P. 1
PPL

PPL

|Views: 120|Likes:
Published by Neeti Maheshwari

More info:

Published by: Neeti Maheshwari on Feb 26, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

05/12/2011

pdf

text

original

Sections

  • 1 Introduction
  • 1.1 How important are programming languages?
  • 1.2 Features of the Course
  • 2 Anomalies
  • 3 The Procedural Paradigm
  • 3.1 Early Days
  • 3.2 FORTRAN
  • 3.3 Algol 60
  • 3.4 COBOL
  • 3.5 PL/I
  • 3.6 Algol 68
  • 3.7 Pascal
  • 3.8 Modula–2
  • 3.9 C
  • 3.10 Ada
  • 4 The Functional Paradigm
  • 4.1 LISP
  • 4.1.1 Basic LISP Functions
  • 4.1.2 A LISP Interpreter
  • 4.1.3 Dynamic Binding
  • 4.2 Scheme
  • 4.3 SASL
  • 4.4 SML
  • 4.5 Other Functional Languages
  • 5 Haskell
  • 5.1 Getting Started
  • 5.1.1 Hugs Commands
  • 5.1.2 Literate Haskell
  • 5.1.3 Defining Functions
  • 5.1.4 Syntax
  • 5.1.5 Tracing
  • 5.2 Details
  • 5.2.1 Lists
  • 5.2.2 Lazy Evaluation
  • 5.2.3 Pattern Matching
  • 5.2.4 Operators
  • 5.2.5 Building Haskell Programs
  • 5.2.6 Guards
  • 5.3 Types
  • 5.3.1 Using Types
  • 5.3.2 Failure
  • 5.3.3 Reporting Errors
  • 5.3.4 Standard Types
  • 5.3.5 Standard Classes
  • 5.3.6 Defining New Types
  • 5.3.7 Defining Data Types
  • 5.3.8 Types with Parameters
  • 5.4 Input and Output
  • 5.4.1 String Formatting
  • 5.4.2 Interactive Programs
  • 5.5 Reasoning about programs
  • 6 The Object Oriented Paradigm
  • 6.1 Simula
  • 6.2 Smalltalk
  • 6.3 CLU
  • 6.4 C++
  • 6.5 Eiffel
  • 6.5.1 Programming by Contract
  • 6.5.2 Repeated Inheritance
  • 6.5.3 Exception Handling
  • 6.6 Java
  • 6.6.1 Portability
  • 6.6.2 Interfaces
  • 6.6.3 Exception Handling
  • 6.6.4 Concurrency
  • 6.7 Kevo
  • 6.8 Other OOPLs
  • 6.9 Issues in OOP
  • 6.9.1 Algorithms + Data Structures = Programs
  • 6.9.2 Values and Objects
  • 6.9.3 Classes versus Prototypes
  • 6.9.4 Types versus Objects
  • 6.9.5 Pure versus Hybrid Languages
  • 6.9.6 Closures versus Classes
  • 6.9.7 Inheritance
  • 6.10 A Critique of C++
  • 6.11 Evaluation of OOP
  • 7 Backtracking Languages
  • 7.1 Prolog
  • 7.2 Alma-0
  • 7.3 Other Backtracking Languages
  • 8 Structure
  • 8.1 Block Structure
  • 8.2 Modules
  • 8.2.1 Encapsulation
  • 8.3 Control Structures
  • 8.3.1 Loop Structures
  • 8.3.2 Procedures and Functions
  • 8.3.3 Exceptions
  • 9 Names and Binding
  • 9.1 Free and Bound Names
  • 9.2 Attributes
  • 9.3 Early and Late Binding
  • 9.4 What Can Be Named?
  • 9.5 What is a Variable Name?
  • 9.6 Polymorphism
  • 9.6.1 Ad Hoc Polymorphism
  • 9.6.2 Parametric Polymorphism
  • 9.6.3 Object Polymorphism
  • 9.7 Assignment
  • 9.8 Scope and Extent
  • 9.8.1 Scope
  • 9.8.2 Are nested scopes useful?
  • 9.8.3 Extent
  • 10 Abstraction
  • 10.1 Abstraction as a Technical Term
  • 10.1.1 Procedures
  • 10.1.2 Functions
  • 10.1.3 Data Types
  • 10.1.4 Classes
  • 10.2 Computational Models
  • 11 Implementation
  • 11.1 Compiling and Interpreting
  • 11.1.1 Compiling
  • 11.1.2 Interpreting
  • 11.1.3 Mixed Implementation Strategies
  • 11.1.4 Consequences for Language Design
  • 11.2 Garbage Collection
  • 12 Conclusion
  • References
  • A Abbreviations and Glossary
  • B Theoretical Issues
  • B.1 Syntactic and Lexical Issues
  • B.2 Semantics
  • B.3 Type Theory
  • B.4 Regular Languages
  • B.4.1 Tokens
  • B.4.2 Context Free Grammars
  • B.4.3 Control Structures and Data Structures
  • B.4.4 Discussion
  • C Haskell Implementation

COMP 348 Principles of Programming Languages

Lecture Notes
Peter Grogono
These notes may be copied for students who are taking either
COMP 348 Principles of Programming Languages or COMP 6411
Comparative Study of Programming Languages.
c (Peter Grogono, 2002, 2003, 2004
Department of Computer Science and Software Engineering
Concordia University
Montreal, Quebec
CONTENTS ii
Contents
1 Introduction 1
1.1 How important are programming languages? . . . . . . . . . . . . . . . . . . . . . 1
1.2 Features of the Course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Anomalies 3
3 The Procedural Paradigm 7
3.1 Early Days . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 FORTRAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Algol 60 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4 COBOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.5 PL/I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.6 Algol 68 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.7 Pascal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.8 Modula–2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.9 C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.10 Ada . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 The Functional Paradigm 20
4.1 LISP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.1.1 Basic LISP Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.1.2 A LISP Interpreter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.1.3 Dynamic Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2 Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 SASL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.4 SML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.5 Other Functional Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5 Haskell 34
5.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.1.1 Hugs Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1.2 Literate Haskell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.1.3 Defining Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.1.4 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.1.5 Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.2 Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2.1 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2.2 Lazy Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
CONTENTS iii
5.2.3 Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.2.4 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.2.5 Building Haskell Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.2.6 Guards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.3 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.3.1 Using Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.3.2 Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.3.3 Reporting Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.3.4 Standard Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.3.5 Standard Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.3.6 Defining New Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3.7 Defining Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3.8 Types with Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.4 Input and Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.4.1 String Formatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.4.2 Interactive Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.5 Reasoning about programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6 The Object Oriented Paradigm 77
6.1 Simula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.2 Smalltalk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.3 CLU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.4 C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.5 Eiffel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.5.1 Programming by Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.5.2 Repeated Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.5.3 Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.6 Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.6.1 Portability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.6.2 Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.6.3 Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.6.4 Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.7 Kevo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.8 Other OOPLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.9 Issues in OOP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.9.1 Algorithms + Data Structures = Programs . . . . . . . . . . . . . . . . . . 93
6.9.2 Values and Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.9.3 Classes versus Prototypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
CONTENTS iv
6.9.4 Types versus Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.9.5 Pure versus Hybrid Languages . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.9.6 Closures versus Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.9.7 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.10 A Critique of C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.11 Evaluation of OOP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7 Backtracking Languages 106
7.1 Prolog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.2 Alma-0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
7.3 Other Backtracking Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
8 Structure 118
8.1 Block Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
8.2 Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
8.2.1 Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
8.3 Control Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
8.3.1 Loop Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
8.3.2 Procedures and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
8.3.3 Exceptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
9 Names and Binding 125
9.1 Free and Bound Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
9.2 Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
9.3 Early and Late Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
9.4 What Can Be Named? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
9.5 What is a Variable Name? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
9.6 Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
9.6.1 Ad Hoc Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
9.6.2 Parametric Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
9.6.3 Object Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
9.7 Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
9.8 Scope and Extent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
9.8.1 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
9.8.2 Are nested scopes useful? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
9.8.3 Extent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
LIST OF FIGURES v
10 Abstraction 138
10.1 Abstraction as a Technical Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
10.1.1 Procedures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
10.1.2 Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
10.1.3 Data Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
10.1.4 Classes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
10.2 Computational Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
11 Implementation 144
11.1 Compiling and Interpreting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
11.1.1 Compiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
11.1.2 Interpreting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
11.1.3 Mixed Implementation Strategies . . . . . . . . . . . . . . . . . . . . . . . . 146
11.1.4 Consequences for Language Design . . . . . . . . . . . . . . . . . . . . . . . 147
11.2 Garbage Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
12 Conclusion 150
References 151
A Abbreviations and Glossary 156
B Theoretical Issues 159
B.1 Syntactic and Lexical Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
B.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
B.3 Type Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
B.4 Regular Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
B.4.1 Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
B.4.2 Context Free Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
B.4.3 Control Structures and Data Structures . . . . . . . . . . . . . . . . . . . . 162
B.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
C Haskell Implementation 164
List of Figures
1 The list structure (A (B C)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2 A short session with Hugs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3 Hugs commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4 Parameters for :set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5 A Literate Haskell program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
LIST OF FIGURES vi
6 Pattern matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7 Parsing precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
8 Operator precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
9 Standard Class Inheritance Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
10 The Class Frac . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
11 Trying out type Frac . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
12 Class BTree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
13 The fringe of a tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
14 Key-value tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
15 Proof of ancestor(pam, jim) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
16 Effect of cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
17 REs and Control Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
18 REs and Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
19 The graph for succ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
COMP 348 Principles of Programming Languages
1 Introduction
In order to understand why programming languages (PLs) are as they are today, and to predict
how they might develop in the future, we need to know something about how they evolved. We
consider early languages, but the main focus of the course is on contemporary and evolving PLs.
The treatment is fairly abstract. We not discuss details of syntax, and we try to avoid unproductive
“language wars”. Here is a useful epigram for the course:
C and Pascal are the same language. Dennis Ritchie.
We can summarize our goals as follows: if the course is successful, it should help us to find useful
answers to questions such as these.
What is a good programming language?
How should we choose an appropriate language for a particular task?
How should we approach the problem of designing a programming language?
1.1 How important are programming languages?
As the focus of practitioners and researchers has moved from programming to large-scale software
development, PLs no longer seem so central to Computer Science (CS). Brooks (1987) and others
have argued that, because coding occupies only a small fraction of the total time required for
project development, developments in programming languages cannot lead to dramatic increases
in software productivity.
The assumption behind this reasoning is that the choice of PL affects only the coding phase of the
project. For example, if coding requires 15% of total project time, we could deduce that reducing
coding time to zero would reduce the total project time by only 15%. Maintenance, however, is
widely acknowledged to be a major component of software development. Estimates of maintenance
time range from “at last 50%” (Lientz and Swanson 1980) to “between 65% and 75%” (McKee
1984) of the total time. Since the choice of PL can influence the ease of maintenance, these figures
suggest that the PL might have a significant effect on total development time.
Another factor that we should consider in studying PLs is their ubiquity: PLs are everywhere.
The “obvious” PLs are the major implementation languages: Ada, C++, COBOL, FOR-
TRAN, . . . .
PLs for the World Wide Web: Java, Javascript, . . . .
Scripting PLs: Javascript, Perl, Tcl/Tk, Python, . . . .
“Hidden” languages: spreadsheets, macro languages, input for complex applications, . . . .
The following scenario has occurred often in the history of programming. Developers realize that
an application requires a format for expressing input data. The format increases in complexity
until it becomes a miniature programming language. By the time that the developers realize this,
it is too late for them to redesign the format, even if they had principles that they could use to
help them. Consequently, the notation develops into a programming language with many of the
bad features of old, long-since rejected programming languages.
1
1 INTRODUCTION 2
There is an unfortunate tendency in Computer Science to re-invent language features without
carefully studying previous work. Brinch Hansen (1999) points out that, although safe and provably
correct methods of programming concurrent processes were developed by himself and Hoare in the
mid-seventies, Java does not make use of these methods and is insecure.
We conclude that PLs are important. However, their importance is often under-estimated, some-
times with unfortunate consequences.
1.2 Features of the Course
We will review a number of PLs during the course, but the coverage is very uneven. The time
devoted to a PL depends on its relative importance in the evolution of PLs rather than the extent
to which it was used or is used. For example, C and C++ are both widely-used languages but,
since they did not contribute profound ideas, they are discussed briefly. To provide insight into
functional programming, we consider Haskell in some depth.
With hindsight, we can recognize the mistakes and omissions of language designers and criticize
their PLs accordingly. These criticisms do not imply that the designers were stupid or incompetent.
On the contrary, we have chosen to study PLs whose designers had deep insights and made profound
contributions. Given the state of the art at the time when they designed the PLs, however, the
designers could hardly be expected to foresee the understanding that would come only with another
twenty or thirty years of experience.
We can echo Perlis (1978, pages 88–91) in saying:
[Algol 60] was more racehorse than camel . . . . a rounded work of art . . . . Rarely
has a construction so useful and elegant emerged as the output of a committee of 13
meeting for about 8 days . . . . Algol deserves our affection and appreciation.
and yet
[Algol 60] was a noble begin but never intended to be a satisfactory end.
The approach of the course is conceptual rather than technical. We do not dwell on fine lexical
or syntactic distinctions, semantic details, or legalistic issues. Instead, we attempt to focus on the
deep and influential ideas that have developed with PLs.
Exercises are provided mainly to provoke thought. In many cases, the solution to an exercise can
be found (implicitly!) elsewhere in these notes. If we spend time to think about a problem, we are
better prepared to understand and appreciate its solution later. Some of the exercises may appear
in revised form in the examination at the end of the course.
Many references are provided. The intention is not that you read them all but rather that, if you
are interested in a particular aspect of the course, you should be able to find out more about it by
reading the indicated material.
2 Anomalies
There are many ways of motivating the study of PLs. One way is to look at some anomalies:
programs that look as if they ought to work but don’t, or programs that look as if they shouldn’t
work but do.
Example 1: Assignments. Most PLs provide an assignment statement of the form v := E in
which v is a variable and E is an expression. The expression E yields a value. The variable v is not
necessarily a simple name: it may be an expression such as a[i] or r.f. In any case, it yields an
address into which the value of E is stored. In some PLs, we can use a function call in place of E
but not in place of v. For example, we cannot write a statement such as
top(stack) = 6;
to replace the top element of a stack. 2
Exercise 1. Explain why the asymmetry mentioned in Example 1 exists. Can you see any
problems that might be encountered in implementing a function like top? 2
Example 2: Parameters and Results. PLs are fussy about the kinds of objects that can be
passed as parameters to a function or returned as results by a function. In C, an array can be
passed to a function, but only by reference, and a function cannot return an array. In Pascal,
arrays can be passed by value or by reference to a function, but a function cannot return an array.
Most languages do not allow types to be passed as parameters.
It would be useful, for example, to write a procedure such as
void sort (type t, t a[]) ¦. . . .¦
and then call sort(float, scores). 2
Exercise 2. Suppose you added types as parameters to a language. What other changes would
you have to make to the language? 2
Exercise 3. C++ provides templates. Is this equivalent to having types as parameters? 2
Example 3: Data Structures and Files. Conceptually, there is not a great deal of difference
between a data structure and a file. The important difference is the more or less accidental feature
that data structures usually live in memory and files usually live in some kind of external storage
medium, such as a disk. Yet PLs treat data structures and files in completely different ways. 2
Exercise 4. Explain why most PLs treat data structures and files differently. 2
Example 4: Arrays. In PLs such as C and Pascal, the size of an array is fixed at compile-time.
In other PLs, even ones that are older than C and Pascal, such as Algol 60, the size of an array
is not fixed until run-time. What difference does this make to the implementor of the PL and the
programmers who use it?
Suppose that we have a two-dimensional array a. We access a component of a by writing a[i,j]
(except in C, where we have to write a[i][j]). We can access a row of the array by writing a[i].
Why can’t we access a column by writing a[,j]?
3
2 ANOMALIES 4
It is easy to declare rectangular arrays in most PLs. Why is there usually no provision for triangular,
symmetric, banded, or sparse arrays? 2
Exercise 5. Suggest ways of declaring arrays of the kinds mentioned in Example 4. 2
Example 5: Data Structures and Functions. Pascal and C programs contain declarations
for functions and data. Both languages also provide a data aggregation — record in Pascal and
struct in C. A record looks rather like a small program: it contains data declarations, but it
cannot contain function declarations. 2
Exercise 6. Suggest some applications for records with both data and function components. 2
Example 6: Memory Management. Most C/C++ programmers learn quite quickly that it
is not a good idea to write functions like this one.
char *money (int amount)
¦
char buffer[16];
sprintf(buffer, "%d.%d", amount / 100, amount % 100);
return buffer;
¦
Writing C functions that return strings is awkward and, although experienced C programmers have
ways of avoiding the problem, it remains a defect of both C and C++. 2
Exercise 7. What is wrong with the function money in Example 6? Why is it difficult for a C
function to return a string (i.e., char *)? 2
Example 7: Factoring. In algebra, we are accustomed to factoring expressions. For example,
when we see the expression Ax + Bx we are strongly tempted to write it in the form (A + B)x.
Sometimes this is possible in PLs: we can certainly rewrite
z = A * x + B * x;
as
z = (A + B) * x;
Most PLs allow us to write something like this:
if (x > PI) y = sin(x) else y = cos(x)
Only a few PLs allow the shorter form:
y = if (x > PI) then sin(x) else cos(x);
Very few PLs recognize the even shorter form:
y = (if (x > PI) then sin else cos)(x);
2
Example 8: Returning Functions. Some PLs, such as Pascal, allow nested functions, whereas
others, such as C, do not. Suppose that we were allowed to nest functions in C, and we defined
the function shown in Listing 1.
2 ANOMALIES 5
Listing 1: Returning a function
typedef int intint(int);
intint addk (int k)
¦
int f (int n)
¦
return n + k;
¦
return f;
¦
The idea is that addk returns a function that “adds k to its argument”. We would use it as in the
following program, which should print “10”:
int add6 (int) = addk(6);
printf("%d", add6(4));
2
Exercise 8. “It would be fairly easy to change C so that a function could return a function.” True
or false? (Note that you can represent the function itself by a pointer to its first instruction.) 2
Example 9: Functions as Values. Example 8 is just one of several oddities about the way
programs handle functions. Some PLs do not permit this sequence:
if (x > PI) f = sin else f = cos;
f(x);
2
Exercise 9. Why are statements like this permitted in C (with appropriate syntax) but not in
Pascal? 2
Example 10: Functions that work in more than one way. The function Add(x,y,z)
attempts to satisfy the constraint x +y = z. For instance:
Add(2,3,n) assigns 5 to n.
Add(i,3,5) assigns 2 to i.
2
Example 11: Logic computation. It would be convenient if we could encode logical assertions
such as
∀ i . 0 < i < N ⇒ a[i − 1] < a[i]
in the form
all (int i = 1; i < N; i++)
a[i-a] < a[i]
and
∃ i • 0 ≤ i < N ∧ a[i] = k
2 ANOMALIES 6
Listing 2: The Factorial Function
typedef TYPE . . . .
TYPE fac (int n, TYPE f)
¦
if (n <= 1)
return 1;
else
return n * f(n-1, f);
¦
printf("%d", fac(3, fac));
in the form
exists (int i = 0; i < N; i++)
a[i] == k
2
Example 12: Implicit Looping. It is a well-known fact that we can rewrite programs that
use loops as programs that use recursive functions. It is less well-known that we can eliminate the
recursion as well: see Listing 2.
This program computes factorials using neither loops nor recursion:
fac(3, fac) = 3 fac(2, fac)
= 3 2 fac(1, fac)
= 3 2 1
= 6
In early versions of Pascal, functions like this actually worked. When type-checking was improved,
however, they would no longer compile. 2
Exercise 10. Can you complete the definition of TYPE in Example 2? 2
The point of these examples is to show that PLs affect the way we think. It would not occur to
most programmers to write programs in the style of the preceding sections because most PLs do
not permit these constructions. Working with low-level PLs leads to the belief that “this is all that
computers can do”.
Garbage collection provides an example. Many programmers have struggled for years writing
programs that require explicit deallocation of memory. Memory “leaks” are one of the most
common forms of error in large programs and they can do serious damage. (One of the Apollo
missions almost failed because of a memory leak that was corrected shortly before memory was
exhausted, when the rocket was already in lunar orbit.) These programmers may be unaware of
the existence of PLs that provide garbage collection, or they may believe (because they have been
told) that garbage collection has an unacceptably high overhead.
3 The Procedural Paradigm
The introduction of the von Neumann architecture was a crucial step in the development of elec-
tronic computers. The basic idea is that instructions can be encoded as data and stored in the
memory of the computer. The first consequence of this idea is that changing and modifying the
stored program is simple and efficient. In fact, changes can take place at electronic speeds, a very
different situation from earlier computers that were programmed by plugging wires into panels.
The second, and ultimately more far-reaching, consequence is that computers can process programs
themselves, under program control. In particular, a computer can translate a program from one
notation to another. Thus the stored-program concept led to the development of programming
“languages”.
The history of PLs, like the history of any technology, is complex. There are advances and set-
backs; ideas that enter the mainstream and ideas that end up in a backwater; even ideas that are
submerged for a while and later surface in an unexpected place.
With the benefit of hindsight, we can identify several strands in the evolution of PLs. These
strands are commonly called “paradigms” and, in this course, we survey the paradigms separately
although their development was interleaved.
Sources for this section include (Wexelblat 1981; Williams and Campbell-Kelly 1989; Bergin and
Gibson 1996).
3.1 Early Days
The first PLs evolved from machine code. The first programs used numbers to refer to machine
addresses. One of the first additions to programming notation was the use of symbolic names
rather than numbers to represent addresses.
Briefly, it enables the programmer to refer to any word in a programme by means of a
label or tag attached to it arbitrarily by the programmer, instead of by its address in
the store. Thus, for example, a number appearing in the calculation might be labelled
‘a3’. The programmer could then write simply ‘A a3’ to denote the operation of adding
this number into the accumulator, without having to specify just where the number is
located in the machine. (Mutch and Gill 1954)
The key point in this quotation is the phrase “instead of by its address in the store”. Instead of
writing
Location Order
100 A 104
101 A 2
102 T 104
103 H 24
104 C 50
105 T 104
the programmer would write
7
3 THE PROCEDURAL PARADIGM 8
A a3
A 2
T a3
H 24
a3) C 50
T a3
systematically replacing the address 104 by the symbol a3 and omitting the explicit addresses.
This establishes the principle that a variable name stands for a memory location, a principle that
influenced the subsequent development of PLs and is now known — perhaps inappropriately — as
value semantics.
The importance and subroutines and subroutine libraries was recognized before high-level pro-
gramming languages had been developed, as the following quotation shows.
The following advantages arise from the use of such a library:
1. It simplifies the task of preparing problems for the machine;
2. It enables routines to be more readily understood by other users, as conventions
are standardized and the units of a routine are much larger, being subroutines
instead of individual orders;
3. Library subroutines may be used directly, without detailed coding and punching;
4. Library subroutines are known to be correct, thus greatly reducing the overall
chance of error in a complete routine, and making it much easier to locate errors.
. . . . Another difficulty arises from the fact that, although it is desirable to have
subroutines available to cover all possible requirements, it is also undesirable to allow
the size of the resulting library to increase unduly. However, a subroutine can be made
more versatile by the use of parameters associated with it, thus reducing the total size
of the library.
We may divide the parameters associated with subroutines into two classes.
EXTERNAL parameters, i.e. parameters which are fixed throughout the solution of a
problem and arise solely from the use of the library;
INTERNAL parameters, i.e. parameters which vary during the solution of the problem.
. . . . Subroutines may be divided into two types, which we have called OPEN and
CLOSED. An open subroutine is one which is included in the routine as it stands
whereas a closed subroutine is placed in an arbitrary position in the store and can be
called into use by any part of the main routine. (Wheeler 1951)
Exercise 11. This quotation introduces a theme that has continued with variations to the present
day. Find in it the origins of concern for:
correctness;
maintenance;
encapsulation;
parameterization;
genericity;
reuse;
space/time trade-offs.
3 THE PROCEDURAL PARADIGM 9
2
Machine code is a sequence of “orders” or “instructions” that the computer is expected to execute.
The style of programming that this viewpoint developed became known as the “imperative” or
“procedural” programming paradigm. In these notes, we use the term “procedural” rather than
“imperative” because programs resemble “procedures” (in the English, non-technical sense) or
recipes rather than “commands”. Confusingly, the individual steps of procedural PLs, such as
Pascal and C, are often called “statements”, although in logic a “statement” is a sentence that is
either true or false.
By default, the commands of a procedural program are executed sequentially. Procedural PLs
provide various ways of escaping from the sequence. The earliest mechanisms were the “jump”
command, which transferred control to another part of the program, and the “jump and store link”
command, which transferred control but also stored a “link” to which control would be returned
after executing a subroutine.
The data structures of these early languages were usually rather simple: typically primitive values
(integers and floats) were provided, along with single- and multi-dimensioned arrays of primitive
values.
3.2 FORTRAN
FORTRAN was introduced in 1957 at IBM by a team led by John Backus. The “Preliminary
Report” describes the goal of the FORTRAN project:
The IBM Mathematical Formula Translation System or briefly, FORTRAN, will com-
prise a large set of programs to enable the IBM 704 to accept a concise formulation of a
problem in terms of a mathematical notation and to produce automatically a high-speed
704 program for the solution of the problem. (Quoted in (Sammet 1969).)
This suggests that the IBM team’s goal was to eliminate programming! The following quotation
seems to confirm this:
If it were possible for the 704 to code problems for itself and produce as good programs
as human coders (but without the errors), it was clear that large benefits could be
achieved. (Backus 1957)
It is interesting to note that, 20 years later, Backus (1978) criticized FORTRAN and similar
languages as “lacking useful mathematical properties”. He saw the assignment statement as a
source of inefficiency: “the von Neumann bottleneck”. The solution, however, was very similar
to the solution he advocated in 1957 — programming must become more like mathematics: “we
should be focusing on the form and content of the overall result”.
Although FORTRAN did not eliminate programming, it was a major step towards the elimination
of assembly language coding. The designers focused on efficient implementation rather than elegant
language design, knowing that acceptance depended on the high performance of compiled programs.
FORTRAN has value semantics. Variable names stand for memory addresses that are determined
when the program is loaded.
The major achievements of FORTRAN are:
efficient compilation;
3 THE PROCEDURAL PARADIGM 10
separate compilation (programs can be presented to the compiler as separate subroutines, but
the compiler does not check for consistency between components);
demonstration that high-level programming, with automatic translation to machine code, is
feasible.
The principal limitations of FORTRAN are:
Flat, uniform structure. There is no concept of nesting in FORTRAN. A program consists
of a sequence of subroutines and a main program. Variables are either global or local to
subroutines. In other words, FORTRAN programs are rather similar to assembly language
programs: the main difference is that a typical line of FORTRAN describes evaluating an
expression and storing its value in memory whereas a typical line of assembly language
specifies a machine instruction (or a small group of instructions in the case of a macro).
Limited control structures. The control structures of FORTRAN are IF, DO, and GOTO. Since
there are no compound statements, labels provide the only indication that a sequence of
statements form a group.
Unsafe memory allocation. FORTRAN borrows the concept of COMMON storage from assembly
language program. This enables different parts of a program to share regions of memory, but
the compiler does not check for consistent usage of these regions. One program component
might use a region of memory to store an array of integers, and another might assume that
the same region contains reals. To conserve precious memory, FORTRAN also provides the
EQUIVALENCE statement, which allows variables with different names and types to share a
region of memory.
No recursion. FORTRAN allocates all data, including the parameters and local variables of
subroutines, statically. Recursion is forbidden because only one instance of a subroutine can
be active at one time.
Exercise 12. The FORTRAN 1966 Standard stated that a FORTRAN implementation may
allow recursion but is not required to do so. How would you interpret this statement if you were:
writing a FORTRAN program?
writing a FORTRAN compiler?
2
3.3 Algol 60
During the late fifties, most of the development of PLs was coming from industry. IBM domi-
nated, with COBOL, FORTRAN, and FLPL (FORTRAN List Processing Language), all designed
for the IBM 704. Algol 60 (Naur et al. 1960; Naur 1978; Perlis 1978) was designed by an interna-
tional committee, partly to provide a PL that was independent of any particular company and its
computers. The committee included both John Backus (chief designer of FORTRAN) and John
McCarthy (designer of LISP).
The goal was a “universal programming language”. In one sense, Algol was a failure: few complete,
high-quality compilers were written and the language was not widely used (although it was used
more in Europe than in North America). In another sense, Algol was a huge success: it became the
3 THE PROCEDURAL PARADIGM 11
Listing 3: An Algol Block
begin
integer x;
begin
function f(x) begin ... end;
integer x;
real y;
x := 2;
y := 3.14159;
end;
x := 1;
end
standard language for describing algorithms. For the better part of 30 years, the ACM required
submissions to the algorithm collection to be written in Algol.
The major innovations of Algol are discussed below.
Block Structure. Algol programs are recursively structured. A program is a block. A block
consists of declarations and statements. There are various kinds of statement; in particular,
one kind of statement is a block. A variable or function name declared in a block can
be accessed only within the block: thus Algol introduced nested scopes. The recursive
structure of programs means that large programs can be constructed from small programs.
In the Algol block shown in Listing 3, the two assignments to x refer to two different variables.
The run-time entity corresponding to a block is called an activation record (AR). The AR
is created on entry to the block and destroyed after the statements of the block have been
executed. The syntax of Algol ensures that blocks are fully nested; this in turn means that
ARs can be allocated on a stack. Block structure and stacked ARs have been incorporated
into almost every language since Algol.
Dynamic Arrays. The designers of Algol realized that it was relatively simple to allow the size of
an array to be determined at run-time. The compiler statically allocates space for a pointer
and an integer (collectively called a “dope vector”) on the stack. At run-time, when the size
of the array is known, the appropriate amount of space is allocated on the stack and the
components of the “dope vector” are initialized. The following code works fine in Algol 60.
procedure average (n); integer n;
begin
real array a[1:n];
. . . .
end;
Despite the simplicity of the implementation, successor PLs such as C and Pascal dropped
this useful feature.
Call By Name. The default method of passing parameters in Algol was “call by name” and it
was described by a rather complicated “copy rule”. The essence of the copy rule is that the
program behaves as if the text of the formal parameter in the function is replaced by the
text of the actual parameter. The complications arise because it may be necessary to rename
some of the variables during the textual substitution. The usual implementation strategy was
to translate the actual parameter into a procedure with no arguments (called a “thunk”);
3 THE PROCEDURAL PARADIGM 12
Listing 4: Call by name
procedure count (n); integer n;
begin
n := n + 1
end
Listing 5: A General Sum Function
integer procedure sum (max, i, val); integer max, i, val;
begin
integer s;
s := 0;
for i := 1 until n do
s := s + val;
sum := s
end
each occurrence of the formal parameter inside the function was translated into a call to this
function.
The mechanism seems strange today because few modern languages use it. However, the
Algol committee had several valid reasons for introducing it.
Call by name enables procedures to alter their actual parameters. If the procedure count
is defined as in Listing 4, the statement
count(widgets)
has the same effect as the statement
begin
widgets := widgets + 1
end
The other parameter passing mechanism provided by Algol, call by value, does not allow
a procedure to alter the value of its actual parameters in the calling environment: the
parameter behaves like an initialized local variable.
Call by name provides control structure abstraction. The procedure in Listing 5 provides
a form of abstraction of a for loop. The first parameter specifies the number of iterations,
the second is the loop index, and the third is the loop body. The statement
sum(3, i, a[i])
computes a[1]+a[2]+a[3].
Call by name evaluates the actual parameter exactly as often as it is accessed. (This is
in contrast with call by value, where the parameter is usually evaluated exactly once, on
entry to the procedure.) For example, if we declare the procedure try as in Listing 6,
it is safe to call try(x > 0, 1.0/x), because, if x ≤ 0, the expression 1.0/x will not be
evaluated.
Own Variables. A variable in an Algol procedure can be declared own. The effect is that the
variable has local scope (it can be accessed only by the statements within the procedure)
but global extent (its lifetime is the execution of the entire program).
Exercise 13. Why does i appear in the parameter list of Sum? 2
3 THE PROCEDURAL PARADIGM 13
Listing 6: Using call by name
real procedure try (b, x); boolean b; real x;
begin
try := if b then x else 0.0
end
Exercise 14. Discuss the initialization of own variables. 2
Algol 60 and most of its successors, like FORTRAN, has a value semantics. A variable name stands
for a memory addresses that is determined when the block containing the variable declaration is
entered at run-time.
With hindsight, we can see that Algol made important contributions but also missed some very
interesting opportunities.
An Algol block without statements is, in effect, a record. Yet Algol 60 does not provide
records.
The local data of an AR is destroyed after the statements of the AR have been executed. If
the data was retained rather than destroyed, Algol would be a language with modules.
An Algol block consists of declarations followed by statements. Suppose that declarations and
statements could be interleaved in a block. In the following block, D denotes a sequence of
declarations and S denotes a sequence of statements.
begin
D
1
S
1
D
2
S
2
end
A natural interpretation would be that S
1
and S
2
are executed concurrently.
Own variables were in fact rather problematic in Algol, for various reasons including the diffi-
culty of reliably initializing them (see Example 14). But the concept was important: it is the
separation of scope and extent that ultimately leads to objects.
The call by name mechanism was a first step towards the important idea that functions can
be treated as values. The actual parameter in an Algol call, assuming the default calling
mechanism, is actually a parameterless procedure, as mentioned above. Applying this idea
consistently throughout the language would have led to high order functions and paved the
way to functional programming.
The Algol committee knew what they were doing, however. They knew that incorporating the
“missed opportunities” described above would have led to significant implementation problems. In
particular, since they believed that the stack discipline obtained with nested blocks was crucial for
efficiency, anything that jeopardized it was not acceptable.
Algol 60 was simple and powerful, but not quite powerful enough. The dominant trend after Algol
was towards languages of increasing complexity, such as PL/I and Algol 68. Before discussing
these, we take a brief look at COBOL.
3 THE PROCEDURAL PARADIGM 14
3.4 COBOL
COBOL (Sammett 1978) introduced structured data and implicit type conversion. When COBOL
was introduced, “programming” was more or less synonymous with “numerical computation”.
COBOL introduced “data processing”, where data meant large numbers of characters. The data
division of a COBOL program contained descriptions of the data to be processed.
Another important innovation of COBOL was a new approach to data types. The problem of type
conversion had not arisen previously because only a small number of types were provided by the
PL. COBOL introduced many new types, in the sense that data could have various degrees of
precision, and different representations as text. The choice made by the designers of COBOL was
radical: type conversion should be automatic.
The assignment statement in COBOL has several forms, including
MOVE X to Y.
If X and Y have different types, the COBOL compiler will attempt to find a conversion from one
type to the other. In most PLs of the time, a single statement translated into a small number of
machine instructions. In COBOL, a single statement could generate a large amount of machine
code.
Example 13: Automatic conversion in COBOL. The Data Division of a COBOL program
might contain these declarations:
77 SALARY PICTURE 99999, USAGE IS COMPUTATIONAL.
77 SALREP PICTURE $$$,$$9.99
The first indicates that SALARY is to be stored in a form suitable for computation (probably, but
not necessarily, binary form) and the second provides a format for reporting salaries as amounts
in dollars. (Only one dollar symbol will be printed, immediately before the first significant digit).
The Procedure Division would probably contain a statement like
MOVE SALARY TO SALREP.
which implicitly requires the conversion from binary to character form, with appropriate formatting.
2
Exercise 15. Despite significant advances in the design and implementation of PLs, it remains
true that FORTRAN is widely used for “number crunching” and COBOL is widely used for data
processing. Can you explain why? 2
3.5 PL/I
During the early 60s, the dominant languages were Algol, COBOL, FORTRAN. The continuing
desire for a “universal language” that would be applicable to a wide variety of problem domains led
IBM to propose a new programming language (originally called NPL but changed, after objections
from the UK’s National Physical Laboratory, to PL/I) that would combine the best features of
these three languages. Insiders at the time referred to the new language as “CobAlgoltran”.
The design principles of PL/I (Radin 1978) included:
the language should contain the features necessary for all kinds of programming;
a programmer could learn a subset of the language, suitable for a particular application,
without having to learn the entire language.
3 THE PROCEDURAL PARADIGM 15
An important lesson of PL/I is that these design goals are doomed to failure. A programmer who
has learned a “subset” of PL/I is likely, like all programmers, to make a mistake. With luck, the
compiler will detect the error and provide a diagnostic message that is incomprehensible to the
programmer because it refers to a part of the language outside the learned subset. More probably,
the compiler will not detect the error and the program will behave in a way that is inexplicable to
the programmer, again because it is outside the learned subset.
PL/I extends the automatic type conversion facilities of COBOL to an extreme degree. For exam-
ple, the expression (Gelernter and Jagannathan 1990)
(’57’ || 8) + 17
is evaluated as follows:
1. Convert the integer 8 to the string ’8’.
2. Concatenate the strings ’57’ and ’8’, obtaining ’578’.
3. Convert the string ’578’ to the integer 578.
4. Add 17 to 578, obtaining 595.
5. Convert the integer 595 to the string ’595’.
The compiler’s policy, on encountering an assignment x = E, might be paraphrased as: “Do
everything possible to compile this statement; as far as possible, avoid issuing any diagnostic
message that would tell the programmer what is happening”.
PL/I did introduce some important new features into PLs. They were not all well-designed, but
their existence encouraged others to produce better designs.
Every variable has a storage class: static, automatic, based, or controlled. Some of
these were later incorporated into C.
An object associated with a based variable x requires explicit allocation and is placed on
the heap rather than the stack. Since we can execute the statement allocatex as often as
necessary, based variables provide a form of template.
PL/I provides a wide range of programmer-defined types. Types, however, could not be named.
PL/I provided a simple, and not very safe, form of exception handling. Statements of the
following form are allowed anywhere in the program:
ON condition
BEGIN;
. . . .
END;
If the condition (which might be OVERFLOW, PRINTER OUT OF PAPER, etc.) becomes TRUE,
control is transferred to whichever ON statement for that condition was most recently executed.
After the statements between BEGIN and END (the handler) have been executed, control returns
to the statement that raised the exception or, if the handler contains a GOTO statement, to the
target of that statement.
Exercise 16. Discuss potential problems of the PL/I exception handling mechanism. 2
3 THE PROCEDURAL PARADIGM 16
3.6 Algol 68
Whereas Algol 60 is a simple and expressive language, its successor Algol 68 (van Wijngaarden
et al. 1975; Lindsey 1996) is much more complex. The main design principle of Algol 68 was
orthogonality: the language was to be defined using a number of basic concepts that could be
combined in arbitrary ways. Although it is true that lack of orthogonality can be a nuisance in
PLs, it does not necessarily follow that orthogonality is always a good thing.
The important features introduced by Algol 68 include the following.
The language was described in a formal notation that specified the complete syntax and
semantics of the language (van Wijngaarden et al. 1975). The fact that the Report was very
hard to understand may have contributed to the slow acceptance of the language.
Operator overloading: programmers can provide new definitions for standard operators such
as “+”. Even the priority of these operators can be altered.
Algol 68 has a very uniform notation for declarations and other entities. For example, Algol
68 uses the same syntax (mode name = expression) for types, constants, variables, and func-
tions. This implies that, for all these entities, there must be forms of expression that yield
appropriate values.
In a collateral clause of the form (x, y, z), the expressions x, y, and z can be evaluated in
any order, or concurrently. In a function call f(x, y, z), the argument list is a collateral clause.
Collateral clauses provide a good, and early, example of the idea that a PL specification should
intentionally leave some implementation details undefined. In this example, the Algol 68 report
does not specify the order of evaluation of the expressions in a collateral clause. This gives
the implementor freedom to use any order of evaluation and hence, perhaps, to optimize.
The operator ref stands for “reference” and means, roughly, “use the address rather than the
value”. This single keyword introduces call by reference, pointers, dynamic data structures,
and other features to the language. It appears in C in the form of the operators “*” and “&”.
A large vocabulary of PL terms, some of which have become part of the culture (cast, coercion,
narrowing, . . . .) and some of which have not (mode, weak context, voiding,. . . .).
Like Algol 60, Algol 68 was not widely used, although it was popular for a while in various parts
of Europe. The ideas that Algol 68 introduced, however, have been widely imitated.
Exercise 17. Algol 68 has a rule that requires, for an assignment x := E, the lifetime of the
variable x must be less than or equal to the lifetime of the object obtained by evaluating E.
Explain the motivation for this rule. To what extent can it be checked by the compiler? 2
3.7 Pascal
Pascal was designed by Wirth (1996) as a reaction to the complexity of Algol 68, PL/I, and other
languages that were becoming popular in the late 60s. Wirth made extensive use of the ideas of
Dijkstra and Hoare (later published as (Dahl, Dijkstra, and Hoare 1972)), especially Hoare’s ideas
of data structuring. The important contributions of Pascal included the following.
Pascal demonstrated that a PL could be simple yet powerful.
The type system of Pascal was based on primitives (integer, real, bool, . . . .) and mech-
anisms for building structured types (array, record, file, set, . . . .). Thus data types in
Pascal form a recursive hierarchy just as blocks do in Algol 60.
3 THE PROCEDURAL PARADIGM 17
Pascal provides no implicit type conversions other than subrange to integer and integer
to real. All other type conversions are explicit (even when no action is required) and the
compiler checks type correctness.
Pascal was designed to match Wirth’s (1971) ideas of program development by stepwise re-
finement. Pascal is a kind of “fill in the blanks” language in which all programs have a similar
structure, determined by the relatively strict syntax. Programmers are expected to start with
a complete but skeletal “program” and flesh it out in a series of refinement steps, each of
which makes certain decisions and adds new details. The monolithic structure that this idea
imposes on programs is a drawback of Pascal because it prevents independent compilation of
components.
Pascal was a failure because it was too simple. Because of the perceived missing features, supersets
were developed and, inevitably, these became incompatible. The first version of “Standard Pascal”
was almost useless as a practical programming language and the Revised Standard described a
usable language but appeared only after most people had lost interest in Pascal.
Like Algol 60, Pascal missed important opportunities. The record type was a useful innovation
(although very similar to the Algol 68 struct) but allowed data only. Allowing functions in a
record declaration would have paved the way to modular and even object oriented programming.
Nevertheless, Pascal had a strong influence on many later languages. Its most important innova-
tions were probably the combination of simplicity, data type declarations, and static type checking.
Exercise 18. List some of the “missing features” of Pascal. 2
Exercise 19. It is well-known that the biggest loop-hole in Pascal’s type structure was the variant
record. How serious do you think this problem was? 2
3.8 Modula–2
Wirth (1982) followed Pascal with Modula–2, which inherits Pascal’s strengths and, to some ex-
tent, removes Pascal’s weaknesses. The important contribution of Modula–2 was, of course, the
introduction of modules. (Wirth’s first design, Modula, was never completed. Modula–2 was the
product of a sabbatical year in California, where Wirth worked with the designers of Mesa, another
early modular language.)
A module in Modula–2 has an interface and an implementation. The interface provides infor-
mation about the use of the module to both the programmer and the compiler. The implementation
contains the “secret” information about the module. This design has the unfortunate consequence
that some information that should be secret must be put into the interface. For example, the
compiler must know the size of the object in order to declare an instance of it. This implies that
the size must be deducible from the interface which implies, in turn, that the interface must contain
the representation of the object. (The same problem appears again in C++.)
Modula–2 provides a limited escape from this dilemma: a programmer can define an “opaque” type
with a hidden representation. In this case, the interface contains only a pointer to the instance
and the representation can be placed in the implementation module.
The important features of Modula–2 are:
Modules with separated interface and implementation descriptions (based on Mesa).
Coroutines.
3 THE PROCEDURAL PARADIGM 18
3.9 C
C is a very pragmatic PL. Ritchie (Ritchie 1996) designed it for a particular task — systems
programming — for which it has been widely used. The enormous success of C is partly accidental.
UNIX, after Bell released it to universities, became popular, with good reason. Since UNIX
depended heavily on C, the spread of UNIX inevitably led to the spread of C.
C is based on a small number of primitive concepts. For example, arrays are defined in terms
of pointers and pointer arithmetic. This is both the strength and weakness of C. The number of
concepts is small, but C does not provide real support for arrays, strings, or boolean operations.
C is a low-level language by comparison with the other PLs discussed in this section. It is designed
to be easy to compile and to produce efficient object code. The compiler is assumed to be rather
unsophisticated (a reasonable assumption for a compiler running on a PDP/11 in the late sixties)
and in need of hints such as register. C is notable for its concise syntax. Some syntactic features
are inherited from Algol 68 (for example, += and other assignment operators) and others are unique
to C and C++ (for example, postfix and prefix ++ and --).
3.10 Ada
Ada (Whitaker 1996) represents the last major effort in procedural language design. It is a large
and complex language that combines then-known programming features with little attempt at
consolidation. It was the first widely-used language to provide full support for concurrency, with
interactions checked by the compiler, but this aspect of the language proved hard to implement.
Ada provides templates for procedures, record types, generic packages, and task types. The corre-
sponding objects are: blocks and records (representable in the language); and packages and tasks
(not representable in the language). It is not clear why four distinct mechanisms are required
(Gelernter and Jagannathan 1990). The syntactic differences suggest that the designers did not
look for similarities between these constructs. A procedure definition looks like this:
procedure procname ( parameters ) is
body
A record type looks like this:
type recordtype ( parameters ) is
body
The parameters of a record type are optional. If present, they have a different form than the
parameters of procedures.
A generic package looks like this:
generic ( parameters ) package packagename is
package description
The parameters can be types or values. For example, the template
generic
max: integer;
type element is private;
package Stack is
. . . .
might be instantiated by a declaration such as
package intStack is new Stack(20, integer)
3 THE PROCEDURAL PARADIGM 19
Finally, a task template looks like this (no parameters are allowed):
task type templatename is
task description
Of course, programmers hardly notice syntactic differences of this kind: they learn the correct
incantation and recite it without thinking. But it is disturbing that the language designers
apparently did not consider passible relationships between these four kinds of declaration. Changing
the syntax would be a minor improvement, but uncovering deep semantic similarities might have a
significant impact on the language as a whole, just as the identity declaration of Algol 68 suggested
new and interesting possibilities.
Exercise 20. Propose a uniform style for Ada declarations. 2
4 The Functional Paradigm
Procedural programming is based on instructions (“do something”) but, inevitably, procedural PLs
also provide expressions (“calculate something”). The key insight of functional programming (FP)
is that everything can be done with expressions: the commands are unnecessary.
This point of view has a solid foundation in theory. Turing (1936) introduced an abstract model of
“programming”, now known as the Turing machine. Kleene (1936) and Church (1941) introduced
the theory of recursive functions. The two theories were later shown (by Kleene) to be equivalent:
each had the same computational power. Other theories, such as Post production systems, were
shown to have the same power. This important theoretical result shows that FP is not a complete
waste of time but it does not tell us whether FP is useful or practical. To decide that, we must
look at the functional programming languages (FPLs) that have actually been implemented.
Most functional language support high order functions. Roughly, a high order function is a
function that takes another function as a parameter or returns a function. More precisely:
A zeroth order expression contains only variables and constants.
Afirst order expression may also contain function invocations, but the results and parameters
of functions are variables and constants (that is, zeroth order expressions).
In general, in an n-th order expression, the results and parameters of functions are (n−1)-th
order expressions.
A high order expression is an n-th order expression with n ≥ 2.
The same conventions apply in logic with “function” replaced by “function or predicate”. In
first-order logic, quantifiers can bind variables only; in a high order logic, quantifiers can bind
predicates.
4.1 LISP
Anyone could learn LISP in one day, except that if they already knew FORTRAN, it
would take three days. Marvin Minsky
Functional programming was introduced in 1958 in the form of LISP by John McCarthy. The
following account of the development of LISP is based on McCarthy’s (1978) history.
The important early decisions in the design of LISP were:
to provide list processing (which already existed in languages such as Information Processing
Language (IPL) and FORTRAN List Processing Language (FLPL));
to use a prefix notation (emphasizing the operator rather than the operands of an expression);
to use the concept of “function” as widely as possible (cons for list construction; car and cdr
for extracting list components; cond for conditional, etc.);
to provide higher order functions and hence a notation for functions (based on Church’s (1941)
λ-notation);
to avoid the need for explicit erasure of unused list structures.
McCarthy (1960) wanted a language with a solid mathematical foundation and decided that recur-
sive function theory was more appropriate for this purpose than the then-popular Turing machine
model. He considered it important that LISP expressions should obey the usual mathematical laws
allowing replacement of expressions and:
20
4 THE FUNCTIONAL PARADIGM 21
s s s s
s s s s
E E E
NIL
E E
NIL
c
A
c
c
B
c
C
Figure 1: The list structure (A (B C))
Another way to show that LISP was neater than Turing machines was to write a
universal LISP function and show that it is briefer and more comprehensible than
the description of a universal Turing machine. This was the LISP function eval [e, a],
which computes the value of a LISP expression e, the second argument a being a list
of assignments of values to variables. . . . Writing eval required inventing a notation
for representing LISP functions as LISP data, and such a notation was devised for the
purpose of the paper with no thought that it would be used to express LISP programs
in practice. (McCarthy 1978)
After the paper was written, McCarthy’s graduate student S. R. Russel noticed that eval could be
used as an interpreter for LISP and hand-coded it, thereby producing the first LISP interpreter.
Soon afterwards, Timothy Hart and Michael Levin wrote a LISP compiler in LISP; this is probably
the first instance of a compiler written in the language that it compiled.
The function application f(x, y) is written in LISP as (f x y). The function name always comes
first: a + b is written in LISP as (+ a b). All expressions are enclosed in parentheses and can be
nested to arbitrary depth.
There is a simple relationship between the text of an expression and its representation in memory.
An atom is a simple object such as a name or a number. A list is a data structure composed of
cons-cells (so called because they are constructed by the function cons); each cons-cell has two
pointers and each pointer points either to another cons-cell or to an atom. Figure 1 shows the list
structure corresponding to the expression (A (B C)). Each box represents a cons-cell. There are
two lists, each with two elements, and each terminated with NIL. The diagram is simplified in that
the atoms A, B, C, and NIL would themselves be list structures in an actual LISP system.
The function cons constructs a list from its head and tail: (cons head tail). The value of
(car list) is the head of the list and the value of (cdr list) is the tail of the list. Thus:
(car (cons head tail)) → head
(cdr (cons head tail)) → tail
The names car and cdr originated in IBM 704 hardware; they are abbreviations for “contents
of address register” (the top 18 bits of a 36-bit word) and “contents of decrement register” (the
bottom 18 bits).
It is easy to translate between list expressions and the corresponding data structures. There
is a function eval (mentioned in the quotation above) that evaluates a stored list expression.
Consequently, it is straightforward to build languages and systems “on top of” LISP and LISP is
often used in this way.
4 THE FUNCTIONAL PARADIGM 22
It is interesting to note that the close relationship between code and data in LISP mimics the von
Neumann architecture at a higher level of abstraction.
LISP was the first in a long line of functional programming (FP) languages. Its principal contri-
butions are listed below.
Names. In procedural PLs, a name denotes a storage location (value semantics). In LISP, a name
is a reference to an object, not a location (reference semantics). In the Algol sequence
int n;
n := 2;
n := 3;
the declaration int n; assigns a name to a location, or “box”, that can contain an integer.
The next two statements put different values, first 2 then 3, into that box. In the LISP
sequence
(progn
(setq x (car structure))
(setq x (cdr structure)))
x becomes a reference first to (car structure) and then to (cdr structure). The two
objects have different memory addresses. A consequence of the use of names as references to
objects is that eventually there will be objects for which there are no references: these objects
are “garbage” and must be automatically reclaimed if the interpreter is not to run out of
memory. The alternative — requiring the programmer to explicitly deallocate old cells —
would add considerable complexity to the task of writing LISP programs. Nevertheless, the
decision to include automatic garbage collection (in 1958!) was courageous and influential.
A PL in which variable names are references to objects in memory is said to have reference
semantics. All FPLs and most OOPLs have reference semantics.
Note that reference semantics is not the same as “pointers” in languages such as Pascal and
C. A pointer variable stands for a location in memory and therefore has value semantics; it
just so happens that the location is used to store the address of another object.
Lambda. LISP uses “lambda expressions”, based on Church’s λ-calculus, to denote functions.
For example, the function that squares its argument is written
(lambda (x) (* x x))
by analogy to Church’s f = λx . x
2
. We can apply a lambda expression to an argument to
obtain the value of a function application. For example, the expression
((lambda (x) (* x x)) 4)
yields the value 16.
However, the lambda expression itself cannot be evaluated. Consequently, LISP had to resort
to programming tricks to make higher order functions work. For example, if we want to pass
the squaring function as an argument to another function, we must wrap it up in a “special
form” called function:
(f (function (lambda (x) (* x x))) . . . .)
Similar complexities arise when a function returns another function as a result.
Dynamic Scoping. Dynamic scoping was an “accidental” feature of LISP: it arose as a side-effect
of the implementation of the look-up table for variable values used by the interpreter. The
C-like program in Listing 7 illustrates the difference between static and dynamic scoping. In
4 THE FUNCTIONAL PARADIGM 23
Listing 7: Static and Dynamic Binding
int x = 4; // 1
void f()
¦
printf("%d", x);
¦
void main ()
¦
int x = 7; // 2
f();
¦
C, the variable x in the body of the function f is a use of the global variable x defined in the
first line of the program. Since the value of this variable is 4, the program prints 4. (Do not
confuse dynamic scoping with dynamic binding!)
A LISP interpreter constructs its environment as it interprets. The environment behaves like
a stack (last in, first out). The initial environment is empty, which we denote by '`. After
interpreting the LISP equivalent of the line commented with “1”, the environment contains
the global binding for x: 'x = 4`. When the interpreter evaluates the function main, it inserts
the local x into the environment, obtaining 'x = 7, x = 4`. The interpreter then evaluates the
call f(); when it encounters x in the body of f, it uses the first value of x in the environment
and prints 7.
Although dynamic scoping is natural for an interpreter, it is inefficient for a compiler. In-
terpreters are slow anyway, and the overhead of searching a linear list for a variable value
just makes them slightly slower still. A compiler, however, has more efficient ways of ac-
cessing variables, and forcing it to maintain a linear list would be unacceptably inefficient.
Consequently, early LISP systems had an unfortunate discrepancy: the interpreters used dy-
namic scoping and the compilers used static scoping. Some programs gave one answer when
interpreted and another answer when compiled!
Exercise 21. Describe a situation in which dynamic scoping is useful. 2
Interpretation. LISP was the first major language to be interpreted. Originally, the LISP inter-
preter behaved as a calculator: it evaluated expressions entered by the user, but its internal
state did not change. It was not long before a form for defining functions was introduced
(originally called define, later changed to defun) to enable users to add their own functions
to the list of built-in functions.
A LISP program has no real structure. On paper, a program is a list of function definitions;
the functions may invoke one another with either direct or indirect recursion. At run-time, a
program is the same list of functions, translated into internal form, added to the interpreter.
The current dialect of LISP is called Common LISP (Steele et al. 1990). It is a much larger and
more complex language than the original LISP and includes many features of Scheme (described
below). Common LISP provides static scoping with dynamic scoping as an option.
Exercise 22. The LISP interpreter, written in LISP, contains expressions such as the one shown
in Listing 8. We might paraphrase this as: “if the car of the expression that we are currently
evaluating is car, the value of the expression is obtained by taking the car of the cadr (that is,
the second term) of the expression”. How much can you learn about a language by reading an
interpreter written in the language? What can you not learn? 2
4 THE FUNCTIONAL PARADIGM 24
Listing 8: Defining car
(cond
((eq (car expr) ’car)
(car (cadr expr)) )
. . . .
4.1.1 Basic LISP Functions
The principal — and only — data structure in classical LISP is the list. Lists are written (a b c).
Lists also represent programs: the list (f x y) stands for the function f applied to the arguments
x and y.
The following functions are provided for lists.
cons builds a list, given its head and tail.
first (originally called car) returns the head (first component) of a list.
rest (originally called cdr) returns the tail of a list.
second (originally called cadr, short for car cdr) returns the second element of a list.
null is a predicate that returns true if its argument is the empty list.
atom is a predicate that return true if its argument is not a list — and must therefore be an
“atom”, that is, a variable name or other primitive object.
A form looks like a function but does not evaluate its argument. The important forms are:
quote takes a single argument and does not evaluate it.
cond is the conditional construct has the form
(cond (p
1
e
1
) (p
2
e
2
) ... (t e))
end works like this: if p
1
is true, return e
1
; if p
2
is true, return e
2
; . . . ; else return e.
def is used to define functions:
(def name (lambda parameters body ))
4.1.2 A LISP Interpreter
The following interpreter is based on McCarthy’s LISP 1.5 Evaluator from Lisp 1.5 Programmer’s
Manual by McCarthy et al., MIT Press 1962.
An environment is a list of name/value pairs. The function pairs builds an environment from
a list of names and a list of expressions.
(def pairs (lambda (names exps env)
(cond
((null names) env)
(t (cons
(cons (first name) (first exps))
(pairs (rest names) (rest exps) env) )) ) ))
The function lookup finds the value of a name in a table. The table is represented as a list of
pairs.
4 THE FUNCTIONAL PARADIGM 25
(def lookup (lambda (name table)
(cond
((eq name (first (first table))) (rest (first table)))
(t (lookup name (rest table))) ) ))
The heart of the interpreter is the function eval which evaluates an expression exp in an environ-
ment env.
(def eval (lambda (exp env)
(cond
((null exp) nil)
((atom exp) (lookup exp env))
((eq (first exp) (quote quote)) (second exp))
((eq (first exp) (quote cond)) (evcon (second exp) env) )
(t (apply (first exp) (evlist (rest exp) env) env)) ) ))
The function evcon is used to evaluate cond forms; it takes a list of pairs of the form (cnd exp)
and returns the value of the expression corresponding to the first true cnd.
(def evcon (lambda (tests env)
(cond
((null tests) nil)
(t
(cond
((eval (first (first tests)) env) (eval (second (first tests)) env))
(t (evcon (rest tests) env)) ) ) ) ))
The function evlist evaluates a list of expressions in an environment. It is a straightforward
recursion.
(def evlist (lambda (exps env)
(cond
((null exps) nil)
(t (cons (eval (first exps) env)
(evlist (rest exps) env) )) ) ))
The function apply applies a function fun to arguments args in the environment env. The cases
it considers are:
Built-in functions: first, second, cons, and other built-in functions not considered here.
Any other function with an atomic name: this is assumed to be a user-defined function, and
lookup is used to find the lambda form corresponding to the name.
A function which is a lambda form.
(def apply (lambda (fun args env)
(cond
((eq fun (quote first)) (first (first args)))
((eq fun (quote second)) (second (first args)))
((eq fun (quote cons)) (cons (first args) (second args)))
((atom fun) (apply (eval fun env) args env))
((eq (first fun) (quote lambda))
(eval (third fun) (pairs (second fun) args env)) ) ) ))
4 THE FUNCTIONAL PARADIGM 26
4.1.3 Dynamic Binding
The classical LISP interpreter was implemented while McCarthy was still designing the language.
It contains an important defect that was not discovered for some time. James Slagle defined a
function like this
(def testr (lambda (x p f u)
(cond
((p x) (f x))
((atom x) (u))
(t (testr (rest x) p f (lambda () (testr (first x) p f u)))) ) ))
and it did not give the correct result.
We use a simpler example to explain the problem. Suppose we define:
(def show (lambda ()
x ))
Calling this function generates an error because x is not defined. (The interpreter above does not
incorporate error detection and would simply fail with this function.) However, we can wrap show
inside another function:
(def try (lambda (x)
(show) ))
We then find that (try t) evaluates to t. When show is called, the binding (x.t) is in the
environment, and so that is the value that show returns.
In other words, the value of a variable in LISP depends on the dynamic behaviour of the program
(which functions have been called) rather than on the static text of the program.
We say that LISP uses dynamic binding whereas most other languages, including Scheme,
Haskell, and C
++
, use static binding.
Correcting the LISP interpreter to provide static binding is not difficult: it requires a slightly
more complicated data structure for environments (instead of having a simple list of name/value
pairs, we effectively build a list of activation records). However, the dynamic binding problem
was not discovered until LISP was well-entrenched: it would be almost 20 years before Guy Steele
introduced Scheme, with static binding, in 1978.
People wrote LISP compilers, but it is hard to write a compiler with dynamic binding. Conse-
quently, there were many LISP systems that provided dynamic binding during interpretation and
static binding for compiled programs!
4.2 Scheme
Scheme was designed by Guy L. Steele Jr. and Gerald Jay Sussman (1975). It is very similar to
LISP in both syntax and semantics, but it corrects some of the errors of LISP and is both simpler
and more consistent.
The starting point of Scheme was an attempt by Steele and Sussman to understand Carl Hewitt’s
theory of actors as a model of computation. The model was object oriented and influenced by
Smalltalk (see Section 6.2). Steele and Sussman implemented the actor model using a small LISP
4 THE FUNCTIONAL PARADIGM 27
Listing 9: Factorial with functions
(define factorial
(lambda (n) (if (= n 0)
1
(* n (factorial (- n 1))))))
Listing 10: Factorial with actors
(define actorial
(alpha (n c) (if (= n 0)
(c 1)
(actorial (- n 1) (alpha (f) (c (* f n)))))))
interpreter. The interpreter provided lexical scoping, a lambda operation for creating functions, and
an alpha operation for creating actors. For example, the factorial function could be represented
either as a function, as in Listing 9, or as an actor, as in Listing 10. Implementing the interpreter
brought an odd fact to light: the interpreter’s code for handling alpha was identical to the code
for handling lambda! This indicated that closures — the objects created by evaluating lambda —
were useful for both high order functional programming and object oriented programming (Steele
1996).
LISP ducks the question “what is a function?” It provides lambda notation for functions, but a
lambda expression can only be applied to arguments, not evaluated itself. Scheme provides an
answer to this question: the value of a function is a closure. Thus in Scheme we can write both
(define num 6)
which binds the value 6 to the name num and
(define square (lambda (x) (* x x)))
which binds the squaring function to the name square. (Scheme actually provides an abbreviated
form of this definition, to spare programmers the trouble of writing lambda all the time, but the
form shown is accepted by the Scheme compiler and we use it in these notes.)
Since the Scheme interpreter accepts a series of definitions, as in LISP, it is important to understand
the effect of the following sequence:
(define n 4)
(define f (lambda () n))
(define n 7)
(f)
The final expression calls the function f that has just been defined. The value returned is the
value of n, but which value, 4 or 7? If this sequence could be written in LISP, the result would
be 7, which is the value of n in the environment when (f) is evaluated. Scheme, however, uses
static scoping. The closure created by evaluating the definition of f includes all name bindings
in effect at the time of definition. Consequently, a Scheme interpreter yields 4 as the value of
(f). The answer to the question “what is a function?” is “a function is an expression (the body
of the function) and an environment containing the values of all variables accessible at the point
of definition”.
Closures in Scheme are ordinary values. They can be passed as arguments and returned by func-
tions. (Both are possible in LISP, but awkward because they require special forms.)
4 THE FUNCTIONAL PARADIGM 28
Listing 11: Differentiating in Scheme
(define derive (lambda (f dx)
(lambda (x)
(/ (- (f (+ x dx)) (f x)) dx))))
(define square (lambda (x) (* x x)))
(define Dsq (derive sq 0.001))
Listing 12: Banking in Scheme
(define (make-account balance)
(define (withdraw amount)
(if (>= balance amount)
(sequence (set! balance (- balance amount))
balance)
("Insufficient funds"))
(define (deposit amount)
(set! balance (+ balance amount))
balance)
(define (dispatch m)
(cond
((eq? m ’withdraw) withdraw)
((eq? m ’deposit) deposit)
(else (error "Unrecognized transaction" m))))
dispatch)
Example 14: Differentiating. Differentiation is a function that maps functions to functions.
Approximately (Abelson and Sussman 1985):
Df(x) =
f(x + dx) −f(x)
dx
We define the Scheme functions shown in Listing 11. After these definitions have been evaluated,
Dsq is, effectively, the function
f(x) =
(x + 0.001)
2
− x
2
0.001
= 2x +
We can apply Dsq like this: (->” is the Scheme prompt):
-> (Dsq 3)
6.001
2
Scheme avoids the problem of incompatibility between interpretation and compilation by being
statically scoped, whether it is interpreted or compiled. The interpreter uses a more elaborate data
structure for storing values of local variables to obtain the effect of static scoping. In addition to
full support for high order functions, Scheme introduced continuations.
Although Scheme is primarily a functional language, side-effects are allowed. In particular, set!
changes the value of a variable. (The “!” is a reminder that set! has side-effects.)
4 THE FUNCTIONAL PARADIGM 29
Listing 13: Using the account
-> ((acc ’withdraw) 50)
50
-> ((acc ’withdraw) 100)
Insufficient funds
-> ((acc ’deposit) 100)
150
Example 15: Banking in Scheme. The function shown in Listing 12 shows how side-effects
can be used to define an object with changing state (Abelson and Sussman 1985, page 173). The
following dialog shows how banking works in Scheme. The first step is to create an account.
-> (define acc (make-account 100))
The value of acc is a closure consisting of the function dispatch together with an environment in
which balance = 100. The function dispatch takes a single argument which must be withdraw
or deposit; it returns one of the functions withdraw or deposit which, in turn, takes an amount
as argument. The value returned is the new balance. Listing 13 shows some simple applications of
acc. The quote sign (’) is required to prevent the evaluation of withdraw or deposit. 2
This example demonstrates that with higher order functions and control of state (by side-effects)
we can obtain a form of OOP. The limitation of this approach, when compared to OOPLs such as
Simula and Smalltalk (described in Section 6) is that we can define only one function at a time.
This function must be used to dispatch messages to other, local, functions.
4.3 SASL
SASL (St. Andrew’s Symbolic Language) was introduced by David Turner (1976). It has an Algol-
like syntax and is of interest because the compiler translates the source code into a combinator
expression which is then processed by graph reduction (Turner 1979). Turner subsequently designed
KRC (Kent Recursive Calculator) (1981) and Miranda (1985), all of which are implemented with
combinator reduction.
Combinator reduction implements call by name (the default method for passing parameters in
Algol 60) but with an optimization. If the parameter is not needed in the function, it is not
evaluated, as in Algol 60. If it is needed one or more times, it is evaluated exactly once. Since
SASL expressions do not have side-effects, evaluating an expression more than once will always
give the same result. Thus combinator reduction is (in this sense) the most efficient way to pass
parameters to functions. Evaluating an expression only when it is needed, and never more than
once, is called call by need or lazy evaluation.
The following examples use SASL notation. The expression x::xs denotes a list with first element
(head) x and remaining elements (tail) xs. The definition
nums(n) = n::nums(n+1)
apparently defines an infinite list:
nums(0) = 0::nums(1) = 0::1::nums(2) = . . . .
The function second returns the second element of a list. In SASL, we can define it like this:
second (x::y::xs) = y
Although nums(0) is an “infinite” list, we can find its second element in SASL:
4 THE FUNCTIONAL PARADIGM 30
second(nums(0)) = second(0::nums(1)) = second(0::1::nums(2)) = 1
This works because SASL evaluates a parameter only when its value is needed for the calculation
to proceed. In this example, as soon as the argument of second is in the form 0::1::. . . ., the
required result is known.
Call by need is the only method of passing arguments in SASL but it occurs as a special case in
other languages. If we consider if as a function, so that the expression
if P then X else Y
is a fancy way of writing if(P,X,Y), then we see that if must use call by need for its second and
third arguments. If it did not, we would not be able to write expressions such as
if x = 0 then 1 else 1/x
In C, the functions && (AND) and [[ (OR) are defined as follows:
X && Y ≡ if X then Y else false
X [[ Y ≡ if X then true else Y
These definitions provide the effect of lazy evaluation and allow us to write expressions such as
if (p != NULL && p->f > 0) . . . .
4.4 SML
SML (Milner, Tofte, and Harper 1990; Milner and Tofte 1991) was designed as a “metalanguage”
(ML) for reasoning about programs as part of the Edinburgh Logic for Computable Functions
(LCF) project. The language survived after the rest of the project was abandoned and became
“standard” ML, or SML. The distinguishing feature of SML is that it is statically typed in the
sense of Section B.3 and that most types can be inferred by the compiler.
In the following example, the programmer defines the factorial function and SML responds with its
type. The programmer then tests the factorial function with argument 6; SML assigns the result
to the variable it, which can be used in the next interaction if desired. SML is run interactively,
and prompts with “-”.
- fun fac n = if n = 0 then 1 else n * fac(n-1);
val fac = fn : int -> int
- fac 6;
val it = 720 : int
SML also allows function declaration by cases, as in the following alternative declaration of the
factorial function:
- fun fac 0 = 1
= | fac n = n * fac(n-1);
val fac = fn : int -> int
- fac 6;
val it = 720 : int
Since SML recognizes that the first line of this declaration is incomplete, it changes the prompt to
“=” on the second line. The vertical bar “|” indicates that we are declaring another “case” of the
declaration.
Each case of a declaration by cases includes a pattern. In the declaration of fac, there are
two patterns. The first, 0, is a constant pattern, and matches only itself. The second, \tt n,
4 THE FUNCTIONAL PARADIGM 31
Listing 14: Function composition
- infix o;
- fun (f o g) x = g (f x);
val o = fn : (’a -> ’b) * (’b -> ’c) -> ’a -> ’c
- val quad = sq o sq;
val quad = fn : real -> real
- quad 3.0;
val it = 81.0 : real
Listing 15: Finding factors
- fun hasfactor f n = n mod f = 0;
val hasfactor fn : int -> int -> bool
- hasfactor 3 9;
val it = true : bool
is a variable pattern, and matches any value of the appropriate type. Note that the definition
fun sq x = x * x; would fail because SML cannot decide whether the type of x is int or real.
- fun sq x:real = x * x;
val sq = fn : real -> real
- sq 17.0;
val it = 289.0 : real
We can pass functions as arguments to other functions. The function o (intended to resemble the
small circle that mathematicians use to denote functional composition) is built-in, but even if it
wasn’t, we could easily declare it and use it to build the fourth power function, as in Listing 14. The
symbols ’a, ’b, and ’c are type names; they indicate that SML has recognized o as a polymorphic
function.
The function hasfactor defined in Listing 15 returns true if its first argument is a factor of
its second argument. All functions in SML have exactly one argument. It might appear that
hasfactor has two arguments, but this is not the case. The declaration of hasfactor introduces
two functions, as shown in Listing 16. Functions like hasfactor take their arguments one at a
time. Applying the first argument, as in hasfactor 2, yields a new function. The trick of applying
one argument at a time is called “currying”, after the American logician Haskell Curry. It may be
helpful to consider the types involved:
hasfactor : int -> int -> bool
hasfactor 2 : int -> bool
hasfactor 2 6 : bool
The following brief discussion, adapted from (
˚
Ake Wikstr¨om 1987), shows how functions can be
used to build a programmer’s toolkit. The functions here are for list manipulation, which is a
widely used example but not the only way in which a FPL can be used. We start with a list
generator, defined as an infix operator.
Listing 16: A function with one argument
- val even = hasfactor 2;
val even = fn : int -> bool;
- even 6;
val it = true : bool
4 THE FUNCTIONAL PARADIGM 32
Listing 17: Sums and products
- fun sum [] = 0
= | sum (x::xs) = x + sum xs;
val sum = fn : int list -> int
- fun prod [] = 1
= | prod (x::xs) = x * prod xs;
val prod = fn : int list -> int
- sum (1 -- 5);
val it = 15 : int
- prod (1 -- 5);
val it = 120 : int
Listing 18: Using reduce
- fun sum xs = reduce add 0 xs;
val sum = fn : int list -> list
- fun prod xs = reduce mul 1 xs;
val prod = fn : int list -> list
- infix --;
- fun (m -- n) = if m < n then m :: (m+1 -- n) else [];
val -- = fun : int * int -> int list
- 1 -- 5;
val it = [1,2,3,4,5] : int list
The functions sum and prod in Listing 17 compute the sum and product, respectively, of a list of
integers. We note that sum and prod have a similar form. This suggests that we can abstract the
common features into a function reduce that takes a binary function, a value for the empty list,
and a list. We can use reduce to obtain one-line definitions of sum and prod, as in Listing 18. The
idea of processing a list by recursion has been captured in the definition of reduce.
- fun reduce f u [] = u
= | reduce f u (x::xs) = f x (reduce f u xs);
val reduce = fn : (’a -> ’b -> ’b) -> ’b -> ’a list -> ’b
We can also define a sorting function as
- val sort = reduce insert nil;
val sort = fn : int list -> int list
where insert is the function that inserts a number into an ordered list, preserving the ordering:
- fun insert x:int [] = x::[]
| insert x (y::ys) = if x <= y then x::y::ys else y::insert x ys;
val insert = fn : int -> int list -> int list
4.5 Other Functional Languages
We have not discussed FP, the notation introduced by Backus (1978). Although Backus’s speech
(and subsequent publication) turned many people in the direction of functional programming, his
ideas for the design of FPLs were not widely adopted.
Since 1978, FPLs have split into two streams: those that provide mainly “eager evaluation” (that
is, call by value), following LISP and Scheme, and those that provide mainly “lazy evaluation”,
4 THE FUNCTIONAL PARADIGM 33
following SASL. The most important language in the first category is still SML, while Haskell
appears to be on the way to becoming the dominant FPL for teaching and other applications.
5 Haskell
COMP 348 is about the principles of programming languages. In order to understand the prin-
ciples, you need to know more than one programming language. Even this is not enough: you
really need to know more than one style of programming. Accordingly, the course begins with an
introduction to a language that is quite different to languages that you may already be familiar
with such as C
++
, Java, and Basic.
Haskell is a functional programming language in the sense of Section 4. Although it has many
of the features that we associate with other languages, such as variables, expressions, and data
structures, its foundation is the mathematical concept of a function.
You can find information about Haskell at www.haskell.org. Haskell gets its name from the
American logician Haskell Curry (1900–82 — see http://www.haskell.org/bio.html).
5.1 Getting Started
There are several Haskell compilers and all of them are free. One of the first Haskell implementa-
tions was called “Gofer”; we will be using a system derived from Gofer and called HUGS (Haskell
Users’ Gofer System). At Concordia, Hugs has been installed on both Solaris8 machines and
linux hosts in pkg/hugs98. Executables are in /site/bin and can be run without any special
preparation. Documentation is in pkg/hugs98/docs.
If you want to run Hugs on your own PC, download it from http://www.haskell.org/hugs;
installation is straightforward.
Hugs provides an interactive programming environment. It is based on a “read/eval/print” loop:
the system reads an expression, evaluates it, and prints the value. Figure 2 shows Hugs working out
2+2. The prompt Prelude> indicates that the only module that has been loaded is the “prelude”.
As well as expressions, the environment recognizes various commands with the prefix ”:”. For
instance, if you enter :browse (or :b for short), Hugs will display the functions and structures
defined in the current environment.
Haskell computes with integers and floating-point numbers. Integers have unlimited size and
conversion are performed as required.
Prelude> 1234567890987654321 / 1234567
1.0e+012 :: Double
Prelude> 1234567890987654321 ‘div‘ 1234567
1000000721700 :: Integer
Strings are built into Haskell. A string literal is enclosed in double quotes, as in C
++
. Haskell uses
the same escape sequences C
++
(e.g., ’\n’, etc.). However, Haskell strings are not terminated by
’\0’. The infix operator ++ concatenates strings.
Prelude> "Hallo" ++ " " ++ "everyone!"
"Hallo everyone!" :: [Char]
The basic data structure of almost all functional programming languages, including Haskell, is
the list. In Haskell, lists are enclosed in square brackets with items separated by commas. The
concatenation operator for lists is ++. This is the same as the concatenation operator for strings
— Haskell considers a string to be simply a list of characters.
Prelude> [1,2,3] ++ [4,5,6]
[1,2,3,4,5,6] :: [Integer]
The prelude provides a number of basic operations on lists.
34
5 HASKELL 35
__ __ __ __ ____ ___ _____________________________________________
|| || || || || || ||__ Hugs 98: Based on the Haskell 98 standard
||___|| ||__|| ||__|| __|| Copyright (c) 1994-2001
||---|| ___|| World Wide Web: http://haskell.org/hugs
|| || Report bugs to: hugs-bugs@haskell.org
|| || Version: December 2001 _____________________________________________
Haskell 98 mode: Restart with command line option -98 to enable extensions
Reading file "C:\Program Files\Hugs98\lib\Prelude.hs":
Hugs session for:
C:\Program Files\Hugs98\lib\Prelude.hs
Type :? for help
Prelude> 2+2
4 :: Integer
Prelude>
Figure 2: A short session with Hugs
Prelude> [1..20]
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20] :: [Integer]
Prelude> sum [1..20]
210 :: Integer
Prelude> product [1..20]
2432902008176640000 :: Integer
Prelude> take 5 [1..20]
[1,2,3,4,5] :: [Integer]
Prelude> drop 10 [1..20]
[11,12,13,14,15,16,17,18,19,20] :: [Integer]
Prelude> filter even [1..20]
[2,4,6,8,10,12,14,16,18,20] :: [Integer]
Prelude> take 5 "Peter Piper pickled a pepper"
"Peter" :: [Char]
Evaluating expressions is fun for a few minutes, but it quickly gets boring. To do anything inter-
esting, you have to write your own programs — which, in Haskell, means mainly writing functions
(although later we will also introduce new data types).
The environment does not allow you to introduce any new definitions:
Prelude> n = 99
ERROR - Syntax error in input (unexpected ‘=’)
Fortunately, it does allow you read definitions from a file. The command is :load and it is followed
by a path name. Here is how the definitions for this lecture are loaded. Note that the prompt
changes to reflect the newly loaded module:
Prelude> :load "e:/courses/comp348/src/lec1.hs"
Reading file "e:/courses/comp348/src/lec1.hs":
5 HASKELL 36
Command Brief description Note
:? List all Hugs commands
:! 'command` Execute a shell command
:quit Quit the current Hugs session
:load 'filename` . . . Load Haskell source from a file (1)
:also 'filename` . . . Load additional files (1)
:reload Repeat last load command (1)
:browse 'module` . . . Display functions exported by module(s)
:set 'options` Set flags and environment parameters (see Figure 4)
:cd 'directory` Change working directory
:edit 'file` Start the editor (2)
:find 'name` Start the editor with the cursor on the object named (2)
:gc Force garbage collection
:info 'name` . . . Display information about files, classes, etc.
:module 'module` Change module
:names 'pattern` Find names that match the pattern (2)
:project 'project file` Load a project (obsolete) (3)
:type 'expression` Print the type of 'expression` without evaluating it
:version Display Hugs version
Figure 3: Hugs commands
Hugs session for:
C:\Program Files\Hugs98\lib\Prelude.hs
e:/courses/comp348/src/lec1.hs
The file lec1.hs looks like this (the “....” indicates material that is discussed below):
module Lec1 where
fc1 n = if n == 0 then 1 else n * fc1(n - 1)
....
This code introduces a Haskell module called Lec1 that contains a number of definitions. The
first definition introduces a function called fc1.
In the examples that follow, we will show Haskell code as it appears in the source file. When we
need to show the effect of evaluating an expression, we will include the prompt, as in the examples
above.
5.1.1 Hugs Commands
Before describing the Haskell language, we digress briefly to explain the commands provided by the
Hugs environment. Figure 3 provides a brief summary of each command. Note that any command
can be abbreviated: for example, you can enter :b instead of :browse. Figure 4 shows options
for the :set command; these options can also be specified on the command line. The WinHugs
environment has icons for most of these commands.
Notes for Figure 3
5 HASKELL 37
±s Print # of reductions and cells after evaluation
±t Print type after evaluation
±f Terminate evaluation after first error
±g Print cells recovered by garbage collector
±l Use literate modules as default
±e Warn about errors in literate modules
±. Print dots to show progress (not recommended!)
±q Print nothing to show progress
±w Always show which modules are loaded
±k Show “kind” errors in full
±u Use show to display results
±i Chase imports when loading modules
p'string` Set prompt to 'string`
r'string` Set repeat last expression string to 'string`
P'string` Set search path for modules to 'string`
E'string` Use editor setting given by 'string`
F'string` Set preprocessor filter to 'string`
h'num` Set heap size to 'num` bytes (has no effect on current session)
c'num` Set constraint cutoff limit to 'num`
Figure 4: Parameters for :set. The symbol ± indicates a toggle: +x turns x on and -x turns it
off. The other commands take a string or a number as argument, as shown.
1. :load removes all definitions (except those belonging to the Prelude) and loads new defini-
tions from the named file(s). :also is similar but it does not remove previous definitions.
:reload repeats the previous :load command.
One easy way to develop a Hugs program is to:
Write the first version in a file fun.hs
Leaving the editor running, load this file into Hugs and try the functions: :l fun.hs
Make corrections in the editor window
Reload the file in the Hugs window: :r
2. The editor commands assume that Hugs has been linked to an editor. For example, you may
like to use emacs with Hugs. :edit suspends Hugs and starts the editor; when you close
the editor, Hugs resumes. :find is similar, but it positions the editor cursor at the chosen
object.
The default editor for Hugs in Windows is notepad. You may find it more convenient to ignore
these commands and simply run Hugs and your favourite editor as independent processes.
In this case, all you have to do is save the corrected file in the editor window and enter :r in
Hugs to reload the modified source code.
3. The project command reads a project file containing information about the files that Hugs
should load. Recent versions of Hugs use a smarter method of loading projects, called “import
chasing”, that make the :project command unnecessary in most cases.
You can obtain the information in Figure 4 by entering :set without any arguments. The output
also includes the current settings which, for a Windows session, will look something like this:
5 HASKELL 38
Current settings: +tfewuiAR -sgl.qQkI -h250000 -p"%s> " -r$$ -c40
Search path : -P{Hugs}\lib;{Hugs}\lib\hugs;{Hugs}\lib\exts;{Hugs}\lib\win32
Project Path :
Editor setting : -EC:\WINNT\notepad.exe
Preprocessor : -F
Compatibility : Haskell 98 (+98)
5.1.2 Literate Haskell
Donald Knuth proposed the idea that programs should be readable documents with detailed ex-
planations and commentary built in, rather than just code listings with occasional — and often
useless — comments. A program written in this way is called a literate program and the art of
writing such programs is called literate programming.
The first step towards literate programming is to make “comments” very easy to write and read.
A consequence of this step is that the code itself becomes slightly harder to write and read. The
overhead for code in literate Haskell (at least the Hugs version) is relatively light. Here are the
rules:
• Literate Haskell source files use the extension .lhs instead of .hs.
• A line beginning with > is interpreted as Haskell code.
• The lines before and after lines containing code must be either blank or lines containing code
(but see the e switch below).
• Any other lines are ignored by the compiler.
The check for blank lines can be suppressed by turning off the e switch in Hugs. Here’s an example
to illustrate how this works. Suppose that the source file vector.lhs contains this text:
Vector is an instance of the type class Show.
This enables easy conversion of vectors to strings.
> instance Show Vector where
> show (V x y z) = "(" ++ show x ++ ", " ++ show y ++ ", " ++ show z ++ ")"
By default, the e switch in the compiler is on (as if we had entered :set +e). When Hugs loads
this file, it will report:
Prelude> :l e:/courses/comp348/src/vector.lhs
Reading file "e:/courses/comp348/src/vector.lhs":
Parsing
ERROR "e:/courses/comp348/src/vector.lhs":14 - Program line next to comment
There are two ways to fix this problem. One is to turn off the e switch (its default setting is +e):
Prelude> :set -e
and then reload the source file — it will load without errors. The other way is to put blank lines
in the source file:
5 HASKELL 39
Vector is an instance of the type class Show.
This enables easy conversion of vectors to strings.
> instance Show Vector where
> show (V x y z) = "(" ++ show x ++ ", " ++ show y ++ ", " ++ show z ++ ")"
In general, it is a good idea to put blank lines in the source file anyway, because they make the
code stand out more clearly.
Figure 5 shows a small but complete example of a literate Haskell program.
5.1.3 Defining Functions
Since Haskell is a functional language, defining functions is fundamental and there are, in fact,
several ways to define functions. We will start by considering the favourite function of functional
programmers, the factorial function:
0 ! = 1
n! = 1 2 3 . . . n for n ≥ 1
The first point to note is that, since everything is a value in Haskell, there is no need for a return
statement: you write just the value that you want returned. Our first definition of the factorial
function uses recursion:
fc1 n = if n == 0 then 1 else n * fc1(n - 1)
This definition works fine, but it is not the preferred style for Haskell. Many functions, including
this one, are defined by cases. The cases for factorial are n = 0 and n > 0. Haskell has a case
expression (analogous to switch in C
++
and Java) and we use it here for the second definition of
factorial:
fc2 n = case n of
0 -> 1
n -> n * fc2(n - 1)
This is the first definition that is written on more than one line. It can be written on one line but, to
make the syntax legal, there must be a semicolon between the two case clauses (see Section 5.1.4):
fc2 n = case n of 0 -> 1 ; n -> n * fc2(n - 1)
Even this, however, is not idiomatic Haskell. The preferred form of definition is a set of equa-
tions:
fc3 0 = 1
fc3 n = n * fc3(n - 1)
Although we have written these equations together, they are in fact independent assertions for
Haskell. If we wrote the first but not the second, fc3 0 would evaluate correctly to 1, but fc3 4
would cause an error.
There are also “closed form” expressions for some functions. We can compute n! by constructing
the list [1,2,3,...,n] and then using the Prelude function product to multiply all of the list
items:
fc4 n = product [1..n]
This works correctly when n = 0 because the “product” of the empty list is defined to be 1:
Lec1> product []
5 HASKELL 40
Author: Peter Grogono
Date: 5 October 2003
Description: Example of a Literate Haskell source file
> module Vector where
This is a module for vectors in 3D Euclidean space.
We define V to be the constructor for vectors.
> data Vector = V Double Double Double
Vector is an instance of the type class Show.
This enables easy conversion of vectors to strings.
> instance Show Vector where
> show (V x y z) = "(" ++ show x ++ ", " ++ show y ++ ", " ++ show z ++ ")"
Vector is an equality type. Vectors are equal if all of their components
are equal. It might be an improvement to consider vectors to be equal if
their components are approximately equal.
> instance Eq Vector where
> V x y z == V x’ y’ z’ = x == x’ && y == y’ && z == z’
Vector is a numeric class. The following definitions provide sums,
differences, and outer (cross) products of vectors.
> instance Num Vector where
> V x y z + V x’ y’ z’ = V (x + x’) (y + y’) (z + z’)
> V x y z - V x’ y’ z’ = V (x - x’) (y - y’) (z - z’)
> V x y z * V x’ y’ z’ =
> V (y * z’ - y’ * z) (z * x’ - z’ * x) (x * y’ - x’ * y)
> negate (V x y z) = V (negate x) (negate y) (negate z)
A vector can be scaled by a real number. Haskell does not allow us to
overload ’*’ and ’/’ for vector/scalar operations.
> scale (V x y z) d = V (x * d) (y * d) (z * d)
These auxiliary functions define the dot product and length of a vector.
The expression ’norm v’ yields a unit vector with the same direction as v.
> dot (V x y z) (V x’ y’ z’) = x * x’ + y * y’ + z * z’
> len v = sqrt(dot v v)
> norm v = if l == 0
> then error "Norm of zero vector"
> else scale v (1/l)
> where l = len v
Figure 5: A Literate Haskell program
5 HASKELL 41
1 :: Integer
To illustrate some other styles of function definition, we will use an even simpler function: inc
adds 1 to its argument. The “obvious” definition is a single equation:
inc1 n = n + 1
Purists have a problem with this definition: they would prefer definitions to have to form x = e
in which x is the name of the thing being defined and e is the defining expression. We can write
definition in this way in Haskell:
inc2 = `n -> n + 1
We can read this definition as “inc2 is the function that maps n to n + 1”. The backslash
indicates that n is a parameter and the “arrow” -> links the parameter to the body of the function.
Backslash looks a bit like λ and the Haskell notation suggests the way in which function is written
in λ-calculus: λn. n + 1.
The expression \n -> n + 1 can be used alone: it is an anonymous function.
Lec1> (`n -> n + 1) 7
8 :: Integer
There is another way of writing this anonymous function. Its form is ? = 1 in which ? stands for
the unknown parameter. Putting parentheses around +1 makes this expression into a function:
Lec1> (+1) 6
7 :: Integer
In general, given a binary operator, we can create functions from it by giving it either a left or a
right operand. For example, (/3) is a function that divides its argument by 3 and (3/) is a function
that divides its argument into 3:
Lec1> (/3) 9
3.0 :: Double
Lec1> (12/) 5
2.4 :: Double
This provides us with another definition of inc:
inc3 = (+1)
Operators with one argument missing are called sections. The general rules are:
(op expr ) is equivalent to \x -> (x op expr)
(expr op ) is equivalent to \x -> (expr op x)
There is an important exception to this rule: (-1) is not a section: it is just −1. However,
(subtract expr) is a section equivalent to \x -> x - expr.
So far, all of our functions have had exactly one argument. In fact, all Haskell functions take
exactly one argument. Nevertheless, we can write definitions such as
add x y = x + y
and use the functions defined:
Lec1> add 347589 23874
371463 :: Integer
How do we account for this? The function add is actually a second order function that expects
one argument and returns a function that expects another argument. To see this, define
5 HASKELL 42
inc4 = add 1
and try it:
Lec1> inc4 5
6 :: Integer
Thus add 1 is “the function that adds 1 to its argument” and add 99 is “the function that adds
99 to its argument”, and so on. Haskell interprets the expression add 2 3 as “apply the function
add 2 to the argument 3”.
5.1.4 Syntax
Syntax conventions for functional programming are different from “traditional” programming
styles. You will have noticed, for example, that we write inc n rather than inc(n). Here are
some of the rules.
The “blank” operator denotes function application. Thus f x means “the result of applying the
function f to the argument x”. The technical name for the blank operator is juxtaposition.
Juxtaposition binds higher than any other operation.
Because application binds strongly, we must put parentheses around arguments that are not
simple. For example, the Haskell expression f n+1 denotes f(n) + 1; if we want the effect of
f(n + 1), we must write f(n+1).
Haskell interprets f x y as ((f x) y). First, the function f is applied to the argument x;
then the result, which must be a function, is applied to the argument y.
The application f(g(x)) applies g to x and then applies f to the result. We can write this in
Haskell either as f(g(x)) or as f (g x).
Compared to programs in most other languages, Haskell programs are remarkably free from “syn-
tactic clutter” — braces, semicolons, and such. The trade-off is that Haskell programmers must
respect layout rules. Unlike C
++
, for example, in Haskell indentation and line breaks make a
difference. The rules are simple and intuitive and, if you use existing source code as a model, you
may never have to learn them explicitly.
It is important to know that the rules exist, however, because otherwise Haskell’s error messages
can be quite confusing. For example, inserting this definition into lec1.hs
fc6 n = if
n == 0 then 1 else n * fc6(n - 1)
gives this error:
Reading file "e:/courses/comp348/src/lec1.hs":
Parsing
ERROR "e:/courses/comp348/src/lec1.hs":27 -
Syntax error in expression (unexpected ‘;’, possibly due to bad layout)
Haskell allows semicolons but does not require them. However, the parser inserts a semicolon at
the end of every line. Consequently, if you break a line in the wrong place, you get the mysterious
diagnostic “unexpected ‘;’ ”.
There’s nothing wrong with line breaks, provided that they occur in sensible places. The following
definition is fine:
fc6 n =
5 HASKELL 43
if n == 0
then 1
else n * fc6(n - 1)
We have already seen the Haskell case expression. Cases must appear on separate lines:
case e of
p1 -> e1
p2 -> e2
p3 -> e3
or separated by semicolons:
case e of p1 -> e1 ; p2 -> e2 ; p3 -> e3
There are a few other Haskell constructs for which layout is important, and we will state the rules
when we introduce the constructs.
5.1.5 Tracing
Debugging functional programs is difficult. One problem is that we cannot interleave print state-
ments with other statements because functions programs don’t have statements. Why? Because
statements have side-effects and functional programs are not supposed to have side-effects.
The Hugs Trace module provides a way around the problem of no side-effects. It defines a function
trace with two arguments: a string and an expression. When trace is called, it prints the message
as a side-effect and returns the expression.
To use the Trace module, write
import Trace
at the beginning of your program. To use it, replace any expression e by
(trace (s) (e))
in which s is a string. The parentheses are not always necessary (depending on the context) but
they do no harm.
Since trace returns the value of e, the meaning of your program should not be affected. However,
you will be able to monitor its progress by looking at what trace prints.
Here is a demonstration of Trace in use. We import the Trace module and define our old friend,
the factorial function. We keep the original version for later use and define another version with a
call to trace.
module Demo where
import Trace
¦-
fac 0 = 1
fac n = n * fac (n - 1)

fac 0 = 1
fac n = trace ("fac " ++ show n ++ "`n") (n * fac (n - 1))
Calling the traced version of fac gives:
5 HASKELL 44
Demo> fac 6
fac 6
fac 5
fac 4
fac 3
fac 2
fac 1
720 :: Integer
5.2 Details
5.2.1 Lists
We have already seen (in Section 5) that Haskell supports lists. Here are the basic built-in opera-
tions for lists:
Lists are enclosed in square brackets: [...].
The empty list is [].
Items within lists are separated by commas: [1,2,3].
A list may contain items of any type, but all of the items in a given list must have the same
type (note the use of -- to separate comments from code):
Lec1> [1,2,3] -- integer list - OK
[1,2,3] :: [Integer]
Lec1> ["abc","def"] -- string list - OK
["abc","def"] :: [[Char]]
Lec1> [’a’,’b’,’c’] -- a character list is a string
"abc" :: [Char]
Lec1> [1, 1.5] -- integers are promoted if necessary
[1.0,1.5] :: [Double]
Lec1> [1,’a’] -- mixed types - BAD!
ERROR - Illegal Haskell 98 class constraint in inferred type
*** Expression : [1,’a’]
*** Type : Num Char => [Char]
The infix operator : builds lists: it takes an item on the left and a list on the right. The
operator : is right-associative, so parentheses are not required in the second example below:
Lec1> 1:[]
[1] :: [Integer]
Lec1> 1:2:[]
[1,2] :: [Integer]
A very common idiom in Haskell progamming is x:xs (read “x followed by xes”). This is a
list consisting of an item x and a list xs. The initial letter may vary: for example, a:as,
y:ys, and so on.
5 HASKELL 45
The expression [x..y] constructs a list, provided that x and y are members of the same
enumerated type:
Lec1> [2..4]
[2,3,4] :: [Integer]
Lec1> [3,6,..20]
[3,6,9,12,15,18] :: [Integer]
Lec1> [’a’..’z’]
"abcdefghijklmnopqrstuvwxyz" :: [Char]
Lec1> [5..2]
[] :: [Integer]
The infix operator ++ concatenates lists.
There are functions for extracting the first item of a list (head) and the remaining items of the list
(tail) but we don’t often need them. The reason is that we can define list functions by using the
constructors as patterns. Most function definitions need only two cases: what is the result for an
empty list, []? and what is the result for a non-empty list x:xs?
For example, to find the length of a list, we define:
len [] = 0
len (x:xs) = 1 + len xs
The parentheses are required: since juxtaposition has higher precedence than colon, Haskell would
parse len x:xs as (len x):xs and report a type error.
If the append operator ++ did not exist, we could append lists using our own function app defined
by
app [] ys = ys
app (x:xs) ys = x:app xs ys
Any Haskell function with two arguments
1
can be used as an infix operator by putting back quotes
(grave accents) around it. The second example below demonstrates this. The third example shows
what happens if we try to build a list with mixed types.
Lec1> app [1,2,3] [4,5,6]
[1,2,3,4,5,6] :: [Integer]
Lec1> [1,2,3] ‘app‘ [4,5,6]
[1,2,3,4,5,6] :: [Integer]
Lec1> app [’a’,’b’] [1,2]
ERROR - Illegal Haskell 98 class constraint in inferred type
*** Expression : app [’a’,’b’] [1,2]
*** Type : Num Char => [Char]
A common operation on lists is to select items that satisfy a predicate. For example, to obtain the
even items in a list of integers, we could use the Prelude function even:
evens [] = []
evens (x:xs) = if even x then x:evens xs else evens xs
1
Of course, it hasn’t really got two arguments — see Section 5.1.3.
5 HASKELL 46
In functional programming, however, we usually want to generalize, using functions to do so.
The function filt takes a predicate p and a list, and returns the items in the list that satisfy
the predicate. (A better name would be filter but a function with this name is defined in the
Prelude: it does the same as filt.)
filt f [] = []
filt p (x:xs) = if p x then x:filt p xs else filt p xs
Now we can write:
Lec1> evens [1..20]
[2,4,6,8,10,12,14,16,18,20] :: [Integer]
Lec1> filt even [1..20]
[2,4,6,8,10,12,14,16,18,20] :: [Integer]
Using filt, we can write a better definition for evens:
evens = filt even
5.2.2 Lazy Evaluation
Haskell evaluates expressions lazily: this means that it never evaluates an expression until the
value is actually needed. It is perfectly alright to introduce a non-terminating definition into a
Haskell program, but things go wrong when we try to use it. Here is a non-terminating definition
loop = loop
and here is an attempt to use it (the message ¦Interrupted!¦ appears when the user interrupts
the program, for example, by entering ctrl-C):
Lec1> loop + 3
¦Interrupted!¦
Lec1>
There are other things that fail:
Lec1> 1/0
Program error: primDivDouble 1.0 0.0
But now define a constant function that doesn’t use its argument:
one n = 1
Since Haskell never evaluates the argument of this function, we can give it any argument at all:
Lec1> one 77
1 :: Integer
Lec1> one (1/0)
1 :: Integer
Lec1> one loop
1 :: Integer
Lazy evaluation allows us to work with infinite lists. The lists do not have an infinite number of
items: what “infinite” means in this context is “unbounded” — the list appears to have as many
items as we need. For example, if we define
from n = n:from (n+1)
nats = from 1
5 HASKELL 47
then from n yields an unbounded list starting at n and nats yields as many natural numbers as
we might need: [1,2,3,...]. We can use the built-in function take to bite off finite chunks from
the beginning of an infinite list:
Lec1> take 6 nats
[1,2,3,4,5,6] :: [Integer]
Lec1> take 6 (evens nats)
[2,4,6,8,10,12] :: [Integer]
Here is an alternative definition of nats:
nats = [1..]
As an application of infinite lists, the function nomul returns the items in list l that are not
multiples of n. The function sieve takes a list, and returns its first item n followed by the list
obtained by applying itself to a list with multiples of n.
nomul n l = filter (`m -> mod m n /= 0) l
sieve (n:ns) = n:sieve (nomul n ns)
This is what happens when we apply sieve to the list [2,3,4,...].
Lec1> sieve (from 2)
[2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89
Interrupted!
5.2.3 Pattern Matching
It is time to look more closely at some of the simple definitions we have already seen:
fac 0 = 1
fac n = n * fac(n - 1)
len [] = 0
len (x:xs) = 1 + len xs
The object following the function name on the left of = is a pattern. When the function is called,
the argument of the function is matched against the available patterns. If a match succeeds,
the expression on the right side of = is evaluated. Pattern matching may include binding, which
assigns values to variables in the expression.
Examples:
In the call fac 0, the argument 0 matches the pattern 0. The expression 1 is evaluated,
yielding 1.
In the call fac 3, the argument 3 matches the pattern n, yielding the binding ¦ n = 3 ¦. The
expression n * fac(n - 1) is evaluated with this binding, yielding 3 * fac(2).
In the call len [], the argument [] matches the pattern []. The expression 0 is evaluated,
yielding 0.
In the call len [27..30], the argument is equivalent to 27:28:29:30. This matches the
pattern x:xs, yielding the bindings ¦ x = 27, xs = [28..30]¦. The expression 1 + len xs is
evaluated with these bindings, yielding 1 + len [28..30].
Pattern matching occurs in a number of contexts in Haskell. All of these contexts can be expressed
in terms of the case expression:
5 HASKELL 48
Kind Pattern Expression Condition for match Binding
Wild card e none none
Constant k e k==e none
Variable v e none v = e
Constructor C p
1
p
1
C

e
1
e
2
C = C

, p
1
∼ e
1
, p
2
∼ e
2
Figure 6: Pattern matching
case e of p
1
-> e
1
; p
2
-> e
2
; ...; p
n
-> e
n
Here is how other kinds of expression map to case expressions:
` p
1
p
2
. . . p
n
→ e = `x
1
x
2
. . . x
n
→ case (x
1
, x
2
, . . . , x
n
) of (p
1
, p
2
, . . . , p
n
) → e
where the x
i
are new variables
if e
1
then e
2
else e
3
= case e
1
of True → e
2
; False → e
3
let p
1
= e
1
; p
2
= e
2
; . . . p
n
= e
n
; in e = case (e
1
, e
2
, . . . , e
n
) of (p
1
, p
2
, . . . , p
n
) → e
2
The case expression
case e of p
1
-> e
1
; p
2
-> e
2
; ...; p
n
-> e
n
is evaluated as follows:
The expression e is matched against p
1
, p
2
, . . . , p
n
in turn.
If e matches p
i
, then the value of the expression is obtained by evaluating e
i
with any bindings
obtained by matching.
If none of the patterns match, the expression fails.
Note that the order of patterns is significant. For example, the following definition of the factorial
function does not work, because the first pattern always succeeds and evaluation never terminates:
fac n = n * fc3(n - 1)
fac 0 = 1
Figure 6 summarizes the important kinds of pattern and the way in which they match. The
expression p ∼ e stands for “pattern p matches expression e”.
The underscore character, , is called a “wild card” because it matches anything but produces
no binding. Use the wild card when you don’t need the value of the argument in the function.
For example, the head and tail selectors for lists can be defined as
head (x: ) = x
tail ( :xs) = xs
A constant matches a constant with the same value but produces no binding. For example,
[] matches the empty list but nothing else.
A pattern consisting of a variable name matches any expression and binds that expression to
the variable.
If the pattern is a constructor with patterns as arguments, the expression must be an invocation
of the same constructor to expressions. The corresponding expressions are matched and the
bindings from the matches are combined. Since “:” is the list constructor, the pattern (3:xs)
matches any list whose first element is 3.
5 HASKELL 49
Haskell patterns are linear. This means that each variable in a pattern must be distinct. Suppose
that we are defining a function merge that has two lists and arguments and that we need to consider
the case in which the first item in both lists is the same. We are not allowed to write
merge (x:xs) (x:ys) = ... -- BAD!
Instead, we must write
merge (x:xs) (y:ys) = if x == y then ... else ...
Here are some examples of the pattern matching rules. First, the most common two cases for a
list function are the empty list pattern, [], and the non-empty list pattern, (x:xs). This pattern
matches the list with one item, in which case xs is bound to [].
Sometimes a third pattern is necessary. For example, if we define
swap [] = []
swap (x:y:ys) = y:x:swap ys
the expression “swap [3]” gives an error because the list with one element does not match any
pattern. The solution is to include this case in the definition:
swap [] = []
swap [x] = [x]
swap (x:y:ys) = y:x:swap ys
The tuple is another kind of constructor in Haskell. (A tuple corresponds roughly to a C
++
struct.) Tuples are enclosed in parentheses and the items inside them are separated by commas.
The items need not be of the same type. Unlike lists, tuples of different lengths have different
types.
A tuple with two items is usually called a pair. We can define selectors for pairs like this:
first (x, ) = x
second ( ,y) = y
However, these definitions are not really necessary, because the Prelude defines functions fst and
snd that do the same thing.
We can also define
plus (x,y) = x + y
This is not the same function as the Prelude function
add x y = x + y
The argument of plus must be a pair; there is no equivalent to “add 1”.
Functions can return tuples. The following function breaks a list into two roughly equal parts and
returns the two parts as a pair:
brk xs = (take (len xs ‘div‘ 2) xs, drop (len xs ‘div‘ 2) xs)
Obviously, we would like to avoid the repeated evaluation of “len xs ‘div‘ 2”. In a procedural
language, we would introduce a local variable. The Haskell equivalent is:
brk xs = (take m xs, drop m xs)
where m = len xs ‘div‘ 2
This may not look “functional” but in fact it is. It is “syntactic sugar” for
brk xs = (`m -> (take m xs, drop m xs)) (len xs ‘div‘ 2)
5 HASKELL 50
Description Syntax Associativity
simple term t
parenthesized term (t)
as-patterns a@p right
application f x left
do, if, left, `, case (left) right
case (leftwards) right
infix operators — see Figure 8
function types -> right
contexts =>
type constraints ::
do, if, left, `, case (rightwards) right
sequences ..
generators <-
grouping , n-ary
guards |
cases ->
definitions =
separation ;
Figure 7: Parsing precedence, from highest (at the top) to lowest (at the bottom).
Associativity
Precedence Left None Right
9 !! .
8 ^ ^^ **
7 * / ‘div‘ ‘mod‘
‘rem‘ ‘quot‘
6 + -
5 : ++
4 == /= < <= > >=
‘elem’ ‘notElem’
3 &&
2 ||
1 >> >>=
0 $ $! ‘seq’
Figure 8: Operator precedence, from highest (at the top) to lowest (at the bottom).
5.2.4 Operators
Figures 7 and 8 show the precedence rules that Haskell uses for parsing expressions and operators.
We have not encountered all of them yet.
Some constructs (do, if, let, \, and case) apparently have two precedences: one leftwards
and one rightwards. This is because of a metarule which says that each of these constructs
5 HASKELL 51
extends as far to the right as possible. In the first line in the table below, let seems to have
lower precedence than +, because the whole expression “a + a” is part of it. In the second
line, let seems to have higher precedence than +, because its components group together. The
second line is an application of the metarule: the let expression extends as far as possible.
Note parentheses override the metarule, as shown in the third lines of the table. Note also
that all three parsings are what we might intuitively expect.
The expression parses as
let a = 3 in a + a let a = 3 in (a + a)
k + let a = 3 in a + a k + (let a = 3 in (a + a))
k + (let a = 3 in a) + a (k + (let a = 3 in a)) + a
5.2.5 Building Haskell Programs
In procedural programming, we have a collection of control structures, or abstraction mechanisms
(assignment, sequence, conditional, loop, procedure call, etc.), and we use them to build programs.
In Functional Programming (FP), we have only one abstraction mechanism: function definition.
This makes the task of programming both simpler and harder: simpler because there are fewer
things to worry about, harder because we have a more limited toolkit.
The tendency in FP is to define a lot of small functions and then to build programs using these
functions. This can be confusing for the beginner, because the number of small functions is large.
But, with experience, this approach to programming becomes quite manageable.
The Prelude is a collection of functions that many programmers have found useful. It is often easier
to define a new function using Prelude functions than it is to write the function “from scratch”.
Here are some of the Prelude functions and their definitions.
Since Haskell does not have an explicit loop construct, recursion is used frequently. The key step
in writing a function is often to choose the appropriate parameter for recursion.
Recall that “take n xs” returns the first n items of the list x. If n is zero, the list is unchanged.
Otherwise, we use recursion on the value of n. To be safe, we define the appropriate response when
the list is empty.
take 0 xs = []
take [] = []
take n (x:xs) = x:take (n-1) xs
Recall also that “drop n xs” throws away the first n items of the list xs and returns the rest. The
definition is similar.
drop 0 xs = xs
drop [] = []
drop n ( :xs) = drop (n-1) xs
The functions takeWhile and dropWhile are only a little more complicated.
takeWhile p [] = []
takeWhile p (x:xs) = if p x then x : takeWhile p xs else []
dropWhile p [] = []
dropWhile p (x:xs) = if p x then dropWhile p xs else x:xs
Haskell provides another way of writing the last line. The infix operator @, read “as”, allows us to
write a pattern in two equivalent ways. In this case, we want xs and x:xs’ to stand for the same
5 HASKELL 52
list. Note that one or more primes may be added to a variable name to give a new name, as in
mathematics.
dropWhile p xs@(x:xs’) = if p x then dropWhile p xs’ else xs
The functions break and span are complementary. They split a list into two parts depending on
the value of a predicate.
Prelude> span (<5) [1..20]
([1,2,3,4],[5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20])
Prelude> break (>=5) [1..20]
([1,2,3,4],[5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20])
In particular, note:
Prelude> break isSpace "Here is a sentence."
("Here"," is a sentence.") :: ([Char],[Char])
The definitions of span and break are as follows:
span p [] = ([],[])
span p xs@(x:xs’) = if p x
then (x:ys, zs)
else ([],xs)
where (ys,zs) = span p xs’
break p = span (not . p)
With these functions working, we can write words, which splits a string into words separated by
blanks, and unwords, which puts the words together again.
words s = case dropWhile isSpace s of
"" -> []
s’ -> w : words s’’
where (w,s’’) = break isSpace s’
unwords [] = ""
unwords [w] = w
unwords (w:ws) = w ++ ’ ’ : unwords ws
Now we are ready for something more elaborate. Suppose we have a very long string of words,
perhaps an entire file, and we want to format so that each line is no longer than 80 characters.
The first step is to break into up into a list of words using the function words.
Here is the first version of a function that does the rest. The idea is to put words into a line until
the next word won’t fit — that is, it will make the line more than 80 characters long. When this
happens, we put the line into the output and carry on.
The problem is that we have no local variables and no state. Where do we store the information in
this “line”? The answer is that we need an accumulating parameter and recursion. We define
brk1 as
brk1 [] line = [line]
brk1 (w:ws) line = if length w + 1 + length line <= 80
then brk1 ws (line ++ " " ++ w)
else line : brk1 (w:ws) []
5 HASKELL 53
If there is something in our word string w:ws, this function calculates how long the line will become
if we append a blank and the word w to it. If the length is less than 80 characters, the word is
moved into the line. Otherwise, the result is a list consisting of the line and the result of processing
the rest of the word list. We start it off with an empty line:
brk1 sentence []
This function has one error: the first character of every line that it outputs is a blank. The next
version corrects that error.
brk2 [] line = [line]
brk2 (w:ws) line = if length w + 1 + length line <= 80
then
if line == ""
then brk2 ws w
else brk2 ws (line ++ " " ++ w)
else line : brk2 (w:ws) []
Finally, to make our function presentable to users, we wrap its parts up into a single definition:
format sen = brk (words sen) [] where
brk [] line = [line]
brk ws@(w:ws’) line = if length w + 1 + length line <= 80
then
if line == ""
then brk ws’ w
else brk ws’ (line ++ " " ++ w)
else line : brk ws []
5.2.6 Guards
The pattern-matching mechanism is handy but rather inflexible because it only allows for exact
matching. For example, if we want a factorial function that does not blow up for negative numbers,
we have to write
fac n = if n <= 0 then 1 else n * fac(n-1)
We can do better with guards. A guard is a condition that qualifies a pattern. As we know, the
pattern
n
matches any number. The pattern
n | n < 0
matches any negative number. Once we have provided a pattern, we can specify several guards,
each with their appropriate definitions. This version of factorial works for all integers:
fac n
| n <= 0 = 1
| n > 0 = n * fc(n-1)
The guard otherwise includes any conditions not covered by previous guards. For example, the
actual definitions of take and takeWhile in the Prelude are:
take n | n <= 0 = []
take [] = []
take n (x:xs) = x : take (n-1) xs
5 HASKELL 54
takeWhile p [] = []
takeWhile p (x:xs)
| p x = x : takeWhile p xs
| otherwise = []
5.3 Types
Haskell is a strongly typed language. This means that, if a program compiles without error, it
will either give an answer of the expected type or it will fail. Semantically, every type includes a
value ⊥ (pronounced “bottom” because it is the minimal element of a lattice) that denotes failure.
For example, if the type of a function is Int, the function will return either an integer or ⊥. Failure
is discussed in Section 5.3.2. For the rest of this section, we will discuss programs that do not fail.
Haskell uses the Hindley-Milner type system. An expression is written without type declara-
tions but, if it is type correct, the compiler can find a principle type or most general type for
it. There are a small number of situations in which the type checker needs a type declaration to
help it.
The following examples illustrate principle types. First, declare i to be the identity function. In
all of these examples, we follow the convention of declaring the type before the function. The type
declaration is accepted in this form by the Haskell compiler. It can also be obtained using the
:type command in the Hugs interpreter.
i :: a -> a
i x = x
The Haskell type is a -> a which we can read as “the type of a function that takes an argument
of any type and returns a result of the same type”. The letter a is called a type variable. In the
formal type system, this type is written ∀ a . a → a and is read “for all types a, the type a to a”.
The next example is a constant function: it takes two arguments and throws away the second one.
k :: a -> b -> a
k x y = x
There are two points to note. First, the type is a → b → a although you might have expected
ab → a. This is because Haskell functions take their arguments “one at a time”. The right arrow
associates to the right and the type a → b → a is parsed as a → (b → a). In other words, k is a
function that takes one argument of type a and returns a function with type b → a. The following
tests verifies this:
Types> k ’x’
k ’x’ :: a -> Char
Types> k ’x’ 5
’x’ :: Char
The second point is simply that the type contains two type variables, a and b. Since there is
nothing in the definition that connects the types, the compiler assumes that they are different
(although they don’t have to be).
In the next example, the two arguments are used to build a list. Consequently, they must have the
same type. The type [a] is the type of a list whose elements have type a.
two :: a -> a -> [a]
two x y = [x,y]
5 HASKELL 55
When we apply a function to an actual argument, the types become specialized. Providing a Char
as the first argument of two forces the second argument to be Char as well.
Types> two ’a’
two ’a’ :: Char -> [Char]
The function two ’a’ accepts an argument of type Char but an argument of another kind gives a
type error.
Types> two ’a’ ’b’
"ab" :: [Char]
Types> two ’a’ 3
ERROR - Illegal Haskell 98 class constraint in inferred type
*** Expression : two ’a’ 3
*** Type : Num Char => [Char]
The next function introduces a new feature: it uses the equality predicate ==. Haskell allows this
even without further type information, but adds a requirement to the type: it must be an equality
type. The expression Eq => is called a context and it says that the type a must be an instance
of the class Eq.
cmp :: Eq a => a -> a -> [a]
cmp x y = if x == y then [x,y] else []
If we change == to >, we get a different requirement: the type a must be an instance of the class
Ord, also called an ordinal type. The class Ord is a subclass of the class Eq: this means that any
type that provides > must also provide ==.
gt :: Ord a => a -> a -> [a]
gt x y = if x > y then [x,y] else []
Something else happens if we change one argument of > to the digit “1”. The context becomes
(Ord a, Num a), asserting that a must be both an ordinal type and a numeric type.
n1 :: (Ord a, Num a) => a -> a -> a
n1 x y = if x > 1 then x + y else 0
If we write 1.5 instead of 1, we obtain yet another context: a must be an ordinal type and a
fractional type.
n2 :: (Ord a, Fractional a) => a -> a -> a
n2 x y = if x > 1.5 then x + y else 0
As stated above, Haskell finds the most general type. We can specialize types in two ways. First,
we can add a type annotation to any variable or expression in the definition. Note that we are not
allowed to annotate a pattern — only an expression. If we specify that x is a Double, the compiler
infers that y must be a Double too, because x and y both occur in the expression x + y.
n3 :: Double -> Double -> Double
n3 x y = if (x::Double) > 1.5 then x + y else 0
We can also specify the type of the entire function. Here is the same function again, but with the
type Int specified by the programmer. Asking Haskell for the type of this function shows that it
has used the type we provided rather than the more general type.
n4 :: Int -> Int -> Int
n4 x y = if x > 1 then x + y else 0
Types> :t n4
5 HASKELL 56
n4 :: Int -> Int -> Int
Of course, if we do provide a type, we’d better get it right. In the next example, we provide an
incorrect type for the same function:
n5 :: Int -> Char -> Int
n5 x y = if x > 1 then x + y else 0
This attempt does not fool the compiler:
Reading file "e:/courses/comp348/src/types.hs":
Type checking
ERROR "e:/courses/comp348/src/types.hs":31 - Type error in application
*** Expression : x + y
*** Term : x
*** Type : Int
*** Does not match : Char
It is often easier to write functions than to figure out their types, which can get quite complicated.
In this example, we apply a function to a list. Haskell infers: that the most general type of the
function is a → b; the second argument of apply must be a list of as; and the result is a list of bs.
apply :: (a -> b) -> [a] -> [b]
apply f [] = []
apply f (x:xs) = f x : apply f xs
Types can get even longer, as the following example shows. The individual types are f :: a → b,
g :: c → a, h :: d → c; the type of x is d, and the type of the result is b.
compose :: (a -> b) -> (c -> a) -> (d -> c) -> d -> b
compose f g h x = f (g (h x))
The type of a tuple is written in the same way as values of the type. In this example, the type
(a, b) is “the type of pairs whose first element has type a and whose second element has type b”.
Similar notation is used for longer tuples.
cp :: a -> b -> (a,b)
cp x y = (x, y)
Summarizing what we have seen so far:
All valid Haskell functions and expressions have a type.
The type inferred by the compiler is the most general type.
The most general type may be specialized by type annotations provided by the user.
A context indicates that a type is an instance of a class.
5.3.1 Using Types
Since Haskell infers types, it is possible to virtually ignore types while programming. This is
unwise, for two reasons:
It is often useful to have a clear idea of the types of your variables when writing functions.
Haskell errors are very often caused by type mismatches. If you don’t understand the type
system, you will find these errors very hard to understand. If you do understand the type
system, you will find them quite helpful.
5 HASKELL 57
For example, one version of the function brk function in Section 5.2.5 looked like this:
brk1 [] line = [line]
brk1 (w:ws) line = if length w + 1 + length line <= 80
then brk1 ws (line ++ ’ ’ ++ w)
else line : brk1 (w:ws) []
Haskell did not accept this version, reporting:
Reading file "e:/courses/comp348/src/lec1.hs":
Type checking
ERROR "e:/courses/comp348/src/lec1.hs":9 - Type error in application
*** Expression : ’ ’ ++ w
*** Term : ’ ’
*** Type : Char
*** Does not match : [a]
To make sense of this error:
In the line beginning ERROR, the :9 after the path name is the line number. It identifies then
brk1 ws (line ++ ’ ’ ++ w) as the line with the error.
The expression which cannot be typed is ’ ’ ++ w.
The term (that is, part of the expression) which is causing the problem is ’ ’. It has type
Char, but the type checker is expecting something of type [a] — that is, a list.
The correction is to replace ’ ’ by something which is a list; in this case, [’ ’] or simply "
".
When you are reading someone else’s function definitions, knowing the types of the functions often
makes it easier to understand them. For this reason, it is a common convention in Haskell (and other
functional languages) to write the type declaration, as inferred by the compiler, above the function
declaration. Haskell accepts these type declarations, provided they are correctly formatted, and
reports an error if there is an inconsistency between the declaration type and the inferred type.
However, you are allowed to declare a more specialized type than the type that Haskell infers. For
example, if your module includes the declaration
app :: [Int] -> [Int] -> [Int]
app [] ys = ys
app (x:xs) ys = x:app xs ys
Haskell will accept the declaration, record the type you have specified, and give an error if you try
to use it incorrectly:
Lec1> :info app
app :: [Int] -> [Int] -> [Int]
Lec1> app "abc" "def"
ERROR - Type error in application
*** Expression : app "abc" "def"
*** Term : "def"
*** Type : String
*** Does not match : [Int]
There are some occasions when the type system doesn’t work as you would expect. You may have
already discovered this problem when representing sets as lists:
Lec1> union [] []
5 HASKELL 58
ERROR - Unresolved overloading
*** Type : Eq a => [a]
*** Expression : union [] []
In fact, even [] gives a type error:
Prelude> []
ERROR - Cannot find "show" function for:
*** Expression : []
*** Of type : [a]
This is a deficiency of the type system, but it is not an important one. If the expression union [] []
arises in a context, the empty lists will be the result of some computation and their types will be
known. Here is a contrived example: we define
comb n = union [1,3..n] [1,5..n]
and evaluate comb n with n = 0. This calls union [] [], but Haskell does not complain:
Lec1> comb 0
[] :: [Integer]
5.3.2 Failure
There are only two main causes of failure:
A function is given an argument that is of the correct type but outside the domain of the
function. E.g., head [] or 1/0.
A function fails to terminate.
Although the semantics says that the value of a failed program is always ⊥, the symptoms of
failure can vary quite a lot.
Domain errors are usually trapped by the runtime system:
Prelude> 1/0
Program error: primDivDouble 1.0 0.0
Prelude> head []
Program error: head []
A non-terminating program may run until it is interrupted, whether or not it is producing
output:
Prelude> [1..]
[1,2,3,4,5,6,Interrupted!
Prelude> f where f = f
Interrupted!
Some implementations of Hugs, including WinHugs, do not detect stack overflow, with the
result that programs may crash:
Prelude> g 1 where g n = g(n+1)
5.3.3 Reporting Errors
There is a built-in function
5 HASKELL 59
error :: String -> a
The result returned by this function is the only value that is shared by all types: ⊥ (pronounced
“bottom”). As a side-effect, it displays the string.
Here is an example of its use. The definition of head in the Prelude is:
head (x:xs) = x
head [] = error "headPreuldeList: head[]"
5.3.4 Standard Types
There are a number of types that are either built-in, defined in the Prelude, or defined in libraries.
The first letter of a type is always upper-case. This is not a convention: the compiler uses the case
of the first letter to distinguish type names from type variables, ordinary variables, and function
names.
The type Bool has values True and False.
The type Char has the ASCII characters as its values. The type String is equivalent to
[Char].
There are a number of numeric types. These include Int (fixed size integers); Integer (integers
of any size); Float (single-precision floats); Double (double precision floats); and Rational
(ratios of Integers).
The type () has only two values: (), which is the empty tuple, and ⊥ (bottom). This type
corresponds roughly to void in C
++
.
5.3.5 Standard Classes
The Standard Classes of Haskell are defined in the Prelude. Here are a couple of examples. The
equality class specifies that a type instance must provide the functions == and /=. However, if
the user supplies one of these functions, the compiler will provide the other one automatically, as
shown by the mutually recursive definitions.
class Eq a where
(==), (/=) :: a -> a -> Bool
-- Minimal complete definition: (==) or (/=)
x == y = not (x/=y)
x /= y = not (x==y)
The ordinal class inherits from the equality class and provides additional functions for comparison.
An instance type must provide at least <=; all of the other functions can be inferred by the compiler.
class (Eq a) => Ord a where
compare :: a -> a -> Ordering
(<), (<=), (>=), (>) :: a -> a -> Bool
max, min :: a -> a -> a
-- Minimal complete definition: (<=) or compare
-- using compare can be more efficient for complex types
compare x y | x==y = EQ
| x<=y = LT
| otherwise = GT
x <= y = compare x y /= GT
5 HASKELL 60

Eq

©
d
d
d

Show

©

Ord
d
d
d

Num

©
d
d
d

Enum
d
d
d

Real

©
d
d
d

Fractional

©
d
d
d

Integral

RealFrac
d
d
d

Floating

©

RealFloat
Figure 9: Standard Class Inheritance Graph
x < y = compare x y == LT
x >= y = compare x y /= LT
x > y = compare x y == GT
max x y | x <= y = y
| otherwise = x
min x y | x <= y = x
| otherwise = y
Figure 9 shows the class inheritance graph for the standard classes. There is no need to memorize
this graph: you will need it only if you are designing your own numeric type and, even then,
Haskell’s error messages will guide you. What is more important is to know the functions that a
class is supposed to implement; you can get this information from the Prelude, as shown above.
The class Show must provide a function that translates a value of the type into a string.
5.3.6 Defining New Types
As an illustration of how programmers can define new Haskell types, we discuss the type Frac,
shown in Figure 10. This type is actually unnecessary, because there is already a type Rational,
but it is not defined in the Prelude.
The first step is to declare the type and constructors for it. The name of the type is Frac and
the name of the only constructor is F. These names do not have to be different: we could have
used Frac for both. To construct a fraction, we must provide two integers:
Frac> F 9 12
9/12 :: Frac
5 HASKELL 61
module Frac where
data Frac = F Integer Integer
instance Show Frac where
show (F n d) = if d == 1
then show n
else show n ++ "/" ++ show d
instance Eq Frac where
F n d == F n’ d’ = n * d’ == n’ * d
instance Ord Frac where
F n d <= F n’ d’ = n * d’ <= n’ * d
instance Num Frac where
F n d + F n’ d’ = normalize( F (n * d’ + n’ * d) (d * d’) )
F n d - F n’ d’ = normalize( F (n * d’ - n’ * d) (d * d’) )
F n d * F n’ d’ = normalize( F (n * n’) (d * d’) )
negate (F n d) = normalize( F (negate n) d )
fromInteger n = F n 1
instance Fractional Frac where
recip (F n d) = normalize( F d n )
normalize :: Frac -> Frac
normalize (F n d) =
if d == 0
then error ("Bad fraction: " ++ show (F n d))
else if n == 0
then F 0 1
else if d < 0
then normalize (F (negate n) (negate d))
else F (n ‘div‘ g) (d ‘div‘ g)
where g = gcd n d
-- Harmonic series: 1/1 + 1/2 + 1/3 + ...
harm :: Integer -> [Frac]
harm n
| n <= 0 = []
| otherwise = harm (n - 1) ++ [F 1 n]
Figure 10: The Class Frac
5 HASKELL 62
We could introduce new function names for operations on fractions. For example, addFrac
might add two fractions. There are many advantages, however, to making the type Frac an
instance of numeric classes. If we do this, we can overload function names and, for example,
add fractions using +.
From Figure 9, we can see that any numeric class must be an instance of class Show. We
provide a version of the required function show that writes a fraction in the form "2/3". As
a nice touch, it shows fractions with denominator equal to 1 as integers.
The next step is to make Frac an equality type. We define ==, and Haskell automatically
provides /=.
We can provide an ordering for fractions. The function that we choose is not arbitrary: if
we provide <=, as here, Haskell will provide a number of other functions, including <, >, >=,
compare, min, and max.
Caution: providing a different comparison, such as >=, may lead to looping!
The next step is to make Frac an instance of Num, which requires overloaded definitions of +,
-, and *. It is useful also to define negate, because Haskell treats -x as (negate x), and a
function that converts integers to Fracs. Note that the Num class does not contain division
operations. This enables us to define types like Vector, which can be added, subtracted, and
multiplied, but not divided.
Since we do intend Fracs to be divided, we make Frac an instance of Fractional. All
that is needed is a function recip defined so that recip x = 1/x. Haskell will then define
x/y = x recip y.
We want to work with fractions in “normal form”. For example, 4/6 should be converted to 2/3
and 5/(−16) should be converted to −5/16. We also want to check for zero denominators and
put the fraction zero into the normal form 0/1. All of these tasks are performed by normalize,
which uses the standard function gcd to cancel common factors. We use normalize for every
operation that constructs a new fraction.
Finally and for fun, we introduce the function harm to construct a list of the fractions
1/1, 1/2, 1/3, . . . , 1/n.
Figure 11 shows a few simple examples of the new type. The functions work as expected. The
last example is of interest, because we did not define sum. However, sum is defined for all instances
of Num, so it works correctly for Frac. The result is exact because the components of a Frac are
Integer (“large integers”) rather than Int (32-bit integers).
5.3.7 Defining Data Types
A type may have more than one constructor. Enumerations have several constructions but no
data, as in these declarations:
data Bool = False | True
data Colour = Red | Green | Blue
Haskell provides special syntax (if/then/else) for conditional expressions because programmers
are familiar with the English words. The special syntax is actually unnecessary, because we could
define
if True x y = x
if False x y = y
relying on lazy evaluation to ensure that only one of x or y is actually evaluated.
5 HASKELL 63
Frac> F 3 4 > F 5 6
False :: Bool
Frac> F 3 4 + F 7 12
4/3 :: Frac
Frac> $$ > F 6 5
True :: Bool
Frac> F 7 4 / F 2 5
35/8 :: Frac
Frac> harm 5
[1,1/2,1/3,1/4,1/5] :: [Frac]
Frac> sum (harm 50)
13943237577224054960759/3099044504245996706400 :: Frac
Figure 11: Trying out type Frac
Binary Search Trees Constructors may also define data components. In Figure 12, we define
a type for binary trees with integers at the nodes. A BTree is either Empty, or consists of an Int
and two subtrees of type BTree. Here are some things to note about this definition:
Function show uses a “helper” function with an additional argument to display trees. Using
the tree t defined at the bottom of Figure 12:
BTree> t
1
2
3
4
5
6
8 :: BTree
The functions size, height, and treeToList are trivial to define by cases on the constructors.
The function insert assumes that the trees are binary search trees. A new entry is inserted
in the left or right subtree unless it is equal to an entry already in the tree, in which case it is
ignored (an alternative would be to report an error).
Comparing trees for equality, using == is straightforward. The important thing to note is that
the apparently unnecessary case Empty == Empty = True must be included — otherwise
the function loops!
Constructors may define data components. As an example, we will develop a type for binary trees
with two constructors: Empty constructs an empty tree, and Tree n l r constructs a tree with
integer n at the root, a left subtree l, and a right subtree r. Note that this is a recursive type
definition because BTree occurs on both sides of =. Just as with recursion in functions, we need a
base case (in this example the base case is Empty) and an inductive case.
data BTree = Empty | Tree Int BTree BTree
5 HASKELL 64
module BTree where
import Trace
data BTree = Empty | Tree Int BTree BTree
instance Show BTree where
show t = sh t ""
where
sh Empty sp = ""
sh (Tree n l r) sp = let sps = sp ++ " "
in sh r sps ++ sp ++ show n ++ "\n" ++ sh l sps
instance Eq BTree where
Empty == Empty = True
(Tree n l r) == (Tree n’ l’ r’) = n == n’ && l == l’ && r == r’
size Empty = 0
size (Tree n l r) = 1 + size l + size r
height Empty = 0
height (Tree n l r) = 1 + max (height l) (height r)
treeToList Empty = []
treeToList (Tree n l r) = treeToList l ++ treeToList r ++ [n]
listToTree [] = Empty
listToTree (n:ns) = insert n (listToTree ns)
insert m Empty = Tree m Empty Empty
insert m t@(Tree n l r) =
if m < n then Tree n (insert m l) r
else if m > n then Tree n l (insert m r)
else t
find Empty m = False
find (Tree n l r) m =
if m == n then True
else if m < n then find l m
else find r m
merge x y = listToTree(treeToList x ++ treeToList y)
Figure 12: Class BTree
5 HASKELL 65
We can build trees using the constructors. This is a bit tedious, and we will discuss easier ways
later.
t = Tree 1
( Tree 2
( Tree 3
Empty
( Tree 4
Empty
( Tree 5
Empty
Empty ) ) )
Empty )
( Tree 6 Empty Empty )
The simple functions below illustrate how we use the constructors as patterns in functions that
operate on trees. The size of a tree is the number of nodes that it contains and the height of a tree
is the length of the longest path from the root to a leaf node.
size Empty = 0
size (Tree n l r) = 1 + size l + size r
height Empty = 0
height (Tree n l r) = 1 + max (height l) (height r)
The function treeToList converts a tree to a list. The ordering of the list is significant: it would be
natural to use inorder (left/root/right), because this gives an ordered list of integers. However, we
want to be able to use the list to recreate the tree, and for this purpose postorder (left/right/root)
is better.
Here are these functions in use:
BTree> size t
6 :: Integer
BTree> height t
5 :: Integer
BTree> treeToList t
[5,4,3,2,6,1] :: [Int]
As with previous types, we make the type Btree an instance of class Show so that we can convert
trees to strings. The implementation of show below uses a “helper function” sh with a second
argument giving the amount of indentation to use when printing a node. In recursive calls, the
indentation is controlled by the depth of the node in the tree. For example:
BTree> t
6
1
2
5
4
3
:: BTree
5 HASKELL 66
We can easily define an equality relation on trees: trees are equal if the integers at their root nodes
are equal and their subtrees are (recursively) equal. Surprisingly, we must declare the Empty is
equal to Empty — without this component of the definition, the function loops.
instance Eq BTree where
Empty == Empty = True
(Tree n l r) == (Tree n’ l’ r’) = n == n’ && l == l’ && r == r’
Let us assume that our binary trees are going to be used as binary search trees (BST). That
is, we must ensure that all entries in every left subtree are less than the root and that all entries
in every right subtree are greater than the root. The following function inserts an integer into a
BST, maintaining the BST property. We have to decide what to do with duplicate entries, which
are not allowed in a binary search tree. This version of function insert ignores them.
It is important to “rebuild” the parts of the tree that we have taken apart. The following code for
the case m < n returns only part of the tree:
if m < n
then insert m l
The function insert provides another way of building trees. Note that the last node in the
expression is the first node to be inserted in the tree — and it becomes the root.
t1 = insert 1
(insert 8
(insert 3
(insert 2
(insert 5
(insert 2
(insert 6
(insert 4 Empty)))))))
BTree> t1
8
6
5
4
3
2
1
Although this is a slight improvement over calling constructors, it is still rather clumsy. Here is a
function that takes integers from a list and builds a BST from them.
listToTree [] = Empty
listToTree (n:ns) = insert n (listToTree ns)
t2 = listToTree [5, 2, 7, 0, 8, 3, 6, 1, 4]
BTree> t2
8
7
6
5
4
5 HASKELL 67
3
2
1
0
We have written treeToList and listToTree so that they are inverses.
BTree> t2 == listToTree (treeToList t2)
True :: Bool
It is easy to merge trees using treeToList and listToTree.
merge x y = listToTree (treeToList x ++ treeToList y)
Then, with
t3 = listToTree [14, 9, 13, 16, 11, 17, 19, 18, 10, 20, 12]
we have
BTree> merge t2 t3
20
19
18
17
16
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
:: BTree
The function find uses the familiar algorithm to decide whether a given integer is stored in the
tree. To make trees useful, we should store a key/value pair at each node. Then find would match
the key and return the corresponding value.
The fringe of a tree is a list of the leaf nodes as they would be visited in a left to right scan.
Assuming that we can recognize a leaf, it is straightforward to define fringe, as in Figure 13.
We can test these functions with the trees that we already have.
BTree> fringe t1
[1,3,5,8] :: [Int]
BTree> fringe t2
[0,2,5,7] :: [Int]
5 HASKELL 68
isLeaf (Tree _ Empty Empty) = True
isLeaf _ = False
fringe Empty = []
fringe t@(Tree n l r) =
if isLeaf t
then trace ("Leaf " ++ show n ++ "\n") (n : (fringe l ++ fringe r))
else fringe l ++ fringe r
sameFringe x y = fringe x == fringe y
Figure 13: The fringe of a tree
There is a famous problem in Computer Science called the “same fringe problem”. Given two trees,
how do we decide whether they have the same fringe? Since list comparison is provided for any
equality type, and Integer is an equality type, we have immediately:
sameFringe x y = fringe x == fringe y
BTree> sameFringe t1 t2
False :: Bool
Let us investigate this more closely. Here is a modified version of fringe that traces each discovery
of a leaf node:
fringe Empty = []
fringe t@(Tree n l r) =
if isLeaf t
then trace ("Leaf " ++ show n ++ "`n") (n : (fringe l ++ fringe r))
else fringe l ++ fringe r
This is what happens when we call sameFringe again with the traced version of fringe:
BTree> sameFringe t1 t2
Leaf 1
Leaf 0
False :: Bool
5.3.8 Types with Parameters
In this section, we revisit trees, but with a new twist. The trees in Section 5.3.7 were defined with
integers at the nodes. In the declaration below, we define trees with an arbitrary type, indicated
by the type variable a, at the nodes. On the left side of the declaration, Tree a indicates that the
type variable a is a parameter of the type. Whenever we refer to the type on the right side, we
must specify it as Tree a again.
module Tree where
data Tree a = Empty | T a (Tree a) (Tree a)
In practice, trees with integer nodes are not very useful. We need to store key/value pairs at
the nodes so that we can store useful information in the tree, as shown in Figure 14. The type
5 HASKELL 69
module Tree where
data (Ord k, Show k, Show v) => Tree k v =
Empty
| T k v (Tree k v) (Tree k v)
instance (Ord k, Show k, Show v) => Show (Tree k v) where
show Empty = ""
show (T key value left right) =
show left ++
leftjust (show key) 12 ++ show value ++ "\n" ++
show right
leftjust :: [Char] -> Int -> [Char]
leftjust s n
| length s >= n = s
| otherwise = leftjust (s ++ " ") n
out t = putStr (show t)
insert :: (Ord a, Show a, Show b) => a -> b -> Tree a b -> Tree a b
insert newKey newValue Empty = T newKey newValue Empty Empty
insert newKey newValue t@(T key value left right) =
if newKey == key then T key newValue left right
else if newKey < key then T key value (insert newKey newValue left) right
else T key value left (insert newKey newValue right)
build :: (Ord a, Show a, Show b) => [(a,b)] -> Tree a b
build [] = Empty
build ((key,value):ks) = insert key value (build ks)
t = build [
("Charles", 4163548695),
("Anne", 5144831733),
("George", 8147586977),
("Edward", 4504657464),
("Diana", 2124756843),
("Feridoon", 6063647586),
("Bill", 4508982212) ]
Figure 14: Key-value tree
5 HASKELL 70
variables k and v stand for “key” and “value” respectively. We require that k is an ordered type
so that we can compare keys.
When we construct a non-empty tree, we must now provide four arguments: the key, the value,
the left subtree, and the right subtree. The insertion function insert can use the operators ==, <,
and > with keys because the type of the keys is an instance of class Ord. When we supply a new
pair with a key that already exists in the tree, the new value replaces the old value.
As before, we can reduce the amount of writing required by providing a function, build, that
builds a tree from a list of pairs.
Here is a test:
t = build [ (6,"Anne"), (3,"Bill"), (8,"Charles"), (1,"Diana") ]
There is a catch with parameterized types: when we use them as instances of a class, we must
repeat all of the constraints. For example, to make Tree an instance of Show, we must write:
instance (Ord k, Show k, Show v) => Show (Tree k v) where
....
In the source file:
t = build [
("Charles", 4163548695),
("Anne", 5144831733),
("George", 8147586977),
("Edward", 4504657464),
("Diana", 2124756843),
("Feridoon", 6063647586),
("Bill", 4508982212) ]
A Haskell session:
Tree> t
"Anne" 5144831733
"Bill" 4508982212
"Charles" 4163548695
"Diana" 2124756843
"Edward" 4504657464
"Feridoon" 6063647586
"George" 8147586977
:: Tree [Char] Integer
Formulas It is often useful to define types with many constructors. The following declaration
introduces a type suitable for formulas in propositional calculus, with constants true and false,
propositional symbols (or names), conjunctions, disjunctions, negations, and implications.
module Logic where
data Formula =
TT
| FF
| Var String
| Con [Formula]
| Dis [Formula]
| Neg Formula
5 HASKELL 71
| Imp Formula Formula
When we define a function for a type like this one, we must usually include cases for all of the
constructors. The following definition of show is incomplete because it does not indicate operator
precedence.
instance Show Formula where
show TT = "true"
show FF = "false"
show (Var name) = name
show (Con []) = ""
show (Con [f]) = show f
show (Con (f:fs)) = show f ++ " and " ++ show (Con fs)
show (Dis []) = ""
show (Dis [f]) = show f
show (Dis (f:fs)) = show f ++ " or " ++ show (Dis fs)
show (Neg f) = "not " ++ show f
show (Imp p q) = show p ++ " => " ++ show q
As before, we can build formulas using the constructors, although this is rather tedious.
f = (Imp p (Imp p (Imp p p))) where p = Var "p"
g = Imp (Var "p") (Con [Var "q", Neg (Var "r")])
Logic> f
p => p => p => p :: Formula
Logic> g
p => q and not r :: Formula
5.4 Input and Output
Haskell handles input and output using monads. Moands are pure functions that provide the
effect of managing state changes. In this course, we will not discuss monads in detail; instead we
will just provide some simple examples of their use.
The first program reads a character and echoes it. The keyword do is followed by a sequence of
actions. Each action must have the same indentation. The function getChar performs an action
and returns a value, in this case a Char. The function putChar takes a Char as its argument and
performs an action — writing the character.
test1 = do
c <- getChar
putChar c
The next example shows how to write functions that perform actions. The function ready performs
the following sequence of actions: output a prompt; read a character; return a Bool value that
depends on the character read.
ready = do
putStr "Ready? "
char <- getChar
return (char == ’y’)
5 HASKELL 72
test2 = do
t <- ready
if t
then putStr "\nReady"
else putStr "\nNot ready"
The function getline in the next example is recursive. The else part requires an action sequence
that is introduced by a second do. The program test3 reads a line, applies a function (switch)
to it, and then writes the line.
getline = do
char <- getChar
if char == ’\n’
then return ""
else do
line <- getLine
return (char:line)
switch s = map sw s where
sw char =
if isLower char
then toUpper char
else if isUpper char
then toLower char
else char
test3 = do
line <- getline
putStr (switch line)
The next example uses files instead of console functions.
process s = filter (not . isSpace) s
test4 = do
s <- readFile "e:/temp/test.dat"
writeFile "e:/temp/test.res" (process s)
Since it is not good practice to build file names into programs, the next example reads the input
and output file names from the console.
main = do
putStr "Input file: "
iFile <- getLine
putStr "Output file: "
oFile <- getLine
s <- readFile iFile
writeFile oFile s
putStr (iFile ++ " copied to " ++ oFile ++ "\n")
5 HASKELL 73
5.4.1 String Formatting
By default, Hugs does not interpret formatting characters from strings. The following exchange
with the interpreter illustrates this (the standard function unlines converts a list of strings into a
single string with the list items separated by newline characters):
Prelude> unlines ["first line", "second line", "third line"]
"first line\nsecond line\nthird line\n"
The IO function putStr interprets the formatting characters correctly:
Prelude> putStr(unlines ["first line", "second line", "third line"])
first line
second line
third line
Since putStr :: String -> IO () is an IO function, you can use it only at the “top level” —
that is, as a command that you enter or as part of a do sequence. This does not matter in practice,
because there is no need to format strings until you actually want to see them at the end of a
calculation.
5.4.2 Interactive Programs
The standard IO functions include interact, a function that enables you to write interactive
programs easily. The argument for interact is a function of type String -> String. A simple
example of such a function is map toUpper. If you write
interact (map toUpper)
the interpreter will echo each character as you type it, converting lower case letters to upper case
letters.
The effect of lazy evaluation is important here. The effect is not to read a string of characters and
then output the same string with lower case letters converted. Instead, it reads each character
and then immediately echoes the converted character.
The following example provides a more elaborate version of the same idea. The function machine
:: String -> String simulates a vending machine. The argument string is a sequence of “com-
mands” interpreted as follows:
Command Effect
’n’ Insert a nickel (5 cents)
’d’ Insert a dime (10 cents)
’q’ Insert a quarter (25 cents)
’r’ Demand a refund
The output from the machine is a string defined as follows:
String Condition
"" the amount entered is less than 25 cents
"candy" the amount entered was exactly 25 cents
"candy and n cents change" the amount entered was 25 +n cents
"Refund: n" response to ’r’ with n cents stored
5 HASKELL 74
Here is the definition of the function machine:
machine commands = mac commands 0
where
mac "" stored = "Amount left: " ++ show stored
mac (’q’:cs) stored = respond (stored+25) cs
mac (’d’:cs) stored = respond (stored+10) cs
mac (’n’:cs) stored = respond (stored+5) cs
mac (’r’:cs) stored = "Refund: " ++ show stored ++ "\n" ++ mac cs 0
mac (_:cs) stored = mac cs stored
respond stored cs =
if stored == 25
then "candy\n" ++ mac cs 0
else if stored > 25
then "candy and " ++ show (stored-25) ++ " cents change\n" ++ mac cs 0
else mac cs stored
The following dialog with Hugs shows the result when the user enters the string "qdddnnr". To
end the input string, and hence the dialogue, the user enters ctrl-z. The second example shows
the response to "dddd" followed by ctrl-z, showing that the machine reports the amount it has
left before terminating.
VM> interact machine
candy
candy and 5 cents change
Refund: 10
Amount left: 0
VM> interact machine
candy and 5 cents change
Amount left: 10
VM> ....
5.5 Reasoning about programs
Haskell function definitions not only look like equations but, in an important sense, they are
equations. We will illustrate this with a simple example of program manipulation. The following
definitions of append (append two lists) and length (length of a list) should be familiar by now.
The only difference is that we have numbered the equations for future reference.
append [] ys = ys (1)
append (x:xs) ys = x : append xs ys (2)
length [] = 0 (3)
length (x:xs) = 1 + length xs (4)
Suppose that we want to define a new function len2 such that len2 xs ys returns the sum of the
lengths of the lists xs and ys. Here is a simple definition:
len2 xs ys = length (append xs ys) (5)
5 HASKELL 75
The obvious problem with this definition is that it is inefficient: it constructs a new list just for the
purpose of finding its length. How can we avoid this inefficiency? As a first step, we instantiate
(5) with xs = [] and then use (1) to simplify the resulting expression:
len2 [] ys = length (append [] ys) (6)
= length ys (7)
Next, we instantiate (5) with xs = (x:xs) and then use (2), (4), and (5) to simplify the resulting
expression:
len2 (x:xs) = length (append (x:xs) ys (8)
= length (x : append xs ys) (9)
= 1 + length (append xs ys) (10)
= 1 + len2 xs ys (11)
We can now combine (7) and (11) to obtain a definition of len2 that does not use append:
len2 [] ys = length ys
len2 (x:xs) ys = 1 + len2 xs ys
We could, of course, have obtained this definition of len2 with a little thought and no formal
deduction. Here is a somewhat more complicated example in which the final answer is less obvious.
Suppose we would like a function isSubList such that isSubList xs ys returns True if and only if
xs is a sublist of ys. The following definition is clearly correct but not, as it stands, implementable
(think of bs as the “before list” and as as the “after list”):
isSubList xs ys = ∃ bs,as • bs ++ xs ++ as == ys
To make progress, we assume that ys is not empty (and so can be written y:ys’) and that there
are two cases: either bs = [] or bs = y:bs’ (clearly, if bs is not empty, its first element must be
y). Then we have:
isSubList xs (y:ys’) = ∃ as • [] ++ xs ++ as == ys’
∨ ∃ bs’,as • y:bs’ ++ xs ++ as == y:ys’
= ∃ as • xs ++ as == ys’
∨ ∃ bs’,as • bs’ ++ xs ++ as == ys’
= ∃ as • xs ++ as == ys’
∨ isSubList xs ys’
The second disjunct is a straightforward recursion. The first requires another function which, on
the basis of its required behaviour, we will call isPrefix. The definition of isPrefix is
isPrefix xs ys = ∃ as • xs ++ as == ys
Assuming both lists are not empty, we can write this as
isPrefix (x:xs’) (y:ys’) = ∃ as • (x:xs’) ++ as == (y:ys’)
= x == y ∧ ∃ as’ • xs’ ++ as’ == ys’
= x == y ∧ isPrefix xs’ ys’
It is straightforward to add the base cases to obtain the following definition:
5 HASKELL 76
isSubList [] _ = True
isSubList _ [] = False
isSubList xs (y:ys) = isPrefix xs (y:ys) || isSubList xs ys
where
isPrefix [] _ = True
isPrefix _ [] = False
isPrefix (x:xs) (y:ys) = x == y && isPrefix xs ys
6 The Object Oriented Paradigm
A fundamental feature of our understanding of the world is that we organize our ex-
perience as a number of distinct object (tables and chairs, bank loans and algebraic
expressions, . . .) . . . . When we wish to solve a problem on a computer, we often
need to construct within the computer a model of that aspect of the real or conceptual
world to which the solution of the problem will be applied. (Hoare 1968)
Object Oriented Programming (OOP) is the currently dominant programming paradigm. For
many people today, “programming” means “object oriented programming”. Nevertheless, there
is a substantial portion of the software industry that has not adopted OOP, and there remains
widespread misunderstanding about what OOP actually is and what, if any, benefits it has to offer.
6.1 Simula
Many programs are computer simulations of the real world or a conceptual world. Writing such
programs is easier if there is a correspondence between objects in the world and components of the
program.
Simula originated in the Norwegian Computing Centre in 1962. Kristen Nygaard proposed a
language for simulation to be developed by himself and Ole-Johan Dahl (1978). Key insights were
developed in 1965 following experience with Simula I:
the purpose of the language was to model systems;
a system is a collection of interacting processes;
a process can be represented during program execution by multiple procedures each with its
own Algol-style stacks.
The main lessons of Simula I were:
the distinction between a program text and its execution;
the fact that data and operations belong together and that most useful programming constructs
contain both.
Simula 67 was a general purpose programming language that incorporated the ideas of Simula I
but put them into a more general context. The basic concept of Simula 67 was to be “classes of
objects”. The major innovation was “block prefixing” (Nygaard and Dahl 1978).
Prefixing emerged from the study of queues, which are central to discrete-event simulations. The
queue mechanism (a list linked by pointers) can be split off from the elements of the queue (ob-
jects such as trucks, buses, people, and so on, depending on the system being simulated). Once
recognized, the prefix concept could be used in any context where common features of a collection
of classes could be abstracted in a prefixed block or, as we would say today, a superclass.
Dahl (1978, page 489) has provided his own analysis of the role of blocks in Simula.
Deleting the procedure definitions and final statement from an Algol block gives a pure data
record.
Deleting the final statement from an Algol block, leaving procedure definitions and data dec-
larations, gives an abstract data object.
Adding coroutine constructs to Algol blocks provides quasi-parallel programming capabilities.
77
6 THE OBJECT ORIENTED PARADIGM 78
Listing 19: A Simula Class
class Account (real balance);
begin
procedure Deposit (real amount)
balance := balance + amount;
procedure Withdraw (real amount)
balance := balance - amount;
end;
Adding a prefix mechanism to Algol blocks provides an abstraction mechanism (the class
hierarchy).
All of these concepts are subsumed in the Simula class.
Listing 19 shows a simple Simula-style class, with modernized syntax. Before we use this class, we
must declare a reference to it and then instantiate it on the heap:
ref (Account) myAccount;
MyAccount := new Account(1000);
Note that the class is syntactically more like a procedure than a data object: it has an (optional)
formal parameter and it must be called with an actual parameter. It differs from an ordinary Algol
procedure in that its AR remains on the heap after the invocation.
To inherit from Account we write something like:
Account class ChequeAccount (real amount);
. . . .
This explains the term “prefixing” — the declaration of the child class is prefixed by the name of
its parent. Inheritance and subclasses introduced a new programming methodology, although this
did not become evident for several years.
Simula provides:
coroutines that permit the simulation of concurrent processes;
multiple stacks, needed to support coroutines;
classes that combine data and a collection of functions that operate on the data;
prefixing (now known as inheritance) that allows a specialized class to be derived from a
general class without unnecessary code duplication;
a garbage collector that frees the programmer from the responsibility of deallocating storage.
Simula itself had an uneven history. It was used more in Europe than in North America, but it
never achieved the recognition that it deserved. This was partly because there were few Simula
compilers and the good compilers were expensive.
On the other hand, the legacy that Simula left is considerable: a new paradigm of programming.
Exercise 23. Coroutines are typically implemented by adding two new statements to the language:
suspend and resume. A suspend statement, executed within a function, switches control to a
scheduler which stores the value of the program counter somewhere and continues the execution of
another function. A resume statement has the effect of restarting a previously suspended function.
At any one time, exactly one function is executing. What features might a concurrent language
provide that are not provided by coroutines? What additional complications are introduced by
concurrency? 2
6 THE OBJECT ORIENTED PARADIGM 79
6.2 Smalltalk
Smalltalk originated with Alan Kay’s reflections on the future of computers and programming in
the late 60s. Kay (1996) was influenced by LISP, especially by its one-page metacircular interpreter,
and by Simula. It is interesting to note that Kay gave a talk about his ideas at MIT in November
1971; the talk inspired Carl Hewitt’s work on his Actor model, an early attempt at formalizing
objects. In turn, Sussman and Steele wrote an interpreter for Actors that eventually became
Scheme (see Section 4.2). The cross-fertilization between OOPL and FPL that occurred during
these early days has, sadly, not continued.
The first version of Smalltalk was implemented by Dan Ingalls in 1972, using BASIC (!) as the
implementation language (Kay 1996, page 533). Smalltalk was inspired by Simula and LISP; it
was based on six principles (Kay 1996, page 534).
1. Everything is an object.
2. Objects communicate by sending and receiving messages (in terms of objects).
3. Objects have their own memory (in terms of objects).
4. Every object is an instance of a class (which must be an object).
5. The class holds the shared behaviour for its instances (in the form of objects in a program
list).
6. To [evaluate] a program list, control is passed to the first object and the remainder is treated
as its message.
Principles 1–3 provide an “external” view and remained stable as Smalltalk evolved. Principles
4–6 provide an “internal” view and were revised following implementation experience. Principle
6 reveals Kay’s use of LISP as a model for Smalltalk — McCarthy had described LISP with a
one-page meta-circular interpreter; one of Kay’s goals was to do the same for Smalltalk.
Smalltalk was also strongly influenced by Simula. However, it differs from Simula in several ways:
Simula distinguishes primitive types, such as integer and real, from class types. In Smalltalk,
“everything is an object”.
In particular, classes are objects in Smalltalk. To create a new instance of a class, you send a
message to it. Since a class object must belong to a class, Smalltalk requires metaclasses.
Smalltalk effectively eliminates passive data. Since objects are “active” in the sense that they
have methods, and everything is an object, there are no data primitives.
Smalltalk is a complete environment, not just a compiler. You can edit, compile, execute, and
debug Smalltalk programs without ever leaving the Smalltalk environment.
The “block” is an interesting innovation of Smalltalk. A block is a sequence of statements that
can be passed as a parameter to various control structures and behaves rather like an object. In
the Smalltalk statement
10 timesRepeat: [Transcript nextPutAll: ’ Hi!’]
the receiver is 10, the message is timesRepeat:, and [Transcript nextPutAll: ’ Hi!’] is
the parameter of the message.
Blocks may have parameters. One technique for iteration in Smalltalk is to use a do: message
which executes a block (its parameter) for each component of a collection. The component is
6 THE OBJECT ORIENTED PARADIGM 80
passed to the block as a parameter. The following loop determines whether "Mika" occurs in the
current object (assumed to be a collection):
self do: [:item | item = "Mika" ifTrue: [^true]].
^false
The first occurrence of item introduces it as the name of the formal parameter for the block.
The second occurrence is its use. Blocks are closely related to closures (Section 4.2). The block
[:x | E] corresponds to λx . E.
The first practical version of Smalltalk was developed in 1976 at Xerox Palo Alto Research Center
(PARC). The important features of Smalltalk are:
everything is an object;
an object has private data and public functions;
objects collaborate by exchanging “messages”;
every object is a member of a class;
there is an inheritance hierarchy (actually a tree) with the class Object as its root;
all classes inherit directly or indirectly from the class Object;
blocks;
coroutines;
garbage collection.
Exercise 24. Discuss the following statements (Gelernter and Jagannathan 1990, page 245).
1. [Data objects are replaced by program structures.] Under this view of things, particularly
under Smalltalk’s approach, there are no passive objects to be acted upon. Every entity is an
active agent, a little program in itself. This can lead to counter-intuitive interpretations of simple
operations, and can be destructive to a natural sense of hierarchy.
2. Algol 60 gives us a representation for procedures, but no direct representation for a procedure
activation. Simula and Smalltalk inherit this same limitation: they give us representations for
classes, but no direct representations for the objects that are instantiated from classes. Our
ability to create single objects and to specify (constant) object “values” is compromised. 2
Exercise 25. In Smalltalk, an object is an instance of a class. The class is itself an object, and
therefore is a member of a class. A class whose instances are classes is called a metaclass. Do
these definitions imply an infinite regression? If so, how could the infinite regression by avoided?
Is there a class that is a member of itself? 2
6.3 CLU
CLU is not usually considered an OOPL because it does not provide any mechanisms for incre-
mental modification. However:
The resulting programming methodology is object-oriented: programs are developed
by thinking about the objects they manipulate and then inventing a modular structure
based on these objects. Each type of object is implemented by its own program module.
(Liskov 1996, page 475)
Parnas (1972) introduced the idea of “information hiding”: a program module should have a public
interface and a private implementation. Liskov and Zilles (1974) introduced CLU specifically to
support this style of programming.
6 THE OBJECT ORIENTED PARADIGM 81
An abstract data type (ADT) specifies a type, a set of operations that may be performed on
instances of the type, and the effects of these operations. The implementation of an ADT provides
the actual operations that have the specified effects and also prevents programmers from doing
anything else. The word “abstract” in CLU means “user defined”, or not built in to the language:
I referred to the types as “abstract” because they are not provided directly by a pro-
gramming languages but instead must be implemented by the user. An abstract type
is abstract in the same way that a procedure is an abstract operation. (Liskov 1996,
page 473)
This is not a universally accepted definition. Many people use the term ADT in the following
sense:
An abstract data type is a system consisting of three constituents:
1. some sets of objects;
2. a set of syntactic descriptions of the primitive functions;
3. a semantic description — that is, a sufficiently complete set of relationships that
specify how the functions interact with each other.
(Martin 1986)
The difference is that an ADT in CLU provides a particular implementation of the type (and, in
fact, the implementation must be unique) whereas Martin’s definition requires only a specification.
A stack ADT in CLU would have functions called push and pop to modify the stack, but a
stack ADT respecting Martin’s definition would define the key property of a stack (the sequence
push(); pop() has no effect) but would not provide an implementation of these functions.
Although Simula influenced the design of CLU, it was seen as deficient in various ways. Simula:
does not provide encapsulation: clients could directly access the variables, as well as the
functions, of a class;
does not provide “type generators” or, as we would say now, “generic types” (“templates” in
C++);
associates operations with objects rather than types;
handles built-in and user-defined types differently — for example, instances of user-defined
classes are always heap-allocated.
CLU’s computational model is similar to that of LISP: names are references to heap-allocated
objects. The stack-only model of Algol 60 was seen as too restrictive for a language supporting
data abstractions. The reasons given for using a heap include:
Declarations are easy to implement because, for all types, space is allocated for a pointer.
Stack allocation breaks encapsulation because the size of the object (which should be an
implementation “secret”) must be known at the point of declaration.
Variable declaration is separated from object creation. This simplifies safe initialization.
Variable and object lifetimes are not tied. When the variable goes out of scope, the object
continues to exist (there may be other references to it).
The meaning of assignment is independent of type. The statement x := E means simply
that x becomes a reference (implemented as a pointer, of course) to E. There is no need to
worry about the semantics of copying an object. Copying, if needed, should be provided by a
member function of the class.
6 THE OBJECT ORIENTED PARADIGM 82
Listing 20: A CLU Cluster
intset = cluster is create, insert, delete, member, size, choose
rep = array[int]
create = proc () returns (cvt)
return (rep$new())
end create
insert = proc (s: intset, x: int)
if ∼member(s, x) then rep$addh(down(s), x) end
end insert
member = proc (s: cvt, x: int) returns (bool)
return (getind(s, x) ≤ rep$high(s))
end member
. . . .
end intset
Reference semantics requires garbage collection. Although garbage collection has a run-time
overhead, efficiency was not a primary goal of the CLU project. Furthermore, garbage collec-
tion eliminates memory leaks and dangling pointers, notorious sources of obscure errors.
CLU is designed for programming with ADTs. Associated with the interface is an implementation
that defines a representation for the type and provides bodies for the functions.
Listing 20 shows an extract from a cluster that implements a set of integers (Liskov and Guttag
1986, page 63). The components of a cluster, such as intset, are:
1. A header that gives the name of the type being implemented (intset) and the names of
the operations provided (create, insert, . . . .).
2. A definition of the representation type, introduced by the keyword rep. An intset is repre-
sented by an array of integers.
3. Implementations of the functions mentioned in the header and, if necessary, auxiliary func-
tions that are not accessible to clients.
The abstract type intset and the representation type array[int], referred to as rep, are distinct.
CLU provides special functions up, which converts the representation type to the abstract type,
and down, which converts the abstract type to the representation type. For example, the function
insert has a parameter s of type intset which is converted to the representation type so that it
can be treated as an array.
It often happens that the client provides an abstract argument that must immediately be converted
to the representation type; conversely, many functions create a value of the representation type
but must return a value of the abstract type. For both of these cases, CLU provides the keyword
cvt as an abbreviation. In the function create above, cvt converts the new instance of rep to
intset before returning, and in the function member, the formal parameter s:cvt indicates that
the client will provide an intset which will be treated in the function body as a rep.
All operations on abstract data in CLU contain the type of the operator explicitly, using the syntax
type$operation.
CLU draws a distinction between mutable and immutable objects. An object is mutable if its value
can change during execution, otherwise it is immutable. For example, integers are immutable but
arrays are mutable (Liskov and Guttag 1986, pages 19–20).
6 THE OBJECT ORIENTED PARADIGM 83
The mutability of an object depends on its type and, in particular, on the operations provided by
the type. Clearly, an object is mutable if its type has operations that change the object. After
executing
x: T = intset$create()
y: T
y := x
the variables x and y refer to the same object. If the object is immutable, the fact that it is shared
is undetectable by the program. If it is mutable, changes made via the reference x will be visible
via the reference y. Whether this is desirable depends on your point of view.
The important contributions of CLU include:
modules (“clusters”);
safe encapsulation;
distinction between the abstract type and the representation type;
names are references, not values;
the distinction between mutable and immutable objects;
analysis of copying and comparison of ADTs;
generic clusters (a cluster may have parameters, notably type parameters);
exception handling.
Whereas Simula provides tools for the programmer and supports a methodology, CLU provides
the same tools and enforces the methodology. The cost is an increase in complexity. Several
keywords (rep, cvt, down, and up) are needed simply to maintain the distinction between abstract
and representation types. In Simula, three concepts — procedures, classes, and records — are
effectively made equivalent, resulting in significant simplification. In CLU, procedures, records,
and clusters are three different things.
The argument in favour of CLU is that the compiler will detect encapsulation and other errors.
But is it the task of a compiler to prevent the programmer doing things that might be completely
safe? Or should this role be delegated to a style checker, just as C programmers use both cc (the
C compiler) and lint (the C style checker)?
The designers of CLU advocate defensive programming:
Of course, checking whether inputs [i.e., parameters of functions] are in the permit-
ted subset of the domain takes time, and it is tempting not to bother with the checks,
or to use them only while debugging, and suppress them during production. This is
generally an unwise practice. It is better to develop the habit of defensive programming,
that it writing each procedure to defend itself against errors. . . .
Defensive programming makes it easier to debug programs. . . . (Liskov and Guttag
1986, page 100)
Exercise 26. Do you agree that defensive programming is a better habit than the alternative
suggested by Liskov and Guttag? Can you propose an alternative strategy? 2
6.4 C++
C++ was developed at Bell Labs by Bjarne Stroustrup (1994). The task assigned to Stroustrup was
to develop a new systems PL that would replace C. C++ is the result of two major design decisions:
6 THE OBJECT ORIENTED PARADIGM 84
first, Stroustrup decided that the new language would be adopted only if it was compatible with
C; second, Stroustrup’s experience of completing a Ph.D. at Cambridge University using Simula
convinced him that object orientation was the correct approach but that efficiency was essential for
acceptance. C++ is widely used but has been sharply criticized (Sakkinen 1988; Sakkinen 1992;
Joyner 1992).
C++:
is almost a superset of C (that is, there are only a few C constructs that are not accepted by
a C++ compiler);
is a hybrid language (i.e., a language that supports both imperative and OO programming),
not a “pure” OO language;
emphasizes the stack rather than the heap, although both stack and heap allocation is provided;
provides multiple inheritance, genericity (in the form of “templates”), and exception handling;
does not provide garbage collection.
Exercise 27. Explain why C++ is the most widely-used OOPL despite its technical flaws. 2
6.5 Eiffel
Software Engineering was a key objective in the design of Eiffel.
Eiffel embodies a “certain idea” of software construction: the belief that it is possible to
treat this task as a serious engineering enterprise, whose goal is to yield quality software
through the careful production and continuous development of parameterizable, scien-
tifically specified, reusable components, communicating on the basis of clearly defined
contracts and organized in systematic multi-criteria classifications. (Meyer 1992)
Eiffel borrows from Simula 67, Algol W, Alphard, CLU, Ada, and the specification language Z.
Like some other OOPLs, it provides multiple inheritance. It is notable for its strong typing, an
unusual exception handling mechanism, and assertions. By default, object names are references;
however, Eiffel also provides “expanded types” whose instances are named directly.
Eiffel does not have: global variables; enumerations; subranges; goto, break, or continue state-
ments; procedure variables; casts; or pointer arithmetic. Input/output is defined by libraries rather
than being built into the language.
The main features of Eiffel are as follows. Eiffel:
is strictly OO: all functions are defined as methods in classes;
provides multiple inheritance and repeated inheritance;
supports “programming by contract” with assertions;
has an unusual exception handling mechanism;
provides generic classes;
may provide garbage collection.
6.5.1 Programming by Contract
The problem with writing a function to compute square roots is that it is not clear what to do if
the argument provided by the caller is negative. The solution adopted in Listing 21 is to allow for
the possibility that the argument might be negative.
6 THE OBJECT ORIENTED PARADIGM 85
Listing 21: Finding a square root
double sqrt (double x)
¦
if (x ≥ 0.0)
return . . . . //

x
else
??
¦
Listing 22: A contract for square roots
sqrt (x: real): real is
require x ≥ 0.0
Result := . . . . --

x
ensure abs(Result * Result / x - 1.0) ≤ 1e-6
end
This approach, applied throughout a software system, is called defensive programming
(Section 6.3).
Testing arguments adds overhead to the program.
Most of the time spent testing is wasted, because programmers will rarely call sqrt with a
negative argument.
The failure action, indicated by “??” in Listing 21, is hard to define. In particular, it is
undesirable to return a funny value such as −999.999, or to send a message to the console
(there may not be one), or to add another parameter to indicate success/failure. The only
reasonable possibility is some kind of exception.
Meyer’s solution is to write sqrt as shown in Listing 22. The require/ensure clauses constitute
a contract between the function and the caller. In words:
If the actual parameter supplied, x, is positive, the result returned will be approximately

x.
If the argument supplied is negative, the contract does not constrain the function in any way at all:
it can return -999.999 or any other number, terminate execution, or whatever. The caller therefore
has a responsibility to ensure that the argument is valid.
The advantages of this approach include:
Many tests become unnecessary.
Programs are not cluttered with tests for conditions that are unlikely to arise.
The require conditions can be checked dynamically during testing and (perhaps) disabled
during production runs. (Hoare considers this policy to be analogous to wearing your life
jacket during on-shore training but leaving it behind when you go to sea.)
Require and ensure clauses are a useful form of documentation; they appear in Eiffel inter-
faces.
Functions in a child class must provide a contract at least as strong as that of the corresponding
function in the parent class, as shown in Listing 23. We must have R
p
⇒ R
c
and E
c
⇒ E
p
. Note
6 THE OBJECT ORIENTED PARADIGM 86
Listing 23: Eiffel contracts
class Parent
feature
f() is
require R
p
ensure E
p
end
. . . .
class Child inherit Parent
feature
f() is
require R
c
ensure E
c
. . . .
the directions of the implications: the ensure clauses are “covariant” but the require clauses are
“contravariant”. The following analogy explains this reversal.
A pet supplier offers the following contract: “if you send at least $200, I will send you an animal”.
The pet supplier has a subcontractor who offers the following contract: “if you send me at least
$150, I will send you a dog”. These contracts respect the Eiffel requirements: the contract offered
by the subcontractor has a weaker requirement ($150 is less than $200) but promises more (a dog
is more specific than an animal). Thus the contractor can use the subcontractor when a dog is
required and make a profit of $50.
Eiffel achieves the subcontract specifications by requiring the programmer to define
R
c
≡ R
p

E
c
≡ E
p

The ∨ ensures that the child’s requirement is weaker than the parent’s requirement, and the ∧
ensures that the child’s commitment is stronger than the parent’s commitment.
In addition to require and ensure clauses, Eiffel has provision for writing class invariants. A
class invariant is a predicate over the variables of an instance of the class. It must be true when
the object is created and after each function of the class has executed.
Eiffel programmers are not obliged to write assertions, but it is considered good Eiffel “style” to
include them. In practice, it is sometimes difficult to write assertions that make non-trivial claims
about the objects. Typical contracts say things like: “after you have added an element to this
collection, it contains one more element”. Since assertions may contain function calls, it is possible
in principle to say things like: “this structure is a heap (in the sense of heap sort)”. However:
this does not seem to be done much in practice;
complex assertions increase run-time overhead, increasing the incentive to disable checking;
a complex assertion might itself contain errors, leading to a false sense of confidence in the
correctness of the program.
6.5.2 Repeated Inheritance
Listing 24 shows a collection of classes (Meyer 1992, page169). The features of HomeBusiness
include:
6 THE OBJECT ORIENTED PARADIGM 87
Listing 24: Repeated inheritance in Eiffel
class House
feature
address: String
value: Money
end
class Residence inherit House
rename value as residenceValue
end
class Business inherit House
rename value as businessValue
end
class HomeBusiness inherit Residence Business
. . . .
end
address, inherited once but by two paths;
residenceValue, inherited from value in class House;
businessValue, inherited from value in class House.
Repeated inheritance can be used to solve various problems that appear in OOP. For example, it is
difficult in some languages to provide collections with more than one iteration sequence. In Eiffel,
more than one sequence can be obtained by repeated inheritance of the successor function.
C++ has a similar mechanism, but it applies to entire classes (via the virtual mechanism) rather
than to individual attributes of classes.
6.5.3 Exception Handling
An Eiffel function must either fulfill its contract or report failure. If a function contains a rescue
clause, this clause is invoked if an operation within the function reports failure. A return from
the rescue clause indicate that the function has failed. However, a rescue clause may perform
some cleaning-up actions and then invoke retry to attempt the calculation again. The function in
Listing 25 illustrates the idea. The mechanism seems harder to use than other, more conventional,
exception handling mechanisms. It is not obvious that there are many circumstances in which it
makes sense to “retry” a function.
Exercise 28. Early versions of Eiffel had a reference semantics (variable names denoted references
to objects). After Eiffel had been used for a while, Meyer introduced expanded types. An instance
of an expanded type is an object that has a name (value semantics). Expanded types introduce
various irregularities into the language, because they do not have the same properties as ordinary
types. Why do you think expanded types were introduced? Was the decision to introduce them
wise? 2
6.6 Java
Java (Arnold and Gosling 1998) is an OOPL introduced by Sun Microsystems. Its syntax bears
some relationship to that of C++, but Java is simpler in many ways than C++. Key features of
Java include the following.
6 THE OBJECT ORIENTED PARADIGM 88
Listing 25: Exceptions in Eiffel
tryIt is
local
triedFirst: Boolean -- initially false
do
if not triedFirst
then firstmethod
else secondmethod
end
rescue
if not triedFirst
then
triedFirst := true
retry
else
restoreInvariant
-- report failure
end
end
Java is compiled to byte codes that are interpreted. Since any computer that has a Java byte
code interpreter can execute Java programs, Java is highly portable.
The portability of Java is exploited in network programming: Java bytes can be transmitted
across a network and executed by any processor with an interpreter.
Java offers security. The byte codes are checked by the interpreter and have limited function-
ality. Consequently, Java byte codes do not have the potential to penetrate system security
in the way that a binary executable (or even a MS-Word macro) can.
Java has a class hierarchy with class Object at the root and provides single inheritance of
classes.
In addition to classes, Java provides interfaces with multiple inheritance.
Java has an exception handling mechanism.
Java provides concurrency in the form of threads.
Primitive values, such as int, are not objects in Java. However, Java provides wrapper
classes, such as Integer, for each primitive type.
A variable name in Java is a reference to an object.
Java provides garbage collection.
6.6.1 Portability
Compiling to byte codes is an implementation mechanism and, as such, is not strongly relevant to
this course. The technique is quite old — Pascal P achieved portability in the same way during
the mid-seventies — and has the disadvantage of inefficiency: Java programs are typically an order
of magnitude slower than C++ programs with the same functionality. For some applications, such
as simple Web “applets”, the inefficiency is not important. More efficient implementations are
promised. One interesting technique is “just in time” compilation, in which program statements
are compiled immediately before execution; parts of the program that are not needed during a
particular run (this may be a significant fraction of a complex program) are not compiled.
6 THE OBJECT ORIENTED PARADIGM 89
Listing 26: Interfaces
interface Comparable
¦
Boolean equalTo (Comparable other);
// Return True if ‘this’ and ‘other’ objects are equal.
¦
interface Ordered extends Comparable
¦
Boolean lessThan (Ordered other);
// Return True if ‘this’ object precedes ‘other’ object in the ordering.
¦
The portability of Java has been a significant factor in the rapid spread of its popularity, particularly
for Web programming.
6.6.2 Interfaces
A Java class, as usual in OOP, is a compile-time entity whose run-time instances are objects. An
interface declaration is similar to a class declaration, but introduces only abstract methods and
constants. Thus an interface has no instances.
A class may implement an interface by providing definitions for each of the methods declared in
the interface. (The class may also make use of the values of the constants declared in the interface.)
A Java class can inherit from at most one parent class but it may inherit from several interfaces
provided, of course, that it implements all of them. Consequently, the class hierarchy is a rooted
tree but the interface hierarchy is a directed acyclic graph. Interfaces provide a way of describing
and factoring the behaviour of classes.
Listing 26 shows a couple of simple examples of interfaces. The first interface ensures that instances
of a class can be compared. (In fact, all Java classes inherit the function equals from class Object
that can be redefined for comparison in a particular class.) If we want to store instances of a class
in an ordered set, we need a way of sorting them as well: this requirement is specified by the second
interface.
Class Widget is required to implement equalTO and lessThan:
class Widget implements Ordered
¦
. . . .
¦
6.6.3 Exception Handling
The principal components of exception handling in Java are as follows:
There is a class hierarchy rooted at class Exception. Instances of a subclass of class Exception
are used to convey information about what has gone wrong. We will call such instances
“exceptions”.
The throw statement is used to signal the fact that an exception has occurred. The argument
of throw is an exception.
6 THE OBJECT ORIENTED PARADIGM 90
Listing 27: Declaring a Java exception class
public class DivideByZero extends Exception
¦
DivideByZero()
¦
super("Division by zero");
¦
¦
Listing 28: Another Java exception class
public class NegativeSquareRoot extends Exception
¦
public float value;
NegativeSquareRoot(float badValue)
¦
super("Square root with negative argument");
value = badValue;
¦
¦
The try statement is used to attempt an operation and handle any exceptions that it raises.
A try statement may use one or more catch statements to handle exceptions of various types.
The following simple example indicates the way in which exceptions might be used. Listing 27
shows the declaration of an exception for division by zero. The next step is to declare another
exception for negative square roots, as shown in Listing 28
Next, we suppose that either of these problems may occur in a particular computation, as shown
in Listing 29. Finally, we write some code that tries to perform the computation, as shown in
Listing 30.
6.6.4 Concurrency
Java provides concurrency in the form of threads. Threads are defined by a class in that standard
library. Java threads have been criticized on (at least) two grounds.
Listing 29: Throwing exceptions
public void bigCalculation ()
throws DivideByZero, NegativeSquareRoot
¦
. . . .
if (. . . .) throw new DivideByZero();
. . . .
if (. . . .) throw new NegativeSquareRoot(x);
. . . .
¦
6 THE OBJECT ORIENTED PARADIGM 91
Listing 30: Using the exceptions
¦
try
¦
bigCalculation()
¦
catch (DivideByZero)
¦
System.out.println("Oops! divided by zero.");
¦
catch (NegativeSquareRoot n)
¦
System.out.println("Argument for square root was " + n);
¦
¦
¦
The synchronization methods provided by Java (based on the early concepts of semaphores
and monitors) are out of date and do not benefit from more recent research in this area.
The concurrency mechanism is insecure. Brinch Hansen (1999) argues that Java’s concurrency
mechanisms are unsafe and that such mechanisms must be defined in the base language and
checked by the compiler, not provided separately in a library.
Exercise 29. Byte codes provide portability. Can you suggest any other advantages of using byte
codes? 2
6.7 Kevo
All of the OOPLs previously described in this section are class-based languages. In a class-based
language, the programmer defines one or more classes and, during execution, instances of these
classes are created and become the objects of the system. (In Smalltalk, classes are run-time
objects, but Smalltalk is nevertheless a class-based language.) There is another, smaller, family of
OOPLs that use prototypes rather than classes.
Although there are several prototype-based OOPLs, we discuss only one of them here. Antero
Taivalsaari (1993) has given a thorough account of the motivation and design of his language,
Kevo. Taivalsaari (1993, page 172) points out that class-based OOPLs, starting with Simula, are
based on an Aristotelian view in which
the world is seen as a collection of objects with well-defined properties arranged [ac-
cording] to a hierarchical taxonomy of concepts.
The problem with this approach is that there are many classes which people understand intuitively
but which are not easily defined in taxonomic terms. Common examples include “book”, “chair”,
“game”, etc.
Prototypes provide an alternative to classes. In a prototype-based OOPL, each new type of object
is introduced by defining a prototype or exemplar, which is itself an object. Identical copies
of a prototype — as many as needed — are created by cloning. Alternatively, we can clone a
6 THE OBJECT ORIENTED PARADIGM 92
Listing 31: Object Window
REF Window
Object.new → Window
Window ADDS
VAR rect
Prototypes.Rectangle.new → rect
VAR contents
METHOD drawFrame . . . .;
METHOD drawContents . . . .;
METHOD refresh
self.drawFrame
self.drawContents;
ENDADDS;
Listing 32: Object Title Window
REF TitleWindow
Window.new → TitleWindow
TitleWindow ADDS
VAR title
"Anonymous" → title
METHOD drawFrame . . . .;
ENDADDS;
prototype, modify it in some way, and use the resulting object as a prototype for another collection
of identical objects.
The modified object need not be self-contained. Some of its methods will probably be the same
as those of the original object, and these methods can be delegated to the original object rather
than replicated in the new object. Delegation thus plays the role of inheritance in a prototype-
based language and, for this reason, prototype-based languages are sometimes called delegation
languages. This is misleading, however, because delegation is not the only possible mechanism for
inheritance with prototypes.
Example 16: Objects in Kevo. This example illustrates how new objects are introduced in
Kevo; it is based on (Taivalsaari 1993, pages 187–188). Listing 31 shows the introduction of a
window object. The first line, REF Window, declares a new object with no properties. The second
line defines Window as a clone of Object. In Kevo terminology, Window receives the basic printing
and creation operations defined for Object. In the next block of code, properties are added to the
window using the module operation ADDS.
The window has two components: rect and contents. The properties of rect are obtained
from the prototype Rectangle. The properties of contents are left unspecified in this example.
Finally, three new methods are added to Window. The first two are left undefined here, but the
third, refresh, draws the window frame first and then the contents.
Listing 32 shows how to add a title to a window. It declares a new object, TitleWindow with
Window as a prototype. Then it adds a new variable, title, with initial value "Anonymous", and
provides a new method drawFrame that overrides Window.drawFrame and draws a frame with a
title. 2
6 THE OBJECT ORIENTED PARADIGM 93
6.8 Other OOPLs
The OOPLs listed below are interesting, but lack of time precludes further discussion of them.
Beta. Beta is a descendant of Simula, developed in Denmark. It differs in many respects from
“mainstream” OOPLs.
Blue. Blue (K¨olling 1999) is a language and environment designed for teaching OOP. The language
was influenced to some extent by Dee (Grogono 1991b). The environment is interesting in
that users work directly with “objects” rather than program text and generated output.
CLOS. The Common LISP Object System is an extension to LISP that provides OOP. It has a
number of interesting features, including “multi-methods” — the choice of a function is based
on the class of all of its arguments rather than its first (possibly implicit) argument. CLOS
also provides “before” and “after” methods — code defined in a superclass that is executed
before or after the invocation of the function in a class.
Self. Self is the perhaps the best-known prototype language. Considerable effort has been put into
efficient implementation, to the extent that Self programs run at up to half of the speed of
equivalent C++ programs. This is an impressive achievement, given that Self is considerably
more flexible than C++.
6.9 Issues in OOP
OOP is the programming paradigm that currently dominates industrial software development. In
this section, we discuss various issues that arise in OOP.
6.9.1 Algorithms + Data Structures = Programs
The distinction between code and data was made at an early stage. It is visible in FORTRAN and
reinforced in Algol 60. Algol provided the recursive block structure for code but almost no facilities
for data structuring. Algol 68 and Pascal provided recursive data structures but maintained the
separation between code and data.
LISP brought algorithms (in the form of functions) and data structures together, but:
the mutability of data was downplayed;
subsequent FPLs reverted to the Algol model, separating code and data.
Simula started the move back towards the von Neumann Architecture at a higher level of abstrac-
tion than machine code: the computer was recursively divided into smaller computers, each with its
own code and stack. This was the feature of Simula that attracted Alan Kay and led to Smalltalk
(Kay 1996).
Many OOPLs since Smalltalk have withdrawn from its radical ideas. C++ was originally called
“C with classes” (Stroustrup 1994) and modern C++ remains an Algol-like systems programming
language that supports OOP. Ada has shown even more reluctance to move in the direction of
OOP, although Ada 9X has a number of OO features.
6 THE OBJECT ORIENTED PARADIGM 94
6.9.2 Values and Objects
Some data behave like mathematical objects and other data behave like representations of physical
objects (MacLennan 1983).
Consider the sets ¦ 1, 2, 3 ¦ and ¦ 2, 3, 1 ¦. Although these sets have different representations (the
ordering of the members is different), they have the same value because in set theory the only
important property of a set is the members that it contains: “two sets are equal if and only if they
contain the same members”.
Consider two objects representing people:
[name="Peter", age=55]
[name="Peter", age=55]
Although these objects have the same value, they might refer to different people who have the same
name and age. The consequences of assuming that these two objects denote the same person could
be serious: in fact, innocent people have been arrested and jailed because they had the misfortune
to have the same name and social security number as a criminal.
In the following comparison of values and objects, statements about values on the left side of the
page are balanced by statements about objects on the right side of the page.
Values
Objects
Integers have values. For example, the integer 6 is a value.
Objects possess values as attributes. For example, a counter might
have 6 as its current value.
Values are abstract. The value 7 is the common property shared by
all sets with cardinality 7.
Objects are concrete. An object is a particular collection of sub-
objects and values. A counter has one attribute: the value of the
count.
Values are immutable (i.e., do not change).
Objects are usually, but not necessarily, mutable (i.e., may change).
It does not make sense to change a value. If we change the value 3,
it is no longer 3.
It is meaningful, and often useful, to change the value of an object.
For example, we can increment a counter.
Values need representations. We need representations in order to
compute with values. The representations 7, VII, sept, sju, and
a particular combination of bits in the memory of a computer are
different representations of the same abstract value.
Objects need representations. An object has subobjects and values
as attributes.
It does not make sense to talk about the identity of a value. Suppose
we evaluate 2 +3, obtaining 5, and then 7 −2, again obtaining 5. Is
it the same 5 that we obtain from the second calculation?
Identity is an important attribute of objects. Even objects with
the same attributes may have their own, unique identities: consider
identical twins.
We can copy a representation of a value, but copying a value is
meaningless.
6 THE OBJECT ORIENTED PARADIGM 95
Copying an object certainly makes sense. Sometimes, it may even
be useful, as when we copy objects from memory to disk, for exam-
ple. However, multiple copies of an object with a supposedly unique
identity can cause problems, known as “integrity violations” in the
database community.
We consider different representations of the same value to be equal.
An object is equal to itself. Whether two distinct objects with equal
attributes are equal depends on the application. A person who is
arrested because his name and social insurance number is the same
as those of a wanted criminal may not feel “equal to” the criminal.
Aliasing is not a problem for values. It is immaterial to the pro-
grammer, and should not even be detectable, whether two instances
of a value are stored in the same place or in different places.
All references to a particular object should refer to the same object.
This is sometimes called “aliasing”, and is viewed as harmful, but it
is in fact desirable for objects with identity.
Functional and logical PLs usually provide values.
Object oriented PLs should provide objects.
Programs that use values only have the property of referential trans-
parency.
Programs that provide objects do not provide referential trans-
parency, but may have other desirable properties such as encap-
sulation.
Small values can be held in machine words. Large values are best
represented by references (pointers to the data structure representing
the value.
Very small objects may be held in machine words. Most objects
should be represented by references (pointers to the data structure
holding the object.
The assignment operator for values should be a reference assignment.
There is no need to copy values, because they do not change.
The assignment operator for objects should be a reference assign-
ment. Objects should not normally be copied, because copying en-
dangers their integrity.
The distinction between values and objects is not as clear as these comparisons suggest. When we
look at programming problems in more deatil, doubts begin to arise.
Exercise 30. Should sets be values? What about sets with mutable components? Does the model
handle objects with many to one abstraction functions? What should a PL do about values and
objects? 2
Conclusions:
The CM must provide values, may provide objects.
What is “small” and what is “large”? This is an implementation issue: the compiler must
decide.
What is “primitive” and what is “user defined”? The PL should minimize the distinction as
much as possible.
Strings should behave like values. Since they have unbounded size, many PLs treat them as
objects.
6 THE OBJECT ORIENTED PARADIGM 96
Copying and comparing are semantic issues. Compilers can provide default operations and
“building bricks”, but they cannot provide fully appropriate implementations.
6.9.3 Classes versus Prototypes
The most widely-used OOPLs — C++, Smalltalk, and Eiffel — are class-based. Classes provide a
visible program structure that seems to be reassuring to designers of complex software systems.
Prototype languages are a distinct but active minority. The relation between classes and prototypes
is somewhat analogous to the relation between procedural (FORTRAN and Algol) and functional
(LISP) languages in the 60s.
Prototypes provide more flexibility and are better suited to interactive environments than classes.
Although large systems have been built using prototype languages, these languages seem to be
more often used for exploratory programming.
6.9.4 Types versus Objects
In most class-based PLs, functions are associated with objects. This has the effect of making binary
operations asymmetric. For example, x = y is interpreted as x.equal(y) or, approximately,
“send the message equal with argument y to the object x”. It is not clear whether, with this
interpretation, y = x means the same as x = y. The problem is complicated further by inheritance:
if class C is a subclass of class P, can we compare an instance of C to an instance of P, and is this
comparison commutative?
Some of the problems can be avoided by associating functions with types rather than with objects.
This is the approach taken by CLU and some of the “functional” OOPLs. An expression such as
x = y is interpreted in the usual mathematical sense as equal(x,y). However, even this approach
has problems if x and y do not have the same type.
6.9.5 Pure versus Hybrid Languages
Smalltalk, Eiffel, and Java are pure OOPLs: they can describe only systems of objects and pro-
grammers are forced to express everything in terms of objects.
C++ is a hybrid PL in that it supports both procedural and object styles. Once the decision was
made to extend C rather than starting afresh (Stroustrup 1994, page 43), it was inevitable that
C++ could be a hybrid language. In fact, its precursor, “C with classes”, did not even claim to be
object oriented.
A hybrid language can be dangerous for inexperienced programmers because it encourages a hybrid
programming style. OO design is performed perfunctorily and design problems are solved by
reverting to the more general, but lower level, procedural style. (This is analogous to the early
days of “high level” programming when programmers would revert to assembler language to express
constructs that were not easy or possible to express at a high level.)
For experienced programmers, and especially those trained in OO techniques, a hybrid language
is perhaps better than a pure OOPL. These programmers will use OO techniques most of the
time and will only resort to procedural techniques when these are clearly superior to the best OO
solution. As OOPLs evolve, these occasions should become rarer.
Perhaps “pure or hybrid” is the wrong question. A better question might be: suppose a PL
provides OOP capabilities. Does it need to provide anything else? If so, what? One answer is: FP
6 THE OBJECT ORIENTED PARADIGM 97
capabilities. The multi-paradigm language LEDA (Budd 1995) demonstrates that it is possible to
provide functional, logic, and object programming in a single language. However, LEDA provides
these in a rather ad hoc way. It more tightly-integrated design would be preferable, but is harder
to accomplish.
6.9.6 Closures versus Classes
The most powerful programming paradigms developed so far are those that do not separate code
and data but instead provide “code and data” units of arbitrary size. This can be accomplished
by high order functions and mutable state, as in Scheme (Section 4.2) or by classes. Although
each method has advantages for certain applications, the class approach seems to be better in most
practical situations.
6.9.7 Inheritance
Inheritance is one of the key features of OOP and at the same time one of the most troublesome.
Conceptually, inheritance can be defined in a number of ways and has a number of different uses.
In practice, most (but not all!) OOPLs provide a single mechanism for inheritance and leave it to
the programmer to decide how to use the mechanism. Moreover, at the implementation level,
the varieties of inheritance all look much the same — a fact that discourages PL designers from
providing multiple mechanisms.
One of the clearest examples of the division is Meyer’s (1988) “marriage of convenience” in which
the class ArrayStack inherits from Array and Stack. The class ArrayStack behaves like a stack
— it has functions such as push, pop, and so on — and is represented as an array. There are other
ways of obtaining this effect — notably, the class ArrayStack could have an array object as an
attribute — but Meyer maintains that inheritance is the most appropriate technique, at least in
Eiffel.
Whether or not we agree with Meyer, it is clear that there two different relations are involved.
The relation ArrayStack ∼ Stack is not the same as the relation ArrayStack ∼ Array. Hence the
question: should an OOPL distinguish between these relations or not?
Based on the “marriage of convenience” the answer would appear to be “yes” — two distinct ends
are achieved and the means by which they are achieved should be separated. A closer examination,
however, shows that the answer is by no means so clear-cut. There are many practical programming
situations in which code in a parent class can be reused in a child class even when interface
consistency is the primary purpose of inheritance.
6.10 A Critique of C++
C++ is both the most widely-used OOPL and the most heavily criticized. Further criticism might
appear to be unnecessary. The purpose of this section is to show how C++ fails to fulfill its role
as the preferred OOPL.
The first part of the following discussion is based on a paper by LaLonde and Pugh (1995) in which
they discuss C++ from the point of view of a programmer familiar with the principles of OOP,
exemplified by a language such as Smalltalk, but who is not familiar with the details of C++.
We begin by writing a specification for a simple BankAccount class with an owner name, a balance,
and functions for making deposits and withdrawals: see Listing 33.
6 THE OBJECT ORIENTED PARADIGM 98
Data members are prefixed with “p” to distinguish them from parameters with similar names.
The instance variable pTrans is a pointer to an array containing the transactions that have
taken place for this account. Functions that operate on this array have been omitted.
We can create instances of BankAccount on either the stack or the heap:
BankAccount account1("John"); // on the stack
BankAccount *account2 = new BankAccount("Jane"); // on the heap
The first account, account1, will be deleted automatically at the end of the scope containing the
declaration. The second account, account2, will be deleted only when the program executes a
delete operation. In both cases, the destructor will be invoked.
The notation for accessing functions depends on whether the variable refers to the object directly
or is a pointer to the object.
account1.balance()
account2→balance()
A member function can access the private parts of another instance of the same class, as shown in
Listing 34. This comes as a surprise to Smalltalk programmers because, in Smalltalk, an object
can access only its own components.
The next step is to define the member functions of the class BankAccount: see Listing 35.
Listing 36 shows a simple test of the class BankAccount. This test introduces two bugs.
The array account1.pTrans is lost because it is over-written by account2.pTrans.
At the end of the test, account2.pTrans will be deleted twice, once when the destructor for
account1 is called, and again when the destructor for account2 is called.
Although most introductions to C++ focus on stack objects, it is more interesting to consider heap
objects: Listing 37.
The code in Listing 37 — creating three special accounts — is rather specialized. Listing 38
generalizes it so that we can enter the account owner’s names interactively.
Listing 33: Declaring a Bank Account
class BankAccount
¦
public:
BankAccount(char * newOwner = "");
∼BankAcocunt();
void owner(char *newOwner);
char *owner();
double balance();
void credit(double amount);
void debit(double amount);
void transfer(BankAccount *other, double amount);
private:
char *pOwner;
double pBalance;
Transaction *pTrans[MAXTRANS];
¦;
6 THE OBJECT ORIENTED PARADIGM 99
Listing 34: Accessing Private Parts
void BankAccount::transfer(BankAccount *other, double amount)
¦
pBalance += amount;
other→pBalance -= amount;
¦
Listing 35: Member Functions for the Bank Account
BankAccount::BankAccount(char *newOwner)
¦
pOwner = newOwner;
pBalance = 0.0;
¦
BankAccount::∼BankAccount()
¦
// Delete individual transactions
delete [] pTrans; // Delete the array of transactions
¦
void BankAccount owner(char *newOwner)
¦
pOwner = newOwner;
¦
char *BankAccount owner()
¦
return pOwner;
¦
. . . .
Listing 36: Testing the Bank Account
void test
¦
BankAccount account1("Anne");
BankAccount account2("Bill");
account1 = acount2;
¦
Listing 37: Using the Heap
BankAccount *accounts[] = new BankAccount * [3];
accounts[0] = new BankAccount("Anne");
accounts[1] = new BankAccount("Bill");
accounts[2] = new BankAccount("Chris");
// Account operations omitted.
// Delete everything.
for (int i = 0; i < 3; i++)
delete accounts[i];
delete accounts;
6 THE OBJECT ORIENTED PARADIGM 100
Listing 38: Entering a Name
BankAccount *accounts[] = new BankAccount * [3];
char * buffer = new char[100];
for (int i = 0; i < 3; i++)
¦
cout << "Please enter your name: ";
cin.getline(buffer, 100);
accounts[i] = new BankAccount(buffer);
¦
// Account operations omitted.
// Delete everything.
for (int i = 0; i < 3; i++)
delete accounts[i];
delete accounts;
delete buffer;
Listing 39: Adding buffers for names
// char * buffer = new char[100]; (deleted)
for (int i = 0; i < 3; i++)
¦
cout << "Please enter your name: ";
char * buffer = new char[100];
cin.getline(buffer, 100);
accounts[i] = new BankAccount(buffer);
¦
// Account operations omitted.
// Delete everything.
for (int i = 0; i < 3; i++)
¦
delete accounts[i]→owner();
delete accounts[i];
¦
delete accounts;
delete buffer;
The problem with this code is that all of the bank accounts have the same owner: that of the last
name entered. This is because each account contains a pointer to the unique buffer. We can fix
this by allocating a new buffer for each account, as in Listing 39.
This version works, but is poorly designed. Deleting the owner string should be the responsibility
of the object, not the client. The destructor for BankAccount should delete the owner string. After
making this change, we can remove the destructor for owner in the code above: Listing 40.
Unfortunately, this change introduces another problem. Suppose we use the loop to create two
accounts and then create the third account with this statement:
accounts[2] = new BankAccount("Tina");
The account destructor fails because "Tina" was not allocated using new. To correct this problem,
we must modify the constructor for BankAccount. The rule we must follow is that data must be
allocated with the constructor and deallocated with the destructor, or controlled entirely outside
6 THE OBJECT ORIENTED PARADIGM 101
Listing 40: Correcting the Destructor
BankAccount::∼BankAccount()
¦
delete pOwner;
// Delete individual transactions
delete [] pTrans; // Delete the array of transactions
¦
Listing 41: Correcting the Constructor
BankAccount::BankAccount(char *newOwner)
¦
char *pOwner = new char (strlen(newOwner) + 1);
strcpy(pOwner, newOwner);
¦
the object. Listing 41 shows the new constructor.
The next problem occurs when we try to use the overloaded function owner() to change the name
of an account owner.
accounts[2]→owner("Fred");
Once again, we have introduced two bugs:
The original owner of accounts[2] will be lost.
When the destructor is invoked for accounts[2], it will attempt to delete "Fred", which was
not allocated by new. To correct this problem, we must rewrite the member function owner(),
as shown in Listing 42.
We now have bank accounts that work quite well, provided that they are allocated on the heap. If
we try to mix heap objects and stack objects, however, we again encounter difficulties: Listing 43.
The error occurs because transfer expects the address of an account rather than an actual account.
We correct it by introducing the reference operator, &.
account1→transfer(&account3, 10.0);
We could make things easier for the programmer who is using accounts by declaring an overloaded
version of transfer() as shown below. Note that, in the version of Listing 44, we must use the
operator “.” rather than “→”. This is because other behaves like an object, although in fact an
address has been passed.
The underlying problem here is that references (&) work best with stack objects and pointers (*)
work best with heap objects. It is awkward to mix stack and heap allocation in C++ because two
Listing 42: Correcting the Owner Function
void BankAccount owner(char *newOwner)
¦
delete pOwner;
pOwner = new char(strlen(newOwner) + 1);
strcpy(pOwner, newOwner);
¦
6 THE OBJECT ORIENTED PARADIGM 102
Listing 43: Stack and Heap Allocation
BankAccount *account1 = new BankAccount("Anne");
BankAccount *account2 = new BankAccount("Bill");
BankAccount account3("Chris");
account1→transfer(account2, 10.0); // OK
account1→transfer(account3, 10.0); // Error!
Listing 44: Revising the Transfer Function
void BankAccount::transfer(BankAccount &other, double amount)
¦
pBalance += amount;
other.pBalance -= amount;
¦
sets of functions are needed.
If stack objects are used without references, the compiler is forced to make copies. To implement
the code shown in Listing 45, C++ uses the (default) copy constructor for BankAccount twice:
first to make a copy of account2 to pass to test() and, second, to make another copy of account2
that will become myAccount. At the end of the scope, these two copies must be deleted.
The destructors will fail because the default copy constructor makes a shallow copy — it does not
copy the pTrans fields of the accounts. The solution is to write our own copy constructor and
assignment overload for class BankAccount, as in Listing 46. The function pDuplicate() makes
a deep copy of an account the function; it must make copies of all of the components of a back
account object and destroy all components of the old object. The function pDelete has the same
effect as the destructor. It is recommended practice in C++ to write these functions for any class
that contains pointers.
LaLonde and Pugh continue with a discussion of infix operators for user-defined classes. If the
objects are small — complex numbers, for example — it is reasonable to store the objects on
the stack and to use call by value. C++ is well-suited to this approach: storage management is
automatic and the only overhead is the copying of values during function call and return.
If the objects are large, however, there are problems. Suppose that we want to provide the standard
algebraic operations (+, −, ∗) for 4 4 matrices. (This would be useful for graphics programming,
for example.) Since a matrix occupies 448 = 128 bytes, copying a matrix is relatively expensive
and the stack approach will be inefficient. But if we write functions that work with pointers to
matrices, it is difficult to manage allocation and deallocation. Even a simple function call such as
project(A + B);
will create a heap object corresponding to A + B but provides no opportunity to delete this object.
Listing 45: Copying
BankAccount test(BankAccount account)
¦
return account;
¦
BankAccount myAccount;
myAccount = test(account2);
6 THE OBJECT ORIENTED PARADIGM 103
Listing 46: Copy Constructors
BankAccount::BankAccount(BankAccount &account)
¦
pDuplicate(account);
¦
BankAccount & BankAccount operator = (BankAccount &rhs)
¦
if (this == &rhs)
return this;
pDelete();
pDuplicate(rhs);
return *this;
¦
Listing 47: Using extended assignment operators
A = C;
A *= D;
A += B;
(The function cannot delete it, because it cannot tell whether its argument is a temporary.) The
same problem arises with expressions such as
A = B + C * D;
It is possible to implement the extended assignment operators, such as +=, without losing objects.
But this approach reduces us to a kind of assembly language style of programming, as shown in
Listing 47.
These observations by LaLonde and Pugh seem rather hostile towards C++. They are making a
comparison between C++ and Smalltalk based only on ease of programming. The comparison is
one-sided because they do not discuss factors that might make C++ a better choice than Smalltalk
in some circumstances, such as when efficiency is important. Nevertheless, the discussion makes an
important point: the “stack model” (with value semantics) is inferior to the “heap model” (with
reference semantics) for many OOP applications. It is significant that most language designers
have chosen the heap model for OOP (Simula, Smalltalk, CLU, Eiffel, Sather, Dee, Java, Blue,
amongst others) and that C++ is in the minority on this issue.
Another basis for criticizing C++ is that is has become, perhaps not intentionally, a hacker’s
language. A sign of good language design is that the features of the language are used most of
the time to solve the problems that they were intended to solve. A sign of a hacker’s language is
that language “gurus” introduce “hacks” that solve problems by using features in bizarre ways for
which they were not originally intended.
Input and output in C++ is based on “streams”. A typical output statement looks something like
this:
cout << "The answer is " << answer << ’.’ << endl;
The operator << requires a stream reference as its left operand and an object as its right operand.
Since it returns a stream reference (usually its left operand), it can be used as an infix operator
in compound expressions such as the one shown. The notation seems clear and concise and, at
first glance, appears to be an elegant way of introducing output and input (with the operator >>)
capabilities.
6 THE OBJECT ORIENTED PARADIGM 104
Listing 48: A Manipulator Class
class ManipulatorForInts
¦
friend ostream & operator << (ostream & const ManipulatorForInts &);
public:
ManipulatorForInts (int (ios::*)(int), int);
. . . .
private:
int (ios::*memberfunc)(int);
int value;
¦
Obviously, << is heavily overloaded. There must be a separate function for each class for which
instances can be used as the right operand. (<< is also overloaded with respect to its left operand,
because there are different kinds of output streams.) In the example above, the first three right
operands are a string, a double, and a character.
The first problem with streams is that, although they are superficially object oriented (a stream
is an object), they do not work well with inheritance. Overloading provides static polymorphism
which is incompatible with dynamic polymorphism (recall Section 9.6). It is possible to write
code that uses dynamic binding but only by providing auxiliary functions.
The final operand, endl, writes end-of-line to the stream and flushes it. It is a global function with
signature
ios & endl (ios & s)
A programmer who needs additional “manipulators” of this kind has no alternative but to add
them to the global name space.
Things get more complicated when formatting is added to streams. For example, we could change
the example above to:
cout << "The answer is " << setprecision(8) << answer << ’.’ << endl;
The first problem with this is that it is verbose. (The verbosity is not really evident in such a small
example, but it is very obvious when streams are used, for example, to format a table of heteroge-
neous data.) The second, and more serious problem, is the implementation of setprecision. The
way this is done is (roughly) as follows (Teale 1993, page 171). First, a class for manipulators with
integer arguments is declared, as in Listing 48, and the inserter function is defined:
ostream & operator << (ostream & os, const ManipulatorForInts & m)
¦
(os.*memberfunc)(m.value);
return os;
¦
Second, the function that we need can be defined:
ManipulatorForInts setprecision (int n)
¦
return ManipulatorForInts (& ios::precision, n);
¦
It is apparent that a programmer must be quite proficient in C++ to understand these definitions,
let alone design similar functions. To make matters worse, the actual definitions are not as shown
6 THE OBJECT ORIENTED PARADIGM 105
Listing 49: File handles as Booleans
ifstream infile ("stats.dat");
if (infile) . . . .
if (!infile) . . . .
here, because that would require class declarations for many different types, but are implemented
using templates.
There are many other tricks in the implementation of streams. For example, we can use file handles
as if they had boolean values, as in Listing 49. It seems that there is a boolean value, infile, that
we can negate, obtaining !infile. In fact, infile can be used as a conditional because there is
a function that converts infile to (void *), with the value NULL indicating that the file is not
open, and !infile can be used as a conditional because the operator ! is overloaded for class
ifstream to return an int which is 0 if the file is open.
These are symptoms of an even deeper malaise in C++ than the problems pointed out by authors of
otherwise excellent critiques such as (Joyner 1992; Sakkinen 1992). Most experienced programmers
can get used to the well-known idiosyncrasies of C++, but few realize that frequently-used and
apparently harmless constructs are implemented by subtle and devious tricks. Programs that look
simple may actually conceal subtle traps.
6.11 Evaluation of OOP
OOP provides a way of building programs by incremental modification.
Programs can often be extended by adding new code rather than altering existing code.
The mechanism for incremental modification without altering existing code is inheritance.
However, there are several different ways in which inheritance can be defined and used, and
some of them conflict.
The same identifier can be used in various contexts to name related functions. For example,
all objects can respond to the message “print yourself”.
Class-based OOPLs can be designed so as to support information hiding (encapsulation) —
the separation of interface and implementation. However, there may be conflicts between
inheritance and encapsulation.
An interesting, but rarely noted, feature of OOP is that the call graph is created and modified
during execution. This is in contrast to procedural languages, in which the call graph can be
statically inferred from the program text.
Section 6.9 contains a more detailed review of issues in OOP.
Exercise 31. Simula is 33 years old. Can you account for the slow acceptance of OOPLs by
professional programmers? 2
7 Backtracking Languages
The PLs that we discuss in this section differ in significant ways, but they have an important
feature in common: they are designed to solve problems with multiple answers, and to find as
many answers as required.
In mathematics, a many-valued expression can be represented as a relation or as a function re-
turning a set. Consider the relation “divides”, denoted by [ in number theory. Any integer greater
than 1 has at least two divisors (itself and 1) and possible others. For example, 1 [ 6, 2 [ 6, 3 [ 6,
and 6 [ 6. The functional analog of the divides relation is the set-returning function divisors:
divisors(6) = ¦ 1, 2, 3, 6 ¦
divisors(7) = ¦ 1, 7 ¦
divisors(24) = ¦ 1, 2, 3, 4, 6, 8, 12, 24 ¦
We could define functions that return sets in a FPL or we could design a language that works di-
rectly with relations. The approach that has been most successful, however, is to produce solutions
one at a time. The program runs until it finds the first solution, reports that solution or uses it
in some way, then backtracks to find another solution, reports that, and so on until all the choices
have been explored or the user has had enough.
7.1 Prolog
The computational model (see Section 10.2) of a logic PL is some form of mathematical logic.
Propositional calculus is too weak because it does not have variables. Predicate calculus is too
strong because it is undecidable. (This means that there is no effective procedure that determines
whether a given predicate is true.) Practical logic PLs are based on a restricted form of predicate
calculus that has a decision procedure.
The first logic PL was Prolog, introduced by Colmerauer (1973) and Kowalski (1974). The discus-
sion in this section is based on (Bratko 1990).
We can express facts in Prolog as predicates with constant arguments. All Prolog textbooks start
with facts about someone’s family: see Listing 50. We read parent(pam, bob) as “Pam is the
parent of Bob”. This Bob’s parents are Pam and Tom; similarly for the other facts. Note that,
since constants in Prolog start with a lower case letter, we cannot write Pam, Bob, etc.
Prolog is interactive; it prompts with ?- to indicate that it is ready to accept a query. Prolog
responds to simple queries about facts that it has been told with “yes” and “no”, as in Listing 51.
If the query contains a logical variable — an identifier that starts with an upper case letter —
Prolog attempts to find a value of the variable that makes the query true.
If it succeeds, it replies with the binding.
Listing 50: A Family
parent(pam, bob).
parent(tom, bob).
parent(tom, liz).
parent(bob, ann).
parent(bob, pat).
parent(pat, jim).
106
7 BACKTRACKING LANGUAGES 107
Listing 51: Prolog queries
?- parent(tom, liz).
yes
?- parent(tom, jim).
no
?- parent(pam, X).
X = bob
If there is more than one binding that satisfies the query, Prolog will find them all. (The exact
mechanism depends on the implementation. Some implementations require the user to ask for more
results. Here we assume that the implementation returns all values without further prompting.)
?- parent(bob, C).
C = ann
C = pat
Consider the problem of finding Jim’s grandparents. In logic, we can express “G is a grandparent
of Jim” in the following form: “there is a person P such that P is a parent of Jim and there is a
person G such that G is a parent of P”. In Prolog, this becomes:
?- parent(P, jim), parent(G, P).
P = pat
G = bob
The comma in this query corresponds to conjunction (“and”): the responses to the query are
bindings that make all of the terms valid simultaneously. Disjunction (“or”) is provided by multiple
facts or rules: in the example above, Bob is the parent of C is true if C is Ann or C is Pat.
We can expression the “grandparent” relation by writing a rule and adding it to the collection
of facts (which becomes a collection of facts and rules). The rule for “grandparent” is “G is a
grandparent of C if G is a parent of P and P is a parent of C”. In Prolog:
grandparent(G, C) :- parent(G, P), parent(P, C).
Siblings are people who have the same parents. We could write this rule for siblings:
sibling(X, Y) :- parent(P, X), parent(P, Y).
Testing this rule reveals a problem.
?- sibling(pat, X).
X = ann
X = pat
We may not expect the second response, because we do not consider Pat to be a sibling of her-
self. But Prolog simply notes that the predicate parent(P, X), parent(P, Y) is satisfied by the
bindings P = bob, X = pat, and Y = pat. To correct this rule, we would have to require that X
and Y have different values:
sibling(X, Y) :- parent(P, X), parent(P, Y), different(X, Y).
It is clear that different(X, Y) must be built into Prolog somehow, because it would not be
feasible to list every true instantiation of it as a fact.
Prolog allows rules to be recursive: a predicate name that appears on the left side of a rule may
also appear on the right side. For example, a person A is an ancestor of a person X if either A is
a parent of X or A is an ancestor of a parent P of X:
ancestor(A, X) :- parent(A, X).
7 BACKTRACKING LANGUAGES 108
parent(pam, bob)
ancestor(pam, bob) parent(bob, pat)
ancestor(pam, pat) parent(pat, jim)
ancestor(pam, jim)
Figure 15: Proof of ancestor(pam, jim)
ancestor(A, X) :- ancestor(A, P), parent(P, X).
We confirm the definition by testing:
?- ancestor(pam, jim).
yes
We can read a Prolog program in two ways. Read declaratively, ancestor(pam, jim) is a
statement and Prolog’s task is to prove that it is true. The proof is shown in Figure 15, which uses
the convention that
P
Q
means “from P infer Q”. Read procedurally, ancestor(pam, jim) defines
a sequence of fact-searches and rule-applications that establish (or fail to establish) the goal. The
procedural steps for this example are as follows.
1. Using the first rule for ancestor, look in the data base for the fact parent(pam, jim).
2. Since this search fails, try the second rule with A = pam and X = jim. This gives two new
subgoals: ancestor(pam, P) and parent(P, jim).
3. The second of these subgoals is satisfied by P = pat from the database. The goal is now
ancestor(pam, pat).
4. The goal expands to the subgoals ancestor(pam, P) and parent(P, pat). The second of
these subgoals is satisfied by P = bob fromthe database. The goal is now ancestor(pam, bob).
5. The goal ancestor(pam, bob) is satisfied by applying the first rule with the known fact
parent(pam, bob).
In the declarative reading, Prolog finds multiple results simply because they are there. In the
procedure reading, multiple results are obtained by backtracking. Every point in the program at
which there is more than one choice of a variable binding is called a choice point. The choice
points define a tree: the root of the tree is the starting point of the program (the main goal) and
each leaf of the tree represents either success (the goal is satisfied with the chosen bindings) or
failure (the goal cannot be satisfied with these bindings). Prolog performs a depth-first search of
this tree.
Sometimes, we can tell from the program that backtracking will be useless. Declaratively, the
predicate minimum(X,Y,Min) is satisfied if Min is the minimum of X and Y. We define rules for the
cases X ≤ Y and X > Y .
minimum(X, Y, X) :- X ≤ Y.
minimum(X, Y, Y) :- X > Y.
It is easy to see that, if one of these rules succeeds, we do not want to backtrack to the other,
because it is certain to fail. We can tell Prolog not to backtrack by writing “!”, pronounced “cut”.
7 BACKTRACKING LANGUAGES 109
Program Meaning
p :- a,b. p :- c. p ⇐⇒ (a ∧ b) ∨ c
p :- a,!,b. p:- c. p ⇐⇒ (a ∧ b) ∨ (∼a ∧ c)
p :- c. p :- a,!,b. p ⇐⇒ c ∨ (a ∧ b)
Figure 16: Effect of cuts
minimum(X, Y, X) :- X ≤ Y, !.
minimum(X, Y, Y) :- X > Y, !.
The introduction of the cut operation has an interesting consequence. If a Prolog program has no
cuts, we can freely change the order of goals and clauses. These changes may affect the efficiency
of the program, but they do not affect its declarative meaning. If the program has cuts, this is
no longer true. Figure 16 shows an example of the use of cut and its effect on the meaning of a
program.
Recall the predicate different introduced above to distinguish siblings, and consider the following
declarations and queries.
different(X, X) :- !, fail.
different(X, Y).
?- different(Tom, Tom).
no
?- different(Ann, Tom).
yes
The query different(Tom, Tom) matches different(X, X) with the binding X = Tom and suc-
ceeds. The Prolog constant fail then reports failure. Since the cut has been passed, no backtrack-
ing occurs, and the query fails. The query different(Ann, Tom) does not match different(X, X)
but it does match the second clause, different(X, Y) and therefore succeeds.
This idea can be used to define negation in Prolog:
not (P) :- P, !, fail.
true.
However, it is important to understand what negation in Prolog means. Consider Listing 52,
assuming that the entire database is included. The program has stated correctly that crows fly and
penguins do not fly. But it has made this inference only because the database has no information
about penguins. The same program can produce an unlimited number of (apparently) correct
responses, and also some incorrect responses. We account for this by saying that Prolog assumes
a closed world. Prolog can make inferences from the “closed world” consisting of the facts and
rules in its database.
Now consider negation. In Prolog, proving not(P) means “P cannot be proved” which means “P
cannot be inferred from the facts and rules in the database”. Consequently, we should be sceptical
about positive results obtained from Prolog programs that use not. The response
?- not human(peter).
yes
means no more than “the fact that Peter is human cannot be inferred from the database”.
The Prolog system proceeds top-down. It attempts to match the entire goal to one of the rules
and, having found a match, applies the same idea to the parts of the matched formulas. The
“matching” process is called unification.
7 BACKTRACKING LANGUAGES 110
Listing 52: Who can fly?
flies(albatross).
flies(blackbird);
flies(crow).
?- flies(crow).
yes
?- flies(penguin).
no
?- flies(elephant).
no
?- flies(brick).
no
?- flies(goose).
no
Suppose that we have two terms A and B built up from constants and variables. A unifier of A
and B is a substitution that makes the terms identical. A substitution is a binding of variable
names to values. In the following example we follow the Prolog convention, using upper case letters
X, Y, . . .to denote variables and lower case letters a, b, . . .to denote constants. Let A and B be the
terms
A ≡ left(tree(X, tree(a, Y ))
B ≡ left(tree(b, Z))
and let
σ = (X = b, Z = tree(a, Y ))
be a substitution. Then σA = σB = left(tree(b, tree(a, Y )) and σ is a unifier of A and B. Note
that the unifier is not necessarily unique. However, we can define a most general unifier which
is unique.
The unification algorithm contains a test called the occurs check that is used to reject a substi-
tution of the form X = f(X) in which a variable is bound to a formula that contains the variable.
If we changed A and B above to
A ≡ left(tree(X, tree(a, Y ))
B ≡ left(tree(b, Y )
they would not unify.
You can find out whether a particular version of Prolog implements the occurs check by trying the
following query:
?- X = f(X).
If the reply is “no”, the occurs check is implemented. If, as is more likely, the occurs check is not
performed, the response will be
X = f(f(f(f( . . . .
Prolog syntax corresponds to a restricted form of first-order predicate calculus called clausal
form logic. It is not practical to use the full predicate calculus as a basis for a PL because it
is undecidable. Clausal form logic is semi-decidable: there is an algorithm that will find a proof
7 BACKTRACKING LANGUAGES 111
Listing 53: Restaurants
good(les halles).
expensive(les halles).
good(joes diner).
reasonable(Restaurant) : not expensive(Restaurant).
?- good(X), reasonable(X).
X = joes diner.
?- reasonable(X), good(X).
no
for any formula that is true in the logic. If a formula is false, the algorithm may fail after a finite
time or may loop forever. The proof technique, called SLD resolution, was introduced by Robinson
(1965). SLD stands for “Selecting a literal, using a Linear strategy, restricted to Definite clauses”.
The proof of validity of SLD resolution assumes that unification is implemented with the occurs
check. Unfortunately, the occurs check is expensive to implement and most Prolog systems omit
it. Consequently, most Prolog systems are technically unsound, although problems are rare in
practice.
A Prolog program apparently has a straightforward interpretation as a statement in logic, but the
interpretation is slightly misleading. For example, since Prolog works through the rules in the
order in which they are written, the order is significant. Since the logical interpretation of a rule
sequence is disjunction, it follows that disjunction in Prolog does not commute.
Example 17: Failure of commutativity. In the context of the rules
wife(jane).
female(X) :- wife(X).
female(X) :- female(X).
we obtain a valid answer to a query:
?- female(Y).
jane
But if we change the order of the rules, as in
wife(jane).
female(X) :- female(X).
female(X) :- wife(X).
the same query leads to endless applications of the second rule. 2
Exercise 32. Listing 53 shows the results of two queries about restaurants. In logic, conjunction
is commutative (P ∧ Q ≡ Q∧ P). Explain why commutativity fails in this program. 2
The important features of Prolog include:
Prolog is based on a mathematical model (clausal form logic with SLD resolution).
“Pure” Prolog programs, which fully respect the logical model, can be written but are often
inefficient.
In order to write efficient programs, a Prolog programmer must be able to read programs both
declaratively (to check the logic) and procedurally (to ensure efficiency).
7 BACKTRACKING LANGUAGES 112
Prolog implementations introduce various optimizations (the cut, omitting the occurs check)
that improve the performance of programs but compromise the mathematical model.
Prolog has garbage collection.
7.2 Alma-0
One of the themes of this course is the investigation of ways in which different paradigms of
programming can be combined. The PL Alma-0 takes as its starting point a simple imperative
language, Modula–2 (Wirth 1982), and adds a number of features derived from logic programming
to it. The result is a simple but highly expressive language. Some features of Modula–2 are omitted
(the CARDINAL type; sets; variant records; open array parameters; procedure types; pointer types;
the CASE, WITH, LOOP, and EXIT statements; nested procedures; modules). They are replaced by
nine new features, described below. This section is based on Apt et al. (1998).
BES. Boolean expressions may be used as statements. The expression is evaluated. If its value
is TRUE, the computation succeeds and the program continues as usual. If the value of the
expression is FALSE, the computation fails. However, this does not mean that the program
stops: instead, control returns to the nearest preceding choice point and resumes with a
new choice.
Example 18: Ordering 1. If the array a is ordered, the following statement succeeds.
Otherwise, it fails; the loop exits, and i has the smallest value for which the condition is
FALSE.
FOR i := 1 to n - 1 DO a[i] <= a[i+1] END
2
SBE. A statement sequence may be used as a boolean expression. If the computation of a sequence
of statements succeeds, it is considered equivalent to a boolean expression with value TRUE.
If the computation fails, it is considered equivalent to a boolean expression with value FALSE.
Example 19: Ordering 2. The following sequence decides whether a precedes b in the
lexicographic ordering. The FOR statement finds the first position at which the arrays differ.
If they are equal, the entire sequence fails. If there is a difference at position i, the sequence
succeeds iff a[i] < b[i].
NOT FOR i :=1 TO n DO a[i] = b[i] END;
a[i] < b[i]
2
EITHER–ORELSE. The construct EITHER-ORELSE is added to the language to permit back-
tracking. An EITHER statement has the form
EITHER <seq> ORELSE <seq> . . . .ORELSE <seq> END
Computation of an EITHER statement starts by evaluating the first sequence. If it succeeds,
computation continues at the statement following END. If this computation subsequently
fails, control returns to the second sequence in the EITHER statement, with the program in
the same state as it was when the EITHER statement was entered. Each sequence is tried in
turn in this way, until one succeeds or the last one fails.
7 BACKTRACKING LANGUAGES 113
Listing 54: Creating choice points
EITHER y := x
ORELSE x > 0; y := -x
END;
x := x + b;
y < 0
Listing 55: Pattern matching
SOME i := 0 TO n - m DO
FOR j := 0 to m - 1 DO
s[i+j] = p[j]
END
END
Example 20: Choice points. Assume that the sequence shown in Listing 54 is entered
with x = a > 0. The sequence of events is as follows:
The assignment y := x assigns a to y. This sequence succeeds and transfers control to
x := x + b.
The assignment x := x + b assigns a + b to x.
The test y < 0 fails because y = a > 0.
The original value of x is restored. The test x > 0 succeeds because x = a > 0, and y
receives the value −a.
The final sequence is performed and succeeds, leaving x = a +b and y = −a.
2
SOME. The SOME statement permits the construction of statements that work like EITHER state-
ments but have a variable number of ORELSE clauses. The statement
SOME i := 1 TO n DO <seq> END
is equivalent to
EITHER i := 1; <seq>
ORELSE SOME i := 2 TO n DO <seq> END
Example 21: Pattern matching. Suppose that we have two arrays of characters: a
pattern p[0..m-1] and a string s[0..n-1]. The sequence shown in Listing 55 sets i to the
first occurrence of p in s, or fails if there is no match. 2
COMMIT. The COMMIT statement provides a way of restricting backtracking. (It is similar to
the cut in Prolog.) The statement
COMMIT <seq> END
evaluates the statement sequence <seq>; if seq succeeds (possibly by backtracking), all of
the choice points created during its evaluation are removed.
Example 22: Ordering 3. Listing 56 shows another way of checking lexicographic or-
dering. With the COMMIT, this sequence yields that value of a[i] < b[i] for the smallest i
7 BACKTRACKING LANGUAGES 114
Listing 56: Checking the order
COMMIT
SOME i := 1 TO n DO
a[i] <> b[i]
END
END;
a[i] < b[i]
such that a[i] <> b[i]. Without the COMMIT, it succeeds if a[i] < b[i] is TRUE for some
value of i such that a[i] <> b[i]. 2
FORALL. The FORALL statement provides for the exploration of all choice points generated by a
statement sequence. The statement
FORALL <seq1> DO <seq2> END
evaluates <seq1> and then <seq2>. If <seq1> generates choice points, control backtracks
through these, even if the original sequence succeeded. Whenever <seq2> succeeds, its choice
points are removed (that is, there is an implicit COMMIT around <seq2>).
Example 23: Matching. Suppose we enclose the pattern matching code above in a func-
tion Match. The following sequence finds the number of times that p occurs in s.
count := 0;
FORALL k := Match(p, s) DO count := count + 1 END;
2
EQ. Variables in Alma-0 start life as uninitialized values. After a value has been assigned to
a variable, it ceases to be uninitialized and becomes initialized. Expressions containing
uninitialized values are allowed, but they do not have values. For example, the comparison
s = t is evaluated as follows:
if both s and t are initialized, their values are compared as usual.
if one side is an uninitialized variable and the other is an expression with a value, this
value is assigned to the variable;
if both sides are uninitialized variables, an error is reported.
Example 24: Remarkable sequences. A sequence with 27 integer elements is remark-
able if it contains three 1’s, three 2’s, . . . ., three 9’s arranged in such a way that for
i = 1, 2, . . . , 9 there are exactly i numbers between successive occurrences of i. Here is a
remarkable sequence:
(1, 9, 1, 2, 1, 8, 2, 4, 6, 2, 7, 9, 4, 5, 8, 6, 3, 4, 7, 5, 3, 9, 6, 8, 3, 5, 7).
Problem 1. Write a program that tests whether a given array is a remarkable sequence.
Problem 2. Find a remarkable sequence.
Due to the properties of “=”, the program shown in Listing 57 is a solution for both of these
problems! If the given sequence is initialized, the program tests whether it is remarkable; if
it is not initialized, the program assigns a remarkable sequence to it. 2
7 BACKTRACKING LANGUAGES 115
Listing 57: Remarkable sequences
TYPE Sequence = ARRAY [1..27] OF INTEGER;
PROCEDURE Remarkable (VAR a: Sequence);
VAR i, j: INTEGER;
BEGIN
For i := 1 TO 9 DO
SOME j := 1 TO 25 - 2 * i DO
a[j] = i;
a[j+i+1] = i;
a[j+2*i+1] = i
END
END
END;
Listing 58: Compiling a MIX parameter
VAR v: integer;
BEGIN
v := E;
P(v)
END
MIX. Modula–2 provides call by value (the default) and call by reference (indicated by writing
VAR before the formal parameter. Alma-0 adds a third parameter passing mechanism: call
by mixed form. The parameter is passed either by value or by reference, depending on the
form of the actual parameter.
We can think of the implementation of mixed form in the following way. Suppose that
procedure P has been declared as
PROCEDURE P (MIX n: INTEGER); BEGIN . . . .END;
The call P(v), in which v is a variable, is compiled in the usual way, interpreting MIX as VAR.
The call P(E), in which E is an expression, is compiled as if the programmer had written the
code shown in Listing 58.
Example 25: Linear Search. The function Find in Listing 59 uses a mixed form param-
eter.
The call Find(7,a) tests if 7 occurs in a.
The call Find(x, a), where x is an integer variable, assigns each component of a to x
by backtracking.
Listing 59: Linear search
Type Vector = ARRAY [1..N] of INTEGER;
PROCEDURE Find (MIX e: INTEGER; a: Vector);
VAR i: INTEGER;
BEGIN
SOME i := 1 TO N DO e = a[i] END
END;
7 BACKTRACKING LANGUAGES 116
Listing 60: Generalized Addition
PROCEDURE Plus (MIX x, y, z: INTEGER);
BEGIN
IF KNOWN(x); KNOWN(y) THEN z = x + y
ELSIF KNOWN(y); KNOWN(z) THEN x = z - y
ELSIF KNOWN(x); KNOWN(z) THEN y = z - x
END
END;
The sequence
Find(x, a); Find(x, b)
tests whether the arrays a and b have an element in common; if so, x receives the value
of the common element.
The statement
FORALL Find(x, a) DO Find(x, b) END
tests whether all elements of a are also elements of b.
The statement
FORALL Find(x, a); Find(x, b) DO WRITELN(x) END
prints all of the elements that a and b have in common.
2
KNOWN. The test KNOWN(x) succeeds iff the variable x is initialized.
Example 26: Generalized Addition. Listing 60 shows the declaration of a function
Plus. The call Plus(x,y,z) will succeed if x+y = z provided that at least two of the actual
parameters have values. For example, Plus(2,3,5) succeeds and Plus(5,n,12) assigns 7 to
n. 2
The language Alma-0 has a declarative semantics in which the constructs described above
correspond to logical formulas. It also has a procedural semantics that describes how the
constructs can be efficiently executed, using backtracking. Alma-0 demonstrates that, if appropriate
care is taken, it is possible to design PLs that smoothly combine paradigms without losing the
advantages of either paradigm.
The interesting feature of Alma-0 is that it demonstrates that it is possible to start with a simple,
well-defined language (Modula-2), remove features that are not required for the current purpose,
and add a small number of simple, new features that provide expressive power when used together.
Exercise 33. The new features of Alma-0 are orthogonal, both to the existing features of Modula-
2 and to each other. How does this use of orthogonality compare with the orthogonal design of
Algol 68? 2
7.3 Other Backtracking Languages
Prolog and Alma-0 are by no means the only PLs that incorporate backtracking. Others include
Icon (Griswold and Griswold 1983), SETL (Schwartz, Dewar, Dubinsky, and Schonberg 1986), and
2LP (McAloon and Tretkoff 1995).
7 BACKTRACKING LANGUAGES 117
Constraint PLs are attracting increasing attention. A program is a set of constraints, expressed
as equalities and inequalities, that must be satisfied. Prolog is strong in logical deduction but weak
in numerical calculation; constraint PLs attempt to correct this imbalance.
8 Structure
The earliest programs had little structure. Although a FORTRAN program has a structure — it
consists of a main program and subroutines — the main program and the subroutines themselves
are simply lists of statements.
8.1 Block Structure
Algol 60 was the first major PL to provide recursive, hierarchical structure. It is probably significant
that Algol 60 was also the first language to be defined, syntactically at least, with a context-
free grammar. It is straightforward to define recursive structures using context-free grammars —
specifically, a grammar written in Backus-Naur Form (BNF) for Algol 60. The basis of Algol 60
syntax can be written in extended BNF as follows:
PROG → BLOCK
BLOCK → "begin" ¦ DECL¦ ¦ STMT¦ "end"
STMT → BLOCK [
DECL → HEADER BLOCK [
With blocks come nested scopes. Earlier languages had local scope (for example, FORTRAN
subroutines have local variables) but scopes were continuous (no “holes”) and not nested. It was
not possible for one declaration to override another declaration of the same name. In Algol 60,
however, an inner declaration overrides an outer declaration. The outer block in Listing 61 declares
an integer k, which is first set to 6 and then incremented. The inner block declares another integer
k which is set to 4 and then disappears without being used at the end of the block.
See Section 9.8.1 for further discussion of nesting.
8.2 Modules
By the early 70s, people realized that the “monolithic” structure of Algol-like programs was too
restricted for large and complex programs. Furthermore, the “separate compilation” model of
FORTRAN — and later C — was seen to be unsafe.
The new model that was introduced by languages such as Mesa and Modula divided a large program
into modules.
Listing 61: Nested Declarations
begin
integer k;
k := 6;
begin
integer k;
k := 1;
end;
k := k + 1;
end;
118
8 STRUCTURE 119
A module is a meaningful unit (rather than a collection of possibly unrelated declarations)
that may introduce constants, types, variables, and functions (called collectively features).
A module may import some or all of its features.
In typical modular languages, modules can export types, functions, and variables.
Example 27: Stack Module. One module of a program might declare a stack module:
module Stack
¦
export initialize, push, pop, . . . .
. . . .
¦
Elsewhere in the program, we could use the stack module:
import Stack;
initialize();
push(67);
. . . .
2
This form of module introduces a single instance: there is only one stack, and it is shared by all
parts of the program that import it. In contrast, a stack template introduces a type from which
we can create as many instances as we need.
Example 28: Stack Template. The main difference between the stack module and the stack
template is that the template exports a type.
module Stack
¦
export type STACK, initialize, push, pop, . . . .
. . . .
¦
In other parts of the program, we can use the type STACK to create stacks.
import Stack;
STACK myStack;
initialize(myStack);
push(myStack, 67);
. . . .
STACK yourStack;
. . . .
2
The fact that we can declare many stacks is somewhat offset by the increased complexity of the
notation: the name of the stack instance must appear in every function call. Some modular
languages provide an alternative syntax in which the name of the instance appears before the
function name, as if the module was a record. The last few lines of the example above would be
written as:
import Stack;
STACK myStack;
8 STRUCTURE 120
myStack.initialize();
myStack.push(67);
. . . .
8.2.1 Encapsulation
Parnas (1972) wrote a paper that made a number of important points.
A large program will be implemented as a set of modules.
Each programmer, or programming team, will be the owners of some modules and users of
other modules.
It is desirable that a programmer should be able to change the implementation of a module
without affecting (or even informing) other programmers.
This implies that a module should have an interface that is relatively stable and an imple-
mentation that changes as often as necessary. Only the owner of a module has access to its
implementation; users do not have such access.
Parnas introduced these ideas as the “principle of information hiding”. Today, it is more usual to
refer to the goals as encapsulation.
In order to use the module correctly without access to its implementation, users require a speci-
fication written in an appropriate language. The specification answer the question “What does
this module do?” and the implementation answers the question “How does this module work?”
Most people today accept Parnas’s principle, even if they do not apply it well. It is therefore
important to appreciate how radical it was at the time of its introduction.
[Parnas] has proposed a still more radical solution. His thesis is that the programmer
is most effective if shielded from, rather than exposed to, the details of construction of
system parts other than his own. This presupposes that all interfaces are completely and
precisely defined. While that is definitely sound design, dependence upon its perfect
accomplishment is a recipe for disaster. A good information system both exposes
interface errors and stimulates their correction. (Brooks 1975, page 78)
Brooks wrote this in the first edition of The Mythical Man Month in 1975. Twenty years later, he
had changed his mind.
Parnas was right, and I was wrong. I am now convinced that information hiding, today
often embodied in object-oriented programming, is the only was of raising the level of
software design. (Brooks 1995, page 272)
Parnas’s paper was published in 1972, only shortly after Wirth’s (1971) paper on program de-
velopment by stepwise refinement. Yet Parnas’s approach is significantly different from Wirth’s.
Whereas Wirth assumes that a program can be constructed by filling in details, perhaps with occa-
sional backtracking to correct an error, Parnas sees programming as an iterative process in which
revisions and alterations are part of the normal course of events. This is a more “modern” view
and certainly a view that reflects the actual construction of large and complex software systems.
8 STRUCTURE 121
8.3 Control Structures
Early PLs, such as FORTRAN, depended on three control structures:
sequence;
transfer (goto);
call/return.
These structures are essentially those provided by assembly language. The first contribution of
PLs was to provide names for variables and infix operators for building expressions.
The “modern” control structures first appeared in Algol 60. The most significant contribution
of Algol 60 was the block: a program unit that could contain data, functions, and statements.
Algol 60 also contributed the familiar if-then-else statement and a rather complex loop statement.
However, Algol 60 also provided goto statements and most programmers working in the sixties
assumed that goto was essential.
The storm of controversy raised by Dijkstra’s (1968) letter condemning the goto statement indicates
the extent to which programmers were wedded to the goto statement. Even Knuth (1974) entered
the fray with arguments supporting the goto statement in certain circumstances, although his
paper is well-balanced overall. The basic ideas that Dijkstra proposed were as follows.
There should be one flow into, and one flow out of, each control structure.
All programming needs can be met by three structures satisfying the above property: the
sequence; the loop with preceding test (while-do); and the conditional (if-then-else).
Dijkstra’s proposals had previously been justified in a formal sense by Bohm and Jacopini (1966).
This paper establishes that any program written with goto statements can be rewritten as an
equivalent program that uses sequence, conditional, and loop structures only; it may be necessary
to introduce Boolean variables. Although the result is interesting, Dijkstra’s main point was not
to transform old, goto-ridden programs but to write new programs using simple control structures
only.
Despite the controversy, Dijkstra’s arguments had the desired effect. In languages designed after
1968, the role of goto was down-played. Both C and Pascal have goto statements but they are
rarely used in well-written programs. C, with break, continue, and return, has even less need
for goto than Pascal. Ada has a goto statement because it was part of the US DoD requirements
but, again, it is rarely needed. C
++
inherited goto statements from C. Java does not have a goto
statement, although some uses of continue look rather like jumps.
The use of control structures rather than goto statements has several advantages.
It is easier to read and maintain programs. Maintenance errors are less common.
Precise reasoning (for example, the use of formal semantics) is simplified when goto statements
are absent.
Certain optimizations become feasible because the compiler can obtain more precise informa-
tion in the absence of goto statements.
8.3.1 Loop Structures.
The control structure that has elicited the most discussion is the loop. The basic loop has a single
test that precedes the body:
8 STRUCTURE 122
Listing 62: Reading in Pascal
var n, sum : int;
sum := 0;
read(n);
while n ≥ 0 do
begin
sum := sum + n;
read(n);
end
Listing 63: Reading in C
int sum = 0;
while (true)
¦
int n;
cin >> n;
if (n < 0)
break;
sum += n;
¦
while E do S
It is occasionally useful to perform the test after the body: Pascal provides repeat/until and C
provides do/while.
This is not quite enough, however, because there are situations in which it is useful to exit from
a loop from within the body. Consider, for example, the problem of reading and adding numbers,
terminating when a negative number is read without including the negative number in the sum.
Listing 62 shows the code in Pascal. In C and C++, we can use break to avoid the repeated code,
as shown in Listing 63.
Some recent languages, recognizing this problem, provide only one form of loop statement consisting
of an unconditional repeat and a conditional exit that can be used anywhere in the body of the
loop.
8.3.2 Procedures and Functions
Some languages, such as Ada and Pascal, distinguish the terms procedure and function. Both
names refer to “subroutines” — program components that perform a specific task when invoked
from another component of the program. A procedure is a subroutine that has some effect on the
program data but does not return a result. A function is a subroutine that returns a result; a
function may have some effect on program data, called a “side effect”, but this is often considered
undesirable.
Languages that make a distinction between procedures and functions usually also make a distinction
between “actions” and “data”. For example, at the beginning of the first Pascal text (Jensen and
Wirth 1976), we find the following statement:
An algorithm or computer program consists of two essential parts, a description of
actions which are to be performed, and a description of the data, which are manip-
8 STRUCTURE 123
ulated by the actions. Actions are described be so-called statements, and data are
described by so-called declarations and definitions.
In other languages, such as Algol and C, the distinction between “actions” and “data” is less
emphasized. In these languages, every “statement” has a value as well as an effect. Consistently,
the distinction between “procedure” and “function” also receives less emphasis.
C, for example, does not have keywords corresponding to procedure and function. There is a
single syntax for declaring and defining all functions and a convention that, if a function does not
return a result, it is given the return type void.
We can view the progression here as a separation of concerns. The “concerns” are: whether a
subroutine returns a value; and whether it has side-effects. Ada and Pascal combine the concerns
but, for practical reasons, allow “functions with side-effects”. In C, there are two concerns: “having
a value” and “having an effect” and they are essentially independent.
Example 29: Returning multiple results. Blue (K¨olling 1999) has a uniform notation for
returning either one or more results. For example, we can define a routine with this heading:
findElem (index: Integer) -> (found: Boolean, elem: Element) is . . . .
The parameter index is passed to the routine findElem, and the routine returns two values: found
to indicate whether something was found, and elem as the object found. (If the object is not found,
elem is set to the special value nil.)
Adding multiple results introduces the problem of how to use the results. Blue also provides
multi-assignments: we can write
success, thing := findElem("myKey")
Multi-assignments, provided they are implemented properly, provide other advantages. For exam-
ple we can write the standard “swap sequence”
t := a;
a := b;
b := t;
in the more concise and readable form
a, b := b, a
2
Exercise 34. Describe a “proper” implementation of the multi-assignment statement. Can you
think of any other applications of multi-assignment? 2
Exercise 35. Using Example 29 and other examples, discuss the introduction of new features into
PLs. 2
8.3.3 Exceptions.
The basic control structures are fine for most applications provided that nothing goes wrong.
Although in principle we can do everything with conditions and loops, there are situations in
which these constructs are inadequate.
8 STRUCTURE 124
Example 30: Matrix Inversion. Suppose that we are writing a function that inverts a matrix.
In almost all cases, the given matrix is non-singular and the inversion proceeds normally. Occa-
sionally, however, a singular matrix is passed to the function and, at some point in the calculation,
a division by zero will occur. What choices do we have?
We could test the matrix for singularity before starting to invert it. Unfortunately, the test
involves computing the determinant of the matrix, which is almost as much work as inverting
it. This is wasteful, because we know that most of the matrices that we will receive are
non-singular.
We could test every divisor before performing a division. This is wasteful if the matrix is
unlikely to be singular. Moreover, the divisions probably occur inside nested loops: each level
of loop will require Boolean variables and exit conditions that are triggered by a zero divisor,
adding overhead.
Rely on the hardware to signal an exception when division by zero occurs. The exception
can be caught by a handler that can be installed at any convenient point in the calling
environment.
2
PL/I (Section 3.5) was the first major language to provide exception handling. C provided prim-
itives (setjmp/longjmp) that made it possible to implement exceptin handling. For further de-
scriptions of exception handling in particular languages, see Section 3.
9 Names and Binding
Names are a central feature of all programming languages. In the earliest programs, numbers were
used for all purposes, including machine addresses. Replacing numbers by symbolic names was one
of the first major improvements to program notation (Mutch and Gill 1954).
9.1 Free and Bound Names
The use of names in programming is very similar to the use of names in mathematics. We say that
“x occurs free” in an expression such as x
2
+2x +5. We do not know anything about the value of
this expression because we do not know the value of x. On the other hand, we say that “n occurs
bound” in the expression
5

n=0
(2n + 1). (12)
More precisely, the binding occurrence of n is in n = 0. The n in the expression (2n + 1) is a
reference to, or use of , to this binding.
There are two things to note about (12). First, it has a definite value, 36, because n is bound to a
value (actually, the set of values ¦ 0, 1, . . . , 5 ¦). Second, the particular name that we use does not
change the value of the expression. If we systematically change n to k, we obtain the expression

5
k=0
(2k + 1), which has the same value as (12).
Similarly, the expression


0
e
−x
dx,
which contains a binding occurence of x (in dx) and a use of x (in e
−x
), has a value that does not
depend on the particular name x.
The expression

e
−x
dx,
is interesting because it binds x but does not have a numerical value; the convention is that it
denotes a function, −e
−x
. The conventional notation for functions, however, is ambiguous. It is
clear that −e
−x
is a function of x because, by convention, e denotes a constant. But what about
−e
−ax
: is it a function of a or a function of x? There are several ways of resolving this ambiguity.
We can write
x → −e
−ax
,
read “x maps to −e
−ax
”. Or we can use the notation of the λ-calculus:
λx . − e
−ax
.
In predicate calculus, predicates may contain free names, as in the following expression, whose
truth value depends on the value of n:
n mod 2 = 0 ∧ n mod 3 = 1. (13)
Names are bound by the quantifiers ∀ (for all) and ∃ (there exists). We say that the formula
∀ n. n mod 2 = 0 ⇒ n
2
mod 2 = 0
is closed because it contains no free variables. In this case, the formula is also true because (13) is
true for all values of n. Strictly, we should specify the range of values that n is allowed to assume.
125
9 NAMES AND BINDING 126
We could do this implicitly: for example, it is very likely that (13) refers to integers. Alternatively,
we could define the range of values explicitly, as in
∀ n ∈ Z . n mod 2 = 0 ⇒ n
2
mod 2 = 0.
Precisely analogous situations occur in programming. In the function
int f(int n)
¦
return k + n;
¦
k occurs free and n occurs bound. Note that we cannot tell the value of n from the definition of
the function but we know that n will be given a value when the function is called.
In most PLs, a program with free variables will not compile. A C compiler, for example, will
accept the function f defined above only if it is compiled in a scope that contains a declaration of
k. Some early PLs, such as FORTRAN and PL/I, provided implicit declarations for variables that
the programmer did not declare; this is now understood to be error-prone and is not a feature of
recent PLs.
9.2 Attributes
In mathematics, a variable normally has only one attribute: its value. In (12), n is bound to the
values 0, 1, . . . , 5. Sometimes, we specify the domain from which these values are chosen, as in
n ∈ Z.
In programming, a name may have several attributes, and they may be bound at different times.
For example, in the sequence
int n;
n = 6;
the first line binds the type int to n and the second line binds the value 6 to n. The first binding
occurs when the program is compiled. (The compiler records the fact that the type of n is int;
neither this fact nor the name n appears explicitly in the compiled code.) The second binding
occurs when the program is executed.
This example shows that there are two aspects of binding that we must consider in PLs: the
attribute that is bound, and the time at which the binding occurs. The attributes that may be
bound to a name include: type, address, and value. The times at which the bindings occur include:
compile time, link time, load time, block entry time, and statement execution time.
Definition. A binding is static if it occurs during before the program is executed: during
compilation or linking. A binding is dynamic if it occurs while the program is running: during
loading, block entry, or statement execution.
Example 31: Binding. Consider the program shown in Listing 64. The address of k is bound
when the program starts; its value is bound by the scanf call. The address and value of n are
bound only when the function f is called; if f is not called (because k ≤ 0), then n is never bound.
2
9 NAMES AND BINDING 127
Listing 64: Binding
void f()
¦
int n=7;
printf("%d", n);
¦
void main ()
¦
int k;
scanf("%d", &k);
if (k>0)
f();
¦
9.3 Early and Late Binding
An anonymous contributor to the Encyclopedia of Computer Science wrote:
Broadly speaking, the history of software development is the history of ever-later bind-
ing time.
As a rough rule of thumb:
Early binding ⇒ efficiency;
Late binding ⇒ flexibility.
Example 32: Variable Addressing. In FORTRAN, addresses are bound to variable names at
compile time. The result is that, in the compiled code, variables are addressed directly, without
any indexing or other address calculations. (In reality, the process is somewhat more complicated.
The compiler assigns an address relative to a compilation unit. When the program is linked, the
address of the unit within the program is added to this address. When the program is loaded, the
address of the program is added to the address. The important point is that, by the time execution
begins, the “absolute” address of the variable is known.)
FORTRAN is efficient, because absolute addressing is used. It is inflexible, because all addresses
are assigned at load time. This leads to wasted space, because all local variables occupy space
whether or not they are being used, and also prevents the use of direct or indirect recursion.
In languages of the Algol family (Algol 60, Algol 68, C, Pascal, etc.), local variables are allocated on
the run-time stack. When a function is called, a stack frame (called, in this context, an activation
record or AR) is created for it and space for its parameters and local variables is allocated in the
AR. The variable must be addressed by adding an offset (computed by the compiler) to the address
of the AR, which is not known until the function is called.
Algol-style is slightly less efficient than FORTRAN because addresses must be allocated at block-
entry time and indexed addressing is required. It is more flexible than FORTRAN because inactive
functions do not use up space and recursion works.
Languages that provide dynamic allocation, such as Pascal and C and most of their successors, have
yet another way of binding an address to a variable. In these languages, a statement such as new(p)
allocates space on the “heap” and stores the address of that space in the pointer p. Accessing the
variable requires indirect addressing, which is slightly slower then indexed addressing, but greater
9 NAMES AND BINDING 128
flexibility is obtained because dynamic variables do not have to obey stack discipline (last in, first
out). 2
Example 33: Function Addressing. When the compiler encounters a function definition, it
binds an address to the function name. (As above, several steps must be completed before the
absolute address of the function is determined, but this is not relevant to the present discussion.)
When the compiler encounters a function invocation, it generates a call to the address that it has
assigned to the function.
This statement is true of all PLs developed before OOP was introduced. OOPLs provide “virtual”
functions. A virtual function name may refer to several functions. The actual function referred to
in a particular invocation is not in general known until the call is executed.
Thus in non-OO PLs, functions are statically bound, with the result that calls are efficiently
executed but no decisions can be made at run-time. In OOPLs, functions are dynamically bound.
Calls take longer to execute, but there is greater flexibility because the decision as to which function
to execute is made at run-time.
The overhead of a dynamically-bound function call depends on the language. In Smalltalk, the
overhead can be significant, because the CM requires a search. In practice, the search can be
avoided in up to 95% of calls by using a cache. In C++, the compiler generates “virtual function
tables” and dynamic binding requires only indirect addressing. Note, however, that this provides
yet another example of the principle: Smalltalk binds later than C++, runs more slowly, but
provides greater flexibility. 2
9.4 What Can Be Named?
The question “what can be named?” and its converse “what can be denoted?” are important
questions to ask about a PL.
Definition. An entity can be named if the PL provides a definitional mechanism that associates
a name with an instance of the entity. An entity can be denoted if the PL provides ways of
expressing instances of the entity as expressions.
Example 34: Functions. In C, we can name functions but we cannot denote them. The
definition
int f(int x)
¦
B
¦
introduces a function with name f, parameter x, and body B. Thus functions in C can be named.
There is no way that we can write a function without a name in C. Thus functions in C cannot be
denoted.
The corresponding definition in Scheme would be:
(defun f (lambda (x)
E))
9 NAMES AND BINDING 129
This definition can be split into two parts: the name of the function being defined (f) and the
function itself ((lambda (x) E)). The notation is analogous to that of the λ-calculus:
f = λx . E
FPLs allow expressions analogous to λx . E to be used as functions. LISP, as mentioned in Sec-
tion 4.1, is not one of these languages. However, Scheme is, and so are SML, Haskell, and other
FPLs. 2
Example 35: Types. In early versions of C, types could be denoted but not named. For
example, we could write
int *p[10];
but we could not give a name to the type of p (“array of 10 pointers to int”). Later versions of C
introduced the typedef statement, making types nameable:
typedef int *API[10];
API p;
2
9.5 What is a Variable Name?
We have seen that most PLs choose between two interpretations of a variable name. In a PL
with value semantics, variable names denote memory locations. Assigning to a variable changes
the contents of the location named by the variable. In a PL with reference semantics, variable
names denote objects in memory. Assigning to a variable, if permitted by the language, changes
the denotation of the name to a different object.
Reference semantics is sometimes called “pointer semantics”. This is reasonable in the sense that
the implementation of reference semantics requires the storage of addresses — that is, pointers.
It is misleading in that providing pointers is not the same as providing reference semantics. The
distinction is particularly clear in Hoare’s work on PLs. Hoare (1974) has this to say about the
introduction of pointers into PLs.
Many language designers have preferred to extend [minor, localized faults in Algol 60
and other PLs] throughout the whole language by introducing the concept of reference,
pointer, or indirect address into the language as an assignable item of data. This
immediately gives rise in a high-level language to one of the most notorious confusions
of machine code, namely that between an address and its contents. Some languages
attempt to solve this by even more confusing automatic coercion rules. Worse still, an
indirect assignment through a pointer, just as in machine code, can update any store
location whatsoever, and the damage is no longer confined to the variable explicitly
names as the target of assignment. For example, in Algol 68, the assignment
x := y
always changes x, but the assignment
p := y + 1
may, if p is a reference variable, change any other variable (of appropriate type) in the
whole machine. One variable it can never change is p! . . . . References are like jumps,
leading wildly from one part of a data structure to another. Their introduction into
high-level languages has been a step backward from which we may never recover.
9 NAMES AND BINDING 130
One year later, Hoare (1975) provided his solution to the problem of references in high-level PLs:
In this paper, we will consider a class of data structures for which the amount of
storage can actually vary during the lifetime of the data, and we will show that it can
be satisfactorily accommodated in a high-level language using solely high-level problem-
oriented concepts, and without the introduction of references.
The implementation that Hoare proposes in this paper is a reference semantics with types. Explicit
types make it possible to achieve
. . . . a significant improvement on the efficiency of compiled LISP, perhaps even a
factor of two in space-time cost for suitable applications. (Hoare 1975)
All FPLs and most OOPLs (the notable exception, of course, being C++) use reference semantics.
There are good reasons for this choice, but the reasons are not the same for each paradigm.
In a FPL, all (or at least most) values are immutable. If X and Y are have the same value,
the program cannot tell whether X and Y are distinct objects that happen to be equal, or
pointers to the same object. It follows that value semantics, which requires copying, would be
wasteful because there is no point in making copies of immutable objects.
(LISP provides two tests for equality. (eq x y) is true if x and y are pointers to the same
object. (equal x y) is true if the objects x and y have the same extensional value. These
two functions are provided partly for efficiency and partly to cover up semantic deficiencies of
the implementation. Some other languages provide similar choices for comparison.)
One of the important aspects of OOP is object identity. If a program object X corresponds
to a unique entity in the world, such as a person, it should be unique in the program too. This
is most naturally achieved with a reference semantics.
The use of reference semantics in OOP is discussed at greater length in (Grogono and Chalin 1994).
9.6 Polymorphism
The word “polymorphism” is derived from Greek and means, literally, “many shapes”. In PLs,
“polymorphism” is used to describe a situation in which one name can refer to several different en-
tities. The most common application is to functions. There are several kinds of polymorphism; the
terms “ad hoc polymorphism” and “parametric polymorphism” are due to Christopher Strachey.
9.6.1 Ad Hoc Polymorphism
In the code
int m, n;
float x, y;
printf("%d %f", m+ n, x +y);
the symbol “+” is used in two different ways. In the expression m + n, it stands for the function
that adds two integers. In the expression x +y, it stands for the function that adds two floats.
In general, ad hoc polymorphism refers to the use of a single function name to refer to two or more
distinct functions. Typically the compiler uses the types of the arguments of the function to decide
which function to call. Ad hoc polymorphism is also called “overloading”.
9 NAMES AND BINDING 131
Almost all PLs provide ad hoc polymorphism for built-in operators such as “+”, “−”, “*”, etc.
Ada, C++, and other recent languages also allow programmers to overload functions. (Strictly,
we should say “overload function names” but the usage “overloaded functions” is common.) In
general, all that the programmer has to do is write several definitions, using the same function
name, but ensuring that the either the number or the type of the arguments are different.
Example 36: Overloaded Functions. The following code declares three distinct functions in
C++.
int max (int x, int y);
int max (int x, int y, int z);
float max (float x, float y);
2
9.6.2 Parametric Polymorphism
Suppose that a language provides the type list as a parameterized type. That is, we can make
declarations such as these:
list(int)
list(float)
Suppose also that we have two functions: sum computes the sum of the components of a given
list, and len computes the number of components in a given list. There is an important difference
between these two functions. In order to compute the sum of a list, we must be able to choose an
appropriate “add” function, and this implies that we must know the type of the components. On
the other hand, there seems to be no need to know the type of the components if all we need to
do is count them.
A function such as len, which counts the components of a list but does not care about their type,
has parametric polymorphism.
Example 37: Lists in SML. In SML, functions can be defined by cases, “[]” denotes the empty
list, and “::” denotes list construction. We write definitions without type declarations and SML
infers the types. Here are functions that sum the components of a list and count the components
of a list, respectively.:
sum [] = 0
| sum (x::xs) = x + sum(xs)
len [] = 0
| len(x::xs) = 1 + len(xs)
SML infers the type of sum to be list(int) → int: since the sum of the empty list is (integer)
0, the type of the operator “+” must be int int → int, and all of the components must be
integers.
For the function len, SML can infer nothing about the type of the components from the function
definition, and it assigns the type list(α) → int to the function. α is a type variable, acting as
a type parameter. When len is applied to an actual list, α will implicitly assume the type of the
list components. 2
9 NAMES AND BINDING 132
9.6.3 Object Polymorphism
In OOPLs, there may be many different objects that provide a function called, say, f. However,
the effect of invoking f may depend on the object. The details of this kind of polymorphism,
which are discussed in Section 6, depend on the particular OOPL. This kind of polymorphism is a
fundamental, and very important, aspect of OOP.
9.7 Assignment
Consider the assignment x := E. Whatever the PL, the semantics of this statement will be
something like:
evaluate the expression E;
store the value obtained in x.
The assignment is unproblematic if x and E have the same type. But what happens if they have
different types? There are several possibilities.
The statement is considered to be an error. This will occur in a PL that provides static type
checking but does not provide coercion. For example, Pascal provides only a few coercions
(subrange to integer, integer to real, etc.) and rejects other type differences.
The compiler will generate code to convert the value of expression E to the type of x. This
approach was taken to extremes in COBOL and PL/I. It exists in C and C++, but C++ is
stricter than C.
The value of E will be assigned to x anyway. This is the approach taken by PLs that use
dynamic type checking. Types are associated with objects rather than with names.
9.8 Scope and Extent
The two most important properties of a variable name are scope and extent.
9.8.1 Scope
The scope of a name is the region of the program text in which a name may be used. In C++, for
example, the scope of a local variable starts at the declaration of the variable and ends at the end
of the block in which the declaration appears, as shown in Listing 65. Scope is a static property
of a name that is determined by the semantics of the PL and the text of the program. Examples
of other scope rules follow.
FORTRAN. Local variables are declared at the start of subroutines and at the start of the main
program. Their scope starts at the declarations and end at the end of the program unit (that
is, subroutine or main program) within which the declaration appears. Subroutine names
have global scope. It is also possible to make variables accessible in selected subroutines
by means of COMMON statements, as shown in Listing 66. This method of “scope control” is
insecure because the compiler does not attempt to check consistency between program units.
Algol 60. Local variables are declared at the beginning of a block. Scopes are nested: a decla-
ration of a name in an inner block hides a declaration of the same name in an outer block,
as shown in Listing 67.
9 NAMES AND BINDING 133
Listing 65: Local scope in C++
¦
// x not accessible.
t x;
// x accessible
. . . .
¦
// x still accessible in inner block
. . . .
¦
// x still accessible
¦
// x not accessible
Listing 66: FORTRAN COMMON blocks
subroutine first
integer i
real a(100)
integer b(250)
common /myvars/ a, b
. . . .
return
end
subroutine second
real x(100), y(100)
integer k(250)
common /myvars/ x, k, y
. . . .
return
end
Listing 67: Nested scopes in Algol 60
begin
real x; integer k;
begin
real x;
x := 3.0
end;
x := 6.0;
end
9 NAMES AND BINDING 134
The fine control of scope provided by Algol 60 was retained in Algol 68 but simplified in
other PLs of the same generation.
Pascal. Control structures and nested functions are nested but declarations are allowed only at
the start of the program or the start of a function declaration. In early Pascal, the order of
declarations was fixed: constants, types, variables, and functions were declared in that order.
Later dialects of Pascal sensibly relaxed this ordering.
Pascal also has a rather curious scoping feature: within the statement S of the statement
with r do S, a field f of the record r may be written as f rather than as r.f.
C. Function declarations cannot be nested and all local variables must appear before executable
statements in a block. The complete rules of C scoping, however, are complicated by the fact
that C has five overloading classes (Harbison and Steele 1987):
preprocessor macro names;
statement labels;
structure, union, and enumeration tags;
component names;
other names.
Consequently, the following declarations do not conflict:
¦
char str[200];
struct str ¦int m, n; ¦ x;
. . . .
¦
Inheritance. In OOPLs, inheritance is, in part, a scoping mechanism. When a child class C
inherits a parent class P, some or all of the attributes of P are made visible in C.
C++. There are a number of additional scope control mechanisms in C++. Class attributes may
be declared private (visible only with the class), protected (visible within the class and
derived classes), and public (visible outside the class). Similarly, a class may be derived as
private, protected, or public. A function or a class may be declared as a friend of a class,
giving it access to the private attributes of the class. Finally, C++ follows C in providing
directives such as static and extern to modify scope.
Global Scope. The name is visible throughout the program. Global scope is useful for pervasive
entities, such as library functions and fundamental constants (π = 3.1415926) but is best
avoided for application variables.
FORTRAN does not have global variables, although programmers simulate them by over
using COMMON declarations. Subroutine names in FORTRAN are global.
Names declared at the beginning of the outermost block of Algol 60 and Pascal programs
have global scope. (There may be “holes” in these global scopes if the program contains local
declarations of the same names.)
The largest default scope in C is “file scope”. Global scope can be obtained using the storage
class extern.
9 NAMES AND BINDING 135
Listing 68: Declaration before use
void main ()
¦
const int i = 6;
¦
int j = i;
const int i = 7;
cout << j << endl;
¦
¦
Listing 69: Qualified names
class Widget
¦
public:
void f();
¦;
Widget w;
f(); // Error: f is not in scope
w.f(); // OK
Local Scope. In block-structured languages, names declared in a block are local to the block.
There is a question as to whether the scope starts at the point of definition or is the entire
block. (In other words, does the PL require “declaration before use”?) C++ uses the “declare
before use” rule and the program in Listing 68 prints 6.
In Algol 60 and C++, local scopes can be as small as the programmer needs. Any statement
context can be instantiated by a block that contains local variable declarations. In other
languages, such as Pascal, local declarations can be used only in certain places. Although a
few people do not like the fine control offered by Algol 60 and C++, it seems best to provide
the programmer with as much freedom as possible and to keep scopes as small as possible.
Qualified Scope. The components of a structure, such as a Pascal record or a C struct, have
names. These names are usually hidden, but can be made visible by the name of the object.
Listing 69 provides an example in C++. The construct w. opens a scope in which the public
attributes of the object w are visible.
The dot notation also appears in many OOPLs, where is it used to select attributes and
functions of an object. For an attribute, the syntax is the same: o.a denotes attribute a
of object o. If the attribute is a function, a parameter list follows: o.f(x) denotes the
invocation of function f with argument x in object o.
Import and Export. In modular languages, such as Modula-2, a name or group of names can
be brought into a scope with an import clause. The module that provides the definitions
must have a corresponding export clause.
Typically, if module A exports name X and module B imports name X from A, then X may be
used in module B as if it was declared there.
Namespaces. The mechanisms above are inadequate for very large programs developed by large
teams. Problems arise when a project uses several libraries that may have conflicting names.
9 NAMES AND BINDING 136
Namespaces, provided by PLs such as C++ and Common LISP, provide a higher level of
name management than the regular language features.
In summary:
Nested scopes are convenient in small regions, such as functions. The advantage of nested
scopes for large structures, such as classes or modules, is doubtful.
Nesting is a form of scope management. For large structures, explicit control by name quali-
fication may be better than nesting.
Qualified names work well at the level of classes and modules, when the source of names is
obvious.
The import mechanism has the problem that the source of a name is not obvious where it
appears: users must scan possibly remote import declarations.
Qualified names are inadequate for very large programs.
Library designers can reduce the potential for name conflicts by using distinctive prefixes. For
example, all names supplied by the commercial graphics library FastGraph are have fg_ as a
prefix.
Namespaces provide a better solution than prefixes.
Scope management is important because the programmer has no “work arounds” if the scoping
mechanisms provided by the PL are inadequate. This is because PLs typically hide the scoping
mechanisms.
If a program is compiled, names will normally not be accessible at run-time unless they are included
for a specific purpose such as debugging. In fact, one way to view compiling is as a process that
converts names to numbers.
Interpreters generally have a repository that contains the value of each name currently accessible
to the program, but programmers may have limited access to this repository.
9.8.2 Are nested scopes useful?
As scope control mechanisms became more complicated, some people began to question to need
for nested scopes (Clarke, Wilden, and Wolf 1980; Hanson 1981).
On a small scale, for example in statement and functions, nesting works fairly well. It is,
however, the cause of occasional errors that could be eliminated by requiring all names in a
statement or function to be unique.
On a large scale, for example in classes and modules, nesting is less effective because there is
more text and a larger number of names. It is probably better to provide special mechanisms
to provide the scope control that is needed rather than to rely on simple nesting.
C++ allows classes to be nested (this seems inconsistent, given that functions cannot be nested).
The effect is that the nested class can be accessed only by the enclosing class. An alternative
mechanism would be a declaration that states explicitly that class A can be accessed only by class
B. This mechanism would have the advantage that it would also be able to describe constraints
such as “class A can be accessed only by classes B, C, and D”.
9 NAMES AND BINDING 137
9.8.3 Extent
The extent, also called lifetime, of a name is the period of time during program execution during
which the object corresponding to the name exists. Understanding the relation between scope and
extent is an important part of understanding a PL.
Global names usually exist for the entire lifetime of the execution: they have global extent.
In FORTRAN, local variables also exist for the lifetime of the execution. Programmers assume
that, on entry to a subroutine, its local variables will not have changed since the last invocation of
the subroutine.
In Algol 60 and subsequent stack-based languages, local variables are instantiated whenever control
enters a block and they are destroyed (at least in principle) when control leaves the block. It is an
error to create an object on the stack and to pass a reference to that object to an enclosing scope.
Some PLs attempt to detect this error (e.g., Algol 68), some try to prevent it from occurring (e.g.,
Pascal), and others leave it as a problem for the programmer (e.g., C and C++).
In PLs that use a reference model, objects usually have unlimited extent, whether or not their
original names are accessible. Examples include Simula, Smalltalk, Eiffel, CLU, and all FPLs. The
advantage of the reference model is that the problems of disappearing objects (dangling pointers)
and inaccessible objects (memory leaks) do not occur. The disadvantage is that GC and the
accompanying overhead are more or less indispensable.
The separation of extent from scope was a key step in the evolution of post-Algol PLs.
10 Abstraction
Abstraction is one of the most important mental tools in Computer Science. The biggest problems
that we face in Computer Science are concerned with complexity; abstraction helps us to manage
complexity. Abstraction is particularly important in the study of PLs because a PL is a two-fold
abstraction.
A PL must provide an abstract view of the underlying machine. The values that we use in pro-
gramming — integers, floats, booleans, arrays, and so on — are abstracted from the raw bits of
computer memory. The control structures that we use — assignments, conditionals, loops, and so
on — are abstracted from the basic machine instructions. The first task of a PL is to hide the
low-level details of the machine and present a manageable interface, preferably an interface that
does not depend on the particular machine that is being used.
A PL must also provide ways for the programmer to abstract information about the world. Most
programs are connected to the world in one way or another, and some programs are little more
than simulations of some aspect of the world. For example, an accounting program is, at some
level, a simulation of tasks that used to be performed by clerks and accountants. The second
task of a PL is to provide as much help as possible to the programmer who is creating a simplified
model of the complex world in a computer program.
In the evolution of PLs, the first need came before the second. Early PLs emphasize abstraction
from the machine: they provide assignments, loops, and arrays. Programs are seen as lists of
instructions for the computer to execute. Later PLs emphasize abstraction from the world: they
provide objects and processes. Programs are seen as descriptions of the world that happen to be
executable.
The idea of abstracting from the world is not new, however. The design of Algol 60 began in
the late fifties and, even at that time, the explicit goal of the designers was a “problem oriented
language”.
Example 38: Sets and Bits. Here is an early example that contrasts the problem-oriented and
machine-oriented approaches to PL design.
Pascal provides the problem-oriented abstraction set. We can declare types whose values are sets;
we can define literal constants and variables of these types; and we can write expressions using set
operations, as shown in Listing 70. In contrast, C provides bit-wise operations on integer data, as
shown in Listing 71.
These features of Pascal and C have almost identical implementations. In each case, the elements
of a set are encoded as positions within a machine word; the presence or absence of an element
Listing 70: Sets in Pascal
type SmallNum = set of [1..10];
var s, t: SmallNum;
var n: integer;
begin
s := [1, 3, 5, 7, 9];
t := s + [2, 4, 6, 8];
if (s <= t) . . . .
if (n in s) . . . .
end
138
10 ABSTRACTION 139
Listing 71: Bit operations in C
int n = 0x10101010101;
int mask = 0x10000;
if (n & mask) . . . .
is represented by a 1-bit or a 0-bit, respectively; and the operations for and-ing and or-ing bits
included in most machine languages are used to manipulate the words.
The appearance of the program, however, is quite different. The Pascal programmer can think
at the level of sets, whereas the C programmer is forced to think at the level of words and bits.
Neither approach is necessarily better than the other: Pascal was designed for teaching introductory
programming and C was designed for systems programmers. The trend, however, is away from
machine abstractions such as bit strings and towards problem domain abstractions such as sets. 2
10.1 Abstraction as a Technical Term
“Abstraction” is a word with quite general meaning. In Computer Science in general, and in PL
in particular, it is often given a special sense. An abstraction mechanism provides a way of
naming and parameterizing a program entity.
10.1.1 Procedures.
The earliest abstraction mechanism was the procedure, which exists in rudimentary form even in
assembly language. Procedural abstraction enables us to give a name to a statement, or group of
statements, and to use the name elsewhere in the program to trigger the execution of the statements.
Parameters enable us to customize the execution according to the particular requirements of the
caller.
10.1.2 Functions.
The next abstraction mechanism, which also appeared at an early stage was the function. Func-
tional abstraction enables us to give a name to an expression, and to use the name to trigger
evaluation of the expression.
In many PLs, procedures and functions are closely related. Typically, they have similar syntax and
similar rules of construction. The reason for this is that an abstraction mechanism that works only
for simple expressions is not very useful. In practice, we need to abstract form arbitrarily complex
programs whose main purpose is to yield a single value.
The difference between procedures and functions is most evident at the call site, where a procedure
call appears in a statement context and a function call appears in an expression context.
FORTRAN provides procedures (called “subroutines”) and functions. Interestingly, COBOL does
not provide procedure or function abstraction, although the PERFORM verb provides a rather re-
stricted and unsafe way of executing statements remotely.
LISP provides function abstraction. The effect of procedures is obtained by writing functions with
side-effects.
Algol 60, Algol 68, and C take the view that everything is an expression. (The idea seems to be
that, since any computation leaves something in the accumulator, the programmer might as well
be allowed to use it.) Thus statements have a value in these languages. It is not surprising that
10 ABSTRACTION 140
this viewpoint minimizes the difference between procedures and functions. In Algol 68 and C, a
procedure is simple a function that returns a value of type void. Since this value is unique, it
requires log 1 = 0 bits of storage and can be optimized away by the compiler.
10.1.3 Data Types.
Although Algol 68 provided a kind of data type abstraction, Pascal was more systematic and
more influential. Pascal provides primitive types, such as integer, real, and boolean, and type
expressions (C denotes an integer constant and T denotes a type):
C. . C
(C, C, . . . ., C)
array [T] of T
record v: T, . . . .end
set of T
file of T
Type expressions can be named and the names can be used in variable declarations or as compo-
nents of other types. Pascal, however, does not provide a parameter mechanism for types.
10.1.4 Classes.
Simula introduced an abstraction mechanism for classes. Although this was an important and
profound step, with hindsight we can see that two simple ideas were needed: giving a name to an
Algol block, and freeing the data component of the block from the tyranny of stack discipline.
10.2 Computational Models
There are various ways of describing a PL. We can write an informal description in natural language
(e.g., the Pascal User Manual and Report (Jensen and Wirth 1976)), a formal description in natural
language (e.g., the Algol report (Naur et al. 1960)), or a document in a formal notation (e.g., the
Revised Report on Algol 68 (van Wijngaarden et al. 1975)). Alternatively, we can provide a formal
semantics for the language, using the axiomatic, denotational, or operational approaches.
There is, however, an abstraction that is useful for describing a PL in general terms, without
the details of every construct, and that is the computational model (also sometimes called
the semantic model ). The computational model (CM) is an abstraction of the operational
semantics of the PL and it describes the effects of various operations without describing the actual
implementation.
Programmers who use a PL acquire intuitive knowledge of its CM. When a program behaves in an
unexpected way, the implementation has failed to respect the CM.
Example 39: Passing parameters. Suppose that a PL specifies that arguments are passed by
value. This means that the programs in Listing 72 and Listing 73 are equivalent. The statement
T a = E means “the variable a has type T and initial value E” and the statement x ← a means
“the value of a is assigned (by copying) to x”.
A compiler could implement this program by copying the value of the argument, as if executing
the assignment x ← a. Alternatively, it could pass the address of the argument a to the function
but not allow the statement S to change the value of a. Both implementations respect the CM. 2
10 ABSTRACTION 141
Listing 72: Passing a parameter
void f (T x);
¦
S;
¦
T a = E;
f(a);
Listing 73: Revealing the parameter passing mechanism
T a = E;
¦
T x;
x ← a;
S;
¦
Example 40: Parameters in FORTRAN. If a FORTRAN subroutine changes the value
of one of its formal parameters (called “dummy variables” in FORTRAN parlance) the value of
the corresponding actual argument at the call site also changes. For example, if we define the
subroutine
subroutine p (count)
integer count
. . . .
count = count + 1
return
end
then, after executing the sequence
integer total
total = 0
call p(total)
the value of total is 1. The most common way of achieving this result is to pass the address of the
argument to the subroutine. This enables the subroutine to both access the value of the argument
and to alter it.
Unfortunately, most FORTRAN compilers use the same mechanism for passing constant arguments.
The effect of executing
call p(0)
is to change all instances of 0 (the constant “zero”) in the program to 1 (the constant “one”)! The
resulting bugs can be quite hard to detect for a programmer who is not familiar with FORTRAN’s
idiosyncrasies.
The programmer has a mental CM for FORTRAN which predicts that, although variable arguments
may be changed by a subroutine, constant arguments will not be. The problem arises because
typical FORTRAN compilers do not respect this CM. 2
A simple CM does not imply easy implementation. Often, the opposite is true. For example, C is
easy to compile, but has a rather complex CM.
10 ABSTRACTION 142
Example 41: Array Indexing. In the CM of C, a[i] ≡ ∗(a + i). To understand this, we must
know that an array can be represented by a pointer to its first element and that we can access
elements of the array by adding the index (multiplied by an appropriate but implicit factor) to this
pointer.
In Pascal’s CM, an array is a mapping. The declaration
var A : array [1..N] of real;
introduces A as a mapping:
A : ¦ 1, . . . , N ¦ → {.
A compiler can implement arrays in any suitable way that respects this mapping. 2
Example 42: Primitives in OOPLs. There are two ways in which an OOPL can treat prim-
itive entities such as booleans, characters, and integers.
Primitive values can have a special status that distinguishes them from objects. A language
that uses this policy is likely to be efficient, because the compiler can process primitives without
the overhead of treating them as full-fledged objects. On the other hand, it may be confusing
for the user to deal with two different kinds of variable — primitive values and objects.
Primitive values can be given the same status as objects. This simplifies the programmer’s
task, because all variables behave like objects, but a naive implementation will probably be
inefficient.
A solution for this dilemma is to define a CM for the language in which all variables are objects.
An implementor is then free to implement primitive values efficiently provided that the behaviour
predicted by the CM is obtained at all times.
In Smalltalk, every variable is an object. Early implementations of Smalltalk were inefficient,
because all variables were actually implemented as objects. Java provides primitives with efficient
implementations but, to support various OO features, also has classes for the primitive types. 2
Example 43: λ-Calculus as a Computational Model. FPLs are based on the theory of
partial recursive functions. They attempt to use this theory as a CM. The advantage of this point
of view is that program text resembles mathematics and, to a certain extent, programs can be
manipulated as mathematical objects.
The disadvantage is that mathematics and computation are not the same. Some operations that
are intuitively simple, such as maintaining the balance of a bank account subject to deposits and
withdrawals, are not easily described in classical mathematics. The underlying difficulty is that
many computations depend on a concept of mutable state that does not exist in mathematics.
FPLs have an important property called referential transparency. An expression is referentially
transparent if its only attribute is its value. The practical consequence of referential transparency
is that we can freely substitute equal expressions.
For example, if we are dealing with numbers, we can replace E + E by 2 E; the replacement
would save time if the evaluation of E requires a significant amount of work. However, if E was not
referentially transparent — perhaps because its evaluation requires reading from an input stream or
the generation of random numbers — the replacement would change the meaning of the program.
2
10 ABSTRACTION 143
Example 44: Literals in OOP. How should an OOPL treat literal constants, such as 7? The
following discussion is based on (K¨olling 1999, pages 71–74).
There are two possible ways of interpreting the symbol “7”: we could interpret it as an object
constructor, constructing the object 7; or as a constant reference to an object that already exists.
Dee (Grogono 1991a) uses a variant of the first alternative: literals are constructors. This introduces
several problems. For example, what does the comparison 7 = 7 mean? Are there two 7-objects
or just one? The CM of Dee says that the first occurrence of the literal “7” constructs a new
object and subsequent occurrences return a reference to this (unique) object.
Smalltalk uses the second approach. Literals are constant references to objects. These objects,
however, “just magically exist in the Smalltalk universe”. Smalltalk does not explain where they
come from, or when and how they are created.
Blue avoids these problems by introducing the new concept of a manifest class. Manifest classes
are defined by enumeration. The enumeration may be finite, as for class Boolean with members
true and false. Infinite enumerations, as for class Integer, are allowed, although we cannot see
the complete declaration.
The advantage of the Blue approach, according to K¨olling (1999, page 72), is that:
. . . . the distinction is made at the logical level. The object model explicitly recognizes
these two different types of classes, and differences between numbers and complex
objects can be understood independently of implementation concerns. They
cease to be anomalies or special cases at a technical level. This reduces the dangers of
misunderstandings on the side of the programmer. [Emphasis added.]
2
Exercise 36. Do you think that K¨olling’s arguments for introducing a new concept (manifest
classes) are justified? 2
Summary Early CMs:
modelled programs as code acting on data;
structured programs by recursive decomposition of code.
Later CMs:
modelled programs as packages of code and data;
structured programs by recursive decomposition of packages of code and data.
The CM should:
help programmers to reason about programs;
help programmers to read and write programs;
constrain but not determine the implementation.
11 Implementation
This course is concerned mainly with the nature of PLs. Implementation techniques are a sepa-
rate, and very rich, topic. Nevertheless, there are few interesting things that we can say about
implementation, and that is the purpose of this section.
11.1 Compiling and Interpreting
11.1.1 Compiling
As we have seen, the need for programming languages at a higher level than assembly language
was recognized early. People also realized that the computer itself could perform the translation
from a high level notation to machine code. In the following quote, “conversion” is used in roughly
the sense that we would say “compilation” today.
On the other hand, large storage capacities will lead in the course of time to very
elaborate conversion routines. Programmes in the future may be written in a very
different language from that in which the machines execute them. It is to be hoped (and
it is indeed one of the objects of using a conversion routine) that many of the tiresome
blunders that occur in present-day programmes will be avoided when programmes can
be written in a language in which the programmer feels more at home. However, any
mistake that the programmer does commit may prove more difficult to find because it
will not be detected until the programme has been converted into machine language.
Before he can find the cause of the trouble the programmer will either have to investigate
the conversion process or enlist the aid of checking routines which will ‘reconvert’ and
provide him with diagnostic information in his own language. (Gill 1954)
A compiler translates source language to machine language. Since practical compilation systems
allow programs to be split into several components for compiling, there are usually several phases.
1. The components of the source program is compiled into object code.
2. The object code units and library functions required by the program are linked to form an
executable file.
3. The executable file is loaded into memory and executed.
Since machine code is executed directly, compiled programs ideally run at the maximum possible
speed of the target machine. In practice, the code generated by a compiler is usually not optimal.
For very small programs, a programmer skilled in assembly language may be able to write more
efficient code than a compiler can generate. For large programs, a compiler can perform global
optimizations that are infeasible for a human programmer and, in any case, hand-coding such large
programs would require excessive time.
All compilers perform some checks to ensure that the source code that they are translating is
correct. In early languages, such as FORTRAN, COBOL, PL/I, and C, the checking was quite
superficial and it was easy to write programs that compiled without errors but failed when they
were executed. The value of compile-time checking began to be recognized in the late 60s, and
languages such as Pascal, Ada, and C++ provided more thorough checking than their predecessors.
The goal of checking is to ensure that, if a program compiles without errors, it will not fail when
144
11 IMPLEMENTATION 145
it is executed. This goal is hard to achieve and, even today, there are few languages that achieve
it totally.
Compiled PLs can be classified according to whether the compiler can accept program components
or only entire programs. An Algol 60 program is a block and must be compiled as a block. Standard
Pascal programs cannot be split up, but recent dialects of Pascal provide program components.
The need to compile programs in parts was recognized at an early stage: even FORTRAN was
implemented in this way. The danger of separate compilation is that there may be inconsistencies
between the components that the compiler cannot detect. We can identify various levels of checking
that have evolved.
A FORTRAN compiler provides no checking between components. When the compiled com-
ponents are linked, the linker will report subroutines that have not been defined but it will
not detect discrepancies in the number of type of arguments. The result of such a discrepancy
is, at best, run-time failure.
C and C++ rely on the programmer to provide header files containing declarations and imple-
mentation files containing definitions. The compiler checks that declarations and definitions
are compatible.
Overloaded functions in C++ present a problem because one name may denote several func-
tions. A key design decision in C++, however, was to use the standard linker (Stroustrup
1994, page 34). The solution is “name mangling” — the C++ compiler alters the names of
functions by encoding their argument types in such a way that each function has a unique
name.
Modula-2 programs consist of interface modules and implementation modules. When the com-
piler compiles an implementation module, it reads all of the relevant interfaces and performs
full checking. In practice, the interface modules are themselves compiled for efficiency, but
this is not logically necessary.
Eiffel programs consist of classes that are compiled separately. The compiler automatically
derives an interface from the class declaration and uses the generated interfaces of other classes
to perform full checking. Other OOPLs that use this method include Blue, Dee, and Java.
11.1.2 Interpreting
An interpreter accepts source language and run-time input data and executes the program di-
rectly. It is possible to interpret without performing any translation at all, but this approach
tends to be inefficient. Consequently, most interpreters do a certain amount of translation before
executing the code. For example, the source program might be converted into an efficient internal
representation, such as an abstract syntax tree, which is then “executed”.
Interpreting is slower than executing compiled code, typically by a factor of 10 or more. Since no
time is spent in compiling, however, interpreted languages can provide immediate response. For
this reason, prototyping environments often provide an interpreted language. Early BASIC and
LISP systems relied on interpretation, but more recent interactive systems provide an option to
compile and may even provide compilation only.
Most interpreters check correctness of programs during execution, although some interpreters may
report obvious (e.g., syntactic) errors as the program is read. Language such as LISP and Smalltalk
are often said to be untyped, because programs do not contain type declarations, but this is
incorrect. The type checking performed by both of these languages is actually more thorough than
11 IMPLEMENTATION 146
that performed by C and Pascal compilers; the difference is that errors are not detected until the
offending statements are actually executed. For safety-critical applications, this may be too late.
11.1.3 Mixed Implementation Strategies
Some programming environments, particularly those for LISP, provide both interpretation and
compilation. In fact, some LISP environments allow source code and compiled code to be freely
mixed.
Byte Codes. Another mixed strategy consists of using a compiler to generate byte codes and an
interpreter to execute the byte codes. The “byte codes”, as the name implies, are simply streams
of bytes that represent the program: a machine language for an “abstract machine”. One of the
first programming systems to be based on byte codes was the Pascal “P” compiler.
The advantage of byte code programs is that they can, in principle, be run on any machine. Thus
byte codes provide a convenient way of implementing “portable” languages.
The Pascal “P” compiler provided a simple method for porting Pascal to a new kind of machine.
The “Pascal P package” consisted of the P-code compiler written in P-code. A Pascal implementor
had to perform the following steps:
1. Write a P-code interpreter. Since P-code is quite simple, this is not a very arduous job.
2. Use the P-code interpreter to interpret the P-code compiler. This step yields a P-code pro-
gram that can be used to translate Pascal programs to P-code. At this stage, the implementor
has a working Pascal system, but it is inefficient because it depends on the P-code interpreter.
3. Modify the compiler source so that the compiler generates machine code for the new machine.
This is a somewhat tedious task, but it is simplified by the fact that the compiler is written
in Pascal.
4. Use the compiler created in step 2 to compile the modified compiler. The result of this step
is a P-code program that translates Pascal to machine code.
5. Use the result of step 4 to compile the modified compiler. The result is a machine code
program that translates Pascal to machine code — in other words, a Pascal compiler for the
new machine.
Needless to say, all of these steps must be carried out with great care. It is important to ensure
that each step is completed correctly before beginning the next.
With the exercise of due caution, the process described above (called bootstrapping) can be
simplified. For example, the first version of the compiler, created in step 2 does not have to
compile the entire Pascal language, but only those parts that are needed by the compiler itself.
The complete sequence of steps can be used to generate a Pascal compiler that can compile itself,
but may be incomplete in other respects. When that compiler is available, it can be extended to
compile the full language.
Java uses byte codes to obtain portability in a different way. Java was designed to be used as an
internet language. Any program that is to be distributed over the internet must be capable of
running on all the servers and clients of the network. If the program is written in a conventional
language, such as C, there must be a compiled version for every platform (a platform consists of
a machine and operating system). If the program is written in Java, it is necessary only to provide
a Java byte code interpreter for each platform.
11 IMPLEMENTATION 147
Just In Time. “Just in time” (JIT) compilation is a recent strategy that is used for Java and a
few research languages. The idea is that the program is interpreted, or perhaps compiled rapidly
without optimization. As each statement is executed, efficient code is generated for it. This
approach is based on the following assumptions:
in a typical run, much of the code of a program will not be executed;
if a statement has been executed once, it is likely to be executed again.
JIT provides fast response, because the program can be run immediately, and efficient execution,
because the second and subsequent cycles of a loop are executed from efficiently compiled code.
Thus JIT combines some of the advantages of interpretation and compilation.
Interactive Systems A PL can be designed for writing large programs that run autonomously
or for interacting with the user. The categories overlap: PLs that can be used interactively, such
as BASIC, LISP, SML, and Smalltalk, can also be used to develop large programs. The converse
is not true: an interactive environment for C++ would not be very useful.
An interactive PL is usually interpreted, but interpretation is not the only possible means of
implementation. PLs that start out with interpreters often acquire compilers at a later stage —
BASIC, LISP, and a number of other interactive PLs followed this pattern of development.
A PL that is suitable for interactive use if programs can be described in small chunks. FPLs are
often supported interactively because a programmer can build up a program as a collection of func-
tion definitions, where most functions are fairly small (e.g., they fit on a screen). Block structured
languages, and modular languages, are less suitable for this kind of program development.
11.1.4 Consequences for Language Design
Any PL can be compiled or interpreted. A PL intended for interpretation, however, is usually
implemented interactively. This implies that the user must be able to enter short sequences of
code that can be executed immediately in the current context. Many FPLs are implemented in
this way (often with a compiler); the user can define new functions and invoke any function that
has previously been defined.
11.2 Garbage Collection
There is a sharp distinction: either a PL is implemented with garbage collection (GC) or it is
not. The implementor does not really have a choice. An implementation of LISP, Smalltalk, or
Java without GC would be useless — programs would exhaust memory in a few minutes at most.
Conversely, it is almost impossible to implement C or C++ with GC, because programmers can
perform arithmetic on pointers.
The imperative PL community has been consistently reluctant to incorporate GC: there is no
widely-used imperative PL that provides GC. The functional PL community has been equally
consistent, but in the opposite direction. The first FPL, LISP, provides GC, and so do all of its
successors.
The OOP community is divided on the issue of GC, but most OOPLs provide it: Simula, Smalltalk,
CLU, Eiffel, and Java. The exception, of course, is C++, for reasons that have been clearly
explained (Stroustrup 1994, page 219).
11 IMPLEMENTATION 148
I deliberately designed C++ not to rely on [garbage collection]. I feared the very
significant space and time overheads I had experienced with garbage collectors. I feared
the added complexity of implementation and porting that a garbage collector would
impose. Also, garbage collection would make C++ unsuitable for many of the low-level
tasks for which it was intended. I like the idea of garbage collection as a mechanism
that simplifies design and eliminates a source of errors. However, I am fully convinced
that had garbage collection been an integral part of C++ originally, C++ would have
been stillborn.
It is interesting to study Stroustrup’s arguments in detail. In his view, the “fundamental reasons”
that garbage collection is desirable are (Stroustrup 1994, page 220):
1. Garbage collection is easiest for the user. In particular, it simplifies the building and use of
some libraries.
2. Garbage collection is more reliable than user-supplied memory management schemes for
some applications.
The emphasis is added: Stroustrup makes it clear that he would have viewed GC in a different
light if it provided advantages for all applications.
Stroustrup provides more arguments against GC than in favour of it (Stroustrup 1994, page 220):
1. Garbage collection causes run-time and space overheads that are not affordable for many
current C++ applications running on current hardware.
2. Many garbage collection techniques imply service interruptions that are not acceptable for
important classes of applications, such as hard real-time applications, device drivers, control
applications, human interface code on slow hardware, and operating system kernels.
3. Some applications do not have the hardware resources of a traditional general-purpose com-
puter.
4. Some garbage collection schemes require banning several basic C facilities such as pointer
arithmetic, unchecked arrays, and unchecked function arguments as used by printf().
5. Some garbage collection schemes impose constraints on object layout or object creation that
complicates interfacing with other languages.
It is hard to argue with Stroustrup’s conclusion (Stroustrup 1994, page 220):
I know that there are more reasons for and against, but no further reasons are needed.
These are sufficient arguments against the view that every application would be better
done with garbage collection. Similarly, these are sufficient arguments against the view
that no application would be better done with garbage collection. (Stroustrup 1994,
page 220) [Emphasis in the original.]
Stroustrup’s arguments do not preclude GC as an option for C++ and, in fact, Stroustrup advo-
cates this. But it remains true that it is difficult to add GC to C++ and the best that can be done
is probably “conservative” GC. In the meantime, most C++ programmers spend large amounts of
time figuring out when to call destructors and debugging the consequences of their wrong decisions.
Conservative GC depends on the fact that it is possible to guess with a reasonable degree of
accuracy whether a particular machine word is a pointer (Boehm and Weiser 1988). In typical
11 IMPLEMENTATION 149
modern processors, a pointer is a 4-byte quantity, aligned on a word boundary, and its value is a
legal address. By scanning memory and noting words with these properties, the GC can identify
regions of memory that might be in use. The problem is that there will be occasionally be words
that look like pointers but actually are not: this results in about 10% of garbage being retained.
The converse, and more serious problem, of identifying live data as garbage cannot occur: hence
the term “conservative”.
The implementation of Eiffel provided by Meyer’s own company, ISE, has a garbage collector that
can be turned on and off. Eiffel, however, is a public-domain language that can be implemented
by anyone who chooses to do so — there are in fact a small number of companies other than ISE
that provide Eiffel. Implementors are encouraged, but not required to provide GC. Meyer (1992,
page 335) recommends that an Eiffel garbage collector should have the following properties.
It should be efficient, with a low overhead on execution time.
It should be incremental, working in small, frequent bursts of activity rather than waiting for
memory to fill up.
It should be tunable. For example, programmers should be able to switch off GC in critical
sections and to request a full clean-up when it is safe to do so.
The concern about the efficiency of GC is the result of the poor performance of early, mark-sweep
collectors. Modern garbage collectors have much lower overheads and are “tunable” in Meyer’s
sense. Larose and Feeley (1999), for example, describe a garbage collector for a Scheme compiler.
Programs written in Erlang, translated into Scheme, and compiled with this compiler are used in
ATM switches. (Erlang is a proprietary functional language developed by Ericsson.)
GC has divided the programming community into two camps: those who use PLS with GC and
consider it indispensable, and those who use PLs without GC and consider it unnecessary. Expe-
rience with C++ is beginning to affect the polarization, as the difficulties and dangers of explicit
memory management in complex OO systems becomes apparent. Several commercial products
that offer “conservative” GC are now available and their existence suggests that developers will
increasingly respect the contribution of GC.
12 Conclusion
Values and Objects. Mathematically oriented views of programming favour values. Simulation
oriented views favour objects. Both views are useful for particular applications. A PL that
provides both in a consistent and concise way might be simple, expressive, and powerful.
The “logical variable” (a variable that is first unbound, then bound, and later may be unbound
during backtracking) is a third kind of entity, after values and objects. It might be fruitful to
design a PL with objects, logical variables, unification, and backtracking. Such a language
would combine the advantages of OOP and logic programming.
Making Progress. It is not hard to design a large and complex PL by throwing in numerous
features. The hard part of design is to find simple, powerful abstractions — to achieve more
with less. In this respect, PL design resembles mathematics. The significant advances in
mathematics are often simplifications that occur when structures that once seemed distinct
are united in a common abstraction. Similar simplifications have occurred in the evolution
of PLs: for example, Simula (Section 6.1) and Scheme (Section 4.2). But programs are not
mathematical objects. A rigorously mathematical approach can lead all too rapidly to the
“Turing tar pit”.
Practice and Theory. The evolution of PLs shows that, most of the time, practice leads theory.
Designers and implementors introduce new ideas, then theoreticians attempt to what they
did and how they could have done it better. There are a number of PLs that have been based
on purely theoretical principles but few, if any, are in widespread use.
Theory-based PLs are important because they provide useful insights. Ideally, they serve
as testing environments for ideas that may eventually be incorporated into new mainstream
PLs. Unfortunately, it is often the case that they show retroactively how something should
have been done when it is already too late to improve it.
Multiparadigm Programming. PLs have evolved into several strands: procedural program-
ming, now more or less subsumed by object oriented programming; functional programming;
logic programming and its successor, constraint programming. Each paradigm performs well
on a particular class of applications. Since people are always searching for “universal” solu-
tions, it is inevitable that some will try to build a “universal” PL by combining paradigms.
Such attempts will be successful to the extent that they achieve overall simplification. It is
not clear that any future universal language will be accepted by the programming community;
it seems more likely that specialized languages will dominate in the future.
150
References
Abelson, H. and G. Sussman (1985). Structure and Interpretation of Computer Programs. MIT
Press.
˚
Ake Wikstr¨om (1987). Functional Programming Using Standard ML. Prentice Hall.
Apt, K. R., J. Brunekreef, V. Partington, and A. Schaerf (1998, September). Alma-0: an impera-
tive language that supports declarative programming. ACM Trans. Programming Languages
and Systems 20(5), 1014–1066.
Arnold, K. and J. Gosling (1998). The Java Programming Language (Second ed.). The Java
Series. Addison Wesley.
Ashcroft, E. and W. Wadge (1982). R
x
for semantics. ACM Trans. Programming Languages and
Systems 4(2), 283–294.
Backus, J. (1957, February). The FORTRAN automatic coding system. In Proceedings of the
Western Joint Computer Conference.
Backus, J. (1978, August). Can programming be liberated from the von Neumann style? A
functional style and its algebra of programs. Communications of the ACM 21(8), 613–64.
Bergin, Jr., T. J. and R. G. Gibson, Jr. (Eds.) (1996). History of Programming Languages–II.
ACM Press (Addison Wesley).
Boehm, H.-J. and M. Weiser (1988). Garbage collection in an uncooperative environment. Soft-
ware: Practice & Experience 18(9), 807–820.
Bohm, C. and G. Jacopini (1966, May). Flow diagrams, Turing machines, and languages with
only two formation rules. Communications of the ACM 9(5), 366–371.
Bratko, I. (1990). PROLOG: Programming for Artificial Intelligence (Second ed.). Addison Wes-
ley.
Brinch Hansen, P. (1999, April). Java’s insecure parallelism. ACM SIGPLAN 34(4), 38–45.
Brooks, F. P. (1975). The Mythical Man-Month. Addison Wesley.
Brooks, F. P. (1987, April). No silver bullet: Essence and accidents for software engineering.
IEEE Computer 20(4), 10–18.
Brooks, F. P. (1995). The Mythical Man-Month (Anniversary ed.). Addison Wesley.
Budd, T. A. (1995). Multiparadigm Programming in Leda. Addison Wesley.
Church, A. (1941). The calculi of lambda-conversion. In Annals of Mathematics Studies, Vol-
ume 6. Princeton Univesity Press.
Clarke, L. A., J. C. Wilden, and A. L. Wolf (1980). Nesting in Ada programs is for the birds. In
Proc. ACM SIGPLAN Symposium on the Ada programming Language. Reprinted as ACM
SIGPLAN Notices, 15(11):139–145, November 1980.
Colmerauer, A., H. Kanoui, P. Roussel, and R. Pasero (1973). Un systeme de communication
homme-machine en francais. Technical report, Groupe de Recherche en Intelligence Artifi-
cielle, Universit´e d’Aix-Marseille.
Dahl, O.-J., E. W. Dijkstra, and C. A. R. Hoare (1972). Structured Programming. Academic
Press.
Dijkstra, E. W. (1968, August). Go to statement considered harmful. Communications of the
ACM 11(8), 538.
Gelernter, D. and S. Jagannathan (1990). Programming Linguistics. MIT Press.
151
REFERENCES 152
Gill, S. (1954). Getting programmes right. In International Symposium on Automatic Digi-
tal Computing. National Physical Laboratory. Reprinted as pp. 289–292 of (Williams and
Campbell-Kelly 1989).
Griswold, R. E. and M. T. Griswold (1983). The Icon Programming Language. Prentice Hall.
Grogono, P. (1991a, January). The Dee report. Technical Report
OOP–91–2, Department of Computer Science, Concordia University.
ttp://www.cs.concordia.ca/∼faculty/grogono/dee.html.
Grogono, P. (1991b, January). Issues in the design of an object oriented
programming language. Structured Programming 12(1), 1–15. Available as
www.cs.concordia/~faculty/grogono/oopissue.ps.
Grogono, P. and P. Chalin (1994, May). Copying, sharing, and aliasing. In Colloquium on Object
Orientation in Databases and Software Engineering (ACFAS’94), Montreal, Quebec. Avail-
able as www.cs.concordia/~faculty/grogono/copying.ps.
Hanson, D. R. (1981, August). Is block structure necessary? Software: Practice and Experi-
ence 11(8), 853–866.
Harbison, S. P. and G. L. Steele, Jr. (1987). C: A Reference Manual (second ed.). Prentice Hall.
Hoare, C. A. R. (1968). Record handling. In Programming Languages, pp. 291–347. Academic
Press.
Hoare, C. A. R. (1974). Hints on programming language design. In C. Bunyan (Ed.), Computer
Systems Reliability, Volume 20 of State of the Art Report, pp. 505–34. Pergamon/Infotech.
Reprinted as pages 193–216 of (Hoare 1989).
Hoare, C. A. R. (1975, June). Recursive data structures. Int. J. Computer and Information
Science 4(2), 105–32. Reprinted as pages 217–243 of (Hoare 1989).
Hoare, C. A. R. (1989). Essays in Computing Science. Prentice Hall. Edited by C. B. Jones.
Jensen, K. and N. Wirth (1976). Pascal: User Manual and Report. Springer-Verlag.
Joyner, I. (1992). C++? a critique of C++. Available from ian@syacus.acus.oz.au. Second
Edition.
Kay, A. C. (1996). The early history of Smalltalk. In History of Programming Languages–II, pp.
511–578. ACM Press (Addison Wesley).
Kleene, S. C. (1936). General recursive functinos of natural numbesr. Mathematische An-
nalen 112, 727–742.
Knuth, D. E. (1974). Structured programming with GOTO statements. ACM Computing Sur-
veys 6(4), 261–301.
K¨olling, M. (1999). The Design of an Object-Oriented Environment and Language for Teaching.
Ph. D. thesis, Basser Department of Computer Science, University of Sydney.
Kowalski, R. A. (1974). Predicate logic as a programming language. In Information processing
74, pp. 569–574. North Holland.
LaLonde, W. and J. Pugh (1995, March/April). Complexity in C++: a Smalltalk perspective.
Journal of Object-Oriented Programming 8(1), 49–56.
Landin, P. J. (1965, August). A correspondence between Algol 60 and Church’s lambda-notation.
Communications of the ACM 8, 89–101 and 158–165.
Larose, M. and M. Feeley (1999). A compacting incrmental collector and its performce in a pro-
duction quality compiler. In ACM SIGPLAN International Symposiuum on Memory Man-
agement (ISMM’98). Reprinted as ACM SIGPLAN Notices 343:1–9, March 1999.
REFERENCES 153
Lientz, B. P. and E. B. Swanson (1980). Software Maintenance Management. Addison Wesley.
Lindsey, C. H. (1996). A history of Algol 68. In History of Programming Languages, pp. 27–83.
ACM Press (Addison Wesley).
Liskov, B. (1996). A history of CLU. In History of Programming Languages–II, pp. 471–496.
ACM Press (Addison Wesley).
Liskov, B. and J. Guttag (1986). Abstraction and Specification in Program Development. MIT
Press.
Liskov, B. and S. Zilles (1974, April). Programming with abstract data types. ACM SIGPLAN
Notices 9(4), 50–9.
MacLennan, B. J. (1983, December). Values and objects in programming languages. ACM SIG-
PLAN Notices 17(12), 70–79.
Martin, J. J. (1986). Data Types and Data Structures. Prentice Hall International.
McAloon, K. and C. Tretkoff (1995). 2LP: Linear programming and logic programming. In P. V.
Hentenryck and V. Saraswat (Eds.), Principles and Preactice of Constraint Programming,
pp. 101–116. MIT Press.
McCarthy, J. (1960, April). Recursive functions of symbolic expressions and their computation
by machine, Part I. Communications of the ACM 3(4), 184–195.
McCarthy, J. (1978). History of LISP. In R. L. Wexelblat (Ed.), History of Programming Lan-
guages, pp. 173–197. Academic Press.
McKee, J. R. (1984). Maintenance as a function of design. In Proc. AFIPS National Computer
Conference, pp. 187–93.
Meyer, B. (1988). Object-oriented Software Construction. Prentice Hall International. Second
Edition, 1997.
Meyer, B. (1992). Eiffel: the Language. Prentice Hall International.
Milne, R. E. and C. Strachey (1976). A Theory of Programming Language Semantics. John
Wiley.
Milner, B. and M. Tofte (1991). Commentary on Standard ML. MIT Press.
Milner, B., M. Tofte, and R. Harper (1990). The Definition of Standard ML. MIT Press.
Mutch, E. N. and S. Gill (1954). Conversion routines. In International Symposium on Automatic
Digital Computing. National Physical Laboratory. Reprinted as pp. 283–289 of (Williams and
Campbell-Kelly 1989).
Naur, P. (1978). The European side of the last phase of the development of Algol. In History of
Programming Languages. ACM Press. Reprinted as pages 92–137 of (Wexelblat 1981).
Naur, P. et al. (1960, May). Report on the algorithmic language Algol 60. Communications of
the ACM 3(5), 299–314.
Nygaard, K. and O.-J. Dahl (1978). The development of the SIMULA languages. In R. L.
Wexelblat (Ed.), History of Programming Languages, pp. 439–478. Academic Press.
Parnas, D. L. (1972, December). On the criteria to be used in decomposing systems into modules.
Communications of the ACM 15(12), 1053–1058.
Perlis, A. J. (1978). The American side of the development of Algol. In History of Programming
Languages. ACM Press. Reprinted as pages 75–91 of (Wexelblat 1981).
Radin, G. (1978). The early history and characteristics of PL/I. In History of Programming
Languages. ACM Press. Reprinted as pages 551–574 of (Wexelblat 1981).
REFERENCES 154
Ritchie, D. M. (1996). The development of the C programming language. In History of Program-
ming Languages, pp. 671–686. ACM Press (Addison Wesley).
Robinson, A. J. (1965). A machine-oriented logic based on the resolution principle. Journal of
the ACM 12, 23–41.
Sakkinen, M. (1988). On the darker side of C++. In S. Gjessing and K. Nygaard (Eds.), Pro-
ceedings of the 1988 European Conference of Object Oriented Programming, pp. 162–176.
Springer LNCS 322.
Sakkinen, M. (1992). The darker side of C++ revisited. Structured Programming 13.
Sammet, J. (1969). Programming Languages: History and Fundamentals. Prentice Hall.
Sammett, J. E. (1978). The early history of COBOL. In History of Programming Languages.
ACM Press. Reprinted as pages 199–241 of (Wexelblat 1981).
Schwartz, J. T., R. B. K. Dewar, E. Dubinsky, and E. Schonberg (1986). Programming with sets
— an introduction to SETL. Springer Verlag.
Steele, Jr., G. L. (1996). The evolution of LISP. In History of Programming Languages, pp.
233–308. ACM Press (Addison Wesley).
Steele, Jr., G. L. et al. (1990). Common LISP: the Language (second ed.). Digital Press.
Steele, Jr., G. L. and G. J. Sussman (1975). Scheme: an interpreter for the extended lambda
calculus. Technical Report Memo 349, MIT Artificial Intelligence Laboratory.
Strachey, C. (1966). Towards a formal semantics. In T. B. Steel (Ed.), Formal Language De-
scription Languages for Computer Programming. North-Holland.
Stroustrup, B. (1994). The Design and Evolution of C++. Addison Wesley.
Taivalsaari, A. (1993). A Critical View of Inheritance and Reusability in Object-oriented Pro-
gramming. Ph. D. thesis, University of Jyv¨askyl¨a.
Teale, S. (1993). C++ IOStreams Handbook. Addison Wesley. Reprinted with corrections, Febru-
ary 1994.
Turing, A. M. (1936). On computable numbers with an application to the Entscheidungsproblem.
Proc. London Math. Soc. 2(42), 230–265.
Turner, D. (1976). The SASL Langauge Manual. Technical report, University of St. Andrews.
Turner, D. (1979). A new implementation technique for applicative languages. Software: Practice
& Experience 9, 31–49.
Turner, D. (1981). KRC language manual. Technical report, University of Kent.
Turner, D. (1985, September). Miranda: a non-strict functional language with polymorphic
types. In Proceedings of the Second International Conference on Functional Programming
Languages and Computer Architecture. Springer LNCS 301.
van Wijngaarden, A. et al. (1975). Revised report on the algorithmic language Algol 68. Acta
Informatica 5, 1–236. (In three parts.).
Wexelblat, R. L. (Ed.) (1981). History of Programming Languages. Academic Press. From the
ACM SIGPLAN History of programming Languages Conference, 1–3 June 1978.
Wheeler, D. J. (1951). Planning the use of a paper library. In Report of a Conference on High
Speed Automatic Calculating-Engines. University Mathematical Laboratory, Cambridge.
Reprinted as pp. 42–44 of (Williams and Campbell-Kelly 1989).
Whitaker, W. A. (1996). Ada — the project. In History of Programming Languages, pp. 173–228.
ACM Press (Addison Wesley).
REFERENCES 155
Williams, M. R. and M. Campbell-Kelly (Eds.) (1989). The Early British Computer Conferences,
Volume 14 of The Charles Babbage Institute Reprint Series for the History of Computing.
The MIT Press and Tomash Publishers.
Wirth, N. (1971, April). Program development by stepwise refinement. Communications of the
ACM 14(4), 221–227.
Wirth, N. (1976). Algorithms + Data Structures = Programs. Prentice Hall.
Wirth, N. (1982). Programming in Modula-2. Springer-Verlag.
Wirth, N. (1996). Recollections about the development of Pascal. In History of Programming
Languages, pp. 97–110. ACM Press (Addison Wesley).
A Abbreviations and Glossary
Actual parameter. A value passed to a function.
ADT. Abstract data type. A data type described in terms of the operations that can be performed
on the data.
AR. Activation record. The run-time data structure corresponding to an Algol 60 block. Sec-
tion 3.3.
Argument. A value passed to a function; synonym of actual parameter.
Bind. Associate a property with a name.
Block. In Algol 60, a syntactic unit containing declarations of variables and functions, followed
by code. Successors of Algol 60 define blocks in similar ways. In Smalltalk, a block is a
closure consisting of code, an environment with bindings for local variables, and possibly
parameters.
BNF. Backus-Naur Form (originally Backus Normal Form). A notation for describing a context-
free grammar, introduced by John Backus for describing Algol 60. Section B.4.2.
Call by need. The argument of a function is not evaluated until its value is needed, and then
only once. See lazy evaluation.
CBN. Call by name. A parameter passing method in which actual parameters are evaluated when
they are needed by the called function. Section 3.3
CBV. Call by value. A parameter passing method in which the actual parameter is evaluated
before the function is called.
Closure. A data structure consisting of one (or more, but usually one) function and its lexical
environment. Section 4.2.
CM. Computational Model. Section 10.2
Compile. Translate source code (program text) into code for a physical (or possibly abstract)
machine.
CS. Computer Science.
Delegation. An object that accepts a request and then forwards the request to another object is
said to “delegate” the request.
Dummy parameter. Formal parameter (FORTRAN usage).
Dynamic. Used to describe something that happens during execution, as opposed to static.
Dynamic Binding. A binding that occurs during execution.
Dynamic Scoping. The value of a non-local variable is the most recent value that has been
assigned to it during the execution of the program. Cf. Lexical Scoping. Section 4.1.
Dynamic typing. Performing type checking “on the fly”, while the program is running.
Eager evaluation. The arguments of a function are evaluated at the time that it is called, whether
or not their values are needed.
156
A ABBREVIATIONS AND GLOSSARY 157
Early binding. Binding a property to a name at an early stage, usually during compilation.
Exception handling. A PL feature that permits a computation to be abandoned; control is
transferred to a handler in a possibly non-local scope.
Extent. The time during execution during which a variable is active. It is a dynamic property,
as opposed to scope.
Formal parameter. A variable name that appears in a function heading; the variable will receive
a value when the function is called. “Parameter” without qualification usually means “formal
parameter”.
FP(L). Functional Programming (Language). Section 4.
Garbage collection. Recycling of unused memory when performed automatically by the imple-
mentatin rather than coded by the programmer. (The term “garbage collection” is widely
used but inaccurate. The “garbage” is thrown away: it is the useful data that is collected!)
GC. Garbage collection.
Global scope. A name that is accessible throughout the (static) program text has global scope.
Heap. A region for storing variables in which no particular order of allocation or deallocation is
defined.
High order function. A function that either accepts a function as an argument, or returns a
function as a value, or both.
Higher order function. Same as high order function.
Inheritance. In OOPLs, when an object (or class) obtains some of its features from another
object (or class), it is said to inherit those features.
Interpret. Execute a program one statement at a time, without translating it into machine code.
Lambda calculus. A notation for describing functions, using the Greek letter λ, introduced by
Church.
Late binding. A binding (that is, attachment of a property to a name) that is performed at a later
time than would normally be expected. For example, procedural languages bind functions to
names during compilation, but OOPLs “late bind” function names during execution.
Lazy evaluation. If the implementation uses lazy evaluation, or call by need, a function does not
evaluate an argument until the value of the argument is actually required for the computation.
When the argument has been evaluated, its value is stored and it is not evaluated again during
the current invocation of the function.
Lexical Scoping. The value of a non-local variables is determined in the static context as deter-
mined by the text of the program. Cf. Dynamic Scoping. Section 4.1.
Local scope. A name that is accessible in only a part of the program text has local scope.
OOP(L). Object Oriented Programming (Language). Section 6.
Orthogonal. Two language features are orthogonal if they can be combined without losing any
of their important properties. Section 3.6.
A ABBREVIATIONS AND GLOSSARY 158
Overloading. Using the same name to denote more than one entity (the entities are usually
functions).
Own variable. A local variable in an Algol 60 program that has local scope (it can be accessed
only within the procedure in which it is declared) but global extent (it is created when the
procedure is first called and persists until the program termiantes). Section 3.3.
Parameter. Literally “unknown” or “unmeasurable” (from Greek). Used in PLs for the objects
passed to functions.
Partial function. A function that is defined over only a part of its domain.

x is partial over
the domain of reals because it has no real value if x < 0.
PL. Programming Language.
Polymorphism. Using one name to denote several distinct objects. (Literally “many shapes”).
Recursion. A process or object that is defined partly in terms of itself without circularity.
Reference semantics. Variable names denote objects. Used by all functional and logic languages,
and most OOPLs.
RE. Regular Expression. Section B.4.
Scope. The region of text in which a variable is accessible. It is a static property, as opposed to
extent.
Semantic Model. An alternative term for Computational Model (CM). Section 10.2.
Semantics. A system that assigns meaning to programs.
Stack. A region for storing variables in which the most recently allocated unit is the first one to
be deallocated (“last-in, first out”, LIFO). The name suggests a stack of plates.
Static. Used to describe something that happens during compilation (or possible during linking
but, anyway, before the program is executed), as opposed to dynamic.
Static Binding. A value that is bound before the program is executed. Cf. Dynamic Binding.
Static typing. Type checking performed during compilation.
Strong typing. Detect and report all type errors, as opposed to weak typing.
Template. A syntactic construct from which many distinct objects can be generated. A template
is different from a declaration that constructs only a single object. (The word “template”
has a special sense in C++, but the usage is more or less compatible with our definition.)
Thunk. A function with no parameters used to implement the call by name mechanism of Algol
60 and a few other languages.
Total function. A function that is defined for all values of its arguments in its domain. e
x
is
total over the domain of reals.
Value semantics. Variable names denote locations. Most imperative languages use value seman-
tics.
Weak typing. Detect and report some type errors, but allow some possible type errors to remain
in the program.
B Theoretical Issues
As stated in Section 1, theory is not a major concern of the course. In this section, however, we
briefly review the contributions that theory has made to the development of PLs and discuss the
importance and relevance of these contributions.
B.1 Syntactic and Lexical Issues
There is a large body of knowledge concerning formal languages. There is a hierarchy of languages
(the “Chomsky hierarchy”) and, associated with each level of the hierarchy, formal machines that
can “recognize” the corresponding languages and perform other tasks. The first two levels of the
hierarchy are the most familiar to programmers: regular languages and finite state automata;
and context-free languages and push-down automata. Typical PLs have context-free (or almost
context-free) grammars, and their tokens, or lexemes, can be described by a regular language.
Consequently, programs can be split into lexemes by a finite state automaton and parsed by a
push-down automaton.
It might appear, then, that the problem of PL design at this level is “solved”. Unfortunately,
there are exceptions. The grammar of C++, for example, is not context-free. This makes the
construction of a C++ parser unnecessarily difficult and accounts for the slow development of
C++ programming tools. The preprocessor is another obstacle that affects program development
environments in both C and C++.
Exercise 37. Give examples to demonstrate the difficulties that the C++ preprocessor causes in
a program development environment. 2
B.2 Semantics
There are many kinds of semantics, including: algebraic, axiomatic, denotational, operational,
and others. They are not competitive but rather complementary. There are many applications of
semantics, and a particular semantic system may be more or less appropriate for a particular task.
For example, a denotational semantics is a mapping from program constructs to abstract math-
ematical entities that represent the “meaning” of the constructs. Denotational semantics is an
effective language for communication between the designer and implementors of a PL but is not
usually of great interest to a programmer who uses the PL. An axiomatic semantics, on the other
hand, is not very useful to an implementor but may be very useful to a programmer who wishes
to prove the correctness of a program.
The basic ideas of denotational semantics are due to Landin (1965) and Strachey (1966). The most
complete description was completed by Milne (1976) after Strachey’s death. Denotational semantics
is a powerful descriptive tool: it provides techniques for mapping almost any PL feature into a
high order mathematical function. Denotational semantics developed as a descriptive technique
and was extended to handle increasingly arcane features, such as jumps into blocks.
Ashcroft and Wadge published an interesting paper (1982), noting that PL features that were
easy to implement were often hard to describe and vice versa. They suggested that denotational
semantics should be used prescriptively rather then descriptively. In other words, a PL designer
should start with a simple denotational semantics and then figure out how to implement the
language — or perhaps leave that task to implementors altogether. Their own language, Lucid,
was designed according to these principles.
159
B THEORETICAL ISSUES 160
To some extent, the suggestions of Ashcroft and Wadge have been followed (although not neces-
sarily as a result of their paper). A description of a new PL in the academic literature is usually
accompanied by a denotational semantics for the main features of the PL. It is not clear, however,
that semantic techniques have had a significant impact on mainstream language design.
B.3 Type Theory
There is a large body of knowledge on type theory. We can summarize it concisely as follows. Here
is a function declaration.
T f(S x) ¦ B; return y; ¦
The declaration introduces a function called f that takes an argument, x of type S, performs
calculations indicated as B, and returns a result, y of type T. If there is a type theory associated
with the language, we should be able to prove a theorem of the following form:
If x has type S then the evaluation of f(x) yields a value of type T.
The reasoning used in the proof is based on the syntactic form of the function body, B.
If we can do this for all legal programs in a language L, we say that L is statically typed. If L is
indeed statically typed:
A compiler for L can check the type correctness of all programs.
A program that is type-correct will not fail because of a type error when it is executed.
A program that is type-correct may nevertheless fail in various ways. Type checking does not
usually detect errors such as division by zero. (In general, most type checking systems assume that
functions are total : a function with type S → T, given a value of type S, will return a value of
type T. Some commonly used functions are partial : for example x/y ≡ divide(x,y) is undefined
for y = 0.) A program can be type-correct and yet give completely incorrect answers. In general:
Most PLs are not statically typed in the sense defined above, although the number of loopholes
in the type system may be small.
A type-correct program may give incorrect answers.
Type theories for modern PLs, with objects, classes, inheritance, overloading, genericity, dy-
namic binding, and other features, are very complex.
Type theory leads to attractive and interesting mathematics and, consequently, tends to be over-
valued by the theoretical Computer Science community. The theory of program correctness is more
difficult and has attracted less attention, although it is arguably more important.
Exercise 38. Give an example of an expression or statement in Pascal or C that contains a type
error that the compiler cannot detect. 2
B.4 Regular Languages
Regular languages are based on a simple set of operations: sequence, choice, and repetition. Since
these operations occur in many contexts in the study of PLs, it is interesting to look briefly at
regular languages and to see how the concepts can be applied.
B THEORETICAL ISSUES 161
In the formal study of regular languages, we begin with an alphabet, Σ, which is a finite set of
symbols. For example, we might have Σ = ¦ 0, 1 ¦. The set of all finite strings, including the empty
string, constructed from the symbols of Σ is written Σ

.
We then introduce a variety of regular expressions, which we will refer to as RE here. (The
abbreviation RE is also used to stand for “recursively enumerable”, but in this section it stands
for “regular expression”.) Each form of RE is defined by the set of strings in Σ

that it generates.
This set is called a “regular language”.
1.
g

is RE and denotes the empty set.
2. (the string containing no symbols) is RE and denotes the set ¦ ¦.
3. For each symbol x ∈ Σ, x is RE and denotes the set ¦ x ¦.
4. If r is RE with language R, then r

is RE and denotes the set R

(where
R

= ¦ x
1
x
2
. . . x
n
[ n ≥ 0 ∧ x
i
∈ R¦ ).
5. If r and s are RE with languages R and S respectively, then (r + s) is RE and denotes the
set R∪ S.
6. If r and s are RE with languages R and S respectively, then (rs) is RE and denotes the set
RS (where
RS = ¦ rs [ r ∈ R∧ s ∈ S ¦ ).
We add two further notations that serve solely as abbreviations.
1. The expression a
n
represents the string aa . . . a containing n occurrences of the symbol a.
2. The expression a
+
is an abbreviation for the RE aa

that denotes the set ¦ a, aa, aaa, . . . ¦.
B.4.1 Tokens
We can use regular languages to describe the tokens (or lexemes) of most PLs. For example:
UC = A+ B + +Z
LC = a + b + +z
LETTER = UC +LC
DIGIT = 0 + 1 + + 9
IDENTIFIER = LETTER( LETTER + DIGIT)

INTCONST = DIGIT
+
FLOATCONST = DIGIT
+
. DIGIT

. . . .
We can use a program such as lex to construct a lexical analyzer (scanner) from a regular expression
that defines the tokens of a PL.
Exercise 39. Some compilers allow “nested comments”. For example, ¦. . . ¦. . .¦ . . .¦ in Pascal or
/ ∗ . . . / ∗ . . .∗ / . . . ∗ / in C. What consequences does this have for the lexical analyzer (scanner)? 2
B THEORETICAL ISSUES 162
RE Control Structure
x statement
rs statement; statement
r

while expression do statement
r +s if condition then statement else statement
Figure 17: REs and Control Structures
B.4.2 Context Free Grammars
Context free grammars for PLs are usually written in BNF (Backus-Naur Form). A grammar rule,
or production, consists of a non-terminal symbol (the symbol being defined), a connector (usually
::= or →), and a sequence of terminal and non-terminal symbols. The syntax for the assignment,
for example, might be defined:
ASSIGNMENT → VARIABLE “ := ” EXPRESSION
Basic BNF provides no mechanism for choice: we must provide one rule for each possibility. Also,
BNF provides no mechanism for repetition; we must use recursion instead. The following example
illustrates both of these limitations.
SEQUENCE → EMPTY
SEQUENCE → STATEMENT SEQUENCE
BNF has been extended in a variety of ways. With a few exceptions, most of the extended forms
can be described simply: the sequence of symbols on the right of the connector is replaced by a
regular expression. The extension enables us to express choice and repetition within a single rule.
In grammars, the symbol [ rather than + is used to denote choice.
STATEMENT = ASSIGNMENT [ CONDITIONAL [
SEQUENCE = ( STATEMENT)

Extended BNF (EBNF) provides a more concise way of describing grammars than BNF. Just as
parsers can be constructed from BNF grammars, they can be constructed from EBNF grammars.
Exercise 40. Discuss the difficulty of adding attributes to an EBNF grammar. 2
B.4.3 Control Structures and Data Structures
Figure 17 shows a correspondence between REs and the control structures of Algol-like languages.
Similarly, Figure 18 shows a correspondence between REs and data structures..
Exercise 41. Why might Figures 17 and 18 be of interest to a PL designer? 2
B.4.4 Discussion
These analogies suggest that the mechanisms for constructing REs — concatenation, alternation,
and repetition — are somehow fundamental. In particular, the relation between the standard
control structures and the standard data structures can be helpful in programming. In Jackson
B THEORETICAL ISSUES 163
RE Data Structure
x int n;
rs struct ¦ int n; float y; ¦
r
n
int a[n];
r

int a[]
r +s union ¦ int n; float y; ¦
Figure 18: REs and Data Structures
Structured Design (JSD), for example, the data structures appropriate for the application are
selected first and then the program is constructed with the corresponding control structures.
The limitations of REs are also interesting. When we move to the next level of the Chomsky
hierarchy, Context Free Languages (CFLs), we obtain the benefits of recursion. The corresponding
control structure is the procedure and the corresponding data structures are recursive structures
such as lists and trees (Wirth 1976, page 163).
C Haskell Implementation
Several modern functional programming languages, including Haskell, are implemented using com-
binators and graph reduction. The technique described below was introduced into mathematical
logic by M. Schoenfinkel in 1924 and was rediscovered as an implementation method by D.A. Turner
in 1979.
We start with three observations.
Operators are really just functions. For example, we can write x+y as add x y. Thus we only
have to consider prefix functions.
Functions of more than one variable can be reduced to functions of one variable. We understand
add x y as (add x) y: we evaluate (add x), obtaining a function of one variable, which we then
apply to y.
In a definition such as
succ x = add 1 x
we can distinguish between bound variables, or simply variables, such as x in this examples,
and constants, such as 1 and +. Although add might be defined elsewhere, it is a constant
as far as this definition is concerned.
We would like to write the definition of succ in the form
succ = ???
so that the distinction between the name being defined and the definition is clear. Schoenfinkel
showed how to do this using the following abstraction algorithm. If e is an expression possibly
containing x, the abstraction of e with respect to x is written [x] e. It is an expression that does
not contain x. The rules for abstraction follow. In the second rule, we assume x = y, because
otherwise the first rule would apply.
[x] x = I
[x] y = K y
[x] (f g) = S ([x] f) ([x] g)
Apply these rules to the function succ, defined above, gives
succ = [x] (add 1 x)
= S ([x] (add 1)) ([x] x)
= S (S (K add) (K 1)) I
These expressions quickly become quite long. If we abstract n from
fac n = if (n == 0) 1 (n fac(n −1))
(where we treat if as a function of three arguments), we obtain
fac = S(S(S(K I if)(S(S(K ==)(K0))I))(K1))(S(S(K)I)(S(Kfac)(S(S(K −)I)( 1))))
In general, the size of the SKI expression increases exponentially with the size of the original
expression. Turner introduced new functions B and C and optimizations to overcome the size
problem. Using these, the definition of fac becomes:
fac = S(C(B if (== 0))1)(S (B fac (C − 1)))
164
C HASKELL IMPLEMENTATION 165
S

©
d

S

©
d

I
K
c
K
c
+ 1
Figure 19: The graph for succ
To make use of these expressions, we apply an argument and then make use of the following
reduction rules:
I x = x
K x y = x
S f g x = (f x) (g x)
B f g x = f (g x)
C f g x = f x g
Using the simpler example, succ, we have:
succ 4 = S (S (K add) (K 1)) I 4
= (S (K add) (K 1) 4) (I 4)
= ((K + 4) (K 1 4)) 4
= + 1 4
= 5
For practical implementation, these expressions are represented as graphs: Figure 19 shows the
graph corresponding to succ. To evaluate an expression such as succ 4, we add a third subtree
to the top node, giving S three arguments, and transform the graph, always using the rule for
whatever function is at the root of the tree.
The basic technique of graph reduction is quite slow. When Turner presented one his functional
programming languages at a conference in 1982, he said that it was about 500 times slower than
conventional languages such as C and FORTRAN. Twenty years of research, however, have led
to steady improvements, and a modern implementation of a functional language has performance
comparable to that of a compiled procedural language.

CONTENTS

ii

Contents
1 Introduction 1.1 1.2 How important are programming languages? . . . . . . . . . . . . . . . . . . . . . Features of the Course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 3 7 7 9 10 14 14 16 16 17 18 18 20 20 24 24 26 26 29 30 32 34 34 36 38 39 42 43 44 44 46 Hugs Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literate Haskell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Defining Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lazy Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 Anomalies 3 The Procedural Paradigm 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 Early Days . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . FORTRAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Algol 60 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . COBOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PL/I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Algol 68 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pascal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modula–2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.10 Ada . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 The Functional Paradigm 4.1 LISP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 4.1.2 4.1.3 4.2 4.3 4.4 4.5 Basic LISP Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A LISP Interpreter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dynamic Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SASL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Other Functional Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5 Haskell 5.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 5.1.2 5.1.3 5.1.4 5.1.5 5.2 5.2.1 5.2.2

Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CONTENTS 5.2.3 5.2.4 5.2.5 5.2.6 5.3 5.3.1 5.3.2 5.3.3 5.3.4 5.3.5 5.3.6 5.3.7 5.3.8 5.4 5.4.1 5.4.2 5.5 Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Building Haskell Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . Guards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Using Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iii 47 50 51 53 54 56 58 58 59 59 60 62 68 71 73 73 74 77 77 79 80 83 84 84 86 87 87 88 89 89 90 91 93 93 93 94 96

Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reporting Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Standard Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Standard Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Defining New Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Defining Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Types with Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . String Formatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Interactive Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Input and Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Reasoning about programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 The Object Oriented Paradigm 6.1 6.2 6.3 6.4 6.5 Simula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Smalltalk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CLU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eiffel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 6.5.2 6.5.3 6.6 6.6.1 6.6.2 6.6.3 6.6.4 6.7 6.8 6.9 Programming by Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . Repeated Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Portability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Kevo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Other OOPLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Issues in OOP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.9.1 6.9.2 6.9.3 Algorithms + Data Structures = Programs . . . . . . . . . . . . . . . . . . Values and Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Classes versus Prototypes . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . 9. . . . . . . . Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . Control Structures . . . . . . . . Procedures and Functions .1 9. . . . . .2 9. . . . . . . . . . . . . . .6. . . . . . . . . . . . . . . . . . . . . . . . . .2 9. . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Scope . . . . . . . . . . . . . . . . . . . . . . . .2 8.9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 6. . . . . . . .CONTENTS 6. . . . . . . . . . . . . . . . . . . . . . . .8. . . . . . . . . . . . . . . . .6 6.3 Encapsulation . . .3 9. . . . . . . Early and Late Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10 A Critique of C++ . . . . . . . . . . . . . .1 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . .8. . . . . . . . . . . . . . . . . . . . .1 7. . . . . . .9. . . . . .3 9. . . . . . . . . . . . . .2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 8. . . . . . . . . . . . . . Closures versus Classes . . . . . . . . . . . . . . . . . . . . . . .3 Block Structure . . . . . . . . . . . . . . . . . . . . . . . Parametric Polymorphism . . . . . . . . . .9. . . . . 7 Backtracking Languages 7.6. . . . . . . . . . . Loop Structures. . . . . .1 9.11 Evaluation of OOP . Attributes . . . . . . . . . . . . . . . . . . . .3. . . Pure versus Hybrid Languages . . . . . . . . . . . . . . . . . . . . . . . . . .3 Prolog .1 9. . . . . . What is a Variable Name? . . . . . Exceptions. . . . . . . . . . . . . . . . . . . . . . . . 9 Names and Binding 9. . . . . .8 Ad Hoc Polymorphism . Extent . . . . . . . . . . . . . . . . . . .5 6. . . . . .7 9. . . . . . . . . . . 9. . . . . . . . . . . . . . . . . . . . . . . . . . . .5 9. . .7 Types versus Objects . . . . . . . . . . . . . . . .6 Free and Bound Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Object Polymorphism . .8. . . . . . . What Can Be Named? . Are nested scopes useful? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Structure 8. . . .2 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Modules . . . . . . . 6. . Scope and Extent . . . . iv 96 96 97 97 97 105 106 106 112 116 118 118 118 120 121 121 122 123 125 125 126 127 128 129 130 130 131 132 132 132 132 136 137 6. . . . . Polymorphism . . .3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9. . . . . . . . . . . . . . . . .2 7. . .4 9. 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3. . . . . . . . . . . . . . . . . . . . . . . . Alma-0 . . . . . . . . . . .1 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 8. . . Other Backtracking Languages . . .6. . . . . . . .

. . . . . . . . . . .3 Type Theory . . . .1 Tokens . . B. . . . . . . . . . . .2 Semantics . . . . . . . . . . .4 Consequences for Language Design . . . . . . . . . . . . . . . . .1. . . . . . . . . . . . . . . . . . . . 11. . . .4. . .1.2 Garbage Collection . . . . . . C Haskell Implementation v 138 139 139 139 140 140 140 144 144 144 145 146 147 147 150 151 156 159 159 159 160 160 161 162 162 162 164 List of Figures 1 2 3 4 5 The list structure (A (B C)) . . . . . . . . . . . . . . . . 11. .1. . . . . . Parameters for :set . . . .3 Data Types. . . . . . . . . . . . . . . . . . . A Literate Haskell program . . . B. . . . . . . . . . . . . . . . . . . . . 11. . . . . . . . . . . . . 11. . . . . . . . . . . 10. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. . . . . . . . . . . . . . .1. . . . . . . . . . . . . . . . . . . . . B. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10. . . B. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 35 36 37 40 . . . . . . . . . . . . . . . .1 Procedures. . . . . . . . . . . . . . . . . . . . . . . . .3 Control Structures and Data Structures . . . . . . . . . . . . . . . . . .1 Syntactic and Lexical Issues . . . . . . . . . . . . . . . . . B. . . . . . . . .1 Compiling and Interpreting . . . . . . . A short session with Hugs . . . . . . . . . . . . . . .1 Abstraction as a Technical Term . . . . . . . . . . . . . . . .1 Compiling . . . . . . . . . . . . . . . . .1. B. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Regular Languages . . . . . . . . . . . .2 Interpreting . . . . . . . . . . . . . 10. . . . . . . . . . . .2 Context Free Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1. . . .3 Mixed Implementation Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10. . B. . . . . . . . . . . . . . . . . . . . . . . . .4. . . . . . . . . . . . . .4.2 Computational Models . . . . . . . . . . . . .4 Discussion . . . . . . . . . . . . . . . . . . . . . . 11 Implementation 11. . . . . . . . 10. . . . . . . . .4. . . . . . . . . . . . . . . . . .1. . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Classes. . .LIST OF FIGURES 10 Abstraction 10. . . . . .2 Functions. . . . . 12 Conclusion References A Abbreviations and Glossary B Theoretical Issues B.1. . . . . . . Hugs commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. REs and Control Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . REs and Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Key-value tree . . . . . . . . . . . . . . . . . Trying out type Frac . . . . . . . . . . . . . . . . . . . jim) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The graph for succ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Standard Class Inheritance Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Effect of cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Class Frac . . . . . . . . . . . . . .LIST OF FIGURES 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Pattern matching . . . . . . . Proof of ancestor(pam. . . . . . . Parsing precedence . . . . . . . . . . . . . . . . . . . . . . vi 48 50 50 60 61 63 64 68 69 108 109 162 163 165 . . . . . . . . . . . . . . . . . The fringe of a tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operator precedence . Class BTree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Brooks (1987) and others have argued that. macro languages. Tcl/Tk. long-since rejected programming languages. The format increases in complexity until it becomes a miniature programming language. For example. and to predict how they might develop in the future. . 1 . . The following scenario has occurred often in the history of programming. it should help us to find useful answers to questions such as these. “Hidden” languages: spreadsheets. developments in programming languages cannot lead to dramatic increases in software productivity. Dennis Ritchie. By the time that the developers realize this. Since the choice of PL can influence the ease of maintenance. . Consequently. . PLs no longer seem so central to Computer Science (CS). input for complex applications. Another factor that we should consider in studying PLs is their ubiquity: PLs are everywhere. The assumption behind this reasoning is that the choice of PL affects only the coding phase of the project. Maintenance. FORTRAN. . the notation develops into a programming language with many of the bad features of old. We consider early languages. . . if coding requires 15% of total project time. is widely acknowledged to be a major component of software development. Javascript. We not discuss details of syntax. these figures suggest that the PL might have a significant effect on total development time. PLs for the World Wide Web: Java. What is a good programming language? How should we choose an appropriate language for a particular task? How should we approach the problem of designing a programming language? 1. Here is a useful epigram for the course: C and Pascal are the same language. The “obvious” PLs are the major implementation languages: Ada. Scripting PLs: Javascript. C++. Perl. COBOL. . . we could deduce that reducing coding time to zero would reduce the total project time by only 15%. . and we try to avoid unproductive “language wars”. it is too late for them to redesign the format.1 How important are programming languages? As the focus of practitioners and researchers has moved from programming to large-scale software development. . even if they had principles that they could use to help them. The treatment is fairly abstract. . however. Estimates of maintenance time range from “at last 50%” (Lientz and Swanson 1980) to “between 65% and 75%” (McKee 1984) of the total time. . but the main focus of the course is on contemporary and evolving PLs. . . . Developers realize that an application requires a format for expressing input data.COMP 348 1 Introduction Principles of Programming Languages In order to understand why programming languages (PLs) are as they are today. because coding occupies only a small fraction of the total time required for project development. we need to know something about how they evolved. We can summarize our goals as follows: if the course is successful. Python.

sometimes with unfortunate consequences. These criticisms do not imply that the designers were stupid or incompetent. We can echo Perlis (1978. however. For example. The approach of the course is conceptual rather than technical. the designers could hardly be expected to foresee the understanding that would come only with another twenty or thirty years of experience. . The intention is not that you read them all but rather that. Java does not make use of these methods and is insecure. Instead. However. .1 INTRODUCTION 2 There is an unfortunate tendency in Computer Science to re-invent language features without carefully studying previous work. . Brinch Hansen (1999) points out that. Exercises are provided mainly to provoke thought. Rarely has a construction so useful and elegant emerged as the output of a committee of 13 meeting for about 8 days . . . although safe and provably correct methods of programming concurrent processes were developed by himself and Hoare in the mid-seventies. . . The time devoted to a PL depends on its relative importance in the evolution of PLs rather than the extent to which it was used or is used. Given the state of the art at the time when they designed the PLs. or legalistic issues. we are better prepared to understand and appreciate its solution later. if you are interested in a particular aspect of the course. . a rounded work of art . we attempt to focus on the deep and influential ideas that have developed with PLs. C and C++ are both widely-used languages but. If we spend time to think about a problem. We do not dwell on fine lexical or syntactic distinctions. you should be able to find out more about it by reading the indicated material. we consider Haskell in some depth. Many references are provided. semantic details.2 Features of the Course We will review a number of PLs during the course. To provide insight into functional programming. and yet [Algol 60] was a noble begin but never intended to be a satisfactory end. 1. we can recognize the mistakes and omissions of language designers and criticize their PLs accordingly. the solution to an exercise can be found (implicitly!) elsewhere in these notes. pages 88–91) in saying: [Algol 60] was more racehorse than camel . On the contrary. Algol deserves our affection and appreciation. we have chosen to study PLs whose designers had deep insights and made profound contributions. their importance is often under-estimated. We conclude that PLs are important. since they did not contribute profound ideas. . . they are discussed briefly. Some of the exercises may appear in revised form in the examination at the end of the course. In many cases. but the coverage is very uneven. With hindsight.

arrays can be passed by value or by reference to a function. In PLs such as C and Pascal. it yields an address into which the value of E is stored. It would be useful. In some PLs.f. or programs that look as if they shouldn’t work but do. to replace the top element of a stack.j] (except in C. there is not a great deal of difference between a data structure and a file. such as Algol 60. One way is to look at some anomalies: programs that look as if they ought to work but don’t. In C. Yet PLs treat data structures and files in completely different ways. the size of an array is not fixed until run-time. Explain why the asymmetry mentioned in Example 1 exists. In any case. t a[]) {. 2 Exercise 2. Explain why most PLs treat data structures and files differently. but only by reference. . such as a disk. scores). Most PLs provide an assignment statement of the form v := E in which v is a variable and E is an expression. Conceptually. We can access a row of the array by writing a[i]. 2 Exercise 4. Is this equivalent to having types as parameters? 2 Example 3: Data Structures and Files. we can use a function call in place of E but not in place of v. We access a component of a by writing a[i. The expression E yields a value. In Pascal. where we have to write a[i][j]). C++ provides templates. for example. an array can be passed to a function. Example 1: Assignments.2 Anomalies There are many ways of motivating the study of PLs. we cannot write a statement such as top(stack) = 6. 2 Example 4: Arrays. Can you see any problems that might be encountered in implementing a function like top? 2 Example 2: Parameters and Results. 2 Exercise 1. . Why can’t we access a column by writing a[. In other PLs.} and then call sort(float. but a function cannot return an array. . to write a procedure such as void sort (type t. The variable v is not necessarily a simple name: it may be an expression such as a[i] or r. Suppose you added types as parameters to a language. and a function cannot return an array. Most languages do not allow types to be passed as parameters. PLs are fussy about the kinds of objects that can be passed as parameters to a function or returned as results by a function. even ones that are older than C and Pascal. For example. What difference does this make to the implementor of the PL and the programmers who use it? Suppose that we have a two-dimensional array a. The important difference is the more or less accidental feature that data structures usually live in memory and files usually live in some kind of external storage medium.j]? 3 . What other changes would you have to make to the language? 2 Exercise 3. the size of an array is fixed at compile-time.

2 Example 5: Data Structures and Functions. Very few PLs recognize the even shorter form: y = (if (x > PI) then sin else cos)(x). char *money (int amount) { char buffer[16]. Pascal and C programs contain declarations for functions and data.%d". do not. For example. 2 Exercise 7. as z = (A + B) * x. Suppose that we were allowed to nest functions in C. when we see the expression Ax + Bx we are strongly tempted to write it in the form (A + B)x. although experienced C programmers have ways of avoiding the problem. it remains a defect of both C and C++. Sometimes this is possible in PLs: we can certainly rewrite z = A * x + B * x. we are accustomed to factoring expressions. amount % 100). Why is there usually no provision for triangular. and we defined the function shown in Listing 1. allow nested functions. Both languages also provide a data aggregation — record in Pascal and struct in C. char *)? 2 Example 7: Factoring. Suggest some applications for records with both data and function components. 2 Exercise 6. whereas others. Most PLs allow us to write something like this: if (x > PI) y = sin(x) else y = cos(x) Only a few PLs allow the shorter form: y = if (x > PI) then sin(x) else cos(x). In algebra. } Writing C functions that return strings is awkward and. Some PLs. A record looks rather like a small program: it contains data declarations. but it cannot contain function declarations. What is wrong with the function money in Example 6? Why is it difficult for a C function to return a string (i. . Most C/C++ programmers learn quite quickly that it is not a good idea to write functions like this one.2 ANOMALIES 4 It is easy to declare rectangular arrays in most PLs. 2 Example 8: Returning Functions. symmetric..e. return buffer. amount / 100. Suggest ways of declaring arrays of the kinds mentioned in Example 4. "%d. or sparse arrays? 2 Exercise 5. banded. sprintf(buffer. 2 Example 6: Memory Management. such as C. such as Pascal.

3.n) assigns 5 to n.y. 0 < i < N ⇒ a[i − 1] < a[i] in the form all (int i = 1. f(x). add6(4)).) 2 Example 9: Functions as Values. intint addk (int k) { int f (int n) { return n + k. Some PLs do not permit this sequence: if (x > PI) f = sin else f = cos.” True or false? (Note that you can represent the function itself by a pointer to its first instruction. Add(i.5) assigns 2 to i.3. printf("%d".2 ANOMALIES Listing 1: Returning a function typedef int intint(int). “It would be fairly easy to change C so that a function could return a function. i++) a[i-a] < a[i] and ∃ i • 0 ≤ i < N ∧ a[i] = k . i < N. } 5 The idea is that addk returns a function that “adds k to its argument”. 2 Example 11: Logic computation. We would use it as in the following program. } return f. Why are statements like this permitted in C (with appropriate syntax) but not in Pascal? 2 Example 10: Functions that work in more than one way. which should print “10”: int add6 (int) = addk(6). 2 Exercise 9. The function Add(x.z) attempts to satisfy the constraint x + y = z. It would be convenient if we could encode logical assertions such as ∀ i . 2 Exercise 8. Example 8 is just one of several oddities about the way programs handle functions. For instance: Add(2.

Working with low-level PLs leads to the belief that “this is all that computers can do”. f). however. Memory “leaks” are one of the most common forms of error in large programs and they can do serious damage. . . Many programmers have struggled for years writing programs that require explicit deallocation of memory. 6 in the form exists (int i = 0. or they may believe (because they have been told) that garbage collection has an unacceptably high overhead. else return n * f(n-1. fac(3. TYPE f) { if (n <= 1) return 1. i < N. i++) a[i] == k 2 Example 12: Implicit Looping. when the rocket was already in lunar orbit. It is a well-known fact that we can rewrite programs that use loops as programs that use recursive functions. TYPE fac (int n. When type-checking was improved. It would not occur to most programmers to write programs in the style of the preceding sections because most PLs do not permit these constructions. they would no longer compile. . fac) = 3 × 2 × fac(1. functions like this actually worked. This program computes factorials using neither loops nor recursion: fac(3.) These programmers may be unaware of the existence of PLs that provide garbage collection. fac) = 3×2×1 = 6 In early versions of Pascal. } printf("%d". (One of the Apollo missions almost failed because of a memory leak that was corrected shortly before memory was exhausted. It is less well-known that we can eliminate the recursion as well: see Listing 2.2 ANOMALIES Listing 2: The Factorial Function typedef TYPE . . fac) = 3 × fac(2. Can you complete the definition of TYPE in Example 2? 2 The point of these examples is to show that PLs affect the way we think. Garbage collection provides an example. fac)). 2 Exercise 10.

Briefly. (Mutch and Gill 1954) The key point in this quotation is the phrase “instead of by its address in the store”. Thus the stored-program concept led to the development of programming “languages”. 3. The first consequence of this idea is that changing and modifying the stored program is simple and efficient. a very different situation from earlier computers that were programmed by plugging wires into panels. Williams and Campbell-Kelly 1989. The programmer could then write simply ‘A a3’ to denote the operation of adding this number into the accumulator. consequence is that computers can process programs themselves. changes can take place at electronic speeds. The history of PLs. under program control. we survey the paradigms separately although their development was interleaved. like the history of any technology. In fact. instead of by its address in the store. The basic idea is that instructions can be encoded as data and stored in the memory of the computer. ideas that enter the mainstream and ideas that end up in a backwater. even ideas that are submerged for a while and later surface in an unexpected place. Thus. Bergin and Gibson 1996). With the benefit of hindsight. Instead of writing Location 100 101 102 103 104 105 the programmer would write Order A 104 A 2 T 104 H 24 C 50 T 104 7 . The second. in this course. In particular. Sources for this section include (Wexelblat 1981. One of the first additions to programming notation was the use of symbolic names rather than numbers to represent addresses. without having to specify just where the number is located in the machine. These strands are commonly called “paradigms” and.1 Early Days The first PLs evolved from machine code. a computer can translate a program from one notation to another. The first programs used numbers to refer to machine addresses. a number appearing in the calculation might be labelled ‘a3’. and ultimately more far-reaching. for example.3 The Procedural Paradigm The introduction of the von Neumann architecture was a crucial step in the development of electronic computers. There are advances and setbacks. we can identify several strands in the evolution of PLs. it enables the programmer to refer to any word in a programme by means of a label or tag attached to it arbitrarily by the programmer. is complex.

However. The following advantages arise from the use of such a library: 1. genericity. An open subroutine is one which is included in the routine as it stands whereas a closed subroutine is placed in an arbitrary position in the store and can be called into use by any part of the main routine. . being subroutines instead of individual orders. although it is desirable to have subroutines available to cover all possible requirements. as conventions are standardized and the units of a routine are much larger. The importance and subroutines and subroutine libraries was recognized before high-level programming languages had been developed.3 THE PROCEDURAL PARADIGM A A T H C T a3 2 a3 24 50 a3 8 a3) systematically replacing the address 104 by the symbol a3 and omitting the explicit addresses. EXTERNAL parameters. maintenance. Library subroutines may be used directly. reuse. 4. (Wheeler 1951) Exercise 11. It simplifies the task of preparing problems for the machine. it is also undesirable to allow the size of the resulting library to increase unduly. . . 3. . space/time trade-offs. a subroutine can be made more versatile by the use of parameters associated with it.e. without detailed coding and punching. and making it much easier to locate errors. as the following quotation shows. parameterization. This quotation introduces a theme that has continued with variations to the present day. parameters which are fixed throughout the solution of a problem and arise solely from the use of the library. Another difficulty arises from the fact that. . It enables routines to be more readily understood by other users. thus reducing the total size of the library. We may divide the parameters associated with subroutines into two classes.e. . INTERNAL parameters. parameters which vary during the solution of the problem. Library subroutines are known to be correct. which we have called OPEN and CLOSED. . This establishes the principle that a variable name stands for a memory location. encapsulation. a principle that influenced the subsequent development of PLs and is now known — perhaps inappropriately — as value semantics. i. . . i. 2. Find in it the origins of concern for: correctness. thus greatly reducing the overall chance of error in a complete routine. Subroutines may be divided into two types.

The “Preliminary Report” describes the goal of the FORTRAN project: The IBM Mathematical Formula Translation System or briefly. FORTRAN has value semantics. FORTRAN. along with single. and the “jump and store link” command.and multi-dimensioned arrays of primitive values. Procedural PLs provide various ways of escaping from the sequence. we use the term “procedural” rather than “imperative” because programs resemble “procedures” (in the English. it was clear that large benefits could be achieved. Although FORTRAN did not eliminate programming. Confusingly. are often called “statements”. Variable names stand for memory addresses that are determined when the program is loaded. He saw the assignment statement as a source of inefficiency: “the von Neumann bottleneck”. Backus (1978) criticized FORTRAN and similar languages as “lacking useful mathematical properties”. was very similar to the solution he advocated in 1957 — programming must become more like mathematics: “we should be focusing on the form and content of the overall result”. which transferred control to another part of the program. The style of programming that this viewpoint developed became known as the “imperative” or “procedural” programming paradigm. although in logic a “statement” is a sentence that is either true or false. (Backus 1957) It is interesting to note that. By default. . The major achievements of FORTRAN are: efficient compilation. In these notes. such as Pascal and C. the individual steps of procedural PLs. non-technical sense) or recipes rather than “commands”.3 THE PROCEDURAL PARADIGM 2 9 Machine code is a sequence of “orders” or “instructions” that the computer is expected to execute. The data structures of these early languages were usually rather simple: typically primitive values (integers and floats) were provided.2 FORTRAN FORTRAN was introduced in 1957 at IBM by a team led by John Backus.) This suggests that the IBM team’s goal was to eliminate programming! The following quotation seems to confirm this: If it were possible for the 704 to code problems for itself and produce as good programs as human coders (but without the errors). The designers focused on efficient implementation rather than elegant language design. (Quoted in (Sammet 1969). The earliest mechanisms were the “jump” command. The solution. however. will comprise a large set of programs to enable the IBM 704 to accept a concise formulation of a problem in terms of a mathematical notation and to produce automatically a high-speed 704 program for the solution of the problem. 3. which transferred control but also stored a “link” to which control would be returned after executing a subroutine. knowing that acceptance depended on the high performance of compiled programs. 20 years later. the commands of a procedural program are executed sequentially. it was a major step towards the elimination of assembly language coding.

Perlis 1978) was designed by an international committee. and another might assume that the same region contains reals. uniform structure. In another sense. FORTRAN programs are rather similar to assembly language programs: the main difference is that a typical line of FORTRAN describes evaluating an expression and storing its value in memory whereas a typical line of assembly language specifies a machine instruction (or a small group of instructions in the case of a macro). high-quality compilers were written and the language was not widely used (although it was used more in Europe than in North America). with COBOL. How would you interpret this statement if you were: writing a FORTRAN program? writing a FORTRAN compiler? 2 3. but the compiler does not check for consistent usage of these regions. with automatic translation to machine code. In one sense. demonstration that high-level programming. Variables are either global or local to subroutines. No recursion. IBM dominated.3 Algol 60 During the late fifties. A program consists of a sequence of subroutines and a main program. To conserve precious memory. partly to provide a PL that was independent of any particular company and its computers. labels provide the only indication that a sequence of statements form a group. 1960. FORTRAN. all designed for the IBM 704. The control structures of FORTRAN are IF. In other words. most of the development of PLs was coming from industry. The goal was a “universal programming language”. is feasible. Algol was a failure: few complete. FORTRAN allocates all data. DO. and FLPL (FORTRAN List Processing Language). There is no concept of nesting in FORTRAN. and GOTO. The FORTRAN 1966 Standard stated that a FORTRAN implementation may allow recursion but is not required to do so. The principal limitations of FORTRAN are: Flat. Limited control structures. which allows variables with different names and types to share a region of memory. statically. FORTRAN borrows the concept of COMMON storage from assembly language program.3 THE PROCEDURAL PARADIGM 10 separate compilation (programs can be presented to the compiler as separate subroutines. Unsafe memory allocation. but the compiler does not check for consistency between components). This enables different parts of a program to share regions of memory. including the parameters and local variables of subroutines. Algol 60 (Naur et al. Algol was a huge success: it became the . The committee included both John Backus (chief designer of FORTRAN) and John McCarthy (designer of LISP). Recursion is forbidden because only one instance of a subroutine can be active at one time. One program component might use a region of memory to store an array of integers. FORTRAN also provides the EQUIVALENCE statement. Since there are no compound statements. Exercise 12. Naur 1978.

Algol programs are recursively structured. In the Algol block shown in Listing 3.. Despite the simplicity of the implementation. begin real array a[1:n]. Dynamic Arrays.. The default method of passing parameters in Algol was “call by name” and it was described by a rather complicated “copy rule”. . successor PLs such as C and Pascal dropped this useful feature. Block Structure. The run-time entity corresponding to a block is called an activation record (AR). end. The designers of Algol realized that it was relatively simple to allow the size of an array to be determined at run-time. . in particular. x := 1. Call By Name. standard language for describing algorithms. procedure average (n). There are various kinds of statement. The AR is created on entry to the block and destroyed after the statements of the block have been executed. one kind of statement is a block. this in turn means that ARs can be allocated on a stack .. The usual implementation strategy was to translate the actual parameter into a procedure with no arguments (called a “thunk”).. A program is a block . The complications arise because it may be necessary to rename some of the variables during the textual substitution. A variable or function name declared in a block can be accessed only within the block: thus Algol introduced nested scopes. end. when the size of the array is known. The recursive structure of programs means that large programs can be constructed from small programs. y := 3. The compiler statically allocates space for a pointer and an integer (collectively called a “dope vector”) on the stack. real y. The essence of the copy rule is that the program behaves as if the text of the formal parameter in the function is replaced by the text of the actual parameter. integer n. Block structure and stacked ARs have been incorporated into almost every language since Algol. The major innovations of Algol are discussed below. x := 2.14159. end 11 end. At run-time. begin function f(x) begin .3 THE PROCEDURAL PARADIGM Listing 3: An Algol Block begin integer x. integer x. the two assignments to x refer to two different variables. For the better part of 30 years. the ACM required submissions to the algorithm collection to be written in Algol. The syntax of Algol ensures that blocks are fully nested. the appropriate amount of space is allocated on the stack and the components of the “dope vector” are initialized.. The following code works fine in Algol 60. A block consists of declarations and statements.

The procedure in Listing 5 provides a form of abstraction of a for loop.0/x will not be evaluated. begin integer s. integer n. sum := s end 12 each occurrence of the formal parameter inside the function was translated into a call to this function. if x ≤ 0. the Algol committee had several valid reasons for introducing it. it is safe to call try(x > 0. call by value. the expression 1. where the parameter is usually evaluated exactly once. Call by name enables procedures to alter their actual parameters. s := 0. If the procedure count is defined as in Listing 4. A variable in an Algol procedure can be declared own. 1.0/x).3 THE PROCEDURAL PARADIGM Listing 4: Call by name procedure count (n). However. Why does i appear in the parameter list of Sum? 2 . i. for i := 1 until n do s := s + val. if we declare the procedure try as in Listing 6. a[i]) computes a[1]+a[2]+a[3]. The effect is that the variable has local scope (it can be accessed only by the statements within the procedure) but global extent (its lifetime is the execution of the entire program). the statement count(widgets) has the same effect as the statement begin widgets := widgets + 1 end The other parameter passing mechanism provided by Algol. Own Variables. The mechanism seems strange today because few modern languages use it. Exercise 13. and the third is the loop body. does not allow a procedure to alter the value of its actual parameters in the calling environment: the parameter behaves like an initialized local variable. Call by name provides control structure abstraction. i. because. val). The statement sum(3.) For example. (This is in contrast with call by value. begin n := n + 1 end Listing 5: A General Sum Function integer procedure sum (max. the second is the loop index. i. The first parameter specifies the number of iterations. Call by name evaluates the actual parameter exactly as often as it is accessed. on entry to the procedure. val. integer max.

The Algol committee knew what they were doing. With hindsight. The call by name mechanism was a first step towards the important idea that functions can be treated as values. If the data was retained rather than destroyed. has a value semantics. real x. D denotes a sequence of declarations and S denotes a sequence of statements. begin try := if b then x else 0. Algol would be a language with modules. Discuss the initialization of own variables. The actual parameter in an Algol call.0 end 13 Exercise 14. The local data of an AR is destroyed after the statements of the AR have been executed. In particular. An Algol block consists of declarations followed by statements. but not quite powerful enough. The dominant trend after Algol was towards languages of increasing complexity. begin D1 S1 D2 S2 end A natural interpretation would be that S1 and S2 are executed concurrently . Before discussing these. Algol 60 was simple and powerful. Own variables were in fact rather problematic in Algol. we can see that Algol made important contributions but also missed some very interesting opportunities. such as PL/I and Algol 68. for various reasons including the difficulty of reliably initializing them (see Example 14). is actually a parameterless procedure. we take a brief look at COBOL. a record. They knew that incorporating the “missed opportunities” described above would have led to significant implementation problems. since they believed that the stack discipline obtained with nested blocks was crucial for efficiency. In the following block. Suppose that declarations and statements could be interleaved in a block. however. 2 Algol 60 and most of its successors. . Applying this idea consistently throughout the language would have led to high order functions and paved the way to functional programming. A variable name stands for a memory addresses that is determined when the block containing the variable declaration is entered at run-time. But the concept was important: it is the separation of scope and extent that ultimately leads to objects. in effect. as mentioned above. x). anything that jeopardized it was not acceptable. boolean b.3 THE PROCEDURAL PARADIGM Listing 6: Using call by name real procedure try (b. An Algol block without statements is. assuming the default calling mechanism. like FORTRAN. Yet Algol 60 does not provide records.

including MOVE X to Y. a programmer could learn a subset of the language. In most PLs of the time. In COBOL. Can you explain why? 2 3.3 THE PROCEDURAL PARADIGM 14 3. COBOL introduced many new types.99 The first indicates that SALARY is to be stored in a form suitable for computation (probably. after objections from the UK’s National Physical Laboratory.4 COBOL COBOL (Sammett 1978) introduced structured data and implicit type conversion. (Only one dollar symbol will be printed. a single statement could generate a large amount of machine code. The choice made by the designers of COBOL was radical: type conversion should be automatic. When COBOL was introduced. Example 13: Automatic conversion in COBOL. FORTRAN. Despite significant advances in the design and implementation of PLs. Another important innovation of COBOL was a new approach to data types. Insiders at the time referred to the new language as “CobAlgoltran”. in the sense that data could have various degrees of precision. The design principles of PL/I (Radin 1978) included: the language should contain the features necessary for all kinds of programming. “programming” was more or less synonymous with “numerical computation”. where data meant large numbers of characters. might contain these declarations: 77 77 The Data Division of a COBOL program SALARY PICTURE 99999. immediately before the first significant digit).$$9. and different representations as text. 2 Exercise 15. SALREP PICTURE $$$. without having to learn the entire language. suitable for a particular application. The assignment statement in COBOL has several forms. COBOL introduced “data processing”. COBOL. a single statement translated into a small number of machine instructions. . The Procedure Division would probably contain a statement like MOVE SALARY TO SALREP. The data division of a COBOL program contained descriptions of the data to be processed. with appropriate formatting. but not necessarily. The continuing desire for a “universal language” that would be applicable to a wide variety of problem domains led IBM to propose a new programming language (originally called NPL but changed. it remains true that FORTRAN is widely used for “number crunching” and COBOL is widely used for data processing. which implicitly requires the conversion from binary to character form. to PL/I) that would combine the best features of these three languages. binary form) and the second provides a format for reporting salaries as amounts in dollars. the dominant languages were Algol. the COBOL compiler will attempt to find a conversion from one type to the other. The problem of type conversion had not arisen previously because only a small number of types were provided by the PL. USAGE IS COMPUTATIONAL.5 PL/I During the early 60s. If X and Y have different types.

control returns to the statement that raised the exception or. based.3 THE PROCEDURAL PARADIGM 15 An important lesson of PL/I is that these design goals are doomed to failure. More probably. and not very safe. Every variable has a storage class: static. like all programmers. obtaining ’578’.) becomes TRUE. however. 4. the expression (Gelernter and Jagannathan 1990) (’57’ || 8) + 17 is evaluated as follows: 1. PRINTER OUT OF PAPER. form of exception handling. 2 . automatic. A programmer who has learned a “subset” of PL/I is likely. 5. based variables provide a form of template. etc. Convert the integer 8 to the string ’8’. . obtaining 595. the compiler will detect the error and provide a diagnostic message that is incomprehensible to the programmer because it refers to a part of the language outside the learned subset. Discuss potential problems of the PL/I exception handling mechanism. PL/I extends the automatic type conversion facilities of COBOL to an extreme degree.. If the condition (which might be OVERFLOW. to make a mistake. Exercise 16. END.. as far as possible. Add 17 to 578. The compiler’s policy. Since we can execute the statement allocate x as often as necessary. the compiler will not detect the error and the program will behave in a way that is inexplicable to the programmer. For example. to the target of that statement. avoid issuing any diagnostic message that would tell the programmer what is happening”. or controlled. control is transferred to whichever ON statement for that condition was most recently executed. could not be named. Some of these were later incorporated into C. They were not all well-designed. Convert the string ’578’ to the integer 578. Convert the integer 595 to the string ’595’. again because it is outside the learned subset.. With luck. on encountering an assignment x = E. PL/I provides a wide range of programmer-defined types. if the handler contains a GOTO statement. 3. PL/I did introduce some important new features into PLs. An object associated with a based variable x requires explicit allocation and is placed on the heap rather than the stack. might be paraphrased as: “Do everything possible to compile this statement. Concatenate the strings ’57’ and ’8’. PL/I provided a simple. After the statements between BEGIN and END (the handler) have been executed. Types. but their existence encouraged others to produce better designs. 2. Statements of the following form are allowed anywhere in the program: ON condition BEGIN.

The ideas that Algol 68 introduced. Lindsey 1996) is much more complex. z). . weak context. Algol 68 has a rule that requires. and other languages that were becoming popular in the late 60s. . and z can be evaluated in any order. .7 Pascal Pascal was designed by Wirth (1996) as a reaction to the complexity of Algol 68. and Hoare 1972)). set. The important features introduced by Algol 68 include the following. . “use the address rather than the value”. The important contributions of Pascal included the following. In this example. some of which have become part of the culture (cast. Collateral clauses provide a good.) and mechanisms for building structured types (array. z). and other features to the language. bool.). however. for all these entities. . y. Wirth made extensive use of the ideas of Dijkstra and Hoare (later published as (Dahl. record. PL/I. voiding. coercion. . example of the idea that a PL specification should intentionally leave some implementation details undefined. real. Algol 68 was not widely used. In a collateral clause of the form (x. Algol 68 uses the same syntax (mode name = expression) for types. the Algol 68 report does not specify the order of evaluation of the expressions in a collateral clause. dynamic data structures.6 Algol 68 Whereas Algol 60 is a simple and expressive language. . The main design principle of Algol 68 was orthogonality : the language was to be defined using a number of basic concepts that could be combined in arbitrary ways. y. narrowing. there must be forms of expression that yield appropriate values. This implies that. Operator overloading: programmers can provide new definitions for standard operators such as “+”. and early. the lifetime of the variable x must be less than or equal to the lifetime of the object obtained by evaluating E.. For example. The fact that the Report was very hard to understand may have contributed to the slow acceptance of the language. its successor Algol 68 (van Wijngaarden et al. .) and some of which have not (mode. and functions. for an assignment x := E. . Dijkstra. Thus data types in Pascal form a recursive hierarchy just as blocks do in Algol 60. y. 1975). Explain the motivation for this rule.3 THE PROCEDURAL PARADIGM 16 3. constants. the expressions x. . . A large vocabulary of PL terms. . Like Algol 60. . Pascal demonstrated that a PL could be simple yet powerful. although it was popular for a while in various parts of Europe. This gives the implementor freedom to use any order of evaluation and hence. file.). The type system of Pascal was based on primitives (integer. It appears in C in the form of the operators “*” and “&”. In a function call f (x. or concurrently. especially Hoare’s ideas of data structuring. Although it is true that lack of orthogonality can be a nuisance in PLs. . Algol 68 has a very uniform notation for declarations and other entities. 1975. . Even the priority of these operators can be altered. to optimize. roughly. The operator ref stands for “reference” and means. . perhaps. This single keyword introduces call by reference. pointers. it does not necessarily follow that orthogonality is always a good thing. variables. have been widely imitated. Exercise 17. To what extent can it be checked by the compiler? 2 3. The language was described in a formal notation that specified the complete syntax and semantics of the language (van Wijngaarden et al. the argument list is a collateral clause.

The interface provides information about the use of the module to both the programmer and the compiler. data type declarations. The record type was a useful innovation (although very similar to the Algol 68 struct) but allowed data only. Its most important innovations were probably the combination of simplicity. Pascal is a kind of “fill in the blanks” language in which all programs have a similar structure. and static type checking. The important contribution of Modula–2 was. Pascal had a strong influence on many later languages. each of which makes certain decisions and adds new details. This design has the unfortunate consequence that some information that should be secret must be put into the interface. Pascal was a failure because it was too simple. The important features of Modula–2 are: Modules with separated interface and implementation descriptions (based on Mesa). determined by the relatively strict syntax. (The same problem appears again in C++. that the interface must contain the representation of the object.8 Modula–2 Wirth (1982) followed Pascal with Modula–2. the introduction of modules. inevitably. where Wirth worked with the designers of Mesa. This implies that the size must be deducible from the interface which implies. Because of the perceived missing features. Exercise 18. The first version of “Standard Pascal” was almost useless as a practical programming language and the Revised Standard described a usable language but appeared only after most people had lost interest in Pascal. in turn. another early modular language. Coroutines. The implementation contains the “secret” information about the module.) Modula–2 provides a limited escape from this dilemma: a programmer can define an “opaque” type with a hidden representation. was never completed.3 THE PROCEDURAL PARADIGM 17 Pascal provides no implicit type conversions other than subrange to integer and integer to real. to some extent. the interface contains only a pointer to the instance and the representation can be placed in the implementation module. 2 Exercise 19. Pascal was designed to match Wirth’s (1971) ideas of program development by stepwise refinement. . Allowing functions in a record declaration would have paved the way to modular and even object oriented programming. How serious do you think this problem was? 2 3. It is well-known that the biggest loop-hole in Pascal’s type structure was the variant record. For example. (Wirth’s first design. List some of the “missing features” of Pascal. the compiler must know the size of the object in order to declare an instance of it. supersets were developed and. The monolithic structure that this idea imposes on programs is a drawback of Pascal because it prevents independent compilation of components. Programmers are expected to start with a complete but skeletal “program” and flesh it out in a series of refinement steps.) A module in Modula–2 has an interface and an implementation. All other type conversions are explicit (even when no action is required) and the compiler checks type correctness. Modula. removes Pascal’s weaknesses. of course. Nevertheless. Modula–2 was the product of a sabbatical year in California. Like Algol 60. these became incompatible. In this case. which inherits Pascal’s strengths and. Pascal missed important opportunities.

+= and other assignment operators) and others are unique to C and C++ (for example. but C does not provide real support for arrays. It is designed to be easy to compile and to produce efficient object code. record types. Ritchie (Ritchie 1996) designed it for a particular task — systems programming — for which it has been widely used. became popular. For example. This is both the strength and weakness of C.3 THE PROCEDURAL PARADIGM 18 3. If present. The compiler is assumed to be rather unsophisticated (a reasonable assumption for a compiler running on a PDP/11 in the late sixties) and in need of hints such as register.. package Stack is . C is notable for its concise syntax. A procedure definition looks like this: procedure procname ( parameters ) is body A record type looks like this: type recordtype ( parameters ) is body The parameters of a record type are optional.. they have a different form than the parameters of procedures. The enormous success of C is partly accidental. The syntactic differences suggest that the designers did not look for similarities between these constructs. It is a large and complex language that combines then-known programming features with little attempt at consolidation. the template generic max: integer. or boolean operations. Ada provides templates for procedures. and task types. postfix and prefix ++ and --).10 Ada Ada (Whitaker 1996) represents the last major effort in procedural language design. For example. generic packages. the spread of UNIX inevitably led to the spread of C. UNIX. C is based on a small number of primitive concepts. Since UNIX depended heavily on C. It was the first widely-used language to provide full support for concurrency. strings. integer) . but this aspect of the language proved hard to implement.9 C C is a very pragmatic PL. after Bell released it to universities. might be instantiated by a declaration such as package intStack is new Stack(20. with good reason. with interactions checked by the compiler. 3. and packages and tasks (not representable in the language). The corresponding objects are: blocks and records (representable in the language). C is a low-level language by comparison with the other PLs discussed in this section. It is not clear why four distinct mechanisms are required (Gelernter and Jagannathan 1990). type element is private. The number of concepts is small.. arrays are defined in terms of pointers and pointer arithmetic. A generic package looks like this: generic ( parameters ) package packagename is package description The parameters can be types or values. Some syntactic features are inherited from Algol 68 (for example.

But it is disturbing that the language designers apparently did not consider passible relationships between these four kinds of declaration. a task template looks like this (no parameters are allowed): task type templatename is task description 19 Of course. 2 . just as the identity declaration of Algol 68 suggested new and interesting possibilities. Exercise 20. programmers hardly notice syntactic differences of this kind: they learn the correct incantation and recite it without thinking. but uncovering deep semantic similarities might have a significant impact on the language as a whole. Changing the syntax would be a minor improvement. Propose a uniform style for Ada declarations.3 THE PROCEDURAL PARADIGM Finally.

4

The Functional Paradigm

Procedural programming is based on instructions (“do something”) but, inevitably, procedural PLs also provide expressions (“calculate something”). The key insight of functional programming (FP) is that everything can be done with expressions: the commands are unnecessary. This point of view has a solid foundation in theory. Turing (1936) introduced an abstract model of “programming”, now known as the Turing machine. Kleene (1936) and Church (1941) introduced the theory of recursive functions. The two theories were later shown (by Kleene) to be equivalent: each had the same computational power. Other theories, such as Post production systems, were shown to have the same power. This important theoretical result shows that FP is not a complete waste of time but it does not tell us whether FP is useful or practical. To decide that, we must look at the functional programming languages (FPLs) that have actually been implemented. Most functional language support high order functions. Roughly, a high order function is a function that takes another function as a parameter or returns a function. More precisely: A zeroth order expression contains only variables and constants. A first order expression may also contain function invocations, but the results and parameters of functions are variables and constants (that is, zeroth order expressions). In general, in an n-th order expression, the results and parameters of functions are (n − 1)-th order expressions. A high order expression is an n-th order expression with n ≥ 2. The same conventions apply in logic with “function” replaced by “function or predicate”. In first-order logic, quantifiers can bind variables only; in a high order logic, quantifiers can bind predicates.

4.1

LISP
Anyone could learn LISP in one day, except that if they already knew FORTRAN, it would take three days. Marvin Minsky

Functional programming was introduced in 1958 in the form of LISP by John McCarthy. The following account of the development of LISP is based on McCarthy’s (1978) history. The important early decisions in the design of LISP were: to provide list processing (which already existed in languages such as Information Processing Language (IPL) and FORTRAN List Processing Language (FLPL)); to use a prefix notation (emphasizing the operator rather than the operands of an expression); to use the concept of “function” as widely as possible (cons for list construction; car and cdr for extracting list components; cond for conditional, etc.); to provide higher order functions and hence a notation for functions (based on Church’s (1941) λ-notation); to avoid the need for explicit erasure of unused list structures. McCarthy (1960) wanted a language with a solid mathematical foundation and decided that recursive function theory was more appropriate for this purpose than the then-popular Turing machine model. He considered it important that LISP expressions should obey the usual mathematical laws allowing replacement of expressions and: 20

4 THE FUNCTIONAL PARADIGM
E s c s E s s E NIL

21

A

c s s c

E

s c

s

E NIL

B

C

Figure 1: The list structure (A (B C)) Another way to show that LISP was neater than Turing machines was to write a universal LISP function and show that it is briefer and more comprehensible than the description of a universal Turing machine. This was the LISP function eval [e, a], which computes the value of a LISP expression e, the second argument a being a list of assignments of values to variables. . . . Writing eval required inventing a notation for representing LISP functions as LISP data, and such a notation was devised for the purpose of the paper with no thought that it would be used to express LISP programs in practice. (McCarthy 1978) After the paper was written, McCarthy’s graduate student S. R. Russel noticed that eval could be used as an interpreter for LISP and hand-coded it, thereby producing the first LISP interpreter. Soon afterwards, Timothy Hart and Michael Levin wrote a LISP compiler in LISP; this is probably the first instance of a compiler written in the language that it compiled. The function application f (x, y) is written in LISP as (f x y). The function name always comes first: a + b is written in LISP as (+ a b). All expressions are enclosed in parentheses and can be nested to arbitrary depth. There is a simple relationship between the text of an expression and its representation in memory. An atom is a simple object such as a name or a number. A list is a data structure composed of cons-cells (so called because they are constructed by the function cons); each cons-cell has two pointers and each pointer points either to another cons-cell or to an atom. Figure 1 shows the list structure corresponding to the expression (A (B C)). Each box represents a cons-cell. There are two lists, each with two elements, and each terminated with NIL. The diagram is simplified in that the atoms A, B, C, and NIL would themselves be list structures in an actual LISP system. The function cons constructs a list from its head and tail: (cons head tail). The value of (car list) is the head of the list and the value of (cdr list) is the tail of the list. Thus: (car (cons head tail)) → head (cdr (cons head tail)) → tail The names car and cdr originated in IBM 704 hardware; they are abbreviations for “contents of address register” (the top 18 bits of a 36-bit word) and “contents of decrement register” (the bottom 18 bits). It is easy to translate between list expressions and the corresponding data structures. There is a function eval (mentioned in the quotation above) that evaluates a stored list expression. Consequently, it is straightforward to build languages and systems “on top of” LISP and LISP is often used in this way.

4 THE FUNCTIONAL PARADIGM

22

It is interesting to note that the close relationship between code and data in LISP mimics the von Neumann architecture at a higher level of abstraction. LISP was the first in a long line of functional programming (FP) languages. Its principal contributions are listed below. Names. In procedural PLs, a name denotes a storage location (value semantics). In LISP, a name is a reference to an object, not a location (reference semantics). In the Algol sequence int n; n := 2; n := 3; the declaration int n; assigns a name to a location, or “box”, that can contain an integer. The next two statements put different values, first 2 then 3, into that box. In the LISP sequence (progn (setq x (car structure)) (setq x (cdr structure))) x becomes a reference first to (car structure) and then to (cdr structure). The two objects have different memory addresses. A consequence of the use of names as references to objects is that eventually there will be objects for which there are no references: these objects are “garbage” and must be automatically reclaimed if the interpreter is not to run out of memory. The alternative — requiring the programmer to explicitly deallocate old cells — would add considerable complexity to the task of writing LISP programs. Nevertheless, the decision to include automatic garbage collection (in 1958!) was courageous and influential. A PL in which variable names are references to objects in memory is said to have reference semantics. All FPLs and most OOPLs have reference semantics. Note that reference semantics is not the same as “pointers” in languages such as Pascal and C. A pointer variable stands for a location in memory and therefore has value semantics; it just so happens that the location is used to store the address of another object. Lambda. LISP uses “lambda expressions”, based on Church’s λ-calculus, to denote functions. For example, the function that squares its argument is written (lambda (x) (* x x)) by analogy to Church’s f = λx . x2 . We can apply a lambda expression to an argument to obtain the value of a function application. For example, the expression ((lambda (x) (* x x)) 4) yields the value 16. However, the lambda expression itself cannot be evaluated. Consequently, LISP had to resort to programming tricks to make higher order functions work. For example, if we want to pass the squaring function as an argument to another function, we must wrap it up in a “special form” called function: (f (function (lambda (x) (* x x))) . . . .) Similar complexities arise when a function returns another function as a result. Dynamic Scoping. Dynamic scoping was an “accidental” feature of LISP: it arose as a side-effect of the implementation of the look-up table for variable values used by the interpreter. The C-like program in Listing 7 illustrates the difference between static and dynamic scoping. In

(Do not confuse dynamic scoping with dynamic binding!) A LISP interpreter constructs its environment as it interprets.4 THE FUNCTIONAL PARADIGM Listing 7: Static and Dynamic Binding int x = 4. It is a much larger and more complex language than the original LISP and includes many features of Scheme (described below). A compiler. the functions may invoke one another with either direct or indirect recursion. Originally. and forcing it to maintain a linear list would be unacceptably inefficient. it is inefficient for a compiler. We might paraphrase this as: “if the car of the expression that we are currently evaluating is car. LISP was the first major language to be interpreted. The environment behaves like a stack (last in. the environment contains the global binding for x: x = 4 . contains expressions such as the one shown in Listing 8. Interpreters are slow anyway. Exercise 22. 1990). the variable x in the body of the function f is a use of the global variable x defined in the first line of the program. has more efficient ways of accessing variables. How much can you learn about a language by reading an interpreter written in the language? What can you not learn? 2 . first out). The initial environment is empty. After interpreting the LISP equivalent of the line commented with “1”. when it encounters x in the body of f . it uses the first value of x in the environment and prints 7. Describe a situation in which dynamic scoping is useful. however. Since the value of this variable is 4. added to the interpreter. A LISP program has no real structure. x = 4 . the program prints 4. Although dynamic scoping is natural for an interpreter. When the interpreter evaluates the function main. translated into internal form. The current dialect of LISP is called Common LISP (Steele et al. void f () { printf("%d". and the overhead of searching a linear list for a variable value just makes them slightly slower still. the second term) of the expression”. f (). Consequently. On paper. the value of the expression is obtained by taking the car of the cadr (that is. The interpreter then evaluates the call f (). it inserts the local x into the environment. 2 Interpretation. } // 1 23 // 2 C. At run-time. written in LISP. a program is a list of function definitions. early LISP systems had an unfortunate discrepancy: the interpreters used dynamic scoping and the compilers used static scoping. but its internal state did not change. } void main () { int x = 7. It was not long before a form for defining functions was introduced (originally called define. obtaining x = 7. Common LISP provides static scoping with dynamic scoping as an option. Some programs gave one answer when interpreted and another answer when compiled! Exercise 21. x). The LISP interpreter. a program is the same list of functions. later changed to defun) to enable users to add their own functions to the list of built-in functions. the LISP interpreter behaved as a calculator: it evaluated expressions entered by the user. which we denote by .

The important forms are: quote takes a single argument and does not evaluate it. . cons builds a list. def is used to define functions: (def name (lambda parameters body )) 4. The table is represented as a list of pairs. . . MIT Press 1962...1 Basic LISP Functions The principal — and only — data structure in classical LISP is the list. . Lists also represent programs: the list (f x y) stands for the function f applied to the arguments x and y. Lists are written (a b c). first (originally called car) returns the head (first component) of a list.4 THE FUNCTIONAL PARADIGM Listing 8: Defining car (cond ((eq (car expr) ’car) (car (cadr expr)) ) . return e2 . rest (originally called cdr) returns the tail of a list.2 A LISP Interpreter The following interpreter is based on McCarthy’s LISP 1. if p2 is true. second (originally called cadr. (t e)) end works like this: if p1 is true.. An environment is a list of name/value pairs.5 Programmer’s Manual by McCarthy et al. given its head and tail.. 24 4.5 Evaluator from Lisp 1. The function pairs builds an environment from a list of names and a list of expressions. a variable name or other primitive object. null is a predicate that returns true if its argument is the empty list. The following functions are provided for lists. atom is a predicate that return true if its argument is not a list — and must therefore be an “atom”.1. else return e. cond is the conditional construct has the form (cond (p1 e1 ) (p2 e2 ) . return e1. A form looks like a function but does not evaluate its argument. (def pairs (lambda (names exps env) (cond ((null names) env) (t (cons (cons (first name) (first exps)) (pairs (rest names) (rest exps) env) )) ) )) The function lookup finds the value of a name in a table.. short for car cdr) returns the second element of a list. .1. that is..

(def eval (lambda (exp env) (cond ((null exp) nil) ((atom exp) (lookup exp env)) ((eq (first exp) (quote quote)) (second exp)) ((eq (first exp) (quote cond)) (evcon (second exp) env) ) (t (apply (first exp) (evlist (rest exp) env) env)) ) )) The function evcon is used to evaluate cond forms. and lookup is used to find the lambda form corresponding to the name. A function which is a lambda form. and other built-in functions not considered here. The cases it considers are: Built-in functions: first. it takes a list of pairs of the form (cnd exp) and returns the value of the expression corresponding to the first true cnd. second.4 THE FUNCTIONAL PARADIGM (def lookup (lambda (name table) (cond ((eq name (first (first table))) (rest (first table))) (t (lookup name (rest table))) ) )) 25 The heart of the interpreter is the function eval which evaluates an expression exp in an environment env. It is a straightforward recursion. (def evcon (lambda (tests env) (cond ((null tests) nil) (t (cond ((eval (first (first tests)) env) (eval (second (first tests)) env)) (t (evcon (rest tests) env)) ) ) ) )) The function evlist evaluates a list of expressions in an environment. Any other function with an atomic name: this is assumed to be a user-defined function. cons. (def evlist (lambda (exps env) (cond ((null exps) nil) (t (cons (eval (first exps) env) (evlist (rest exps) env) )) ) )) The function apply applies a function fun to arguments args in the environment env. (def apply (lambda (fun args env) (cond ((eq fun (quote first)) (first (first args))) ((eq fun (quote second)) (second (first args))) ((eq fun (quote cons)) (cons (first args) (second args))) ((atom fun) (apply (eval fun env) args env)) ((eq (first fun) (quote lambda)) (eval (third fun) (pairs (second fun) args env)) ) ) )) .

and so that is the value that show returns. and C++. We say that LISP uses dynamic binding whereas most other languages.3 Dynamic Binding 26 The classical LISP interpreter was implemented while McCarthy was still designing the language. It contains an important defect that was not discovered for some time. Steele Jr. the value of a variable in LISP depends on the dynamic behaviour of the program (which functions have been called) rather than on the static text of the program. but it is hard to write a compiler with dynamic binding.1. Correcting the LISP interpreter to provide static binding is not difficult: it requires a slightly more complicated data structure for environments (instead of having a simple list of name/value pairs.2). and Gerald Jay Sussman (1975). in 1978. the dynamic binding problem was not discovered until LISP was well-entrenched: it would be almost 20 years before Guy Steele introduced Scheme. People wrote LISP compilers. In other words. including Scheme. we effectively build a list of activation records). Steele and Sussman implemented the actor model using a small LISP . the binding (x. It is very similar to LISP in both syntax and semantics.) However. We use a simpler example to explain the problem. The starting point of Scheme was an attempt by Steele and Sussman to understand Carl Hewitt’s theory of actors as a model of computation. (The interpreter above does not incorporate error detection and would simply fail with this function. Haskell.2 Scheme Scheme was designed by Guy L. Suppose we define: (def show (lambda () x )) Calling this function generates an error because x is not defined. Consequently. When show is called. we can wrap show inside another function: (def try (lambda (x) (show) )) We then find that (try t) evaluates to t. use static binding . However. but it corrects some of the errors of LISP and is both simpler and more consistent. James Slagle defined a function like this (def testr (lambda (x p f u) (cond ((p x) (f x)) ((atom x) (u)) (t (testr (rest x) p f (lambda () (testr (first x) p f u)))) ) )) and it did not give the correct result. The model was object oriented and influenced by Smalltalk (see Section 6. there were many LISP systems that provided dynamic binding during interpretation and static binding for compiled programs! 4.4 THE FUNCTIONAL PARADIGM 4. with static binding.t) is in the environment.

The answer to the question “what is a function?” is “a function is an expression (the body of the function) and an environment containing the values of all variables accessible at the point of definition”. They can be passed as arguments and returned by functions. 4 or 7? If this sequence could be written in LISP. it is important to understand the effect of the following sequence: (define n 4) (define f (lambda () n)) (define n 7) (f) The final expression calls the function f that has just been defined. Thus in Scheme we can write both (define num 6) which binds the value 6 to the name num and (define square (lambda (x) (* x x))) which binds the squaring function to the name square. (Scheme actually provides an abbreviated form of this definition. The value returned is the value of n. LISP ducks the question “what is a function?” It provides lambda notation for functions. The interpreter provided lexical scoping. uses static scoping . Implementing the interpreter brought an odd fact to light: the interpreter’s code for handling alpha was identical to the code for handling lambda! This indicated that closures — the objects created by evaluating lambda — were useful for both high order functional programming and object oriented programming (Steele 1996).n 1) (alpha (f) (c (* f n))))))) 27 interpreter. the result would be 7. but the form shown is accepted by the Scheme compiler and we use it in these notes. as in LISP. not evaluated itself. Consequently. the factorial function could be represented either as a function. For example. as in Listing 10. but which value.n 1)))))) Listing 10: Factorial with actors (define actorial (alpha (n c) (if (= n 0) (c 1) (actorial (. (Both are possible in LISP. but awkward because they require special forms. however. to spare programmers the trouble of writing lambda all the time. but a lambda expression can only be applied to arguments. The closure created by evaluating the definition of f includes all name bindings in effect at the time of definition. Scheme provides an answer to this question: the value of a function is a closure. as in Listing 9.) Since the Scheme interpreter accepts a series of definitions. or as an actor.4 THE FUNCTIONAL PARADIGM Listing 9: Factorial with functions (define factorial (lambda (n) (if (= n 0) 1 (* n (factorial (. Scheme. a Scheme interpreter yields 4 as the value of (f).) . which is the value of n in the environment when (f) is evaluated. a lambda operation for creating functions. Closures in Scheme are ordinary values. and an alpha operation for creating actors.

set! changes the value of a variable. Differentiation is a function that maps functions to functions. Dsq is.001 2 Scheme avoids the problem of incompatibility between interpretation and compilation by being statically scoped. whether it is interpreted or compiled.) . Approximately (Abelson and Sussman 1985): D f (x) = f (x + dx) − f (x) dx We define the Scheme functions shown in Listing 11.4 THE FUNCTIONAL PARADIGM Listing 11: Differentiating in Scheme (define derive (lambda (f dx) (lambda (x) (/ (. The interpreter uses a more elaborate data structure for storing values of local variables to obtain the effect of static scoping.001 = 2x + · · · We can apply Dsq like this: (->” is the Scheme prompt): -> (Dsq 3) 6.(f (+ x dx)) (f x)) dx)))) (define square (lambda (x) (* x x))) (define Dsq (derive sq 0. effectively. Although Scheme is primarily a functional language.001)) Listing 12: Banking in Scheme (define (make-account balance) (define (withdraw amount) (if (>= balance amount) (sequence (set! balance (. Scheme introduced continuations. In particular.balance amount)) balance) ("Insufficient funds")) (define (deposit amount) (set! balance (+ balance amount)) balance) (define (dispatch m) (cond ((eq? m ’withdraw) withdraw) ((eq? m ’deposit) deposit) (else (error "Unrecognized transaction" m)))) dispatch) 28 Example 14: Differentiating.001)2 − x2 0. (The “!” is a reminder that set! has side-effects. In addition to full support for high order functions. side-effects are allowed. After these definitions have been evaluated. the function f (x) = (x + 0.

3 SASL SASL (St. In SASL. 4. takes an amount as argument. The following dialog shows how banking works in Scheme. Listing 13 shows some simple applications of acc. The following examples use SASL notation. The first step is to create an account. Turner subsequently designed KRC (Kent Recursive Calculator) (1981) and Miranda (1985). is called call by need or lazy evaluation. functions. The value returned is the new balance. page 173). . we can define it like this: second (x::y::xs) = y Although nums(0) is an “infinite” list. when compared to OOPLs such as Simula and Smalltalk (described in Section 6) is that we can define only one function at a time. The limitation of this approach. If the parameter is not needed in the function. -> (define acc (make-account 100)) The value of acc is a closure consisting of the function dispatch together with an environment in which balance = 100. If it is needed one or more times. It has an Algollike syntax and is of interest because the compiler translates the source code into a combinator expression which is then processed by graph reduction (Turner 1979). Since SASL expressions do not have side-effects. we can find its second element in SASL: . in turn.4 THE FUNCTIONAL PARADIGM Listing 13: Using the account -> ((acc ’withdraw) 50) 50 -> ((acc ’withdraw) 100) Insufficient funds -> ((acc ’deposit) 100) 150 29 Example 15: Banking in Scheme. 2 This example demonstrates that with higher order functions and control of state (by side-effects) we can obtain a form of OOP. local. Thus combinator reduction is (in this sense) the most efficient way to pass parameters to functions. The definition nums(n) = n::nums(n+1) apparently defines an infinite list: nums(0) = 0::nums(1) = 0::1::nums(2) = . as in Algol 60. Andrew’s Symbolic Language) was introduced by David Turner (1976). This function must be used to dispatch messages to other. . and never more than once. . The function second returns the second element of a list. The quote sign (’) is required to prevent the evaluation of withdraw or deposit. The expression x::xs denotes a list with first element (head) x and remaining elements (tail) xs. The function shown in Listing 12 shows how side-effects can be used to define an object with changing state (Abelson and Sussman 1985. it is evaluated exactly once. it returns one of the functions withdraw or deposit which. evaluating an expression more than once will always give the same result. all of which are implemented with combinator reduction. Combinator reduction implements call by name (the default method for passing parameters in Algol 60) but with an optimization. Evaluating an expression only when it is needed. The function dispatch takes a single argument which must be withdraw or deposit. it is not evaluated.

In the declaration of fac. SML assigns the result to the variable it. then we see that if must use call by need for its second and third arguments. and Harper 1990. .fun fac 0 = 1 = | fac n = n * fac(n-1). The distinguishing feature of SML is that it is statically typed in the sense of Section B. . If we consider if as a function. the functions && (AND) and || (OR) are defined as follows: X && Y X || Y ≡ if X then Y else false ≡ if X then true else Y These definitions provide the effect of lazy evaluation and allow us to write expressions such as if (p != NULL && p->f > 0) . val it = 720 : int SML also allows function declaration by cases.Y).X. The programmer then tests the factorial function with argument 6. Milner and Tofte 1991) was designed as a “metalanguage” (ML) for reasoning about programs as part of the Edinburgh Logic for Computable Functions (LCF) project. The first. and prompts with “-”. SML is run interactively.fac 6. . Tofte. Each case of a declaration by cases includes a pattern. it changes the prompt to “=” on the second line. . . and matches only itself. as in the following alternative declaration of the factorial function: . 4. 0. If it did not. val it = 720 : int Since SML recognizes that the first line of this declaration is incomplete. val fac = fn : int -> int .fac 6. . the required result is known. The language survived after the rest of the project was abandoned and became “standard” ML.4 THE FUNCTIONAL PARADIGM second(nums(0)) = second(0::nums(1)) = second(0::1::nums(2)) = 1 30 This works because SASL evaluates a parameter only when its value is needed for the calculation to proceed. as soon as the argument of second is in the form 0::1::.fun fac n = if n = 0 then 1 else n * fac(n-1). the programmer defines the factorial function and SML responds with its type. or SML. In this example. \tt n.3 and that most types can be inferred by the compiler. .4 SML SML (Milner. val fac = fn : int -> int . The vertical bar “|” indicates that we are declaring another “case” of the declaration. which can be used in the next interaction if desired. Call by need is the only method of passing arguments in SASL but it occurs as a special case in other languages. In the following example. there are two patterns. we would not be able to write expressions such as if x = 0 then 1 else 1/x In C. is a constant pattern. so that the expression if P then X else Y is a fancy way of writing if(P.. . The second.

quad 3. yields a new function. as shown in Listing 16. The symbols ’a. Listing 16: A function with one argument . The functions here are for list manipulation. val it = true : bool . adapted from (Ake Wikstr¨m 1987). val it = true : bool 31 is a variable pattern.even 6.val quad = sq o sq. val sq = fn : real -> real .sq 17. .0.fun (f o g) x = g (f x). We start with a list generator. The function o (intended to resemble the small circle that mathematicians use to denote functional composition) is built-in.4 THE FUNCTIONAL PARADIGM Listing 14: Function composition . Note that the definition fun sq x = x * x. The function hasfactor defined in Listing 15 returns true if its first argument is a factor of its second argument.0 : real We can pass functions as arguments to other functions. as in Listing 14. val it = 81. val quad = fn : real -> real .0 : real Listing 15: Finding factors . The trick of applying one argument at a time is called “currying”. and ’c are type names.infix o.hasfactor 3 9.fun sq x:real = x * x. The declaration of hasfactor introduces two functions. . which is a widely used example but not the only way in which a FPL can be used. we could easily declare it and use it to build the fourth power function. Applying the first argument. val o = fn : (’a -> ’b) * (’b -> ’c) -> ’a -> ’c . they indicate that SML has recognized o as a polymorphic function. It may be helpful to consider the types involved: hasfactor : int -> int -> bool hasfactor 2 : int -> bool hasfactor 2 6 : bool ˚ The following brief discussion. after the American logician Haskell Curry. defined as an infix operator.fun hasfactor f n = n mod f = 0. It might appear that hasfactor has two arguments. val even = fn : int -> bool. but this is not the case. and matches any value of the appropriate type. Functions like hasfactor take their arguments one at a time. shows how functions can be o used to build a programmer’s toolkit. as in hasfactor 2. val hasfactor fn : int -> int -> bool . val it = 289. All functions in SML have exactly one argument.0. . would fail because SML cannot decide whether the type of x is int or real. but even if it wasn’t. ’b.val even = hasfactor 2.

5] : int list The functions sum and prod in Listing 17 compute the sum and product. The idea of processing a list by recursion has been captured in the definition of reduce. val -. the notation introduced by Backus (1978). This suggests that we can abstract the common features into a function reduce that takes a binary function. val sort = fn : int list -> int list where insert is the function that inserts a number into an ordered list. his ideas for the design of FPLs were not widely adopted. following LISP and Scheme.val sort = reduce insert nil. preserving the ordering: . . and those that provide mainly “lazy evaluation”. a value for the empty list. FPLs have split into two streams: those that provide mainly “eager evaluation” (that is.5 Other Functional Languages We have not discussed FP.4 THE FUNCTIONAL PARADIGM Listing 17: Sums and products .prod (1 -. val sum = fn : int list -> list .2.4. We note that sum and prod have a similar form. val it = 120 : int Listing 18: Using reduce . val prod = fn : int list -> list 32 . val sum = fn : int list -> int . as in Listing 18.fun (m -. .1 -. respectively. call by value).= fun : int * int -> int list . .n) = if m < n then m :: (m+1 -. val prod = fn : int list -> int . val it = [1.fun sum xs = reduce add 0 xs.fun prod [] = 1 = | prod (x::xs) = x * prod xs. val insert = fn : int -> int list -> int list 4.3.sum (1 -.fun reduce f u [] = u = | reduce f u (x::xs) = f x (reduce f u xs).5).fun sum [] = 0 = | sum (x::xs) = x + sum xs.fun prod xs = reduce mul 1 xs. val it = 15 : int . val reduce = fn : (’a -> ’b -> ’b) -> ’b -> ’a list -> ’b We can also define a sorting function as . Although Backus’s speech (and subsequent publication) turned many people in the direction of functional programming. and a list.n) else [].fun insert x:int [] = x::[] | insert x (y::ys) = if x <= y then x::y::ys else y::insert x ys.5). Since 1978. of a list of integers. We can use reduce to obtain one-line definitions of sum and prod.infix --.5.

while Haskell appears to be on the way to becoming the dominant FPL for teaching and other applications.4 THE FUNCTIONAL PARADIGM 33 following SASL. The most important language in the first category is still SML. .

5 Haskell COMP 348 is about the principles of programming languages. Prelude> "Hallo" ++ " " ++ "everyone!" "Hallo everyone!" :: [Char] The basic data structure of almost all functional programming languages. Java. It is based on a “read/eval/print” loop: the system reads an expression. This is the same as the concatenation operator for strings — Haskell considers a string to be simply a list of characters. Haskell is a functional programming language in the sense of Section 4.1 Getting Started There are several Haskell compilers and all of them are free. The infix operator ++ concatenates strings. the environment recognizes various commands with the prefix ”:”. such as variables.2. A string literal is enclosed in double quotes. download it from http://www. Although it has many of the features that we associate with other languages. Haskell strings are not terminated by ’\0’. Even this is not enough: you really need to know more than one style of programming. and Basic. etc. Prelude> [1. Documentation is in pkg/hugs98/docs. is the list.html). Hugs will display the functions and structures defined in the current environment.haskell. At Concordia. including Haskell.org/hugs. As well as expressions. If you want to run Hugs on your own PC. 5. its foundation is the mathematical concept of a function. and data structures. installation is straightforward. Hugs has been installed on both Solaris8 machines and linux hosts in pkg/hugs98. lists are enclosed in square brackets with items separated by commas. Hugs provides an interactive programming environment. The concatenation operator for lists is ++. Haskell uses the same escape sequences C++(e.). and prints the value. Figure 2 shows Hugs working out 2 + 2.3] ++ [4. if you enter :browse (or :b for short).org. you need to know more than one programming language. The prompt Prelude> indicates that the only module that has been loaded is the “prelude”. Prelude> 1234567890987654321 / 1234567 1.haskell.org/bio. as in C++. You can find information about Haskell at www.2. 34 . the course begins with an introduction to a language that is quite different to languages that you may already be familiar with such as C++.haskell.g. Executables are in /site/bin and can be run without any special preparation. Haskell gets its name from the American logician Haskell Curry (1900–82 — see http://www. Integers have unlimited size and conversion are performed as required.5. However. In order to understand the principles.0e+012 :: Double Prelude> 1234567890987654321 ‘div‘ 1234567 1000000721700 :: Integer Strings are built into Haskell. For instance. One of the first Haskell implementations was called “Gofer”.3. Accordingly.6] [1. we will be using a system derived from Gofer and called HUGS (Haskell Users’ Gofer System).6] :: [Integer] The prelude provides a number of basic operations on lists. ’\n’.5.4. expressions. evaluates it. Haskell computes with integers and floating-point numbers.. In Haskell.

.20] 210 :: Integer Prelude> product [1. The command is :load and it is followed by a path name.20] :: [Integer] Prelude> filter even [1.20] :: Prelude> sum [1.4.15.15..9. means mainly writing functions (although later we will also introduce new data types).20] 2432902008176640000 :: Integer Prelude> take 5 [1.12. To do anything interesting.16.20] [2. you have to write your own programs — which.5] :: [Integer] Prelude> drop 10 [1.20] [1.6.hs": .8. Here is how the definitions for this lecture are loaded.18.hs Type :? for help Prelude> 2+2 4 :: Integer Prelude> Figure 2: A short session with Hugs Prelude> [1.6. in Haskell.11.18.17.org _____________________________________________ 35 Haskell 98 mode: Restart with command line option -98 to enable extensions Reading file "C:\Program Files\Hugs98\lib\Prelude..Syntax error in input (unexpected ‘=’) Fortunately.17.14.14.16.16. Note that the prompt changes to reflect the newly loaded module: Prelude> :load "e:/courses/comp348/src/lec1.8.20] :: [Integer] Prelude> take 5 "Peter Piper pickled a pepper" "Peter" :: [Char] [Integer] Evaluating expressions is fun for a few minutes.19.3.20] [11. it does allow you read definitions from a file.19.12..13.20] [1.4.5.14.7.12.18.3. The environment does not allow you to introduce any new definitions: Prelude> n = 99 ERROR .10.2.hs" Reading file "e:/courses/comp348/src/lec1.13.4. but it quickly gets boring.10..2..org/hugs Report bugs to: hugs-bugs@haskell.5 HASKELL __ __ || || ||___|| ||---|| || || || || __ __ ____ ___ || || || || ||__ ||__|| ||__|| __|| ___|| Version: December 2001 _____________________________________________ Hugs 98: Based on the Haskell 98 standard Copyright (c) 1994-2001 World Wide Web: http://haskell.hs": Hugs session for: C:\Program Files\Hugs98\lib\Prelude.

classes. .5 HASKELL 36 Command :? :! command :quit :load filename .. . Note that any command can be abbreviated: for example. . Notes for Figure 3 . you can enter :b instead of :browse. we will include the prompt. :set options :cd directory :edit file :find name :gc :info name . . . . as in the examples above. Figure 4 shows options for the :set command. :reload :browse module . these options can also be specified on the command line. The WinHugs environment has icons for most of these commands.. 5. :also filename . In the examples that follow. Change module Find names that match the pattern Load a project (obsolete) Print the type of expression without evaluating it Display Hugs version Figure 3: Hugs commands Note (1) (1) (1) (2) (2) (2) (3) Hugs session for: C:\Program Files\Hugs98\lib\Prelude.hs The file lec1.” indicates material that is discussed below): module Lec1 where fc1 n = if n == 0 then 1 else n * fc1(n . . we digress briefly to explain the commands provided by the Hugs environment..hs e:/courses/comp348/src/lec1. This code introduces a Haskell module called Lec1 that contains a number of definitions. we will show Haskell code as it appears in the source file....hs looks like this (the “. The first definition introduces a function called fc1.1) .1 Hugs Commands Before describing the Haskell language. etc. . When we need to show the effect of evaluating an expression. Figure 3 provides a brief summary of each command.1. :module module :names pattern :project project file :type expression :version Brief description List all Hugs commands Execute a shell command Quit the current Hugs session Load Haskell source from a file Load additional files Repeat last load command Display functions exported by module(s) Set flags and environment parameters (see Figure 4) Change working directory Start the editor Start the editor with the cursor on the object named Force garbage collection Display information about files.

:also is similar but it does not remove previous definitions. ±q ±w ±k ±u ±i p string r string P string E string F string h num c num Print # of reductions and cells after evaluation Print type after evaluation Terminate evaluation after first error Print cells recovered by garbage collector Use literate modules as default Warn about errors in literate modules Print dots to show progress (not recommended!) Print nothing to show progress Always show which modules are loaded Show “kind” errors in full Use show to display results Chase imports when loading modules Set prompt to string Set repeat last expression string to string Set search path for modules to string Use editor setting given by string Set preprocessor filter to string Set heap size to num bytes (has no effect on current session) Set constraint cutoff limit to num Figure 4: Parameters for :set. :load removes all definitions (except those belonging to the Prelude) and loads new definitions from the named file(s). 3. One easy way to develop a Hugs program is to: Write the first version in a file fun. 1. Recent versions of Hugs use a smarter method of loading projects. In this case. The default editor for Hugs in Windows is notepad.5 HASKELL 37 ±s ±t ±f ±g ±l ±e ±. The other commands take a string or a number as argument. for a Windows session. you may like to use emacs with Hugs. but it positions the editor cursor at the chosen object. You can obtain the information in Figure 4 by entering :set without any arguments. will look something like this: .hs Leaving the editor running.hs Make corrections in the editor window Reload the file in the Hugs window: :r 2. Hugs resumes. :reload repeats the previous :load command. You may find it more convenient to ignore these commands and simply run Hugs and your favourite editor as independent processes. The symbol ± indicates a toggle: +x turns x on and -x turns it off. all you have to do is save the corrected file in the editor window and enter :r in Hugs to reload the modified source code. :find is similar. that make the :project command unnecessary in most cases. when you close the editor. The output also includes the current settings which. load this file into Hugs and try the functions: :l fun. as shown. The editor commands assume that Hugs has been linked to an editor. The project command reads a project file containing information about the files that Hugs should load. For example. called “import chasing”. :edit suspends Hugs and starts the editor.

1.{Hugs}\lib\hugs. " ++ show z ++ ")" By default.lhs Reading file "e:/courses/comp348/src/vector.lhs contains this text: Vector is an instance of the type class Show. rather than just code listings with occasional — and often useless — comments.lhs instead of . • A line beginning with > is interpreted as Haskell code. When Hugs loads this file. This enables easy conversion of vectors to strings. The overhead for code in literate Haskell (at least the Hugs version) is relatively light.exe -F Haskell 98 (+98) 38 Literate Haskell Donald Knuth proposed the idea that programs should be readable documents with detailed explanations and commentary built in. Suppose that the source file vector. > instance Show Vector where > show (V x y z) = "(" ++ show x ++ ".5 HASKELL Current settings: Search path : Project Path : Editor setting : Preprocessor : Compatibility : 5.2 +tfewuiAR -sgl.{Hugs}\lib\exts. A consequence of this step is that the code itself becomes slightly harder to write and read.hs. A program written in this way is called a literate program and the art of writing such programs is called literate programming .lhs":14 . the e switch in the compiler is on (as if we had entered :set +e). The other way is to put blank lines in the source file: . One is to turn off the e switch (its default setting is +e): Prelude> :set -e and then reload the source file — it will load without errors. " ++ show y ++ ". • The lines before and after lines containing code must be either blank or lines containing code (but see the e switch below).lhs": Parsing ERROR "e:/courses/comp348/src/vector. Here are the rules: • Literate Haskell source files use the extension .Program line next to comment There are two ways to fix this problem. • Any other lines are ignored by the compiler. it will report: Prelude> :l e:/courses/comp348/src/vector. The first step towards literate programming is to make “comments” very easy to write and read.{Hugs}\lib\win32 -EC:\WINNT\notepad.qQkI -h250000 -p"%s> " -r$$ -c40 -P{Hugs}\lib. The check for blank lines can be suppressed by turning off the e switch in Hugs. Here’s an example to illustrate how this works.

since everything is a value in Haskell.1.1) Even this. We will start by considering the favourite function of functional programmers. it is a good idea to put blank lines in the source file anyway.. in fact.1) Although we have written these equations together. Many functions. but fc3 4 would cause an error. fc3 0 would evaluate correctly to 1... . there must be a semicolon between the two case clauses (see Section 5. Haskell has a case expression (analogous to switch in C++ and Java) and we use it here for the second definition of factorial: fc2 n = case n of 0 -> 1 n -> n * fc2(n . to make the syntax legal.3. including this one. Figure 5 shows a small but complete example of a literate Haskell program.1.4): fc2 n = case n of 0 -> 1 . 5. are defined by cases. This enables easy conversion of vectors to strings.3 Defining Functions Since Haskell is a functional language. 39 > instance Show Vector where > show (V x y z) = "(" ++ show x ++ ". If we wrote the first but not the second.. but it is not the preferred style for Haskell. there is no need for a return statement: you write just the value that you want returned. because they make the code stand out more clearly.2. The preferred form of definition is a set of equations: fc3 0 = 1 fc3 n = n * fc3(n . × n for n ≥ 1 The first point to note is that.n] This works correctly when n = 0 because the “product” of the empty list is defined to be 1: Lec1> product [] . is not idiomatic Haskell.5 HASKELL Vector is an instance of the type class Show. however. There are also “closed form” expressions for some functions.n] and then using the Prelude function product to multiply all of the list items: fc4 n = product [1.1) This is the first definition that is written on more than one line. . " ++ show y ++ ". several ways to define functions.1) This definition works fine. they are in fact independent assertions for Haskell. The cases for factorial are n = 0 and n > 0.. We can compute n ! by constructing the list [1. defining functions is fundamental and there are. the factorial function: 0! = 1 n ! = 1 × 2 × 3 × . Our first definition of the factorial function uses recursion: fc1 n = if n == 0 then 1 else n * fc1(n . It can be written on one line but. n -> n * fc2(n . " ++ show z ++ ")" In general.

V x’ y’ z’ = V (x . The following definitions provide sums.z’) * y’ .z’ * x) (x > negate (V x y z) = V (negate x) (negate y) (z + z’) (z .5 HASKELL Author: Peter Grogono Date: 5 October 2003 Description: Example of a Literate Haskell source file > module Vector where This is a module for vectors in 3D Euclidean space. This enables easy conversion of vectors to strings. We define V to be the constructor for vectors. Vectors are equal if all of their components are equal. > instance Num Vector where > V x y z + V x’ y’ z’ = V (x + x’) (y + y’) > V x y z . > scale (V x y z) d = V (x * d) (y * d) (z * d) These auxiliary functions define the dot product and length of a vector. It might be an improvement to consider vectors to be equal if their components are approximately equal. differences. " ++ show y ++ ".x’ * y) (negate z) A vector can be scaled by a real number. Haskell does not allow us to overload ’*’ and ’/’ for vector/scalar operations.y’) > V x y z * V x’ y’ z’ = > V (y * z’ . and outer (cross) products of vectors. > instance Eq Vector where > V x y z == V x’ y’ z’ = x == x’ && y == y’ && z == z’ Vector is a numeric class. > instance Show Vector where > show (V x y z) = "(" ++ show x ++ ". " ++ show z ++ ")" Vector is an equality type.y’ * z) (z * x’ .x’) (y . The expression ’norm v’ yields a unit vector with the same direction as v. > dot (V x y z) (V x’ y’ z’) = x * x’ + y * y’ + z * z’ > len v = sqrt(dot v v) > norm v = if l == 0 > then error "Norm of zero vector" > else scale v (1/l) > where l = len v Figure 5: A Literate Haskell program . > data Vector = V Double Double Double 40 Vector is an instance of the type class Show.

expr . (/3) is a function that divides its argument by 3 and (3/) is a function that divides its argument into 3: Lec1> (/3) 9 3. define . The expression \n -> n + 1 can be used alone: it is an anonymous function.4 :: Double This provides us with another definition of inc: inc3 = (+1) Operators with one argument missing are called sections. we can write definitions such as add x y = x + y and use the functions defined: Lec1> add 347589 23874 371463 :: Integer How do we account for this? The function add is actually a second order function that expects one argument and returns a function that expects another argument. We can write definition in this way in Haskell: inc2 = \n -> n + 1 We can read this definition as “inc2 is the function that maps n to n + 1”. The backslash indicates that n is a parameter and the “arrow” -> links the parameter to the body of the function. In fact.0 :: Double Lec1> (12/) 5 2. (subtract expr ) is a section equivalent to \x -> x . n + 1. To see this. Lec1> (\n -> n + 1) 7 8 :: Integer There is another way of writing this anonymous function. Nevertheless. Backslash looks a bit like λ and the Haskell notation suggests the way in which function is written in λ-calculus: λ n . The general rules are: (op expr ) is equivalent to \x -> (x op expr ) (expr op ) is equivalent to \x -> (expr op x) There is an important exception to this rule: (-1) is not a section: it is just −1. So far. However. Putting parentheses around +1 makes this expression into a function: Lec1> (+1) 6 7 :: Integer In general. we can create functions from it by giving it either a left or a right operand. all Haskell functions take exactly one argument. given a binary operator. all of our functions have had exactly one argument. The “obvious” definition is a single equation: inc1 n = n + 1 Purists have a problem with this definition: they would prefer definitions to have to form x = e in which x is the name of the thing being defined and e is the defining expression. Its form is ? = 1 in which ? stands for the unknown parameter. we will use an even simpler function: inc adds 1 to its argument. For example.5 HASKELL 1 :: Integer 41 To illustrate some other styles of function definition.

5. which must be a function. in Haskell indentation and line breaks make a difference. The application f (g(x)) applies g to x and then applies f to the result. Thus f x means “the result of applying the function f to the argument x”. we must put parentheses around arguments that are not simple. For example. For example. provided that they occur in sensible places. we must write f(n+1). The rules are simple and intuitive and. The following definition is fine: fc6 n = . for example.4 Syntax Syntax conventions for functional programming are different from “traditional” programming styles. Haskell programs are remarkably free from “syntactic clutter” — braces. The “blank” operator denotes function application. Compared to programs in most other languages. possibly due to bad layout) Haskell allows semicolons but does not require them. Juxtaposition binds higher than any other operation. Unlike C++. semicolons. Haskell interprets the expression add 2 3 as “apply the function add 2 to the argument 3”.hs":27 Syntax error in expression (unexpected ‘. you may never have to learn them explicitly. if you use existing source code as a model.’ ”. because otherwise Haskell’s error messages can be quite confusing. It is important to know that the rules exist. inserting this definition into lec1. the Haskell expression f n+1 denotes f (n) + 1. Haskell interprets f x y as ((f x) y).1) gives this error: Reading file "e:/courses/comp348/src/lec1.’. the parser inserts a semicolon at the end of every line. There’s nothing wrong with line breaks. Because application binds strongly. and so on.5 HASKELL inc4 = add 1 and try it: Lec1> inc4 5 6 :: Integer 42 Thus add 1 is “the function that adds 1 to its argument” and add 99 is “the function that adds 99 to its argument”. the function f is applied to the argument x. if we want the effect of f (n + 1). However. The technical name for the blank operator is juxtaposition. that we write inc n rather than inc(n).1. and such. The trade-off is that Haskell programmers must respect layout rules.hs": Parsing ERROR "e:/courses/comp348/src/lec1. then the result. you get the mysterious diagnostic “unexpected ‘. Consequently. is applied to the argument y. We can write this in Haskell either as f(g(x)) or as f (g x). however.hs fc6 n = if n == 0 then 1 else n * fc6(n . You will have noticed. if you break a line in the wrong place. First. Here are some of the rules. for example.

p3 -> e3 There are a few other Haskell constructs for which layout is important. However. module Demo where import Trace {fac 0 = 1 fac n = n * fac (n . The Hugs Trace module provides a way around the problem of no side-effects. the factorial function.1. p2 -> e2 .1) -} fac 0 = 1 fac n = trace ("fac " ++ show n ++ "\n") (n * fac (n . One problem is that we cannot interleave print statements with other statements because functions programs don’t have statements. the meaning of your program should not be affected. We import the Trace module and define our old friend. It defines a function trace with two arguments: a string and an expression. write import Trace at the beginning of your program. Cases must appear on separate lines: case e p1 p2 p3 of -> e1 -> e2 -> e3 43 or separated by semicolons: case e of p1 -> e1 . Why? Because statements have side-effects and functional programs are not supposed to have side-effects. Since trace returns the value of e. To use the Trace module.1) We have already seen the Haskell case expression. you will be able to monitor its progress by looking at what trace prints. The parentheses are not always necessary (depending on the context) but they do no harm. When trace is called.5 HASKELL if n == 0 then 1 else n * fc6(n . 5. and we will state the rules when we introduce the constructs. To use it.1)) Calling the traced version of fac gives: . replace any expression e by (trace (s) (e)) in which s is a string. Here is a demonstration of Trace in use. We keep the original version for later use and define another version with a call to trace.5 Tracing Debugging functional programs is difficult. it prints the message as a side-effect and returns the expression.

2.3] :: [Integer] Lec1> ["abc". The initial letter may vary: for example.0.2] :: [Integer] A very common idiom in Haskell progamming is x:xs (read “x followed by xes”).BAD! ERROR . The empty list is [].5 HASKELL Demo> fac 6 fac 6 fac 5 fac 4 fac 3 fac 2 fac 1 720 :: Integer 44 5."def"] -.5] -.2 5.mixed types .1.’a’] *** Type : Num Char => [Char] The infix operator : builds lists: it takes an item on the left and a list on the right.."def"] :: [[Char]] Lec1> [’a’.to separate comments from code): Lec1> [1.3].].3] -.2.Illegal Haskell 98 class constraint in inferred type *** Expression : [1. but all of the items in a given list must have the same type (note the use of -. 1. Here are the basic built-in operations for lists: Lists are enclosed in square brackets: [.integers are promoted if necessary [1. a:as. and so on.OK ["abc".5] :: [Double] Lec1> [1.2. The operator : is right-associative.’a’] -.integer list . . This is a list consisting of an item x and a list xs. y:ys. so parentheses are not required in the second example below: Lec1> 1:[] [1] :: [Integer] Lec1> 1:2:[] [1.2.’c’] -. A list may contain items of any type.string list .OK [1.a character list is a string "abc" :: [Char] Lec1> [1.. Items within lists are separated by commas: [1.1 Details Lists We have already seen (in Section 5) that Haskell supports lists.’b’.

If the append operator ++ did not exist. . to obtain the even items in a list of integers.6] [1.4.6] [1.15. The second example below demonstrates this.12.20] [3.4] :: [Integer] Lec1> [3.2] ERROR .Illegal Haskell 98 class constraint in inferred type *** Expression : app [’a’.4] [2.3.3. to find the length of a list.5.2] [] :: [Integer] The infix operator ++ concatenates lists.5. it hasn’t really got two arguments — see Section 5. Most function definitions need only two cases: what is the result for an empty list.6.2.3] ‘app‘ [4.y] constructs a list.. []? and what is the result for a non-empty list x:xs? For example. Lec1> app [1. we could append lists using our own function app defined by app [] ys = ys app (x:xs) ys = x:app xs ys Any Haskell function with two arguments1 can be used as an infix operator by putting back quotes (grave accents) around it.2. we define: len [] = 0 len (x:xs) = 1 + len xs The parentheses are required: since juxtaposition has higher precedence than colon.’z’] "abcdefghijklmnopqrstuvwxyz" :: Lec1> [5.3.2. we could use the Prelude function even: evens [] = [] evens (x:xs) = if even x then x:evens xs else evens xs 1 Of course..4.2] *** Type : Num Char => [Char] A common operation on lists is to select items that satisfy a predicate. For example.9.6.6] :: [Integer] Lec1> app [’a’.3] [4.5..’b’] [1. Haskell would parse len x:xs as (len x):xs and report a type error.18] :: [Integer] Lec1> [’a’.2..6] :: [Integer] Lec1> [1.5. The reason is that we can define list functions by using the constructors as patterns. The third example shows what happens if we try to build a list with mixed types. [Char] There are functions for extracting the first item of a list (head) and the remaining items of the list (tail) but we don’t often need them.1.’b’] [1..5 HASKELL 45 The expression [x.3.. provided that x and y are members of the same enumerated type: Lec1> [2.

5 HASKELL

46

In functional programming, however, we usually want to generalize, using functions to do so. The function filt takes a predicate p and a list, and returns the items in the list that satisfy the predicate. (A better name would be filter but a function with this name is defined in the Prelude: it does the same as filt.) filt f [] = [] filt p (x:xs) = if p x then x:filt p xs else filt p xs Now we can write: Lec1> evens [1..20] [2,4,6,8,10,12,14,16,18,20] :: Lec1> filt even [1..20] [2,4,6,8,10,12,14,16,18,20] :: evens = filt even 5.2.2 Lazy Evaluation [Integer] [Integer]

Using filt, we can write a better definition for evens:

Haskell evaluates expressions lazily : this means that it never evaluates an expression until the value is actually needed. It is perfectly alright to introduce a non-terminating definition into a Haskell program, but things go wrong when we try to use it. Here is a non-terminating definition loop = loop and here is an attempt to use it (the message {Interrupted!} appears when the user interrupts the program, for example, by entering ctrl-C): Lec1> loop + 3 {Interrupted!} Lec1> There are other things that fail: Lec1> 1/0 Program error: one n = 1 Since Haskell never evaluates the argument of this function, we can give it any argument at all: Lec1> 1 :: Lec1> 1 :: Lec1> 1 :: one 77 Integer one (1/0) Integer one loop Integer primDivDouble 1.0 0.0

But now define a constant function that doesn’t use its argument:

Lazy evaluation allows us to work with infinite lists. The lists do not have an infinite number of items: what “infinite” means in this context is “unbounded” — the list appears to have as many items as we need. For example, if we define from n = n:from (n+1) nats = from 1

5 HASKELL

47

then from n yields an unbounded list starting at n and nats yields as many natural numbers as we might need: [1,2,3,...]. We can use the built-in function take to bite off finite chunks from the beginning of an infinite list: Lec1> take 6 nats [1,2,3,4,5,6] :: [Integer] Lec1> take 6 (evens nats) [2,4,6,8,10,12] :: [Integer] Here is an alternative definition of nats: nats = [1..] As an application of infinite lists, the function nomul returns the items in list l that are not multiples of n. The function sieve takes a list, and returns its first item n followed by the list obtained by applying itself to a list with multiples of n. nomul n l = filter (\m -> mod m n /= 0) l sieve (n:ns) = n:sieve (nomul n ns) This is what happens when we apply sieve to the list [2,3,4,...]. Lec1> sieve (from 2) [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89 Interrupted! 5.2.3 Pattern Matching

It is time to look more closely at some of the simple definitions we have already seen: fac 0 = 1 fac n = n * fac(n - 1) len [] = 0 len (x:xs) = 1 + len xs The object following the function name on the left of = is a pattern. When the function is called, the argument of the function is matched against the available patterns. If a match succeeds, the expression on the right side of = is evaluated. Pattern matching may include binding , which assigns values to variables in the expression. Examples: In the call fac 0, the argument 0 matches the pattern 0. The expression 1 is evaluated, yielding 1. In the call fac 3, the argument 3 matches the pattern n, yielding the binding { n = 3 }. The expression n * fac(n - 1) is evaluated with this binding, yielding 3 * fac(2). In the call len [], the argument [] matches the pattern []. The expression 0 is evaluated, yielding 0. In the call len [27..30], the argument is equivalent to 27:28:29:30. This matches the pattern x:xs, yielding the bindings { x = 27, xs = [28..30] }. The expression 1 + len xs is evaluated with these bindings, yielding 1 + len [28..30]. Pattern matching occurs in a number of contexts in Haskell. All of these contexts can be expressed in terms of the case expression:

5 HASKELL

48

Kind Wild card Constant Variable Constructor

Pattern k v C p 1 p1

Expression e e e C e1 e2

Condition for match none k==e none C = C , p1 ∼ e 1 , p2 ∼ e 2

Binding none none v=e

Figure 6: Pattern matching

case e of p1 -> e1 ; p2 -> e2 ; ...; pn -> en Here is how other kinds of expression map to case expressions: \ p1 p2 . . . pn → e = \x1x2 . . . xn → case (x1 , x2, . . . , xn ) of (p1, p2, . . ., pn ) → e where the xi are new variables if e1 then e2 else e3 = case e1 of True → e2 ; False → e3 let p1 = e1 ; p2 = e2 ; . . . pn = en ; in e = case (e1, e2, . . . , en ) of (p1, p2, . . ., pn ) → e2 The case expression case e of p1 -> e1 ; p2 -> e2 ; ...; pn -> en is evaluated as follows: The expression e is matched against p1 , p2, . . . , pn in turn. If e matches pi , then the value of the expression is obtained by evaluating ei with any bindings obtained by matching. If none of the patterns match, the expression fails. Note that the order of patterns is significant. For example, the following definition of the factorial function does not work, because the first pattern always succeeds and evaluation never terminates: fac n = n * fc3(n - 1) fac 0 = 1 Figure 6 summarizes the important kinds of pattern and the way in which they match. The expression p ∼ e stands for “pattern p matches expression e”. The underscore character, , is called a “wild card” because it matches anything but produces no binding. Use the wild card when you don’t need the value of the argument in the function. For example, the head and tail selectors for lists can be defined as head (x: ) = x tail ( :xs) = xs A constant matches a constant with the same value but produces no binding. For example, [] matches the empty list but nothing else. A pattern consisting of a variable name matches any expression and binds that expression to the variable. If the pattern is a constructor with patterns as arguments, the expression must be an invocation of the same constructor to expressions. The corresponding expressions are matched and the bindings from the matches are combined. Since “:” is the list constructor, the pattern (3:xs) matches any list whose first element is 3.

these definitions are not really necessary. in which case xs is bound to []. Functions can return tuples.y) = y However. Sometimes a third pattern is necessary.y) = x + y This is not the same function as the Prelude function add x y = x + y The argument of plus must be a pair. The Haskell equivalent is: brk xs = (take m xs.BAD! . We are not allowed to write merge (x:xs) (x:ys) = . drop m xs)) (len xs ‘div‘ 2) -. The items need not be of the same type. []. drop (len xs ‘div‘ 2) xs) Obviously. This means that each variable in a pattern must be distinct. tuples of different lengths have different types. For example.5 HASKELL 49 Haskell patterns are linear . Instead. else . if we define swap [] = [] swap (x:y:ys) = y:x:swap ys the expression “swap [3]” gives an error because the list with one element does not match any pattern.. In a procedural language. Suppose that we are defining a function merge that has two lists and arguments and that we need to consider the case in which the first item in both lists is the same.. because the Prelude defines functions fst and snd that do the same thing. First. ) = x second ( .) Tuples are enclosed in parentheses and the items inside them are separated by commas. we must write merge (x:xs) (y:ys) = if x == y then . drop m xs) where m = len xs ‘div‘ 2 This may not look “functional” but in fact it is. This pattern matches the list with one item. We can define selectors for pairs like this: first (x. The solution is to include this case in the definition: swap [] = [] swap [x] = [x] swap (x:y:ys) = y:x:swap ys The tuple is another kind of constructor in Haskell. A tuple with two items is usually called a pair . (x:xs)... the most common two cases for a list function are the empty list pattern. we would like to avoid the repeated evaluation of “len xs ‘div‘ 2”. and the non-empty list pattern. It is “syntactic sugar” for brk xs = (\m -> (take m xs. Unlike lists. We can also define plus (x. there is no equivalent to “add 1”.. (A tuple corresponds roughly to a C++ struct.. Here are some examples of the pattern matching rules. The following function breaks a list into two roughly equal parts and returns the two parts as a pair: brk xs = (take (len xs ‘div‘ 2) xs. we would introduce a local variable.

and case) apparently have two precedences: one leftwards and one rightwards. if. Figure 7: Parsing precedence.. \. left.5 HASKELL 50 Description simple term parenthesized term as-patterns application do. if. \. if. n-ary | -> = . 5. case (rightwards) sequences generators grouping guards cases definitions separation Syntax t (t) a@p fx Associativity right left right right Figure 8 -> right => :: right . let. case (left) case (leftwards) infix operators — see function types contexts type constraints do. from highest (at the top) to lowest (at the bottom). left. We have not encountered all of them yet. <. Some constructs (do. This is because of a metarule which says that each of these constructs .2. ^ ^^ ** : == /= < <= > >= ‘elem’ ‘notElem’ && || >> >>= $ ++ $! ‘seq’ Figure 8: Operator precedence. Precedence 9 8 7 6 5 4 3 2 1 0 Left !! * / ‘div‘ ‘mod‘ ‘rem‘ ‘quot‘ + - Associativity None Right . from highest (at the top) to lowest (at the bottom).4 Operators Figures 7 and 8 show the precedence rules that Haskell uses for parsing expressions and operators. \.

because its components group together. as shown in the third lines of the table. Since Haskell does not have an explicit loop construct. The infix operator @.5 HASKELL 51 extends as far to the right as possible. loop. conditional. with experience. because the number of small functions is large. we have a collection of control structures. The tendency in FP is to define a lot of small functions and then to build programs using these functions. take 0 xs = [] take [] = [] take n (x:xs) = x:take (n-1) xs Recall also that “drop n xs” throws away the first n items of the list xs and returns the rest. In the second line. The definition is similar. But. It is often easier to define a new function using Prelude functions than it is to write the function “from scratch”. or abstraction mechanisms (assignment. drop 0 xs = xs drop [] = [] drop n ( :xs) = drop (n-1) xs The functions takeWhile and dropWhile are only a little more complicated. Note also that all three parsings are what we might intuitively expect. and we use them to build programs. If n is zero. the list is unchanged. Otherwise. harder because we have a more limited toolkit. To be safe. This makes the task of programming both simpler and harder: simpler because there are fewer things to worry about. let seems to have higher precedence than +. This can be confusing for the beginner. takeWhile p [] = [] takeWhile p (x:xs) = if p x then x : takeWhile p xs else [] dropWhile p [] = [] dropWhile p (x:xs) = if p x then dropWhile p xs else x:xs Haskell provides another way of writing the last line. we want xs and x:xs’ to stand for the same . In Functional Programming (FP). read “as”. we use recursion on the value of n. In the first line in the table below. because the whole expression “a + a” is part of it. The let k + k + 5. etc. The key step in writing a function is often to choose the appropriate parameter for recursion.5 expression a = 3 in a + a let a = 3 in a + a (let a = 3 in a) + a parses as let a = 3 in (a + a) k + (let a = 3 in (a + a)) (k + (let a = 3 in a)) + a Building Haskell Programs In procedural programming. In this case. we define the appropriate response when the list is empty.). Here are some of the Prelude functions and their definitions. Note parentheses override the metarule. procedure call. Recall that “take n xs” returns the first n items of the list x. sequence.2. The Prelude is a collection of functions that many programmers have found useful. allows us to write a pattern in two equivalent ways. let seems to have lower precedence than +. we have only one abstraction mechanism: function definition. this approach to programming becomes quite manageable. recursion is used frequently. The second line is an application of the metarule: the let expression extends as far as possible.

20] ([1.10.20]) Prelude> break (>=5) [1.16. which puts the words together again.s’’) = break isSpace s’ unwords [] = "" unwords [w] = w unwords (w:ws) = w ++ ’ ’ : unwords ws Now we are ready for something more elaborate. Suppose we have a very long string of words.11.17.20] ([1.9. note: Prelude> break isSpace "Here is a sentence.8.17.[5. dropWhile p xs@(x:xs’) = if p x then dropWhile p xs’ else xs The functions break and span are complementary.4]. Where do we store the information in this “line”? The answer is that we need an accumulating parameter and recursion.15.[5.19. Here is the first version of a function that does the rest. we can write words.20]) In particular. Prelude> span (<5) [1.6. zs) else ([]. The idea is to put words into a line until the next word won’t fit — that is. and we want to format so that each line is no longer than 80 characters.14." ("Here".11.8.") :: ([Char].13. The first step is to break into up into a list of words using the function words.4].15.16. which splits a string into words separated by blanks. The problem is that we have no local variables and no state.7.19. Note that one or more primes may be added to a variable name to give a new name.7.14. it will make the line more than 80 characters long.18.[Char]) The definitions of span and break are as follows: span p [] = ([].12.9.13.2. perhaps an entire file..3. When this happens. as in mathematics. They split a list into two parts depending on the value of a predicate. p) With these functions working. we put the line into the output and carry on. and unwords.xs) where (ys.10.zs) = span p xs’ break p = span (not . words s = case dropWhile isSpace s of "" -> [] s’ -> w : words s’’ where (w. We define brk1 as brk1 [] line = [line] brk1 (w:ws) line = if length w + 1 + length line <= 80 then brk1 ws (line ++ " " ++ w) else line : brk1 (w:ws) [] .12.18.3.6.2." is a sentence.[]) span p xs@(x:xs’) = if p x then (x:ys.5 HASKELL 52 list..

If the length is less than 80 characters. The next version corrects that error. we have to write fac n = if n <= 0 then 1 else n * fac(n-1) We can do better with guards. the word is moved into the line. each with their appropriate definitions. For example. if we want a factorial function that does not blow up for negative numbers. We start it off with an empty line: brk1 sentence [] This function has one error: the first character of every line that it outputs is a blank. brk2 [] line = [line] brk2 (w:ws) line = if then if line == "" then brk2 else brk2 else line : brk2 length w + 1 + length line <= 80 ws w ws (line ++ " " ++ w) (w:ws) [] Finally. Otherwise. to make our function presentable to users. the pattern n matches any number. we wrap its parts up into a single definition: format sen = brk (words sen) [] where brk [] line = [line] brk ws@(w:ws’) line = if length w + 1 + length line <= 80 then if line == "" then brk ws’ w else brk ws’ (line ++ " " ++ w) else line : brk ws [] 5. we can specify several guards. the actual definitions of take and takeWhile in the Prelude are: take n | n <= 0 = [] take [] = [] take n (x:xs) = x : take (n-1) xs . Once we have provided a pattern. this function calculates how long the line will become if we append a blank and the word w to it.6 Guards The pattern-matching mechanism is handy but rather inflexible because it only allows for exact matching. A guard is a condition that qualifies a pattern. For example. This version of factorial works for all integers: fac n | n <= 0 = 1 | n > 0 = n * fc(n-1) The guard otherwise includes any conditions not covered by previous guards. the result is a list consisting of the line and the result of processing the rest of the word list.5 HASKELL 53 If there is something in our word string w:ws. As we know. The pattern n | n < 0 matches any negative number.2.

if a program compiles without error. they must have the same type. It can also be obtained using the :type command in the Hugs interpreter. This is because Haskell functions take their arguments “one at a time”. the compiler can find a principle type or most general type for it. if the type of a function is Int. An expression is written without type declarations but. we will discuss programs that do not fail. k :: a -> b -> a k x y = x There are two points to note. First. The following examples illustrate principle types.5 HASKELL 54 takeWhile p [] = [] takeWhile p (x:xs) | p x = x : takeWhile p xs | otherwise = [] 5. the two arguments are used to build a list. the type a to a”. Failure is discussed in Section 5. k is a function that takes one argument of type a and returns a function with type b → a. The next example is a constant function: it takes two arguments and throws away the second one. For example. The letter a is called a type variable. Semantically. i :: a -> a i x = x The Haskell type is a -> a which we can read as “the type of a function that takes an argument of any type and returns a result of the same type”. This means that.3 Types Haskell is a strongly typed language. two :: a -> a -> [a] two x y = [x. a → a and is read “for all types a. Since there is nothing in the definition that connects the types. Consequently. a and b. The right arrow associates to the right and the type a → b → a is parsed as a → (b → a). the type is a → b → a although you might have expected a × b → a. every type includes a value ⊥ (pronounced “bottom” because it is the minimal element of a lattice) that denotes failure. For the rest of this section. it will either give an answer of the expected type or it will fail. In other words. if it is type correct.y] . The type declaration is accepted in this form by the Haskell compiler. the function will return either an integer or ⊥. declare i to be the identity function. we follow the convention of declaring the type before the function. In all of these examples. In the formal type system. the compiler assumes that they are different (although they don’t have to be). this type is written ∀ a . First.2. There are a small number of situations in which the type checker needs a type declaration to help it. The following tests verifies this: Types> k ’x’ k ’x’ :: a -> Char Types> k ’x’ 5 ’x’ :: Char The second point is simply that the type contains two type variables. In the next example. The type [a] is the type of a list whose elements have type a. Haskell uses the Hindley-Milner type system.3.

We can specialize types in two ways. also called an ordinal type.5 then x + y else 0 As stated above. Note that we are not allowed to annotate a pattern — only an expression. Here is the same function again. The context becomes (Ord a. Providing a Char as the first argument of two forces the second argument to be Char as well. because x and y both occur in the expression x + y. The class Ord is a subclass of the class Eq: this means that any type that provides > must also provide ==. Asking Haskell for the type of this function shows that it has used the type we provided rather than the more general type. n3 :: Double -> Double -> Double n3 x y = if (x::Double) > 1.y] else [] Something else happens if we change one argument of > to the digit “1”. cmp :: Eq a => a -> a -> [a] cmp x y = if x == y then [x. the types become specialized. Fractional a) => a -> a -> a n2 x y = if x > 1.5 then x + y else 0 We can also specify the type of the entire function. If we specify that x is a Double. but adds a requirement to the type: it must be an equality type. Num a). The expression Eq => is called a context and it says that the type a must be an instance of the class Eq. but with the type Int specified by the programmer. we can add a type annotation to any variable or expression in the definition.5 instead of 1. Types> two ’a’ two ’a’ :: Char -> [Char] The function two ’a’ accepts an argument of type Char but an argument of another kind gives a type error. the compiler infers that y must be a Double too. n4 :: Int -> Int -> Int n4 x y = if x > 1 then x + y else 0 Types> :t n4 . asserting that a must be both an ordinal type and a numeric type. gt :: Ord a => a -> a -> [a] gt x y = if x > y then [x. n2 :: (Ord a. Num a) => a -> a -> a n1 x y = if x > 1 then x + y else 0 If we write 1. Types> two ’a’ ’b’ "ab" :: [Char] Types> two ’a’ 3 ERROR .5 HASKELL 55 When we apply a function to an actual argument. Haskell finds the most general type. we get a different requirement: the type a must be an instance of the class Ord.y] else [] If we change == to >.Illegal Haskell 98 class constraint in inferred type *** Expression : two ’a’ 3 *** Type : Num Char => [Char] The next function introduces a new feature: it uses the equality predicate ==. Haskell allows this even without further type information. n1 :: (Ord a. First. we obtain yet another context: a must be an ordinal type and a fractional type.

Haskell errors are very often caused by type mismatches. the type (a. and the result is a list of bs. The most general type may be specialized by type annotations provided by the user. compose :: (a -> b) -> (c -> a) -> (d -> c) -> d -> b compose f g h x = f (g (h x)) The type of a tuple is written in the same way as values of the type.hs":31 . In this example.hs": Type checking ERROR "e:/courses/comp348/src/types.1 Using Types Since Haskell infers types. Haskell infers: that the most general type of the function is a → b.b) cp x y = (x. we’d better get it right.5 HASKELL n4 :: Int -> Int -> Int 56 Of course. you will find these errors very hard to understand. In the next example. which can get quite complicated. it is possible to virtually ignore types while programming. for two reasons: It is often useful to have a clear idea of the types of your variables when writing functions. the type of x is d.3. 5. we apply a function to a list. h :: d → c. . cp :: a -> b -> (a. Similar notation is used for longer tuples. This is unwise. and the type of the result is b. g :: c → a. b) is “the type of pairs whose first element has type a and whose second element has type b”. the second argument of apply must be a list of as. as the following example shows. y) Summarizing what we have seen so far: All valid Haskell functions and expressions have a type. In this example. if we do provide a type. If you do understand the type system. apply :: (a -> b) -> [a] -> [b] apply f [] = [] apply f (x:xs) = f x : apply f xs Types can get even longer. The type inferred by the compiler is the most general type. you will find them quite helpful. we provide an incorrect type for the same function: n5 :: Int -> Char -> Int n5 x y = if x > 1 then x + y else 0 This attempt does not fool the compiler: Reading file "e:/courses/comp348/src/types.Type error in application *** Expression : x + y *** Term : x *** Type : Int *** Does not match : Char It is often easier to write functions than to figure out their types. A context indicates that a type is an instance of a class. The individual types are f :: a → b. If you don’t understand the type system.

2. provided they are correctly formatted. the :9 after the path name is the line number. one version of the function brk function in Section 5.Type error in application *** Expression : app "abc" "def" *** Term : "def" *** Type : String *** Does not match : [Int] There are some occasions when the type system doesn’t work as you would expect. However. but the type checker is expecting something of type [a] — that is. The correction is to replace ’ ’ by something which is a list.hs": Type checking ERROR "e:/courses/comp348/src/lec1. For example. record the type you have specified. if your module includes the declaration app :: [Int] -> [Int] -> [Int] app [] ys = ys app (x:xs) ys = x:app xs ys Haskell will accept the declaration. reporting: Reading file "e:/courses/comp348/src/lec1. part of the expression) which is causing the problem is ’ ’.5 HASKELL For example. a list. as inferred by the compiler.Type error in application *** Expression : ’ ’ ++ w *** Term : ’ ’ *** Type : Char *** Does not match : [a] To make sense of this error: 57 In the line beginning ERROR. Haskell accepts these type declarations. in this case. knowing the types of the functions often makes it easier to understand them. You may have already discovered this problem when representing sets as lists: Lec1> union [] [] . When you are reading someone else’s function definitions. The term (that is. it is a common convention in Haskell (and other functional languages) to write the type declaration.5 looked like this: brk1 [] line = [line] brk1 (w:ws) line = if length w + 1 + length line <= 80 then brk1 ws (line ++ ’ ’ ++ w) else line : brk1 (w:ws) [] Haskell did not accept this version. It identifies then brk1 ws (line ++ ’ ’ ++ w) as the line with the error. you are allowed to declare a more specialized type than the type that Haskell infers.hs":9 . The expression which cannot be typed is ’ ’ ++ w. above the function declaration. and give an error if you try to use it incorrectly: Lec1> :info app app :: [Int] -> [Int] -> [Int] Lec1> app "abc" "def" ERROR . It has type Char. [’ ’] or simply " ". and reports an error if there is an inconsistency between the declaration type and the inferred type. For this reason.

6.2 Failure There are only two main causes of failure: A function is given an argument that is of the correct type but outside the domain of the function.3.3.4.5.g.n] [1.3. including WinHugs.Cannot find "show" function for: *** Expression : [] *** Of type : [a] 58 This is a deficiency of the type system.3. but it is not an important one. This calls union [] [].n] and evaluate comb n with n = 0.0 0. A function fails to terminate.2.5 HASKELL ERROR . Domain errors are usually trapped by the runtime system: Prelude> 1/0 Program error: primDivDouble 1. but Haskell does not complain: Lec1> comb 0 [] :: [Integer] 5. even [] gives a type error: Prelude> [] ERROR .5.] [1. the symptoms of failure can vary quite a lot. If the expression union [] [] arises in a context.. Although the semantics says that the value of a failed program is always ⊥. head [] or 1/0. the empty lists will be the result of some computation and their types will be known. Here is a contrived example: we define comb n = union [1. E..Interrupted! Prelude> f where f = f Interrupted! Some implementations of Hugs.3 Reporting Errors There is a built-in function . whether or not it is producing output: Prelude> [1. with the result that programs may crash: Prelude> g 1 where g n = g(n+1) 5. do not detect stack overflow.Unresolved overloading *** Type : Eq a => [a] *** Expression : union [] [] In fact...0 Prelude> head [] Program error: head [] A non-terminating program may run until it is interrupted.

as shown by the mutually recursive definitions. The type String is equivalent to [Char]. and Rational (ratios of Integers).Minimal complete definition: (<=) or compare -. An instance type must provide at least <=. the compiler will provide the other one automatically. The type Char has the ASCII characters as its values. The type () has only two values: (). The definition of head in the Prelude is: head (x:xs) = x head [] = error "headPreuldeList: 5. This is not a convention: the compiler uses the case of the first letter to distinguish type names from type variables. which is the empty tuple. (>) :: a -> a -> Bool max. (>=).4 Standard Types head[]" There are a number of types that are either built-in. Float (single-precision floats). There are a number of numeric types. 5. Here is an example of its use. min :: a -> a -> a -. or defined in libraries. This type corresponds roughly to void in C++.3. Double (double precision floats). class Eq a where (==). if the user supplies one of these functions. (/=) :: a -> a -> Bool -. The equality class specifies that a type instance must provide the functions == and /=.Minimal complete definition: x == y = not (x/=y) x /= y = not (x==y) (==) or (/=) The ordinal class inherits from the equality class and provides additional functions for comparison. it displays the string. The type Bool has values True and False. defined in the Prelude.using compare can be more efficient for complex types compare x y | x==y = EQ | x<=y = LT | otherwise = GT x <= y = compare x y /= GT . These include Int (fixed size integers). Here are a couple of examples. The first letter of a type is always upper-case. all of the other functions can be inferred by the compiler.3.5 Standard Classes The Standard Classes of Haskell are defined in the Prelude. Integer (integers of any size). ordinary variables. However. and ⊥ (bottom).5 HASKELL error :: String -> a 59 The result returned by this function is the only value that is shared by all types: ⊥ (pronounced “bottom”). As a side-effect. class (Eq a) => Ord a where compare :: a -> a -> Ordering (<). (<=). and function names.

The class Show must provide a function that translates a value of the type into a string. as shown above. but it is not defined in the Prelude. To construct a fraction. Haskell’s error messages will guide you. 5. These names do not have to be different: we could have used Frac for both.6 Defining New Types As an illustration of how programmers can define new Haskell types. shown in Figure 10. There is no need to memorize this graph: you will need it only if you are designing your own numeric type and. you can get this information from the Prelude. even then. The first step is to declare the type and constructors for it.3. What is more important is to know the functions that a class is supposed to implement. we must provide two integers: Frac> F 9 12 9/12 :: Frac . we discuss the type Frac.5 HASKELL        60 Eq    Ord   Enum d d d ‚ d   d d d ‚ d        ©      d d d ‚ d       ©      Show Num Real Integral       ©     d d d ‚ d   d d d ‚ d        ©    Fractional RealFrac  d d d ‚ d        ©     d d d ‚ d     Floating RealFloat       ©     Figure 9: Standard Class Inheritance Graph x < y = compare x y == LT x >= y = compare x y /= LT x > y = compare x y == GT max x y | x <= y = y | otherwise = x min x y | x <= y = x | otherwise = y Figure 9 shows the class inheritance graph for the standard classes. The name of the type is Frac and the name of the only constructor is F. because there is already a type Rational. This type is actually unnecessary.

F n’ d’ = normalize( F (n * d’ . harm :: Integer -> [Frac] harm n | n <= 0 = [] | otherwise = harm (n .Harmonic series: 1/1 + 1/2 + 1/3 + .n’ * d) (d * d’) ) F n d * F n’ d’ = normalize( F (n * n’) (d * d’) ) negate (F n d) = normalize( F (negate n) d ) fromInteger n = F n 1 instance Fractional Frac where recip (F n d) = normalize( F d n ) normalize :: Frac -> Frac normalize (F n d) = if d == 0 then error ("Bad fraction: " ++ show (F n d)) else if n == 0 then F 0 1 else if d < 0 then normalize (F (negate n) (negate d)) else F (n ‘div‘ g) (d ‘div‘ g) where g = gcd n d -.1) ++ [F 1 n] Figure 10: The Class Frac ..5 HASKELL 61 module Frac where data Frac = F Integer Integer instance Show Frac where show (F n d) = if d == 1 then show n else show n ++ "/" ++ show d instance Eq Frac where F n d == F n’ d’ = n * d’ == n’ * d instance Ord Frac where F n d <= F n’ d’ = n * d’ <= n’ * d instance Num Frac where F n d + F n’ d’ = normalize( F (n * d’ + n’ * d) (d * d’) ) F n d ..

5 HASKELL 62 We could introduce new function names for operations on fractions. 4/6 should be converted to 2/3 and 5/(−16) should be converted to −5/16. compare. We use normalize for every operation that constructs a new fraction. we make Frac an instance of Fractional.3. >=. For example. Note that the Num class does not contain division operations. 1/n. We want to work with fractions in “normal form”. Caution: providing a different comparison. and max. The special syntax is actually unnecessary. . 1/3. Enumerations have several constructions but no data. subtracted. Finally and for fun. it shows fractions with denominator equal to 1 as integers. because Haskell treats -x as (negate x). . to making the type Frac an instance of numeric classes. For example. We can provide an ordering for fractions. which can be added. . The result is exact because the components of a Frac are Integer (“large integers”) rather than Int (32-bit integers). but not divided. . The functions work as expected. addFrac might add two fractions. >. including <. 1/2. The last example is of interest. All of these tasks are performed by normalize. The function that we choose is not arbitrary: if we provide <=. min. and *. which uses the standard function gcd to cancel common factors. because we could define if True x y = x if False x y = y relying on lazy evaluation to ensure that only one of x or y is actually evaluated. add fractions using +. we introduce the function harm to construct a list of the fractions 1/1. and a function that converts integers to Fracs. Figure 11 shows a few simple examples of the new type. we can see that any numeric class must be an instance of class Show. may lead to looping! The next step is to make Frac an instance of Num. which requires overloaded definitions of +. All that is needed is a function recip defined so that recip x = 1/x. Since we do intend Fracs to be divided. There are many advantages. From Figure 9. Haskell will then define x/y = x × recip y.7 Defining Data Types A type may have more than one constructor. It is useful also to define negate. however. sum is defined for all instances of Num. We also want to check for zero denominators and put the fraction zero into the normal form 0/1. we can overload function names and. as here. If we do this. . as in these declarations: data Bool = False | True data Colour = Red | Green | Blue Haskell provides special syntax (if/then/else) for conditional expressions because programmers are familiar with the English words. such as >=. This enables us to define types like Vector. However. for example. We provide a version of the required function show that writes a fraction in the form "2/3". because we did not define sum. 5. We define ==. and Haskell automatically provides /=. The next step is to make Frac an equality type. so it works correctly for Frac. and multiplied. As a nice touch. Haskell will provide a number of other functions. -.

and treeToList are trivial to define by cases on the constructors. in which case it is ignored (an alternative would be to report an error).1/4.1/2. Here are some things to note about this definition: Function show uses a “helper” function with an additional argument to display trees. or consists of an Int and two subtrees of type BTree. The function insert assumes that the trees are binary search trees. height.1/3. The important thing to note is that the apparently unnecessary case Empty == Empty = True must be included — otherwise the function loops! Constructors may define data components. data BTree = Empty | Tree Int BTree BTree . we need a base case (in this example the base case is Empty) and an inductive case. A new entry is inserted in the left or right subtree unless it is equal to an entry already in the tree. Note that this is a recursive type definition because BTree occurs on both sides of =. A BTree is either Empty. In Figure 12. Comparing trees for equality. Using the tree t defined at the bottom of Figure 12: BTree> t 1 2 3 4 5 6 8 :: BTree The functions size. a left subtree l. and Tree n l r constructs a tree with integer n at the root. and a right subtree r. Just as with recursion in functions.5 HASKELL Frac> F 3 4 > F 5 6 False :: Bool Frac> F 3 4 + F 7 12 4/3 :: Frac Frac> $$ > F 6 5 True :: Bool Frac> F 7 4 / F 2 5 35/8 :: Frac Frac> harm 5 [1. As an example. we define a type for binary trees with integers at the nodes. we will develop a type for binary trees with two constructors: Empty constructs an empty tree. using == is straightforward.1/5] :: 63 [Frac] Frac> sum (harm 50) 13943237577224054960759/3099044504245996706400 :: Figure 11: Trying out type Frac Frac Binary Search Trees Constructors may also define data components.

5 HASKELL 64 module BTree where import Trace data BTree = Empty | Tree Int BTree BTree instance Show BTree where show t = sh t "" where sh Empty sp = "" sh (Tree n l r) sp = let sps = sp ++ " " in sh r sps ++ sp ++ show n ++ "\n" ++ sh l sps instance Eq BTree where Empty == Empty = True (Tree n l r) == (Tree n’ l’ r’) = n == n’ && l == l’ && r == r’ size Empty = 0 size (Tree n l r) = 1 + size l + size r height Empty = 0 height (Tree n l r) = 1 + max (height l) (height r) treeToList Empty = [] treeToList (Tree n l r) = treeToList l ++ treeToList r ++ [n] listToTree [] = Empty listToTree (n:ns) = insert n (listToTree ns) insert insert if m else else m Empty = Tree m Empty Empty m t@(Tree n l r) = < n then Tree n (insert m l) r if m > n then Tree n l (insert m r) t find Empty m = False find (Tree n l r) m = if m == n then True else if m < n then find l m else find r m merge x y = listToTree(treeToList x ++ treeToList y) Figure 12: Class BTree .

6. The size of a tree is the number of nodes that it contains and the height of a tree is the length of the longest path from the root to a leaf node. However. and we will discuss easier ways later. because this gives an ordered list of integers. In recursive calls. and for this purpose postorder (left/right/root) is better.4. size Empty = 0 size (Tree n l r) = 1 + size l + size r height Empty = 0 height (Tree n l r) = 1 + max (height l) (height r) The function treeToList converts a tree to a list.2. For example: BTree> t 6 1 2 5 4 :: 3 BTree . The ordering of the list is significant: it would be natural to use inorder (left/root/right). The implementation of show below uses a “helper function” sh with a second argument giving the amount of indentation to use when printing a node. This is a bit tedious. we make the type Btree an instance of class Show so that we can convert trees to strings.3. t = Tree 1 ( Tree 2 ( Tree 3 Empty ( Tree 4 Empty ( Tree 5 Empty Empty ) ) ) Empty ) ( Tree 6 Empty Empty ) The simple functions below illustrate how we use the constructors as patterns in functions that operate on trees.1] :: [Int] As with previous types.5 HASKELL 65 We can build trees using the constructors. we want to be able to use the list to recreate the tree. the indentation is controlled by the depth of the node in the tree. Here are these functions in use: BTree> size t 6 :: Integer BTree> height t 5 :: Integer BTree> treeToList t [5.

Note that the last node in the expression is the first node to be inserted in the tree — and it becomes the root. 6. instance Eq BTree where Empty == Empty = True (Tree n l r) == (Tree n’ l’ r’) = n == n’ && l == l’ && r == r’ Let us assume that our binary trees are going to be used as binary search trees (BST). This version of function insert ignores them. the function loops. 8. That is.5 HASKELL 66 We can easily define an equality relation on trees: trees are equal if the integers at their root nodes are equal and their subtrees are (recursively) equal. Surprisingly. 7. The following function inserts an integer into a BST. The following code for the case m < n returns only part of the tree: if m < n then insert m l The function insert provides another way of building trees. which are not allowed in a binary search tree. t1 = insert 1 (insert 8 (insert 3 (insert 2 (insert 5 (insert 2 (insert 6 (insert 4 Empty))))))) BTree> t1 8 6 5 4 3 2 1 Although this is a slight improvement over calling constructors. it is still rather clumsy. maintaining the BST property. We have to decide what to do with duplicate entries. Here is a function that takes integers from a list and builds a BST from them. 2. listToTree [] = Empty listToTree (n:ns) = insert n (listToTree ns) t2 = listToTree [5. 3. 0. 1. It is important to “rebuild” the parts of the tree that we have taken apart. we must declare the Empty is equal to Empty — without this component of the definition. 4] BTree> t2 8 7 6 5 4 . we must ensure that all entries in every left subtree are less than the root and that all entries in every right subtree are greater than the root.

11.5 HASKELL 3 2 1 0 We have written treeToList and listToTree so that they are inverses. We can test these functions with the trees that we already have. 16.5. Assuming that we can recognize a leaf. 17. BTree> t2 == listToTree (treeToList t2) True :: Bool It is easy to merge trees using treeToList and listToTree. 13. as in Figure 13. 18.3. we should store a key/value pair at each node.2.5. BTree> fringe t1 [1. it is straightforward to define fringe. 10. 20. 12] we have BTree> merge t2 t3 20 19 18 17 16 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 :: BTree 67 The function find uses the familiar algorithm to decide whether a given integer is stored in the tree. 19. Then find would match the key and return the corresponding value.7] :: [Int] . The fringe of a tree is a list of the leaf nodes as they would be visited in a left to right scan. with t3 = listToTree [14. To make trees useful. merge x y = listToTree (treeToList x ++ treeToList y) Then. 9.8] :: [Int] BTree> fringe t2 [0.

The type . Whenever we refer to the type on the right side. Tree a indicates that the type variable a is a parameter of the type. we have immediately: sameFringe x y = fringe x == fringe y BTree> sameFringe t1 t2 False :: Bool Let us investigate this more closely.5 HASKELL isLeaf (Tree _ Empty Empty) = True isLeaf _ = False fringe Empty = [] fringe t@(Tree n l r) = if isLeaf t then trace ("Leaf " ++ show n ++ "\n") (n : (fringe l ++ fringe r)) else fringe l ++ fringe r sameFringe x y = fringe x == fringe y Figure 13: The fringe of a tree 68 There is a famous problem in Computer Science called the “same fringe problem”.8 Types with Parameters (fringe l ++ fringe r)) This is what happens when we call sameFringe again with the traced version of fringe: In this section. module Tree where data Tree a = Empty | T a (Tree a) (Tree a) In practice. we revisit trees.3. Here is a modified version of fringe that traces each discovery of a leaf node: fringe Empty = [] fringe t@(Tree n l r) = if isLeaf t then trace ("Leaf " ++ show n ++ "\n") (n : else fringe l ++ fringe r BTree> sameFringe t1 t2 Leaf 1 Leaf 0 False :: Bool 5. Given two trees. as shown in Figure 14. we must specify it as Tree a again. We need to store key/value pairs at the nodes so that we can store useful information in the tree.3. at the nodes. but with a new twist. we define trees with an arbitrary type. trees with integer nodes are not very useful. In the declaration below. On the left side of the declaration. how do we decide whether they have the same fringe? Since list comparison is provided for any equality type.7 were defined with integers at the nodes. indicated by the type variable a. The trees in Section 5. and Integer is an equality type.

6063647586). Show b) => [(a.value):ks) = insert key value (build ks) t = build [ ("Charles".5 HASKELL 69 module Tree where data (Ord k. 4508982212) ] Figure 14: Key-value tree . ("George". ("Anne". 4163548695). 5144831733). Show k. 8147586977). ("Feridoon". ("Edward". Show a. Show v) => Show (Tree k v) where show Empty = "" show (T key value left right) = show left ++ leftjust (show key) 12 ++ show value ++ "\n" ++ show right leftjust :: [Char] -> Int -> [Char] leftjust s n | length s >= n = s | otherwise = leftjust (s ++ " ") n out t = putStr (show t) insert :: (Ord a. Show k. ("Bill". 4504657464). Show b) => a -> b -> Tree a b -> Tree a b insert newKey newValue Empty = T newKey newValue Empty Empty insert newKey newValue t@(T key value left right) = if newKey == key then T key newValue left right else if newKey < key then T key value (insert newKey newValue left) right else T key value left (insert newKey newValue right) build :: (Ord a.b)] -> Tree a b build [] = Empty build ((key. 2124756843). Show a. ("Diana". Show v) => Tree k v = Empty | T k v (Tree k v) (Tree k v) instance (Ord k.

When we construct a non-empty tree. Show k. module Logic where data Formula = TT | FF | Var String | Con [Formula] | Dis [Formula] | Neg Formula . propositional symbols (or names). 2124756843). and > with keys because the type of the keys is an instance of class Ord. build. we can reduce the amount of writing required by providing a function. The following declaration introduces a type suitable for formulas in propositional calculus. to make Tree an instance of Show. and the right subtree. 4508982212) ] A Haskell session: Tree> t "Anne" 5144831733 "Bill" 4508982212 "Charles" 4163548695 "Diana" 2124756843 "Edward" 4504657464 "Feridoon" 6063647586 "George" 8147586977 :: Tree [Char] Integer Formulas It is often useful to define types with many constructors. 8147586977). ("Edward". the left subtree. ("Feridoon". When we supply a new pair with a key that already exists in the tree. Here is a test: t = build [ (6. 4504657464). 4163548695).. <. (8. 5144831733). we must write: instance (Ord k. ("Anne". For example. the value. ("Diana". (3. the new value replaces the old value.5 HASKELL 70 variables k and v stand for “key” and “value” respectively."Bill"). and implications. we must repeat all of the constraints. that builds a tree from a list of pairs. As before."Anne"). disjunctions. Show v) => Show (Tree k v) where . conjunctions. ("Bill". we must now provide four arguments: the key. with constants true and false. negations. 6063647586)."Charles"). In the source file: t = build [ ("Charles". (1. We require that k is an ordered type so that we can compare keys..."Diana") ] There is a catch with parameterized types: when we use them as instances of a class. ("George". The insertion function insert can use the operators ==.

5 HASKELL | Imp Formula Formula 71 When we define a function for a type like this one. The first program reads a character and echoes it. we must usually include cases for all of the constructors.getChar return (char == ’y’) . instance show show show show show show show show show show show Show TT = FF = (Var (Con (Con (Con (Dis (Dis (Dis (Neg (Imp Formula where "true" "false" name) = name []) = "" [f]) = show f (f:fs)) = show f ++ " and " ++ show (Con fs) []) = "" [f]) = show f (f:fs)) = show f ++ " or " ++ show (Dis fs) f) = "not " ++ show f p q) = show p ++ " => " ++ show q As before. Each action must have the same indentation. Moands are pure functions that provide the effect of managing state changes. f = (Imp p (Imp p (Imp p p))) where p = Var "p" g = Imp (Var "p") (Con [Var "q". test1 = do c <. In this course.4 Input and Output Haskell handles input and output using monads. return a Bool value that depends on the character read. we will not discuss monads in detail. read a character. The function ready performs the following sequence of actions: output a prompt. The following definition of show is incomplete because it does not indicate operator precedence. instead we will just provide some simple examples of their use. The function putChar takes a Char as its argument and performs an action — writing the character. although this is rather tedious. ready = do putStr "Ready? " char <. Neg (Var "r")]) Logic> f p => p => p => p :: Logic> g p => q and not r :: Formula Formula 5.getChar putChar c The next example shows how to write functions that perform actions. in this case a Char. The function getChar performs an action and returns a value. we can build formulas using the constructors. The keyword do is followed by a sequence of actions.

process s = filter (not .readFile iFile writeFile oFile s putStr (iFile ++ " copied to " ++ oFile ++ "\n") . getline = do char <.readFile "e:/temp/test. and then writes the line. The else part requires an action sequence that is introduced by a second do.res" (process s) Since it is not good practice to build file names into programs.getLine s <.getChar if char == ’\n’ then return "" else do line <. isSpace) s test4 = do s <.ready if t then putStr "\nReady" else putStr "\nNot ready" The function getline in the next example is recursive.getline putStr (switch line) The next example uses files instead of console functions. the next example reads the input and output file names from the console.dat" writeFile "e:/temp/test. The program test3 reads a line. main = do putStr "Input file: " iFile <.getLine putStr "Output file: " oFile <. applies a function (switch) to it.getLine return (char:line) switch s = map sw s where sw char = if isLower char then toUpper char else if isUpper char then toLower char else char test3 = do line <.5 HASKELL 72 test2 = do t <.

A simple example of such a function is map toUpper. 5.4. "second line". The effect is not to read a string of characters and then output the same string with lower case letters converted. The function machine :: String -> String simulates a vending machine. If you write interact (map toUpper) the interpreter will echo each character as you type it. The argument for interact is a function of type String -> String. This does not matter in practice.4.2 Interactive Programs The standard IO functions include interact. "third line"]) first line second line third line Since putStr :: String -> IO () is an IO function.1 String Formatting 73 By default. converting lower case letters to upper case letters. you can use it only at the “top level” — that is. The following example provides a more elaborate version of the same idea. Hugs does not interpret formatting characters from strings. The following exchange with the interpreter illustrates this (the standard function unlines converts a list of strings into a single string with the list items separated by newline characters): Prelude> unlines ["first line". as a command that you enter or as part of a do sequence. because there is no need to format strings until you actually want to see them at the end of a calculation. "second line". Instead. "third line"] "first line\nsecond line\nthird line\n" The IO function putStr interprets the formatting characters correctly: Prelude> putStr(unlines ["first line". The effect of lazy evaluation is important here. The argument string is a sequence of “commands” interpreted as follows: Command ’n’ ’d’ ’q’ ’r’ Effect Insert a nickel (5 cents) Insert a dime (10 cents) Insert a quarter (25 cents) Demand a refund The output from the machine is a string defined as follows: String "" "candy" "candy and n cents change" "Refund: n" Condition the amount entered is less than 25 cents the amount entered was exactly 25 cents the amount entered was 25 + n cents response to ’r’ with n cents stored . a function that enables you to write interactive programs easily.5 HASKELL 5. it reads each character and then immediately echoes the converted character.

append [] ys = ys append (x:xs) ys = x : length [] = 0 length (x:xs) = 1 + length xs append xs ys (1) (2) (3) (4) Suppose that we want to define a new function len2 such that len2 xs ys returns the sum of the lengths of the lists xs and ys. the user enters ctrl-z.5 HASKELL Here is the definition of the function machine: 74 machine commands = mac commands 0 where mac "" stored = "Amount left: " ++ show stored mac (’q’:cs) stored = respond (stored+25) cs mac (’d’:cs) stored = respond (stored+10) cs mac (’n’:cs) stored = respond (stored+5) cs mac (’r’:cs) stored = "Refund: " ++ show stored ++ "\n" ++ mac cs 0 mac (_:cs) stored = mac cs stored respond stored cs = if stored == 25 then "candy\n" ++ mac cs 0 else if stored > 25 then "candy and " ++ show (stored-25) ++ " cents change\n" ++ mac cs 0 else mac cs stored The following dialog with Hugs shows the result when the user enters the string "qdddnnr". 5.. The only difference is that we have numbered the equations for future reference. Here is a simple definition: len2 xs ys = length (append xs ys) (5) . and hence the dialogue.. in an important sense. showing that the machine reports the amount it has left before terminating. We will illustrate this with a simple example of program manipulation.5 Reasoning about programs Haskell function definitions not only look like equations but. The following definitions of append (append two lists) and length (length of a list) should be familiar by now. To end the input string. they are equations.. The second example shows the response to "dddd" followed by ctrl-z. VM> interact machine candy candy and 5 cents change Refund: 10 Amount left: 0 VM> interact machine candy and 5 cents change Amount left: 10 VM> .

as it stands. Then we have: isSubList xs (y:ys’) = ∃ as • [] ++ xs ++ as == ys’ ∨ ∃ bs’.as • y:bs’ ++ xs ++ as == y:ys’ = ∃ as • xs ++ as == ys’ ∨ ∃ bs’. have obtained this definition of len2 with a little thought and no formal deduction. Here is a somewhat more complicated example in which the final answer is less obvious. we will call isPrefix.5 HASKELL 75 The obvious problem with this definition is that it is inefficient: it constructs a new list just for the purpose of finding its length. of course. on the basis of its required behaviour. we instantiate (5) with xs = (x:xs) and then use (2). The following definition is clearly correct but not. and (5) to simplify the resulting expression: len2 (x:xs) = length (append (x:xs) ys = length (x : append xs ys) = 1 + length (append xs ys) = 1 + len2 xs ys We can now combine (7) and (11) to obtain a definition of len2 that does not use append: len2 [] ys = length ys len2 (x:xs) ys = 1 + len2 xs ys We could. implementable (think of bs as the “before list” and as as the “after list”): isSubList xs ys = ∃ bs. How can we avoid this inefficiency? As a first step. we instantiate (5) with xs = [] and then use (1) to simplify the resulting expression: len2 [] ys = length (append [] ys) = length ys (6) (7) Next. (4). The definition of isPrefix is isPrefix xs ys = ∃ as • xs ++ as == ys Assuming both lists are not empty. The first requires another function which.as • bs’ ++ xs ++ as == ys’ = ∃ as • xs ++ as == ys’ ∨ isSubList xs ys’ (8) (9) (10) (11) The second disjunct is a straightforward recursion.as • bs ++ xs ++ as == ys To make progress. we can write this as isPrefix (x:xs’) (y:ys’) = ∃ as • (x:xs’) ++ as == (y:ys’) = x == y ∧ ∃ as’ • xs’ ++ as’ == ys’ = x == y ∧ isPrefix xs’ ys’ It is straightforward to add the base cases to obtain the following definition: . we assume that ys is not empty (and so can be written y:ys’) and that there are two cases: either bs = [] or bs = y:bs’ (clearly. its first element must be y). Suppose we would like a function isSubList such that isSubList xs ys returns True if and only if xs is a sublist of ys. if bs is not empty.

5 HASKELL isSubList [] _ = True isSubList _ [] = False isSubList xs (y:ys) = isPrefix xs (y:ys) || isSubList xs ys where isPrefix [] _ = True isPrefix _ [] = False isPrefix (x:xs) (y:ys) = x == y && isPrefix xs ys 76 .

buses. and so on. depending on the system being simulated). Writing such programs is easier if there is a correspondence between objects in the world and components of the program. benefits it has to offer. Adding coroutine constructs to Algol blocks provides quasi-parallel programming capabilities. “programming” means “object oriented programming”.6 The Object Oriented Paradigm A fundamental feature of our understanding of the world is that we organize our experience as a number of distinct object (tables and chairs. leaving procedure definitions and data declarations. bank loans and algebraic expressions. there is a substantial portion of the software industry that has not adopted OOP. Dahl (1978. When we wish to solve a problem on a computer. Kristen Nygaard proposed a language for simulation to be developed by himself and Ole-Johan Dahl (1978). . The major innovation was “block prefixing” (Nygaard and Dahl 1978).) . The basic concept of Simula 67 was to be “classes of objects”. . page 489) has provided his own analysis of the role of blocks in Simula. (Hoare 1968) Object Oriented Programming (OOP) is the currently dominant programming paradigm. Nevertheless. . Once recognized. Prefixing emerged from the study of queues. if any. which are central to discrete-event simulations. as we would say today. Simula originated in the Norwegian Computing Centre in 1962. For many people today. the prefix concept could be used in any context where common features of a collection of classes could be abstracted in a prefixed block or. . we often need to construct within the computer a model of that aspect of the real or conceptual world to which the solution of the problem will be applied. a superclass. and there remains widespread misunderstanding about what OOP actually is and what. . 6. Key insights were developed in 1965 following experience with Simula I: the purpose of the language was to model systems. people. . 77 .1 Simula Many programs are computer simulations of the real world or a conceptual world. Deleting the procedure definitions and final statement from an Algol block gives a pure data record. Simula 67 was a general purpose programming language that incorporated the ideas of Simula I but put them into a more general context. a system is a collection of interacting processes. The main lessons of Simula I were: the distinction between a program text and its execution. a process can be represented during program execution by multiple procedures each with its own Algol-style stacks. gives an abstract data object. the fact that data and operations belong together and that most useful programming constructs contain both. The queue mechanism (a list linked by pointers) can be split off from the elements of the queue (objects such as trucks. Deleting the final statement from an Algol block.

What features might a concurrent language provide that are not provided by coroutines? What additional complications are introduced by concurrency? 2 . although this did not become evident for several years. the legacy that Simula left is considerable: a new paradigm of programming. begin procedure Deposit (real amount) balance := balance + amount. needed to support coroutines. MyAccount := new Account(1000). It was used more in Europe than in North America.. exactly one function is executing. but it never achieved the recognition that it deserved.. On the other hand. Simula provides: coroutines that permit the simulation of concurrent processes. Inheritance and subclasses introduced a new programming methodology. with modernized syntax.. Coroutines are typically implemented by adding two new statements to the language: suspend and resume. Before we use this class. end. multiple stacks.amount. This explains the term “prefixing” — the declaration of the child class is prefixed by the name of its parent. . procedure Withdraw (real amount) balance := balance .6 THE OBJECT ORIENTED PARADIGM Listing 19: A Simula Class class Account (real balance). switches control to a scheduler which stores the value of the program counter somewhere and continues the execution of another function. a garbage collector that frees the programmer from the responsibility of deallocating storage. A suspend statement. This was partly because there were few Simula compilers and the good compilers were expensive. It differs from an ordinary Algol procedure in that its AR remains on the heap after the invocation. classes that combine data and a collection of functions that operate on the data. executed within a function. prefixing (now known as inheritance) that allows a specialized class to be derived from a general class without unnecessary code duplication. Note that the class is syntactically more like a procedure than a data object: it has an (optional) formal parameter and it must be called with an actual parameter. Listing 19 shows a simple Simula-style class. A resume statement has the effect of restarting a previously suspended function. At any one time. Simula itself had an uneven history. All of these concepts are subsumed in the Simula class. To inherit from Account we write something like: Account class ChequeAccount (real amount). Exercise 23. 78 Adding a prefix mechanism to Algol blocks provides an abstraction mechanism (the class hierarchy). we must declare a reference to it and then instantiate it on the heap: ref (Account) myAccount.

it was based on six principles (Kay 1996. not just a compiler. you send a message to it. Kay (1996) was influenced by LISP. Smalltalk requires metaclasses. Smalltalk was also strongly influenced by Simula. it differs from Simula in several ways: Simula distinguishes primitive types.6 THE OBJECT ORIENTED PARADIGM 79 6. Blocks may have parameters. 3. However. Smalltalk effectively eliminates passive data. an early attempt at formalizing objects. Since a class object must belong to a class. and by Simula. using BASIC (!) as the implementation language (Kay 1996. In particular. and everything is an object. The class holds the shared behaviour for its instances (in the form of objects in a program list). classes are objects in Smalltalk. “everything is an object”. Objects have their own memory (in terms of objects). The “block” is an interesting innovation of Smalltalk. there are no data primitives. and [Transcript nextPutAll: the parameter of the message.2). One technique for iteration in Smalltalk is to use a do: message which executes a block (its parameter) for each component of a collection. the talk inspired Carl Hewitt’s work on his Actor model. The cross-fertilization between OOPL and FPL that occurred during these early days has. the message is timesRepeat:. You can edit. To [evaluate] a program list. Every object is an instance of a class (which must be an object). In Smalltalk. Smalltalk was inspired by Simula and LISP. The component is . The first version of Smalltalk was implemented by Dan Ingalls in 1972. Since objects are “active” in the sense that they have methods. Principles 1–3 provide an “external” view and remained stable as Smalltalk evolved. 2. execute. Objects communicate by sending and receiving messages (in terms of objects). compile. In turn. In the Smalltalk statement 10 timesRepeat: [Transcript nextPutAll: ’ Hi!’] ’ Hi!’] is the receiver is 10.2 Smalltalk Smalltalk originated with Alan Kay’s reflections on the future of computers and programming in the late 60s. one of Kay’s goals was to do the same for Smalltalk. Everything is an object. It is interesting to note that Kay gave a talk about his ideas at MIT in November 1971. Smalltalk is a complete environment. such as integer and real. not continued. page 534). Principle 6 reveals Kay’s use of LISP as a model for Smalltalk — McCarthy had described LISP with a one-page meta-circular interpreter. To create a new instance of a class. control is passed to the first object and the remainder is treated as its message. Principles 4–6 provide an “internal” view and were revised following implementation experience. 5. especially by its one-page metacircular interpreter. 6. A block is a sequence of statements that can be passed as a parameter to various control structures and behaves rather like an object. and debug Smalltalk programs without ever leaving the Smalltalk environment. Sussman and Steele wrote an interpreter for Actors that eventually became Scheme (see Section 4. 4. page 533). from class types. sadly. 1.

Discuss the following statements (Gelernter and Jagannathan 1990.] Under this view of things. an object has private data and public functions. The following loop determines whether "Mika" occurs in the current object (assumed to be a collection): self do: ^false [:item | item = "Mika" ifTrue: [^true]]. coroutines. The class is itself an object. there are no passive objects to be acted upon. The second occurrence is its use. E. Each type of object is implemented by its own program module. an object is an instance of a class. 2.3 CLU CLU is not usually considered an OOPL because it does not provide any mechanisms for incremental modification. Do these definitions imply an infinite regression? If so. . The block [:x | E] corresponds to λx . there is an inheritance hierarchy (actually a tree) with the class Object as its root. However: The resulting programming methodology is object-oriented: programs are developed by thinking about the objects they manipulate and then inventing a modular structure based on these objects. how could the infinite regression by avoided? Is there a class that is a member of itself? 2 6. Exercise 24. This can lead to counter-intuitive interpretations of simple operations. garbage collection. page 245). 2 Exercise 25.6 THE OBJECT ORIENTED PARADIGM 80 passed to the block as a parameter. The first occurrence of item introduces it as the name of the formal parameter for the block. page 475) Parnas (1972) introduced the idea of “information hiding”: a program module should have a public interface and a private implementation. 1. In Smalltalk. A class whose instances are classes is called a metaclass.2). Algol 60 gives us a representation for procedures. Blocks are closely related to closures (Section 4. every object is a member of a class. Our ability to create single objects and to specify (constant) object “values” is compromised. The first practical version of Smalltalk was developed in 1976 at Xerox Palo Alto Research Center (PARC). all classes inherit directly or indirectly from the class Object. The important features of Smalltalk are: everything is an object. blocks. but no direct representations for the objects that are instantiated from classes. Every entity is an active agent. objects collaborate by exchanging “messages”. and can be destructive to a natural sense of hierarchy. Simula and Smalltalk inherit this same limitation: they give us representations for classes. but no direct representation for a procedure activation. a little program in itself. Liskov and Zilles (1974) introduced CLU specifically to support this style of programming. [Data objects are replaced by program structures. (Liskov 1996. particularly under Smalltalk’s approach. and therefore is a member of a class.

associates operations with objects rather than types. some sets of objects. . instances of user-defined classes are always heap-allocated. The word “abstract” in CLU means “user defined”. and the effects of these operations. it was seen as deficient in various ways. but a stack ADT respecting Martin’s definition would define the key property of a stack (the sequence push(). of a class. a semantic description — that is. An abstract type is abstract in the same way that a procedure is an abstract operation. Simula: does not provide encapsulation: clients could directly access the variables. This simplifies safe initialization. (Liskov 1996. Many people use the term ADT in the following sense: An abstract data type is a system consisting of three constituents: 1. of course) to E. A stack ADT in CLU would have functions called push and pop to modify the stack. does not provide “type generators” or. “generic types” (“templates” in C++). a set of operations that may be performed on instances of the type. or not built in to the language: I referred to the types as “abstract” because they are not provided directly by a programming languages but instead must be implemented by the user. for all types. (Martin 1986) The difference is that an ADT in CLU provides a particular implementation of the type (and. should be provided by a member function of the class. Although Simula influenced the design of CLU. page 473) This is not a universally accepted definition. a set of syntactic descriptions of the primitive functions. in fact. as we would say now. The reasons given for using a heap include: Declarations are easy to implement because. When the variable goes out of scope. pop() has no effect) but would not provide an implementation of these functions. CLU’s computational model is similar to that of LISP: names are references to heap-allocated objects. There is no need to worry about the semantics of copying an object. Variable and object lifetimes are not tied. the implementation must be unique) whereas Martin’s definition requires only a specification. if needed. The implementation of an ADT provides the actual operations that have the specified effects and also prevents programmers from doing anything else. Variable declaration is separated from object creation. as well as the functions. Copying. the object continues to exist (there may be other references to it). The statement x := E means simply that x becomes a reference (implemented as a pointer. Stack allocation breaks encapsulation because the size of the object (which should be an implementation “secret”) must be known at the point of declaration. handles built-in and user-defined types differently — for example. 2. a sufficiently complete set of relationships that specify how the functions interact with each other. The meaning of assignment is independent of type. space is allocated for a pointer. 3. The stack-only model of Algol 60 was seen as too restrictive for a language supporting data abstractions.6 THE OBJECT ORIENTED PARADIGM 81 An abstract data type (ADT) specifies a type.

The components of a cluster. x) then rep$addh(down(s). if necessary. 2. CLU is designed for programming with ADTs. integers are immutable but arrays are mutable (Liskov and Guttag 1986. CLU provides special functions up. auxiliary functions that are not accessible to clients.). the function insert has a parameter s of type intset which is converted to the representation type so that it can be treated as an array. referred to as rep. Furthermore. 3. garbage collection eliminates memory leaks and dangling pointers. insert.. x) ≤ rep$high(s)) end member . An object is mutable if its value can change during execution. The abstract type intset and the representation type array[int]. . CLU draws a distinction between mutable and immutable objects. All operations on abstract data in CLU contain the type of the operator explicitly. using the syntax type$operation. conversely. A header that gives the name of the type being implemented (intset) and the names of the operations provided (create. For example. and down. CLU provides the keyword cvt as an abbreviation. For both of these cases. . introduced by the keyword rep. efficiency was not a primary goal of the CLU project. choose rep = array[int] create = proc () returns (cvt) return (rep$new()) end create insert = proc (s: intset. Implementations of the functions mentioned in the header and. For example. are: 1. such as intset. In the function create above. page 63). and in the function member. x: int) if ∼member(s. which converts the representation type to the abstract type. delete.. Listing 20 shows an extract from a cluster that implements a set of integers (Liskov and Guttag 1986. It often happens that the client provides an abstract argument that must immediately be converted to the representation type. Although garbage collection has a run-time overhead. Associated with the interface is an implementation that defines a representation for the type and provides bodies for the functions. size. An intset is represented by an array of integers. A definition of the representation type.6 THE OBJECT ORIENTED PARADIGM Listing 20: A CLU Cluster intset = cluster is create. . notorious sources of obscure errors. pages 19–20). insert. otherwise it is immutable. x: int) returns (bool) return (getind(s. . end intset 82 Reference semantics requires garbage collection. the formal parameter s:cvt indicates that the client will provide an intset which will be treated in the function body as a rep. . x) end end insert member = proc (s: cvt. are distinct. member. many functions create a value of the representation type but must return a value of the abstract type.. which converts the abstract type to the representation type. cvt converts the new instance of rep to intset before returning.

Do you agree that defensive programming is a better habit than the alternative suggested by Liskov and Guttag? Can you propose an alternative strategy? 2 6. . an object is mutable if its type has operations that change the object. the fact that it is shared is undetectable by the program. and records — are effectively made equivalent. Whereas Simula provides tools for the programmer and supports a methodology. In Simula. and up) are needed simply to maintain the distinction between abstract and representation types. the distinction between mutable and immutable objects.6 THE OBJECT ORIENTED PARADIGM 83 The mutability of an object depends on its type and. distinction between the abstract type and the representation type. resulting in significant simplification. in particular. . If the object is immutable. generic clusters (a cluster may have parameters. down. C++ is the result of two major design decisions: . If it is mutable. names are references. changes made via the reference x will be visible via the reference y. and suppress them during production.. After executing x: T = intset$create() y: T y := x the variables x and y refer to the same object. page 100) Exercise 26. analysis of copying and comparison of ADTs. checking whether inputs [i. But is it the task of a compiler to prevent the programmer doing things that might be completely safe? Or should this role be delegated to a style checker. Clearly. on the operations provided by the type. Defensive programming makes it easier to debug programs. This is generally an unwise practice. exception handling. notably type parameters). In CLU. The task assigned to Stroustrup was to develop a new systems PL that would replace C. classes. .e. It is better to develop the habit of defensive programming. safe encapsulation. records. The important contributions of CLU include: modules (“clusters”). procedures. three concepts — procedures. (Liskov and Guttag 1986. Several keywords (rep. and it is tempting not to bother with the checks. not values. and clusters are three different things. . parameters of functions] are in the permitted subset of the domain takes time. The argument in favour of CLU is that the compiler will detect encapsulation and other errors. . that it writing each procedure to defend itself against errors. just as C programmers use both cc (the C compiler) and lint (the C style checker)? The designers of CLU advocate defensive programming : Of course. Whether this is desirable depends on your point of view. The cost is an increase in complexity. or to use them only while debugging. CLU provides the same tools and enforces the methodology.4 C++ C++ was developed at Bell Labs by Bjarne Stroustrup (1994). cvt. .

however. 2 6. CLU.1 Programming by Contract The problem with writing a function to compute square roots is that it is not clear what to do if the argument provided by the caller is negative. although both stack and heap allocation is provided. and exception handling. not a “pure” OO language. Joyner 1992).6 THE OBJECT ORIENTED PARADIGM 84 first. and assertions. there are only a few C constructs that are not accepted by a C++ compiler). Stroustrup’s experience of completing a Ph. a language that supports both imperative and OO programming). reusable components. Eiffel: is strictly OO: all functions are defined as methods in classes. break. C++: is almost a superset of C (that is. goto. or continue statements. subranges. an unusual exception handling mechanism. (Meyer 1992) Eiffel borrows from Simula 67. Ada. C++ is widely used but has been sharply criticized (Sakkinen 1988. By default. is a hybrid language (i. or pointer arithmetic. provides multiple inheritance and repeated inheritance. . at Cambridge University using Simula convinced him that object orientation was the correct approach but that efficiency was essential for acceptance. supports “programming by contract” with assertions. It is notable for its strong typing. casts.e. it provides multiple inheritance. communicating on the basis of clearly defined contracts and organized in systematic multi-criteria classifications. whose goal is to yield quality software through the careful production and continuous development of parameterizable. Explain why C++ is the most widely-used OOPL despite its technical flaws. 6. may provide garbage collection. provides generic classes. procedure variables. provides multiple inheritance. Exercise 27. Eiffel also provides “expanded types” whose instances are named directly. second. Stroustrup decided that the new language would be adopted only if it was compatible with C. emphasizes the stack rather than the heap.. scientifically specified.D.5 Eiffel Software Engineering was a key objective in the design of Eiffel. Sakkinen 1992. object names are references. has an unusual exception handling mechanism. does not provide garbage collection. enumerations. genericity (in the form of “templates”). Input/output is defined by libraries rather than being built into the language. Eiffel does not have: global variables.5. The main features of Eiffel are as follows. Like some other OOPLs. Eiffel embodies a “certain idea” of software construction: the belief that it is possible to treat this task as a serious engineering enterprise. Alphard. Algol W. and the specification language Z. The solution adopted in Listing 21 is to allow for the possibility that the argument might be negative.

The advantages of this approach include: Many tests become unnecessary. We must have Rp ⇒ Rc and Ec ⇒ Ep. The require/ensure clauses constitute a contract between the function and the caller. Most of the time spent testing is wasted. the contract does not constrain the function in any way at all: it can return -999. . is positive. Note . The only reasonable possibility is some kind of exception.1.6 THE OBJECT ORIENTED PARADIGM Listing 21: Finding a square root double sqrt (double x) { if (x ≥ 0. indicated by “??” in Listing 21. Testing arguments adds overhead to the program. The require conditions can be checked dynamically during testing and (perhaps) disabled during production runs.x ensure abs(Result * Result / x .) Require and ensure clauses are a useful form of documentation. it is undesirable to return a funny value such as −999. or to send a message to the console (there may not be one). In particular.0) √ return . The caller therefore has a responsibility to ensure that the argument is valid. Functions in a child class must provide a contract at least as strong as that of the corresponding function in the parent class. .0) ≤ 1e-6 end 85 This approach. terminate execution. . In words: If the actual parameter supplied.3).999 or any other number. . x. -. the result returned will be approximately √ x. (Hoare considers this policy to be analogous to wearing your life jacket during on-shore training but leaving it behind when you go to sea. Programs are not cluttered with tests for conditions that are unlikely to arise. is called defensive programming (Section 6. or whatever. . applied throughout a software system. . The failure action. If the argument supplied is negative. they appear in Eiffel interfaces. Meyer’s solution is to write sqrt as shown in Listing 22. as shown in Listing 23. // x else ?? } Listing 22: A contract for square roots sqrt (x: real): real is require x ≥ 0.999. or to add another parameter to indicate success/failure. because programmers will rarely call sqrt with a negative argument. is hard to define.0 √ Result := .

Typical contracts say things like: “after you have added an element to this collection. Thus the contractor can use the subcontractor when a dog is required and make a profit of $50. The features of HomeBusiness include: . 86 the directions of the implications: the ensure clauses are “covariant” but the require clauses are “contravariant”.6 THE OBJECT ORIENTED PARADIGM Listing 23: Eiffel contracts class Parent feature f() is require Rp ensure Ep end . In practice. and the ∧ ensures that the child’s commitment is stronger than the parent’s commitment. A class invariant is a predicate over the variables of an instance of the class.2 Repeated Inheritance Listing 24 shows a collection of classes (Meyer 1992. but it is considered good Eiffel “style” to include them. complex assertions increase run-time overhead. In addition to require and ensure clauses.. leading to a false sense of confidence in the correctness of the program. increasing the incentive to disable checking.. Eiffel programmers are not obliged to write assertions. 6.5.. it contains one more element”. It must be true when the object is created and after each function of the class has executed. a complex assertion might itself contain errors. Eiffel has provision for writing class invariants. I will send you a dog”.. class Child inherit Parent feature f() is require Rc ensure Ec . I will send you an animal”.. The pet supplier has a subcontractor who offers the following contract: “if you send me at least $150. page169). Eiffel achieves the subcontract specifications by requiring the programmer to define Rc ≡ Rp ∨ · · · E c ≡ Ep ∧ · · · The ∨ ensures that the child’s requirement is weaker than the parent’s requirement. However: this does not seem to be done much in practice. Since assertions may contain function calls.. The following analogy explains this reversal. it is possible in principle to say things like: “this structure is a heap (in the sense of heap sort)”. it is sometimes difficult to write assertions that make non-trivial claims about the objects. These contracts respect the Eiffel requirements: the contract offered by the subcontractor has a weaker requirement ($150 is less than $200) but promises more (a dog is more specific than an animal). A pet supplier offers the following contract: “if you send at least $200.

6. inherited from value in class House.5. The function in Listing 25 illustrates the idea. but it applies to entire classes (via the virtual mechanism) rather than to individual attributes of classes. but Java is simpler in many ways than C++. A return from the rescue clause indicate that the function has failed. Expanded types introduce various irregularities into the language. After Eiffel had been used for a while. Meyer introduced expanded types. .. residenceValue. end 87 address. inherited once but by two paths. Why do you think expanded types were introduced? Was the decision to introduce them wise? 2 6.3 Exception Handling An Eiffel function must either fulfill its contract or report failure. Key features of Java include the following. businessValue. An instance of an expanded type is an object that has a name (value semantics). this clause is invoked if an operation within the function reports failure. inherited from value in class House. it is difficult in some languages to provide collections with more than one iteration sequence. Its syntax bears some relationship to that of C++..6 Java Java (Arnold and Gosling 1998) is an OOPL introduced by Sun Microsystems. For example. Exercise 28.. C++ has a similar mechanism. It is not obvious that there are many circumstances in which it makes sense to “retry” a function. a rescue clause may perform some cleaning-up actions and then invoke retry to attempt the calculation again. In Eiffel. However. more than one sequence can be obtained by repeated inheritance of the successor function. because they do not have the same properties as ordinary types. more conventional. exception handling mechanisms. Repeated inheritance can be used to solve various problems that appear in OOP. Early versions of Eiffel had a reference semantics (variable names denoted references to objects).6 THE OBJECT ORIENTED PARADIGM Listing 24: Repeated inheritance in Eiffel class House feature address: String value: Money end class Residence inherit House rename value as residenceValue end class Business inherit House rename value as businessValue end class HomeBusiness inherit Residence Business . The mechanism seems harder to use than other. If a function contains a rescue clause.

Java has an exception handling mechanism. Java byte codes do not have the potential to penetrate system security in the way that a binary executable (or even a MS-Word macro) can. In addition to classes. Java provides wrapper classes.initially false do if not triedFirst then firstmethod else secondmethod end rescue if not triedFirst then triedFirst := true retry else restoreInvariant -.report failure end end 88 Java is compiled to byte codes that are interpreted. One interesting technique is “just in time” compilation. Java is highly portable. the inefficiency is not important. for each primitive type. 6. However.1 Portability Compiling to byte codes is an implementation mechanism and. More efficient implementations are promised. Java provides interfaces with multiple inheritance. For some applications. Primitive values. such as int. Java provides garbage collection. is not strongly relevant to this course. Consequently. such as Integer. in which program statements are compiled immediately before execution. The byte codes are checked by the interpreter and have limited functionality. parts of the program that are not needed during a particular run (this may be a significant fraction of a complex program) are not compiled. The portability of Java is exploited in network programming: Java bytes can be transmitted across a network and executed by any processor with an interpreter. such as simple Web “applets”.6 THE OBJECT ORIENTED PARADIGM Listing 25: Exceptions in Eiffel tryIt is local triedFirst: Boolean -.6. Java offers security . are not objects in Java. Java provides concurrency in the form of threads. Java has a class hierarchy with class Object at the root and provides single inheritance of classes. as such. The technique is quite old — Pascal P achieved portability in the same way during the mid-seventies — and has the disadvantage of inefficiency: Java programs are typically an order of magnitude slower than C++ programs with the same functionality. . A variable name in Java is a reference to an object. Since any computer that has a Java byte code interpreter can execute Java programs.

We will call such instances “exceptions”.3 Exception Handling The principal components of exception handling in Java are as follows: There is a class hierarchy rooted at class Exception. Listing 26 shows a couple of simple examples of interfaces. } 6. (In fact. particularly for Web programming.. } interface Ordered extends Comparable { Boolean lessThan (Ordered other).6 THE OBJECT ORIENTED PARADIGM Listing 26: Interfaces 89 interface Comparable { Boolean equalTo (Comparable other). is a compile-time entity whose run-time instances are objects. The throw statement is used to signal the fact that an exception has occurred. } The portability of Java has been a significant factor in the rapid spread of its popularity. // Return True if ‘this’ object precedes ‘other’ object in the ordering. Class Widget is required to implement equalTO and lessThan: class Widget implements Ordered { . The first interface ensures that instances of a class can be compared. Interfaces provide a way of describing and factoring the behaviour of classes. A class may implement an interface by providing definitions for each of the methods declared in the interface. An interface declaration is similar to a class declaration.2 Interfaces A Java class.6.) A Java class can inherit from at most one parent class but it may inherit from several interfaces provided. of course. (The class may also make use of the values of the constants declared in the interface.) If we want to store instances of a class in an ordered set. .. // Return True if ‘this’ and ‘other’ objects are equal. 6. all Java classes inherit the function equals from class Object that can be redefined for comparison in a particular class. but introduces only abstract methods and constants.. Instances of a subclass of class Exception are used to convey information about what has gone wrong.6. as usual in OOP. we need a way of sorting them as well: this requirement is specified by the second interface. Consequently. the class hierarchy is a rooted tree but the interface hierarchy is a directed acyclic graph. The argument of throw is an exception. Thus an interface has no instances. that it implements all of them.

.) throw new NegativeSquareRoot(x). } } Listing 28: Another Java exception class public class NegativeSquareRoot extends Exception { public float value. . Listing 27 shows the declaration of an exception for division by zero. } . as shown in Listing 29.. A try statement may use one or more catch statements to handle exceptions of various types.. value = badValue.. Java threads have been criticized on (at least) two grounds. . 6.. . The following simple example indicates the way in which exceptions might be used.4 Concurrency Java provides concurrency in the form of threads. we suppose that either of these problems may occur in a particular computation.6 THE OBJECT ORIENTED PARADIGM Listing 27: Declaring a Java exception class public class DivideByZero extends Exception { DivideByZero() { super("Division by zero"). NegativeSquareRoot(float badValue) { super("Square root with negative argument"). as shown in Listing 30... if (. if (. Finally. ..) throw new DivideByZero(). Listing 29: Throwing exceptions public void bigCalculation () throws DivideByZero. Threads are defined by a class in that standard library. as shown in Listing 28 Next. NegativeSquareRoot { . The next step is to declare another exception for negative square roots. we write some code that tries to perform the computation. . ..6. . } } 90 The try statement is used to attempt an operation and handle any exceptions that it raises..

during execution. are based on an Aristotelian view in which the world is seen as a collection of objects with well-defined properties arranged [according] to a hierarchical taxonomy of concepts.println("Oops! divided by zero.6 THE OBJECT ORIENTED PARADIGM Listing 30: Using the exceptions { try { bigCalculation() } catch (DivideByZero) { System. instances of these classes are created and become the objects of the system. we can clone a .7 Kevo All of the OOPLs previously described in this section are class-based languages. Exercise 29.println("Argument for square root was " + n). starting with Simula. Identical copies of a prototype — as many as needed — are created by cloning . which is itself an object. Common examples include “book”. “game”. The problem with this approach is that there are many classes which people understand intuitively but which are not easily defined in taxonomic terms. the programmer defines one or more classes and. } } } 91 The synchronization methods provided by Java (based on the early concepts of semaphores and monitors) are out of date and do not benefit from more recent research in this area. Prototypes provide an alternative to classes. Antero Taivalsaari (1993) has given a thorough account of the motivation and design of his language. Kevo. but Smalltalk is nevertheless a class-based language. each new type of object is introduced by defining a prototype or exemplar . smaller. Taivalsaari (1993. page 172) points out that class-based OOPLs. (In Smalltalk.out. In a prototype-based OOPL. Brinch Hansen (1999) argues that Java’s concurrency mechanisms are unsafe and that such mechanisms must be defined in the base language and checked by the compiler. The concurrency mechanism is insecure. etc.) There is another."). In a class-based language. “chair”.out. family of OOPLs that use prototypes rather than classes. we discuss only one of them here. Byte codes provide portability. } catch (NegativeSquareRoot n) { System. not provided separately in a library. Can you suggest any other advantages of using byte codes? 2 6. classes are run-time objects. Although there are several prototype-based OOPLs. Alternatively.

pages 187–188).new → rect VAR contents METHOD drawFrame . Example 16: Objects in Kevo. The properties of rect are obtained from the prototype Rectangle.new → Window Window ADDS VAR rect Prototypes.Rectangle. and use the resulting object as a prototype for another collection of identical objects. .drawFrame self. however. REF Window. three new methods are added to Window. The first line. refresh. Finally.new → TitleWindow TitleWindow ADDS VAR title "Anonymous" → title METHOD drawFrame . with initial value "Anonymous". . it is based on (Taivalsaari 1993. . draws the window frame first and then the contents. In Kevo terminology.6 THE OBJECT ORIENTED PARADIGM Listing 31: Object Window REF Window Object.. This example illustrates how new objects are introduced in Kevo. . and these methods can be delegated to the original object rather than replicated in the new object. Some of its methods will probably be the same as those of the original object. Delegation thus plays the role of inheritance in a prototypebased language and. declares a new object with no properties. but the third. The first two are left undefined here. . METHOD refresh self. The modified object need not be self-contained. . 2 . . Then it adds a new variable. In the next block of code. The properties of contents are left unspecified in this example. 92 prototype. This is misleading. ENDADDS. for this reason. Listing 32: Object Title Window REF TitleWindow Window. ENDADDS. Listing 31 shows the introduction of a window object. and provides a new method drawFrame that overrides Window. It declares a new object. Listing 32 shows how to add a title to a window. Window receives the basic printing and creation operations defined for Object. because delegation is not the only possible mechanism for inheritance with prototypes. TitleWindow with Window as a prototype. properties are added to the window using the module operation ADDS.drawFrame and draws a frame with a title.. . . title. The second line defines Window as a clone of Object.drawContents. METHOD drawContents . The window has two components: rect and contents. modify it in some way.. prototype-based languages are sometimes called delegation languages.

C++ was originally called “C with classes” (Stroustrup 1994) and modern C++ remains an Algol-like systems programming language that supports OOP. Blue. It differs in many respects from “mainstream” OOPLs. Considerable effort has been put into efficient implementation. each with its own code and stack. Beta. including “multi-methods” — the choice of a function is based on the class of all of its arguments rather than its first (possibly implicit) argument.6 THE OBJECT ORIENTED PARADIGM 93 6. In this section.8 Other OOPLs The OOPLs listed below are interesting. but lack of time precludes further discussion of them. Algol 68 and Pascal provided recursive data structures but maintained the separation between code and data. This was the feature of Simula that attracted Alan Kay and led to Smalltalk (Kay 1996). although Ada 9X has a number of OO features. LISP brought algorithms (in the form of functions) and data structures together. . Blue (K¨lling 1999) is a language and environment designed for teaching OOP. Many OOPLs since Smalltalk have withdrawn from its radical ideas. subsequent FPLs reverted to the Algol model. given that Self is considerably more flexible than C++. Self is the perhaps the best-known prototype language. The language o was influenced to some extent by Dee (Grogono 1991b). Simula started the move back towards the von Neumann Architecture at a higher level of abstraction than machine code: the computer was recursively divided into smaller computers. Ada has shown even more reluctance to move in the direction of OOP.9 Issues in OOP OOP is the programming paradigm that currently dominates industrial software development. but: the mutability of data was downplayed. 6. to the extent that Self programs run at up to half of the speed of equivalent C++ programs. we discuss various issues that arise in OOP. CLOS. 6. developed in Denmark. This is an impressive achievement.9. Self. It is visible in FORTRAN and reinforced in Algol 60.1 Algorithms + Data Structures = Programs The distinction between code and data was made at an early stage. Algol provided the recursive block structure for code but almost no facilities for data structuring. It has a number of interesting features. The Common LISP Object System is an extension to LISP that provides OOP. The environment is interesting in that users work directly with “objects” rather than program text and generated output. CLOS also provides “before” and “after” methods — code defined in a superclass that is executed before or after the invocation of the function in a class. separating code and data. Beta is a descendant of Simula.

do not change). may change).6 THE OBJECT ORIENTED PARADIGM 6. sju.2 Values and Objects 94 Some data behave like mathematical objects and other data behave like representations of physical objects (MacLennan 1983). It does not make sense to talk about the identity of a value. A counter has one attribute: the value of the count. For example. In the following comparison of values and objects. obtaining 5.. For example. An object is a particular collection of subobjects and values. mutable (i. again obtaining 5. but not necessarily. 3. the integer 6 is a value. For example. We can copy a representation of a value. VII. 1 }. Values Objects Integers have values. Although these sets have different representations (the ordering of the members is different). and a particular combination of bits in the memory of a computer are different representations of the same abstract value. . If we change the value 3. to change the value of an object. Objects need representations. An object has subobjects and values as attributes. 2. they might refer to different people who have the same name and age. statements about values on the left side of the page are balanced by statements about objects on the right side of the page. Objects possess values as attributes.e. The consequences of assuming that these two objects denote the same person could be serious: in fact. It does not make sense to change a value. and often useful. 3 } and { 2. Consider the sets { 1. Is it the same 5 that we obtain from the second calculation? Identity is an important attribute of objects. innocent people have been arrested and jailed because they had the misfortune to have the same name and social security number as a criminal. but copying a value is meaningless. Values are immutable (i. age=55] Although these objects have the same value. sept. they have the same value because in set theory the only important property of a set is the members that it contains: “two sets are equal if and only if they contain the same members”. We need representations in order to compute with values. It is meaningful. it is no longer 3. age=55] [name="Peter". Values need representations. The value 7 is the common property shared by all sets with cardinality 7. and then 7 − 2.9. The representations 7.. a counter might have 6 as its current value. Even objects with the same attributes may have their own. Consider two objects representing people: [name="Peter". Objects are usually.e. unique identities: consider identical twins. Objects are concrete. Values are abstract. Suppose we evaluate 2 + 3. we can increment a counter.

. but may have other desirable properties such as encapsulation. whether two instances of a value are stored in the same place or in different places. All references to a particular object should refer to the same object. The assignment operator for objects should be a reference assignment. Small values can be held in machine words. It is immaterial to the programmer. When we look at programming problems in more deatil. The distinction between values and objects is not as clear as these comparisons suggest. it may even be useful. Most objects should be represented by references (pointers to the data structure holding the object. and is viewed as harmful. Large values are best represented by references (pointers to the data structure representing the value. What is “small” and what is “large”? This is an implementation issue: the compiler must decide. Exercise 30. and should not even be detectable. Objects should not normally be copied. This is sometimes called “aliasing”. Since they have unbounded size. Whether two distinct objects with equal attributes are equal depends on the application. Should sets be values? What about sets with mutable components? Does the model handle objects with many to one abstraction functions? What should a PL do about values and objects? 2 Conclusions: The CM must provide values. There is no need to copy values. Object oriented PLs should provide objects. as when we copy objects from memory to disk. A person who is arrested because his name and social insurance number is the same as those of a wanted criminal may not feel “equal to” the criminal. We consider different representations of the same value to be equal. for example. many PLs treat them as objects. However. Functional and logical PLs usually provide values. doubts begin to arise. Sometimes. multiple copies of an object with a supposedly unique identity can cause problems. The assignment operator for values should be a reference assignment. may provide objects. What is “primitive” and what is “user defined”? The PL should minimize the distinction as much as possible. Aliasing is not a problem for values. Programs that provide objects do not provide referential transparency. because copying endangers their integrity. Strings should behave like values. because they do not change. but it is in fact desirable for objects with identity. known as “integrity violations” in the database community.6 THE OBJECT ORIENTED PARADIGM 95 Copying an object certainly makes sense. Programs that use values only have the property of referential transparency. An object is equal to itself. Very small objects may be held in machine words.

and Eiffel — are class-based. these languages seem to be more often used for exploratory programming.y). with this interpretation. Perhaps “pure or hybrid” is the wrong question. OO design is performed perfunctorily and design problems are solved by reverting to the more general. Prototype languages are a distinct but active minority. (This is analogous to the early days of “high level” programming when programmers would revert to assembler language to express constructs that were not easy or possible to express at a high level. However.) For experienced programmers. and is this comparison commutative? Some of the problems can be avoided by associating functions with types rather than with objects. 6. x = y is interpreted as x.4 Types versus Objects In most class-based PLs. page 43).9. approximately.5 Pure versus Hybrid Languages Smalltalk. Classes provide a visible program structure that seems to be reassuring to designers of complex software systems. y = x means the same as x = y. “C with classes”. These programmers will use OO techniques most of the time and will only resort to procedural techniques when these are clearly superior to the best OO solution. Although large systems have been built using prototype languages. In fact. what? One answer is: FP . it was inevitable that C++ could be a hybrid language. did not even claim to be object oriented.9.equal(y) or. Does it need to provide anything else? If so. As OOPLs evolve. Once the decision was made to extend C rather than starting afresh (Stroustrup 1994. procedural style. a hybrid language is perhaps better than a pure OOPL. its precursor. these occasions should become rarer. C++ is a hybrid PL in that it supports both procedural and object styles.3 Classes versus Prototypes The most widely-used OOPLs — C++. A hybrid language can be dangerous for inexperienced programmers because it encourages a hybrid programming style. functions are associated with objects. For example. but they cannot provide fully appropriate implementations. Compilers can provide default operations and “building bricks”.6 THE OBJECT ORIENTED PARADIGM 96 Copying and comparing are semantic issues. A better question might be: suppose a PL provides OOP capabilities. even this approach has problems if x and y do not have the same type. but lower level. and especially those trained in OO techniques. This is the approach taken by CLU and some of the “functional” OOPLs. An expression such as x = y is interpreted in the usual mathematical sense as equal(x. The relation between classes and prototypes is somewhat analogous to the relation between procedural (FORTRAN and Algol) and functional (LISP) languages in the 60s. can we compare an instance of C to an instance of P . The problem is complicated further by inheritance: if class C is a subclass of class P . and Java are pure OOPLs: they can describe only systems of objects and programmers are forced to express everything in terms of objects. 6. 6. Smalltalk.9. It is not clear whether. Eiffel. This has the effect of making binary operations asymmetric. Prototypes provide more flexibility and are better suited to interactive environments than classes. “send the message equal with argument y to the object x”.

9. 6. Although each method has advantages for certain applications. at the implementation level.10 A Critique of C++ C++ is both the most widely-used OOPL and the most heavily criticized. as in Scheme (Section 4. but who is not familiar with the details of C++.6 THE OBJECT ORIENTED PARADIGM 97 capabilities. 6. the class ArrayStack could have an array object as an attribute — but Meyer maintains that inheritance is the most appropriate technique. and so on — and is represented as an array. . The first part of the following discussion is based on a paper by LaLonde and Pugh (1995) in which they discuss C++ from the point of view of a programmer familiar with the principles of OOP. In practice. There are many practical programming situations in which code in a parent class can be reused in a child class even when interface consistency is the primary purpose of inheritance. There are other ways of obtaining this effect — notably. It more tightly-integrated design would be preferable. however. a balance. exemplified by a language such as Smalltalk. One of the clearest examples of the division is Meyer’s (1988) “marriage of convenience” in which the class ArrayStack inherits from Array and Stack. pop. and object programming in a single language. but is harder to accomplish. at least in Eiffel. LEDA provides these in a rather ad hoc way. Moreover. Hence the question: should an OOPL distinguish between these relations or not? Based on the “marriage of convenience” the answer would appear to be “yes” — two distinct ends are achieved and the means by which they are achieved should be separated. and functions for making deposits and withdrawals: see Listing 33. the class approach seems to be better in most practical situations.7 Inheritance Inheritance is one of the key features of OOP and at the same time one of the most troublesome. it is clear that there two different relations are involved. Conceptually. The relation ArrayStack ∼ Stack is not the same as the relation ArrayStack ∼ Array. inheritance can be defined in a number of ways and has a number of different uses. Further criticism might appear to be unnecessary. The multi-paradigm language LEDA (Budd 1995) demonstrates that it is possible to provide functional. most (but not all!) OOPLs provide a single mechanism for inheritance and leave it to the programmer to decide how to use the mechanism. logic. the varieties of inheritance all look much the same — a fact that discourages PL designers from providing multiple mechanisms.6 Closures versus Classes The most powerful programming paradigms developed so far are those that do not separate code and data but instead provide “code and data” units of arbitrary size. We begin by writing a specification for a simple BankAccount class with an owner name. shows that the answer is by no means so clear-cut. The class ArrayStack behaves like a stack — it has functions such as push. 6. The purpose of this section is to show how C++ fails to fulfill its role as the preferred OOPL.2) or by classes. A closer examination. However.9. Whether or not we agree with Meyer. This can be accomplished by high order functions and mutable state.

// on the heap The first account. At the end of the test. The instance variable pTrans is a pointer to an array containing the transactions that have taken place for this account. as shown in Listing 34. The second account.6 THE OBJECT ORIENTED PARADIGM 98 Data members are prefixed with “p” to distinguish them from parameters with similar names. account1. ∼BankAcocunt(). double pBalance. // on the stack BankAccount *account2 = new BankAccount("Jane"). account2. in Smalltalk. The code in Listing 37 — creating three special accounts — is rather specialized. This comes as a surprise to Smalltalk programmers because. will be deleted only when the program executes a delete operation. Listing 33: Declaring a Bank Account class BankAccount { public: BankAccount(char * newOwner = ""). Listing 36 shows a simple test of the class BankAccount. double balance().pTrans.pTrans is lost because it is over-written by account2.pTrans will be deleted twice. Transaction *pTrans[MAXTRANS]. We can create instances of BankAccount on either the stack or the heap: BankAccount account1("John"). and again when the destructor for account2 is called. once when the destructor for account1 is called. void owner(char *newOwner). it is more interesting to consider heap objects: Listing 37. The notation for accessing functions depends on whether the variable refers to the object directly or is a pointer to the object. the destructor will be invoked. In both cases. account1. }. Although most introductions to C++ focus on stack objects. This test introduces two bugs. The array account1. . void debit(double amount). Listing 38 generalizes it so that we can enter the account owner’s names interactively. will be deleted automatically at the end of the scope containing the declaration. char *owner(). double amount). an object can access only its own components. private: char *pOwner. void credit(double amount). Functions that operate on this array have been omitted. account2. void transfer(BankAccount *other.balance() account2→balance() A member function can access the private parts of another instance of the same class. The next step is to define the member functions of the class BankAccount: see Listing 35.

other→pBalance -= amount. } Listing 35: Member Functions for the Bank Account BankAccount::BankAccount(char *newOwner) { pOwner = newOwner. } .0. pBalance = 0. i++) delete accounts[i]. 99 . Listing 36: Testing the Bank Account void test { BankAccount account1("Anne").. accounts[1] = new BankAccount("Bill"). for (int i = 0. accounts[2] = new BankAccount("Chris").. BankAccount account2("Bill"). // Account operations omitted. // Delete the array of transactions } void BankAccount owner(char *newOwner) { pOwner = newOwner. // Delete everything. } char *BankAccount owner() { return pOwner. i < 3. delete accounts. } BankAccount::∼BankAccount() { // Delete individual transactions delete [] pTrans.6 THE OBJECT ORIENTED PARADIGM Listing 34: Accessing Private Parts void BankAccount::transfer(BankAccount *other. accounts[0] = new BankAccount("Anne").. account1 = acount2. } Listing 37: Using the Heap BankAccount *accounts[] = new BankAccount * [3]. double amount) { pBalance += amount.

for (int i = 0. This is because each account contains a pointer to the unique buffer. i < 3. accounts[i] = new BankAccount(buffer). // Delete everything. but is poorly designed.6 THE OBJECT ORIENTED PARADIGM Listing 38: Entering a Name BankAccount *accounts[] = new BankAccount * [3]. delete accounts. } delete accounts. (deleted) for (int i = 0. 100). The destructor for BankAccount should delete the owner string. We can fix this by allocating a new buffer for each account. not the client. Listing 39: Adding buffers for names // char * buffer = new char[100]. Suppose we use the loop to create two accounts and then create the third account with this statement: accounts[2] = new BankAccount("Tina").getline(buffer. i++) delete accounts[i]. i < 3. accounts[i] = new BankAccount(buffer). this change introduces another problem. i++) { delete accounts[i]→owner(). After making this change. cin. for (int i = 0. } // Account operations omitted. // Delete everything. i < 3. i++) { cout << "Please enter your name: ". char * buffer = new char[100]. Deleting the owner string should be the responsibility of the object. delete buffer. delete buffer. we must modify the constructor for BankAccount. cin. we can remove the destructor for owner in the code above: Listing 40. as in Listing 39. 100 The problem with this code is that all of the bank accounts have the same owner: that of the last name entered. 100). The account destructor fails because "Tina" was not allocated using new. i < 3. or controlled entirely outside . To correct this problem. This version works. Unfortunately. char * buffer = new char[100]. for (int i = 0. The rule we must follow is that data must be allocated with the constructor and deallocated with the destructor.getline(buffer. i++) { cout << "Please enter your name: ". } // Account operations omitted. delete accounts[i].

Note that. newOwner). Once again. we must rewrite the member function owner(). we have introduced two bugs: The original owner of accounts[2] will be lost. we must use the operator “.0). strcpy(pOwner. } . To correct this problem. &. // Delete individual transactions delete [] pTrans. It is awkward to mix stack and heap allocation in C++ because two Listing 42: Correcting the Owner Function void BankAccount owner(char *newOwner) { delete pOwner. accounts[2]→owner("Fred"). pOwner = new char(strlen(newOwner) + 1). however. as shown in Listing 42. We correct it by introducing the reference operator. The underlying problem here is that references (&) work best with stack objects and pointers (*) work best with heap objects. it will attempt to delete "Fred". When the destructor is invoked for accounts[2]. The next problem occurs when we try to use the overloaded function owner() to change the name of an account owner. account1→transfer(&account3.” rather than “→”. we again encounter difficulties: Listing 43.6 THE OBJECT ORIENTED PARADIGM Listing 40: Correcting the Destructor BankAccount::∼BankAccount() { delete pOwner. Listing 41 shows the new constructor. provided that they are allocated on the heap. The error occurs because transfer expects the address of an account rather than an actual account. in the version of Listing 44. // Delete the array of transactions } Listing 41: Correcting the Constructor BankAccount::BankAccount(char *newOwner) { char *pOwner = new char (strlen(newOwner) + 1). although in fact an address has been passed. newOwner). strcpy(pOwner. We now have bank accounts that work quite well. which was not allocated by new. } 101 the object. 10. We could make things easier for the programmer who is using accounts by declaring an overloaded version of transfer() as shown below. This is because other behaves like an object. If we try to mix heap objects and stack objects.

the compiler is forced to make copies. 10.0). as in Listing 46. The function pDuplicate() makes a deep copy of an account the function. it is difficult to manage allocation and deallocation. C++ is well-suited to this approach: storage management is automatic and the only overhead is the copying of values during function call and return. If stack objects are used without references. to make another copy of account2 that will become myAccount. for example. double amount) { pBalance += amount. } BankAccount myAccount. however. myAccount = test(account2). // Error! Listing 44: Revising the Transfer Function void BankAccount::transfer(BankAccount &other. other. At the end of the scope. BankAccount *account2 = new BankAccount("Bill"). Even a simple function call such as project(A + B). But if we write functions that work with pointers to matrices. copying a matrix is relatively expensive and the stack approach will be inefficient. −. If the objects are large. for example — it is reasonable to store the objects on the stack and to use call by value. BankAccount account3("Chris"). If the objects are small — complex numbers.pBalance -= amount. there are problems. these two copies must be deleted. LaLonde and Pugh continue with a discussion of infix operators for user-defined classes.) Since a matrix occupies 4×4×8 = 128 bytes. // OK account1→transfer(account3. C++ uses the (default) copy constructor for BankAccount twice: first to make a copy of account2 to pass to test() and. it must make copies of all of the components of a back account object and destroy all components of the old object. To implement the code shown in Listing 45.6 THE OBJECT ORIENTED PARADIGM Listing 43: Stack and Heap Allocation BankAccount *account1 = new BankAccount("Anne").0). Suppose that we want to provide the standard algebraic operations (+. second. The function pDelete has the same effect as the destructor. Listing 45: Copying BankAccount test(BankAccount account) { return account. } 102 sets of functions are needed. ∗) for 4 × 4 matrices. 10. will create a heap object corresponding to A + B but provides no opportunity to delete this object. account1→transfer(account2. The solution is to write our own copy constructor and assignment overload for class BankAccount. (This would be useful for graphics programming. . The destructors will fail because the default copy constructor makes a shallow copy — it does not copy the pTrans fields of the accounts. It is recommended practice in C++ to write these functions for any class that contains pointers.

A sign of good language design is that the features of the language are used most of the time to solve the problems that they were intended to solve. return *this. } Listing 47: Using extended assignment operators A = C. pDelete(). The comparison is one-sided because they do not discuss factors that might make C++ a better choice than Smalltalk in some circumstances. } BankAccount & BankAccount operator = (BankAccount &rhs) { if (this == &rhs) return this. A sign of a hacker’s language is that language “gurus” introduce “hacks” that solve problems by using features in bizarre ways for which they were not originally intended. the discussion makes an important point: the “stack model” (with value semantics) is inferior to the “heap model” (with reference semantics) for many OOP applications. such as when efficiency is important. it can be used as an infix operator in compound expressions such as the one shown. A += B. Another basis for criticizing C++ is that is has become. Dee. appears to be an elegant way of introducing output and input (with the operator >>) capabilities. . They are making a comparison between C++ and Smalltalk based only on ease of programming. A *= D.6 THE OBJECT ORIENTED PARADIGM Listing 46: Copy Constructors BankAccount::BankAccount(BankAccount &account) { pDuplicate(account). a hacker’s language. The notation seems clear and concise and. amongst others) and that C++ is in the minority on this issue.’ << endl. It is possible to implement the extended assignment operators. Java. pDuplicate(rhs). The operator << requires a stream reference as its left operand and an object as its right operand. Input and output in C++ is based on “streams”. without losing objects. A typical output statement looks something like this: cout << "The answer is " << answer << ’. These observations by LaLonde and Pugh seem rather hostile towards C++. Blue. at first glance. perhaps not intentionally. Since it returns a stream reference (usually its left operand).) The same problem arises with expressions such as A = B + C * D. Smalltalk. Eiffel. because it cannot tell whether its argument is a temporary. as shown in Listing 47. Nevertheless. CLU. But this approach reduces us to a kind of assembly language style of programming. It is significant that most language designers have chosen the heap model for OOP (Simula. such as +=. Sather. 103 (The function cannot delete it.

let alone design similar functions.) The second. as in Listing 48. to format a table of heterogeneous data. private: int (ios::*memberfunc)(int). they do not work well with inheritance. because there are different kinds of output streams. writes end-of-line to the stream and flushes it. For example. is the implementation of setprecision. Overloading provides static polymorphism which is incompatible with dynamic polymorphism (recall Section 9.6).*memberfunc)(m. for example. } 104 Obviously.’ << endl. a double. First. << is heavily overloaded. } It is apparent that a programmer must be quite proficient in C++ to understand these definitions. the first three right operands are a string. (<< is also overloaded with respect to its left operand. int).6 THE OBJECT ORIENTED PARADIGM Listing 48: A Manipulator Class class ManipulatorForInts { friend ostream & operator << (ostream & const ManipulatorForInts &). There must be a separate function for each class for which instances can be used as the right operand.. endl. we could change the example above to: cout << "The answer is " << setprecision(8) << answer << ’. int value. } Second. public: ManipulatorForInts (int (ios::*)(int). but it is very obvious when streams are used. The first problem with this is that it is verbose. and more serious problem. The way this is done is (roughly) as follows (Teale 1993. (The verbosity is not really evident in such a small example. page 171). return os. . To make matters worse.. It is possible to write code that uses dynamic binding but only by providing auxiliary functions. Things get more complicated when formatting is added to streams.value). the function that we need can be defined: ManipulatorForInts setprecision (int n) { return ManipulatorForInts (& ios::precision. although they are superficially object oriented (a stream is an object).) In the example above. The first problem with streams is that. the actual definitions are not as shown . The final operand. It is a global function with signature ios & endl (ios & s) A programmer who needs additional “manipulators” of this kind has no alternative but to add them to the global name space. n). a class for manipulators with integer arguments is declared. const ManipulatorForInts & m) { (os. and the inserter function is defined: ostream & operator << (ostream & os.. and a character.

as in Listing 49. . Can you account for the slow acceptance of OOPLs by professional programmers? 2 . There are many other tricks in the implementation of streams. The mechanism for incremental modification without altering existing code is inheritance. but are implemented using templates.9 contains a more detailed review of issues in OOP. 105 here. Simula is 33 years old. For example. with the value NULL indicating that the file is not open. An interesting. obtaining !infile. Exercise 31. Section 6. infile can be used as a conditional because there is a function that converts infile to (void *). These are symptoms of an even deeper malaise in C++ than the problems pointed out by authors of otherwise excellent critiques such as (Joyner 1992. However. .11 Evaluation of OOP OOP provides a way of building programs by incremental modification. if (!infile) . because that would require class declarations for many different types. all objects can respond to the message “print yourself”. in which the call graph can be statically inferred from the program text. For example. Sakkinen 1992). that we can negate.dat"). if (infile) . there may be conflicts between inheritance and encapsulation. we can use file handles as if they had boolean values. In fact. . but few realize that frequently-used and apparently harmless constructs are implemented by subtle and devious tricks. . It seems that there is a boolean value. 6. Class-based OOPLs can be designed so as to support information hiding (encapsulation) — the separation of interface and implementation. . This is in contrast to procedural languages. The same identifier can be used in various contexts to name related functions. but rarely noted. and some of them conflict. However. infile. there are several different ways in which inheritance can be defined and used. . and !infile can be used as a conditional because the operator ! is overloaded for class ifstream to return an int which is 0 if the file is open. Programs that look simple may actually conceal subtle traps. feature of OOP is that the call graph is created and modified during execution.6 THE OBJECT ORIENTED PARADIGM Listing 49: File handles as Booleans ifstream infile ("stats. Programs can often be extended by adding new code rather than altering existing code. Most experienced programmers can get used to the well-known idiosyncrasies of C++.

but they have an important feature in common: they are designed to solve problems with multiple answers. and 6 | 6. All Prolog textbooks start with facts about someone’s family: see Listing 50. 12. (This means that there is no effective procedure that determines whether a given predicate is true. 2. parent(bob. reports that. The first logic PL was Prolog. ann). 2 | 6. In mathematics. it replies with the binding. parent(tom. We read parent(pam.1 Prolog The computational model (see Section 10. Note that. Predicate calculus is too strong because it is undecidable.) Practical logic PLs are based on a restricted form of predicate calculus that has a decision procedure. 6. 3. If the query contains a logical variable — an identifier that starts with an upper case letter — Prolog attempts to find a value of the variable that makes the query true. then backtracks to find another solution. parent(tom. 1 | 6. pat). The approach that has been most successful. 3. If it succeeds. and to find as many answers as required. 106 .7 Backtracking Languages The PLs that we discuss in this section differ in significant ways. 7. Prolog responds to simple queries about facts that it has been told with “yes” and “no”. Any integer greater than 1 has at least two divisors (itself and 1) and possible others. is to produce solutions one at a time. reports that solution or uses it in some way. etc. The discussion in this section is based on (Bratko 1990). bob). it prompts with ?. Bob. a many-valued expression can be represented as a relation or as a function returning a set. This Bob’s parents are Pam and Tom.to indicate that it is ready to accept a query. denoted by | in number theory. 24 } We could define functions that return sets in a FPL or we could design a language that works directly with relations. jim). similarly for the other facts. 2. introduced by Colmerauer (1973) and Kowalski (1974). bob) as “Pam is the parent of Bob”. as in Listing 51. 4. 3 | 6. The functional analog of the divides relation is the set-returning function divisors: divisors(6) = { 1. parent(pat. and so on until all the choices have been explored or the user has had enough. 6 } divisors(7) = { 1. Listing 50: A Family parent(pam. bob). The program runs until it finds the first solution. 7 } divisors(24) = { 1. we cannot write Pam. Propositional calculus is too weak because it does not have variables. Consider the relation “divides”. We can express facts in Prolog as predicates with constant arguments. however. liz). since constants in Prolog start with a lower case letter.2) of a logic PL is some form of mathematical logic. Prolog is interactive. parent(bob. For example. 8.

7 BACKTRACKING LANGUAGES Listing 51: Prolog queries ?. C). jim). X = pat. different(X. parent(P. parent(P. this becomes: ?.parent(G. Y) :. C = ann C = pat Consider the problem of finding Jim’s grandparents. X) :. because it would not be feasible to list every true instantiation of it as a fact. Here we assume that the implementation returns all values without further prompting. C) :. The rule for “grandparent” is “G is a grandparent of C if G is a parent of P and P is a parent of C”. parent(P. a person A is an ancestor of a person X if either A is a parent of X or A is an ancestor of a parent P of X: ancestor(A. no 107 ?. liz). X). Prolog allows rules to be recursive: a predicate name that appears on the left side of a rule may also appear on the right side.parent(P. yes ?. In Prolog: grandparent(G. jim). Prolog will find them all. X = bob If there is more than one binding that satisfies the query. Y) is satisfied by the bindings P = bob. X). Disjunction (“or”) is provided by multiple facts or rules: in the example above.) ?. . X).parent(A.sibling(pat. X). X = ann X = pat We may not expect the second response. To correct this rule. C). we would have to require that X and Y have different values: sibling(X.parent(tom. parent(P. and Y = pat. It is clear that different(X. In Prolog. X).parent(bob. For example. Y) :. We could write this rule for siblings: sibling(X. Testing this rule reveals a problem. ?. Y).parent(P. because we do not consider Pat to be a sibling of herself. Y).parent(P. parent(G. In logic. X). Siblings are people who have the same parents.parent(tom. But Prolog simply notes that the predicate parent(P. P). We can expression the “grandparent” relation by writing a rule and adding it to the collection of facts (which becomes a collection of facts and rules). Bob is the parent of C is true if C is Ann or C is Pat.parent(pam. (The exact mechanism depends on the implementation. Y). Y) must be built into Prolog somehow. P = pat G = bob The comma in this query corresponds to conjunction (“and”): the responses to the query are bindings that make all of the terms valid simultaneously. P). Some implementations require the user to ask for more results. we can express “G is a grandparent of Jim” in the following form: “there is a person P such that P is a parent of Jim and there is a person G such that G is a parent of P”.

7 BACKTRACKING LANGUAGES

108

parent(pam, bob) ancestor(pam, bob) ancestor(pam, pat) ancestor(pam, jim) Figure 15: Proof of ancestor(pam, jim) ancestor(A, X) :- ancestor(A, P), parent(P, X). We confirm the definition by testing: ?- ancestor(pam, jim). yes We can read a Prolog program in two ways. Read declaratively , ancestor(pam, jim) is a statement and Prolog’s task is to prove that it is true. The proof is shown in Figure 15, which uses the convention that P means “from P infer Q”. Read procedurally , ancestor(pam, jim) defines Q a sequence of fact-searches and rule-applications that establish (or fail to establish) the goal. The procedural steps for this example are as follows. 1. Using the first rule for ancestor, look in the data base for the fact parent(pam, jim). 2. Since this search fails, try the second rule with A = pam and X = jim. This gives two new subgoals: ancestor(pam, P) and parent(P, jim). 3. The second of these subgoals is satisfied by P = pat from the database. The goal is now ancestor(pam, pat). 4. The goal expands to the subgoals ancestor(pam, P) and parent(P, pat). The second of these subgoals is satisfied by P = bob from the database. The goal is now ancestor(pam, bob). 5. The goal ancestor(pam, bob) is satisfied by applying the first rule with the known fact parent(pam, bob). In the declarative reading, Prolog finds multiple results simply because they are there. In the procedure reading, multiple results are obtained by backtracking . Every point in the program at which there is more than one choice of a variable binding is called a choice point. The choice points define a tree: the root of the tree is the starting point of the program (the main goal) and each leaf of the tree represents either success (the goal is satisfied with the chosen bindings) or failure (the goal cannot be satisfied with these bindings). Prolog performs a depth-first search of this tree. Sometimes, we can tell from the program that backtracking will be useless. Declaratively, the predicate minimum(X,Y,Min) is satisfied if Min is the minimum of X and Y. We define rules for the cases X ≤ Y and X > Y . minimum(X, Y, X) :- X ≤ Y. minimum(X, Y, Y) :- X > Y. It is easy to see that, if one of these rules succeeds, we do not want to backtrack to the other, because it is certain to fail. We can tell Prolog not to backtrack by writing “!”, pronounced “cut”. parent(bob, pat) parent(pat, jim)

7 BACKTRACKING LANGUAGES Program p :- a,b. p :- c. p :- a,!,b. p:- c. p :- c. p :- a,!,b. Meaning p ⇐⇒ (a ∧ b) ∨ c p ⇐⇒ (a ∧ b) ∨ (∼a ∧ c) p ⇐⇒ c ∨ (a ∧ b)

109

Figure 16: Effect of cuts minimum(X, Y, X) :- X ≤ Y, !. minimum(X, Y, Y) :- X > Y, !. The introduction of the cut operation has an interesting consequence. If a Prolog program has no cuts, we can freely change the order of goals and clauses. These changes may affect the efficiency of the program, but they do not affect its declarative meaning. If the program has cuts, this is no longer true. Figure 16 shows an example of the use of cut and its effect on the meaning of a program. Recall the predicate different introduced above to distinguish siblings, and consider the following declarations and queries. different(X, X) :- !, fail. different(X, Y). ?- different(Tom, Tom). no ?- different(Ann, Tom). yes The query different(Tom, Tom) matches different(X, X) with the binding X = Tom and succeeds. The Prolog constant fail then reports failure. Since the cut has been passed, no backtracking occurs, and the query fails. The query different(Ann, Tom) does not match different(X, X) but it does match the second clause, different(X, Y) and therefore succeeds. This idea can be used to define negation in Prolog: not (P) :- P, !, fail. true. However, it is important to understand what negation in Prolog means. Consider Listing 52, assuming that the entire database is included. The program has stated correctly that crows fly and penguins do not fly. But it has made this inference only because the database has no information about penguins. The same program can produce an unlimited number of (apparently) correct responses, and also some incorrect responses. We account for this by saying that Prolog assumes a closed world . Prolog can make inferences from the “closed world” consisting of the facts and rules in its database. Now consider negation. In Prolog, proving not(P) means “P cannot be proved” which means “P cannot be inferred from the facts and rules in the database”. Consequently, we should be sceptical about positive results obtained from Prolog programs that use not. The response ?- not human(peter). yes means no more than “the fact that Peter is human cannot be inferred from the database”. The Prolog system proceeds top-down. It attempts to match the entire goal to one of the rules and, having found a match, applies the same idea to the parts of the matched formulas. The “matching” process is called unification.

7 BACKTRACKING LANGUAGES Listing 52: Who can fly? flies(albatross). flies(blackbird); flies(crow). ?- flies(crow). yes ?- flies(penguin). no ?- flies(elephant). no ?- flies(brick). no ?- flies(goose). no

110

Suppose that we have two terms A and B built up from constants and variables. A unifier of A and B is a substitution that makes the terms identical. A substitution is a binding of variable names to values. In the following example we follow the Prolog convention, using upper case letters X, Y, . . .to denote variables and lower case letters a, b, . . .to denote constants. Let A and B be the terms A ≡ left(tree(X, tree(a, Y )) B ≡ left(tree(b, Z)) and let σ = (X = b, Z = tree(a, Y )) be a substitution. Then σA = σB = left(tree(b, tree(a, Y )) and σ is a unifier of A and B. Note that the unifier is not necessarily unique. However, we can define a most general unifier which is unique. The unification algorithm contains a test called the occurs check that is used to reject a substitution of the form X = f (X) in which a variable is bound to a formula that contains the variable. If we changed A and B above to A ≡ left(tree(X, tree(a, Y )) B ≡ left(tree(b, Y ) they would not unify. You can find out whether a particular version of Prolog implements the occurs check by trying the following query: ?- X = f(X). If the reply is “no”, the occurs check is implemented. If, as is more likely, the occurs check is not performed, the response will be X = f(f(f(f( . . . . Prolog syntax corresponds to a restricted form of first-order predicate calculus called clausal form logic. It is not practical to use the full predicate calculus as a basis for a PL because it is undecidable. Clausal form logic is semi-decidable: there is an algorithm that will find a proof

Explain why commutativity fails in this program. X = joes diner. Since the logical interpretation of a rule sequence is disjunction. reasonable(Restaurant) : not expensive(Restaurant). reasonable(X). Listing 53 shows the results of two queries about restaurants. it follows that disjunction in Prolog does not commute. 2 Exercise 32. Consequently. female(X) :. the occurs check is expensive to implement and most Prolog systems omit it.female(X). the same query leads to endless applications of the second rule. most Prolog systems are technically unsound. SLD stands for “S electing a literal. expensive(les halles). was introduced by Robinson (1965). female(X) :. ?. since Prolog works through the rules in the order in which they are written. A Prolog program apparently has a straightforward interpretation as a statement in logic. jane But if we change the order of the rules. For example. ?.female(X). as in wife(jane).wife(X). we obtain a valid answer to a query: ?. the order is significant. The proof technique. good(joes diner). female(X) :. “Pure” Prolog programs. The proof of validity of SLD resolution assumes that unification is implemented with the occurs check.female(Y). Example 17: Failure of commutativity. can be written but are often inefficient. although problems are rare in practice. no 111 for any formula that is true in the logic.good(X). Unfortunately. In the context of the rules wife(jane). the algorithm may fail after a finite time or may loop forever.wife(X). 2 The important features of Prolog include: Prolog is based on a mathematical model (clausal form logic with SLD resolution). conjunction is commutative (P ∧ Q ≡ Q ∧ P ). . but the interpretation is slightly misleading. a Prolog programmer must be able to read programs both declaratively (to check the logic) and procedurally (to ensure efficiency). female(X) :. good(X). using a Linear strategy. In order to write efficient programs.reasonable(X). which fully respect the logical model. called SLD resolution. If a formula is false. In logic.7 BACKTRACKING LANGUAGES Listing 53: Restaurants good(les halles). restricted to Definite clauses”.

1 DO a[i] <= a[i+1] END 2 SBE. The construct EITHER-ORELSE is added to the language to permit backtracking. If the value of the expression is FALSE. Example 19: Ordering 2. it fails. If they are equal. omitting the occurs check) that improve the performance of programs but compromise the mathematical model. control returns to the second sequence in the EITHER statement.2 Alma-0 One of the themes of this course is the investigation of ways in which different paradigms of programming can be combined. This section is based on Apt et al. with the program in the same state as it was when the EITHER statement was entered. . the entire sequence fails. They are replaced by nine new features. The following sequence decides whether a precedes b in the lexicographic ordering. open array parameters. sets. 7. An EITHER statement has the form EITHER <seq> ORELSE <seq> . modules).7 BACKTRACKING LANGUAGES 112 Prolog implementations introduce various optimizations (the cut. Some features of Modula–2 are omitted (the CARDINAL type. The result is a simple but highly expressive language. it is considered equivalent to a boolean expression with value FALSE. NOT FOR i :=1 TO n DO a[i] = b[i] END. and i has the smallest value for which the condition is FALSE. this does not mean that the program stops: instead. the loop exits. . procedure types. (1998). If the computation of a sequence of statements succeeds. the sequence succeeds iff a[i] < b[i]. The FOR statement finds the first position at which the arrays differ. Example 18: Ordering 1. . If it succeeds.ORELSE <seq> END Computation of an EITHER statement starts by evaluating the first sequence. The PL Alma-0 takes as its starting point a simple imperative language. Modula–2 (Wirth 1982). If this computation subsequently fails. until one succeeds or the last one fails. the computation fails. and adds a number of features derived from logic programming to it. computation continues at the statement following END. pointer types. described below. and EXIT statements. The expression is evaluated. nested procedures. Boolean expressions may be used as statements. a[i] < b[i] 2 EITHER–ORELSE. the following statement succeeds. WITH. BES. If its value is TRUE. Prolog has garbage collection. the computation succeeds and the program continues as usual. If there is a difference at position i. Otherwise. A statement sequence may be used as a boolean expression. the CASE. If the array a is ordered. . control returns to the nearest preceding choice point and resumes with a new choice. However. Each sequence is tried in turn in this way. variant records. FOR i := 1 to n . If the computation fails. LOOP. it is considered equivalent to a boolean expression with value TRUE.

2 COMMIT. leaving x = a + b and y = −a. all of the choice points created during its evaluation are removed. With the COMMIT. The test x > 0 succeeds because x = a > 0. The sequence shown in Listing 55 sets i to the first occurrence of p in s.. x := x + b.m-1] and a string s[0. or fails if there is no match. Assume that the sequence shown in Listing 54 is entered with x = a > 0.m DO FOR j := 0 to m . Listing 56 shows another way of checking lexicographic ordering.1 DO s[i+j] = p[j] END END 113 Example 20: Choice points. if seq succeeds (possibly by backtracking). Suppose that we have two arrays of characters: a pattern p[0. This sequence succeeds and transfers control to x := x + b. The assignment x := x + b assigns a + b to x. The COMMIT statement provides a way of restricting backtracking. and y receives the value −a. The statement SOME i := 1 TO n DO <seq> END is equivalent to EITHER i := 1.) The statement COMMIT <seq> END evaluates the statement sequence <seq>..7 BACKTRACKING LANGUAGES Listing 54: Creating choice points EITHER y := x ORELSE x > 0. The test y < 0 fails because y = a > 0. The SOME statement permits the construction of statements that work like EITHER statements but have a variable number of ORELSE clauses. 2 SOME. The final sequence is performed and succeeds. The sequence of events is as follows: The assignment y := x assigns a to y. The original value of x is restored. (It is similar to the cut in Prolog. y < 0 Listing 55: Pattern matching SOME i := 0 TO n . y := -x END.n-1]. this sequence yields that value of a[i] < b[i] for the smallest i . Example 22: Ordering 3. <seq> ORELSE SOME i := 2 TO n DO <seq> END Example 21: Pattern matching.

2. the comparison s = t is evaluated as follows: if both s and t are initialized. control backtracks through these. FORALL k := Match(p. 7. 5. 1. count := 0. Problem 1. it succeeds if a[i] < b[i] is TRUE for some value of i such that a[i] <> b[i]. the program assigns a remarkable sequence to it. three 2’s. 7. but they do not have values. For example. The statement FORALL <seq1> DO <seq2> END evaluates <seq1> and then <seq2>. 7). 9. 5. Problem 2. Whenever <seq2> succeeds. Variables in Alma-0 start life as uninitialized values. 2. Suppose we enclose the pattern matching code above in a function Match. Example 23: Matching. 3. . 9. 4. even if the original sequence succeeded. . 9.. Expressions containing uninitialized values are allowed. Example 24: Remarkable sequences. 9 there are exactly i numbers between successive occurrences of i. 6. 5. if both sides are uninitialized variables. Due to the properties of “=”. their values are compared as usual. The following sequence finds the number of times that p occurs in s. 8. its choice points are removed (that is. If <seq1> generates choice points. an error is reported.7 BACKTRACKING LANGUAGES Listing 56: Checking the order COMMIT SOME i := 1 TO n DO a[i] <> b[i] END END. . . The FORALL statement provides for the exploration of all choice points generated by a statement sequence. Find a remarkable sequence. 3. Without the COMMIT. it ceases to be uninitialized and becomes initialized . 8. . .. . 1. 3. 6. if it is not initialized. this value is assigned to the variable. A sequence with 27 integer elements is remarkable if it contains three 1’s. 8. After a value has been assigned to a variable. the program tests whether it is remarkable. s) DO count := count + 1 END. 2. a[i] < b[i] 114 such that a[i] <> b[i]. three 9’s arranged in such a way that for i = 1. Write a program that tests whether a given array is a remarkable sequence. 4. there is an implicit COMMIT around <seq2>). 2. 6. 2 FORALL. 2 . 2 EQ. Here is a remarkable sequence: (1. 4. if one side is an uninitialized variable and the other is an expression with a value. the program shown in Listing 57 is a solution for both of these problems! If the given sequence is initialized.

in which E is an expression. BEGIN For i := 1 TO 9 DO SOME j := 1 TO 25 . . Suppose that procedure P has been declared as PROCEDURE P (MIX n: INTEGER). assigns each component of a to x by backtracking. Alma-0 adds a third parameter passing mechanism: call by mixed form. The call P(E).. a[j+i+1] = i. VAR i: INTEGER. BEGIN .27] OF INTEGER. interpreting MIX as VAR. . Modula–2 provides call by value (the default) and call by reference (indicated by writing VAR before the formal parameter. BEGIN v := E.2 * i DO a[j] = i. depending on the form of the actual parameter. is compiled as if the programmer had written the code shown in Listing 58.N] of INTEGER. Listing 59: Linear search Type Vector = ARRAY [1. Example 25: Linear Search.. PROCEDURE Remarkable (VAR a: Sequence). .END. The function Find in Listing 59 uses a mixed form parameter. in which v is a variable. a: Vector). P(v) END 115 MIX. The call P(v). Listing 58: Compiling a MIX parameter VAR v: integer. where x is an integer variable. BEGIN SOME i := 1 TO N DO e = a[i] END END. a). j: INTEGER. The parameter is passed either by value or by reference. PROCEDURE Find (MIX e: INTEGER. a[j+2*i+1] = i END END END. We can think of the implementation of mixed form in the following way. The call Find(x.a) tests if 7 occurs in a. . The call Find(7.7 BACKTRACKING LANGUAGES Listing 57: Remarkable sequences TYPE Sequence = ARRAY [1. is compiled in the usual way. VAR i.

KNOWN(y) THEN z = x + y ELSIF KNOWN(y). .3 Other Backtracking Languages Prolog and Alma-0 are by no means the only PLs that incorporate backtracking.y ELSIF KNOWN(x).y. 2 The language Alma-0 has a declarative semantics in which the constructs described above correspond to logical formulas. Find(x. How does this use of orthogonality compare with the orthogonal design of Algol 68? 2 7. Alma-0 demonstrates that. if so. Listing 60 shows the declaration of a function Plus. The test KNOWN(x) succeeds iff the variable x is initialized. Example 26: Generalized Addition.x END END. The call Plus(x. a) DO Find(x.12) assigns 7 to n. 116 The sequence Find(x.5) succeeds and Plus(5. Dubinsky. Others include Icon (Griswold and Griswold 1983). KNOWN(z) THEN y = z . Find(x. both to the existing features of Modula2 and to each other. b) END tests whether all elements of a are also elements of b. SETL (Schwartz. The new features of Alma-0 are orthogonal. using backtracking. if appropriate care is taken. b) tests whether the arrays a and b have an element in common.z) will succeed if x + y = z provided that at least two of the actual parameters have values. It also has a procedural semantics that describes how the constructs can be efficiently executed. x receives the value of the common element. well-defined language (Modula-2). 2 KNOWN. b) DO WRITELN(x) END prints all of the elements that a and b have in common. z: INTEGER). y. a). Dewar. and add a small number of simple. The statement FORALL Find(x. BEGIN IF KNOWN(x). Plus(2. and Schonberg 1986). new features that provide expressive power when used together.n. and 2LP (McAloon and Tretkoff 1995). The interesting feature of Alma-0 is that it demonstrates that it is possible to start with a simple.7 BACKTRACKING LANGUAGES Listing 60: Generalized Addition PROCEDURE Plus (MIX x. For example.3. KNOWN(z) THEN x = z . Exercise 33. The statement FORALL Find(x. a). remove features that are not required for the current purpose. it is possible to design PLs that smoothly combine paradigms without losing the advantages of either paradigm.

Prolog is strong in logical deduction but weak in numerical calculation.7 BACKTRACKING LANGUAGES 117 Constraint PLs are attracting increasing attention. A program is a set of constraints. constraint PLs attempt to correct this imbalance. expressed as equalities and inequalities. that must be satisfied. .

In Algol 60. an inner declaration overrides an outer declaration. the “separate compilation” model of FORTRAN — and later C — was seen to be unsafe. k := 1. 8. syntactically at least. Listing 61: Nested Declarations begin integer k. It is straightforward to define recursive structures using context-free grammars — specifically. 118 . k := 6. begin integer k. hierarchical structure. The inner block declares another integer k which is set to 4 and then disappears without being used at the end of the block. See Section 9. Earlier languages had local scope (for example. Furthermore. a grammar written in Backus-Naur Form (BNF) for Algol 60. The new model that was introduced by languages such as Mesa and Modula divided a large program into modules.1 Block Structure Algol 60 was the first major PL to provide recursive. k := k + 1. which is first set to 6 and then incremented. end. The outer block in Listing 61 declares an integer k.2 Modules By the early 70s.1 for further discussion of nesting. It was not possible for one declaration to override another declaration of the same name. FORTRAN subroutines have local variables) but scopes were continuous (no “holes”) and not nested. The basis of Algol 60 syntax can be written in extended BNF as follows: PROG → BLOCK BLOCK → "begin" { DECL } { STMT } "end" STMT → BLOCK | · · · DECL → HEADER BLOCK | · · · With blocks come nested scopes. end.8 Structure The earliest programs had little structure. people realized that the “monolithic” structure of Algol-like programs was too restricted for large and complex programs.8. however. with a contextfree grammar. 8. Although a FORTRAN program has a structure — it consists of a main program and subroutines — the main program and the subroutines themselves are simply lists of statements. It is probably significant that Algol 60 was also the first language to be defined.

.8 STRUCTURE 119 A module is a meaningful unit (rather than a collection of possibly unrelated declarations) that may introduce constants. STACK myStack. The main difference between the stack module and the stack template is that the template exports a type. . A module may import some or all of its features. STACK yourStack. and variables.. initialize(myStack). The last few lines of the example above would be written as: import Stack. variables. One module of a program might declare a stack module: module Stack { export initialize. Example 28: Stack Template. In contrast.. we could use the stack module: import Stack. STACK myStack.. we can use the type STACK to create stacks. module Stack { export type STACK... . push. In typical modular languages.. functions. . . types. 67).. push. . a stack template introduces a type from which we can create as many instances as we need. Example 27: Stack Module.. modules can export types. pop. push(myStack. .... initialize(). Some modular languages provide an alternative syntax in which the name of the instance appears before the function name. initialize. push(67). . and it is shared by all parts of the program that import it. and functions (called collectively features). pop. } In other parts of the program. . as if the module was a record. .. . 2 The fact that we can declare many stacks is somewhat offset by the increased complexity of the notation: the name of the stack instance must appear in every function call. .. import Stack.. . 2 This form of module introduces a single instance: there is only one stack.. . } Elsewhere in the program.

. users require a specification written in an appropriate language. 8. today often embodied in object-oriented programming. It is therefore important to appreciate how radical it was at the time of its introduction.. [Parnas] has proposed a still more radical solution. Yet Parnas’s approach is significantly different from Wirth’s. users do not have such access. Parnas was right. rather than exposed to. perhaps with occasional backtracking to correct an error.. A large program will be implemented as a set of modules. page 78) Brooks wrote this in the first edition of The Mythical Man Month in 1975.2. Whereas Wirth assumes that a program can be constructed by filling in details. dependence upon its perfect accomplishment is a recipe for disaster. myStack. This implies that a module should have an interface that is relatively stable and an implementation that changes as often as necessary. is the only was of raising the level of software design. or programming team.push(67).initialize(). (Brooks 1975. it is more usual to refer to the goals as encapsulation. Only the owner of a module has access to its implementation. Parnas sees programming as an iterative process in which revisions and alterations are part of the normal course of events. only shortly after Wirth’s (1971) paper on program development by stepwise refinement.1 Encapsulation 120 Parnas (1972) wrote a paper that made a number of important points. In order to use the module correctly without access to its implementation. While that is definitely sound design. . I am now convinced that information hiding. This presupposes that all interfaces are completely and precisely defined. The specification answer the question “What does this module do?” and the implementation answers the question “How does this module work?” Most people today accept Parnas’s principle. he had changed his mind. A good information system both exposes interface errors and stimulates their correction. Today. even if they do not apply it well. This is a more “modern” view and certainly a view that reflects the actual construction of large and complex software systems. the details of construction of system parts other than his own. His thesis is that the programmer is most effective if shielded from. It is desirable that a programmer should be able to change the implementation of a module without affecting (or even informing) other programmers. Each programmer. (Brooks 1995. will be the owners of some modules and users of other modules. and I was wrong. page 272) Parnas’s paper was published in 1972.. Parnas introduced these ideas as the “principle of information hiding”. Twenty years later.8 STRUCTURE myStack.

and statements. It is easier to read and maintain programs. Precise reasoning (for example. continue. and return. Dijkstra’s arguments had the desired effect. All programming needs can be met by three structures satisfying the above property: the sequence. it may be necessary to introduce Boolean variables. Even Knuth (1974) entered the fray with arguments supporting the goto statement in certain circumstances.3 Control Structures Early PLs.3. The use of control structures rather than goto statements has several advantages. the use of formal semantics) is simplified when goto statements are absent. The first contribution of PLs was to provide names for variables and infix operators for building expressions. although his paper is well-balanced overall.1 Loop Structures. The most significant contribution of Algol 60 was the block : a program unit that could contain data. Ada has a goto statement because it was part of the US DoD requirements but. and loop structures only. The control structure that has elicited the most discussion is the loop. conditional. depended on three control structures: sequence. Certain optimizations become feasible because the compiler can obtain more precise information in the absence of goto statements. However. Algol 60 also contributed the familiar if-then-else statement and a rather complex loop statement. 8. again. with break. Java does not have a goto statement. although some uses of continue look rather like jumps. The “modern” control structures first appeared in Algol 60. functions. such as FORTRAN. Dijkstra’s main point was not to transform old. goto-ridden programs but to write new programs using simple control structures only. In languages designed after 1968.8 STRUCTURE 121 8. Algol 60 also provided goto statements and most programmers working in the sixties assumed that goto was essential. the loop with preceding test (while-do). call/return. This paper establishes that any program written with goto statements can be rewritten as an equivalent program that uses sequence. Despite the controversy. has even less need for goto than Pascal. Both C and Pascal have goto statements but they are rarely used in well-written programs. C. Dijkstra’s proposals had previously been justified in a formal sense by Bohm and Jacopini (1966). Maintenance errors are less common. and the conditional (if-then-else). it is rarely needed. and one flow out of. The basic ideas that Dijkstra proposed were as follows. The storm of controversy raised by Dijkstra’s (1968) letter condemning the goto statement indicates the extent to which programmers were wedded to the goto statement. C++ inherited goto statements from C. Although the result is interesting. There should be one flow into. the role of goto was down-played. transfer (goto). each control structure. These structures are essentially those provided by assembly language. The basic loop has a single test that precedes the body: .

but this is often considered undesirable. A procedure is a subroutine that has some effect on the program data but does not return a result. sum := 0. if (n < 0) break. a description of actions which are to be performed. which are manip- . we can use break to avoid the repeated code. while (true) { int n. A function is a subroutine that returns a result. terminating when a negative number is read without including the negative number in the sum.2 Procedures and Functions Some languages. called a “side effect”. read(n). 8. while n ≥ 0 do begin sum := sum + n. at the beginning of the first Pascal text (Jensen and Wirth 1976).3. such as Ada and Pascal. for example. This is not quite enough. In C and C++. Languages that make a distinction between procedures and functions usually also make a distinction between “actions” and “data”. Some recent languages. as shown in Listing 63. read(n). a function may have some effect on program data. recognizing this problem. Both names refer to “subroutines” — program components that perform a specific task when invoked from another component of the program.8 STRUCTURE Listing 62: Reading in Pascal var n. } 122 while E do S It is occasionally useful to perform the test after the body: Pascal provides repeat/until and C provides do/while. Listing 62 shows the code in Pascal. sum += n. provide only one form of loop statement consisting of an unconditional repeat and a conditional exit that can be used anywhere in the body of the loop. cin >> n. we find the following statement: An algorithm or computer program consists of two essential parts. For example. the problem of reading and adding numbers. Consider. end Listing 63: Reading in C int sum = 0. distinguish the terms procedure and function. and a description of the data. sum : int. however. because there are situations in which it is useful to exit from a loop from within the body.

thing := findElem("myKey") Multi-assignments. elem: Element) is .) Adding multiple results introduces the problem of how to use the results. C. Can you think of any other applications of multi-assignment? 2 Exercise 35. for practical reasons. For example. Describe a “proper” implementation of the multi-assignment statement. . every “statement” has a value as well as an effect. provide other advantages.8 STRUCTURE ulated by the actions. it is given the return type void. there are situations in which these constructs are inadequate. . Although in principle we can do everything with conditions and loops. allow “functions with side-effects”. a := b. Actions are described be so-called statements. For example we can write the standard “swap sequence” t := a.3. In these languages. provided they are implemented properly. and elem as the object found. Blue also provides multi-assignments: we can write success. . The parameter index is passed to the routine findElem. There is a single syntax for declaring and defining all functions and a convention that. b := b. the distinction between “procedure” and “function” also receives less emphasis. 2 8. and whether it has side-effects. Example 29: Returning multiple results. and the routine returns two values: found to indicate whether something was found. for example. b := t. Blue (K¨lling 1999) has a uniform notation for o returning either one or more results. Consistently. a 2 Exercise 34. there are two concerns: “having a value” and “having an effect” and they are essentially independent. (If the object is not found. discuss the introduction of new features into PLs.3 Exceptions. Ada and Pascal combine the concerns but. the distinction between “actions” and “data” is less emphasized. In C. We can view the progression here as a separation of concerns. such as Algol and C. if a function does not return a result. elem is set to the special value nil. in the more concise and readable form a. The “concerns” are: whether a subroutine returns a value. 123 In other languages. and data are described by so-called declarations and definitions. Using Example 29 and other examples. The basic control structures are fine for most applications provided that nothing goes wrong. does not have keywords corresponding to procedure and function. . we can define a routine with this heading: findElem (index: Integer) -> (found: Boolean.

This is wasteful. This is wasteful if the matrix is unlikely to be singular. a singular matrix is passed to the function and. C provided primitives (setjmp/longjmp) that made it possible to implement exceptin handling. however. because we know that most of the matrices that we will receive are non-singular.5) was the first major language to provide exception handling. The exception can be caught by a handler that can be installed at any convenient point in the calling environment. adding overhead. Moreover. Occasionally. We could test every divisor before performing a division. at some point in the calculation. Rely on the hardware to signal an exception when division by zero occurs. 2 PL/I (Section 3. Unfortunately. Suppose that we are writing a function that inverts a matrix. the test involves computing the determinant of the matrix. . What choices do we have? We could test the matrix for singularity before starting to invert it. For further descriptions of exception handling in particular languages. a division by zero will occur. the given matrix is non-singular and the inversion proceeds normally. In almost all cases. which is almost as much work as inverting it. the divisions probably occur inside nested loops: each level of loop will require Boolean variables and exit conditions that are triggered by a zero divisor.8 STRUCTURE 124 Example 30: Matrix Inversion. see Section 3.

We say that the formula ∀ n . Second. Similarly. or use of . which contains a binding occurence of x (in dx) and a use of x (in e−x ). by convention. which has the same value as (12). Or we can use the notation of the λ-calculus: λx . is interesting because it binds x but does not have a numerical value. In the earliest programs. the set of values { 0. we should specify the range of values that n is allowed to assume. the particular name that we use does not change the value of the expression. The conventional notation for functions. 9. 36. we say that “n occurs bound” in the expression 5 (2n + 1).9 Names and Binding Names are a central feature of all programming languages. predicates may contain free names. n mod 2 = 0 ⇒ n2 mod 2 = 0 is closed because it contains no free variables. whose truth value depends on the value of n: n mod 2 = 0 ∧ n mod 3 = 1. . read “x maps to −e−ax ”. Replacing numbers by symbolic names was one of the first major improvements to program notation (Mutch and Gill 1954). to this binding. has a value that does not depend on the particular name x. as in the following expression. however. is ambiguous. If we systematically change n to k. In this case. Names are bound by the quantifiers ∀ (for all) and ∃ (there exists). 125 (13) . But what about −e−ax : is it a function of a or a function of x? There are several ways of resolving this ambiguity. It is clear that −e−x is a function of x because. we obtain the expression 5 k=0 (2k + 1).. the formula is also true because (13) is true for all values of n. including machine addresses. the binding occurrence of n is in n = 0. n=0 (12) More precisely. . First. . In predicate calculus. The expression e−x dx. it has a definite value. Strictly. numbers were used for all purposes. The n in the expression (2n + 1) is a reference to. We say that “x occurs free” in an expression such as x2 + 2x + 5. because n is bound to a value (actually. e denotes a constant. 5 }). We do not know anything about the value of this expression because we do not know the value of x. −e−x . On the other hand. There are two things to note about (12). We can write x → −e−ax . the convention is that it denotes a function. − e−ax .1 Free and Bound Names The use of names in programming is very similar to the use of names in mathematics. 1. the expression 0 ∞ e−x dx.

2 . link time. then n is never bound. In (12). the first line binds the type int to n and the second line binds the value 6 to n.9 NAMES AND BINDING 126 We could do this implicitly: for example. The times at which the bindings occur include: compile time. as in ∀ n ∈ Z . 9.) The second binding occurs when the program is executed. this is now understood to be error-prone and is not a feature of recent PLs. The address and value of n are bound only when the function f is called. A C compiler.2 Attributes In mathematics. Alternatively. (The compiler records the fact that the type of n is int. we could define the range of values explicitly. For example. block entry time. a program with free variables will not compile. A binding is static if it occurs during before the program is executed: during compilation or linking. Note that we cannot tell the value of n from the definition of the function but we know that n will be given a value when the function is called. The first binding occurs when the program is compiled. a name may have several attributes. a variable normally has only one attribute: its value. Some early PLs. for example. n is bound to the values 0. it is very likely that (13) refers to integers. or statement execution. } k occurs free and n occurs bound. . as in n ∈ Z.. n mod 2 = 0 ⇒ n2 mod 2 = 0. The attributes that may be bound to a name include: type. and value. and the time at which the binding occurs. provided implicit declarations for variables that the programmer did not declare. . block entry. 5. its value is bound by the scanf call. A binding is dynamic if it occurs while the program is running: during loading. will accept the function f defined above only if it is compiled in a scope that contains a declaration of k. . In the function int f (int n) { return k + n. and statement execution time. 1. Precisely analogous situations occur in programming. if f is not called (because k ≤ 0). in the sequence int n. Consider the program shown in Listing 64. such as FORTRAN and PL/I. Example 31: Binding. In programming. In most PLs. n = 6. Definition. Sometimes. neither this fact nor the name n appears explicitly in the compiled code. load time. The address of k is bound when the program starts. This example shows that there are two aspects of binding that we must consider in PLs: the attribute that is bound. we specify the domain from which these values are chosen. address. and they may be bound at different times.

} 127 9. (In reality. scanf("%d". in this context. the address of the program is added to the address. addresses are bound to variable names at compile time. and also prevents the use of direct or indirect recursion. The compiler assigns an address relative to a compilation unit. When the program is loaded. have yet another way of binding an address to a variable. which is slightly slower then indexed addressing. Late binding ⇒ flexibility. the “absolute” address of the variable is known. Languages that provide dynamic allocation. without any indexing or other address calculations.3 Early and Late Binding An anonymous contributor to the Encyclopedia of Computer Science wrote: Broadly speaking. In these languages. the process is somewhat more complicated. The important point is that. } void main () { int k. a statement such as new(p) allocates space on the “heap” and stores the address of that space in the pointer p.). such as Pascal and C and most of their successors. because absolute addressing is used. which is not known until the function is called. C. It is inflexible. the history of software development is the history of ever-later binding time. because all local variables occupy space whether or not they are being used. In FORTRAN. The result is that. because all addresses are assigned at load time. but greater . As a rough rule of thumb: Early binding ⇒ efficiency. if (k>0) f(). In languages of the Algol family (Algol 60. When the program is linked. Example 32: Variable Addressing. a stack frame (called. &k). etc. Algol 68. It is more flexible than FORTRAN because inactive functions do not use up space and recursion works. Algol-style is slightly less efficient than FORTRAN because addresses must be allocated at blockentry time and indexed addressing is required. local variables are allocated on the run-time stack.9 NAMES AND BINDING Listing 64: Binding void f () { int n=7. n). an activation record or AR) is created for it and space for its parameters and local variables is allocated in the AR. in the compiled code. variables are addressed directly. by the time execution begins. printf("%d". Accessing the variable requires indirect addressing. Pascal.) FORTRAN is efficient. The variable must be addressed by adding an offset (computed by the compiler) to the address of the AR. When a function is called. This leads to wasted space. the address of the unit within the program is added to this address.

The overhead of a dynamically-bound function call depends on the language. runs more slowly. In Smalltalk. The actual function referred to in a particular invocation is not in general known until the call is executed. 2 9.) When the compiler encounters a function invocation. This statement is true of all PLs developed before OOP was introduced. Thus in non-OO PLs. however. several steps must be completed before the absolute address of the function is determined. The corresponding definition in Scheme would be: (defun f (lambda (x) E)) . Thus functions in C cannot be denoted. The definition int f (int x) { B } introduces a function with name f . functions are dynamically bound. the compiler generates “virtual function tables” and dynamic binding requires only indirect addressing. and body B. Thus functions in C can be named. An entity can be named if the PL provides a definitional mechanism that associates a name with an instance of the entity. In C++. Definition. An entity can be denoted if the PL provides ways of expressing instances of the entity as expressions. In C. the search can be avoided in up to 95% of calls by using a cache. first out).9 NAMES AND BINDING 128 flexibility is obtained because dynamic variables do not have to obey stack discipline (last in. it generates a call to the address that it has assigned to the function. the overhead can be significant. but there is greater flexibility because the decision as to which function to execute is made at run-time. parameter x. with the result that calls are efficiently executed but no decisions can be made at run-time. Note. (As above. that this provides yet another example of the principle: Smalltalk binds later than C++. OOPLs provide “virtual” functions. There is no way that we can write a function without a name in C. A virtual function name may refer to several functions. Calls take longer to execute. When the compiler encounters a function definition. it binds an address to the function name. we can name functions but we cannot denote them. because the CM requires a search. Example 34: Functions. In practice.4 What Can Be Named? The question “what can be named?” and its converse “what can be denoted?” are important questions to ask about a PL. but this is not relevant to the present discussion. 2 Example 33: Function Addressing. In OOPLs. but provides greater flexibility. functions are statically bound.

Worse still. Many language designers have preferred to extend [minor. For example. an indirect assignment through a pointer.1. In early versions of C. Reference semantics is sometimes called “pointer semantics”. Their introduction into high-level languages has been a step backward from which we may never recover. One variable it can never change is p! . The notation is analogous to that of the λ-calculus: f = λx . variable names denote memory locations. or indirect address into the language as an assignable item of data. 2 Example 35: Types. change any other variable (of appropriate type) in the whole machine.9 NAMES AND BINDING 129 This definition can be split into two parts: the name of the function being defined (f) and the function itself ((lambda (x) E)). changes the denotation of the name to a different object. Hoare (1974) has this to say about the introduction of pointers into PLs. . Later versions of C introduced the typedef statement. Assigning to a variable. and the damage is no longer confined to the variable explicitly names as the target of assignment. Some languages attempt to solve this by even more confusing automatic coercion rules. in Algol 68. . However. the assignment x := y always changes x. . References are like jumps. if permitted by the language. but the assignment p := y + 1 may. Scheme is. This immediately gives rise in a high-level language to one of the most notorious confusions of machine code. For example. Haskell. In a PL with reference semantics. This is reasonable in the sense that the implementation of reference semantics requires the storage of addresses — that is. if p is a reference variable.5 What is a Variable Name? We have seen that most PLs choose between two interpretations of a variable name. leading wildly from one part of a data structure to another. as mentioned in Section 4. pointers. namely that between an address and its contents. pointer. Assigning to a variable changes the contents of the location named by the variable. but we could not give a name to the type of p (“array of 10 pointers to int”). localized faults in Algol 60 and other PLs] throughout the whole language by introducing the concept of reference. types could be denoted but not named. E FPLs allow expressions analogous to λx . API p. variable names denote objects in memory. is not one of these languages. and other FPLs. can update any store location whatsoever. making types nameable: typedef int *API[10]. we could write int *p[10]. 2 9. In a PL with value semantics. just as in machine code. E to be used as functions. . LISP. The distinction is particularly clear in Hoare’s work on PLs. It is misleading in that providing pointers is not the same as providing reference semantics. and so are SML.

In a FPL. it stands for the function that adds two floats.9 NAMES AND BINDING 130 One year later. 9. The use of reference semantics in OOP is discussed at greater length in (Grogono and Chalin 1994). such as a person. the terms “ad hoc polymorphism” and “parametric polymorphism” are due to Christopher Strachey. float x. “many shapes”. There are several kinds of polymorphism. The most common application is to functions. In general. we will consider a class of data structures for which the amount of storage can actually vary during the lifetime of the data. . ad hoc polymorphism refers to the use of a single function name to refer to two or more distinct functions. In the expression m + n. m + n. x + y). In the expression x + y. Hoare (1975) provided his solution to the problem of references in high-level PLs: In this paper. If X and Y are have the same value. Typically the compiler uses the types of the arguments of the function to decide which function to call. (Hoare 1975) All FPLs and most OOPLs (the notable exception. it should be unique in the program too. In PLs. y. and we will show that it can be satisfactorily accommodated in a high-level language using solely high-level problemoriented concepts. literally. (eq x y) is true if x and y are pointers to the same object. .) One of the important aspects of OOP is object identity . . but the reasons are not the same for each paradigm. it stands for the function that adds two integers. all (or at least most) values are immutable. which requires copying. n. It follows that value semantics. These two functions are provided partly for efficiency and partly to cover up semantic deficiencies of the implementation. Explicit types make it possible to achieve . Ad hoc polymorphism is also called “overloading”.6. the program cannot tell whether X and Y are distinct objects that happen to be equal. Some other languages provide similar choices for comparison. This is most naturally achieved with a reference semantics. “polymorphism” is used to describe a situation in which one name can refer to several different entities.1 Ad Hoc Polymorphism In the code int m. being C++) use reference semantics. and without the introduction of references. The implementation that Hoare proposes in this paper is a reference semantics with types. or pointers to the same object. printf("%d %f". would be wasteful because there is no point in making copies of immutable objects. . the symbol “+” is used in two different ways. 9. perhaps even a factor of two in space-time cost for suitable applications. (equal x y) is true if the objects x and y have the same extensional value. a significant improvement on the efficiency of compiled LISP. If a program object X corresponds to a unique entity in the world.6 Polymorphism The word “polymorphism” is derived from Greek and means. of course. There are good reasons for this choice. (LISP provides two tests for equality.

When len is applied to an actual list. α will implicitly assume the type of the list components. acting as a type parameter. and all of the components must be integers.9 NAMES AND BINDING 131 Almost all PLs provide ad hoc polymorphism for built-in operators such as “+”. On the other hand. α is a type variable.: sum [] = 0 | sum (x::xs) = x + sum(xs) len [] = 0 | len(x::xs) = 1 + len(xs) SML infers the type of sum to be list(int) → int: since the sum of the empty list is (integer) 0. “−”. using the same function name. “[]” denotes the empty list. Ada. and this implies that we must know the type of the components. For the function len. which counts the components of a list but does not care about their type. The following code declares three distinct functions in C++. A function such as len. int max (int x. int max (int x. we must be able to choose an appropriate “add” function. C++.6.) In general. and other recent languages also allow programmers to overload functions. Here are functions that sum the components of a list and count the components of a list. Example 37: Lists in SML. and it assigns the type list(α) → int to the function. has parametric polymorphism. int z). Example 36: Overloaded Functions. float max (float x. (Strictly. “*”. That is. functions can be defined by cases. There is an important difference between these two functions. respectively. int y. In order to compute the sum of a list. and “::” denotes list construction. We write definitions without type declarations and SML infers the types. but ensuring that the either the number or the type of the arguments are different. we should say “overload function names” but the usage “overloaded functions” is common. there seems to be no need to know the type of the components if all we need to do is count them. all that the programmer has to do is write several definitions. etc. SML can infer nothing about the type of the components from the function definition. int y). we can make declarations such as these: list(int) list(float) Suppose also that we have two functions: sum computes the sum of the components of a given list. float y). the type of the operator “+” must be int × int → int. In SML.2 Parametric Polymorphism Suppose that a language provides the type list as a parameterized type. 2 9. and len computes the number of components in a given list. 2 .

for example. Scopes are nested : a declaration of a name in an inner block hides a declaration of the same name in an outer block. This is the approach taken by PLs that use dynamic type checking.9 NAMES AND BINDING 9. Pascal provides only a few coercions (subrange to integer. there may be many different objects that provide a function called. This will occur in a PL that provides static type checking but does not provide coercion.8. This kind of polymorphism is a fundamental. The compiler will generate code to convert the value of expression E to the type of x. This method of “scope control” is insecure because the compiler does not attempt to check consistency between program units. the semantics of this statement will be something like: evaluate the expression E. f . depend on the particular OOPL. . Local variables are declared at the beginning of a block. and very important. But what happens if they have different types? There are several possibilities.3 Object Polymorphism 132 In OOPLs. FORTRAN. Whatever the PL. The details of this kind of polymorphism.7 Assignment Consider the assignment x := E. Their scope starts at the declarations and end at the end of the program unit (that is. but C++ is stricter than C. Scope is a static property of a name that is determined by the semantics of the PL and the text of the program. say. The statement is considered to be an error. store the value obtained in x. However. subroutine or main program) within which the declaration appears. Examples of other scope rules follow. as shown in Listing 65. integer to real. Types are associated with objects rather than with names.8 Scope and Extent The two most important properties of a variable name are scope and extent.1 Scope The scope of a name is the region of the program text in which a name may be used.6. This approach was taken to extremes in COBOL and PL/I. as shown in Listing 66. 9. 9. In C++.) and rejects other type differences. 9. the scope of a local variable starts at the declaration of the variable and ends at the end of the block in which the declaration appears. aspect of OOP. etc. Algol 60. as shown in Listing 67. The assignment is unproblematic if x and E have the same type. The value of E will be assigned to x anyway. It is also possible to make variables accessible in selected subroutines by means of COMMON statements. Subroutine names have global scope. the effect of invoking f may depend on the object. It exists in C and C++. which are discussed in Section 6. Local variables are declared at the start of subroutines and at the start of the main program. For example.

. return end subroutine second real x(100).. integer k. b ... end .. return end Listing 67: Nested scopes in Algol 60 begin real x.. x := 3. k. y(100) integer k(250) common /myvars/ x.... // x accessible . x := 6.9 NAMES AND BINDING 133 Listing 65: Local scope in C++ { // x not accessible.. y . begin real x.0 end. { // x still accessible in inner block ... t x. } // x still accessible } // x not accessible Listing 66: FORTRAN COMMON blocks subroutine first integer i real a(100) integer b(250) common /myvars/ a.0.

The name is visible throughout the program. although programmers simulate them by over using COMMON declarations. struct str {int m. Control structures and nested functions are nested but declarations are allowed only at the start of the program or the start of a function declaration. inheritance is. Pascal. however. component names. . There are a number of additional scope control mechanisms in C++. other names. (There may be “holes” in these global scopes if the program contains local declarations of the same names. FORTRAN does not have global variables. } x. the following declarations do not conflict: { char str[200]. Finally.1415926) but is best avoided for application variables. n. In early Pascal. statement labels. a field f of the record r may be written as f rather than as r. the order of declarations was fixed: constants. and enumeration tags. structure.) The largest default scope in C is “file scope”. or public. and public (visible outside the class).. C++ follows C in providing directives such as static and extern to modify scope. a scoping mechanism. In OOPLs.. A function or a class may be declared as a friend of a class. Function declarations cannot be nested and all local variables must appear before executable statements in a block. Subroutine names in FORTRAN are global. are complicated by the fact that C has five overloading classes (Harbison and Steele 1987): preprocessor macro names. Class attributes may be declared private (visible only with the class). Similarly. protected. union. in part..9 NAMES AND BINDING 134 The fine control of scope provided by Algol 60 was retained in Algol 68 but simplified in other PLs of the same generation. . Pascal also has a rather curious scoping feature: within the statement S of the statement with r do S. and functions were declared in that order. The complete rules of C scoping. C. Consequently. Later dialects of Pascal sensibly relaxed this ordering. such as library functions and fundamental constants (π = 3. Names declared at the beginning of the outermost block of Algol 60 and Pascal programs have global scope. } Inheritance. types. protected (visible within the class and derived classes). some or all of the attributes of P are made visible in C. variables. Global Scope. When a child class C inherits a parent class P.f. a class may be derived as private. Global scope can be obtained using the storage class extern. giving it access to the private attributes of the class. C++. Global scope is useful for pervasive entities.

In block-structured languages. . }. a parameter list follows: o. // Error: w. where is it used to select attributes and functions of an object. There is a question as to whether the scope starts at the point of definition or is the entire block. Widget w. (In other words. The module that provides the definitions must have a corresponding export clause. cout << j << endl. Listing 69 provides an example in C++. The mechanisms above are inadequate for very large programs developed by large teams. In Algol 60 and C++. such as Modula-2. The construct w. Although a few people do not like the fine control offered by Algol 60 and C++. In modular languages.a denotes attribute a of object o. These names are usually hidden. The components of a structure. Namespaces.f().9 NAMES AND BINDING Listing 68: Declaration before use void main () { const int i = 6. If the attribute is a function. local scopes can be as small as the programmer needs. such as a Pascal record or a C struct. { int j = i. Typically. Import and Export. f(). For an attribute. the syntax is the same: o. does the PL require “declaration before use”?) C++ uses the “declare before use” rule and the program in Listing 68 prints 6. Any statement context can be instantiated by a block that contains local variable declarations. but can be made visible by the name of the object. such as Pascal. } } Listing 69: Qualified names class Widget { public: void f(). if module A exports name X and module B imports name X from A. The dot notation also appears in many OOPLs. Problems arise when a project uses several libraries that may have conflicting names. In other languages. Qualified Scope. // OK 135 f is not in scope Local Scope. opens a scope in which the public attributes of the object w are visible. local declarations can be used only in certain places. it seems best to provide the programmer with as much freedom as possible and to keep scopes as small as possible. names declared in a block are local to the block.f(x) denotes the invocation of function f with argument x in object o. then X may be used in module B as if it was declared there. const int i = 7. a name or group of names can be brought into a scope with an import clause. have names.

Interpreters generally have a repository that contains the value of each name currently accessible to the program. For large structures. The import mechanism has the problem that the source of a name is not obvious where it appears: users must scan possibly remote import declarations. This is because PLs typically hide the scoping mechanisms. given that functions cannot be nested). however. provide a higher level of name management than the regular language features. provided by PLs such as C++ and Common LISP. On a large scale. For example. C++ allows classes to be nested (this seems inconsistent. and D”. On a small scale. nesting is less effective because there is more text and a larger number of names. The advantage of nested scopes for large structures. such as classes or modules. explicit control by name qualification may be better than nesting. The effect is that the nested class can be accessed only by the enclosing class. is doubtful. nesting works fairly well. Scope management is important because the programmer has no “work arounds” if the scoping mechanisms provided by the PL are inadequate. If a program is compiled. Qualified names work well at the level of classes and modules. Namespaces provide a better solution than prefixes. such as functions. all names supplied by the commercial graphics library FastGraph are have fg_ as a prefix. In summary: Nested scopes are convenient in small regions. Qualified names are inadequate for very large programs. names will normally not be accessible at run-time unless they are included for a specific purpose such as debugging. It is probably better to provide special mechanisms to provide the scope control that is needed rather than to rely on simple nesting. some people began to question to need for nested scopes (Clarke. Wilden. 9. one way to view compiling is as a process that converts names to numbers. for example in statement and functions. Nesting is a form of scope management. but programmers may have limited access to this repository. In fact. and Wolf 1980.8. the cause of occasional errors that could be eliminated by requiring all names in a statement or function to be unique. for example in classes and modules.9 NAMES AND BINDING 136 Namespaces. C. An alternative mechanism would be a declaration that states explicitly that class A can be accessed only by class B.2 Are nested scopes useful? As scope control mechanisms became more complicated. It is. Library designers can reduce the potential for name conflicts by using distinctive prefixes. . This mechanism would have the advantage that it would also be able to describe constraints such as “class A can be accessed only by classes B. Hanson 1981). when the source of names is obvious.

..g. Understanding the relation between scope and extent is an important part of understanding a PL. The disadvantage is that GC and the accompanying overhead are more or less indispensable. In PLs that use a reference model. The advantage of the reference model is that the problems of disappearing objects (dangling pointers) and inaccessible objects (memory leaks) do not occur. Eiffel. its local variables will not have changed since the last invocation of the subroutine. Pascal).9 NAMES AND BINDING 9. In FORTRAN. also called lifetime. Programmers assume that. Some PLs attempt to detect this error (e.g.. on entry to a subroutine.8. Algol 68). Smalltalk. of a name is the period of time during program execution during which the object corresponding to the name exists. and others leave it as a problem for the programmer (e.g. C and C++). objects usually have unlimited extent. local variables also exist for the lifetime of the execution. Examples include Simula. It is an error to create an object on the stack and to pass a reference to that object to an enclosing scope. whether or not their original names are accessible. local variables are instantiated whenever control enters a block and they are destroyed (at least in principle) when control leaves the block. . CLU. and all FPLs. some try to prevent it from occurring (e.3 Extent 137 The extent. Global names usually exist for the entire lifetime of the execution: they have global extent. The separation of extent from scope was a key step in the evolution of post-Algol PLs. In Algol 60 and subsequent stack-based languages.

C provides bit-wise operations on integer data. 3. A PL must also provide ways for the programmer to abstract information about the world. begin s := [1. the elements of a set are encoded as positions within a machine word. The idea of abstracting from the world is not new. The biggest problems that we face in Computer Science are concerned with complexity. . In the evolution of PLs. and we can write expressions using set operations. Abstraction is particularly important in the study of PLs because a PL is a two-fold abstraction. 7. Programs are seen as descriptions of the world that happen to be executable. t: SmallNum. loops. loops. 6. the explicit goal of the designers was a “problem oriented language”. Pascal provides the problem-oriented abstraction set. In contrast. if (n in s) . abstraction helps us to manage complexity.. Example 38: Sets and Bits. 9]. The design of Algol 60 began in the late fifties and. 5. t := s + [2.10 Abstraction Abstraction is one of the most important mental tools in Computer Science. var n: integer. 4. The second task of a PL is to provide as much help as possible to the programmer who is creating a simplified model of the complex world in a computer program.10]. even at that time. A PL must provide an abstract view of the underlying machine. and so on — are abstracted from the raw bits of computer memory. floats. the first need came before the second. a simulation of tasks that used to be performed by clerks and accountants. . . conditionals. and some programs are little more than simulations of some aspect of the world. Here is an early example that contrasts the problem-oriented and machine-oriented approaches to PL design. The control structures that we use — assignments. Most programs are connected to the world in one way or another. at some level. The values that we use in programming — integers. Programs are seen as lists of instructions for the computer to execute. as shown in Listing 70. In each case. booleans. . Early PLs emphasize abstraction from the machine: they provide assignments. as shown in Listing 71. . Later PLs emphasize abstraction from the world: they provide objects and processes. and arrays. end 138 . and so on — are abstracted from the basic machine instructions. . an accounting program is. These features of Pascal and C have almost identical implementations. For example. we can define literal constants and variables of these types. 8]. preferably an interface that does not depend on the particular machine that is being used. var s. arrays. the presence or absence of an element Listing 70: Sets in Pascal type SmallNum = set of [1. The first task of a PL is to hide the low-level details of the machine and present a manageable interface. We can declare types whose values are sets. if (s <= t) . however.

. although the PERFORM verb provides a rather restricted and unsafe way of executing statements remotely. It is not surprising that . procedures and functions are closely related. Interestingly. we need to abstract form arbitrarily complex programs whose main purpose is to yield a single value. Typically. and to use the name elsewhere in the program to trigger the execution of the statements. Procedural abstraction enables us to give a name to a statement. COBOL does not provide procedure or function abstraction. int mask = 0x10000.1. they have similar syntax and similar rules of construction. Functional abstraction enables us to give a name to an expression. if (n & mask) .1 Procedures. The reason for this is that an abstraction mechanism that works only for simple expressions is not very useful. 10. The trend. the programmer might as well be allowed to use it. The appearance of the program. and C take the view that everything is an expression. it is often given a special sense. which also appeared at an early stage was the function. or group of statements. however. LISP provides function abstraction. Algol 60. 10.1 Abstraction as a Technical Term “Abstraction” is a word with quite general meaning. whereas the C programmer is forced to think at the level of words and bits.2 Functions. Neither approach is necessarily better than the other: Pascal was designed for teaching introductory programming and C was designed for systems programmers. The earliest abstraction mechanism was the procedure. In practice. and to use the name to trigger evaluation of the expression.) Thus statements have a value in these languages. An abstraction mechanism provides a way of naming and parameterizing a program entity. respectively. The difference between procedures and functions is most evident at the call site. since any computation leaves something in the accumulator. The effect of procedures is obtained by writing functions with side-effects. The next abstraction mechanism. In Computer Science in general. is away from machine abstractions such as bit strings and towards problem domain abstractions such as sets. and in PL in particular. Parameters enable us to customize the execution according to the particular requirements of the caller. where a procedure call appears in a statement context and a function call appears in an expression context. however. and the operations for and-ing and or-ing bits included in most machine languages are used to manipulate the words. Algol 68. The Pascal programmer can think at the level of sets. FORTRAN provides procedures (called “subroutines”) and functions. . . which exists in rudimentary form even in assembly language. is quite different.1. 2 10. 139 is represented by a 1-bit or a 0-bit. In many PLs. (The idea seems to be that.10 ABSTRACTION Listing 71: Bit operations in C int n = 0x10101010101.

Although this was an important and profound step. 10. with hindsight we can see that two simple ideas were needed: giving a name to an Algol block. as if executing the assignment x ← a. Alternatively. a formal description in natural language (e. A compiler could implement this program by copying the value of the argument. does not provide a parameter mechanism for types. . Pascal was more systematic and more influential. There is. such as integer.g. .g. Alternatively. 10. . Both implementations respect the CM. Suppose that a PL specifies that arguments are passed by value. however. The computational model (CM) is an abstraction of the operational semantics of the PL and it describes the effects of various operations without describing the actual implementation. the Pascal User Manual and Report (Jensen and Wirth 1976)). it could pass the address of the argument a to the function but not allow the statement S to change the value of a.. and type expressions (C denotes an integer constant and T denotes a type): C. . the implementation has failed to respect the CM. The statement T a = E means “the variable a has type T and initial value E” and the statement x ← a means “the value of a is assigned (by copying) to x”. . and freeing the data component of the block from the tyranny of stack discipline.1.4 Classes. a procedure is simple a function that returns a value of type void.C (C. we can provide a formal semantics for the language. 1960)). Pascal.. We can write an informal description in natural language (e. This means that the programs in Listing 72 and Listing 73 are equivalent. denotational. and that is the computational model (also sometimes called the semantic model ).3 Data Types. however. .end set of T file of T Type expressions can be named and the names can be used in variable declarations or as components of other types. 1975)).10 ABSTRACTION 140 this viewpoint minimizes the difference between procedures and functions.. without the details of every construct.g. Although Algol 68 provided a kind of data type abstraction. or operational approaches. using the axiomatic.2 Computational Models There are various ways of describing a PL. an abstraction that is useful for describing a PL in general terms. the Algol report (Naur et al. Programmers who use a PL acquire intuitive knowledge of its CM. C) array [T] of T record v: T. C. Simula introduced an abstraction mechanism for classes. Example 39: Passing parameters. or a document in a formal notation (e. 10. In Algol 68 and C. the Revised Report on Algol 68 (van Wijngaarden et al. . it requires log 1 = 0 bits of storage and can be optimized away by the compiler. 2 . and boolean. Since this value is unique.. real.1. . When a program behaves in an unexpected way. Pascal provides primitive types..

This enables the subroutine to both access the value of the argument and to alter it. Unfortunately. For example. Often. Listing 73: Revealing the parameter passing mechanism T a = E. { T x.. most FORTRAN compilers use the same mechanism for passing constant arguments. The most common way of achieving this result is to pass the address of the argument to the subroutine. constant arguments will not be. { S. after executing the sequence integer total total = 0 call p(total) the value of total is 1. S. The programmer has a mental CM for FORTRAN which predicts that. if we define the subroutine subroutine p (count) integer count . } 141 Example 40: Parameters in FORTRAN. The problem arises because typical FORTRAN compilers do not respect this CM.10 ABSTRACTION Listing 72: Passing a parameter void f (T x). . but has a rather complex CM. 2 A simple CM does not imply easy implementation.. x ← a. f (a). If a FORTRAN subroutine changes the value of one of its formal parameters (called “dummy variables” in FORTRAN parlance) the value of the corresponding actual argument at the call site also changes. For example. } T a = E. The effect of executing call p(0) is to change all instances of 0 (the constant “zero”) in the program to 1 (the constant “one”)! The resulting bugs can be quite hard to detect for a programmer who is not familiar with FORTRAN’s idiosyncrasies. count = count + 1 return end then. although variable arguments may be changed by a subroutine. C is easy to compile.. the opposite is true.

A language that uses this policy is likely to be efficient. we must know that an array can be represented by a pointer to its first element and that we can access elements of the array by adding the index (multiplied by an appropriate but implicit factor) to this pointer. if we are dealing with numbers. FPLs are based on the theory of partial recursive functions. This simplifies the programmer’s task. 2 Example 43: λ-Calculus as a Computational Model. but a naive implementation will probably be inefficient. every variable is an object. characters. a[i] ≡ ∗(a + i). A : { 1. Java provides primitives with efficient implementations but. For example. The underlying difficulty is that many computations depend on a concept of mutable state that does not exist in mathematics. The advantage of this point of view is that program text resembles mathematics and. On the other hand. because the compiler can process primitives without the overhead of treating them as full-fledged objects. programs can be manipulated as mathematical objects. . A solution for this dilemma is to define a CM for the language in which all variables are objects. In Pascal’s CM. 2 introduces A as a mapping: .N ] of real. to a certain extent. The declaration var A : array [1. A compiler can implement arrays in any suitable way that respects this mapping. we can replace E + E by 2 × E.. because all variables were actually implemented as objects. The disadvantage is that mathematics and computation are not the same. There are two ways in which an OOPL can treat primitive entities such as booleans. In Smalltalk. to support various OO features. In the CM of C. N } → R. the replacement would save time if the evaluation of E requires a significant amount of work. . However. it may be confusing for the user to deal with two different kinds of variable — primitive values and objects. because all variables behave like objects. Early implementations of Smalltalk were inefficient. They attempt to use this theory as a CM. . Primitive values can be given the same status as objects. such as maintaining the balance of a bank account subject to deposits and withdrawals. The practical consequence of referential transparency is that we can freely substitute equal expressions. an array is a mapping. FPLs have an important property called referential transparency . An implementor is then free to implement primitive values efficiently provided that the behaviour predicted by the CM is obtained at all times. are not easily described in classical mathematics. Primitive values can have a special status that distinguishes them from objects. To understand this. also has classes for the primitive types.10 ABSTRACTION 142 Example 41: Array Indexing. An expression is referentially transparent if its only attribute is its value. 2 Example 42: Primitives in OOPLs.. Some operations that are intuitively simple. and integers. if E was not referentially transparent — perhaps because its evaluation requires reading from an input stream or the generation of random numbers — the replacement would change the meaning of the program.

. Infinite enumerations. How should an OOPL treat literal constants. This reduces the dangers of misunderstandings on the side of the programmer. .10 ABSTRACTION 143 Example 44: Literals in OOP. the distinction is made at the logical level. This introduces several problems. They cease to be anomalies or special cases at a technical level. The enumeration may be finite.] 2 Exercise 36. structured programs by recursive decomposition of code. is that: o . pages 71–74). Manifest classes are defined by enumeration. The advantage of the Blue approach. what does the comparison 7 = 7 mean? Are there two 7-objects or just one? The CM of Dee says that the first occurrence of the literal “7” constructs a new object and subsequent occurrences return a reference to this (unique) object. Smalltalk uses the second approach. The object model explicitly recognizes these two different types of classes. constrain but not determine the implementation. These objects. according to K¨lling (1999. Do you think that K¨lling’s arguments for introducing a new concept (manifest o classes) are justified? 2 Summary Early CMs: modelled programs as code acting on data. or as a constant reference to an object that already exists. are allowed. [Emphasis added. help programmers to read and write programs. Blue avoids these problems by introducing the new concept of a manifest class. . however. structured programs by recursive decomposition of packages of code and data. such as 7? The following discussion is based on (K¨lling 1999. . constructing the object 7. o There are two possible ways of interpreting the symbol “7”: we could interpret it as an object constructor . The CM should: help programmers to reason about programs. Smalltalk does not explain where they come from. Later CMs: modelled programs as packages of code and data. as for class Boolean with members true and false. Dee (Grogono 1991a) uses a variant of the first alternative: literals are constructors. and differences between numbers and complex objects can be understood independently of implementation concerns. page 72). Literals are constant references to objects. or when and how they are created. as for class Integer. although we cannot see the complete declaration. For example. “just magically exist in the Smalltalk universe”.

and languages such as Pascal. Ada. Implementation techniques are a separate. and C. The value of compile-time checking began to be recognized in the late 60s. and very rich. PL/I. the code generated by a compiler is usually not optimal. The object code units and library functions required by the program are linked to form an executable file. The goal of checking is to ensure that. All compilers perform some checks to ensure that the source code that they are translating is correct.11 Implementation This course is concerned mainly with the nature of PLs. Since machine code is executed directly. However. there are usually several phases. (Gill 1954) A compiler translates source language to machine language. if a program compiles without errors. in any case. a compiler can perform global optimizations that are infeasible for a human programmer and. Nevertheless. COBOL. and that is the purpose of this section.1 11. Programmes in the future may be written in a very different language from that in which the machines execute them.1 Compiling and Interpreting Compiling As we have seen. there are few interesting things that we can say about implementation. It is to be hoped (and it is indeed one of the objects of using a conversion routine) that many of the tiresome blunders that occur in present-day programmes will be avoided when programmes can be written in a language in which the programmer feels more at home. 2. Before he can find the cause of the trouble the programmer will either have to investigate the conversion process or enlist the aid of checking routines which will ‘reconvert’ and provide him with diagnostic information in his own language. On the other hand. the need for programming languages at a higher level than assembly language was recognized early. Since practical compilation systems allow programs to be split into several components for compiling. The executable file is loaded into memory and executed.1. compiled programs ideally run at the maximum possible speed of the target machine. People also realized that the computer itself could perform the translation from a high level notation to machine code. 3. large storage capacities will lead in the course of time to very elaborate conversion routines. In practice. it will not fail when 144 . topic. 1. such as FORTRAN. a programmer skilled in assembly language may be able to write more efficient code than a compiler can generate. the checking was quite superficial and it was easy to write programs that compiled without errors but failed when they were executed. In the following quote. For very small programs. and C++ provided more thorough checking than their predecessors. The components of the source program is compiled into object code. In early languages. For large programs. hand-coding such large programs would require excessive time. “conversion” is used in roughly the sense that we would say “compilation” today. 11. any mistake that the programmer does commit may prove more difficult to find because it will not be detected until the programme has been converted into machine language.

11. The need to compile programs in parts was recognized at an early stage: even FORTRAN was implemented in this way. however. because programs do not contain type declarations. Early BASIC and LISP systems relied on interpretation. Dee. This goal is hard to achieve and. Modula-2 programs consist of interface modules and implementation modules. syntactic) errors as the program is read. The compiler checks that declarations and definitions are compatible. most interpreters do a certain amount of translation before executing the code.2 Interpreting An interpreter accepts source language and run-time input data and executes the program directly. but recent dialects of Pascal provide program components. but this approach tends to be inefficient. interpreted languages can provide immediate response. run-time failure. A key design decision in C++. The type checking performed by both of these languages is actually more thorough than .1.g. Since no time is spent in compiling. at best. An Algol 60 program is a block and must be compiled as a block. Eiffel programs consist of classes that are compiled separately. Consequently. Overloaded functions in C++ present a problem because one name may denote several functions. When the compiled components are linked. When the compiler compiles an implementation module. which is then “executed”. was to use the standard linker (Stroustrup 1994. It is possible to interpret without performing any translation at all. Most interpreters check correctness of programs during execution. Standard Pascal programs cannot be split up. although some interpreters may report obvious (e. but this is not logically necessary. such as an abstract syntax tree.11 IMPLEMENTATION 145 it is executed. typically by a factor of 10 or more. Other OOPLs that use this method include Blue. Language such as LISP and Smalltalk are often said to be untyped. the source program might be converted into an efficient internal representation. The result of such a discrepancy is. the interface modules are themselves compiled for efficiency. but more recent interactive systems provide an option to compile and may even provide compilation only. the linker will report subroutines that have not been defined but it will not detect discrepancies in the number of type of arguments. page 34). but this is incorrect. C and C++ rely on the programmer to provide header files containing declarations and implementation files containing definitions. Compiled PLs can be classified according to whether the compiler can accept program components or only entire programs. The compiler automatically derives an interface from the class declaration and uses the generated interfaces of other classes to perform full checking. and Java. however. it reads all of the relevant interfaces and performs full checking. The danger of separate compilation is that there may be inconsistencies between the components that the compiler cannot detect. even today. In practice. Interpreting is slower than executing compiled code. The solution is “name mangling” — the C++ compiler alters the names of functions by encoding their argument types in such a way that each function has a unique name. prototyping environments often provide an interpreted language. A FORTRAN compiler provides no checking between components.. We can identify various levels of checking that have evolved. For example. For this reason. there are few languages that achieve it totally.

Modify the compiler source so that the compiler generates machine code for the new machine. The advantage of byte code programs is that they can. all of these steps must be carried out with great care.11 IMPLEMENTATION 146 that performed by C and Pascal compilers. in principle. the difference is that errors are not detected until the offending statements are actually executed. The “Pascal P package” consisted of the P-code compiler written in P-code. it is necessary only to provide a Java byte code interpreter for each platform. this may be too late. but it is simplified by the fact that the compiler is written in Pascal. created in step 2 does not have to compile the entire Pascal language. 5. A Pascal implementor had to perform the following steps: 1. be run on any machine. it can be extended to compile the full language.3 Mixed Implementation Strategies Some programming environments. Use the compiler created in step 2 to compile the modified compiler. some LISP environments allow source code and compiled code to be freely mixed. It is important to ensure that each step is completed correctly before beginning the next.1. Java was designed to be used as an internet language. The complete sequence of steps can be used to generate a Pascal compiler that can compile itself. For example. Use the P-code interpreter to interpret the P-code compiler. The Pascal “P” compiler provided a simple method for porting Pascal to a new kind of machine. Write a P-code interpreter. there must be a compiled version for every platform (a platform consists of a machine and operating system). the implementor has a working Pascal system. 2. Another mixed strategy consists of using a compiler to generate byte codes and an interpreter to execute the byte codes. This step yields a P-code program that can be used to translate Pascal programs to P-code. If the program is written in a conventional language. Byte Codes. Needless to say. The result is a machine code program that translates Pascal to machine code — in other words. this is not a very arduous job. Use the result of step 4 to compile the modified compiler. the first version of the compiler. In fact. but only those parts that are needed by the compiler itself. If the program is written in Java. the process described above (called bootstrapping ) can be simplified. 3. With the exercise of due caution. provide both interpretation and compilation. but may be incomplete in other respects. but it is inefficient because it depends on the P-code interpreter. are simply streams of bytes that represent the program: a machine language for an “abstract machine”. Thus byte codes provide a convenient way of implementing “portable” languages. 11. At this stage. Since P-code is quite simple. The result of this step is a P-code program that translates Pascal to machine code. This is a somewhat tedious task. One of the first programming systems to be based on byte codes was the Pascal “P” compiler. Java uses byte codes to obtain portability in a different way. such as C. 4. particularly those for LISP. For safety-critical applications. . The “byte codes”. When that compiler is available. as the name implies. Any program that is to be distributed over the internet must be capable of running on all the servers and clients of the network. a Pascal compiler for the new machine.

can also be used to develop large programs. where most functions are fairly small (e. is C++. PLs that start out with interpreters often acquire compilers at a later stage — BASIC. because the second and subsequent cycles of a loop are executed from efficiently compiled code. Interactive Systems A PL can be designed for writing large programs that run autonomously or for interacting with the user. they fit on a screen). Smalltalk. and Smalltalk. are less suitable for this kind of program development. Eiffel. it is likely to be executed again. Many FPLs are implemented in this way (often with a compiler). As each statement is executed. and efficient execution. is usually implemented interactively. A PL intended for interpretation. 11. An implementation of LISP.g.2 Garbage Collection There is a sharp distinction: either a PL is implemented with garbage collection (GC) or it is not. provides GC. or perhaps compiled rapidly without optimization. CLU. LISP. if a statement has been executed once. Thus JIT combines some of the advantages of interpretation and compilation. and modular languages. such as BASIC. The implementor does not really have a choice. The first FPL. SML. much of the code of a program will not be executed. This implies that the user must be able to enter short sequences of code that can be executed immediately in the current context. The idea is that the program is interpreted. Smalltalk. however. and a number of other interactive PLs followed this pattern of development. LISP. Conversely.11 IMPLEMENTATION 147 Just In Time. or Java without GC would be useless — programs would exhaust memory in a few minutes at most. because programmers can perform arithmetic on pointers.. A PL that is suitable for interactive use if programs can be described in small chunks. 11. The functional PL community has been equally consistent. for reasons that have been clearly explained (Stroustrup 1994. This approach is based on the following assumptions: in a typical run.4 Consequences for Language Design Any PL can be compiled or interpreted. because the program can be run immediately. but most OOPLs provide it: Simula. The imperative PL community has been consistently reluctant to incorporate GC: there is no widely-used imperative PL that provides GC. page 219).1. An interactive PL is usually interpreted. it is almost impossible to implement C or C++ with GC. but in the opposite direction. of course. Block structured languages. . The exception. LISP. The categories overlap: PLs that can be used interactively. efficient code is generated for it. “Just in time” (JIT) compilation is a recent strategy that is used for Java and a few research languages. but interpretation is not the only possible means of implementation. and Java. the user can define new functions and invoke any function that has previously been defined. and so do all of its successors. The OOP community is divided on the issue of GC. JIT provides fast response. The converse is not true: an interactive environment for C++ would not be very useful. FPLs are often supported interactively because a programmer can build up a program as a collection of function definitions.

However. and unchecked function arguments as used by printf(). In the meantime. but no further reasons are needed. It is hard to argue with Stroustrup’s conclusion (Stroustrup 1994. (Stroustrup 1994. I like the idea of garbage collection as a mechanism that simplifies design and eliminates a source of errors. these are sufficient arguments against the view that no application would be better done with garbage collection. page 220) [Emphasis in the original. Some garbage collection schemes require banning several basic C facilities such as pointer arithmetic. Garbage collection is easiest for the user. But it remains true that it is difficult to add GC to C++ and the best that can be done is probably “conservative” GC. In his view. C++ would have been stillborn. unchecked arrays. Some garbage collection schemes impose constraints on object layout or object creation that complicates interfacing with other languages. device drivers. Also. the “fundamental reasons” that garbage collection is desirable are (Stroustrup 1994. Similarly. 3.11 IMPLEMENTATION I deliberately designed C++ not to rely on [garbage collection]. garbage collection would make C++ unsuitable for many of the low-level tasks for which it was intended. such as hard real-time applications. I am fully convinced that had garbage collection been an integral part of C++ originally. page 220): I know that there are more reasons for and against. I feared the added complexity of implementation and porting that a garbage collector would impose. most C++ programmers spend large amounts of time figuring out when to call destructors and debugging the consequences of their wrong decisions. page 220): 1. Stroustrup advocates this. human interface code on slow hardware. In typical . 148 It is interesting to study Stroustrup’s arguments in detail. Garbage collection is more reliable than user-supplied memory management schemes for some applications. 2.] Stroustrup’s arguments do not preclude GC as an option for C++ and. In particular. Some applications do not have the hardware resources of a traditional general-purpose computer. in fact. These are sufficient arguments against the view that every application would be better done with garbage collection. I feared the very significant space and time overheads I had experienced with garbage collectors. page 220): 1. 4. Many garbage collection techniques imply service interruptions that are not acceptable for important classes of applications. 5. Stroustrup provides more arguments against GC than in favour of it (Stroustrup 1994. it simplifies the building and use of some libraries. control applications. The emphasis is added: Stroustrup makes it clear that he would have viewed GC in a different light if it provided advantages for all applications. Conservative GC depends on the fact that it is possible to guess with a reasonable degree of accuracy whether a particular machine word is a pointer (Boehm and Weiser 1988). Garbage collection causes run-time and space overheads that are not affordable for many current C++ applications running on current hardware. and operating system kernels. 2.

Programs written in Erlang. The implementation of Eiffel provided by Meyer’s own company. It should be tunable. and those who use PLs without GC and consider it unnecessary. and more serious problem. Modern garbage collectors have much lower overheads and are “tunable” in Meyer’s sense. as the difficulties and dangers of explicit memory management in complex OO systems becomes apparent. for example. of identifying live data as garbage cannot occur: hence the term “conservative”. It should be efficient. the GC can identify regions of memory that might be in use. page 335) recommends that an Eiffel garbage collector should have the following properties. mark-sweep collectors. Experience with C++ is beginning to affect the polarization.) GC has divided the programming community into two camps: those who use PLS with GC and consider it indispensable. programmers should be able to switch off GC in critical sections and to request a full clean-up when it is safe to do so. Implementors are encouraged. aligned on a word boundary. and its value is a legal address. Eiffel. The converse.11 IMPLEMENTATION 149 modern processors. The problem is that there will be occasionally be words that look like pointers but actually are not: this results in about 10% of garbage being retained. For example. ISE. Larose and Feeley (1999). a pointer is a 4-byte quantity. with a low overhead on execution time. frequent bursts of activity rather than waiting for memory to fill up. . (Erlang is a proprietary functional language developed by Ericsson. The concern about the efficiency of GC is the result of the poor performance of early. working in small. Several commercial products that offer “conservative” GC are now available and their existence suggests that developers will increasingly respect the contribution of GC. It should be incremental. By scanning memory and noting words with these properties. is a public-domain language that can be implemented by anyone who chooses to do so — there are in fact a small number of companies other than ISE that provide Eiffel. Meyer (1992. describe a garbage collector for a Scheme compiler. but not required to provide GC. has a garbage collector that can be turned on and off. and compiled with this compiler are used in ATM switches. however. translated into Scheme.

Making Progress. powerful abstractions — to achieve more with less. and backtracking. it is often the case that they show retroactively how something should have been done when it is already too late to improve it. constraint programming. practice leads theory. PLs have evolved into several strands: procedural programming. logical variables. Such a language would combine the advantages of OOP and logic programming. Since people are always searching for “universal” solutions. Theory-based PLs are important because they provide useful insights. It is not hard to design a large and complex PL by throwing in numerous features. most of the time. and powerful. after values and objects. Ideally. Practice and Theory. logic programming and its successor. Multiparadigm Programming. 150 . it seems more likely that specialized languages will dominate in the future. There are a number of PLs that have been based on purely theoretical principles but few. Such attempts will be successful to the extent that they achieve overall simplification. But programs are not mathematical objects. now more or less subsumed by object oriented programming. Simula (Section 6. Simulation oriented views favour objects. The significant advances in mathematics are often simplifications that occur when structures that once seemed distinct are united in a common abstraction.1) and Scheme (Section 4.2). expressive. The “logical variable” (a variable that is first unbound. Each paradigm performs well on a particular class of applications. A rigorously mathematical approach can lead all too rapidly to the “Turing tar pit”. they serve as testing environments for ideas that may eventually be incorporated into new mainstream PLs. Unfortunately. are in widespread use. It might be fruitful to design a PL with objects. functional programming. Both views are useful for particular applications. It is not clear that any future universal language will be accepted by the programming community.12 Conclusion Values and Objects. The hard part of design is to find simple. Mathematically oriented views of programming favour values. Similar simplifications have occurred in the evolution of PLs: for example. PL design resembles mathematics. In this respect. The evolution of PLs shows that. then theoreticians attempt to what they did and how they could have done it better. and later may be unbound during backtracking) is a third kind of entity. A PL that provides both in a consistent and concise way might be simple. Designers and implementors introduce new ideas. it is inevitable that some will try to build a “universal” PL by combining paradigms. if any. then bound. unification.

Rx for semantics. Functional Programming Using Standard ML. (1995). P. ACM SIGPLAN Symposium on the Ada programming Language. Programming Linguistics. Arnold. Wadge (1982). Communications of the ACM 9 (5). (1999. Wilden. Addison Wesley. Addison Wesley. August). 538. (1941). W. Go to statement considered harmful. Brooks. ACM SIGPLAN 34 (4). April). MIT Press. Structured Programming. and S.) (1996). 10–18. and C. The Mythical Man-Month (Anniversary ed. Wolf (1980). 613–64. K. O. Roussel. E. April). 151 . In Annals of Mathematics Studies. May). A. February). and G. 807–820. L. Partington. P. Colmerauer. September). and J. and A. P. The Java Series.. A. C. R. A. e Dahl. Dijkstra. L.. Multiparadigm Programming in Leda. Clarke. Software: Practice & Experience 18 (9).. August). (1987. and M. H. In Proceedings of the Western Joint Computer Conference. F. History of Programming Languages–II. Flow diagrams. Un systeme de communication homme-machine en francais. 38–45. Bratko. Pasero (1973). and G. ACM Trans. E. Gelernter. (1968. Brooks. MIT Press. and languages with only two formation rules. Backus. A. Hoare (1972). (1990). 15(11):139–145. D. 366–371. The Java Programming Language (Second ed. 1014–1066. J. (1957. Gosling (1998). C. and A. November 1980. Programming Languages and Systems 4 (2). Budd. Technical report. J. E. and W. Universit´ d’Aix-Marseille. and R. (1975). The Mythical Man-Month. Ashcroft. T. Academic Press. V. In Proc. Brunekreef. J. T. Communications of the ACM 21 (8). Dijkstra. Java’s insecure parallelism. I. G. No silver bullet: Essence and accidents for software engineering. K. Brooks. Volume 6. Princeton Univesity Press. Weiser (1988).). R. Reprinted as ACM SIGPLAN Notices. (1995). Nesting in Ada programs is for the birds. (Eds. Jagannathan (1990). Alma-0: an imperative language that supports declarative programming.). W. Garbage collection in an uncooperative environment. Turing machines. Communications of the ACM 11 (8). J. Boehm. Groupe de Recherche en Intelligence Artificielle..-J.-J.. Can programming be liberated from the von Neumann style? A functional style and its algebra of programs. The FORTRAN automatic coding system. Jr. Addison Wesley. Addison Wesley. P.References Abelson. J. Brinch Hansen. ACM Press (Addison Wesley). A. PROLOG: Programming for Artificial Intelligence (Second ed. F. Structure and Interpretation of Computer Programs. Prentice Hall. (1978. P. H. Gibson. The calculi of lambda-conversion. Addison Wesley. Kanoui. IEEE Computer 20 (4). A o Apt. Backus. Sussman (1985). Schaerf (1998. Jr. H. F. 283–294. Bohm.). Programming Languages and Systems 20 (5). ACM Trans. Church. Jacopini (1966. Bergin. ˚ke Wikstr¨m (1987). and R.

concordia. R. Grogono. The early history of Smalltalk. pp. (1989). pp.ca/∼faculty/grogono/dee. W. August). S. Hints on programming language design. Predicate logic as a programming language. pp. P. C. A. thesis. In C. (1974). pp. D. Hoare. Hanson. Montreal. Griswold. and P. sharing. Jr. 727–742. P. (1991b. January). and M. Basser Department of Computer Science. Computer and Information Science 4 (2). LaLonde. Hoare. 505–34. T.REFERENCES 152 Gill. R. Harbison. R. P. General recursive functinos of natural numbesr. 291–347. Computer Systems Reliability. I. 569–574. Prentice Hall. C. Is block structure necessary? Software: Practice and Experience 11 (8). (1981. Quebec. Department of Computer Science. In International Symposium on Automatic Digital Computing. Pascal: User Manual and Report.html. University of Sydney.ps. Second Edition. Feeley (1999). The Dee report. P. (1974). 853–866. (1992). (1968). o Ph. (1991a. A.). Grogono. January).concordia/~faculty/grogono/oopissue. Hoare. R. (1965. Pugh (1995. and aliasing. D. C. Communications of the ACM 8. (1987). March/April). Prentice Hall. A. J. M. Complexity in C++: a Smalltalk perspective. S. 105–32. (1975. Volume 20 of State of the Art Report. and M.acus. C++? a critique of C++. National Physical Laboratory. B. Record handling. (1999). 261–301. E. Concordia University. Kowalski. Mathematische Annalen 112. Recursive data structures. and J. Available as www. Steele. North Holland. (1936). Reprinted as pages 217–243 of (Hoare 1989). C. (1974). ACM Computing Surveys 6 (4). R. June). Hoare. Grogono. Essays in Computing Science. Jones. In Colloquium on Object Orientation in Databases and Software Engineering (ACFAS’94). Reprinted as pages 193–216 of (Hoare 1989). Technical Report OOP–91–2. J. 89–101 and 158–165. ACM Press (Addison Wesley). Structured Programming 12 (1). A. March 1999. Int. Landin. 49–56. May). Available as www. . Journal of Object-Oriented Programming 8 (1). 1–15. Reprinted as ACM SIGPLAN Notices 343:1–9. Structured programming with GOTO statements. In Programming Languages. P. Bunyan (Ed.concordia/~faculty/grogono/copying. and N. In Information processing 74. E. Knuth. Kleene. Reprinted as pp. Copying. and G. 511–578. A compacting incrmental collector and its performce in a production quality compiler. A. Pergamon/Infotech. Jensen. In ACM SIGPLAN International Symposiuum on Memory Management (ISMM’98).ps. August). Kay. In History of Programming Languages–II. Available from ian@syacus. K. C: A Reference Manual (second ed. Wirth (1976). Griswold (1983). Prentice Hall. C. Issues in the design of an object oriented programming language. Academic Press. C. (1954).cs.). L. Larose. (1996). A. S. The Icon Programming Language. The Design of an Object-Oriented Environment and Language for Teaching. Chalin (1994. ttp://www.cs. D. M. K¨lling.au. Springer-Verlag. Joyner. R. A correspondence between Algol 60 and Church’s lambda-notation. Getting programmes right.oz. 289–292 of (Williams and Campbell-Kelly 1989).cs. Edited by C. R.

(1996). pp. The Definition of Standard ML. B. (1978). B. 27–83. Software Maintenance Management. Wexelblat (Ed. MIT Press. K. Zilles (1974. Prentice Hall International. ACM Press (Addison Wesley). ACM SIGPLAN Notices 17 (12). MIT Press. (1984). (1960. (1978). Eiffel: the Language. N. K. pp. Object-oriented Software Construction. Reprinted as pp. R. and C. Principles and Preactice of Constraint Programming. J. In History of Programming Languages–II. and S. (1978). Conversion routines. December). Parnas. J. 70–79. R. Liskov. In International Symposium on Automatic Digital Computing. 50–9. Tofte (1991). M. Naur. B. (1983. and R. Guttag (1986). V. Prentice Hall International. In History of Programming Languages. (1986). C. Communications of the ACM 3 (5). G. and J. Academic Press. B. Reprinted as pages 75–91 of (Wexelblat 1981). In History of Programming Languages. The European side of the last phase of the development of Algol. The early history and characteristics of PL/I. In P. 283–289 of (Williams and Campbell-Kelly 1989). Maintenance as a function of design. Radin. (1996). J. Dahl (1978). December). In R. McCarthy. Meyer. Tretkoff (1995). History of Programming Languages. J. (1988). et al. and O. P. Hentenryck and V. and S. 1997. John Wiley. A Theory of Programming Language Semantics. ACM Press.REFERENCES 153 Lientz. In History of Programming Languages. ACM SIGPLAN Notices 9 (4). (1992). B. Mutch. (1960. On the criteria to be used in decomposing systems into modules. MIT Press.). J. and E. Milne. Swanson (1980). pp. Nygaard. Data Types and Data Structures. Wexelblat (Ed. ACM Press. AFIPS National Computer Conference. 1053–1058. History of Programming Languages. May). A history of CLU. A history of Algol 68. L. and C. In Proc. The American side of the development of Algol. A. In History of Programming Languages. B. Addison Wesley. Values and objects in programming languages. pp. McAloon. Strachey (1976). Perlis. Prentice Hall International. E. McKee. Harper (1990). J. Naur. The development of the SIMULA languages. 101–116.. (1978). Tofte. 439–478. Communications of the ACM 15 (12). Liskov. Abstraction and Specification in Program Development. P. and M. D. In R. MIT Press. pp. P. (1972. ACM Press. 2LP: Linear programming and logic programming. MacLennan. Milner. Meyer. Lindsey. April). E. Programming with abstract data types. H. 471–496.). Reprinted as pages 551–574 of (Wexelblat 1981). Reprinted as pages 92–137 of (Wexelblat 1981). Second Edition. Academic Press. Report on the algorithmic language Algol 60. Recursive functions of symbolic expressions and their computation by machine. Commentary on Standard ML. . National Physical Laboratory. History of LISP. L. pp. 184–195. McCarthy. 173–197. B. 187–93. ACM Press (Addison Wesley). Milner.). J. Saraswat (Eds. Part I. Gill (1954). Communications of the ACM 3 (4). April). Martin. B. B. B. Liskov. L. 299–314.-J.

(1936). R. J. Revised report on the algorithmic language Algol 68. Technical Report Memo 349. Planning the use of a paper library. Programming Languages: History and Fundamentals. In Report of a Conference on High Speed Automatic Calculating-Engines. D. L. B. A. Acta Informatica 5. (1994). History of Programming Languages. Schwartz. Miranda: a non-strict functional language with polymorphic types.. J. Sammett. C. . Reprinted with corrections. J. D. Turner. Reprinted as pages 199–241 of (Wexelblat 1981). Programming with sets — an introduction to SETL. E. The Design and Evolution of C++.). Formal Language Description Languages for Computer Programming. London Math. D. A. A new implementation technique for applicative languages. M. Robinson. B. University of Kent. Whitaker. Technical report. Academic Press. (1981). Turing. Springer LNCS 301. Schonberg (1986). thesis. (1988). ACM Press (Addison Wesley). G. In Proceedings of the Second International Conference on Functional Programming Languages and Computer Architecture. (In three parts. et al. In History of Programming Languages. 1–3 June 1978. Proc. L. a a Teale. On computable numbers with an application to the Entscheidungsproblem. In T. M. C++ IOStreams Handbook. ACM Press. G. University Mathematical Laboratory. In History of Programming Languages. (1990). W. Sussman (1975). 2 (42). L. Jr. (1993). A machine-oriented logic based on the resolution principle. Dewar. K. pp. Gjessing and K. In History of Programming Languages. Cambridge. L. (1951). Steele. Andrews. ACM Press (Addison Wesley). February 1994. 42–44 of (Williams and Campbell-Kelly 1989). et al. B. 173–228. KRC language manual. 23–41. Turner. Jr. The darker side of C++ revisited. (1992). (1966). Turner. Steele. van Wijngaarden.). T. A Critical View of Inheritance and Reusability in Object-oriented Programming. Soc. A. (1985. Scheme: an interpreter for the extended lambda calculus. Digital Press. (1965).). Taivalsaari. Common LISP: the Language (second ed. J. 230–265. Turner.). Nygaard (Eds. The development of the C programming language. 671–686. (1969). Ph. September). 162–176. Addison Wesley. Steel (Ed. Towards a formal semantics. (1996). J.. D. M. Jr. G. A. Ada — the project. R. Proceedings of the 1988 European Conference of Object Oriented Programming.) (1981). (1976). M. Technical report. In History of Programming Languages. University of Jyv¨skyl¨. The evolution of LISP.. Springer LNCS 322. S. (1996). pp. D. MIT Artificial Intelligence Laboratory. and E..REFERENCES 154 Ritchie. From the ACM SIGPLAN History of programming Languages Conference. Strachey. pp. The SASL Langauge Manual. E. The early history of COBOL. 31–49. J. Stroustrup. D. Journal of the ACM 12. Software: Practice & Experience 9. Sammet. pp. North-Holland. 1–236. D. (1978). Steele. (Ed. Prentice Hall. In S. A. Wheeler. Sakkinen. (1979). Addison Wesley. Wexelblat. On the darker side of C++. University of St. ACM Press (Addison Wesley). Structured Programming 13. 233–308. Sakkinen. (1996). Springer Verlag. and G. Dubinsky. (1975). Reprinted as pp. (1993).

REFERENCES

155

Williams, M. R. and M. Campbell-Kelly (Eds.) (1989). The Early British Computer Conferences, Volume 14 of The Charles Babbage Institute Reprint Series for the History of Computing. The MIT Press and Tomash Publishers. Wirth, N. (1971, April). Program development by stepwise refinement. Communications of the ACM 14 (4), 221–227. Wirth, N. (1976). Algorithms + Data Structures = Programs. Prentice Hall. Wirth, N. (1982). Programming in Modula-2. Springer-Verlag. Wirth, N. (1996). Recollections about the development of Pascal. In History of Programming Languages, pp. 97–110. ACM Press (Addison Wesley).

A

Abbreviations and Glossary

Actual parameter. A value passed to a function. ADT. Abstract data type. A data type described in terms of the operations that can be performed on the data. AR. Activation record. The run-time data structure corresponding to an Algol 60 block. Section 3.3. Argument. A value passed to a function; synonym of actual parameter . Bind. Associate a property with a name. Block. In Algol 60, a syntactic unit containing declarations of variables and functions, followed by code. Successors of Algol 60 define blocks in similar ways. In Smalltalk, a block is a closure consisting of code, an environment with bindings for local variables, and possibly parameters. BNF. Backus-Naur Form (originally Backus Normal Form). A notation for describing a contextfree grammar, introduced by John Backus for describing Algol 60. Section B.4.2. Call by need. The argument of a function is not evaluated until its value is needed, and then only once. See lazy evaluation. CBN. Call by name. A parameter passing method in which actual parameters are evaluated when they are needed by the called function. Section 3.3 CBV. Call by value. A parameter passing method in which the actual parameter is evaluated before the function is called. Closure. A data structure consisting of one (or more, but usually one) function and its lexical environment. Section 4.2. CM. Computational Model. Section 10.2 Compile. Translate source code (program text) into code for a physical (or possibly abstract) machine. CS. Computer Science. Delegation. An object that accepts a request and then forwards the request to another object is said to “delegate” the request. Dummy parameter. Formal parameter (FORTRAN usage). Dynamic. Used to describe something that happens during execution, as opposed to static. Dynamic Binding. A binding that occurs during execution. Dynamic Scoping. The value of a non-local variable is the most recent value that has been assigned to it during the execution of the program. Cf. Lexical Scoping. Section 4.1. Dynamic typing. Performing type checking “on the fly”, while the program is running. Eager evaluation. The arguments of a function are evaluated at the time that it is called, whether or not their values are needed. 156

A

ABBREVIATIONS AND GLOSSARY

157

Early binding. Binding a property to a name at an early stage, usually during compilation. Exception handling. A PL feature that permits a computation to be abandoned; control is transferred to a handler in a possibly non-local scope. Extent. The time during execution during which a variable is active. It is a dynamic property, as opposed to scope. Formal parameter. A variable name that appears in a function heading; the variable will receive a value when the function is called. “Parameter” without qualification usually means “formal parameter”. FP(L). Functional Programming (Language). Section 4. Garbage collection. Recycling of unused memory when performed automatically by the implementatin rather than coded by the programmer. (The term “garbage collection” is widely used but inaccurate. The “garbage” is thrown away: it is the useful data that is collected!) GC. Garbage collection. Global scope. A name that is accessible throughout the (static) program text has global scope. Heap. A region for storing variables in which no particular order of allocation or deallocation is defined. High order function. A function that either accepts a function as an argument, or returns a function as a value, or both. Higher order function. Same as high order function. Inheritance. In OOPLs, when an object (or class) obtains some of its features from another object (or class), it is said to inherit those features. Interpret. Execute a program one statement at a time, without translating it into machine code. Lambda calculus. A notation for describing functions, using the Greek letter λ, introduced by Church. Late binding. A binding (that is, attachment of a property to a name) that is performed at a later time than would normally be expected. For example, procedural languages bind functions to names during compilation, but OOPLs “late bind” function names during execution. Lazy evaluation. If the implementation uses lazy evaluation, or call by need , a function does not evaluate an argument until the value of the argument is actually required for the computation. When the argument has been evaluated, its value is stored and it is not evaluated again during the current invocation of the function. Lexical Scoping. The value of a non-local variables is determined in the static context as determined by the text of the program. Cf. Dynamic Scoping. Section 4.1. Local scope. A name that is accessible in only a part of the program text has local scope. OOP(L). Object Oriented Programming (Language). Section 6. Orthogonal. Two language features are orthogonal if they can be combined without losing any of their important properties. Section 3.6.

The region of text in which a variable is accessible. Semantics. Polymorphism. Using the same name to denote more than one entity (the entities are usually functions). Stack. Variable names denote locations. before the program is executed). Parameter. A local variable in an Algol 60 program that has local scope (it can be accessed only within the procedure in which it is declared) but global extent (it is created when the procedure is first called and persists until the program termiantes). as opposed to extent. Static Binding. A value that is bound before the program is executed. Recursion. but allow some possible type errors to remain in the program. A region for storing variables in which the most recently allocated unit is the first one to be deallocated (“last-in. A syntactic construct from which many distinct objects can be generated. Dynamic Binding. Used to describe something that happens during compilation (or possible during linking but. Section 3. Value semantics. Literally “unknown” or “unmeasurable” (from Greek). Semantic Model. LIFO). Used in PLs for the objects passed to functions. Static.3.) Thunk. Section 10. A function that is defined over only a part of its domain. Strong typing. ex is total over the domain of reals. Cf. PL. but the usage is more or less compatible with our definition. Type checking performed during compilation. A function that is defined for all values of its arguments in its domain. Using one name to denote several distinct objects. Detect and report all type errors. A template is different from a declaration that constructs only a single object. Static typing. A function with no parameters used to implement the call by name mechanism of Algol 60 and a few other languages. x is partial over the domain of reals because it has no real value if x < 0. Total function. Detect and report some type errors. Regular Expression. The name suggests a stack of plates. It is a static property. Weak typing. as opposed to weak typing . Programming Language.4.A ABBREVIATIONS AND GLOSSARY 158 Overloading. (Literally “many shapes”). Scope. . Used by all functional and logic languages. Variable names denote objects. as opposed to dynamic. Section B. first out”. A process or object that is defined partly in terms of itself without circularity. Reference semantics. Own variable. Most imperative languages use value semantics. An alternative term for Computational Model (CM). Template. √ Partial function. anyway. RE. and most OOPLs.2. (The word “template” has a special sense in C++. A system that assigns meaning to programs.

Consequently. Typical PLs have context-free (or almost context-free) grammars.2 Semantics There are many kinds of semantics. This makes the construction of a C++ parser unnecessarily difficult and accounts for the slow development of C++ programming tools. 2 B. operational. There is a hierarchy of languages (the “Chomsky hierarchy”) and. and context-free languages and push-down automata. An axiomatic semantics. They suggested that denotational semantics should be used prescriptively rather then descriptively . programs can be split into lexemes by a finite state automaton and parsed by a push-down automaton. on the other hand. a denotational semantics is a mapping from program constructs to abstract mathematical entities that represent the “meaning” of the constructs. For example. Exercise 37. Denotational semantics developed as a descriptive technique and was extended to handle increasingly arcane features. The grammar of C++.B Theoretical Issues As stated in Section 1. Ashcroft and Wadge published an interesting paper (1982). The most complete description was completed by Milne (1976) after Strachey’s death. Denotational semantics is a powerful descriptive tool: it provides techniques for mapping almost any PL feature into a high order mathematical function. denotational. or lexemes. There are many applications of semantics. In this section. The first two levels of the hierarchy are the most familiar to programmers: regular languages and finite state automata. there are exceptions. a PL designer should start with a simple denotational semantics and then figure out how to implement the language — or perhaps leave that task to implementors altogether. is not context-free. we briefly review the contributions that theory has made to the development of PLs and discuss the importance and relevance of these contributions. then. and a particular semantic system may be more or less appropriate for a particular task. such as jumps into blocks. Denotational semantics is an effective language for communication between the designer and implementors of a PL but is not usually of great interest to a programmer who uses the PL. was designed according to these principles. Give examples to demonstrate the difficulties that the C++ preprocessor causes in a program development environment. is not very useful to an implementor but may be very useful to a programmer who wishes to prove the correctness of a program. that the problem of PL design at this level is “solved”. including: algebraic. They are not competitive but rather complementary. Unfortunately.1 Syntactic and Lexical Issues There is a large body of knowledge concerning formal languages. can be described by a regular language. It might appear. 159 . The preprocessor is another obstacle that affects program development environments in both C and C++. theory is not a major concern of the course. Their own language. associated with each level of the hierarchy. for example. formal machines that can “recognize” the corresponding languages and perform other tasks. and others. Lucid. noting that PL features that were easy to implement were often hard to describe and vice versa. B. The basic ideas of denotational semantics are due to Landin (1965) and Strachey (1966). however. In other words. and their tokens. axiomatic.

will return a value of type T . we should be able to prove a theorem of the following form: If x has type S then the evaluation of f (x) yields a value of type T . A type-correct program may give incorrect answers. } The declaration introduces a function called f that takes an argument. Type theory leads to attractive and interesting mathematics and. B. most type checking systems assume that functions are total : a function with type S → T . however. A description of a new PL in the academic literature is usually accompanied by a denotational semantics for the main features of the PL. we say that L is statically typed . A program that is type-correct may nevertheless fail in various ways. Here is a function declaration. A program that is type-correct will not fail because of a type error when it is executed. although it is arguably more important. Give an example of an expression or statement in Pascal or C that contains a type error that the compiler cannot detect. classes. x of type S. inheritance. choice. and repetition. that semantic techniques have had a significant impact on mainstream language design. The theory of program correctness is more difficult and has attracted less attention. genericity. The reasoning used in the proof is based on the syntactic form of the function body. Some commonly used functions are partial : for example x/y ≡ divide(x.y) is undefined for y = 0.3 Type Theory There is a large body of knowledge on type theory. and other features. consequently. If there is a type theory associated with the language. We can summarize it concisely as follows. and returns a result. If we can do this for all legal programs in a language L.B THEORETICAL ISSUES 160 To some extent. . It is not clear. overloading. performs calculations indicated as B.) A program can be type-correct and yet give completely incorrect answers. given a value of type S. return y. with objects. although the number of loopholes in the type system may be small. Exercise 38. 2 B. Since these operations occur in many contexts in the study of PLs. B. are very complex. dynamic binding. the suggestions of Ashcroft and Wadge have been followed (although not necessarily as a result of their paper). If L is indeed statically typed: A compiler for L can check the type correctness of all programs. Type theories for modern PLs. it is interesting to look briefly at regular languages and to see how the concepts can be applied. T f (S x) { B. In general: Most PLs are not statically typed in the sense defined above.4 Regular Languages Regular languages are based on a simple set of operations: sequence. (In general. y of type T . Type checking does not usually detect errors such as division by zero. tends to be overvalued by the theoretical Computer Science community.

We add two further notations that serve solely as abbreviations. For example. x is RE and denotes the set { x }.∗ / .} in Pascal or / ∗ . which we will refer to as RE here. The expression a+ is an abbreviation for the RE aa∗ that denotes the set { a. .. 6. we begin with an alphabet. a containing n occurrences of the symbol a. aa. . . {. The set of all finite strings. 2. 2. For example: UC LC LETTER DIGIT IDENTIFIER INTCONST FLOATCONST = = = = = = = . we might have Σ = { 0. . ./ ∗ .1 Tokens We can use regular languages to describe the tokens (or lexemes) of most PLs. which is a finite set of symbols. . This set is called a “regular language”. . B. Exercise 39.4. constructed from the symbols of Σ is written Σ∗ .∗ / in C. aaa. {. . 4. . . . For each symbol x ∈ Σ. Σ. We then introduce a variety of regular expressions. (the string containing no symbols) is RE and denotes the set { }.   is RE and denotes the empty set.} .. . We can use a program such as lex to construct a lexical analyzer (scanner) from a regular expression that defines the tokens of a PL. 1 }. . If r and s are RE with languages R and S respectively. 5. Some compilers allow “nested comments”. If r is RE with language R. 3. If r and s are RE with languages R and S respectively.}. then (rs) is RE and denotes the set RS (where RS = { rs | r ∈ R ∧ s ∈ S } ). . .B THEORETICAL ISSUES 161 In the formal study of regular languages. . but in this section it stands for “regular expression”. (The abbreviation RE is also used to stand for “recursively enumerable”. . For example.) Each form of RE is defined by the set of strings in Σ∗ that it generates. 1. . xn | n ≥ 0 ∧ xi ∈ R } ). then (r + s) is RE and denotes the set R ∪ S.. . DIGIT∗ . then r∗ is RE and denotes the set R∗ (where R∗ = { x1 x2 . g 1. including the empty string. What consequences does this have for the lexical analyzer (scanner)? 2 A+ B + ···+Z a+ b+ ···+z U C + LC 0+1+···+9 LETTER ( LETTER + DIGIT )∗ DIGIT+ DIGIT+ . The expression an represents the string aa .

With a few exceptions. and repetition — are somehow fundamental. most of the extended forms can be described simply: the sequence of symbols on the right of the connector is replaced by a regular expression. Discuss the difficulty of adding attributes to an EBNF grammar. Why might Figures 17 and 18 be of interest to a PL designer? 2 B. Also. The extension enables us to express choice and repetition within a single rule. STATEMENT = ASSIGNMENT | CONDITIONAL | · · · SEQUENCE = ( STATEMENT )∗ Extended BNF (EBNF) provides a more concise way of describing grammars than BNF.4. statement while expression do statement if condition then statement else statement Figure 17: REs and Control Structures 162 B. Similarly. alternation.4. In particular. Figure 18 shows a correspondence between REs and data structures. they can be constructed from EBNF grammars. we must use recursion instead.3 Control Structures and Data Structures Figure 17 shows a correspondence between REs and the control structures of Algol-like languages. SEQUENCE → EMPTY SEQUENCE → STATEMENT SEQUENCE BNF has been extended in a variety of ways.4 Discussion These analogies suggest that the mechanisms for constructing REs — concatenation.4. Exercise 40. might be defined: ASSIGNMENT → VARIABLE “ := ” EXPRESSION Basic BNF provides no mechanism for choice: we must provide one rule for each possibility. for example. the relation between the standard control structures and the standard data structures can be helpful in programming. The syntax for the assignment. In Jackson .2 Context Free Grammars Context free grammars for PLs are usually written in BNF (Backus-Naur Form). the symbol | rather than + is used to denote choice.B THEORETICAL ISSUES RE x rs r∗ r+s Control Structure statement statement. Just as parsers can be constructed from BNF grammars. and a sequence of terminal and non-terminal symbols. In grammars. or production. a connector (usually ::= or →). A grammar rule. Exercise 41. The following example illustrates both of these limitations. 2 B. consists of a non-terminal symbol (the symbol being defined).. BNF provides no mechanism for repetition.

for example. When we move to the next level of the Chomsky hierarchy. . int a[] union { int n. } 163 Figure 18: REs and Data Structures Structured Design (JSD). we obtain the benefits of recursion. float y. } int a[n]. struct { int n. page 163). float y. The limitations of REs are also interesting. The corresponding control structure is the procedure and the corresponding data structures are recursive structures such as lists and trees (Wirth 1976. the data structures appropriate for the application are selected first and then the program is constructed with the corresponding control structures.B THEORETICAL ISSUES RE x rs rn r∗ r+s Data Structure int n. Context Free Languages (CFLs).

Although add might be defined elsewhere. Schoenfinkel showed how to do this using the following abstraction algorithm. The rules for abstraction follow. We would like to write the definition of succ in the form succ = ??? so that the distinction between the name being defined and the definition is clear. defined above. Using these. such as x in this examples. Turner in 1979. we can write x + y as add x y. Schoenfinkel in 1924 and was rediscovered as an implementation method by D. We start with three observations. In a definition such as succ x = add 1 x we can distinguish between bound variables. the size of the SKI expression increases exponentially with the size of the original expression. In the second rule. the definition of fac becomes: fac = S(C(B if (== 0))1)(S × (B fac (C − 1))) 164 . including Haskell. The technique described below was introduced into mathematical logic by M. and constants. or simply variables. which we then apply to y. it is a constant as far as this definition is concerned.C Haskell Implementation Several modern functional programming languages. If e is an expression possibly containing x. gives succ = [x] (add 1 x) = S ([x] (add 1)) ([x] x) = S (S (K add) (K 1)) I These expressions quickly become quite long.A. we obtain fac = S(S(S(K I if)(S(S(K ==)(K0))I))(K 1))(S(S(K ×)I)(S(K fac)(S(S(K −)I)( 1)))) In general. because otherwise the first rule would apply. If we abstract n from fac n = if (n == 0) 1 (n × fac(n − 1)) (where we treat if as a function of three arguments). We understand add x y as (add x) y: we evaluate (add x). Thus we only have to consider prefix functions. Operators are really just functions. Turner introduced new functions B and C and optimizations to overcome the size problem. such as 1 and +. obtaining a function of one variable. It is an expression that does not contain x. For example. the abstraction of e with respect to x is written [x] e. [x] x = I [x] y = K y [x] (f g) = S ([x] f ) ([x] g) Apply these rules to the function succ. we assume x = y. Functions of more than one variable can be reduced to functions of one variable. are implemented using combinators and graph reduction.

and transform the graph. have led to steady improvements. he said that it was about 500 times slower than conventional languages such as C and FORTRAN. we apply an argument and then make use of the following reduction rules: Ix = x Kxy = x S f g x = (f x) (g x) B f g x = f (g x) Cf gx = f xg Using the simpler example. always using the rule for whatever function is at the root of the tree. we add a third subtree to the top node.C HASKELL IMPLEMENTATION 165 S   ©   d ‚ d S   ©   d ‚ d I K c K c + 1 Figure 19: The graph for succ To make use of these expressions. these expressions are represented as graphs: Figure 19 shows the graph corresponding to succ. succ. we have: succ 4 = S (S (K add) (K 1)) I 4 = (S (K add) (K 1) 4) (I 4) = ((K + 4) (K 1 4)) 4 = +14 = 5 For practical implementation. . giving S three arguments. When Turner presented one his functional programming languages at a conference in 1982. The basic technique of graph reduction is quite slow. and a modern implementation of a functional language has performance comparable to that of a compiled procedural language. To evaluate an expression such as succ 4. Twenty years of research. however.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->