Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

Sequential Processes

C. A. R. Hoare

June 21, 2004

© C. A. R. Hoare, 1985–2004

This document is an electronic version of Communicating Sequential

Processes, ﬁrst published in 1985 by Prentice Hall International. It may be

copied, printed, and distributed free of charge. However, such copying,

printing, or distribution may not:

− be carried out for commercial gain; or

− take place within India, Pakistan, Bangladesh, Sri Lanka, or the Maldives;

or

− involve any modiﬁcation to the document itself.

Questions and comments are welcome, and should be sent to the editor of

this version: Jim.Davies@comlab.ox.ac.uk.

Foreword

For a variety of reasons, this is a book eagerly awaited by all who knew it

was in the making; to say that their patience has been rewarded would be an

understatement.

A simple reason was that it is Tony Hoare’s ﬁrst book. Many know him

from the lectures he has untiringly given all over the world; many more know

him as the articulate and careful author of a number of articles (of great vari-

ety!) that became classics almost before the printer’s ink had dried. But a

book is a diﬀerent medium: here the author can express himself without the

usually stringent limitations of time and space; it gives him the opportunity

of revealing himself more intimately and of covering a topic of wider span,

opportunities of which Tony Hoare has made the best use we could hope for.

A more solid reason was derived from the direct contents of the book.

When concurrency confronted the computing community about a quarter of

a century ago, it caused an endless confusion, partly by the technically very

diﬀerent circumstances in which it emerged, partly by the accident of history

that it introduced non-determinism at the same time. The disentanglement of

that confusion required the hard work of a mature and devoted scientist who,

with luck, would clarify the situation. Tony Hoare has devoted a major part

of his scientiﬁc endeavours to that challenge, and we have every reason to be

grateful for that.

The most profound reason, however, was keenly felt by those who had

seen earlier drafts of his manuscript, which shed with surprising clarity new

light on what computing science could—or even should—be. To say or feel

that the computing scientist’s main challenge is not to get confused by the

complexities of his own making is one thing; it is quite a diﬀerent matter to

discover and show how a strict adherence to the tangible and quite explicit

elegance of a few mathematical laws can achieve this lofty goal. It is here

that we, the grateful readers, reap to my taste the greatest beneﬁts from the

scientiﬁc wisdom, the notational intrepidity, and the manipulative agility of

Charles Antony Richard Hoare.

Edsger W. Dijkstra

Preface

This is a book for the aspiring programmer, the programmer who aspires to

greater understanding and skill in the practice of an intellectually demanding

profession. It is designed to appeal ﬁrst to a natural sense of curiosity, which

is aroused by a new approach to a familiar topic. The approach is illustrated

by a host of examples drawn from a wide range of applications, from vending

machines through fairy stories and games to computer operating systems. The

treatment is based on a mathematical theory, which is described by a system-

atic collection of algebraic laws.

The ultimate objective of the book is to convey an insight which will enable

the reader to see both current and future problems in a fresh light, in which

they can be more eﬃciently and more reliably solved; and even better, they can

sometimes be avoided.

The most obvious application of the new ideas is to the speciﬁcation,

design, and implementation of computer systems which continuously act and

interact with their environment. The basic idea is that these systems can be

readily decomposed into subsystems which operate concurrently and interact

with each other as well as with their common environment. The parallel com-

position of subsystems is as simple as the sequential composition of lines or

statements in a conventional programming language.

This insight brings practical beneﬁts. Firstly, it avoids many of the tra-

ditional problems of parallelism in programming—interference, mutual ex-

clusion, interrupts, multithreading, semaphores, etc. Secondly, it includes

as special cases many of the advanced structuring ideas which have been

explored in recent research into programming languages and programming

methodology—the monitor, class, module, package, critical region, envelope,

form, and even the humble subroutine. Finally, it provides a secure mathem-

atical foundation for avoidance of errors such as divergence, deadlock and

non-termination, and for achievement of provable correctness in the design

and implementation of computer systems.

I have tried hard to present the ideas in a logically and psychologically well-

ordered sequence, starting with the simple basic operators, and progressing

towards their more elaborate applications. An assiduous reader may study the

book from cover to cover. But many readers will start with greater interest in

vi Preface

some topics than others; and for their beneﬁt each chapter of the book has

been structured to permit judicious selection.

1. Each new idea is introduced by an informal description and illuminated

by a number of small examples, which will probably be helpful to all

readers.

2. The algebraic laws which describe the essential properties of the various

operations will be of interest to those with a taste for mathematical

elegance. They will also be of beneﬁt for those who wish to optimise their

system designs by means of correctness-preserving transformations.

3. The proposed implementations are unusual in that they use a very simple

purely functional subset of the well-known programming language LISP.

This will aﬀord additional excitement to those who have access to a LISP

implementation on which to exercise and demonstrate their designs.

4. The deﬁnitions of traces and speciﬁcations will be of interest to systems

analysts, who need to specify a client’s requirements before undertaking

an implementation. They will also be of interest to senior programmers,

who need to design a system by splitting it into subsystems with clearly

speciﬁed interfaces.

5. The proof rules will be of interest to implementors who take seriously

their obligation to produce reliable programs to a known speciﬁcation, to

a ﬁxed schedule, and at a ﬁxed cost.

6. Finally, the mathematical theory gives a rigorous deﬁnition of the

concept of a process, and the operators in terms of which processes are

constructed. These deﬁnitions are a basis for the algebraic laws, the

implementations and the proof rules.

A reader may consistently or intermittently omit or postpone any of these

topics which are of lesser interest, or which present greater diﬃculty of un-

derstanding.

The succession of chapters in the book has also been organised to per-

mit judicious browsing, selection, or rearrangement. The earlier sections of

Chapter 1 and Chapter 2 will be a necessary introduction for all readers, but

later sections may be more lightly skimmed or postponed to a second pass.

Chapters 3, 4 and 5 are independent of each other, and may be started in

any combination or in any order, according to the interest and inclination of the

reader. So if at any stage there is any diﬃculty of understanding, it is advisable

to continue reading at the next section or even the next chapter, since there

is a reasonable expectation that the omitted material will not be immediately

required again. When such a requirement arises, there will often be an explicit

backward reference, which can be followed when there is suﬃcient motivation

to do so.

I hope everything in the book will in the end be found interesting and re-

warding; but not everyone will wish to read and master it in the order presen-

ted.

Preface vii

The examples chosen to illustrate the ideas of this book will all seem very

small. This is deliberate. The early examples chosen to illustrate each new

idea must be so simple that the idea cannot be obscured by the complexity or

unfamiliarity of the example. Some of the later examples are more subtle; the

problems themselves are of the kind that could generate much confusion and

complexity; and the simplicity of their solution could be due to the power of the

concepts used and the elegance of the notations in which they are expressed.

Nevertheless, each reader will be familiar, perhaps painfully familiar, with

problems of far greater scope, complexity and importance than the examples

appropriate for an introductory text. Such problems may seem to be intract-

able by any mathematical theory. Please do not give way to irritation or disil-

lusion, but rather accept the challenge of trying to apply these new methods

to existing problems.

Start with some grossly over-simpliﬁed version of some selected aspect

of the problem, and gradually add the complexities which appear to be neces-

sary. It is surprising how often the initial over-simpliﬁed model will convey

additional insight, to assist in the solution of the problem as a whole. Perhaps

the model can serve as a structure on which complex detail can later be safely

superimposed. And the ﬁnal surprise is that perhaps some of the additional

complexity turns out to be unnecessary after all. In such cases, the eﬀort of

mastering a new method receives it most abundant reward.

Notations are a frequent complaint. A student setting out to learn the

Russian language often complains at the initial hurdle of learning the unfamil-

iar letters of the Cyrillic alphabet, especially since many of them have strange

pronunciations. If it is any consolation, this should be the least of your wor-

ries. After learning the script, you must learn the grammar and the vocabulary,

and after that you must master the idiom and style, and after that you must

develop ﬂuency in the use of the language to express your own ideas. All this

requires study and exercise and time, and cannot be hurried.

So it is with mathematics. The symbols may initially appear to be a seri-

ous hurdle; but the real problem is to understand the meaning and properties

of the symbols and how they may and may not be manipulated, and to gain

ﬂuency in using them to express new problems, solutions, and proofs. Finally,

you will cultivate an appreciation of mathematical elegance and style. By that

time, the symbols will be invisible; you will see straight through them to what

they mean. The great advantage of mathematics is that the rules are much

simpler than those of a natural language, and the vocabulary is much smal-

ler. Consequently, when presented with something unfamiliar it is possible

to work out a solution for yourself, by logical deduction and invention rather

than by consulting books or experts.

That is why mathematics, like programming, can be so enjoyable. But it is

not always easy. Even mathematicians ﬁnd it diﬃcult to study new branches

of their subject. The theory of communicating processes is a new branch of

mathematics; programmers who study it start with no disadvantage over math-

ematicians; but they will end with the distinct advantage of putting their know-

ledge to practical use.

viii Preface

The material of this book has been tested by presentation in informal work-

shops as well as on formal academic courses. It was ﬁrst designed for a one-

semester Master’s course in software engineering, though most of it could be

presented in the ﬁnal or even the second year of a Bachelor’s degree in com-

puting science. The main prerequisite is some acquaintance with high-school

algebra, the concepts of set theory, and the notations of the predicate calculus.

These are summarised on the ﬁrst page of the glossary of symbols just after

this preface. The book is also a suitable basis for an intensive one-week course

for experienced programmers. In such a course, the lecturer would concen-

trate on examples and deﬁnitions, leaving the more mathematical material for

later private study. If even less time is available, a course which ends after

Chapter 2 is quite worthwhile; and even in a single hour’s seminar it is pos-

sible by careful selection, to get as far as the edifying tale of the ﬁve dining

philosophers.

It is great fun to present lectures and seminars on communicating sequen-

tial processes, since the examples give scope for exercise of the histrionic skills

of the lecturer. Each example presents a little drama which can be acted with

due emphasis on the feelings of the human participants. An audience usu-

ally ﬁnds something particularly farcical about deadlock. But they should be

constantly warned about the dangers of anthropomorphism. The mathemat-

ical formulae have deliberately abstracted from the motives, preferences, and

emotional responses by which the lecturer “lends an air of verisimilitude to

an otherwise bald and unconvincing tale”. So one must learn to concentrate

attention on the cold dry text of the mathematical formulae, and cultivate an

appreciation for their elegant abstraction. In particular, some of the recurs-

ively deﬁned algorithms have something of the breathtaking beauty of a fugue

composed by J. S. Bach.

Summary

Chapter 1 introduces the basic concept of a process as a mathematical abstrac-

tion of the interactions between a system and its environment. It shows how

the familiar technique of recursion may be used to describe processes that last

a long time, or forever. The concepts are explained ﬁrst by example and then

by pictures; a more complete explanation is given by algebraic laws, and by an

implementation on a computer in a functional programming language.

The second part of the chapter explains how the behaviour of a process

can be recorded as a trace of the sequence of actions in which it engages. Many

useful operations on traces are deﬁned. A process can be speciﬁed in advance

of implementation by describing the properties of its traces. Rules are given

to help in implementation of processes which can be proved to meet their

speciﬁcations.

The second chapter describes how processes can be assembled together

into systems, in which the components interact with each other and with their

external environment. The introduction of concurrency does not by itself in-

troduce any element of nondeterminism. The main example of this chapter is

a treatment of the traditional tale of the ﬁve dining philosophers.

The second part of Chapter 2 shows how processes can be conveniently

adapted to new purposes by changing the names of the events in which they

engage. The chapter concludes with an explanation of the mathematical theory

of deterministic processes, including a simple account of the domain theory

of recursion.

The third chapter gives one of the simplest known solutions to the vexed

problem of nondeterminism. Nondeterminism is shown to be a valuable tech-

nique for achieving abstraction, since it arises naturally from the decision to

ignore or conceal those aspects of the behaviour of a systems in which we are

no longer interested. It also preserves certain symmetries in the deﬁnition of

the operators of the mathematical theory.

Proof methods for nondeterministic processes are slightly more complic-

ated than those for deterministic processes, since it is necessary to demon-

strate that every possible nondeterministic choice will result in a behaviour

which meets the given speciﬁcation. Fortunately, there are techniques for

x Summary

avoiding nondeterminism, and these are used extensively in Chapters 4 and 5.

Consequently the study or mastery of Chapter 3 can be postponed until just

before Chapter 6, in which the introduction of nondeterminism can no longer

be avoided.

In the later sections of Chapter 3, there is given a complete mathematical

deﬁnition of the concept of a nondeterministic process. This deﬁnition will

be of interest to the pure mathematician, who wishes to explore the founda-

tions of the subject, or to verify by proof the validity of the algebraic laws and

other properties of processes. Applied mathematicians (including program-

mers) may choose to regard the laws as self-evident or justiﬁed by their utility;

and they may safely omit the more theoretical sections.

Chapter 4 at last introduces communication: it is a special case of interac-

tion between two processes, one of which outputs a message at the same time

as the other one inputs it. Thus communication is synchronised; if buﬀering is

required on a channel, this is achieved by interposing a buﬀer process between

the two processes.

An important objective in the design of concurrent systems is to achieve

greater speed of computation in the solution of practical problems. This is il-

lustrated by the design of some simple systolic (or iterative) array algorithms.

A simple case is a pipe, deﬁned as a sequence of processes in which each pro-

cess inputs only from its predecessor and outputs only to its successor. Pipes

are useful for the implementation of a single direction of a communications

protocol, structured as a hierarchy of layers. Finally, the important concept

of an abstract data type is modelled a a subordinate process, each instance of

which communicates only with the block in which it is declared.

Chapter 5 shows how the conventional operators of sequential program-

ming can be integrated within the framework of communicating sequential

processes. It may be surprising to experienced programmers that these oper-

ators enjoy the same kind of elegant algebraic properties as the operators of

familiar mathematical theories; and that sequential programs can be proved

to meet their speciﬁcations in much the same way as concurrent programs.

Even the externally triggered interrupt is deﬁned and shown to be useful, and

subject to elegant laws.

Chapter 6 describes how to structure and implement a system in which

a limited number of physical resources such as discs and line printers can be

shared among a greater number of processes, whose resource requirements

vary with time. Each resource is represented as a single process. On each

occasion that a resource is required by a user process, a new virtual resource

is created.

A virtual resource is a process which behaves as if it were subordinate to

the user process; but it also communicates with the real resource whenever re-

quired. Such communications are interleaved with those of other concurrently

active virtual processes. So the real and virtual processes play the same roles

as the monitors and envelopes of PASCAL PLUS. The chapter is illustrated by

the modular development of a series of complete but very simple operating

systems, which are the largest examples given in this book.

Summary xi

Chapter 7 describes a number of alternative approaches to concurrency

and communication, and explains the technical, historical, and personal motives

which led to the theory expounded in the preceding chapters. Here I acknow-

ledge my great debt to other authors, and give recommendations and an intro-

duction to further reading in the ﬁeld.

Acknowledgements

It is a great pleasure to acknowledge the profound and original work of Robin

Milner, expounded in his seminal work on a Calculus for Communicating Sys-

tems. His original insights, his personal friendship and his professional rivalry

have been a constant source of inspiration and encouragement for the work

which culminated in the publication of this book.

For the last twenty years I have been considering the problems of program-

ming for concurrent computations, and the design of a programming language

to ease those problems. During that period I have proﬁted greatly by collab-

oration with many scientists, including Per Brinch Hansen, Stephen Brookes,

Dave Bustard, Zhou Chao Chen, Ole-Johan Dahl, Edsger W. Dijkstra, John Elder,

Jeremy Jacob, Ian Hayes, JimKaubisch, John Kennaway, T. Y. Kong, Peter Lauer,

Mike McKeag, Carroll Morgan, Ernst-Rudiger Olderog, Rudi Reinecke, Bill Ros-

coe, Alex Teruel, Alastair Tocher and Jim Welsh.

Finally, my special thanks for to O.-J. Dahl, E. W. Dijkstra, Leslie M. Gold-

schlager, Jeﬀ Sanders and others who have carefully studied an earlier draft

of this text, and who have pointed out its errors and obscurities; and in par-

ticular to the participants in the Wollongong Summer School on the Science of

Computer Programming in January 1983, the attendants at my seminar in the

Graduate School of the Chinese Academy of Science, April 1983, and students

of the M.Sc. in Computation at Oxford University in the years 1979 to 1984.

Glossary of Symbols

Logic

Notation Meaning Example

= equals x = x

≠ is distinct from x ≠ x +1

end of an example

or proof

P ∧ Q P and Q (both true) x ≤ x +1 ∧ x ≠ x +1

P ∨ Q P or Q (one or both true) x ≤ y ∨ y ≤ x

¬ P not P (P is not true) ¬ 3 ≥ 5

P ⇒Q if P then Q x < y ⇒x ≤ y

P ≡ Q P if and only if Q x < y ≡ y > x

∃x • P there exists an x

such that P

∃x • x > y

∀x • P forall x, P ∀x • x < x +1

∃x : A • P there exists an x

in set A such that P

∀x : A • P for all x in set A, P

Sets

Notation Meaning Example

∈ is a member of Napoleon ∈ mankind

∉ is not a member of Napoleon ∉ Russians

¦¦ the empty set (with

no members)

¬ Napoleon ∈ ¦¦

xvi Glossary of Symbols

¦a¦ the singleton set of a;

a is the only member

x ∈ ¦a¦ ≡ x = a

¦a, b, c¦ the set with members a,

b, and c

c ∈ ¦a, b, c¦

¦ x ¦ P(x) ¦ the set of all x

such that P(x)

¦a¦ = ¦ x ¦ x = a ¦

A∪B A union B A∪B = ¦ x ¦ x ∈ A ∨ x ∈ B ¦

A∩B A intersect B A∩B = ¦ x ¦ x ∈ A ∧ x ∈ B ¦

A−B A minus B A−B = ¦ x ¦ x ∈ A ∧ ¬ x ∈ B ¦

A ⊆ B A is contained in B A ⊆ B ≡ ∀x : A • x ∈ B

A ⊇ B A contains B A ⊇ B ≡ B ⊆ A

¦ x : A ¦ P(x) ¦ the set of x in A

such that P(x)

N the set of natural numbers ¦0, 1, 2, . . .¦

PA the power set of A PA = ¦ X ¦ X ⊆ A¦

n≥0

A

n

union of a family of sets

n≥0

A

n

= ¦ x ¦ ∃n ≥ 0 • x ∈ A¦

n≥0

A

n

intersection of a family

of sets

n≥0

A

n

= ¦ x ¦ ∀n ≥ 0 • x ∈ A¦

Functions

Notation Meaning Example

f : A→B f is a function which maps

each member of A to a

member of B

square : N→N

f (x) that member of B to which

f maps x (in A)

injection a function f which maps

each member of A to a

distinct member of B

x ≠ y ⇒f (x) ≠ f (y)

f

−1

inverse of an injection f x = f (y) ≡ y = f

−1

(x)

¦ f (x) ¦ P(x) ¦ the set formed by applying

f to all x such that P(x)

f (C) the image of C under f ¦ y ¦ ∃x • y = f (x) ∧ x ∈ C ¦

square(¦3, 5¦) = ¦9, 15¦

f ◦ g f composed with g f ◦ g(x) = f (g(x))

λx • f (x) the function which maps

each value of x to f (x)

(λx • f (x))(3) = f (3)

Glossary of Symbols xvii

Traces

Section Notation Meaning Example

1.5 ¸) the empty trace

1.5 ¸a) the trace containing only

a (singleton sequence)

1.5 ¸a, b, c) the trace with three sym-

bols, a then b, then c

1.6.1

(between traces)

followed by

¸a, b, c) = ¸a, b)

¸)

¸c)

1.6.1 s

n

s repeated n times ¸a, b)

2

= ¸a, b, a, b)

1.6.2 s A s restricted to A ¸b, c, d, a) ¦a, c¦ = ¸c, a)

1.6.5 s ≤ t s is a preﬁx of t ¸a, b) ≤ ¸a, b, c)

4.2.2 s ≤

n

t s is like t with up to n

symbols removed

¸a, b) ≤

2

¸a, b, c, d)

1.6.5 s in t s is in t ¸c, d) in ¸b, c, d, a, b)

1.6.6 #s the length of s #¸b, c, b, a) = 4

1.6.6 s ↓ b the count of b in s ¸b, c, b, a) ↓ b = 2

1.9.6 s ↓ c the communications on

channel c recorded in s

¸c.1, a.4, c.3, d.1) ↓ c =

¸1, 3)

1.9.2

/ s ﬂatten s

/¸¸a, b), ¸)

1.9.7 s ; t s successfully followed

by t

(s

¸✓)) ; t = s

t

1.6.4 A

∗

set of sequences with

elements in A

A

∗

= ¦ s ¦ s A = s ¦

1.6.3 s

0

the head of s ¸a, b, c)

0

= a

1.6.3 s

¹

the tail of s ¸a, b, c)

¹

= ¸b, c)

1.9.4 s[i] the ith element of s ¸a, b, c)[1] = b

1.9.1 f

∗

(s) f star of s square

∗

(¸1, 5, 3)) =

¸1, 25, 9)

1.9.5 s reverse of s ¸a, b, c, ) = ¸c, b, a)

Special Events

Section Notation Meaning

1.9.7 ✓ success (successful termination)

2.6.2 l.a participation in event a by a process named l

xviii Glossary of Symbols

4.1 c.v communication of value v on channel c

4.5 l.c channel c of a process named l

4.5 l.c.v communication of a message v on channel l.c

5.4.1 catastrophe (lightning)

5.4.3 _x exchange

5.4.4 _c checkpoint for later recovery

6.2 acquire acquisition of a resource

6.2 release release of a resource

Processes

Section Notation Meaning

1.1 αP the alphabet of process P

4.1 αc the set of messages communicable

on channel c

1.1.1 a →P a then P

1.1.3 (a →P ¦ b →Q) a then P choice b then Q (provided a ≠ b)

1.1.3 (x : A →P(x)) (choice of) x from A then P(x)

1.1.2 µ X : A • F(X) the process X with alphabet A

such that X = F(X)

1.8 P / s P after (engaging in events of trace) s

2.3 P || Q P in parallel with Q

2.6.2 l : P P with name l

2.6.4 L : P P with names from set L

3.2 P ¦ Q P or Q (non-deterministic)

3.3 P Q P choice Q

3.5 P \ C P without C (hiding)

3.6 P ||| Q P interleave Q

4.4 P>>Q P chained to Q

4.5 P // Q P subordinate to Q

6.4 l :: P // Q remote subordination

5.1 P ; Q P (successfully) followed by Q

5.4 P ZQ P interrupted by Q

5.4.1 P ˆ

Q P but on catastrophe Q

5.4.2

ˆ

P restartable P

Glossary of Symbols xix

5.4.3 P _x Q P alternating with Q

5.5 P ¦ <b ¦ > Q P if b else Q

5.1

∗

P repeat P

5.5 b ∗ P while b repeat P

5.5 x := e x becomes (value of) e

4.2 b!e on (channel) b output (value of) e

4.2 b?x on (channel) b input to x

6.2 l!e?x call of shared subroutine named l

with value parameter e and results to x

1.10.1 P sat S (process) P satisﬁes (speciﬁcation) S

1.10.1 tr an arbitrary trace of the speciﬁed process

3.7 ref an arbitrary refusal of the speciﬁed process

5.5.2 x

✓

the ﬁnal value of x

produced by the speciﬁed process

5.5.1 var(P) set of variables assignable by P

5.5.1 acc(P) set of variables accessible by P

2.8.2 P U Q (deterministic) Q can do

at least as much as P

3.9 P U Q (nondeterministic) Q is

as good as P or better

5.5.1 1e expression e is deﬁned

Algebra

Term Meaning

reﬂexive a relation R such that x Rx

antisymmetric a relation R such that x Ry ∧ y Rx ⇒x = y

transitive a relation R such that x Ry ∧ y Rz ⇒x Rz

partial order a relation ≤ that is reﬂexive, antisymmetric, and transitive

bottom a least element ⊥ such that ⊥≤ x

monotonic a function f that respects a partial order: x ≤ y ⇒f (x) ≤ f (y)

strict a function f that preserves bottom: f (⊥) =⊥

idempotent a binary operator f such that x f x = x

symmetric a binary operator f such that x f y = y f x

associative a binary operator f such that x f (y f z) = (x f y) f z

xx Glossary of Symbols

distributive f distributes through g if x f (y g z) = (x f y) g (x f z) and

(y g z) f x = (y f x) g (z f x)

unit of f is an element 1 such that x f 1 = 1f x = x

zero of f is an element 0 such that x f 0 = 0 f x = 0

Graphs

Term Meaning

graph a relation drawn as a picture

node a circle in a graph representing an element in the domain

or range of a relation

arc a line or arrow in a graph connecting nodes between which

the pictured relation holds

undirected

graph

graph of a symmetric relation

directed

graph

graph of an asymmetric relation often drawn

with arrows

directed

cycle

a set of nodes connected in a cycle by arrows

all in the same direction

undirected

cycle

a set of nodes connected in a cycle by arcs or

arrows in either direction

Contents

Foreword iii

Preface v

Summary ix

Acknowledgements xiii

Glossary of Symbols xv

1 Processes 1

1.1 Introduction 1

1.2 Pictures 12

1.3 Laws 14

1.4 Implementation of processes 16

1.5 Traces 19

1.6 Operations on traces 21

1.7 Implementation of traces 26

1.8 Traces of a process 27

1.9 More operations on traces 34

1.10 Speciﬁcations 37

2 Concurrency 45

2.1 Introduction 45

2.2 Interaction 45

2.3 Concurrency 48

2.4 Pictures 54

2.5 Example: The Dining Philosophers 55

2.6 Change of symbol 61

2.7 Speciﬁcations 71

2.8 Mathematical theory of deterministic processes 72

xxii Contents

3 Nondeterminism 81

3.1 Introduction 81

3.2 Nondeterministic or 82

3.3 General choice 86

3.4 Refusals 88

3.5 Concealment 90

3.6 Interleaving 99

3.7 Speciﬁcations 101

3.8 Divergence 105

3.9 Mathematical theory of non-deterministic processes 108

4 Communication 113

4.1 Introduction 113

4.2 Input and output 113

4.3 Communications 122

4.4 Pipes 131

4.5 Subordination 142

5 Sequential Processes 153

5.1 Introduction 153

5.2 Laws 157

5.3 Mathematical treatment 158

5.4 Interrupts 161

5.5 Assignment 167

6 Shared Resources 181

6.1 Introduction 181

6.2 Sharing by interleaving 182

6.3 Shared storage 187

6.4 Multiple resources 189

6.5 Operating systems 198

6.6 Scheduling 204

7 Discussion 207

7.1 Introduction 207

7.2 Shared storage 207

7.3 Communication 218

7.4 Mathematical models 226

Select Bibliography 233

Index 235

Processes 1

1.1 Introduction

Forget for a while about computers and computer programming, and think

instead about objects in the world around us, which act and interact with us and

with each other in accordance with some characteristic pattern of behaviour.

Think of clocks and counters and telephones and board games and vending

machines. To describe their patterns of behaviour, ﬁrst decide what kinds of

event or action will be of interest; and choose a diﬀerent name for each kind.

In the case of a simple vending machine, there are two kinds of event:

coin—the insertion of a coin in the slot of a vending machine;

choc—the extraction of a chocolate from the dispenser of the machine.

In the case of a more complex vending machine, there may be a greater variety

of events:

in1p—the insertion of one penny;

in2p—the insertion of a two penny coin;

small—the extraction of a small biscuit or cookie;

large—the extraction of a large biscuit or cookie;

out1p—the extraction of one penny in change.

Note that each event name denotes an event class; there may be many oc-

currences of events in a single class, separated in time. A similar distinction

between a class and an occurrence should be made in the case of the letter

‘h’, of which there are many occurrences spatially separated in the text of this

book.

The set of names of events which are considered relevant for a particular

description of an object is called its alphabet. The alphabet is a permanent

predeﬁned property of an object. It is logically impossible for an object to

engage in an event outside its alphabet; for example, a machine designed to

sell chocolates could not suddenly deliver a toy battleship. But the converse

2 1 Processes

does not hold. A machine designed to sell chocolates may actually never do

so—perhaps because it has not been ﬁlled, or it is broken, or nobody wants

chocolates. But once it is decided that choc is in the alphabet of the machine,

it remains so, even if that event never actually occurs.

The choice of an alphabet usually involves a deliberate simpliﬁcation, a

decision to ignore many other properties and actions which are considered to

be of lesser interest. For example, the colour, weight, and shape of a vending

machine are not described, and certain very necessary events in its life, such as

replenishing the stack of chocolates or emptying the coin box, are deliberately

ignored—perhaps on the grounds that they are not (or should not be) of any

concern to the customers of the machine.

The actual occurrence of each event in the life of an object should be

regarded as an instantaneous or an atomic action without duration. Extended

or time-consuming actions should be represented by a pair of events, the ﬁrst

denoting its start and the second denoting its ﬁnish. The duration of an action

is represented by the internal between the occurrence of its start event and the

occurrence of its ﬁnish event; during such an internal, other events may occur.

Two extended actions may overlap in time if the start of each one precedes the

ﬁnish of the other.

Another detail which we have deliberately chosen to ignore is the exact

timing of occurrences of events. The advantage of this is that designs and

reasoning about them are simpliﬁed, and furthermore can be applied to phys-

ical and computing systems of any speed and performance. In cases where

timing of responses is critical, these concerns can be treated independently of

the logical correctness of the design. Independence of timing has always been

a necessary condition to the success of high-level programming languages.

A consequence of ignoring time is that we refuse to answer of even to ask

whether one event occurs simultaneously with another. When simultaneity of a

pair of events is important (e.g. in synchronisation) we represent it as a single-

event occurrence; and when it is not, we allow two potentially simultaneous

event occurrences to be recorded in either order.

In choosing an alphabet, there is no need to make a distinction between

events which are initiated by the object (perhaps choc) and those which are

initiated by some agent outside the object (for example, coin). The avoidance

of the concept of causality leads to considerable simpliﬁcation in the theory

and its application.

Let us nowbegin to use the word process to stand for the behaviour pattern

of an object, insofar as it can be described in terms of the limited set of events

selected as its alphabet. We shall use the following conventions.

1. Words in lower-case letters denote distinct events, e.g.,

coin, choc, in2p, out1p

and so also do the letters, a, b, c, d, e.

2. Words in upper-case letters denote speciﬁc deﬁned processes, e.g.,

1.1 Introduction 3

VMS—the simple vending machine

VMC—the complex vending machine

and the letters P, Q, R (occurring in laws) stand for arbitrary processes.

3. The letters x, y, z are variables denoting events.

4. The letters A, B, C stand for sets of events.

5. The letters X, Y are variables denoting processes.

6. The alphabet of process P is denoted αP, e.g.,

αVMS = ¦coin, choc¦

αVMC = ¦in1p, in2p, small, large, out1p¦

The process with alphabet A which never actually engages in any of the

events of A is called STOP

A

. This describes the behaviour of a broken object:

although it is equipped with the physical capabilities to engage in the events

of A, it never exercises those capabilities. Objects with diﬀerent alphabets are

distinguished, even if they never do anything. So STOP

αVMS

might have given

out a chocolate, whereas STOP

αVMC

could never give out a chocolate, only a

biscuit. A customer for either machine knows these facts, even if he does not

know that both machines are broken.

In the remainder of this introduction, we shall deﬁne some simple nota-

tions to aid in the description of objects which actually succeed in doing some-

thing.

1.1.1 Preﬁx

Let x be an event and let P be a process. Then

(x →P) (pronounced “x then P”)

describes an object which ﬁrst engages in the event x and then behaves exactly

as described by P. The process (x →P) is deﬁned to have the same alphabet as

P, so this notation must not be used unless x is in that alphabet; more formally,

α(x →P) = αP provided x ∈ αP

Examples

X1 A simple vending machine which consumes one coin before breaking

(coin →STOP

αVMS

)

**X2 A simple vending machine that successfully serves two customers before
**

breaking

(coin →(choc →(coin →(choc →STOP

αVMS

))))

4 1 Processes

Initially, this machine will accept insertion of a coin in its slot, but will not allow

a chocolate to be extracted. But after the ﬁrst coin is inserted, the coin slot

closes until a chocolate is extracted. This machine will not accept two coins in

a row, nor will it give out two consecutive chocolates.

In future, we shall omit brackets in the case of linear sequences of events,

like those in X2, on the convention that →is right associative.

X3 A counter starts on the bottom left square of a board, and can move only

up or right to an adjacent white square

αCTR = ¦up, right¦

CTR = (right →up →right →right →STOP

αCTR

)

**Note that the →operator always takes a process on the right and a single event
**

on the left. If P and Q are processes, it is syntactically incorrect to write

P →Q

The correct method of describing a process which behaves ﬁrst like P and

then like Q is described in Chapter 5. Similarly, if x and y are events, it is

syntactically incorrect to write

x →y

Such a process could be correctly described

x →(y →STOP)

Thus we carefully distinguish the concept of an event from that of a process

which engages in events—maybe many events or even none.

1.1.2 Recursion

The preﬁx notation can be used to describe the entire behaviour of a process

that eventually stops. But it would be extremely tedious to write out the full

behaviour of a vending machine for its maximum design life; so we need a

1.1 Introduction 5

method of describing repetitive behaviours by much shorter notations. Prefer-

ably these notations should not require a prior decision on the length of life

of an object; this will permit description of objects which will continue to act

and interact with their environment for as long as they are needed.

Consider the simplest possible everlasting object, a clock which never does

anything but tick (the act of winding it is deliberately ignored)

αCLOCK = ¦tick¦

Consider next an object that behaves exactly like the clock, except that it ﬁrst

emits a single tick

(tick →CLOCK)

The behaviour of this object is indistinguishable fromthat of the original clock.

This reasoning leads to formulation of the equation

CLOCK = (tick →CLOCK)

This can be regarded as an implicit deﬁnition of the behaviour of the clock,

in the same way that the square root of two might be deﬁned as the positive

solution for x in the equation

x = x

2

+x −2

The equation for the clock has some obvious consequences, which are

derived by simply substituting equals for equals

CLOCK

= (tick →CLOCK) [original equation]

= (tick →(tick →CLOCK)) [by substitution]

= (tick →(tick →(tick →CLOCK))) [similarly]

The equation can be unfolded as many times as required, and the possibility of

further unfolding will still be preserved. The potentially unbounded behaviour

of the CLOCK has been eﬀectively deﬁned as

tick →tick →tick →· · ·

in the same way as the square root of two can be thought of as the limit of a

series of decimals

1.414 . . .

This method of self-referential recursive deﬁnition of processes will work only

if the right-hand side of the equation starts with at least one event preﬁxed

to all recursive occurrences of the process name. For example, the recursive

6 1 Processes

equation

X = X

does not succeed in deﬁning anything: everything is a solution to this equation.

A process description which begins with a preﬁx is said to be guarded. If F(X)

is a guarded expression containing the process name X, and A is the alphabet

of X, then we claim that the equation

X = F(X)

has a unique solution with alphabet A. It is sometimes convenient to denote

this solution by the expression

µ X : A • F(X)

Here X is a local name (bound variable), and can be changed at will, since

µ X : A • F(X) = µ Y : A • F(Y)

This equality is justiﬁed by the fact that a solution for X in the equation

X = F(X)

is also a solution for Y in the equation

Y = F(Y)

In future, we will give recursive deﬁnitions of processes either by equations,

or by use of µ, whichever is more convenient. In the case of µ X : A • F(X), we

shall often omit explicit mention of the alphabet A, where this is obvious from

the content of context of the process.

Examples

X1 A perpetual clock

CLOCK = µ X : ¦tick¦ • (tick →X)

**X2 At last, a simple vending machine which serves as many chocs as required
**

VMS = (coin →(choc →VMS))

As explained above, this equation is just an alternative for the more formal

deﬁnition

VMS = µ X : ¦coin, choc¦ • (coin →(choc →X))

1.1 Introduction 7

X3 A machine that gives change for 5p repeatedly

αCH5A = ¦in5p, out2p, out1p¦

CH5A = (in5p →out2p →out1p →out2p →CH5A)

**X4 A diﬀerent change-giving machine with the same alphabet
**

CH5B = (in5p →out1p →out1p →out1p →out2p →CH5B)

**The claim that guarded equations have a solution, and that this solution
**

may be unique, may be informally justiﬁed by the method of substitution.

Each time that the right-hand side of the equation is substituted for every

occurrence of the process name, the expression deﬁning the behaviour of the

process gets longer, and so describes a longer initial segment of behaviour. Any

ﬁnite amount of behaviour can be determined in this way. Two objects which

behave the same up to every moment in time have the same behaviour, i.e.,

they are the same process. Those who ﬁnd this reasoning incomprehensible or

unconvincing should accept this claim as an axiom, whose value and relevance

will gradually become more apparent. A more formal proof cannot be given

without some mathematical deﬁnition of exactly what a process is. This will

be done in Section 2.8.3. The account of recursion given here relies heavily on

guardedness of recursive equations. A meaning for unguarded recursions will

be discussed in Section 3.8.

1.1.3 Choice

By means of preﬁxing and recursion it is possible to describe objects with a

single possible stream of behaviour. However, many objects allow their beha-

viour to be inﬂuenced by interaction with the environment within which they

are placed. For example, a vending machine may oﬀer a choice of slots for

inserting a 2p or a 1p coin; and it is the customer that decides between these

two events. If x and y are distinct events

(x →P ¦ y →Q)

describes an object which initially engages in either of the events x or y. After

the ﬁrst event has occurred, the subsequent behaviour of the object is de-

scribed by P if the ﬁrst event was x, or by Q if the ﬁrst event was y. Since x

and y are diﬀerent events, the choice between P and Q is determined by the

ﬁrst event that actually occurs. As before, we insist on constancy of alphabets,

i.e.,

α(x →P ¦ y →Q) = αP provided ¦x, y¦ ⊆ αP and αP = αQ

The bar ¦ should be pronounced “choice”: “x then P choice y then Q”

8 1 Processes

Examples

X1 The possible movements of a counter on the board

are deﬁned by the process

(up →STOP ¦ right →right →up →STOP)

**X2 A machine which oﬀers a choice of two combinations of change for 5p
**

(compare 1.1.2 X3 and X4, which oﬀer no choice).

CH5C = in5p →(out1p →out1p →out1p →out2p →CH5C

¦ out2p →out1p →out2p →CH5C)

The choice is exercised by the customer of the machine.

X3 A machine that serves either chocolate or toﬀee on each transaction

VMCT = µ X • coin →(choc →X ¦ toﬀee →X)

**X4 A more complicated vending machine, which oﬀers a choice of coins and
**

a choice of goods and change

VMC = (in2p →(large →VMC

¦ small →out1p →VMC)

¦ in1p →(small →VMC

¦ in1p →(large →VMC

¦ in1p →STOP)))

Like many complicated machines, this has a design ﬂaw. It is often easier to

change the user manual than correct the design, so we write a notice on the

machine

“WARNING: do not insert three pennies in a row.”

1.1 Introduction 9

X5 A machine that allows its customer to sample a chocolate, and trusts him

to pay after. The normal sequence of events is also allowed

VMCRED = µ X • (coin →choc →X

¦ choc →coin →X)

**X6 To prevent loss, an initial payment is extracted for the privilege of using
**

VMCRED

VMS2 = (coin →VMCRED)

This machine will allow insertion of up to two consecutive coins before ex-

traction of up to two consecutive chocolates; but it will never give out more

chocolates than have been previously paid for.

X7 A copying process engages in the following events

in.0—input of zero on its input channel

in.1—input of one on its input channel

out.0—output of zero on its output channel

out.1—output of one on its output channel

Its behaviour consists of a repetition of pairs of events. On each cycle, it inputs

a bit and outputs the same bit

COPYBIT = µ X • (in.0 →out.0 →X

¦ in.1 →out.1 →X)

Note how this process allows its environment to choose which value should

be input, but no choice is oﬀered in the case of output. That will be the main

diﬀerence between input and output in our treatment of communication in

Chapter 4.

The deﬁnition of choice can readily be extended to more than two altern-

atives, e.g.,

(x →P ¦ y →Q ¦ . . . ¦ z →R)

Note that the choice symbol ¦ is not an operator on processes; it would be

syntactically incorrect to write P ¦ Q, for processes P and Q. The reason for

this rule is that we want to avoid giving a meaning to

(x →P ¦ x →Q)

which appears to oﬀer a choice of ﬁrst event, but actually fails to do so.

This problem is solved, at the expense of introducing nondeterminism, in Sec-

10 1 Processes

tion 3.3. Meanwhile, if x, y, z are distinct events,

(x →P ¦ y →Q ¦ z →R)

should be regarded as a single operator with three arguments P, Q, R. It cannot

be regarded as equal to

(x →P ¦ (y →Q ¦ z →R))

which is syntactically incorrect.

In general, if B is any set of events and P(x) is an expression deﬁning a

process for each diﬀerent x in B, then

(x : B →P(x))

deﬁnes a process which ﬁrst oﬀers a choice of any event y in B, and then

behaves like P(y). It should be pronounced “x from B then P of x”. In this

construction, x is a local variable, so

(x : B →P(x)) = (y : B →P(y))

The set B deﬁnes the initial menu of the process, since it gives the set of actions

between which a choice is to be made at the start.

Examples

X8 A process which at all times can engage in any event of its alphabet A

αRUN

A

= A

RUN

A

= (x : A →RUN

A

)

**In the special case that the menu contains only one event e,
**

(x : ¦e¦ →P(x)) = (e →P(e))

since e is the only possible initial event. In the even more special case that the

initial menu is empty, nothing at all can happen, so

(x : ¦¦ →P(x)) = (y : ¦¦ →Q(y)) = STOP

The binary choice operator ¦ can also be deﬁned using the more general nota-

tion

(a →P ¦ b →Q) = (x : B →R(x))

where B = ¦a, b¦ and R(x) =if x = a then P else Q

1.1 Introduction 11

Choice between three or more alternatives can be similarly expressed.

Thus choice, preﬁxing and STOP are deﬁned as just special cases of the gen-

eral choice notation. This will be a great advantage in the formulation of gen-

eral laws governing processes (Section 1.3), and in their implementation (Sec-

tion 1.4).

1.1.4 Mutual recursion

Recursion permits the deﬁnition of a single process as the solution of a single

equation. The technique is easily generalised to the solution of sets of simul-

taneous equations in more than one unknown. For this to work properly, all

the right-hand sides must be guarded, and each unknown process must appear

exactly once on the left-hand side of one of the equations.

Example

X1 A drinks dispenser has two buttons labelled ORANGE and LEMON. The

actions of pressing the two buttons are setorange and setlemon. The actions of

dispensing a drink are orange and lemon. The choice of drink that will next be

dispensed is made by pressing the corresponding button. Before any button

is pressed, no drink will be dispensed. Here are the equations deﬁning the

alphabet and behaviour of the process DD. The deﬁnition uses two auxiliary

deﬁnitions of O and L, which are mutually recursive

αDD = αO = αL = ¦setorange, setlemon, orange, lemon¦

DD = (setorange →O ¦ setlemon →L)

O = (orange →O ¦ setlemon →L ¦ setorange →O)

L = (lemon →O ¦ setorange →O ¦ setlemon →L)

Informally, after the ﬁrst event (which must be a setting event) the dispenser

is in one of two states O or L. In each state, it may serve the appropriate drink

or be set into the other state. Pressing the button for the same state is allowed,

but has no eﬀect.

By using indexed variables, it is possible to specify inﬁnite sets of equations.

Examples

X2 An object starts on the ground, and may move up. At any time thereafter

it may move up and down, except that when on the ground it cannot move any

further down. But when it is on the ground, it may move around. Let n range

over the natural numbers ¦0, 1, 2, . . .¦. For each n, introduce the indexed name

CT

n

to describe the behaviour of the object when it is n moves oﬀ the ground.

Its initial behaviour is deﬁned as

CT

0

= (up →CT

1

¦ around →CT

0

)

12 1 Processes

and the remaining inﬁnite set of equations consists of

CT

n+1

= (up →CT

n+2

¦ down →CT

n

)

where n ranges over the natural numbers 0, 1, 2, …

An ordinary inductive deﬁnition is one whose validity depends on the fact

that the right-hand side of each equation uses only indices less than that of

the left-hand side. Here, CT

n+1

is deﬁned in terms of CT

n+2

, and so this can be

regarded only as an inﬁnite set of mutually recursive deﬁnitions, whose validity

depends on the fact that the right-hand side of each equation is guarded.

1.2 Pictures

It may be helpful sometimes to make a pictorial representation of a process as

a tree structure, consisting of circles connected by arrows. In the traditional

terminology of state machines, the circles represent states of the process, and

the arrows represent transitions between the states. The single circle at the

root of the tree (usually drawn at the top of the page) is the starting state; and

the process moves downward along the arrows. Each arrow is labelled by the

event which occurs on making that transition. The arrows leading from the

same node must all have diﬀerent labels.

Examples (1.1.1 X1, X2; 1.1.3 X3)

X1 X2 X3

coin coin

choc choc

choc

toffee

toffee

coin

choc

coin

coin coin

**In these three examples, every branch of each tree ends in STOP, repres-
**

ented as a circle with no arrows leading out of it. To represent processes with

unbounded behaviour it is necessary to introduce another convention, namely

1.2 Pictures 13

an unlabelled arrow leading from a leaf circle back to some earlier circle in the

tree. The convention is that when a process reaches the node at the tail of the

arrow, it immediately and imperceptibly goes back to the node to which the

arrow points.

X5 X4

choc choc

choc

toffee toffee

toffee

coin coin

coin

**Clearly, these two pictures illustrate exactly the same process (1.1.3 X3). It is
**

one of the weaknesses of pictures that proofs of such an equality are diﬃcult

to conduct pictorially.

Another problemwith pictures is that they cannot illustrate processes with

a very large or inﬁnite number of states, for example CT

0

around

down

down

up

up

up

There is never enough room to draw the whole picture. A counter with only

65536 diﬀerent states would take a long time to draw.

14 1 Processes

1.3 Laws

Even with the very restricted set of notations introduced so far, there are many

diﬀerent ways of describing the same behaviour. For example, it obviously

should not matter in which order a choice between events is presented

(x →P ¦ y →Q) = (y →Q ¦ x →P)

On the other hand, a process that can do something is not the same as one

that cannot do anything

(x →P) ≠ STOP

In order to understand a notation properly and to use it eﬀectively, we must

learn to recognise which expressions describe the same object and which do

not, just as everyone who understands arithmetic knows that (x + y) is the

same number as (y +x). Identity of processes with the same alphabet may be

proved or disproved by appeal to algebraic laws very like those of arithmetic.

The ﬁrst law(L1 below) deals with the choice operator (1.1.3). It states that

two processes deﬁned by choice are diﬀerent if they oﬀer diﬀerent choices on

the ﬁrst step, or if after the same ﬁrst step the behave diﬀerently. However,

if the initial choices are the same, and for each initial choice the subsequent

behaviours are the same, then obviously the processes are identical.

L1 (x : A →P(x)) = (y : B →Q(y)) ≡ (A = B ∧ ∀x : A • P(x) = Q(x))

Here and elsewhere, we assume without stating it that the alphabets of the

processes on each side of an equation are the same.

The law L1 has a number of consequences

L1A STOP ≠ (d →P)

Proof : LHS = (x : ¦¦ →P) by deﬁnition (1.1.3 end)

≠ (x : ¦d¦ →P) because ¦¦ ≠ ¦d¦

= RHS by deﬁnition (1.1.3 end)

L1B (c →P) ≠ (d →Q) if c ≠ d

Proof : ¦c¦ ≠ ¦d¦

L1C (c →P ¦ d →Q) = (d →Q ¦ c →P)

Proof : Deﬁne R(x) = P if x = c

= Q if x = d

LHS = (x : ¦c, d¦ →R(x)) by deﬁnition

= (x : ¦d, c¦ →R(x)) because ¦c, d¦ = ¦d, c¦

= RHS by deﬁnition

1.3 Laws 15

L1D (c →P) = (c →Q) ≡ P = Q

Proof : ¦c¦ = ¦c¦

These laws permit proof of simple theorems.

Examples

X1 (coin →choc →coin →choc →STOP) ≠ (coin →STOP)

Proof : by L1D then L1A.

X2 µ X • (coin →(choc →X ¦ toﬀee →X))

= µ X • (coin →(toﬀee →X ¦ choc →X))

Proof : by L1C.

To prove more general theorems about recursively-deﬁned processes, it is

necessary to introduce a lawwhich states that every properly guarded recursive

equation has only one solution.

L2 If F(X) is a guarded expression,

(Y = F(Y)) ≡ (Y = µ X • F(X))

An immediate but important corollary states that µ X • F(X) is indeed a solu-

tion of the relevant equation

L2A µ X • F(X) = F(µ X • F(X))

Example

X3 Let VM1 = (coin →VM2) and VM2 = (choc →VM1)

Required to prove VM1 = VMS.

Proof : VM1 = (coin →VM2) deﬁnition of VM1

= (coin →(choc →VM1)) deﬁnition of VM2

Therefore VM1 is a solution of the same recursive equation as VMS. Since

the equation is guarded, there is only one solution. So VM1 and VMS are just

diﬀerent names for this unique solution.

This theorem may seem so obviously true that its proof in no way adds

to its credibility. The only purpose of the proof is to show by example that

the laws are powerful enough to establish facts of this kind. When proving

obvious facts from less obvious laws, it is important to justify every line of the

proof in full, as a check that the proof is not circular.

16 1 Processes

The law L2 can be extended to mutual recursion. A set of mutually recurs-

ive equations can be written in the general form using subscripts

X

i

= F(i, X) for all i in S

where

S is an indexing set with one member for each equation, and

X is an array of processes with indices ranging over the set S, and

F(i, X) is a guarded expression.

Under these conditions, the law L3 states that there is only one array X whose

elements satisfy all the equations

L3 Under the conditions explained above,

if (∀i : S • (X

i

= F(i, X) ∧ Y

i

= F(i, Y))) then X = Y

1.4 Implementation of processes

Every process P expressible in the notations introduced so far can be written

in the form

(x : B →F(x))

where F is a function from symbols to processes, and where B may be empty

(in the case of STOP), or may contain only one member (in the case of preﬁx),

or may contain more than one member (in the case of choice). In the case of

a recursively deﬁned process, we have insisted that the recursion should be

guarded, so that it may be written

µ X • (x : B →F(x, X))

and this may be unfolded to the required form using L2A

(x : B →F(x, µ X • (x : B →F(x, X))))

Thus every process may be regarded as a function F with a domain B, deﬁning

the set of events in which the process is initially prepared to engage; and for

each x in B, F(x) deﬁnes the future behaviour of the process if the ﬁrst event

was x.

This insight permits every process to be represented as a function in some

suitable functional programming language such as LISP. Each event in the al-

phabet of a process is represented as an atom, for example "COIN, "TOFFEE.

A process is a function which can be applied to such a symbol as an argument.

1.4 Implementation of processes 17

If the symbol is not a possible ﬁrst event for the process, the function gives

as its result a special symbol "BLEEP, which is used only for this purpose. For

example, since STOP never engages in any event, this is the only result it can

ever give, and so it is deﬁned

STOP = λx • "BLEEP

But if the actual argument is a possible event for the process, the function gives

back as its result another function, representing the subsequent behaviour of

the process. Thus (coin →STOP) is represented as the function

λx • if x = "COIN then

STOP

else

"BLEEP

This last example takes advantage of the facility of LISP for returning a

function (e.g., STOP) as the result of a function. LISP also allows a function

to be passed as an argument to a function, a facility used in representing a

general preﬁx operation (c →P)

preﬁx(c, P) = λx • if x = c then

P

else

"BLEEP

A function to represent a general binary choice (c →P ¦ d →Q) requires

four parameters

choice2(c, P, d, Q) = λx • if x = c then

P

else if x = d then

Q

else

"BLEEP

Recursively deﬁned processes may be represented with the aid of the

LABEL feature of LISP. For example, the simple vending machine process

(µ X • coin →choc →X) is represented as

LABEL X • preﬁx("COIN, preﬁx("CHOC, X))

The LABEL may also be used to represent mutual recursion. For example,

CT (1.1.4 X2) may be regarded as a function fromnatural numbers to processes

18 1 Processes

(which are themselves functions—but let not that be a worry). So CT may be

deﬁned

CT = LABEL X • (λn • if n = 0 then

choice2("AROUND, X(0), "UP, X(1))

else

choice2("UP, X(n +1), "DOWN, X(n −1)))

The process that starts on the ground is CT(0).

If P is a function representing a process, and A is a list containing the

symbols of its alphabet, the LISP function menu(A, P) gives a list of all those

symbols of A which can occur as the ﬁrst event in the life of P

menu(A, P) = if A = NIL then

NIL

else if P(car(A)) = "BLEEP then

menu(cdr(A), P)

else

cons(car(A), menu(cdr(A), P))

If x is in menu(A, P), P(x) is not "BLEEP, and is therefore a function deﬁn-

ing the future behaviour of P after engaging in x. Thus if y is in menu(A, P(x)),

then P(x)(y) will give its later behaviour, after both x and y have occurred. This

suggests a useful method of exploring the behaviour of a process. Write a pro-

gram which ﬁrst outputs the value of menu(A, P) on a screen, and then inputs

a symbol from the keyboard. If the symbol is not in the menu, it should be

greeted with an audible bleep and then ignored. Otherwise the symbol is ac-

cepted, and the process is repeated with P replaced by the result of applying P

to the accepted symbol. The process is terminated by typing an "END symbol.

Thus if k is the sequence of symbols input from the keyboard, the following

function gives the sequence of outputs required

interact(A, P, k) =

cons(menu(A, P), if car(k) = "END then

NIL

else if P(car(k)) = "BLEEP then

cons("BLEEP, interact(A, P, cdr(k)))

else

interact(A, P(car(k), cdr(k)))

The notations used above for deﬁning LISP functions are very informal,

and they will need to be translated to the speciﬁc conventional S-expression

1.5 Traces 19

form of some particular implementation of LISP. For example in LISPkit, the

preﬁx function can be deﬁned

(preﬁx

lambda

(a p)

(lambda (x) (if (eq x a) p (quote BLEEP))))

Fortunately, we shall use only a very small subset of pure functional LISP, so

there should be little diﬃculty in translating and running these processes in a

variety of dialects on a variety of machines.

If there are several versions of LISP available, choose one with proper static

binding of variables. A LISP with lazy evaluation is also more convenient, since

it permits direct encoding of recursive equations, without using LABEL, thus

VMS = preﬁx("COIN, preﬁx("CHOC, VMS))

If input and output are implemented by lazy evaluation, the interact function

may be called with the keyboard as its third parameter; and the menu for the

process P will appear as the ﬁrst output. By selecting and inputting a symbol

from the successive menus, a user can interactively explore the behaviour of

the process P.

In other versions of LISP, the interact function should be rewritten, using

explicit input and output to achieve the same eﬀect. When this has been done,

it is possible to observe the computer executing any process that has been

represented as a LISP function. In this sense, such a LISP function may be

regarded as an implementation of the corresponding process. Furthermore, a

LISP function such as preﬁx which operates on these representations may be

regarded as the implementation of the corresponding operator on processes.

1.5 Traces

A trace of the behaviour of a process is a ﬁnite sequence of symbols recording

the events in which the process has engaged up to some moment in time.

Imagine there is an observer with a notebook who watches the process and

writes down the name of each event as it occurs. We can validly ignore the

possibility that two events occur simultaneously; for if they did, the observer

would still have to record one of them ﬁrst and then the other, and the order

in which he records them would not matter.

A trace will be denoted as a sequence of symbols, separated by commas

and enclosed in angular brackets

¸x, y) consists of two events, x followed by y.

¸x) is a sequence containing only the event x.

¸) is the empty sequence containing no events.

20 1 Processes

Examples

X1 A trace of the simple vending machine VMS (1.1.2 X2) at the moment it

has completed service of its ﬁrst two customers

¸coin, choc, coin, choc)

**X2 A trace of the same machine before the second customer has extracted his
**

choc

¸coin, choc, coin)

Neither the process nor its observer understands the concept of a completed

transaction. The hunger of the expectant customer, and the readiness of the

machine to satisfy it are not in the alphabet of these processes, and cannot be

observed or recorded.

X3 Before a process has engaged in any events, the notebook of the observer

is empty. This is represented by the empty trace

¸)

Every process has this as its shortest possible trace.

X4 The complex vending machine VMC (1.1.3 X4) has the following seven

traces of length two or less

¸)

¸in2p) ¸in1p)

¸in2p, large) ¸in2p, small) ¸in1p, in1p) ¸in1p, small)

Only one of the four traces of length two can actually occur for a given machine.

The choice between themwill be determined by the wishes of the ﬁrst customer

to use the machine.

X5 A trace of VMC if its ﬁrst customer has ignored the warning is

¸in1p, in1p, in1p)

The traces does not actually record the breakage of the machine. Breakage is

only indicated by the fact that among all the possible traces of the machine

there is no trace which extends this one, i.e., there is no event x such that

¸in1p, in1p, in1p, x)

1.6 Operations on traces 21

is a possible trace of VMC, The customer may fret and fume; the observer may

watch eagerly with pencil poised; but not another single event can occur, and

not another symbol will ever be written in the notebook. The ultimate disposal

of customer and machine are not in our chosen alphabet.

1.6 Operations on traces

Traces play a central role in recording, describing, and understanding the be-

haviour of processes. In this section we explore some of the general properties

of traces and of operations on them. We will use the following conventions

s, t, u stand for traces

S, T, U stand for sets of traces

f , g, h stand for functions

1.6.1 Catenation

By far the most important operation on traces is catenation, which constructs

a trace from a pair of operands s and t by simply putting them together in this

order; the result will be denoted

s

t

For example

¸coin, choc)

**¸coin, toﬀee) = ¸coin, choc, coin, toﬀee)
**

¸in1p)

**¸in1p) = ¸in1p, in1p)
**

¸in1p, in1p)

¸) = ¸in1p, in1p)

The most important properties of catenation are that it is associative, and

has ¸) as its unit

L1 s

¸) = ¸)

s = s

L2 s

(t

y) = (s

t)

u

The following laws are both obvious and useful

L3 s

t = s

u ≡ t = u

L4 s

t = u

t ≡ s = u

L5 s

t = ¸) ≡ s = ¸) ∧ t = ¸)

Let f stand for a function which maps traces to traces. The function is

said to be strict if it maps the empty trace to the empty trace

f (¸)) = ¸)

22 1 Processes

It is said to be distributive if it distributes through catenation

f (s

t) = f (s)

f (t)

All distributive functions are strict.

If n is a natural number, we deﬁne t

n

as n copies of t catenated with each

other. It is readily deﬁned by induction on n

L6 t

0

= ¸)

L7 t

n+1

= t

t

n

This deﬁnition itself gives two useful laws; here are two more which can be

proved from them

L8 t

n+1

= t

n

t

L9 (s

t)

n+1

= s

(t

s)

n

t

1.6.2 Restriction

The expression (t A) denotes the trace t when restricted to symbols in the set

A; it is formed from t simply by omitting all symbols outside A. For example

¸around, up, down, around) ¦up, down¦ = ¸up, down)

Restriction is distributive and therefore strict

L1 ¸) A = ¸)

L2 (s

t) A = (s A)

(t A)

Its eﬀect on singleton sequences is obvious

L3 ¸x) A = ¸x) if x ∈ A

L4 ¸y) A = ¸) if y ∉ A

A distributive function is uniquely deﬁned by deﬁning its eﬀect on singleton

sequences, since its eﬀect on all longer sequences can be calculated by distrib-

uting the function to each individual element of the sequence and catenating

the results. For example, if y ≠ x

¸x, y, x) ¦x¦

= (¸x)

¸y)

¸x)) ¦x¦

= (¸x) ¦x¦)

(¸y) ¦x¦)

(¸x) ¦x¦) [by L2]

= ¸x)

¸)

¸x) [by L3 and L4]

= ¸x, x)

1.6 Operations on traces 23

The following laws show the relationship between restriction and set opera-

tions. A trace restricted to the empty set of symbols leaves nothing; and a

successive restriction by two sets is the same as a single restriction by the in-

tersection of the two sets. These laws can be proved by induction on the length

of s

L5 s ¦¦ = ¸)

L6 (s A) B = s (A∩B)

1.6.3 Head and tail

If s is a nonempty sequence, its ﬁrst sequence is denoted s

0

, and the result of

removing the ﬁrst symbol is s

¹

. For example

¸x, y, x)

0

= x

¸x, y, x)

¹

= ¸y, x)

Both of these operations are undeﬁned for the empty sequence.

L1 (¸x)

s)

0

= x

L2 (¸x)

s)

¹

= s

L3 s = (¸s

0

)

s

¹

) if s ≠ ¸)

The following lawgives a convenient method of proving whether two traces

are equal

L4 s = t ≡ (s = t = ¸) ∨ (s

0

= t

0

∧ s

¹

= t

¹

))

1.6.4 Star

The set A

∗

is the set of all ﬁnite traces (including ¸)) which are formed from

symbols in the set A. When such traces are restricted to A, they remain un-

changed. This fact permits a simple deﬁnition

A

∗

= ¦ s ¦ s A = s ¦

The following laws are consequences of this deﬁnition

L1 ¸) ∈ A

∗

L2 ¸x) ∈ A

∗

≡ x ∈ A

L3 (s

t) ∈ A

∗

≡ s ∈ A

∗

∧ t ∈ A

∗

They are suﬃciently powerful to determine whether a trace is a member of A

∗

or not. For example, if x ∈ A and y ∉ A

¸x, y) ∈ A

∗

24 1 Processes

≡ (¸x)

¸y)) ∈ A

∗

≡ (¸x) ∈ A

∗

) ∧ (¸y) ∈ A

∗

) [by L3]

≡ true ∧ false [by L2]

≡ false

Another useful law could serve as a recursive deﬁnition of A

∗

L4 A

∗

= ¦ t ¦ t = ¸) ∨ (t

0

∈ A ∧ t

¹

∈ A

∗

) ¦

1.6.5 Ordering

If s is a copy of an initial subsequence of t, it is possible to ﬁnd some extension

u of s such that s

**u = t. We therefore deﬁne an ordering relation
**

s ≤ t = (∃u • s

u = t)

and say that s is a preﬁx of t. For example,

¸x, y) ≤ ¸x, y, x, w)

¸x, y) ≤ ¸z, y, x) ≡ x = z

The ≤ relation is a partial ordering, and its least element is ¸), as stated in

laws L1 to L4

L1 ¸) ≤ s least element

L2 s ≤ s reﬂexive

L3 s ≤ t ∧ t ≤ s ⇒s = t antisymmetric

L4 s ≤ t ∧ t ≤ u ⇒s ≤ u transitive

The following law, together with L1, gives a method for computing whether

s ≤ t or not

L5 (¸x)

s) ≤ t ≡ t ≠ ¸) ∧ x = t

0

∧ s ≤ t

¹

The preﬁxes of a given subsequence are totally ordered

L6 s ≤ u ∧ t ≤ u ⇒s ≤ t ∨ t ≤ s

If s is a subsequence of t (not necessarily initial), we say s is in t; this may

be deﬁned

L7 s in t = (∃u, v • t = u

s

v)

This relation is also a partial ordering, in that it satisﬁes laws L1 to L4 above.

It also satisﬁes

L8 (¸x)

s) in t ≡ t ≠ ¸) ∧ ((t

0

= x ∧ s ≤ t

¹

) ∨ (¸x)

s in t

¹

))

1.6 Operations on traces 25

A function f from traces to traces is said to be monotonic if it respects the

ordering ≤, i.e.,

f (s) ≤ f (t) whenever s ≤ t

All distributive functions are monotonic, for example

L9 s ≤ t ⇒(s A) ≤ (t A)

A dyadic function may be monotonic in either argument, keeping the other ar-

gument constant. For example, catenation is monotonic in its second argument

(but not its ﬁrst)

L10 t ≤ u ⇒(s

t) ≤ (s

u)

A function which is monotonic in all its arguments is said simply to be mono-

tonic.

1.6.6 Length

The length of the trace t is denoted #t. For example

#¸x, y, x) = 3

The laws which deﬁne # are

L1 #¸) = 0

L2 #¸x) = 1

L3 #(s

t) = (#s) +(#t)

The number of occurrences in t of symbols from A is counted by #(t A).

L4 #(t (A∪B)) =

#(t A) +#(t B) −

#(t (A∩B))

L5 s ≤ t ⇒#s ≤ #t

L6 #(t

n

) = n ×(#t)

The number of occurrences of a symbol x in a trace s is deﬁned

s ↓ x = #(s ¦x¦)

26 1 Processes

1.7 Implementation of traces

In order to represent traces in a computer and to implement operations on

them, we need a high-level list-processing language. Fortunately, LISP is very

suitable for this purpose. Traces are represented in the obvious way by lists

of atoms representing their events

¸) = NIL

¸coin) = (cons("COIN, NIL))

¸coin, choc) = "(COIN CHOC)

which means cons("COIN, cons("CHOC, NIL))

Operations on traces can be readily implemented as functions on lists.

For example, the head and tail of a nonempty list are given by the primitive

functions car and cdr

t

0

= car(t)

t

¹

= cdr(t)

¸x)

s = cons(x, s)

General catenation is implemented as the familiar append function, which is

deﬁned by recursion

s

t = append(s, t)

where

append(s, t) =

if s = NIL then

t

else

cons(car(s), append(cdr(s), t))

The correctness of this deﬁnition follows from the laws

¸)

t = t

s

t = ¸s

0

)

(s

¹

t)

whenever s ≠ ¸)

The termination of the LISP append function is guaranteed by the fact that the

list supplied as the ﬁrst argument of each recursive call is shorter than it was

at the previous level of recursion. Similar arguments establish the correctness

of the implementations of the other operations deﬁned below.

1.8 Traces of a process 27

To implement restriction, we represent a ﬁnite set B as a list of its mem-

bers. The test (x ∈ B) is accomplished by a call on the function

ismember(x, B) =

if B = NIL then

false

else if x = car(B) then

true

else

ismember(x, cdr(B))

(s B) can now be implemented by the function

restrict(s, B) =

if s = NIL then

NIL

else if ismember(car(s), B) then

cons(car(s), restrict(cdr(s), B))

else

restrict(cdr(s), B)

A test of (s ≤ t) is implemented as a function which delivers the answer

true or false; it relies on 1.6.5 L1 and L5

ispreﬁx(s, t) = if s = NIL then

true

else if t = NIL then

false

else

car(s) = car(t) and

ispreﬁx(cdr(s), cdr(t))

1.8 Traces of a process

In Section 1.6 a trace of a process was introduced as a sequential record of

the behaviour of a process up to some moment in time. Before the process

starts, it is not known which of its possible traces will actually be recorded:

the choice will depend on environmental factors beyond the control of the

process. However the complete set of all possible traces of a process P can be

known in advance, and we deﬁne a function traces(P) to yield that set.

28 1 Processes

Examples

X1 The only trace of the behaviour of the process STOP is ¸). The notebook

of the observer of this process remains forever blank

traces(STOP) = ¦¸)¦

**X2 There are only two traces of the machine that ingests a coin before break-
**

ing

traces(coin →STOP) = ¦¸), ¸coin)¦

**X3 A clock that does nothing but tick
**

traces(µ X • tick →X) = ¦¸), ¸tick), ¸tick, tick), . . .¦

= ¦tick¦

∗

As with most interesting processes, the set of traces is inﬁnite, although of

course each individual trace is ﬁnite.

X4 A simple vending machine

traces(µ X • coin →choc →X) = ¦ s ¦ ∃n • s ≤ ¸coin, choc)

n

¦

1.8.1 Laws

In this section we showhowto calculate the set of traces of any process deﬁned

using the notations introduced so far. As mentioned above, STOP has only one

trace

L1 traces(STOP) = ¦ t ¦ t = ¸) ¦ = ¦¸)¦

A trace of (c → P) may be empty, because ¸) is a trace of the behaviour of

every process up to the moment that it engages in its very ﬁrst action. Every

nonempty trace begins with c, and its tail must be a possible trace of P

L2 traces(c →P) = ¦ t ¦ t = ¸) ∨ (t

0

= c ∧ t

¹

∈ traces(P)) ¦

= ¦¸)¦ ∪¦ ¸c)

t ¦ t ∈ traces(P) ¦

A trace of the behaviour of a process which oﬀers a choice between initial

events must be a trace of one of the alternatives

L3 traces(c →P ¦ d →Q) =

¦ t ¦ t = ¸) ∨ (t

0

= c ∧ t

¹

∈ traces(P)) ∨ (t

0

= d ∧ t

¹

∈ traces(Q)) ¦

1.8 Traces of a process 29

These three laws are summarised in the single general law governing choice

L4 traces(x : B →P(x)) = ¦ t ¦ t = ¸) ∨ (t

0

∈ B ∧ t

¹

∈ traces(P(t

0

))) ¦

To discover the set of traces of a recursively deﬁned process is a bit more

diﬃcult. A recursively deﬁned process is the solution of an equation

X = F(X)

First, we deﬁne iteration of the function F by induction

F

0

(X) = X

F

n+1

(X) = F(F

n

(X))

= F

n

(F(X))

= F(. . . (F

n times

(F(X))) . . .)

Then, provided that F is guarded, we can deﬁne

L5 traces(µ X : A • F(X)) =

n≥0

traces(F

n

(STOP

A

))

Examples

X1 Recall that RUN

A

was deﬁned in 1.1.3 X8 as

µ X : A • F(X)

where F(X) = (x : A →X)

We wish to prove that

traces(RUN

A

) = A

∗

Proof : A

∗

=

n≥0

¦ s ¦ s ∈ A

∗

∧ #s ≤ n ¦

This is done by induction on n.

1. traces(STOP

A

)

= ¦¸)¦

= ¦ s ¦ s ∈ A

∗

∧ #s ≤ 0 ¦

2. traces(F

n+1

(STOP

A

))

= traces(x : A →F

n

(STOP

A

)) [def. F, F

n+1

]

= ¦ t ¦ t = ¸) ∨ (t

0

∈ A ∧ t

¹

∈ traces(F

n

(STOP

A

))) ¦ [L4]

= ¦ t ¦ t = ¸) ∨ (t

0

∈ A ∧ (t

¹

∈ A

∗

∧ #t

¹

≤ n)) ¦ [ind. hyp.]

= ¦ t ¦ (t = ¸) ∨ (t

0

∈ A ∧ t

¹

∈ A

∗

)) ∧ #t ≤ n +1¦ [property of #]

= ¦ t ¦ t ∈ A

∗

∧ #t ≤ n +1¦ [1.6.4 L4]

30 1 Processes

X2 We want to prove 1.5 X4, i.e.,

traces(VMS) =

n≥0

¦ s ¦ s ≤ ¸coin, choc)

n

¦

Proof : The inductive hypothesis is

traces(F

n

(VMS)) = ¦ t ¦ t ≤ ¸coin, choc)

n

¦

where F(X) = coin →choc →X

1. traces(STOP) = ¦¸)¦ = ¦ s ¦ s ≤ ¸coin, choc)

0

¦ [1.6.1 L6]

2. traces(coin →choc →F

n

(STOP))

= ¦¸), ¸coin)¦ ∪

¦ ¸coin, choc)

t ¦ t ∈ traces(F

n

(STOP)) ¦

[L2 twice]

= ¦¸), ¸coin)¦ ∪

¦ ¸coin, choc)

t ¦ t ≤ ¸coin, choc)

n

¦

[ind. hyp.]

= ¦ s ¦ s = ¸) ∨ s = ¸coin) ∨

∃t • s = ¸coin, choc)

t ∧ t ≤ ¸coin, choc)

n

¦

= ¦ s ¦ s ≤ ¸coin, choc)

n+1

¦

The conclusion follows by L5.

As mentioned in Section 1.5, a trace is a sequence of symbols recording

the events in which a process P has engaged up to some moment in time. From

this it follows that ¸) is a trace of every process up to the moment in which it

engages in its very ﬁrst event. Furthermore, if (s

t) is a trace of a process up

to some moment, then s must have been a trace of that process up to some

earlier moment. Finally, every event that occurs must be in the alphabet of the

process. These three facts are formalised in the laws

L6 ¸) ∈ traces(P)

L7 s

t ∈ traces(P) ⇒s ∈ traces(P)

L8 traces(P) ⊆ (αP)

∗

There is a close relationship between the traces of a process and the picture

of its behaviour drawn as a tree. For any node on the tree, the trace of the

behaviour of a process up to the time when it reaches that node is just the

sequence of labels encountered on the path leading from the root of the tree

to that node. For example, in the tree for VMC shown in Figure 1.1, the trace

corresponding to the path from the root to the black node is

¸in2p, small, out1p)

1.8 Traces of a process 31

VMC

small large

in1p out1p in1p large in2p in1p

small in1p

in1p in2p

in2p

in2p in1p

Figure 1.1

Clearly, all initial subpaths of a path in a tree are also paths in the same

tree; this is stated more formally in L7 above. The empty trace deﬁnes the path

from the root to itself, which justiﬁes the law L6. The traces of a process are

just the set of paths leading from the root to some node in the tree.

Conversely, because the branches leading from each node are all labelled

with diﬀerent events, each trace of a process uniquely speciﬁes a path leading

from the root of a tree to a particular node. Thus any set of traces satisfying

L6 and L7 constitutes a convenient mathematical representation for a tree with

no duplicate labels on branches emerging from a single node.

1.8.2 Implementation

Suppose a process has been implemented as a LISP function P, and let s be

a trace. Then it is possible to test whether s is a possible trace of P by the

function

istrace(s, P) = if s = NIL then

true

else if P(car(s)) = "BLEEP then

false

else

istrace(cdr(s), P(car(s)))

Since s is ﬁnite, the recursion involved here will terminate, having explored

only a ﬁnite initial segment of the behaviour of the process P. It is because

we avoid inﬁnite exploration that we can safely deﬁne a process as an inﬁnite

object, i.e., a function whose result is a function whose result is a function

whose result…

32 1 Processes

1.8.3 After

If s ∈ traces(P) then

P / s (P after s)

is a process which behaves the same as P behaves from the time after P has

engaged in all the actions recorded in the trace s. If s is not a trace of P, (P / s)

is not deﬁned.

Examples

X1 (VMS / ¸coin)) = (choc →VMS)

X2 (VMS / ¸coin, choc)) = VMS

X3 (VMC / ¸in1p)

3

) = STOP

X4 To avoid loss arising frominstallation of VMCRED (1.1.3 X5, X6), the owner

decides to eat the ﬁrst chocolate himself

(VMCRED / ¸choc)) = VMS2

**In a tree picture of P (Figure 1.1), (P / s) denotes the whole subtree whose
**

root lies at the end of the path labelled by the symbols of s. Thus the subtree

below the black node in the Figure 1.1 is denoted by

VMC / ¸in2p, small, out1p)

The following laws describe the meaning of the operator /. After doing

nothing, a process remains unchanged

L1 P / ¸) = P

After engaging in s

**t, the behaviour of P is the same as that of (P / s) after
**

engaging in t

L2 P / (s

t) = (P / s) / t

After engaging in a single event c, the behaviour of a process is as deﬁned by

this initial choice

L3 (x : B →P(x)) / ¸c) = P(c) provided that c ∈ B

A corollary shows that / ¸c) is the inverse of the preﬁxing operator c →

L3A (c →P) / ¸c) = P

The traces of (P / s) are deﬁned

1.8 Traces of a process 33

L4 traces(P / ¸s)) = ¦ t ¦ s

**t ∈ traces(P) ¦ provided that s ∈ traces(P)
**

In order to prove that a process P never stops it is suﬃcient to prove that

∀s : traces(P) • P / s ≠ STOP

Another desirable property of a process is cyclicity; a process P is deﬁned as

cyclic if in all circumstances it is possible for it to return to its initial state, i.e.,

∀s : traces(P) • ∃t • (P / (s

t) = P)

STOP is trivially cyclic; but if any other process is cyclic, then it also has the

desirable property of never stopping.

Examples

X5 The following processes are cyclic (1.1.3 X8, 1.1.2 X2, 1.1.3 X3, 1.1.4 X2)

RUN

A

, VMS, (choc →VMS), VMCT, CT

7

**X6 The following are not cyclic, because it is not possible to return them to
**

their initial state ( 1.1.2 X2, 1.1.3 X3, 1.1.4 X2)

(coin →VMS), (choc →VMCT), (around →CT

7

)

For example, in the initial state of choc →VMCT only a chocolate is obtainable,

but subsequently whenever choc is obtainable a choice of toﬀee is also possible;

consequently none of these subsequent states is equal to the initial state.

Warning: The use of / in a recursively deﬁned process has the unfortunate con-

sequence of invalidating its guards, thereby introducing the danger of multiple

solutions to the recursive equations. For example

X = (a →(X / ¸a)))

is not guarded, and has as its solution any process of the form

a →P

for any P.

Proof : (a →((a →P) / ¸a))) = (a →P) by L3A.

For this reason, we will never use the / operator in recursive process deﬁnitions.

34 1 Processes

1.9 More operations on traces

This section describes some further operations on traces; it may be skipped

at this stage, since backwards references will be given in later chapters where

the operations are used.

1.9.1 Change of symbol

Let f be a function mapping symbols from a set A to symbols in a set B. From

f we can derive a new function f

∗

which maps a sequence of symbols in A

∗

to

a sequence in B

∗

by applying f to each element of the sequence. For example,

if double is a function which doubles its integer argument

double

∗

(¸1, 5, 3, 1)) = ¸2, 10, 6, 2)

A starred function is obviously distributive and therefore strict

L1 f

∗

(¸)) = ¸)

L2 f

∗

(¸x)) = ¸f (x))

L3 f

∗

(s

t) = f

∗

(s)

f

∗

(t)

Other laws are obvious consequences

L4 f

∗

(s)

0

= f (s

0

) if s ≠ ¸)

L5 #f

∗

(s) = #s

But here is an “obvious” law which is unfortunately not generally true

f

∗

(s A) = f

∗

(s) f (A)

where f (A) = ¦ f (x) ¦ x ∈ A¦.

The simplest counterexample is given by the function f such that

f (b) = f (c) = c where b ≠ c

Therefore

f

∗

(¸b) ¦c¦)

= f

∗

(¸)) [since b ≠ c]

= ¸) [L1]

≠ ¸c)

= ¸c) ¦c¦

= f

∗

(¸c)) f (¦c¦) [since f (c) = c]

However, the law is true if f is a one-one function (injection)

L6 f

∗

(s A) = f

∗

(s) f (A) provided that f is an injection.

1.9 More operations on traces 35

1.9.2 Catenation

Let s be a sequence, each of whose elements is itself a sequence. Then

/ s

is obtained by catenating all the elements together in the original order. For

example

/¸ ¸1, 3), ¸), ¸7) ) = ¸1, 3)

¸)

¸7)

= ¸1, 3, 7)

This operator is distributive

L1

/¸) = ¸)

L2

/¸s) = s

L3

/(s

t) = (

/ s)

(

/ t)

1.9.3 Interleaving

A sequence s is an interleaving of two sequences t and u if it can be split into

a series of subsequences, with alternate subsequences extracted from t and u.

For example

s = ¸1, 6, 3, 1, 5, 4, 2, 7)

is an interleaving of t and u, where

t = ¸1, 6, 5, 2, 7) and u = ¸3, 1, 4)

A recursive deﬁnition of interleaving can be given by means of the follow-

ing laws

L1 ¸) interleaves (t, u) ≡ (t = ¸) ∧ u = ¸))

L2 s interleaves (t, u) ≡ s interleaves (u, t)

L3 (¸x)

s) interleaves (t, u) ≡

(t ≠ ¸) ∧ t

0

= x ∧ s interleaves (t

¹

, u)) ∨

(u ≠ ¸) ∧ u

0

= x ∧ s interleaves (t, u

¹

))

1.9.4 Subscription

If 0 ≤ i ≤ #s, we use the conventional notation s[i] to denote the i

th

element

of the sequence s as described by L1

L1 s[0] = s

0

∧ s[i +1] = s

¹

[i] provided s ≠ ¸)

L2 (f

∗

(s))[i] = f (s[i]) for i < #s

36 1 Processes

1.9.5 Reversal

If s is a sequence, s is formed by taking its elements in reverse order. For

example

¸3, 5, 37) = ¸37, 5, 3)

Reversal is deﬁned fully by the following laws

L1 ¸) = ¸)

L2 ¸x) = ¸x)

L3 s

t = t

s

Reversal enjoys a number of simple algebraic properties, including

L4 s = s

Exploration of other properties is left to the reader. One of the useful facts

about reversal is that s

0

is the last element of the sequence, and in general

L5 s[i] = s[#s −i −1] for i ≤ #s

1.9.6 Selection

If s is a sequence of pairs, we deﬁne s ↓ x as the result of selecting from s all

those pairs whose ﬁrst element is x and then replacing each pair by its second

element. We write a pair with a dot between its two components. Thus if

s = ¸a.7, b.9, a.8, c.0)

then s ↓ a = ¸7, 8) and s ↓ d = ¸)

L1 ¸) ↓ x = ¸)

L2 (¸y.z)

t) ↓ x = t ↓ x if y ≠ x

L3 (¸x.z)

t) ↓ x = ¸z)

(t ↓ x)

If s is not a sequence of pairs, s ↓ a denotes the number of occurrences of a in

s (as deﬁned in Section 1.6.6).

1.9.7 Composition

Let ✓ be a symbol denoting successful termination of the process which en-

gages in it. As a result, this symbol can appear only at the end of a trace. Let t

be a trace recording a sequence of events which start when s has successfully

terminated. The composition of s and t is denoted (s ; t). If ✓ does not occur

in s, then t cannot start

1.10 Speciﬁcations 37

L1 s ; t = s if ¬ (¸✓) in s)

If ✓ does occur at the end of s, it is removed and t is appended to the result

L2 (s

¸✓)) ; t = s

t if ¬ (¸✓) in s)

The symbol ✓ may be regarded as a sort of glue which sticks s and t together;

in the absence of the glue, t cannot stick (L1). If ✓ occurs (incorrectly) in the

middle of a trace, we stipulate for the sake of completeness that all symbols

after the ﬁrst occurrence are irrelevant and should be discarded

L2A (s

¸✓)

u) ; t = s

t if ¬ (¸✓) in s)

This unfamiliar operator enjoys a number of familiar algebraic properties.

Like catenation it is associative. Unlike catenation, it is monotonic in its ﬁrst

as well as its second argument. Also, it is strict in its ﬁrst argument, and has

✓ as its left unit

L3 s ; (t ; u) = (s ; t) ; u

L4A s ≤ t ⇒((u ; s) ≤ (u ; t))

L4B s ≤ t ⇒((s ; u) ≤ (t ; u))

L5 ¸) ; t = ¸)

L6 ¸✓) ; t = t

If ✓ never occurs except at the end of a trace, ¸✓) is a right unit as well

L7 s ; ¸✓) = s provided ¬ (¸✓) in (s)

¹

)

1.10 Speciﬁcations

A speciﬁcation of a product is a description of the way it is intended to behave.

This description is a predicate containing free variables, each of which stands

for some observable aspect of the behaviour of the product. For example, the

speciﬁcation of an electronic ampliﬁer, with an input range of one volt and

with an approximate gain of 10, is given by the predicate

AMP10 = (0 ≤ v ≤ 1 ⇒¦ v

¹

−10 ×v ¦ ≤ 1)

In this speciﬁcation, it is understood that v stands for the input voltage and

v

¹

stands for the output voltage. Such an understanding of the meaning of

variables is essential to the use of mathematics in science and engineering.

In the case of a process, the most obviously relevant observation of its

behaviour is the trace of events that occur up to a given moment in time. We

will use the special variable tr to stand for an arbitrary trace of the process

being speciﬁed, just as v and v

¹

are used for arbitrary observations of voltage

in the previous example.

38 1 Processes

Examples

X1 The owner of a vending machine does not wish to make a loss by installing

it. He therefore speciﬁes that the number of chocolates dispensed must never

exceed the number of coins inserted

NOLOSS = (#(tr ¦choc¦) ≤ #(tr ¦coin¦))

**In future we will use the abbreviation (introduced in 1.6.6)
**

tr ↓ c = #(tr ¦c¦)

to stand for the number of occurrences of the symbol c in tr.

X2 The customer of a vending machine wants to ensure that it will not absorb

further coins until it has dispensed the chocolate already paid for

FAIR1 = ((tr ↓ coin) ≤ (tr ↓ choc) +1)

**X3 The manufacturer of a simple vending machine must meet the require-
**

ments both of its owner and its customer

VMSPEC = NOLOSS ∧ FAIR1

= (0 ≤ ((tr ↓ coin) = (tr ↓ choc)) ≤ 1)

**X4 The speciﬁcation of a correction to the complex vending machine forbids
**

it to accept three pennies in a row

VMCFIX = (¬ ¸in1p)

3

in tr)

**X5 The speciﬁcation of a mended machine
**

MENDVMC = (tr ∈ traces(VMC) ∧ VMCFIX)

**X6 The speciﬁcation of VMS2 (1.1.3 X6)
**

0 ≤ ((tr ↓ coin) −(tr ↓ choc) ≤ 2

1.10 Speciﬁcations 39

1.10.1 Satisfaction

If P is a product which meets a speciﬁcation S, we say that P satisﬁes S, abbre-

viated to

P sat S

This means that every possible observation of the behaviour of P is described

by S; or in other words, S is true whenever its variables take values observed

from the product P, or more formally, ∀tr • tr ∈ traces(P) ⇒S. For example,

the following table gives some observations of the properties of an ampliﬁer

1 2 3 4 5

v 0 .5 .5 2 .1

v

¹

0 5 4 1 3

All observations except the last are described by AMP10. The second and third

columns illustrate the fact that the output of the ampliﬁer is not completely

determined by its input. The fourth column shows that if the input voltage is

outside the speciﬁed range, the output voltage can be anything at all, without

violating the speciﬁcation. (In this simple example we have ignored the pos-

sibility that excessive input may break the product.)

The following laws give the most general properties of the satisﬁes relation.

The speciﬁcation true which places no constraints whatever on observations

of a product will be satisﬁed by all products; even a broken product satisﬁes

such a weak and undemanding speciﬁcation

L1 P sat true

If a product satisﬁes two diﬀerent speciﬁcation, it also satisﬁes their conjunc-

tion

L2A If P sat S

and P sat T

then P sat (S ∧ T)

The law L2A generalises to inﬁnite conjunctions, i.e., to universal quantiﬁca-

tion. Let S(n) be a predicate containing the variable n

L2 If ∀n • (P sat S(n))

then P sat (∀n • S(n))

provided that P does not depend on n.

If a speciﬁcation S logically implies another speciﬁcation T, then every ob-

servation described by S is also described by T. Consequently every product

which satisﬁes S must also satisfy the weaker speciﬁcation T

40 1 Processes

L3 If P sat S

and S ⇒T

then P sat T

In the light of this law, we will sometimes lay out proofs as a chain; so if S ⇒T,

we write

P sat S

⇒ T

as an abbreviation for the fuller proof

P sat S

S ⇒T

P sat T [by L3]

The laws and their explanations given above apply to all kinds of products

and all kinds of speciﬁcations. In the next section we shall give the additional

laws which apply to processes.

1.10.2 Proofs

In the design of a product, the designer has a responsibility to ensure that it will

satisfy its speciﬁcation; this responsibility may be discharged by the reasoning

methods of the relevant branches of mathematics, for example, geometry or

the diﬀerential and integral calculus. In this section we shall give a collection of

laws which permit the use of mathematical reasoning to ensure that a process

P meets its speciﬁcation S.

We will sometimes write the speciﬁcation as S(tr), suggesting that a spe-

ciﬁcation will normally contain tr as a free variable. However, the real reason

for making tr explicit is to indicate how tr may be substituted by some more

elaborate expression, as for example in S(tr

¹

). It is important to note that both

S and S(tr) can have other free variables besides tr.

Any observation of the process STOP will always be an empty trace, since

this process never does anything

L4A STOP sat (tr = ¸))

A trace of the process (c →P) is initially empty. Every subsequent trace begins

with c, and its tail is a trace of P. Consequently its tail must be described by

any speciﬁcation of P

L4B If P sat S(tr)

then (c →P) sat (tr = ¸) ∨ (tr

0

= c ∧ S(tr

¹

)))

A corollary of this law deals with double preﬁxing

1.10 Speciﬁcations 41

L4C If P sat S(tr)

then (c →d →P) sat (tr ≤ ¸c, d) ∨ (tr ≥ ¸c, d) ∧ S(tr

¹¹

)))

Binary choice is similar to preﬁxing, except that the trace may begin with either

of the two alternative events, and its tail must be described by the speciﬁcation

of the chosen alternative

L4D If P sat S(tr)

and Q sat T(tr)

then (c →P ¦ d →Q) sat

(tr = ¸) ∨ (tr

0

= c ∧ S(tr

¹

)) ∨ (tr

0

= d ∧ T(tr

¹

)))

All the laws given above are special cases of the law for general choice

L4 If ∀x : B • (P(x) sat S(tr, x))

then (x : B →P(x)) sat (tr = ¸) ∨ (tr

0

∈ B ∧ S(tr

¹

, tr

0

)))

The law governing the after operator is surprisingly simple. If tr is a trace

of (P / s), s

**tr is a trace of P, and therefore must be described by any speciﬁc-
**

ation which P satisﬁes

L5 If P sat S(tr)

and s ∈ traces(P)

then P / s sat S(s

tr)

Finally, we need a law to establish the correctness of a recursively deﬁned

process

L6 If F(x) is guarded

and STOP sat S

and ((X sat S) ⇒(F(X) sat S))

then (µ X • F(X)) sat S

The antecedents of this law ensure (by induction) that

F

n

(STOP) sat S for all n

Since F is guarded, F

n

(STOP) fully describes at least the ﬁrst n steps of the

behaviour of µ X • F(X). So each trace of µ X • F(X) is a trace of F

n

(STOP) for

some n. This trace must therefore satisfy the same speciﬁcation as F

n

(STOP),

which (for all n) is S. A more formal proof can be given in terms of the math-

ematical theory of Section 2.8.

42 1 Processes

Example

X1 We want to prove (1.1.2 X2, 1.10 X3) that

VMS sat VMSPEC

Proof :

1. STOP

sat tr = ¸) [L4A]

⇒0 ≤ (tr ↓ coin = tr ↓ choc) ≤ 1 [since (¸) ↓ coin) = (¸) ↓ choc) = 0]

The conclusion follows by an (implicit) appeal to L3.

2. Assume X sat (0 ≤ ((tr ↓ coin) −(tr ↓ choc)) ≤ 1), then

(coin →choc →X)

sat (tr ≤ ¸coin, choc)) ∨

(tr ≥ ¸coin, choc) ∧

0 ≤ ((tr

¹¹

↓ coin) −(tr

¹¹

↓ choc)) ≤ 1))

[L4C]

⇒0 ≤ ((tr ↓ coin) −(tr ↓ choc)) ≤ 1

since

¸) ↓ coin = ¸) ↓ choc = ¸coin) ↓ choc = 0

and

¸coin) ↓ coin = (¸coin, choc) ↓ coin) = ¸coin, choc) ↓ choc = 1

and

tr ≥ ¸coin, choc) ⇒

(tr ↓ coin = tr

¹¹

↓ coin +1 ∧ tr ↓ choc = tr

¹¹

↓ choc +1)

The conclusion follows by appeal to L3 and L6.

**The fact that a process P satisﬁes its speciﬁcation does not necessarily
**

mean that it is going to be satisfactory in use. For example, since

tr = ¸) ⇒0 ≤ (tr ↓ coin −tr ↓ choc) ≤ 1

one can prove by L3 and L4A that

STOP sat 0 ≤ (tr ↓ coin −tr ↓ choc) ≤ 1

1.10 Speciﬁcations 43

Yet STOP will not serve as an adequate vending machine, either for its owner

or for the customer. It certainly avoid doing anything wrong; but only by the

lazy expedient of doing nothing at all. For this reason, STOP satisﬁes every

speciﬁcation which is satisﬁable by any process.

Fortunately, it is obvious by independent reasoning that VMS will never

stop. In fact, any process deﬁned solely by preﬁxing, choice, and guarded

recursions will never stop. The only way to write a process that can stop is

to include explicitly the process STOP, or the process (x : B →P(x)) where B

is the empty set. By avoiding such elementary mistakes one can guarantee to

write processes that never stop. However, after introduction of concurrency

in the next chapter, such simple precautions are no longer adequate. A more

general method of specifying and proving that a process will never stop is

described in Section 3.7.

Concurrency 2

2.1 Introduction

A process is deﬁned by describing the whole range of its potential behaviour.

Frequently, there will be a choice between several diﬀerent actions, for ex-

ample, the insertion of a large coin or a small one into a vending machine VMC

(1.1.3 X4). On each such occasion, the choice of which event will actually occur

can be controlled by the environment within which the process evolves. For

example, it is the customer of the vending machine who may select which coin

to insert. Fortunately, the environment of a process itself may be described

as a process, with its behaviour deﬁned by familiar notations. This permits

investigation of the behaviour of a complete system composed from the pro-

cess together with its environment, acting and interacting with each other as

they evolve concurrently. The complete system should also be regarded as a

process, whose range of behaviour is deﬁnable in terms of the behaviour of

its component processes; and the system may in turn be placed within a yet

wider environment. In fact, it is best to forget the distinction between pro-

cesses, environments, and systems; they are all of them just processes whose

behaviour may be prescribed, described, recorded and analysed in a simple

and homogeneous fashion.

2.2 Interaction

When two processes are brought together to evolve concurrently, the usual

intention is that they will interact with each other. These interactions may be

regarded as events that require simultaneous participation of both the pro-

cesses involved. For the time being, let us conﬁne attention to such events,

and ignore all others. Thus we will assume that the alphabets of the two pro-

cesses are the same. Consequently, each event that actually occurs must be a

possible event in the independent behaviour of each process separately. For

example, a chocolate can be extracted from a vending machine only when its

customer wants it and only when the vending machine is prepared to give it.

46 2 Concurrency

If P and Q are processes with the same alphabet, we introduce the notation

P || Q

to denote the process which behaves like the system composed of processes

P and Q interacting in lock-step synchronisation as described above.

Examples

X1 A greedy customer of a vending machine is perfectly happy to obtain a

toﬀee or even a chocolate without paying. However, if thwarted in these de-

sires, he is reluctantly prepared to pay a coin, but then he insists on taking a

chocolate

GRCUST = (toﬀee →GRCUST

¦ choc →GRCUST

¦ coin →choc →GRCUST)

When this customer is brought together with the machine VMCT (1.1.3 X3)

his greed is frustrated, since the vending machine does not allow goods to be

extracted before payment. Similarly, VMCT never gives a toﬀee, because the

customer never wants one after he has paid

(GRCUST || VMCT) = µ X • (coin →choc →X)

This example shows how a process which has been deﬁned as a composition

of two subprocesses may also be described as a simple single process, without

using the concurrency operator ||.

X2 A foolish customer wants a large biscuit, so he puts his coin in the vending

machine VMC. He does not notice whether he has inserted a large coin or a

small one; nevertheless, he is determined on a large biscuit

FOOLCUST = (in2p →large →FOOLCUST

¦ in1p →large →FOOLCUST)

Unfortunately, the vending machine is not prepared to yield a large biscuit for

only a small coin

(FOOLCUST || VMC) = µ X • (in2p →large →X ¦ in1p →STOP)

The STOP that supervenes after the ﬁrst in1p is known as deadlock. Although

each component process is prepared to engage in some further action, these

actions are diﬀerent; since the processes cannot agree on what the next action

shall be, nothing further can happen.

2.2 Interaction 47

The stories that accompany these examples show a sad betrayal of proper

standards of scientiﬁc abstraction and objectivity. It is important to remember

that events are intended to be neutral transitions which could be observed and

recorded by some dispassionate visitor from another planet, who knows noth-

ing of the pleasures of eating biscuits, or of the hunger suﬀered by the foolish

customer as he vainly tries to obtain sustenance. We have deliberately chosen

the alphabet of relevant events to exclude such internal emotional states; if

and when desired, further events can be introduced to model internal state

changes, as shown in 2.3 X1.

2.2.1 Laws

The laws governing the behaviour of (P || Q) are exceptionally simple and

regular. The ﬁrst law expresses the logical symmetry between a process and

its environment

L1 P || Q = Q || P

The next law shows that when three processes are assembled, it does not mat-

ter in which order they are put together

L2 P || (Q || R) = (P || Q) || R

Thirdly, a deadlocked process infects the whole system with deadlock; but

composition with RUN

αP

(1.1.3 X8) makes no diﬀerence

L3A P || STOP

αP

= STOP

αP

L3B P || RUN

αP

= P

The next laws show how a pair of processes either engage simultaneously in

the same action, or deadlock if they disagree on what the ﬁrst action should

be

L4A (c →P) || (c →Q) = (c →(P || Q))

L4B (c →P) || (d →Q) = STOP if c ≠ d

These laws readily generalise to cases when one or both processes oﬀer a choice

of initial event; only events which they both oﬀer will remain possible when

the processes are combined

L4 (x : A →P(x)) || (y : B →Q(y)) = (z : (A∩B) →(P(z) || Q(z)))

It is this law which permits a system deﬁned in terms of concurrency to be

given an alternative description without concurrency.

48 2 Concurrency

Example

X1 Let P = (a →b →P ¦ b →P)

and Q = (a →(b →Q ¦ c →Q))

Then

(P || Q) =

= a →((b →P) || (b →Q ¦ c →Q)) [by L4A]

= a →(b →(P || Q)) [by L4A]

= µ X • (a →b →X) [since the recursion is guarded.]

2.2.2 Implementation

The implementation of the || operator is clearly based on L4

intersect(P, Q) = λz • if P(z) = "BLEEP or Q(z) = "BLEEP then

"BLEEP

else

intersect(P(z), Q(z))

2.2.3 Traces

Since each action of (P || Q) requires simultaneous participation of both P and

Q, each sequence of such actions must be possible for both these operands.

For the same reason, / s distributes through ||.

L1 traces(P || Q) = traces(P) ∩traces(Q)

L2 (P || Q) / s = (P / s) || (Q / s)

2.3 Concurrency

The operator described in the previous section can be generalised to the case

when its operands P and Q have diﬀerent alphabets

αP ≠ αQ

When such processes are assembled to run concurrently, events that are in both

their alphabets (as explained in the previous section) require simultaneous

participation of both P and Q. However, events in the alphabet of P but not

in the alphabet of Q are of no concern to Q, which is physically incapable of

2.3 Concurrency 49

controlling or even of noticing them. Such events may occur independently

of Q whenever P engages in them. Similarly, Q may engage alone in events

which are in the alphabet of Q but not of P. Thus the set of all events that are

logically possible for the system is simply the union of the alphabets of the

component processes

α(P || Q) = αP ∪αQ

This is a rare example of an operator which takes operands with diﬀerent al-

phabets, and yields a result with yet a third alphabet. However in the case when

the two operands have the same alphabet, so does the resulting combination,

and (P || Q) has exactly the meaning described in the previous section.

Examples

X1 Let αNOISYVM = ¦coin, choc, clink, clunk, toﬀee¦, where clink is the sound

of a coin dropping into the moneybox of a noisy vending machine, and clunk

is the sound made by the vending machine on completion of a transaction.

The noisy vending machine has run out of toﬀee

NOISYVM =

(coin →clink →choc →clunk →NOISYVM)

The customer of this machine deﬁnitely prefers toﬀee; the curse is what he

utters when he fails to get it; he then has to take a chocolate instead

αCUST =

¦coin, choc, curse, toﬀee¦

CUST =

(coin →(toﬀee →CUST ¦ curse →choc →CUST))

The result of the concurrent activity of these two processes is

(NOISYVM || CUST) =

µ X • (coin →(clink →curse →choc →clunk →X

¦ curse →clink →choc →clunk →X))

Note that the clink may occur before the curse, or the other way round. They

may even occur simultaneously, and it will not matter in which order they

are recorded. Note also that the mathematical formula in no way represents

the fact that the customer prefers to get a toﬀee rather than utter a curse. The

formula is an abstraction fromreality, which ignores human emotions and con-

centrates on describing only the possibilities of occurrence and non-occurrence

of events within the alphabet of the processes, whether those events are de-

sired or not.

50 2 Concurrency

X2

i

j

1 2 3

2

1

A counter starts at the middle bottom square of the board, and may move

within the board either up, down, left or right. Let

αP = ¦up, down¦

P = (up →down →P)

αQ = ¦left, right¦

Q = (right →left →Q ¦ left →right →Q)

The behaviour of this counter may be deﬁned P || Q.

In this example, the alphabets αP and αQ have no event in common. Con-

sequently, the movements of the counter are an arbitrary interleaving of ac-

tions from the process P with actions from the process Q. Such interleavings

are very laborious to describe without concurrency. For example, let R

ij

stand

for the behaviour of the counter (X2) when situated in row i and column j of

the board, for i ∈ ¦1, 2¦, j ∈ ¦1, 2, 3¦. Then

(P || Q) = R

12

where

R

21

= (down →R

11

¦ right →R

22

)

R

11

= (up →R

21

¦ right →R

12

)

R

22

= (down →R

12

¦ left →R

21

¦ right →R

23

)

R

12

= (up →R

22

¦ left →R

11

¦ right →R

13

)

R

23

= (down →R

13

¦ left →R

22

)

R

13

= (up →R

23

¦ left →R

12

)

2.3.1 Laws

The ﬁrst three laws for the extended form of concurrency are similar to those

for interaction (Section 2.2.1)

L1,2 || is symmetric and associative

L3A P || STOP

αP

= STOP

αP

2.3 Concurrency 51

L3B P || RUN

αP

= P

Let a ∈ (αP − αQ), b ∈ (αQ − αP) and ¦c, d¦ ⊆ (αP ∩ αQ). The following

laws show the way in which P engages alone in a, Q engages alone in b, but c

and d require simultaneous participation of both P and Q

L4A (c →P) || (c →Q) = c →(P || Q)

L4B (c →P) || (d →Q) = STOP if c ≠ d

L5A (a →P) || (c →Q) = a →(P || (c →Q))

L5B (c →P) || (b →Q) = b →((c →P) || Q)

L6 (a →P) || (b →Q) = (a →(P || (b →Q)) ¦ b →((a →P) || Q))

These laws can be generalised to deal with the general choice operator

L7 Let P = (x : A →P(x))

and Q = (y : B →Q(y))

Then (P || Q) = (z : C →P

¹

|| Q

¹

)

where C = (A∩B) ∪(A−αQ) ∪(B −αP)

and P

¹

= P(z) if z ∈ A

P

¹

= P otherwise

and Q

¹

= Q(z) if z ∈ B

Q

¹

= Q otherwise.

These laws permit a process deﬁned by concurrency to be redeﬁned without

that operator, as shown in the following example.

Example

X1 Let αP = ¦a, c¦

αQ = ¦b, c¦

and P = (a →c →P)

Q = (c →b →Q)

P || Q

= (a →c →P) || (c →b →Q) [by deﬁnition]

= a →((c →P) || (c →b →Q)) [by L5A]

= a →c →(P || (b →Q)) [by L4A …‡]

Also

P || (b →Q)

52 2 Concurrency

= (a →(c →P) || (b →Q)

¦ b →(P || Q))

[by L6]

= (a →b →((c →P) || Q)

¦ b →(P || Q))

[by L5B]

= (a →b →c →(P || (b →Q))

¦ b →a →c →(P || (b →Q)))

[by ‡ above]

= µ X • (a →b →c →X

¦ b →a →c →X)

[since this is guarded]

Therefore

(P || Q) = (a →c →µ X • (a →b →c →X

¦ b →a →c →X))

by ‡ above

2.3.2 Implementation

The implementation of the operator || is derived directly from the law L7. The

alphabets of the operands are represented as ﬁnite lists of symbols, A and B.

Test of membership uses the function ismember(x, A) deﬁned in Section 1.7.

P || Q is implemented by calling a function concurrent(P, αP, αQ, Q),

which is deﬁned as follows

concurrent(P, A, B, Q) = aux(P, Q)

where

aux(P, Q) = λx • if P = "BLEEP or Q = "BLEEP then

"BLEEP

else if ismember(x, A) and ismember(x, B) then

aux(P(x), Q(x))

else if ismember(x, A) then

aux(P(x), Q)

else if ismember(x, B) then

aux(P, Q(x))

else

"BLEEP

2.3 Concurrency 53

2.3.3 Traces

Let t be a trace of (P || Q). Then every event in t which belongs to the alphabet

of P has been an event in the life of P; and every event in t which does not

belong to αP has occurred without the participation of P. Thus (t αP) is a

trace of all those events in which P has participated, and is therefore a trace

of P. By a similar argument (t αQ) is a trace of Q. Furthermore, every event

in t must be in either αP or αQ. This reasoning suggests the law

L1 traces(P || Q) =

¦ t ¦ (t αP) ∈ traces(P) ∧ (t αQ) ∈ traces(Q) ∧ t ∈ (αP ∪αQ)

∗

¦

The next law shows how the / s operator distributes through parallel compos-

ition

L2 (P || Q) / s = (P / (s αP)) || (Q / (s αQ))

When αP = αQ, it follows that

s αP = s αQ = s

and these laws are then the same as in Section 2.2.3.

Example

X1 (See 2.3 X1.)

Let t1 = ¸coin, click, curse), then

t1 αNOISYVM = ¸coin, click) [which is in traces(NOISYVM)]

t1 αCUST = ¸coin, curse) [which is in traces(CUST)]

therefore

t1 ∈ traces(NOISYVM || CUST)

Similar reasoning shows that

¸coin, curse, clink) ∈ traces(NOISYVM || CUST)

This shows that the curse and the clink may be recorded one after the other

in either order. They may even occur simultaneously, but we have made the

decision to provide no way of recording this.

In summary, a trace of (P || Q) is a kind of interleaving of a trace of P with

a trace of Q, in which events which are in the alphabet of both of them occur

only once. If αP∩αQ = ¦¦ then the traces are pure interleavings (Section 1.9.3),

as shown in 2.3 X2. At the other extreme, where αP = αQ, every event belongs

to both of the alphabets, and the meaning of (P || Q) is exactly as deﬁned for

interaction (Section 2.2).

54 2 Concurrency

L3A If αP ∩αQ = ¦¦, then

traces(P || Q) = ¦ s ¦ ∃t : traces(P); u : traces(Q) • s interleaves (t, u) ¦

L3B If αP = αQ then

traces(P || Q) = traces(P) ∩traces(Q)

2.4 Pictures

A process P with alphabet ¦a, b, c¦ is pictured as a box labelled P, from which

emerge a number of lines, each labelled with a diﬀerent event from its alpha-

bet (Figure 2.1). Similarly, Q with its alphabet ¦b, c, d¦ may be pictured as in

Figure 2.2.

a

c

b

P

Figure 2.1

b

c

d

Q

Figure 2.2

When these two processes are put together to evolve concurrently, the

resulting system may be pictured as a network in which similarly labelled lines

are connected, but lines labelled by events in the alphabet of only one process

are left free (Figure 2.3).

d

Q

a

c

b

P

Figure 2.3

A third process R with αR = ¦c, e¦ may be added, as shown in Figure 2.4.

This diagram shows that the event c requires participation of all three pro-

cesses, b requires participation of P and Q, whereas each remaining event is

the sole concern of a single process. Pictures of this kind will be known as

connection diagrams.

2.5 Example: The Dining Philosophers 55

d a

e

Q

c

b

P

R

Figure 2.4

d a

c

b

e

( || || ) P Q R

Figure 2.5

But these pictures could be quite misleading. A system constructed from

three processes is still only a single process, and therefore be pictured as a

single box (Figure 2.5). The number 60 can be constructed as the product of

three other numbers (3 ×4 ×5); but after it has been so constructed it is still

only a single number, and the manner of its construction is no longer relevant

or even observable.

2.5 Example: The Dining Philosophers

In ancient times, a wealthy philanthropist endowed a College to accommodate

ﬁve eminent philosophers. Each philosopher had a room in which he could en-

gage in his professional activity of thinking; there was also a common dining

56 2 Concurrency

room, furnished with a circular table, surrounded by ﬁve chairs, each labelled

by the name of the philosopher who was to sit in it. The names of the philo-

sophers were PHIL

0

, PHIL

1

, PHIL

2

, PHIL

3

, PHIL

4

, and they were disposed in this

order anticlockwise around the table. To the left of each philosopher there was

laid a golden fork, and in the centre stood a large bowl of spaghetti, which was

constantly replenished.

A philosopher was expected to spend most of his time thinking; but when

he felt hungry, he went to the dining room, sat down in his own chair, picked

up his own fork on his left, and plunged it into the spaghetti. But such is the

tangled nature of spaghetti that a second fork is required to carry it to the

mouth. The philosopher therefore had also to pick up the fork on his right.

When we was ﬁnished he would put down both his forks, get up fromhis chair,

and continue thinking. Of course, a fork can be used by only one philosopher

at a time. If the other philosopher wants it, he just has to wait until the fork

is available again.

2.5.1 Alphabets

We shall now construct a mathematical model of this system. First we must

select the relevant sets of events. For PHIL

i

, the set is deﬁned

αPHIL

i

= ¦i.sits down, i.gets up,

i.picks up fork.i, i.picks up fork.(i ⊕1),

i.puts down fork.i, i.puts down fork.(i ⊕1) ¦

where ⊕ is addition modulo 5, so i ⊕ 1 identiﬁes the right-hand neighbour of

the ith philosopher.

Note that the alphabets of the philosophers are mutually disjoint. There

is no event in which they participate jointly, so there is no way whatsoever in

which they can interact or communicate with each other—a realistic reﬂection

of the behaviour of philosophers in those days.

The other actors in our little drama are the ﬁve forks, each of which bears

the same number as the philosopher who owns it. A fork is picked up and put

down either by this philosopher, or by his neighbour on the other side. The

alphabet of the ith fork is deﬁned

αFORK

i

= ¦i.picks up fork.i, (i ÷1).picks up fork.i,

i.puts down fork.i, (i ÷1).puts down fork.i¦

where ÷ denotes subtraction modulo 5.

Thus each event except sitting down and getting up requires participa-

tion of exactly two adjacent actors, a philosopher and a fork, as shown in the

connection diagram of Figure 2.6.

2.5 Example: The Dining Philosophers 57

5.gets up

1.gets up

2.gets up

3.gets up

4.gets up

5.sits down

1.sits down

2.sits down

3.sits down

4.sits down

PHIL

2

PHIL

3

PHIL

4

PHIL

5

PHIL

1

FORK

3

FORK

4

FORK

5

FORK

1

FORK

2

3.picks up

fork.4

4.picks up

fork.5

5.picks up

fork.1

1.picks up

fork.2

2.picks up

fork.3

4.picks up

fork.4

5.picks up

fork.5

1.picks up

fork.1

2.picks up

fork.2

3.picks up

fork.3

3.puts down

fork.4

4.puts down

fork.5

5.puts down

fork.1

1.puts down

fork.2

2.puts down

fork.3

4.puts down

fork.4

5.puts down

fork.5

1.puts down

fork.1

2.puts down

fork.2

3.puts down

fork.3

Figure 2.6

2.5.2 Behaviour

Apart from thinking and eating which we have chosen to ignore, the life of

each philosopher is described as the repetition of a cycle of six events

PHIL

i

= (i.sits down →

i.picks up fork.i →

i.picks up fork.(i ⊕1) →

i.puts down fork.i →

i.puts down fork.(i ⊕1) →

i.gets up →PHIL

i

)

58 2 Concurrency

The role of a fork is a simple one; it is repeatedly picked up and put down

by one of its adjacent philosophers (the same one on both occasions)

FORK

i

= (i.picks up fork.i →i.puts down fork.i →FORK

i

¦ (i ÷1).picks up fork.i →(i ÷1).puts down fork.i →FORK

i

)

The behaviour of the whole College is the concurrent combination of the be-

haviour of each of these components

PHILOS = (PHIL

0

|| PHIL

1

|| PHIL

2

|| PHIL

3

|| PHIL

4

)

FORKS = (FORK

0

|| FORK

1

|| FORK

2

|| FORK

3

|| FORK

4

)

COLLEGE = PHILOS || FORKS

An interesting variation of this story allows the philosophers to pick up

their two forks in either order, or put them down in either order. Consider

the behaviour of each philosopher’s hand separately. Each hand is capable of

picking up the relevant fork, but both hands are needed for sitting down and

getting up

αLEFT

i

= ¦i.picks up fork.i, i.puts down fork.i,

i.sits down, i.gets up¦

αRIGHT

i

= ¦i.picks up fork.(i ⊕1), i.puts down fork.(i ⊕1),

i.sits down, i.gets up¦

LEFT

i

= (i.sits down →i.picks up fork.i →

i.puts down fork.i →i.gets up →LEFT

i

)

RIGHT

i

= (i.sits down →i.picks up fork.(i ⊕1) →

i.puts down fork.(i ⊕1) →i.gets up →RIGHT

i

)

PHIL

i

= LEFT

i

|| RIGHT

i

Synchronisation of sitting down and getting up by both LEFT

i

and RIGHT

i

en-

sures that no fork can be raised except when the relevant philosopher is seated.

Apart from this, operations on the two forks are arbitrarily interleaved.

In yet another variation of the story, each fork may be picked up and put

down many times on each occasion that the philosopher sits down. Thus the

behaviour of the hands is modiﬁed to contain an iteration, for example

LEFT

i

= (i.sits down →

µ X • (i.picks up fork.i →i.puts down fork.i →X

¦ i.gets up →LEFT

i

))

2.5 Example: The Dining Philosophers 59

2.5.3 Deadlock!

When a mathematical model had been constructed, it revealed a serious danger.

Suppose all the philosophers get hungry at about the same time; they all sit

down; they all pick up their own forks; and they all reach out for the other

fork—which isn’t there. In this undigniﬁed situation, they will all inevitably

starve. Although each actor is capable of further action, there is no action

which any pair of them can agree to do next.

However, our story does not end so sadly. Once the danger was detected,

there were suggested many ways to avert it. For example, one of the philo-

sophers could always pick up the wrong fork ﬁrst—if only they could agree

which one it should be! The purchase of a single additional fork was ruled

out for similar reasons, whereas the purchase of ﬁve more forks was much too

expensive. The solution ﬁnally adopted was the appointment of a footman,

whose duty it was to assist each philosopher into and out of his chair. His

alphabet was deﬁned as

4

i=0

¦i.sits down, i.gets up¦

This footman was given secret instructions never to allowmore than four philo-

sophers to be seated simultaneously. His behaviour is most simply deﬁned by

mutual recursion. Let

U =

4

i=0

¦i.gets up¦ D =

4

i=0

¦i.sits down¦

FOOT

j

deﬁnes the behaviour of the footman with j philosophers seated

FOOT

0

= (x : D →FOOT

1

)

FOOT

j

= (x : D →FOOT

j+1

¦ y : U →FOOT

j−1

) for j ∈ ¦1, 2, 3¦

FOOT

4

= (y : U →FOOT

3

)

A college free of deadlock is deﬁned

NEWCOLLEGE = (COLLEGE || FOOT

0

)

The edifying tale of the dining philosophers is due to Edsger W. Dijkstra. The

footman is due to Carel S. Scholten.

2.5.4 Proof of absence of deadlock

In the original COLLEGE the risk of deadlock was far from obvious; the claim

that NEWCOLLEGE is free fromdeadlock should therefore be proved with some

care. What we must prove can be stated formally as

(NEWCOLLEGE / s) ≠ STOP for all s ∈ traces(NEWCOLLEGE)

60 2 Concurrency

The proof proceeds by taking an arbitrary trace s, and showing that in all

cases there is at least one event by which s can be extended and still remain in

traces(NEWCOLLEGE). First we deﬁne the number of seated philosophers

seated(s) = #(s D) −#(s U) where U and D are deﬁned above

Because (by 2.3.3 L1) s (U ∪D) ∈ traces(FOOT

0

), we know seated(s) ≤ 4. If

seated(s) ≤ 3, at least one more philosopher can sit down, so that there is no

deadlock. In the remaining case that seated(s) = 4, consider the number of

philosophers who are eating (with both their forks raised). If this is nonzero,

then an eating philosopher can always put down his left fork. In the remaining

case, that no philosopher is eating, consider the number of raised forks. If this

is three or less, then one of the seated philosophers can pick up his left fork. If

there are four raised forks, then the philosopher to the left of the vacant seat

already has raised his left fork and can pick up his right one. If there are ﬁve

raised forks, then at least one of the seated philosophers must be eating.

This proof involves analysis of a number of cases, described informally in

terms of the behaviour of this particular example. Let us consider an altern-

ative proof method: program a computer to explore all possible behaviours

of the system to look for deadlock. In general, we could never know whether

such a program had looked far enough to guarantee absence of deadlock. But

in the case of a ﬁnite-state system like the COLLEGE is is suﬃcient to consider

only those traces whose length does not succeed a known upper bound on

the number of states. The number of states of (P || Q) does not exceed the

product of the number of states of P and the number of states of Q. Since each

philosopher has six states and each fork has three states, the total number of

states of the COLLEGE does not exceed

6

5

×3

5

, or approximately 1.8 million

Since the alphabet of the footman is contained in that of the COLLEGE, the

NEWCOLLEGE cannot have more states than the COLLEGE. Since in nearly

every state there are two or more possible events, the number of traces that

must be examined will exceed two raised to the power of 1.8 million. There

is no hope that a computer will ever be able to explore all these possibilities.

Proof of the absence of deadlock, even for quite simple ﬁnite processes, will

remain the responsibility of the designer of concurrent systems.

2.5.5 Inﬁnite overtaking

Apart fromdeadlock, there is another danger that faces a dining philosopher—

that of being inﬁnitely overtaken. Suppose that a seated philosopher has an

extremely greedy left neighbour, and a rather slow left arm. Before he can pick

up his left fork, his left neighbour rushes in, sits down, rapidly picks up both

forks, and spends a long time eating. Eventually he puts down both forks,

and leaves his seat. But then the left neighbour instantly gets hungry again,

2.6 Change of symbol 61

rushes in, sits down, and rapidly snatches both forks, before his long-seated

and long-suﬀering right neighbour gets around to picking up the fork they

share. Since this cycle may be repeated indeﬁnitely, a seated philosopher may

never succeed in eating.

The correct solution to this problem is probably to regard it as insoluble,

because if any philosopher is as greedy as described above, then somebody

(either he or his neighbours) will inevitably spend a long time hungry. There is

no clever way of ensuring general satisfaction, and the only eﬀective solution

is to buy more forks, and plenty of spaghetti. However, if it is important to

guarantee that a seated philosopher will eventually eat, modify the behaviour

of the footman: having helped a philosopher to his seat he waits until that

philosopher has picked up both forks before he allows either of his neighbours

to sit down.

But there remains a more philosophical problem about inﬁnite overtaking.

Suppose the footman conceives an irrational dislike for one of his philosoph-

ers, and persistently delays the action of escorting him to his chair, even when

the philosopher is ready to engage in that event. This is a possibility that can-

not be described in our conceptual framework, because we cannot distinguish

it from the possibility that the philosopher himself takes an indeﬁnitely long

time to get hungry. So here is a problem, like detailed timing problems, which

we have deliberately decided to ignore, or rather to delegate it to a diﬀerent

phase of design and implementation. It is an implementor’s responsibility to

ensure that any desirable event that becomes possible will take place within

an acceptable interval. The implementor of a conventional high-level program-

ming language has a similar obligation not to insert arbitrary delays into the

execution of a program, even though the programmer has no way of enforcing

or even describing this obligation.

2.6 Change of symbol

The example of the previous section involved two collections of processes,

philosophers and forks; within each collection the processes have very similar

behaviour, except that the names of the events in which they engage are dif-

ferent. In this section we introduce a convenient method of deﬁning groups of

processes with similar behaviour. Let f be a one-one function (injection) which

maps the alphabet of P onto a set of symbols A

f : αP →A

We deﬁne the process f (P) as one which engages in the event f (c) whenever

P would have engaged in c. It follows that

αf (P) = f (αP)

traces(f (P)) = ¦ f

∗

(s) ¦ s ∈ traces(P) ¦

(For the deﬁnition of f

∗

see 1.9.1).

62 2 Concurrency

Examples

X1 After a few years, the price of everything goes up. To represent the eﬀect

of inﬂation, we deﬁne a function f by the following equations

f (in2p) = in10p f (large) = large

f (in1p) = in5p f (small) = small

f (out1p) = out5p

The new vending machine is

NEWVMC = f (VMC)

**X2 A counter behaves like CT
**

0

(1.1.4 X2), except that it moves right and left

instead of up and down

f (up) = right, f (down) = left, f (around) = around,

LR

0

= f (CT

0

)

**The main reason for changing event names of processes in this fashion is
**

to enable them to be composed usefully in concurrent combination.

X3 A counter moves left, right, up or down on an inﬁnite board with bound-

aries at the left and at the bottom

It starts at the bottom left corner. On this square alone, it can turn around. As

in 2.3 X2, vertical and horizontal movements can be modelled as independent

actions of separate processes; but around requires simultaneous participation

of both

LRUD = LR

0

|| CT

0

2.6 Change of symbol 63

X4 We wish to connect two instances of COPYBIT (1.1.3 X7) in series, so that

each bit output by the ﬁrst is simultaneously input by the second. First, we

need to change the names of the events used for internal communication; we

therefore introduce two new events mid.0 and mid.1, and deﬁne the functions

f and g to change the output of one process and the input of the other

f (out.0) = g(in.0) = mid.0

f (out.1) = g(in.1) = mid.1

f (in.0) = in.0, f (in.1) = in.1

g(out.0) = out.0, g(out.1) = out.1

The answer we want is

CHAIN2 = f (COPYBIT) || g(COPYBIT)

Note that each output of 0 or 1 by the left operand of || is (by the deﬁnition of

f and g) the very same event (mid.0 or mid.1) as the input of the same 0 or 1

by the right operand. This models the synchronised communication of binary

digits on a channel which connects the two operands, as shown in Figure 2.7.

mid.0 mid.1

out.0

out.1

in.0

in.1

COPYBIT COPYBIT

Figure 2.7

The left operand oﬀers no choice of which value is transmitted on the con-

necting channel, whereas the right operand is prepared to engage in either of

the events mid.0 or mid.1. It is therefore the outputting process that determ-

ines on each occasion which of these two events will occur. This method of

communication between concurrent processes will be generalised in Chapter 4.

Note that the internal communications mid.0 and mid.1 are still in the

alphabet of the composite processes, and can be observed (or even perhaps

controlled) by its environment. Sometimes one wishes to ignore or conceal

such internal events; in the general case such concealment may introduce non-

determinism, so this topic is postponed to Section 3.5.

X5 We wish to represent the behaviour of a Boolean variable used by a com-

puter program. The events in its alphabet are

assign0—assignment of value zero to the variable

assign1—assignment of value one to the variable

fetch0—access of the value of the variable at a time when it is zero

fetch1—access of the value of the variable at a time when it is one

64 2 Concurrency

The behaviour of the variable is remarkably similar to that of the drinks dis-

penser (1.1.4 X1), so we deﬁne

BOOL = f (DD)

where the deﬁnition of f is a trivial exercise. Note that the Boolean variable

refuses to give its value until after a value has been ﬁrst assigned. An attempt

to fetch an unassigned value would result in deadlock—which is probably the

kindest failure mode for incorrect programs, because the simplest postmortem

will pinpoint the error.

The tree picture of f (P) may be constructed from the tree picture of P by

simply applying the function f to the labels on all the branches. Because f

is a one-one function, this transformation preserves the structure of the tree

and the important distinctness of labels on all branches leading from the same

node. For example, a picture of NEWVMC is shown in Figure 2.8.

NEWVMC

small large

in5p out5p in10p large in10p in5p

small in5p

in10p in5p

in10p

in10p in5p

Figure 2.8

2.6.1 Laws

Change of symbol by application of a one-one function does not change the

structure of the behaviour of a process. This is reﬂected by the fact that func-

tion application distributes through all the other operators, as described in the

following laws. The following auxiliary deﬁnitions are used

f (B) = ¦ f (x) ¦ x ∈ B ¦

f

−1

is the inverse of f

f ◦ g is the composition of f and g

f

∗

is deﬁned in Section 1.9.1

2.6 Change of symbol 65

(the need for f

−1

in the following laws is an important reason for insisting that

f is an injection).

After change of symbol, STOP still performs no event from its changed

alphabet

L1 f (STOP

A

) = STOP

f (A)

In the case of a choice, the symbols oﬀered for selection are changed, and the

subsequent behaviour is similarly changed

L2 f (x : B →P(x)) = (y : f (B) →f (P(f

−1

(y))))

The use of f

−1

on the right-hand side may need explanation. Recall that P

is a function delivering a process depending on selection of some x from the

set B. But the variable y on the right-hand side is selected from the set f (B).

The corresponding event for P is f

−1

(y), which is in B (since y ∈ f (B)). The

behaviour of P after this event is P(f

−1

(y)), and the actions of this process

must continue to be changed by application of f .

Change of symbol simply distributes through parallel composition

L3 f (P || Q) = f (P) || f (Q)

Change of symbol distributes in a slightly more complex way over recursion,

changing the alphabet in the appropriate way

L4 f (µ X : A • F(X)) = (µ Y : f (A) • f (F(f

−1

(Y))))

Again, the use of f

−1

on the right-hand side may be puzzling. Recall that the

validity of the recursion on the left-hand side requires that F is a function which

takes as argument a process with alphabet A, and delivers a process with the

same alphabet. On the right-hand side, Y is a variable ranging over processes

with alphabet f (A), and cannot be used as an argument to F until its alphabet

has been changed back to A. This is done by applying the inverse function

f

−1

. Now F(f

−1

(Y)) has alphabet A, so an application of f will transform the

alphabet to f (A), thus ensuring the validity of the recursion on the right-hand

side of the law.

The composition of two changes of symbol is deﬁned by the composition

of the two symbol-changing functions

L5 f (g(P)) = (f ◦ g) (P)

The traces of a process after change of symbol are obtained simply by

changing the individual symbols in every trace of the original process

L6 traces(f (P)) = ¦ f

∗

(s) ¦ s ∈ traces(P) ¦

The explanation of the next and ﬁnal law is similar to that of L6

L7 f (P) / f

∗

(s) = f (P / s)

66 2 Concurrency

2.6.2 Process labelling

Change of symbol is particularly useful in constructing groups of similar pro-

cesses which operate concurrently in providing identical services to their com-

mon environment, but which do not interact with each other in any way at all.

This means that they must all have diﬀerent and mutually disjoint alphabets.

To achieve this, each process is labelled by a diﬀerent name; and each event

of a labelled process is also labelled by its name. A labelled event is a pair l.x

where l is a label, and x is the symbol standing for the event.

A process P labelled by l is denoted by

l : P

It engages in the event l.x whenever P would have engaged in x. The function

required to deﬁne l : P is

f

l

(x) = l.x for all x in αP

and the deﬁnition of labelling is

l : P = f

l

(P)

Examples

X1 A pair of vending machines standing side by side

(left : VMS) || (right : VMS)

The alphabets of the two processes are disjoint, and every event that occurs

is labelled by the name of the machine on which it occurred. If the machines

were not named before being placed in parallel, every event would require

participation of both of them, and the pair would be indistinguishable from a

single machine; this is a consequence of the fact that

(VMS || VMS) = VMS

**The labelling of processes permits them to be used in the manner of vari-
**

ables in a high-level programming language, declared locally in the block of

program which uses them.

X2 The behaviour of a Boolean variable is modelled by BOOL (2.6 X5). The

behaviour of a block of programis represented by a process USER. This process

assigns and accesses the values of two Boolean variables named b and c. Thus

αUSER includes such compound events as

b.assign.0—to assign value zero to b

c.fetch.1—to access the current value of c when it is one

2.6 Change of symbol 67

The USER process runs in parallel with its two Boolean variables

b : BOOL || c : BOOL || USER

Inside the USER program, the following eﬀects may be achieved

b := false ; P by (b.assign.0 →P)

b := ¬ c ; P by (c.fetch.0 →b.assign.1 →P

¦ c.fetch.1 →b.assign.0 →P)

Note howthe current value of the variable is discovered by allowing the variable

to make the choice between fetch.0 and fetch.1; and this choice aﬀects in an

appropriate way the subsequent behaviour of the USER.

In X2 and the following examples it would have been more convenient to

deﬁne the eﬀect of the single assignment, e.g.,

b := false

rather than the pair of commands

b := false ; P

which explicitly mentions the rest of the program P. The means of doing this

will be introduced in Chapter 5.

X3 A USER process needs two count variables named l and m. They are ini-

tialised to 0 and 3 respectively. The USER process increments each variable by

l.up or m.up, and decrements it (when positive) by l.down and m.down. A test

of zero is provided by the events l.around and m.around. Thus the process

CT (1.1.4 X2) can be used after appropriate labelling by l and by m

(l : CT

0

|| m : CT

3

|| USER)

Within the USER process the following eﬀects (expressed in conventional nota-

tion) can be achieved

(m := m+1 ; P) by (m.up →P)

if l = 0 then P else Q by (l.around →P

¦ l.down →l.up →Q)

Note how the test for zero works: an attempt is made by l.down to reduce

the count by one, at the same time as attempting l.around. The count selects

between these two events: if the value is zero, l.around is selected; if non-zero,

the other. But in the latter case, the value of the count has been decremented,

and it must immediately be restored to its original value by l.up. In the next

example, restoration of the original value is more laborious.

68 2 Concurrency

(m := m+l ; P)

is implemented by ADD, where ADD is deﬁned recursively

ADD = DOWN

0

DOWN

i

= (l.down →DOWN

i+1

¦ l.around →UP

i

)

UP

0

= P

UP

i+1

= l.up →m.up →UP

i

The DOWN

i

processes discover the initial value of l by decrementing it to zero.

The UP

i

processes then add the discovered value to both m and to l, thereby

restoring l to its initial value and adding this value to m.

The eﬀect of an array variable can be achieved by a collection of concurrent

processes, each labelled by its index within the array.

X4 The purpose of the process EL is to record whether the event in has oc-

curred or not. On the ﬁrst occurrence of in it responds no and on each sub-

sequent occurrence it responds yes

αEL = ¦in, no, yes¦

EL = in →no →µ X • (in →yes →X)

This process can be used in an array to mimic the behaviour of a set of small

integers

SET3 = (0 : EL) || (1 : EL) || (2 : EL) || (3 : EL)

The whole array can be labelled yet again before use

m : SET3 || USER

Each event in α(m : SET3) is a triple, e.g., m.2.in. Within the USER process, the

eﬀect of

if 2 ∈ m then P else (m := m∪¦2¦ ; Q)

may be achieved by

m.2.in →(m.2.yes →P ¦ m.2.no →Q)

2.6.3 Implementation

To implement symbol change in general, we need to know the inverse g of

the symbol-changing function f . We also need to ensure that g will give the

2.6 Change of symbol 69

special answer "BLEEP when applied to an argument outside the range of f .

The implementation is based upon 2.6.1 L4.

change(g, P) = λx • if g(x) = "BLEEP then

"BLEEP

else if P(g(x)) = "BLEEP then

"BLEEP

else

change(g, P(g(x)))

The special case of process labelling can be implemented more simply.

The compound event l .x is represented as the pair of atoms cons("l, "x). Now

(l : P) is implemented by

label(l, P) = λy • if null(y) or atom(y) then

"BLEEP

else if car(y) ≠ l then

"BLEEP

else if P(cdr(y)) = "BLEEP then

"BLEEP

else

label(l, P(cdr(y)))

2.6.4 Multiple labelling

The deﬁnition of labelling can be extended to alloweach event to take any label

l from a set L. If P is a process, (L : P) is deﬁned as a process which behaves

exactly like P, except that it engages in the event l.c (where l ∈ L and c ∈ αP)

whenever P would have done c. The choice of the label l is made independently

on each occasion by the environment of (L : P).

Example

X1 A lackey is a junior footman, who helps his single master to and from his

seat, and stands behind his chair while he eats

αLACKEY = ¦sits down, gets up¦

LACKEY = (sits down →gets up →LACKEY)

To teach the lackey to share his services among ﬁve masters (but serving only

one at a time), we deﬁne

70 2 Concurrency

L = ¦0, 1, 2, 3, 4¦

SHARED LACKEY = (L : LACKEY)

The shared lackey could be employed to protect the dining philosophers from

deadlock when the footman (2.5.3) is on holiday. Of course the philosophers

may go hungrier during the holiday, since only one of them is allowed to the

table at a time.

If L contains more than one label, the tree picture of L : P is similar to

that for P; but it is much more bushy in the sense that there are many more

branches leading from each node. For example, the picture of the LACKEY is

a single trunk with no branches (Figure 2.9).

sits down

gets up

sits down

Figure 2.9

However, the picture of ¦0, 1¦ : LACKEY is a complete binary tree (Figure 2.10);

the tree for the SHARED LACKEY is even more bushy.

1.gets up 0.gets up 1.gets up 0.gets up

1.sits down 0.sits down

Figure 2.10

In general, multiple labelling can be used to share the services of a single

process among a number of other labelled processes, provided that the set

of labels is known in advance. This technique will be exploited more fully in

Chapter 6.

2.7 Speciﬁcations 71

2.7 Speciﬁcations

Let P and Q be processes intended to run concurrently, and suppose we have

proved that P sat S(tr) and that Q sat T(tr). Let tr be a trace of (P || Q). It

follows by 2.3.3 L1 that (tr αP) is a trace of P, and consequently it satisﬁes

S, i.e.,

S(tr αP)

Similarly, (tr αQ) is a trace of Q, so

T(tr αQ)

This argument holds for every trace of (P || Q). Consequently we may deduce

(P || Q) sat (S(tr αP) ∧ T(tr αQ))

This informal reasoning is summarised in the law

L1 If P sat S(tr)

and Q sat T(tr)

then (P || Q) sat (S(tr αP) ∧ T(tr αQ))

Example

X1 (See 2.3.1 X1)

Let αP = ¦a, c¦

αQ = ¦b, c¦

and P = (a →c →P)

Q = (c →b →Q)

We wish to prove that

(P || Q) sat 0 ≤ tr ↓ a −tr ↓ b ≤ 2

The proof of 1.10.2 X1 can obviously be adapted to show that

P sat (0 ≤ tr ↓ a −tr ↓ c ≤ 1)

and

Q sat (0 ≤ tr ↓ c −tr ↓ b ≤ 1)

By L1 it follows that

(P || Q)

72 2 Concurrency

sat (0 ≤ (tr αP) ↓ a −(tr αP) ↓ c ≤ 1 ∧

0 ≤ (tr αQ) ↓ c −(tr αQ) ↓ b ≤ 1)

⇒0 ≤ tr ↓ a −tr ↓ b ≤ 2 [since (tr A) ↓ a = tr ↓ a whenever a ∈ A.]

**Since the laws for sat allow STOP to satisfy every satisﬁable speciﬁcation,
**

reasoning based on these laws can never prove absence of deadlock. More

powerful laws will be given in Section 3.7. Meanwhile, one way to eliminate

the risk of stoppage is by careful proof, as in Section 2.5.4. Another method

is to show that a process deﬁned by the parallel combinator is equivalent

to a non-stopping process deﬁned without this combinator, as was done in

2.3.1 X1. However, such proofs involve long and tedious algebraic transform-

ations. Wherever possible, one should appeal to some general law, such as

L2 If P and Q never stop and if (αP ∩αQ) contains at most one event, then

(P || Q) never stops.

Example

X2 The process (P || Q) deﬁned in X1 will never stop, because

αP ∩αQ = ¦c¦

**The proof rule for change of symbol is
**

L3 If P sat S(tr)

then f (P) sat S(f

−1∗

(tr))

The use of f

−1

in the consequent of this law may need extra explanation. Let

tr be a trace of f (P). Then f

−1

(tr) is a trace of P. The antecedent of L3 states

that every trace of P satisﬁes S. It follows that f

−1∗

(tr) satisﬁes S, which is

exactly what is stated by the consequent of L3.

2.8 Mathematical theory of deterministic processes

In our description of processes, we have stated a large number of laws, and we

have occasionally used them in proofs. The laws have been justiﬁed (if at all)

by informal explanations of why we should expect and want them to be true,

For a reader with the instincts of an applied mathematician or engineer, that

may be enough. But the question also arises, are these laws in fact true? Are

they even consistent? Should there be more of them? Or are they complete

in the sense that they permit all true facts about processes to be proved from

them? Could one manage with fewer and simpler laws? These are questions

for which an answer must be sought in a deeper mathematical investigation.

2.8 Mathematical theory of deterministic processes 73

2.8.1 The basic deﬁnitions

In constructing a mathematical model of a physical system, it is a good strategy

to deﬁne the basic concepts in terms of attributes that can be directly or in-

directly observed or measured. For a deterministic process P, we are familiar

with two such attributes

αP—the set of events in which the process is in principle capable of

engaging

traces(P)—the set of all sequences of events in which the process can

actually participate if required.

We have explained how these two sets must satisfy the three laws 1.8.1 L6,

L7, L8. Consider now an arbitrary pair of sets (A, S) which satisfy these three

laws. This pair uniquely identiﬁes a process P whose traces are S constructed

according to the following deﬁnitions. Let

P

0

= ¦ x ¦ ¸x) ∈ S ¦

and, for all x in P

0

, let P(x) be the process whose traces are

¦ t ¦ ¸x)

t ∈ S ¦

Then

αP = A and P = (x : P

0

→P(x))

Furthermore, the pair (A, S) can be recovered by the equations

A = αP

S = traces(x : P

0

→P(x))

Thus there is a one-one correspondence between each process P and the pairs

of sets (αP, traces(P)). In mathematics, this is a suﬃcient justiﬁcation for

identifying the two concepts, by using one of them as the deﬁnition of the

other.

D0 A deterministic process is a pair

(A, S)

where A is any set of symbols and S is any subset of A

∗

which satisﬁes the two

conditions

C0 ¸) ∈ S

C1 ∀s, t • s

t ∈ S ⇒s ∈ S

The simplest example of a process which meets this deﬁnition is the one

that does nothing

74 2 Concurrency

D1 STOP

A

= (A, ¦¸)¦)

At the other extreme there is the process that will do anything at any time

D2 RUN

A

= (A, A

∗

)

The various operators on processes can now be formally deﬁned by show-

ing how the alphabet and traces of the result are derived from the alphabet

and traces of the operands

D3 (x : B →(A, S(x))) = (A, ¦¸)¦ ∪¦ ¸x)s ¦ x ∈ B ∧ s ∈ S(x) ¦)

provided B ⊆ A

D4 (A, S) / s = (A, ¦ t ¦ (s

t) ∈ S ¦) provided s ∈ S

D5 µ X : A • F(X) = (A,

n≥0

traces(F

n

(STOP

A

)))

provided F is a guarded expression

D6 (A, S) || (B, T) = (A∪B, ¦ s ¦ s ∈ (A∪B)

∗

∧ (s A) ∈ S ∧ (s B) ∈ T ¦)

D7 f (A, S) = (f (A), ¦ f

∗

(s) ¦ s ∈ S ¦) provided f is one-one

Of course, it is necessary to prove that the right-hand sides of these deﬁnitions

are actually processes, i.e., that they satisfy the conditions C0 and C1 of D0.

Fortunately, that is quite easy.

In Chapter 3, it will become apparent that D0 is not a fully adequate deﬁn-

ition of the concept of a process, because it does not represent the possibil-

ity of nondeterminism. Consequently, a more general and more complicated

deﬁnition will be required. All laws for nondeterministic processes are true

for deterministic processes as well. But deterministic processes obey some

additional laws, for example

P || P = P

To avoid confusion, in this book we have avoided quoting such laws; so all

quoted laws may safely be applied to nondeterministic processes as well as

deterministic ones, (except 2.2.1 L3A, 2.2.3 L1, 2.3.1 L3A, 2.3.3 L1, L2, L3A,

L3B, which are false for processes containing CHAOS (3.8)).

2.8.2 Fixed point theory

The purpose of this section is to give an outline of a proof of the fundamental

theorem of recursion, that a recursively deﬁned process (2.8.1 D5) is indeed a

solution of the corresponding recursive equation, i.e.,

µ X • F(X) = F(µ X • F(X))

The treatment follows the ﬁxed-point theory of Scott.

2.8 Mathematical theory of deterministic processes 75

First, we need to specify an ordering relationship U among processes

D1 (A, S) U (B, T) = (A = B ∧ S ⊆ T)

Two processes are comparable in this ordering if they have the same alphabet,

and one of them can do everything done by the other—and maybe more. This

ordering is a partial order in the sense that

L1 P U P

L2 P U Q ∧ Q U P ⇒P = Q

L3 P U Q ∧ Q U R ⇒P U R

A chain in a partial order is an inﬁnite sequence of elements

¦ P

0

, P

1

, P

2

, . . . ¦

such that

P

i

U P

i+1

for all i

We deﬁne the limit (least upper bound) of such a chain

i≥0

P

i

= (αP

0

,

i≥0

traces(P

i

))

In future, we will apply the limit operator

only to sequences of processes

that form a chain.

A partial order is said to be complete if it has a least element, and all chains

have a least upper bound. The set of all processes with a given alphabet A

forms a complete partial order (c.p.o.), since it satisﬁes the laws

L4 STOP

A

U P provided αP = A

L5 P

i

U

i≥0

P

i

L6 (∀i ≥ 0 • P

i

U Q) ⇒(

i≥0

P

i

) U Q

Furthermore the deﬁnition of µ (2.8.1 D5) can be reformulated in terms of a

limit

L7 µ X : A • F(X) =

i≥0

F

i

(STOP

A

)

A function F from one c.p.o. to another one (or the same one) is said to

be continuous if it distributes over the limits of all chains, i.e.,

F(

i≥0

P

i

) =

i≥0

F(P

i

) if ¦ P

i

¦ i ≥ 0 ¦ is a chain

(All continuous functions are monotonic in the sense that

P U Q ⇒F(P) U F(Q) for all P and Q,

76 2 Concurrency

so that the right-hand side of the previous equation is also the limit of an

ascending chain.) A function G of several arguments is deﬁned as continuous

if it is continuous in each of its arguments separately, for example

G((

i≥0

P

i

), Q) =

i≥0

G(P

i

, Q) for all Q

and

G(Q,

i≥0

P

i

)) =

i≥0

G(Q, P

i

) for all Q

The composition of continuous functions is also continuous; and indeed any

expression constructed by application of any number of continuous functions

to any number and combination of variables is continuous in each of those

variables. For example, if F, G and H are continuous

G(F(X), H(X, Y))

is continuous in X, i.e.,

G(F(

i≥0

P

i

), H((

i≥0

P

i

), Y)) =

i≥0

G(F(P

i

), H(P

i

, Y)) for all Y

All the operators (except /) deﬁned in D3 to D7 are continuous in the sense

deﬁned above

L8 (x : B →(

i≥0

P

i

(x))) =

i≥0

(x : B →P

i

(x))

L9 µ X : A • F(X, (

i≥0

P

i

)) =

i≥0

µ X : A • F(X, P

i

)

provided F is continuous

L10 (

i≥0

P

i

) || Q = Q || (

i≥0

P

i

) =

i≥0

(Q || P

i

)

L11 f (

i≥0

P

i

) =

i≥0

f (P

i

)

Consequently if F(X) is any expression constructed solely in terms of these

operators, it will be continuous in X. Now it is possible to prove the basic

ﬁxed-point theorem

F(µ X : A • F(X))

= F(

i≥0

F

i

(STOP

A

)) [def. µ]

=

i≥0

F(F

i

(STOP

A

)) [continuity F]

=

i≥1

F

i

(STOP

A

) [def. F

i+1

]

=

i≥0

F

i

(STOP

A

) [STOP

A

U F(STOP

A

)]

= µ X : A • F(X) [def. µ]

This proof has relied only on the fact that F is continuous. The guardedness

of F is necessary only to establish uniqueness of the solution.

2.8 Mathematical theory of deterministic processes 77

2.8.3 Unique solutions

In this section we treat more formally the reasoning given in Section 1.1.2

to show that an equation deﬁning a process by guarded recursion has only

one solution. In doing so, we shall make explicit more general conditions for

uniqueness of such solutions. For simplicity, we deal only with single equa-

tions; the treatment easily extends to sets of simultaneous equations.

If P is a process and n is a natural number, we deﬁne (P n) as a process

which behaves like P for its ﬁrst n events, and then stops; more formally

(A, S) n = (A, ¦ s ¦ s ∈ S ∧ #s ≤ n ¦)

It follows that

L1 P 0 = STOP

L2 P n U P (n +1) U P

L3 P =

n≥0

P n

L4

n≥0

P

n

=

n≥0

(P

n

n)

Let F be a monotonic function from processes to processes. F is said to

be constructive if

F(X) (n +1) = F(X n) (n +1) for all X

This means that the behaviour of F(X) on its ﬁrst n+1 steps is determined by

the behaviour of X on its ﬁrst n steps only; so if s ≠ ¸)

s ∈ traces(F(X)) ≡ s ∈ traces(F(X (#s −1)))

Preﬁxing is the primary example of a constructive function, since

(c →P) (n +1) = (c →(P n)) (n +1)

General choice is also constructive

(x : B →P(x)) (n +1) = (x : B →(P(x) n)) (n +1)

The identity function I is not constructive, since

I (c →P) 1 = c →STOP

≠ STOP

= I ((c →P) 0) 1

We can now formulate the fundamental theorem

L5 Let F be a constructive function. The equation

X = F(X)

has only one solution for X.

78 2 Concurrency

Proof : Let X be an arbitrary solution. First by induction we prove the lemma

that

X n = F

n

(STOP) n

Base case.

X 0 = STOP = STOP 0 = F

0

(STOP) 0

Induction step.

X (n +1)

= F(X) (n +1) [since X = F(X)]

= F(X n) (n +1) [F is constructive]

= F(F

n

(STOP) n) (n +1) [hypothesis]

= F(F

n

(STOP)) (n +1) [F is constructive]

= F

n+1

(STOP) (n +1) [def. F

n

]

Now we go back to the main theorem

X

=

n≥0

(X n) [L3]

=

n≥0

F

n

(STOP) n [just proved]

=

n≥0

F

n

(STOP) [L4]

= µ X • F(X) [2.8.2 L7]

Thus all solutions of X = F(X) are equal to µ X • F(X); or in other words,

µ X • F(X) is the only solution of the equation.

The usefulness of this theorem is much increased if we can recognise

which functions are constructive and which are not. Let us deﬁne a nondestruct-

ive function G as one which satisﬁes

G(P) n = G(P n) n for all n and P.

Alphabet transformation is nondestructive in this sense, since

f (P) n = f (P n) n

So is the identity function. Any monotonic function which is constructive is

also nondestructive. But the after operator is destructive, since

((c →c →STOP) / ¸c)) 1 = c →STOP

≠ STOP

= (c →STOP) / ¸c)

= (((c →c →STOP) 1) / ¸c)) 1

2.8 Mathematical theory of deterministic processes 79

Any composition of nondestructive functions (G and H) is also nondestruct-

ive, because

G(H(P)) n = G(H(P) n) n = G(H(P n) n) n = G(H(P n)) n

Even more important, any composition of a constructive function with nondestruct-

ive functions is also constructive. So if all of F, G, …, H are nondestructive

and just one of them is constructive, then

F(G(. . . (H(X)) . . .))

is a constructive function of X.

The above reasoning extends readily to functions of more than one argu-

ment. For example parallel composition is nondestructive (in both its argu-

ments) because

(P || Q) n = ((P n) || (Q n)) n

Let E be an expression containing the process variable X. Then E is said to be

guarded in X if every occurrence of X in E has a constructive function applied to

it, and no destructive function. Thus the following expression is constructive

in X

(c →X ¦ d →f (X || P) ¦ e →(f (X) || Q)) || ((d →X) || R)

The important consequence of this is that constructiveness can be deﬁned

syntactically by the following conditions for guardedness

D1 Expressions constructed solely by means of the operators concurrency,

symbol change, and general choice are said to be guard-preserving.

D2 An expression which does not contain X is said to be guarded in X.

D3 A general choice

(x : B →P(X, x))

is guarded in X if P(X, x) is guard-preserving for all x.

D4 A symbol change f (P(X)) is guarded in X if P(X) is guarded in X.

D5 A concurrent system P(X) || Q(X) is guarded in X if both P(X) and Q(X)

are guarded in X.

Finally, we reach the conclusion

L6 If E is guarded in X, then the equation

X = E

has a unique solution.

Nondeterminism 3

3.1 Introduction

The choice operator (x : B → P(x)) is used to deﬁne a process which exhibits

a range of possible behaviours; and the concurrency operator || permits some

other process to make a selection between the alternatives oﬀered in the set B.

For example, the change-giving machine CH5C (1.1.3 X2) oﬀers its customer

the choice of taking his change as three small coins and one large, or two large

coins and one small.

Such processes are called deterministic, because whenever there is more

than one event possible, the choice between them is determined externally by

the environment of the process. It is determined either in the sense that the

environment can actually make the choice, or in the weaker sense that the

environment can observe which choice has been made at the very moment of

the choice.

Sometimes a process has a range of possible behaviours, but the environ-

ment of the process does not have any ability to inﬂuence of even observe the

selection between the alternatives. For example, a diﬀerent change-giving ma-

chine may give change in either of the combinations described above; but the

choice between them cannot be controlled or even predicted by its user. The

choice is made, as it were internally, by the machine itself, in an arbitrary or

nondeterministic fashion. The environment cannot control the choice of even

observe it; it cannot ﬁnd out exactly when the choice was made, although it

may later infer which choice was made from the subsequent behaviour of the

process.

There is nothing mysterious about this kind of nondeterminism: it arises

from a deliberate decision to ignore the factor which inﬂuence the selection.

For example, the combination of change given by the machine may depend on

the way in which the machine has been loaded with large and small coins; but

we have excluded these events from the alphabet. Thus nondeterminism is

useful for maintaining a high level of abstraction in descriptions of the beha-

viour of physical systems and machines.

82 3 Nondeterminism

3.2 Nondeterministic or

If P and Q are processes, then we introduce the notation

P ¦ Q (P or Q)

to denote a process which behaves either like P or like Q, where the selection

between them is made arbitrarily, without the knowledge of control of the

external environment. The alphabets of the operands are assumed to be the

same

α(P ¦ Q) = αP = αQ

Examples

X1 A change-giving machine which always gives the right change in one of

two combinations

CH5D = (in5p →((out1p →out1p →out1p →out2p →CH5D)

¦ (out2p →out1p →out2p →CH5D)))

**X2 CH5D may give a diﬀerent combination of change on each occasion of use.
**

Here is a machine that always gives the same combination, but we do not know

initially which it will be (see 1.1.2 X3, X4)

CH5E = CH5A ¦ CH5B

Of course, after this machine gives its ﬁrst coin in change, its subsequent be-

haviour is entirely predictable. For this reason

CH5D ≠ CH5E

**Nondeterminismhas been introduced here in its purest and simplest form
**

by the binary operator ¦. Of course, ¦ is not intended as a useful operator for

implementing a process. It would be very foolish to build both P and Q, put

them in a black bag, make an arbitrary choice between them, and then throw

the other one away!

The main advantage of nondeterminism is in specifying a process. A pro-

cess speciﬁed as (P ¦ Q) can be implemented either by building P or by build-

ing Q. The choice can be made in advance by the implementor on grounds not

relevant (and deliberately ignored) in the speciﬁcation, such as low cost, fast

response times, or early delivery. In fact, the ¦ operator will not often be used

directly even in speciﬁcations; nondeterminismarises more naturally fromuse

of the other operators deﬁned later in this chapter.

3.2 Nondeterministic or 83

3.2.1 Laws

The algebraic laws governing nondeterministic choice are exceptionally simple

and obvious. A choice between P and P is vacuous

L1 P ¦ P = P (idempotence)

It does not matter in which order the choice is presented

L2 P ¦ Q = Q ¦ P (symmetry)

A choice between three alternatives can be split into two successive binary

choices. It does not matter in which way this is done

L3 P ¦ (Q ¦ R) = (P ¦ Q) ¦ R (associativity)

The occasion on which a nondeterministic choice is made is not signiﬁcant. A

process which ﬁrst does x and then makes a choice is indistinguishable from

one which ﬁrst makes the choice and then does x

L4 x →(P ¦ Q) = (x →P) ¦ (x →Q) (distribution)

The law L4 states that the preﬁxing operator distributes through non-

determinism. Such operators are said to be distributive. A dyadic operator

is said to be distributive if it distributes through ¦ in both its argument pos-

itions independently. Most of the operators deﬁned so far for processes are

distributive in this sense

L5 (x : B →(P(x) ¦ Q(x))) = (x : B →P(x)) ¦ (x : B →Q(x))

L6 P || (Q ¦ R) = (P || Q) ¦ (P || R)

L7 (P ¦ Q) || R = (P || R) ¦ (Q || R)

L8 f (P ¦ Q) = f (P) ¦ f (Q)

However, the recursion operator is not distributive, except in the trivial

case where the operands of ¦ are identical. This point is simply illustrated by

the diﬀerence between the two processes

P = µ X • ((a →X) ¦ (b →X))

Q = (µ X • (a →X)) ¦ (µ X • (b →X))

P can make an independent choice between a and b on each iteration, so its

traces include

¸a, b, b, a, b)

Q must make a choice between always doing a and always doing b, so its traces

do not include the one displayed above. However, P may choose always to do

84 3 Nondeterminism

a or always to do b, so

traces(Q) ⊆ traces(P)

In some theories, nondeterminismis obliged to be fair, in the sense that an

event that inﬁnitely often may happen eventually must happen (though there

is no limit to how long it may be delayed). In our theory, there is no such

concept of fairness. Because we observe only ﬁnite traces of the behaviour of

a process, if an event can be postponed indeﬁnitely, we can never tell whether

it is going to happen or not. If we want to insist that the event shall happen

eventually, we must state that there is a number n such that every trace longer

than n contains that event. Then the process must be designed explicitly to

satisfy this constraint. For example, in the process P

0

deﬁned below, then

event a must always occur within n steps of its previous occurrence

P

i

= (a →P

0

) ¦ (b →P

i+1

)

P

n

= (a →P

0

)

Later, we will see that both Q and P

0

are valid implementations of P.

If fairness of nondeterminism is required, this should be speciﬁed and

implemented at a separate stage, for example, by ascribing nonzero probabil-

ities to the alternatives of a nondeterministic choice. It seems highly desirable

to separate complex probabilistic reasoning from concerns about the logical

correctness of the behaviour of a process.

In viewof laws L1 to L3 it is useful to introduce a multiple-choice operator.

Let S be a ﬁnite nonempty set

S = ¦i, j, . . . k¦

Then we deﬁne

¦

x:S

P(x) = P(i) ¦ P(j) ¦ . . . ¦ P(k)

¦

x:S

is meaningless when S is either empty or inﬁnite.

3.2.2 Implementations

As mentioned above, one of the main reasons for the introduction of non-

determinism is to abstract from details of implementation. This means that

there may be many diﬀerent implementations of a nondeterministic process P,

each with an observably diﬀerent pattern of behaviour. The diﬀerences arise

fromdiﬀerent permitted resolutions of the nondeterminisminherent in P. The

choice involved may be made by the implementor before the process starts, or

it may be postponed until the process is actually running. For example, one

way of implementing (P ¦ Q) is to select the ﬁrst operand

or1(P, Q) = P

3.2 Nondeterministic or 85

Another implementation is obtained by selecting the second operand, perhaps

on the grounds of greater eﬃciency on a particular machine

or2(P, Q) = Q

Yet a third implementation postpones the decision until the process is running;

it then allows the environment to make the choice, by selecting an event that

is possible for one process but not the other. If the event is possible for both

processes, the decision is again postponed

or3(P, Q) = λx • (if P(x) = "BLEEP then

Q(x)

else if Q(x) = "BLEEP then

P(x)

else

or3(P(x), Q(x))

Here we have given three diﬀerent possible implementations of the same oper-

ator. In fact there are many more: for example, an implementation may behave

like or3 for the ﬁrst ﬁve steps; and if all these steps are possible both for P

and for Q, it then arbitrarily chooses P.

Since the design of the process (P ¦ Q) has no control over whether P

or Q will be selected, he must ensure that his system will work correctly for

both choices. If there is any risk that either P or Q will deadlock with its

environment, then (P ¦ Q) also runs that same risk. The implementation or3

is the one which minimises the risk of deadlock by delaying the choice until

the environment makes it, and then selecting whichever of P or Q does not

deadlock. For this reason, the deﬁnition of or3 is sometimes known as angelic

nondeterminism. But the price to be paid is high in terms of eﬃciency: if the

choice between P and Q is not made on the ﬁrst step, both P and Q have

to be executed concurrently until the environment chooses an event which is

possible for one but not the other. In the simple but extreme case of or3(P, P),

this will never happen, and the ineﬃciency will also be extreme.

In contrast to or3, the implementations or1 and or2 are asymmetric:

or1(P, Q) ≠ or1(Q, P)

This seems to violate law 3.2.1 L2; but this is not so. The laws apply to pro-

cesses, not to any particular implementation of them. In fact they assert the

identity of the set of all implementations of their left and right hand sides. For

example, since or3 is symmetric,

¦ or1(P, Q), or2(P, Q), or3(P, Q) ¦

= ¦ P, Q, or3(P, Q) ¦

= ¦or2(Q, P), or1(Q, P), or3(Q, P) ¦

86 3 Nondeterminism

One of the advantages of introducing nondeterminism is to avoid the loss of

symmetry that would result from selecting one of the two simple implementa-

tions, and yet to avoid the ineﬃciency of the symmetric implementation or3.

3.2.3 Traces

If s is a trace of P, then s is also a possible trace of (P ¦ Q), i.e., in the case

that P is selected. Similarly, if s is a trace of Q, it is also a trace of (P ¦ Q).

Conversely, each trace of (P ¦ Q) must be a trace of one or both alternatives.

The behaviour of (P ¦ Q) after s is deﬁned by whichever of P or Q could

engage in s; if both could, the choice remains nondeterministic.

L1 traces(P ¦ Q) = traces(P) ∪traces(Q)

L2 (P ¦ Q) / s = Q / s if s ∈ (traces(Q) −traces(P))

= P / s if s ∈ (traces(P) −traces(Q))

= (P / s) ¦ (Q / s) if s ∈ (traces(P) ∩traces(Q))

3.3 General choice

The environment of (P ¦ Q) has no control or even knowledge of the choice

that is made between P and Q, or even the time at which the choice is made. So

(P ¦ Q) is not a helpful way of combining processes, because the environment

must be prepared to deal with either P or Q; and either one of them separately

would be easier to deal with.

We therefore introduce another operation (P Q), for which the environ-

ment can control which of P and Q will be selected, provided that this control

is exercised on the very ﬁrst action. If this action is not a possible ﬁrst action

of P, then Q will be selected; but if Q cannot engage initially in the action,

P will be selected. If, however, the ﬁrst action is possible for both P and Q,

then the choice between them is nondeterministic. (Of course, if the event is

impossible for both P and Q, then it just cannot happen.) As usual

α(P Q) = αP = αQ

In the case that no initial event of P is also possible for Q, the general

choice operator is the same as the ¦ operator, which has been used hitherto to

represent choice between diﬀerent events

(c →P d →Q) = (c →P ¦ d →Q) if c ≠ d.

However if the initial events are the same, (P Q) degenerates to nondetermin-

istic choice

(c →P c →Q) = (c →P ¦ c →Q)

Here we have adopted the convention that →binds more tightly than .

3.3 General choice 87

3.3.1 Laws

The algebraic laws for are similar to those for ¦, and for the same reasons.

L1–L3 is idempotent, symmetric, and associative.

L4 P STOP = P

The following law formalises the informal deﬁnition of the operation

L5 (x : A →P(x)) (y : B →Q(y)) =

(z : (A∪B) →

(if z ∈ (A−B) then

P(z)

else if z ∈ (B −A) then

Q(z)

else if z ∈ (A∩B) then

(P(z) ¦ Q(z))))

Like all other operators introduced so far (apart from recursion), distributes

through ¦

L6 P (Q ¦ R) = (P Q) ¦ (P R)

What may seem more surprising is that ¦ distributes through

L7 P ¦ (Q R) = (P ¦ Q) (P ¦ R)

This law states that choices made nondeterministically and choices made by

the environment are independent, in the sense that the selection made by one

of them does not inﬂuence the choice made by the other.

Let John be the agent which makes nondeterministic choices and Mary be

the environment. On the left-hand side of the law, John chooses (¦) between

P and letting Mary choose () between Q and R. On the right-hand side, Mary

chooses either (1) to oﬀer John the choice between P and Q or (2) to oﬀer John

the choice between P and R.

On both sides of the equation, if John chooses P, then P will be the overall

outcome. But if John does not select P, the choice between Q and R is made

by Mary. Thus the results of the choice strategies described on the left- and

right-hand sides of the law are always equal. Of course, the same reasoning

applies to L6.

The explanation given above is rather subtle; perhaps it would be better to

explain the law as the unexpected by unavoidable consequence of other more

obvious deﬁnitions and laws given later in this chapter.

88 3 Nondeterminism

3.3.2 Implementation

The implementation of the choice operator follows closely the law L5. Assum-

ing the symmetry of or, it is also symmetrical

choice(P, Q) = λx • if P(x) = "BLEEP then

Q(x)

else if Q(x) = "BLEEP then

P(x)

else

or(P(x), Q(x))

3.3.3 Traces

Every trace of (P Q) must be a trace of P or a trace of Q, and conversely

L1 traces(P Q) = traces(P) ∪traces(Q)

The next law is slightly diﬀerent from the corresponding law for ¦

L2 (P Q) / s = P / s if s ∈ traces(P) −traces(Q)

= Q / s if s ∈ traces(Q) −traces(P)

= (P / s) ¦ (Q / s) if s ∈ traces(P) ∩traces(Q)

3.4 Refusals

The distinction between (P ¦ Q) and (P Q) is quite subtle. They cannot

be distinguished by their traces, because each trace of one of them is also a

possible trace of the other. However, it is possible to put them in an environ-

ment in which (P ¦ Q) can deadlock at its ﬁrst step, but (P Q) cannot. For

example let x ≠ y and

P = (x →P), Q = (y →Q), αP = αQ = ¦x, y¦

Then

(P Q) || P = (x →P) = P

but

(P ¦ Q) || P = (P || P) ¦ (Q || P) = P ¦ STOP

This shows that in environment P, process (P ¦ Q) may reach deadlock

but process (P Q) cannot. Of course, even with (P ¦ Q) we cannot be sure

3.4 Refusals 89

that deadlock will occur; and if it does not occur, we will never know that it

might have. But the mere possibility of an occurrence of deadlock is enough

to distinguish (P Q) from (P ¦ Q).

In general, let X be a set of events which are oﬀered initially by the envir-

onment of a process P, which in this context we take to have the same alphabet

as P. If it is possible for P to deadlock on its ﬁrst step when placed in this en-

vironment, we say that X is a refusal of P. The set of all such refusals of P is

denoted

refusals(P)

Note that the refusals of a process constitute a family of sets of symbols. This

is an unfortunate complexity, but it does seem to be unavoidable in a proper

treatment of nondeterminism. Instead of refusals, it might seem more natural

to use the sets of symbols which a process may be ready to accept; however

the refusals are slightly simpler because they obey laws L9 and L10 of Sec-

tion 3.4.1 (below), whereas the corresponding laws for ready sets would be

more complicated.

The introduction of the concept of a refusal permits a clear formal dis-

tinction to be made between deterministic and nondeterministic processes. A

process is said to be deterministic if it can never refuse any event in which it

can engage. In other words, a set is a refusal of a deterministic process only if

that set contains no event in which that process can initially engage; or more

formally

P is deterministic ⇒(X ∈ refusals(P) ≡ (X ∩P

0

= ¦¦))

where P

0

= ¦ x ¦ ¸x) ∈ traces(P) ¦.

This condition applies not only on the initial step of P but also after any

possible sequence of actions of P. Thus we can deﬁne

P is deterministic ≡

∀s : traces(P) • (X ∈ refusals(P / s) ≡ (X ∩(P / s)

0

= ¦¦))

A nondeterministic process is one that does not enjoy this property, i.e., there

is at some time some event in which it can engage; but also (as a result of some

internal nondeterministic choice) it may refuse to engage in that event, even

though the environment is ready for it.

3.4.1 Laws

The following laws deﬁne the refusals of various simple processes. The process

STOP does nothing and refuses everything

L1 refusals(STOP

A

) = all subsets of A (including A itself)

A process c →P refuses every set that does not contain the event c

L2 refusals(c →P) = ¦ X ¦ X ⊆ (αP −¦c¦) ¦

90 3 Nondeterminism

These two laws have a common generalisation

L3 refusals(x : B →P(x)) = ¦ X ¦ X ⊆ (αP −B) ¦

If P can refuse X, so will P ¦ Q if P is selected. Similarly every refusal of Q is

also a possible refusal of (P ¦ Q). These are its only refusals, so

L4 refusals(P ¦ Q) = refusals(P) ∪refusals(Q)

A converse argument applies to (P Q). If X is not a refusal of P, then P

cannot refuse X, and neither can (P Q). Similarly, if X is not a refusal of Q,

then it is not a refusal of (P Q). However, if both P and Q can refuse X, so

can (P Q)

L5 refusals(P Q) = refusals(P) ∩refusals(Q)

Comparison of L5 with L4 shows the distinction between and ¦.

If P can refuse X and Q can refuse Y, then their combination (P || Q) can

refuse all events refused by P as well as all events refused by Q, i.e., it can

refuse the union of the two sets X and Y

L6 refusals(P || Q) = ¦ X ∪Y ¦ X ∈ refusals(P) ∧ Y ∈ refusals(Q) ¦

For symbol change, the relevant law is clear

L7 refusals(f (P)) = ¦ f (x) ¦ X ∈ refusals(P) ¦

There are a number of general laws about refusals. A process can refuse

only events in its own alphabet. A process deadlocks when the environment

oﬀers no events; and if a process refuses a nonempty set, it can also refuse

any subset of that set. Finally, any event x which cannot occur initially may be

added to any set X already refused.

L8 X ∈ refusals(P) ⇒X ⊆ αP

L9 ¦¦ ∈ refusals(P)

L10 (X ∪Y) ∈ refusals(P) ⇒X ∈ refusals(P)

L11 X ∈ refusals(P) ⇒(X ∪¦x¦) ∈ refusals(P) ∨ ¸x) ∈ traces(P)

3.5 Concealment

In general, the alphabet of a process contains just those events which are con-

sidered to be relevant, and whose occurrence requires simultaneous particip-

ation of an environment. In describing the internal behaviour of a mechan-

ism, we often need to consider events representing internal transitions of that

mechanism. Such events may denote the interactions and communications

between concurrently acting components from which the mechanism has been

3.5 Concealment 91

constructed, e.g., CHAIN2 (2.6 X4) and 2.6.2 X3. After construction of the mech-

anism, we conceal the structure of its components; and we also wish to conceal

all occurrences of actions internal to the mechanism. In fact, we want these ac-

tions to occur automatically and instantaneously as soon as they can, without

being observed or controlled by the environment of the process. If C is a ﬁnite

set of events to be concealed in this way, then

P \ C

is a process which behaves like P, except that each occurrence of any event in

C is concealed. Clearly it is our intention that

α(P \ C) = (αP) −C

Examples

X1 A noisy vending machine (2.3 X1) can be placed in a soundproof box

NOISYVM \ ¦clink, clunk¦

Its unexercised capability of dispensing toﬀee can also be removed from its

alphabet, without aﬀecting its actual behaviour. The resulting process is equal

to the simple vending machine

VMS = NOISYVM \ ¦clink, clunk, toﬀee¦

**When two processes have been combined to run concurrently, their mutual
**

interactions are usually regarded as internal workings of the resulting systems;

they are intended to occur autonomously and as quickly as possible, without

the knowledge or intervention of the system’s outer environment. Thus it is

the symbols in the intersection of the alphabets of the two components that

need to be concealed.

X2 Let

αP = ¦a, c¦

αQ = ¦b, c¦

P = (a →c →P)

Q = (c →b →Q)

as in (2.3.1 X1).

The action c in the alphabet of both P and Q is nowregarded as an internal

action, to be concealed

(P || Q) \ ¦c¦ = (a →c →µ X • (a →b →c →X

¦ b →a →c →X)) \ ¦c¦

= a →µ X • (a →b →X

¦ b →a →X)

92 3 Nondeterminism

3.5.1 Laws

The ﬁrst laws state that concealing no symbols has no eﬀect, and that it makes

no diﬀerence in what order the symbols of a set are concealed. The remaining

laws of this group show how concealment distributes through other operators.

Concealment of nothing leaves everything revealed

L1 P \ ¦¦ = P

To conceal one set of symbols and then some more is the same as concealing

them all simultaneously.

L2 (P \ B) \ C = P \ (B ∪C)

Concealment distributes in the familiar way through nondeterministic choice

L3 (P ¦ Q) \ C = (P \ C) ¦ (Q \ C)

Concealment does not aﬀect the behaviour of a stopped process, only its al-

phabet

L4 STOP

A

\ C = STOP

A−C

The purpose of concealment is to allow any of the concealed events to occur

automatically and instantaneously, but make such occurrences totally invis-

ible. Unconcealed events remain unchanged

L5 (x →P) \ C = x →(P \ C) if x ∉ C

= P \ C if x ∈ C

If C contains only events in which P and Q participate independently, conceal-

ment of C distributes through their concurrent composition

L6 If αP ∩αQ ∩C = ¦¦, then

(P || Q) \ C = (P \ C) || (Q \ C)

This is not a commonly useful law, because what we usually wish to conceal

are the interactions between concurrent processes, i.e., the events of αP ∩αQ,

in which they participate jointly.

Concealment distributes in the obvious way through symbol change by a

one-one function

L7 f (P \ C) = f (P) \ f (C)

If none of the possible initial events of a choice is concealed, then the

initial choice remains the same as it was before concealment

L8 If B ∩C = ¦¦, then

(x : B →P(x)) \ C = (x : B →(P(x) \ C))

3.5 Concealment 93

Like the choice operator , the concealment of events can introduce non-

determinism. When several diﬀerent concealed events can happen, it is not

determined which of them will occur; but whichever does occur is concealed

L9 If B ⊆ C, and B is ﬁnite and not empty, then

(x : B →P(x)) \ C = ¦

x∈B

(P(x) \ C)

In the intermediate case, when some of the initial events are concealed and

some are not, the situation is rather more complicated. Consider the process

(c →P ¦ d →Q) \ C

where

c ∈ C, d ∉ C

The concealed event c may happen immediately. In this case the total beha-

viour will be deﬁned by (P \ C), and the possibility of occurrence of the event

d will be withdrawn. But we cannot reliably assume that d will not happen.

If the environment is ready for it, d may very well happen before the hidden

event, after which the hidden event c can no longer occur. But even if d occurs,

it might have been performed by (P \ C) after the hidden occurrence of c. In

this case, the total behaviour is as deﬁned by

(P \ C) (d →(Q \ C))

The choice between this and (P \ C) is nondeterministic. This is a rather

convoluted justiﬁcation for the rather complex law

(c →P ¦ d →Q) \ C =

(P \ C) ¦ ((P \ C) (d →(Q \ C)))

Similar reasoning justiﬁes the more general law

L10 If C ∩B is ﬁnite and non-empty, then

(x : B →P(x)) \ C =

Q ¦ (Q (x : (B −C) →P(x)))

where

Q = ¦

x∈B∩C

P(x) \ C

A pictorial illustration of these laws is given in Section 3.5.4.

Note that \ C does not distribute backwards through . A counterexample

is

(c →STOP d →STOP) \ ¦c¦

94 3 Nondeterminism

= STOP ¦ (STOP (d →STOP)) [L10]

= STOP ¦ (d →STOP) [3.3.1 L4]

≠ d →STOP

= STOP (d →STOP)

= ((c →STOP) \ ¦c¦) ((d →STOP) \ ¦c¦)

Concealment reduces the alphabet of a process. We can also deﬁne an

operation which extends the alphabet of a process P by inclusion of symbols

of a set B

α(P

+B

) = αP ∪B

P

+B

= (P || STOP

B

) provided B ∩αP = ¦¦

None of the new events of B will ever actually occur, so the behaviour of P

+B

is eﬀectively the same as that of P

L11 traces(P

B+

) = traces(P)

Consequently, concealment of B reverses the extension of the alphabet by B

L12 (P

+B

) \ B = P

It is appropriate here to raise a problem that will be solved later, in Sec-

tion 3.8. In simple cases, concealment distributes through recursion

(µ X : A • (c →X)) \ ¦c¦

= µ X : (A−¦c¦) • ((c →X

+¦c¦

) \ ¦c¦)

= µ X : (A−¦c¦) • X [by L12, L5]

Thus the attempt to conceal an inﬁnite sequence of consecutive events leads

to the same unfortunate result as an inﬁnite loop or unguarded recursion. The

general name for this phenomenon is divergence.

The same problem arises even if the divergent process is inﬁnitely often

capable of some unconcealed event, for example

(µ X • (c →X d →P)) \ ¦c¦

= µ X • ((c →X d →P) \ ¦c¦)

= µ X • (X \ ¦c¦) ¦ ((X \ ¦c¦) d →(P \ ¦c¦)) [by L10]

Here again, the recursion is unguarded, and leads to divergence. Even though

it seems that the environment is inﬁnitely often oﬀered the choice of selecting

d, there is no way of preventing the process from inﬁnitely often choosing to

performthe hidden event instead. This possibility seems to aid in achieving the

highest eﬃciency of implementation. It also seems to be related to our decision

not to insist on fairness of nondeterminism, as discussed in Section 3.2.1. A

more rigorous discussion of divergence is given in Section 3.8.

3.5 Concealment 95

There is a sense, an important one, in which hiding is in fact fair. let

d ∈ αR, and consider the process

((c →a →P ¦ d →STOP) \ ¦c¦) || (a →R)

= ((a →P \ ¦c¦) ¦ (a →P \ ¦c¦ d →STOP)) || (a →R) [L10]

= (a →P \ ¦c¦) || (a →R) ¦ (a →P \ ¦c¦ d →STOP) || (a →R)

= a →((P \ ¦c¦) || R)

This shows that a process which oﬀers the choice between a hidden action c

and a nonhidden one d cannot insist that the nonhidden action shall occur.

If the environment (in this example, a → R) is not prepared for d, then the

hidden event must occur, so that the environment has the chance to interact

with the process (e.g. (a →P \ ¦c¦)) which results.

3.5.2 Implementation

For simplicity, we shall implement an operation which hides a single symbol

at a time

hide(P, c) = P \ ¦c¦

A set of two or more symbols may be hidden by hiding one after the other,

since

P \ ¦c1, c2, . . . , cn¦ = (. . . ((P \ ¦c1¦) \ ¦c2¦) \ . . .) \ ¦cn¦

The simplest implementation is one that always makes the hidden event occur

invisibly, whenever it can and as soon as it can

hide(P, c) = if P(c) = "BLEEP then

(λx • if P(x) = "BLEEP then

"BLEEP

else

hide(P(x), c))

else

hide(P(c), c)

Let us explore what happens when the hide function is applied to a pro-

cess which is capable of engaging in an inﬁnite sequence of hidden events, for

example

hide(µ X • (c →X d →P), c)

In this case, the test (P(c) = "BLEEP) will always yield FALSE, so the hide

function will always select its else clause, thereby immediately calling itself

96 3 Nondeterminism

recursively. There is no exit from this recursion, so no further communication

with the outside world will ever occur. This is the penalty for attempting to

implement a divergent process.

This implementation of concealment does not obey L2; indeed, the order

in which it hides the symbols is signiﬁcant, as shown by the example

P = (c →STOP ¦ d →a →STOP)

Then

hide(hide(P, c), d)

= hide(hide(STOP, c), d)

= STOP

and

hide(hide(P, d), c)

= hide(hide((a →STOP), d), c)

= (a →STOP)

But as explained in Section 3.2.2, a particular implementation of a nondetermin-

istic operator does not have to obey the laws. It is suﬃcient that both the

results shown above are permitted implementations of the same process

P \ ¦c, d¦ = (STOP ¦ (a →STOP))

3.5.3 Traces

If t is a trace of P, the corresponding trace of P \ C is obtained from t simply

by removing all occurrences of any of the symbols in C. Conversely each trace

of P \ C must have been obtained from some such trace of P. We can therefore

state

L1 traces(P \ C) = ¦ t (αP −C) ¦ t ∈ traces(P) ¦

provided that ∀s : traces(P) • ¬ diverges(P / s, C)

The condition diverges(P, C) means that P diverges immediately on con-

cealment of C, i.e. that it can engage in an unbounded sequence of hidden

events. Thus we deﬁne

diverges(P, C) = ∀n • ∃s : traces(P) ∩C

∗

• #s > n

3.5 Concealment 97

Corresponding to a single trace s of P \ C, there can be several traces t of

the possible behaviour in which P has engaged which cannot be distinguished

after the concealment, i.e., t (αP −C) = s. The next law states that after s it is

not determined which of the possible subsequent behaviours of P will deﬁne

the subsequent behaviour of (P \ C).

L2 (P \ C) / s = (¦

t∈T

P / t) \ C

where T = traces(P) ∩¦ t ¦ t (αP −C) = s ¦

provided that T is ﬁnite and s ∈ traces(P \ C)

These laws are restricted to the case when the process does not diverge.

The restrictions are not serious, because divergences is never the intended res-

ult of the attempted deﬁnition of a process. For a fuller treatment of divergence

see Section 3.8.

3.5.4 Pictures

Nondeterministic choice can be represented in a picture by a node from which

emerge two or more unlabelled arrows; on reaching the node, a process passes

imperceptibly along one of the emergent arrows, the choice being nondetermin-

istic.

P Q

Figure 3.1

Thus P ¦ Q is pictured as in Figure 3.1. The algebraic laws governing

nondeterminism assert identities between such processes, e.g., associativity

of ¦ is illustrated in Figure 3.2.

Q R

P

=

P Q R

Figure 3.2

98 3 Nondeterminism

Concealment of symbols may be regarded as an operation which simply

removes concealed symbols fromall arrows which they label, so that these arcs

turn into unlabelled arrows. The resulting nondeterminism emerges naturally,

as shown in Figure 3.3.

P Q

c d

\ c d { , }

=

P \ c, d { } Q \ c, d { }

Figure 3.3

But what is the meaning of a node if some of its arcs are labelled and

some are not? The answer is given by the law 3.5.1 L10. Such a node can be

eliminated by redrawing as shown in Figure 3.4.

P Q

R

b c

a

=

P Q R

c b a

P Q

b c

Figure 3.4

It is fairly obvious that such eliminations are always possible for ﬁnite

trees. They are also possible for inﬁnite graphs, provided that the graph con-

tains no inﬁnite path of consecutive unlabelled arrows, as for example in Fig-

ure 3.5. Such a picture can arise only in the case of divergence, which we have

already decided to regard as an error.

a

P

Figure 3.5

3.6 Interleaving 99

As a result of applying the transformation L10, it is possible that the node

may acquire two emergent lines with the same label. Such nodes can be elim-

inated by the law given at the end of Section 3.3 (Figure 3.6).

Q R

P

a b

=

P Q R

a b b

Figure 3.6

The pictorial representation of processes and the laws which govern them

are included here as an aid to memory and understanding; they are not in-

tended to be used for practical transformation or manipulation of large-scale

processes.

3.6 Interleaving

The || operator was deﬁned in Chapter 2 in such a way that actions in the alpha-

bet of both operands require simultaneous participation of themboth, whereas

the remaining actions of the system occur in an arbitrary interleaving. Using

this operator, it is possible to combine interacting processes with diﬀering al-

phabets into systems exhibiting concurrent activity, but without introducing

nondeterminism.

However, it is sometimes useful to join processes with the same alphabet

to operate concurrently without directly interacting or synchronising with each

other. In this case, each action of the system is an action of exactly one of the

processes If one of the processes cannot engage in the action, then it must

have been the other one; but if both processes could have engaged in the same

action, the choice between themis nondeterministic. This formof combination

is denoted

P ||| Q P interleave Q

and its alphabet is deﬁned by the usual stipulation

α(P ||| Q) = αP = αQ

Examples

X1 A vending machine that will accept up to two coins before dispensing up

to two chocolates (1.1.3 X6)

(VMS ||| VMS) = VMS2

100 3 Nondeterminism

X2 A footman made from four lackeys, each serving only one philosopher at

a time (see Section 2.5.4).

3.6.1 Laws

L1–L3 ||| is associative, symmetric, and distributes through ¦.

L4 P ||| STOP = P

L5 P ||| RUN = RUN provided P does not diverge

L6 (x →P) ||| (y →Q) = (x →(P ||| (y →Q)) y →((x →P) ||| Q))

L7 If P = (x : A →P(x))

and Q = (y : B →Q(y))

then P ||| Q = (x : A →(P(x) ||| Q) y : B →(P ||| Q(y)))

Note that ||| does not distribute through . This is shown by the counter-

example (where b ≠ c)

((a →STOP) ||| (b →Q c →R)) / ¸a)

= (b →Q c →R)

≠ ((b →Q) ¦ (c →R))

= ((a →STOP b →Q) ||| (a →STOP c →R)) / ¸a)

On the left-hand side of this chain, the occurrence of a can involve progress

only of the left operand of |||, so no nondeterminism is introduced. The left

operand stops, and the choice between b and c is left open to the environment.

On the right-hand side of the chain, the event a may be an event of either

operand of |||, the choice being nondeterministic. Thus the environment can

no longer choose whether the next event will be b or c.

L6 and L7 state that it is the environment which chooses between the

initial events oﬀered by the operands of |||. Nondeterminism arises only when

the chosen event is possible for both operands.

Example

X1 Let R = (a →b →R), then

(R ||| R)

= (a →((b →R) ||| R) a →(R ||| (b →R))) [L6]

= a →((b →R) ||| R) ¦ (R ||| (b →R))

= a →((b →R) ||| R) [L2]

Also

(b →R) ||| R

3.7 Speciﬁcations 101

= (a →((b →R) ||| (b →R)) b →(R ||| R)) [L6]

= (a →(b →((b →R) ||| R))

b →(a →((b →R) ||| R)))

[as shown above]

= µ X • (a →b →X

b →a →X)

[since the recursion is guarded.]

Thus (R ||| R) is identical to the example 3.5 X2. A similar proof shows that

(VMS ||| VMS) = VMS2.

3.6.2 Traces and refusals

A trace of (P ||| Q) is an arbitrary interleaving of a trace from P with a trace

from Q. For a deﬁnition of interleaving, see Section 1.9.3.

L1 traces(P ||| Q) =

¦ s ¦ ∃t : traces(P) • ∃u : traces(Q) • s interleaves (t, u) ¦

(P ||| Q) can engage in any initial action possible for either P or Q; and it can

therefore refuse only those sets which are refused by both P and Q

L2 refusals(P ||| Q) = refusals(P Q)

The behaviour of (P ||| Q) after engaging in the events of the trace s is

deﬁned by the rather elaborate formula

L3 (P ||| Q) / s = ¦

t,u∈T

(P / t) ||| (Q / u)

where

T = ¦ (t, u) ¦ t ∈ traces(P) ∧ u ∈ traces(Q) ∧ s interleaves (t, u) ¦

This law reﬂects the fact that there is no way of knowing in which way a trace

s of (P ||| Q) has been constructed as an interleaving of a trace from P and a

trace from Q; thus after s, the future behaviour of (P ||| Q) may reﬂect any one

of the possible interleavings. The choice between them is not known and not

determined.

3.7 Speciﬁcations

In Section 3.4 we sawthe need to introduce refusal sets as one of the important

indirectly observable aspects of the behaviour. In specifying a process, we

therefore need to describe the desired properties of its refusal sets as well as

its traces. Let us use the variable ref to denote an arbitrary refusal set of a

process, in the same way as we have used tr to denote an arbitrary trace. As a

result, when P is a nondeterministic process the meaning of

P sat S(tr, ref )

102 3 Nondeterminism

is revised to

∀tr, ref • tr ∈ traces(P) ∧ ref ∈ refusals(P / tr) ⇒S(tr, ref )

Examples

X1 When a vending machine has ingested more coins than it has dispensed

chocolates, the customer speciﬁes that it must not refuse to dispense a chocol-

ate

FAIR = (tr ↓ choc < tr ↓ coin ⇒choc ∉ ref )

It is implicitly understood that every trace tr and every refusal ref of the

speciﬁed process at all times should satisfy this speciﬁcation.

X2 When a vending machine has given out as many chocolates as have been

paid for, the owner speciﬁes that it must not refuse a further coin

PROFIT1 = (tr ↓ choc = tr ↓ coin ⇒coin ∉ ref )

**X3 A simple vending machine should satisfy the combined speciﬁcation
**

NEWVMSPEC = FAIR ∧ PROFIT ∧ (tr ↓ choc ≤ tr ↓ coin)

This speciﬁcation is satisﬁed by VMS. It is also satisﬁed by a vending machine

like VMS2 (1.1.3 X6) which will accept several coins in a row, and then give out

several chocolates.

X4 If desired, one may place a limit on the balance of coins which may be

accepted in a row

ATMOST2 = (tr ↓ coin −tr ↓ choc ≤ 2)

**X5 If desired, one can insist that the machine accept at least two coins in a
**

row whenever the customer oﬀers them

ATLEAST2 = (tr ↓ coin −tr ↓ choc < 2 ⇒coin ∉ ref )

**X6 The process STOP refuses every event in its alphabet. The following pre-
**

dicate speciﬁes that a process with alphabet A will never stop

NONSTOP = (ref ≠ A)

3.7 Speciﬁcations 103

If P sat NOTSTOP, and if an environment allows all events in A, P must perform

one of them. Since (see X3 above)

NEWVMSPEC ⇒ref ≠ ¦coin, choc¦

it follows that any process which satisﬁes NEWVMSPEC will never stop.

These examples show how the introduction of ref into the speciﬁcation of

a process permits the expression of a number of subtle but important proper-

ties; perhaps the most important of all is the property that the process must

not stop (X6). These advantages are obtained at the cost of slightly increased

complexity in proof rules and in proofs.

It is also desirable to prove that a process does not diverge. Section 3.8

describes how a divergent process is one that can do anything and refuse any-

thing. So if there is a set which cannot be refused, then the process is not di-

vergent. This justiﬁes formulation of a suﬃcient condition for non-divergence

NONDIV = (ref ≠ A)

Fortunately

NONSTOP ≡ NONDIV

so proof of absence of divergence does not entail any more work than proof

of absence of deadlock.

3.7.1 Proofs

In the following proof rules, a speciﬁcation will be written in any of the forms S,

S(tr), S(tr, ref ), according to convenience. In all cases, it should be understood

that the speciﬁcation may contain tr and ref among its free variables.

By the deﬁnition of nondeterminism, (P ¦ Q) behaves either like P or like

Q. Therefore every observation of its behaviour will be an observation possible

for P or for Q or for both. This observation will therefore be described by the

speciﬁcation of P or by the speciﬁcation of Q or by both. Consequently, the

proof rule for nondeterminism has an exceptionally simple form

L1 If P sat S

and Q sat T

then (P ¦ Q) sat (S ∨ T)

The proof rule for STOP states that it does nothing and refuses anything

L2A STOP

A

sat (tr = ¸) ∧ ref ⊆ A)

Since refusals are always contained in the alphabet (3.4.1 L8) the clause ref ⊆ A

can be omitted. So if we omit alphabets altogether (which we shall do in future),

104 3 Nondeterminism

the law L2A is identical to that for deterministic processes (1.10.2 L4A)

STOP sat tr = ¸)

The previous law for preﬁxing (1.10.2 L4B) is also still valid, but it is not

quite strong enough to prove that the process cannot stop before its initial

action. The rule needs to be strengthened by mention of the fact that in the

initial state, when tr = ¸), the initial action cannot be refused

L2B If P sat S(tr)

then (c →P) sat ((tr = ¸) ∧ c ∉ ref ) ∨ (tr

0

= c ∧ S(tr

¹

)))

The law for general choice (1.10.2 L4) needs to be similarly strengthened

L2 If ∀x : B • P(x) sat S(tr, x)

then (x : B →P(x)) sat

((tr = ¸) ∧ (B ∩ref = ¦¦) ∨ (tr

0

∈ B ∧ S(tr

¹

, tr

0

))))

The law for parallel composition given in 2.7 L1 is still valid, provided that

the speciﬁcations make no mention of refusal sets. In order to deal correctly

with refusals, a slightly more complicated law is required

L3 If P sat S(tr, ref )

and Q sat T(tr, ref )

and neither P nor Q diverges

then (P || Q) sat

(∃X, Y, ref • ref = (X ∪Y) ∧ S(tr αP, X) ∧ T(tr αQ, Y))

The law for change of symbol needs a similar adaptation

L4 If P sat (tr, ref )

then f (P) sat S(f

−1∗

(tr), f

−1

(ref )) provided f is one-one.

The law for is surprisingly simple

L5 If P sat S(tr, ref )

and Q sat T(tr, ref )

and neither P nor Q diverges

then (P Q) sat (if tr = ¸) then (S ∧ T) else (S ∨ T))

Initially, when tr = ¸), a set is refused by (P Q) only if it is refused by both

P and Q. This set must therefore be described by both their speciﬁcations.

Subsequently, when tr ≠ ¸), each observation of (P Q) must be an obser-

vation either of P or of Q, and must therefore be described by one of their

speciﬁcations (or both).

3.8 Divergence 105

The law for interleaving does not need to mention refusal sets

L6 If P sat S(tr)

and Q sat T(tr)

and neither P nor Q diverges

then (P ||| Q) sat (∃s, t • (tr interleaves (s, t) ∧ S(s) ∧ T(t)))

The law for concealment is complicated by the need to guard against di-

vergence

L7 If P sat (NODIV ∧ S(tr, ref ))

then (P \ C) sat ∃s • tr = s (αP −C) ∧ S(tr, ref ∪C)

where NODIV states that the number of hidden symbols that can occur is

bounded by some function of the non-hidden symbols that have occurred

NODIV = #(tr C) ≤ f (tr (αP −C))

where f is some total function from traces to natural numbers.

The clause ref ∪C in the consequent of law L7 requires some explanation.

It is due to the fact that P \ C can refuse a set X only when P can refuse the

whole set X ∪ C, i.e., X together with all the hidden events. P \ C cannot

refuse to interact with its external environment until it has reached a state

in which it cannot engage in any further concealed internal activities. This

kind of fairness is a most important feature of any reasonable deﬁnition of

concealment, as described in Section 3.5.1.

The proof method for recursion (1.10.2 L6) also needs to be strengthened.

Let S(n) be a predicate containing the variable n, which ranges over the natural

numbers 0, 1, 2, . . .

L8 If S(0)

and (X sat S(n)) ⇒F(X) sat S(n +1))

then (µ X • F(X)) sat (∀n • S(n))

This law is valid even for an unguarded recursion, though the strongest spe-

ciﬁcation which can be proved of such a process is the vacuous speciﬁcation

true.

3.8 Divergence

In previous chapters, we have observed the restriction that the equations which

deﬁne a process must be guarded (Section 1.1.2). This restriction has ensured

that the equations have only a single solution (1.3 L2). It has also released us

from the obligation of giving a meaning to the inﬁnite recursion

µ X • X

106 3 Nondeterminism

Unfortunately, the introduction of concealment (Section 3.5) means that

an apparently guarded recursion is not constructive. For example, consider

the equation

X = c →(X \ ¦c¦)

+¦c¦

This has as solutions both (c →STOP) and (c →a →STOP), a fact which may

be checked by substitution.

Consequently, any recursion equation which involves recursion under the

hiding operator is potentially unguarded, and liable to have more than one

solution. Which solution should be taken as the right one? We stipulate that

the right solution is the least deterministic, because this allows a nondetermin-

istic choice between all the other solutions. With this understanding, we can

altogether remove the restriction that recursions must be guarded, and we can

give a (possibly nondeterministic) meaning to every expression of the form

µ X • F(X), where F is deﬁned in terms of any of the operators introduced in

this book (except /), and observing all alphabet constraints.

3.8.1 Laws

Since CHAOS is the most nondeterministic process it cannot be changed by

adding yet further nondeterministic choices; it is therefore a zero of ¦

L1 P ¦ CHAOS = CHAOS

A function of processes that yields CHAOS if any of its arguments is CHAOS

is said to be strict. The above law (plus symmetry) states that ¦ is a strict

function. CHAOS is such an awful process that almost any process which is

deﬁned in terms of CHAOS is itself equal to CHAOS.

L2 The following operations are strict

/ s, ||, f , , \ C, |||, and µ X

However preﬁxing is not strict.

L3 CHAOS ≠ (a →CHAOS)

because the right-hand side can be relied upon to do a before becoming com-

pletely unreliable.

As mentioned before, CHAOS is the most unpredictable and most uncon-

trollable of processes. There is nothing that it might not do; furthermore, there

is nothing that it might not refuse to do!

L4 traces(CHAOS

A

) = A

∗

L5 refusals(CHAOS

A

) = all subsets of A.

3.8 Divergence 107

3.8.2 Divergences

A divergence of a process is deﬁned as any trace of the process after which the

process behaves chaotically. The set of all divergences is deﬁned

divergences(P) = ¦ s ¦ s ∈ traces(P) ∧ (P / s) = CHAOS

αP

¦

It follows immediately that

L1 divergences(P) ⊆ traces(P)

Because / t is strict,

CHAOS / t = CHAOS

and it follows that the divergences of a process are extension-closed, in the

sense that

L2 s ∈ divergences(P) ∧ t ∈ (αP)

∗

⇒(s

t) ∈ divergences(P)

Since CHAOS

A

may refuse any subset of its alphabet A

L3 s ∈ divergences(P) ∧ X ⊆ αP ⇒X ∈ refusal(P / s)

The three laws given above state general properties of divergences of any

process. The following laws showhowthe divergences of compound processes

are determined by the divergences and traces of their components. Firstly, the

process STOP never diverges

L4 divergences(STOP) = ¦¦

At the other extreme, every trace of CHAOS leads to CHAOS

L5 divergences(CHAOS

A

) = A

∗

A process deﬁned by choice does not diverge on its ﬁrst step. Consequently,

its divergences are determined by what happens after the ﬁrst step

L6 divergences(x : B →P(x)) = ¦ ¸x)

s ¦ x ∈ B ∧ s ∈ divergences(P(x)) ¦

Any divergence of P is also a divergence of (P ¦ Q) and of (P Q)

L7 divergences(P ¦ Q) = divergences(P Q)

= divergences(P) ∪divergences(Q)

Since || is strict, a divergence of (P || Q) starts with a trace of the nondivergent

activity of both P and Q, which leads to divergence of either P or of Q (or of

both)

108 3 Nondeterminism

L8 divergences(P || Q) =

¦ s

t ¦ t ∈ (αP ∪αQ)

∗

∧

((s αP ∈ divergences(P) ∧ s αQ ∈ traces(Q)) ∨

s αP ∈ traces(P) ∧ s αQ ∈ divergences(Q))

A similar explanation applies to |||

L9 divergences(P ||| Q) =

¦ u ¦ ∃s, t • u interleaves (s, t) ∧

((s ∈ divergences(P) ∧ t ∈ traces(Q)) ∨

(s ∈ traces(P) ∧ t ∈ divergences(Q))) ¦

Divergences of a process resulting from concealment include traces derived

fromthe original divergences, plus those resulting fromthe attempt to conceal

an inﬁnite sequence of symbols

L10 divergences(P \ C) =

¦ (s (αP −C))

t ¦

t ∈ (αP −C)

∗

∧

(s ∈ divergences(P) ∨

(∀n • ∃u ∈ C

∗

• #u > n ∧ (s

u) ∈ traces(P))) ¦

A process deﬁned by symbol change diverges only when its argument diverges

L11 divergences(f (P)) = ¦ f

∗

(s) ¦ s ∈ divergences(P) ¦

provided f is one-one.

It is a shame to devote so much attention to divergence, when divergence is

always something we do not want. Unfortunately, it seems to be an inevitable

consequence of any eﬃcient of even computable method of implementation. It

can arise from either concealment or unguarded recursion; and it is part of the

task of a system designer to prove that for his particular design the problem

will not occur. In order to prove that something can’t happen, we need to use

a mathematical theory in which it can!

3.9 Mathematical theory of non-deterministic processes

The laws given in this chapter are distinctly more complicated than the laws

given in the two earlier chapters; and the informal justiﬁcations and examples

carry correspondingly less conviction. It is therefore even more important to

construct a proper mathematical deﬁnition of the concept of a nondetermin-

istic process, and prove the correctness of the laws from the deﬁnitions of the

operators.

3.9 Mathematical theory of non-deterministic processes 109

As in Section 2.8.1, a mathematical model is based on the relevant directly

or indirectly observable properties of a process. These certainly include its

alphabet and its traces; but for a nondeterministic process there are also its

refusals (Section 3.4) and divergences (Section 3.8). In addition to refusals at

the ﬁrst step of a process P, it is necessary also to take into account what P

may refuse after engaging in an arbitrary trace s of its behaviour. We therefore

deﬁne the failures of a process as a relation (set of pairs)

failures(P) = ¦ (s, X) ¦ s ∈ traces(P) ∧ X ∈ refusals(P / s) ¦

If (s, X) is a failure of P, this means that P can engage in the sequence of events

recorded by s, and then refuse to do anything more, in spite of the fact that

its environment is prepared to engage in any of the events of X. The failures

of a process are more informative about the behaviour of that process that its

traces or refusals, which can both be deﬁned in terms of failures

traces(P) = ¦ s ¦ ∃X • (s, X) ∈ failures(P) ¦

= domain(failures(P))

refusals(P) = ¦ X ¦ (¸), X) ∈ failures(P) ¦

The various properties of traces (1.8.1 L6, L7, L8) and refusals (3.4.1 L8, L9,

L10, L11) can easily be reformulated in terms of failures (see conditions C1,

C2, C3, C4 under the deﬁnition D0 below).

We are now ready for the bold decision that a process is uniquely deﬁned

by the three sets specifying its alphabet, its failures, and its divergences; and

conversely, any three sets which satisfy the relevant conditions uniquely deﬁne

a process. We will ﬁrst deﬁne the powerset of A as the set of all its subsets

PA = ¦ X ¦ X ⊆ A¦

D0 A process is a triple (A, F, D), where

A is any set of symbols (for simplicity ﬁnite)

F is a relation between A

∗

and PA

D is a subset of A

∗

provided that they satisfy the following conditions

C1 (¸), ¦¦) ∈ F

C2 (s

t, X) ∈ F ⇒(s, ¦¦) ∈ F

C3 (s, Y) ∈ F ∧ X ⊆ Y ⇒(s, X) ∈ F

C4 (s, X) ∈ F ∧ x ∈ A ⇒(s, X ∪¦x¦) ∈ F ∨ (s

¸x), ¦¦) ∈ F

C5 D ⊆ domain(F)

C6 s ∈ D ∧ t ∈ A

∗

⇒s

t ∈ D

C7 s ∈ D ∧ X ⊆ A ⇒(s, X) ∈ F

(the last three conditions reﬂect the laws 3.8.2 L1, L2, L3).

110 3 Nondeterminism

The simplest process which satisﬁes this deﬁnition is also the worst

D1 CHAOS

A

= (A, (A

∗

× PA), A

∗

)

where

A

∗

× PA is the Cartesian product ¦ (s, X) ¦ s ∈ A

∗

∧ X ∈ PA¦

This is the largest process with alphabet A, since every member of A

∗

is both

a trace and a divergence, and every subset of A is a refusal after all traces.

Another simple process is deﬁned

D2 STOP

A

= (A, ¦¸)¦ × PA, ¦¦)

This process never does anything, can refuse everything, and has no diver-

gences.

An operator is deﬁned on processes by showing how the three sets of the

result can be derived from those of their operands. Of course it is necessary

to show that the result of the operation satisﬁes the six conditions of D0; this

proof is usually based on the assumption that its operands do so to start with.

The simplest operation to deﬁne is the nondeterministic or (¦). Like many

other operations, it is deﬁned only for operands with the same alphabet

D3 (A, F1, D1) ¦ (A, F2, D2) = (A, F1 ∪F2, D1 ∪D2)

The resulting process can fail or diverge in all cases that either of its two op-

erands can do so. The laws 3.2 L1, L2, L3 are direct consequences of this

deﬁnition.

The deﬁnitions of all the other operators can be given similarly; but it

seems slightly more elegant to write separate deﬁnitions for the alphabets,

failures and divergences. The deﬁnitions of the divergences have been given

in Section 3.8.2, so it remains only to deﬁne the alphabets and the failures.

D4 If αP(x) = A for all x

and B ⊆ A

then α(x : B →P(x)) = A.

D5 α(P || Q) = (αP ∪αQ)

D6 α(f (P)) = f (α(P))

D7 α(P Q) = α(P ||| Q) = αP provided αP = αQ

D8 α(P \ C) = αP −C

D9 failures(x : B →P(x)) =

¦ ¸), X ¦ X ⊆ (αP −B) ¦ ∪

¦ ¸x)

s, X ¦ x ∈ B ∧ (s, X) ∈ failures(P(x)) ¦

3.9 Mathematical theory of non-deterministic processes 111

D10 failures(P || Q) =

¦ s, (X ∪Y) ¦ s ∈ (αP ∪αQ)

∗

∧ (s αP, X) ∈ failures(P) ∧

(s αQ, Y) ∈ failures(Q) ¦ ∪

¦s, X ¦ s ∈ divergences(P || Q) ¦

D11 failures(f (P)) = ¦ f

∗

(s), f (X) ¦ (s, X) ∈ failures(P) ¦

D12 failures(P Q) =

¦ s, X ¦ (s, X) ∈ failures(P) ∩failures(Q) ∨

(s ≠ ¸) ∧ (s, X) ∈ failures(P) ∪failures(Q)) ¦ ∪

¦ s, X ¦ s ∈ divergences(P Q) ¦

D13 failures(P ||| Q) =

¦ s, X ¦ ∃t, u • (t, X) ∈ failures(P) ∧ (u, X) ∈ failures(Q) ¦ ∪

¦ s, X ¦ s ∈ divergences(P ||| Q) ¦

D14 failures(P \ C) =

¦ s (αP −C), X ¦ (s, X ∪C) ∈ failures(P) ¦ ∪

¦ s, X ¦ s ∈ divergences(P \ C) ¦

The explanations of these laws may be derived from the explanations of the

corresponding traces and refusals, together with the laws for \.

It remains to give a deﬁnition for processes deﬁned recursively by means

of µ. The treatment is based on the same ﬁxed point theory as Section 2.8.2,

except that the deﬁnition of the ordering U is diﬀerent

D15 (A, F1, D1) U (A, F2, D2) ≡ (F2 ⊆ F1 ∧ D2 ⊆ D1)

P U Q now means that Q is equal to P or better in the sense that it is less likely

to diverge and less likely to fail. Q is more predictable and more controllable

than P, because if Q can do something undesirable P can do it too; and if Q can

refuse to do something desirable, P can also refuse. CHAOS can do anything

at any time, and can refuse to do anything at any time. True to its name, it is

the least predictable and controllable of all processes; or in short the worst

L1 CHAOS U P

This ordering is clearly a partial order. In fact it is a complete partial order, with

a limit operation deﬁned in terms of the intersections of descending chains of

failures and divergences

D16

n≥0

(A, F

n

, D

n

) = (A,

n≥0

F

n

,

n≥0

D

n

)

provided (∀n ≥ 0 • F

n+1

⊆ F

n

∧ D

n+1

⊆ D

n

)

112 3 Nondeterminism

The µ operation is deﬁned in the same way as for deterministic processes (2.8.2

L7), except for the diﬀerence in the deﬁnition of the ordering, which requires

that CHAOS be used in place of STOP

D17 µ X : A • F(X) =

i≥0

F

i

(CHAOS

A

)

The proof that this is a solution (in fact the most nondeterministic solution)

of the relevant equation is the same as that given in Section 2.8.2.

As before, the validity of the proof depends critically on the fact that all the

operators used on the right-hand side of the recursion should be continuous

in the appropriate ordering. Fortunately, all the operators deﬁned in this book

(except /) are continuous, and so is every formula constructed from them. In

the case of the concealment operator, the requirement of continuity was one

of the main motivations for the rather complicated treatment of divergence.

Communication 4

4.1 Introduction

In previous chapters we have introduced and illustrated a general concept of

an event as an action without duration, whose occurrence may require simul-

taneous participation by more than one independently described process. In

this chapter we shall concentrate on a special class of event known as a com-

munication. A communication is an event that is described by a pair c.v where

c is the name of the channel on which the communication takes place and v

is the value of the message which passes. Examples of this convention have

already been given in COPYBIT (1.1.3 X7) and CHAIN2 (2.6 X4).

The set of all messages which P can communicate on channel c is deﬁned

αc(P) = ¦ v ¦ c.v ∈ αP ¦

We also deﬁne functions which extract channel and message components of a

communication

channel(c.v) = c, message(c.v) = v

All the operations introduced in this chapter can be deﬁned in terms of

the more primitive concepts introduced in earlier chapters, and most of the

laws are just special cases of previously familiar laws. The reasons for intro-

ducing special notations is that they are suggestive of useful applications and

implementation methods; and because in some cases imposition of notational

restrictions permits the use of more powerful reasoning methods.

4.2 Input and output

Let v be any member of αc(P). A process which ﬁrst outputs v on the channel

c and then behaves like P is deﬁned

(c!v →P) = (c.v →P)

The only event in which this process is initially prepared to engage is the com-

munication event c.v.

114 4 Communication

A process which is initially prepared to input any value x communicable

on the channel c, and then behave like P(x), is deﬁned

(c?x →P(x)) = (y : ¦ y ¦ channel(y) = c ¦ →P(message(y)))

Example

X1 Using the new deﬁnitions of input and output we can rewrite 1.1.3 X7

COPYBIT = µ X • (in?x →(out!x →X))

where αin(COPYBIT) = αout(COPYBIT) = ¦0, 1¦

We shall observe the convention that channels are used for communication

in only one direction and between only two processes. A channel which is used

only for output by a process will be called an output channel of that process;

and one used only for input will be called an input channel. In both cases,

we shall say loosely that the channel name is a member of the alphabet of the

process.

When drawing a connection diagram (Section 2.4) of a process, the chan-

nels are drawn as arrows in the appropriate direction, and labelled with the

name of the channel (Figure 4.1).

left

down

right

P

Figure 4.1

Let P and Q be processes, and let c be an output channel of P and an input

channel of Q. When P and Q are composed concurrently in the system(P || Q),

communication will occur on channel c on each occasion that P outputs a

message and Q simultaneously inputs that message. An outputting process

speciﬁes a unique value for the message, whereas the inputting process is

prepared to accept any communicable value. Thus the event that will actually

occur is the communication c.v, where v is the value speciﬁed by the outputting

process. This requires the obvious constraint that the channel c must have the

same alphabet at both ends, i.e.,

αc(P) = αc(Q)

In future, we will assume satisfaction of this constraint; and where no con-

fusion can arise we will write αc for αc(P). An example of the working of

this model for communication has been given in CHAIN2 (2.6 X4); and more

interesting examples will be given in Section 4.3 and subsequent sections.

4.2 Input and output 115

In general, the value to be output by a process is speciﬁed by means of an

expression containing variables to which a value has been assigned by some

previous input, as illustrated in the following examples.

Examples

X1 A process which immediately copies every message it has input from the

left by outputting it to the right

αleft(COPY) = αright(COPY)

COPY = µ X • (left?x →right!x →X)

If αleft = ¦0, 1¦, COPY is almost identical to COPYBIT (1.1.3 X7).

X2 A process like COPY, except that every number input is doubled before it

is output

αleft = αright = N

DOUBLE = µ X • (left?x →right!(x +x) →X)

**X3 The value of a punched card is a sequence of eighty characters, which may
**

be read as a single value along the left channel. A process which reads cards

and outputs their characters one at a time

αleft = ¦ s ¦ s ∈ αright

∗

∧ #s = 80 ¦

UNPACK = P

¸)

where P

¸)

= left?s →P

s

P

¸x)

= right!x →P

¸)

P

¸x)

s

= right!x →P

s

**X4 A process which inputs characters one at a time from the left, and as-
**

sembles them into lines of 125 characters’ length. Each completed line is out-

put on the right as a single array-valued message

αright = ¦ s ¦ s ∈ αleft

∗

∧ #s = 125 ¦

PACK = P

¸)

where P

s

= right!s →P

¸

) if #s = 125

P

s

= left?x →P

s

¸x)

if #s < 125

Here, P

s

describes the behaviour of the process when it has input and packed

the characters in the sequence s; they are waiting to be output when the line

is long enough.

116 4 Communication

X5 A process which copies from left to right, except that each pair of consec-

utive asterisks is replaced by a single “↑”

αleft = αright −¦“↑”¦

SQUASH = µ X • left?x →

if x ≠ “*” then (right!x →X)

else left?y →if y = “*” then (right!“↑” →X)

else (right!“*” →right!y →X))

**A process may be prepared initially to communicate on any one of a set of
**

channels, leaving the choice between them to the other processes with which

it is connected. For this purpose we adapt the choice notation introduced in

Chapter1. If c and d are distinct channel names

(c?x →P(x) ¦ d?y →Q(y))

denotes a process which initially inputs x on c and then behaves like P(x),

or initially inputs y on channel d and then behaves like Q(y). The choice

is determined by whichever of the corresponding outputs is ready ﬁrst, as

explained below.

Since we have decided to abstract from the timing of events and the speed

of processes which engage in them, the last sentence of the previous paragraph

may require explanation. Consider the case when the channels c and d are

output channels of two other separate processes, which are independent in

the sense that they do not directly or indirectly communicate with each other.

The actions of these two processes are therefore arbitrarily interleaved. Thus

if one processes is making progress towards an output on c, and the other is

making progress towards an output on d, it is not determined which of them

reaches its output ﬁrst. An implementor will be expected, but not compelled,

to resolve this nondeterminism in favour of the ﬁrst output to become ready.

This policy also protects against the deadlock that will result if the second

output is never going to occur, or if it can occur only after the ﬁrst output,

as in the case when both the channels c and d are connected to the same

concurrent process, which outputs on one and then the other

(c!2 →d!4 →P)

Thus the presentation of a choice of inputs not only protects against dead-

lock but also achieves greater eﬃciency and reduces response times to pro-

ferred communications. A traveller who is waiting for a number 127 bus will

in general have to wait longer than one who is prepared to travel in either a

number 19 or a number 127, whichever arrives ﬁrst at the bus stop. On the

assumption of random arrivals, the traveller who oﬀers a choice will wait only

half as long—paradoxically, it is as though he is waiting twice as fast! To wait

for the ﬁrst of many possible events is the only way of achieving this: purchase

of faster computers is useless.

4.2 Input and output 117

Examples

X6 A process which accepts input on either of the two channels left1 or left2,

and immediately outputs the message to the right

αleft1 = αleft2 = αright

MERGE = (left1?x →right!x →MERGE

¦ left2?x →right!x →MERGE)

The output of this process is an interleaving of the messages input from left1

and left2.

X7 A process that is always prepared to input a value on the left, or to output

to the right a value which it has most recently input

αleft = αright

VAR = left?x →VAR

x

where

VAR

x

= (left?y →VAR

y

¦ right!x →VAR

x

)

Here VAR

x

behaves like a program variable with current value x. New values

are assigned to it by communication on the left channel, and its current value

is obtained by communication on the right channel. If αleft = ¦0, 1¦ the beha-

viour of VAR is almost identical to that of BOOL (2.6 X5).

X8 A process which inputs from up and left, outputs to down a function of

what it has input, before repeating

NODE(v) = µ X • (up?sum →left?prod →

down!(sum+v ×prod) →X)

**X9 A process which is at all times ready to input a message on the left, and
**

to output on its right the ﬁrst message which it has input but not yet output

BUFFER = P

¸)

where

P

¸)

= left?x →P

¸x)

P

¸x)

s

= (left?y →P

¸x)

s

¸y)

¦ right!x →P

s

)

118 4 Communication

BUFFER behaves like a queue; messages join the right-hand end of the queue

and leave it from the left end, in the same order as they joined, but after a

possible delay, during which later messages may join the queue.

X10 A process which behaves like a stack of messages. When empty, it re-

sponds to the signal empty. At all times it is ready to input a newmessage from

the left and put it on top of the stack; and whenever nonempty, it is prepared

to output and remove the top element of the stack

STACK = P

¸)

where

P

¸)

= (empty →P

¸)

¦ left?x →P

¸x)

)

P

¸x)

s

= (right!x →P

s

¦ left?y →P

¸y)

¸x)

s

)

This process is very similar to the previous example, except that when empty it

participates in the empty event, and that it puts newly arrived messages on the

same end of the stored sequence as it takes them oﬀ. Thus if y is the newly

arrived input message, and x is the message currently ready for output, the

STACK stores ¸y)

¸x)

s but the BUFFER stores ¸x)

s

¸y).

4.2.1 Implementation

In a LISP implementation of communicating processes, the event c.v is naturally

represented by the dotted pair c.v, which is constructed by

cons("c, v)

Input and output commands are conveniently implemented as functions which

ﬁrst take a channel name as argument. If the process is not prepared to com-

municate on the channel, it delivers the answer "BLEEP. The actual value com-

municated is treated separately in the next stage, as described below.

If Q is the input command

c?x →Q(x))

then Q("c) ≠ "BLEEP; instead, its result is a function which expects the input

value x as its argument, and delivers as its result the process Q(x). Thus Q is

implemented by calling the LISP function

input("c, λx • Q(x))

which is deﬁned

input(c, F) = λy • if y ≠ c then "BLEEP else F

It follows that Q / ¸c.v) is represented in LISP by Q("c)(v), provided that ¸c.v)

is a trace of Q.

4.2 Input and output 119

If P is the output command

(c!v →P

¹

)

then P("c) ≠ "BLEEP; instead, its result is the pair cons(v, P

¹

). Thus P is imple-

mented by calling the LISP function

output("c, v, P

¹

)

which is deﬁned

output(c, v, P) = λy •if y ≠ c then "BLEEP else cons(v, P)

It follows that v = car(P("c)), and that P / ¸c.v) is represented in LISP by

cdr(P("c)), provided that ¸c.v) is a trace of P.

In theory, if αc is ﬁnite, it would be possible to treat c.v as a single event,

passed as a parameter to the input and output commands. But this would be

grotesquely ineﬃcient, since the only way of ﬁnding out what value is output

would be to test whether P(c.v) ≠ "BLEEP for all values v in αc, until the right

one is found. One of the justiﬁcations for introducing specialised notation

for input and output is to encourage and permit methods of implementation

which are signiﬁcantly more eﬃcient. The disadvantage is that the implement-

ation of nearly all the other operators needs to be recoded in the light of this

optimisations.

Examples

X1 COPY = LABEL X • input("left, λx • output("right, x, X))

X2 PACK = P(NIL)

where

P = LABEL X •

(λs • if length(s) = 125 then

output("right, s, X(NIL))

else

input("left,

λx • X(append(s, cons(x, NIL)))))

4.2.2 Speciﬁcations

In specifying the behaviour of a communicating process, it is convenient to

describe separately the sequences of messages that pass along each of the

channels. If c is a channel name, we deﬁne (see Section 1.9.6)

tr ↓ c = message

∗

(tr αc)

120 4 Communication

It is convenient just to omit the tr ↓, and write right ≤ left instead of tr ↓

right ≤ tr ↓ left.

Another useful deﬁnition places a lower bound on the length of a preﬁx

s ≤

n

t = (s ≤ t ∧ #t ≤ #s +N)

This means that s is a preﬁx of t, with not more than n items removed. The

following laws are obvious and useful

s ≤

0

t ≡ (s = t)

s ≤

n

t ∧ t ≤

m

u ⇒s ≤

n+m

u

s ≤ t ≡ ∃n • s ≤

n

t

Examples

X1 COPY sat right ≤

1

left

X2 DOUBLE sat right ≤

1

double

∗

(left)

X3 UNPACK sat right ≤

/ left

where

/¸s

0

, s

1

, . . . , s

n−1

) = s

0

s

1

. . .

s

n−1

(see 1.9.2)

The speciﬁcation here states that the output on the right is obtained by ﬂat-

tening the sequence of sequences input on the left.

X4 PACK sat ((

/ right ≤

125

left) ∧ (#

∗

right ∈ ¦125¦

∗

))

This speciﬁcation states that each element output on the right is itself a se-

quence of length 125, and the catenation of all these sequences is an initial

subsequence of what has been input on the left.

**If ⊕is a binary operator, it is convenient to apply it distributively to the cor-
**

responding elements of two sequences. The length of the resulting sequence

is equal to that of the shorter operand

s ⊕t = ¸) if s = ¸) or t = ¸)

= ¸s

0

⊕t

0

)

(s

¹

⊕t

¹

) otherwise

Clearly

(s ⊕t)[i] = s[i] ⊕t[i] for i ≤ min(#s, #t).

and

s ≤

n

t ⇒(s ⊕u ≤

n

t ⊕u) ∧ (u ⊕s ≤

n

u ⊕t)

4.2 Input and output 121

Examples

X5 The Fibonacci sequence

¸1, 1, 2, 3, 5, 8, . . .)

is deﬁned by the recurrence relation

ﬁb[0] = ﬁb[1] = 1

ﬁb[i +2] = ﬁb[i +1] +ﬁb[i]

The second line can be rewritten using the

¹

operator to left-shift the sequence

by one place

ﬁb

¹¹

= ﬁb

¹

+ﬁb

The original deﬁnition of the Fibonacci sequence may be recovered from this

more cryptic form by subscripting both sides of the equation

ﬁb

¹¹

[i] = (ﬁb

¹

+ﬁb)[i]

⇒ﬁb

¹

[i +1] = ﬁb

¹

[i] +ﬁb[i] [1.9.4 L1]

⇒ﬁb[i +2] = ﬁb[i +1] +ﬁb[i]

Another explanation of the meaning of the equation is as a description of the

inﬁnite sum, where the left shift is clearly displayed

1 , 1 , 2 , 3 , 5 , . . . ﬁb

« « « «

1 , 2 , 3 , 5 , . . . + ﬁb

¹

« « «

2 , 3 , 5 , . . . = ﬁb

¹¹

In the above discussion, ﬁb is regarded as an inﬁnite sequence. If s is a ﬁnite

initial subsequence of ﬁb (with #s ≥ 2) then instead of the equation we get the

inequality

s

¹¹

≤ s

¹

+s

This formulation can be used to specify a process FIB which outputs the Fibon-

acci sequence to the right.

FIB sat (right ≤ ¸1, 1) ∨ (¸1, 1) ≤ right ∧ right

¹¹

≤ right

¹

+right))

**X6 A variable with value x outputs on the right the value most recently input
**

on the left, or x, if there is no such input. More formally, if the most recent

122 4 Communication

action was an output, then the value which was output is equal to the last item

in the sequence ¸x)

left

VAR

x

sat (channel(tr

0

) = right ⇒right

0

= (¸x)

left)

0

)

where s

0

is the last element of s (Section 1.9.5).

This is an example of a process that cannot be adequately speciﬁed solely

in terms of the sequence of messages on its separate channels. It is also ne-

cessary to know the order in which the communications on separate channels

are interleaved, for example that the latest communication is on the right. In

general, this extra complexity will be necessary for processes which use the

choice operator.

X7 The MERGE process produces an interleaving (Section 1.9.3) of the two

sequences input on left1 and left2, buﬀering up to one message

MERGE sat ∃r • right ≤

1

r ∧ r interleaves (left1, left2)

X8 BUFFER sat right ≤ left

A process which satisﬁes the speciﬁcation right ≤ left describes the beha-

viour of a transparent communications protocol, which is guaranteed to deliver

on the right only those messages which have been submitted on the left, and in

the same order. Aprotocol achieves this in spite of the fact that the place where

the messages are submitted is widely separated from the place where they are

received, and the fact that the communications medium which connects the

two places is somewhat unreliable. Examples will be given in Section 4.4.5.

4.3 Communications

Let P and Q be processes, and let c be a channel used for output by P and for

input by Q. Thus the set containing all communication events of the form c.v

is within the intersection of the alphabet of P with the alphabet of Q. When

these processes are composed concurrently in the system (P || Q), a com-

munication c.v can occur only when both processes engage simultaneously in

that event, i.e., whenever P outputs a value v on the channel c, and Q simul-

taneously inputs the same value. An inputting process is prepared to accept

any communicable value, so it is the outputting process that determines which

actual message value is transmitted on each occasion, as in 2.6 X4.

Thus output may be regarded as a specialised case of the preﬁx operator,

and input a special case of choice; and this leads to the law

L1 (c!v →P) || (c?x →Q(x)) = c!v →(P || Q(v))

Note that c!v remains on the right-hand side of this equation as an observable

action in the behaviour of the system. This represents the physical possibility

of tapping the wires connecting the components of a system, and of thereby

4.3 Communications 123

keeping a log of their internal communications. It is also a help in reasoning

about the system.

If desired, such internal communications can be concealed by applying the

concealment operator described in Section 3.5 outside the parallel composition

of the two processes which communicate on the same channel, as shown by

the law

L2 ((c!v →P) || (c?x →Q(x))) \ C = (P || Q(v)) \ C

where C = ¦ c.v ¦ v ∈ αc ¦

Examples will be given in Sections 4.4 and 4.5.

The speciﬁcation of the parallel composition of communicating processes

takes a particularly simple form when channel names are used to denote the

sequences of messages passing on them. Let c be the name of a channel along

which P and Q communicate. In the speciﬁcation of P, c stands for the se-

quence of messages communicated by P on c. Similarly, in the speciﬁcation of

Q, c stands for the sequence of messages communicated by Q.

Fortunately, by the very nature of communication, when P and Q commu-

nicate on c, the sequences of messages sent and received must at all times be

identical. Consequently this sequence must satisfy both the speciﬁcation of P

and the speciﬁcation of Q. The same is true for all channels in the intersection

of their alphabets.

Consider now a channel d in the alphabet of P but not of Q. This channel

cannot be mentioned in the speciﬁcation of Q, so the values communicated on

it are constrained only by the speciﬁcation of P. Similarly, it is Q that determ-

ines the properties of the communications on its own channels. Consequently

a speciﬁcation of the behaviour of (P || Q) can be simply formed as the logical

conjunction of the speciﬁcation of P with that of Q. However, this simpliﬁca-

tion is valid only when the speciﬁcations of P and Q are expressed wholly in

terms of the channel names, which is not always possible, as shown by 4.2.2

X6.

Example

X1 Let

P = (left?x →mid!(x ×x) →P)

Q = (mid?y →right!(173 ×y) →Q)

Clearly

P sat (mid ≤

1

square

∗

(left))

Q sat (right ≤

1

173 ×mid)

where (173 ×mid) multiples each message of mid by 173. It follows that

(P || Q) sat (right ≤

1

173 ×mid) ∧ (mid ≤

1

square

∗

(left))

124 4 Communication

The speciﬁcation here implies

right ≤ 173 ×square

∗

(left)

which was presumably the original intention.

When communicating processes are connected by the concurrency oper-

ator ||, the resulting formulae are highly suggestive of a physical implement-

ation method in which electronic components are connected by wires along

which they communicate. The purpose of such an implementation is to in-

crease the speed with which useful results can be produced.

The technique is particularly eﬀective when the same calculation must be

performed on each member of a stream of input data, and the results must be

output at the same rate as the input, but possibly after an initial delay. Such

systems are called data ﬂow networks.

A picture of a systemof communicating processes closely represents their

physical realisation. An output channel of one processes is connected to a

like-named input channel of the other process, but channels in the alphabet of

only one process are left free. Thus the example X1 can be drawn, as shown in

Figure 4.2.

left mid right

P Q

Figure 4.2

Examples

X2 Two streams of numbers are to be input from left1 and left2. For each

x read from left1 and each y from left2, the number (a × x + b × y) is to be

output on the right. The speed requirement dictates that the multiplications

must proceed concurrently. We therefore deﬁne two processes, and compose

them

X21 = (left1?x →mid!(a ×x) →X21)

X22 = (left2?y →mid?z →right!(z +b ×y) →X22)

X2 = (X21 || X22)

Clearly,

X2 sat (mid ≤

1

a ×left1 ∧ right ≤

1

mid +b ×left2)

⇒ (right ≤ a ×left1 +b ×left2)

4.3 Communications 125

X3 A stream of numbers is to be input on the left, and on the right is output

a weighted sum of consecutive pairs of input numbers, with weights a and b.

More precisely, we require that

right ≤ a ×left +b ×left

¹

The solution can be constructed by adding a new process X23 to the solution

of X2

X3 = (X2 || X23)

where

X23 sat (left1 ≤

1

left ∧ left2 ≤

1

left

¹

)

X23 can be deﬁned

X23 = (left?x →left1!x →(µ X • left?x →left2!x →left1!x →X))

It copies from left to both left1 and left2, but omits the ﬁrst element in the

case of left2.

A picture of the network of X3 is shown in Figure 4.3.

left

left1

left2

right

X23 mid

X21

X22

Figure 4.3

When two concurrent processes communicate with each other by output

and input only on a single channel, they cannot deadlock (compare 2.7 L2). As

a result, any network of nonstopping processes which is free of cycles cannot

deadlock, since an acyclic graph can be decomposed into subgraphs connected

only by a single arrow.

However, the network of X3 contains an undirected cycle, and cyclic net-

works cannot be decomposed into subnetworks except with connections on

two or more channels; so in this case absence of deadlock cannot so easily be

assured. For example, if the two outputs left2!x →left1!x → in the loop of X3

were reversed, deadlock would occur rapidly.

In proving the absence of deadlock it is often possible to ignore the content

of the messages, and regard each communication on channel c as a single event

126 4 Communication

named c. Communications on unconnected channels can be ignored. Thus X3

can be written in terms of these events

(µ X • left1 →mid →X)

|| (µ Y • left2 →mid →Y)

|| (left1 →(µ Z • left2 →left1 →Z))

= µ X3 • (left1 →left2 →mid →X3)

This proves that X3 cannot deadlock, using algebraic methods as in 2.3 X1.

These examples show how data ﬂow networks can be set up to compute

one or more streams of results from one or more streams of input data. The

shape of the network corresponds closely to the structure of the operands and

operators appearing in the expressions to be computed. When these patterns

are large but regular, it is convenient to use subscripted names for channels,

and to introduce an iterated notation for concurrent combination

||

i<n

P(i) = (P(0) || P(1) || . . . || P(n −1))

A regular network of this kind is known as an iterative array. If the connection

diagramhas no directed cycles, the termsystolic array is often used, since data

passes through the systemmuch like blood through the chambers of the heart.

Examples

X4 The channels ¦ left

j

¦ j < n ¦ are used to input the coordinates of successive

points in n-dimensional space. Each coordinate set is to be multiplied by a ﬁxed

vector V of length n, and the resulting scalar product is to be output to the

right; or more formally

right ≤ Σ

n−1

j=0

V

j

×left

j

It is speciﬁed that in each microsecond the n coordinates of one point are

to be input and one scalar product is to be output. The speed of each indi-

vidual processor is such that it takes nearly one microsecond to do an input,

a multiplication, an addition and an output. It is therefore clear that at least n

processors will be required to operate concurrently. The solution to the prob-

lem should therefore be designed as an iterative array with at least n elements.

Let us replace the Σ in the speciﬁcation by its usual inductive deﬁnition

mid

0

= 0

∗

mid

j+1

= V

j

×left

j

+mid

j

for j < n

right = mid

n

Thus we have split the speciﬁcation into a conjunction of n + 1 component

equations, each containing at most one multiplication. All that is required is

4.3 Communications 127

to write a process for each equation: for j < n, we write

MULT

0

= (µ X • mid

0

!0 →X)

MULT

j+1

= (µ X • left

j

?x →mid

j

?y →mid

j+1

!(V

j

×x +y) →X)

MULT

n+1

= (µ X • mid

n

?x →right!x →X)

NETWORK = ||

j<n+2

MULT

j

The connection diagram is shown in Figure 4.4.

left

0

left

n-1

right

mid

n

mid

1

mid

n

MULT

n+1

MULT

n

MULT

1

MULT

0

Figure 4.4

**X5 This is similar to X4, except that m diﬀerent scalar products of the same
**

coordinate sets are required almost simultaneously. Eﬀectively, the channel

left

j

(for j < n) is to be used to input the jth column of an inﬁnite array; this

is to be multiplied by the (n × m) matrix M, and the ith column of the result

is to be output on right

i

, for i < m. In formulae

right

i

= Σ

j<n

M

ij

×left

j

The coordinates of the result are required as rapidly as before, so at least m×n

processes are required.

128 4 Communication

The solution might ﬁnd practical application in a graphics display device

which automatically transforms or even rotates a two-dimensional representa-

tion of a three-dimensional object. The shape is deﬁned by a series of points in

absolute space; the iterative array applies linear transformations to compute

the deﬂection on the x and y plates of the cathode ray tube; a third output

coordinate could perhaps control the intensity of the beam.

n

m

right

left

Figure 4.5

The solution is based on Figure 4.5. Each column of this array (except

the last) is modelled on the solution to X4; but it copies each value input on

its horizontal input channel to its neighbour on its horizontal output channel.

The processes on the right margin merely discard the values they input. It

would be possible to economise by absorbing the functions of these marginal

processors into their neighbours. The details of the solution are left as an

exercise.

4.3 Communications 129

X6 The input on channel c is to be interpreted as the successive digits of a

natural number C, starting from the least signiﬁcant digit, and expressed with

number base b. We deﬁne the value of the input number as

C = Σ

i≥0

c[i] ×b

i

where c[i] < b for all i.

Given a ﬁxed multiplier M, the output on channel d is to be the successive

digits of the product M ×C. The digits are to be output after minimal delay.

Let us specify the problem more precisely. The desired output d is

d = Σ

i≥0

M ×c[i] ×b

i

The jth element of d must be the jth digit, which can be computed by the

formula

d[j] = ((Σ

i≥0

M ×c[i] ×b

i

) div b

j

) mod b

= (M ×c[j] +z

j

) mod b

where z

j

= (Σ

i<j

M ×c[i] ×b

i

) div b

j

and div denotes integer division.

z

j

is the carry term, and can readily be proved to satisfy the inductive

deﬁnition

z

0

= 0

z

j+1

= ((M ×c[j] +z

j

) div b)

We therefore deﬁne a process MULT1(z), which keeps the carry z as a

parameter

MULT1(z) = c?x →d!(M ×x +z) mod b →MULTI1((M ×x +z) div b)

The initial value of z is zero, so the required solution is

MULT = MULT1(0)

**X7 The problem is the same as X6, except M is a multi-digit number
**

M = Σ

i<n

M

i

×b

i

A single processor can multiply only single-digit numbers. However, output

is to be produced at a rate which allows only one multiplication per digit.

Consequently, at least n processors are required. We will get each NODE

i

to

look after one digit M

i

of the multiplier.

130 4 Communication

The basis of a solution is the traditional manual algorithm for multi-digit

multiplication, except that the partial sums are added immediately to the next

row of the table

. . . 153091 C the incoming number

253 M the multiplier

. . . 306182 M

2

×C computed by NODE

2

. . . 765455

. . . 827275

M

1

×C

25 ×C

computed by NODE

1

. . . 459273

. . . 732023

M

0

×C

M ×C

computed by NODE

0

c

0

c

1

c

n-1

d

n-1

d

0

d

1

NODE

n-1

NODE

0

Figure 4.6

The nodes are connected as shown in Figure 4.6. The original input comes

in on c

0

and is propagated leftward on the c channels. The partial answers are

propagated rightward on the d channels, and the desired answer is output on

d

0

. Fortunately each node can give one digit of its result before communicat-

ing with its left neighbour. Furthermore, the leftmost node can be deﬁned to

behave like the answer to X6

NODE

n−1

(z) = c

n−1

?x →d

n−1

!(M

n−1

×x +z) mod b →

NODE

n−1

((M

n−1

×x +z) div b)

The remaining nodes are similar, except that each of them passes the input

digit to its left neighbour, and adds the result from its left neighbour to its

own carry. For k < n −1

NODE

k

(z) = c

k

?x →d

k

!(M

k

×x +z) mod b →

c

k+1

!x →d

k+1

?y →

NODE

k

(y +(M

k

×x +z) div b)

The whole network is deﬁned

||

i<n

NODE

i

(0)

4.4 Pipes 131

X7 is a simple example from a class of ingenious network algorithms, in

which there is an essential cycle in the directed graph of communication chan-

nels. But the statement of the problem has been much simpliﬁed by the as-

sumption that the multiplier is known in advance and ﬁxed for all time. In a

practical application, it is much more likely that such parameters would have

to be input along the same channel as the subsequent data, and would have

to be reinput whenever it is required to change them. The implementation of

this requires great care, but little ingenuity.

A simple implementation method is to introduce a special symbol, say

reload, to indicate that the next number or numbers are to be treated as a

change of parameter; and if the number of parameters is variable, an endreload

symbol may also be introduced.

Example

X8 Same as X4, except that the parameters V

j

are to be reloaded by the number

immediately following a reload symbol. The deﬁnition of MULT

j+1

needs to be

changed to include the multiplier as parameter

MULT

j+1

(v) = left

j

?x →

if x = reload then (left

j

?y →MULT

j+1

(y))

else (mid

j

?y →mid

j+1

!(v ×x +y) →MULT

j+1

(v))

4.4 Pipes

In this section we shall conﬁne attention to processes with only two channels

in their alphabet, namely an input channel left and an output channel right.

Such processes are called pipes, and they may be pictured as in Figure 4.7.

left

left

right

right

P

Q

Figure 4.7

The processes P and Q may be joined together so that the right channel of

P is connected to the left channel of Q, and the sequence of messages output

by P and input by Q on this internal channel is concealed from their common

132 4 Communication

environment. The result of the connection is denoted

P>>Q

and may be pictured as the series shown in Figure 4.8.

left right

P Q

Figure 4.8

This connection diagram represents the concealment of the connecting

channel by not giving it a name. It also shows that all messages input on the

left channel of (P>>Q) are input by P, and all messages output on the right

channel of (P>>Q) are output by Q. Finally (P>>Q) is itself a pipe, and may

again be placed in series with other pipes

(P>>Q)>>R, (P>>Q)>>(R>>S), etc.

By 4.4.1 L1 >>is associative, so in future we shall omit brackets in such a series.

The validity of chaining processes by >> depends on the obvious alphabet

constraints

α(P>>Q) = αleft(P) ∪αright(Q)

and a further constraint states that the connected channels are capable of

transmitting the same kind of message

αright(P) = αleft(Q)

Examples

X1 A pipe which outputs each input value multiplied by four (4.2 X2)

QUADRUPLE = DOUBLE>>DOUBLE

**X2 A process which inputs cards of eighty characters and outputs their text,
**

tightly packed into lines of 125 characters each (4.2 X3, X4)

UNPACK>>PACK

This process is quite diﬃcult to write using conventional structured program-

ming techniques, because it is not clear whether the major loop should iterate

once per input card, or once per output line. The problem is known by Michael

Jackson as structure clash. The solution given above contains a separate loop

in each of the two processes, which nicely matches the structure of the original

problem.

4.4 Pipes 133

X3 Same as X2, except that each pair of consecutive asterisks is replaced by

“↑” (4.2 X5)

UNPACK>>SQUASH>>PACK

In a conventional sequential program, this minor change in speciﬁcation could

cause severe problems. It is nice to avoid such problems by the simple ex-

pedient of inserting an additional process. This kind of modularity has been

introduced and exploited by the designers of operating systems.

X4 Same as X2, except that the reading of cards may continue when the printer

is held up, and the printing can continue when the card reader is held up (4.2

X9)

UNPACK>>BUFFER>>PACK

The buﬀer holds characters which have been produced by the UNPACK pro-

cess, but not yet consumed by the PACK process. They will be available for

input by the PACK process during times when the UNPACK process is tempor-

arily delayed.

The buﬀer thus smoothes out temporary variations in the rate of produc-

tion and consumption. However it can never solve the problem of long-term

mismatch between the rates of production and consumption. If the card reader

is on average slower than the printer, the buﬀer will be nearly always empty,

and no smoothing eﬀect will be achieved. If the reader is faster, the buﬀer will

expand indeﬁnitely, until it consumes all available storage space.

X5 In order to avoid undesirable expansion of buﬀers, it is usual to limit the

number of messages buﬀered. Even the single buﬀer provided by the COPY

process (4.2 X1) may be adequate. Here is a version of X4 which reads one card

ahead on input and buﬀers one line on output

COPY>>UNPACK>>PACK>>COPY

Note the alphabets of the two instances of COPY are diﬀerent, a fact which

should be understood from the context in which they are placed.

X6 Adouble buﬀer, which accepts up to two messages before requiring output

of the ﬁrst

COPY>>COPY

Its behaviour is similar to that of CHAIN2 (2.6 X4) and VMS2 (1.1.3 X6).

4.4.1 Laws

The most useful algebraic property of chaining is associativity

L1 P>>(Q>>R) = (P>>Q)>>R

134 4 Communication

The remaining laws show how input and output can be implemented in a

pipe; they enable process descriptions to be simpliﬁed by a form of symbolic

execution. For example, if the process on the left of >> starts with output of a

message v to the right, and the process on the right of >>starts with input from

the left, the message v is transmitted from the former process to the latter;

however the actual communication is concealed, as shown in the following law

L2 (right! →P)>>(left?y →Q(y)) = P>>Q(v)

If one of the processes is determined to communicate with the other, but the

other is prepared to communicate externally, it is the external communica-

tion that takes place ﬁrst, and the internal communication is saved up for a

subsequent occasion

L3 (right!x →P)>>(right!w →Q) =

right!w →((right!v →P)>>Q)

L4 (left?x →P(x))>>(left?y →Q(y)) =

left?x →(P(x)>>(left?y →Q(y)))

If both processes are prepared for external communication, then either may

happen ﬁrst

L5 (left?x →P(x))>>(right!w →Q) =

(left?x →(P(x)>>(right!w →Q))

¦ right!w →((left?x →P(x))>>Q))

The law L5 is equally valid when the operator >> is replaced by >>R>>, since

pipes in the middle of a chain cannot communicate directly with the environ-

ment

L6 (left?x →P(x))>>R>>(right!w →Q) =

(left?x →(P(x)>>R>>(right!w →Q))

¦ right!w →((left?x →P(x))>>R>>Q))

Similar generalisations may be made to the other laws

L7 If R is a chain of processes all starting with output to the right,

R>>(right!w →Q) = right!w →(R>>Q)

L8 If R is a chain of processes all starting with input from the left,

(left?x →P(x))>>R = left?x →(P(x)>>R)

4.4 Pipes 135

Examples

X1 Let us deﬁne

R(y) = (right!y →COPY)>>COPY

So

R(y)

= (right!y →COPY)>>(left?x →right!x →COPY) [def COPY]

= COPY>>(right!y →COPY) [L2]

X2 COPY>>COPY

= (left?x →right!x →COPY)>>COPY [def COPY]

= left?x →((right!x →COPY)>>COPY) [L4]

= left?x →R(x) [def R(x)]

**X3 From the last line of X1 we deduce
**

R(y)

= (left?x →right!x →COPY)>>(right!y →COPY)

= (left?x →((right!x →COPY)>>(right!y →COPY))

¦ right!y →(COPY>>COPY))

[L5]

= (left?x →right!y →R(x)

¦ right!y →left?x →R(x))

[L3, X2]

This shows that a double buﬀer, after input of its ﬁrst message, is prepared

either to output that message or to input a second message before doing so.

The reasoning of the above proofs is very similar to that of 2.3.1 X1.

4.4.2 Implementation

In the implementation of (P>>Q) three cases are distinguished

1. If communication can take place on the internal connecting channel, it

does so immediately, without consideration of the external environment.

If an inﬁnite sequence of such communications is possible, the process

diverges (Section 3.5.2).

2. Otherwise, if the environment is interested in communication on the left

channel, this is dealt with by P.

136 4 Communication

3. Or if the environment is interested in the right channel, this is dealt with

by Q.

For an explanation of the input and output operations, see Section 4.2.1.

chain(P, Q) =

if P("right) ≠ "BLEEP and Q("left) ≠ "BLEEP then

chain(cdr(P("right)), Q("left)(car(P("right)))) [Case 1]

else

λx •

if x = "right then

if Q("right) = "BLEEP then

"BLEEP

else

cons(car(Q("right)), chain(P, cdr(Q("right))))

[Case 2]

else

if x = "left then

if P(x) = "BLEEP then

"BLEEP

else

λy • chain(P("left)(y), Q) [Case 3]

else

"BLEEP

4.4.3 Livelock

The chaining operator connects two processes by just one channel; and so it

introduces no risk of deadlock. If both P and Q are nonstopping, then (P>>Q)

will not stop either. Unfortunately, there is a new danger that the processes

P and Q will spend the whole time communicating with each other, so that

(P>>Q) never again communicates with the external world. This case of diver-

gences (Sections 3.5.1, 3.8) is illustrated by the trivial example

P = (right!1 →P)

Q = (left?x →Q)

(P>>Q) is obviously a useless process; it is even worse than STOP, in that

like an endless loop it may consume unbounded computing resources without

achieving anything. A less trivial example is (P>>Q), where

P = (right!1 →P ¦ left?x →P1(x))

Q = (left?x →Q ¦ right!1 →Q1)

4.4 Pipes 137

In this example, divergence derives fromthe mere possibility of inﬁnite internal

communication; it exists even though the choice of external communication on

the left and on the right is oﬀered on every possible occasion, and even though

after such an external communication the subsequent behaviour of (P>>Q)

would not diverge.

A simple method to prove (P>>Q) is free of livelock is to show that P is

left-guarded in the sense that it can never output an inﬁnite series of messages

to the right without interspersing inputs fromthe left. To ensure this, we must

prove that the length of the sequence output to the right is at all times bounded

above by some well-deﬁned function f of the sequence of values input from

the left; or more formally, we deﬁne

P is left-guarded ≡ ∃f • P sat (#right ≤ f (left))

Left-guardedness is often obvious from the text of P.

L1 If every recursion used in the deﬁnition of P is guarded by an input from

the left, then P is left-guarded.

L2 If P is left-guarded then (P>>Q) is free of livelock.

Exactly the same reasoning applies to right-guardedness of the second operand

of >>

L3 If Q is right-guarded then (P>>Q) is free of livelock.

Examples

X1 The following are left-guarded by L1 (4.1 X1, X2, X5, X9)

COPY, DOUBLE, SQUASH, BUFFER

**X2 The following are left-guarded in accordance with the original deﬁnition,
**

because

UNPACK sat #right ≤ #(

/ left)

PACK sat #right ≤ #left

**X3 BUFFER is not right-guarded, since it can input arbitrarily many messages
**

from the left without ever inputting from the right.

4.4.4 Speciﬁcations

A speciﬁcation of a pipe can often be expressed as a relation S(left, right)

between the sequences of messages input on the left channel and the sequence

of messages output on the right. When two pipes are connected in series, the

138 4 Communication

sequence right produced by the left operand is equated with the sequence left

consumed by the right operand; and this common sequence is then concealed.

All that is known of the concealed sequence is that it exists. But we also need

to avert the risk of livelock. Thus we explain the rule

L1 If P sat S(left, right)

and Q sat T(left, right)

and P is left-guarded or Q is right-guarded

then P>>Q sat ∃s • S(left, s) ∧ T(s, right)

This states that the relation between left and right which is maintained by

(P>>Q) is the normal relational composition of the relation for P with the

relation for Q. Since the >> operator cannot introduce deadlock in pipes, we

can aﬀord to omit reasoning about refusals.

Examples

X1 DOUBLE sat right ≤

1

double

∗

(left)

DOUBLE is left-guarded and right-guarded, so

(DOUBLE>>DOUBLE)

sat ∃s • (s ≤

1

double

∗

(left) ∧ right ≤

1

double

∗

(s))

≡ right ≤

2

double

∗

(double

∗

(left))

≡ right ≤

2

quadruple

∗

(left)

**X2 Let us use recursion together with >> to give an alternative deﬁnition of a
**

buﬀer

BUFF = µ X • (left?x →(X>>(right!x →COPY)))

We wish to prove that

BUFF sat (right ≤ left)

Assume that

X sat #left ≥ n ∨ right ≤ left

We know that

COPY sat right ≤ left

Therefore

(right!x →COPY)

4.4 Pipes 139

sat (right = left = ¸) ∨ (right ≤ ¸x) ∧ right

¹

≤ left))

⇒right ≤ ¸x)

left

Since the right operand is right-guarded, by L1 and the assumption

(X>>(right!x →COPY))

sat (∃s • (#left ≥ n ∨ s ≤ left) ∧ right ≤ ¸x)

s)

⇒(#left ≥ n ∨ right ≤ ¸x)

left)

Therefore

left?x →(. . .)

sat right = left = ¸) ∨

(left > ¸) ∧ (#left

¹

≥ n ∨ right ≤ ¸left

0

)

left

¹

))

⇒#left ≥ n +1 ∨ right ≤ left

The desired conclusion follows by the proof rule for recursive processes (3.7.1

L8). The simpler law (1.10.2 L6) cannot be used, because the recursion is not

obviously guarded.

4.4.5 Buﬀers and protocols

A buﬀer is a process which outputs on the right exactly the same sequence

of messages as it has input from the left, though possibly after some delay;

furthermore, when non-empty, it is always ready to output on the right. More

formally, we deﬁne a buﬀer to be a process P which never stops, which is free

of livelock, and which meets the speciﬁcation

P sat (right ≤ left) ∧ (if right = left then left ∉ ref else right ∉ ref )

Here c ∉ ref means that the process cannot refuse to communicate on channel

c (Sections 3.7, 3.4). It follows that all buﬀers are left-guarded.

Example

X1 The following processes are buﬀers

COPY, (COPY>>COPY), BUFF, BUFFER

**Buﬀers are clearly useful for storing information which is waiting to be
**

processed. But they are even more useful as speciﬁcations of the desired be-

haviour of a communications protocol, which is intended to deliver messages

in the same order in which they have been submitted. Such a protocol consists

of two processes, a transmitter T and a receiver R, which are connected in

series (T>>R). If the protocol is correct, clearly (T>>R) must be a buﬀer.

140 4 Communication

In practice, the wire that connects the transmitter to the receiver is quite

long, and the messages which are sent along it are subject to corruption or

loss. Thus the behaviour of the wire itself can be modelled by a process WIRE,

which may behave not quite like a buﬀer. It is the task of the protocol designer

to ensure that in spite of the bad behaviour of the wire, the system as a whole

acts as a buﬀer; i.e,

(T>>WIRE>>R) is a buﬀer

A protocol is usually built in a number of layers, (T

1

, R

1

) , (T

2

, R

2

) , …

(T

n

, R

n

) , each one using the previous layer as its communication medium

T

n

>>. . . >>(T

2

>>(T

1

>>WIRE>>R

1

)>>R

2

)>>. . . >>R

n

Of course, when the protocol is implemented in practice, all the transmitters

are collected into a single transmitter at one end and all the receivers at the

other, in accordance with the changed bracketing

(T

n

>>. . . >>T

2

>>T

1

)>>WIRE>>(R

1

>>R

2

>>. . . >>R

n

)

The law of associativity of >> guarantees that this regrouping does not change

the behaviour of the system.

In practice, protocols must be more complicated than this, since single-

directional ﬂowof messages is not adequate to achieve reliable communication

on an unreliable wire: it is necessary to add channels in the reverse direction,

to enable the receiver to send back acknowledgement signals for successfully

transmitted messages, so that unacknowledged messages can be retransmit-

ted.

The following laws are useful in proving the correctness of protocols. They

are due to A. W. Roscoe.

L1 If P and Q are buﬀers, so are (P>>Q) and (left?x →(P>>(right!x →Q)))

L2 If T>>R = (left?x →(T>>(right!x →R))) then (T>>R) is a buﬀer.

The following is a generalisation of L2

L3 If for some function f and for all z

(T(z)>>R(z)) = (left?x →(T(f (x, z))>>(right!x →R(f (x, z)))))

then T(z)>>R(z) is a buﬀer for all z.

Examples

X2 The following are buﬀers by L1

COPY>>COPY, BUFFER>>COPY, COPY>>BUFFER, BUFFER>>BUFFER

4.4 Pipes 141

X3 It has been shown in 4.4.1 X1 and X2 that

(COPY>>COPY) = (left?x →(COPY>>(right!y →COPY)))

By L2 it is therefore a buﬀer.

X4 (Phase encoding) A phase encoder is a process T which inputs a stream

of bits, and outputs ¸0, 1) for each 0 input and ¸1, 0) for each 1 input. A

decoder R reverses this translation

T = left?x →right!x →right!(1 −x) →T

R = left?x →left?y →if y = x then FAIL else (right!x →R)

where the process FAIL is left undeﬁned.

We wish to prove by L2 that (T>>R) is a buﬀer

(T>>R)

= left?x →((right!x →right!(1 −x) →T)>>

(left?x →left?y →

if y = x then FAIL else (right!x →R)))

= left?x →(T>> if (1 −x) = x then FAIL else (right!x →R))

= left?x →(T>>(right!x →R))

Therefore (T>>R) is a buﬀer, by L2.

X5 (Bit stuﬃng) The transmitter T faithfully reproduces the input bits from

left to right, except that after three consecutive 1-bits which have been output,

it inserts a single extra 0. Thus the input 01011110 is output as 010111010.

The receiver R removes these extra zeroes. Thus (T>>R) must be proved to be

a buﬀer. The construction of T and R, and the proof of their correctness, are

left as an exercise.

X6 (Line sharing) It is desired to copy data froma channel left1 to right1 and

fromleft2 to right2. This can most easily be achieved by two disjoint protocols,

each using a diﬀerent wire. Unfortunately, only a single wire mid is available,

and this must be used for both streams of data, as shown by Figure 4.9.

left1 right1

left2 right2

T R

mid

Figure 4.9

Messages input by T must be tagged before transmission along mid, and

142 4 Communication

R must untag them and output them on the corresponding right channel

T = (left1?x →mid!tag1(x) →T

¦ left2?y →mid!tag2(y) →T)

R = mid?z →

if tag(z) = 1 then

(right1!untag(z) →R)

else

(right2!untag(z) →R)

This solution is quite unsatisfactory. If two messages are input on left1, but

the recipient is not yet ready for them, the whole system will have to wait, and

transmission between left2 and right2 may be seriously delayed. To insert

buﬀers on the channels will only postpone the problem for a short while. The

correct solution is to introduce another channel in the reverse direction, and

for R to send signals back to T to stop sending messages on the stream for

which there seems to be little demand. This is known as ﬂow control.

4.5 Subordination

Let P and Q be processes with

αP ⊆ αQ

In the combination (P || Q), each action of P can occur only when Q permits

it to occur; whereas Q can engage independently in the actions of (αQ −αP),

without the permission and without the knowledge of its partner P. Thus

P serves Q as a slave or subordinate process, while Q acts as a master of

main process. When communications between a subordinate process and a

main process are to be concealed from their common environment, we use the

asymmetric notation

P // Q

Using the concealment operator, this can be deﬁned

P // Q = (P || Q) \ αP

This notation is used only when αP ⊆ αQ; and then

α(P // Q) = (αQ −αP)

It is usually convenient to give the subordinate process a name, say m,

which is used in the main process for all interactions with its subordinate. The

process naming technique described in Section 2.6.2 can be readily extended

4.5 Subordination 143

to communicating processes, by introducing compound channel names. These

take the form m.c, where m is a process name and c is the name of one of its

channels. Each communication on this channel is a triple

m.c.v

where αm.c(m : P) = αc(P) and v ∈ αc(P).

In the construction (m : P // Q), Q communicates with P along channels

with compound names of the form m.c and m.d; whereas P uses the corres-

ponding simple channels c and d for the same communications. Thus for

example

(m : (c!v →P) // (m.c?x →Q(x))) = (m : P // Q(v))

Since all these communications are concealed from the environment, the name

m can never be detected from the outside; it therefore serves as a local name

for the subordinate process.

Subordination can be nested, for example

(n : (m : P // Q) // R)

In this case, all occurrences of events involving the name m are concealed

before the name n is attached to the remaining events, all of which are in the

alphabet of Q, and not of P. There is no way that R can communicate directly

with P, or even know of the existence of P or its name m.

Examples

X1 (for DOUBLE see 4.2 X2)

doub : DOUBLE // Q

The subordinate process acts as a simple subroutine called from within the

main process Q. Inside Q, the value of 2 ×e may be obtained by a successive

output of the argument e on the left channel of doub, and input of the result

on the right channel

doub.left!e →(doub.right?x →. . . )

**X2 One subroutine may use another as a subroutine, and do so several times
**

QUADRUPLE =

(doub : DOUBLE // (µ X • left?x →doub.left!x →

doub.right?y →doub.left!y →

doub.right?z →right!z →X))

144 4 Communication

This is designed itself to be used as a subroutine

quad : QUADRUPLE // Q

This version of QUADRUPLE is similar to that of 4.4 X1, but does not have the

same double-buﬀering eﬀect.

X3 A conventional program variable named m may be modelled as a subor-

dinate process

m : VAR // Q

Inside the main process Q, the value of m can be assigned, read, and updated

by input and output, as described in 2.6.2 X2

m := 3; P is implemented by (m.left!3 →P)

x := m; P is implemented by (m.right?x →P)

m := m+3; P is implemented by (m.right?y →m.left!(y +3) →P)

**A subordinate process may be used to implement a data structure with a
**

more elaborate behaviour than just a simple variable.

X4 (see 4.2 X9)

(q : BUFFER // Q)

The subordinate process serves as an unbounded queue named q. Within Q,

the output q.left!v adds v to one end of the queue, and q.right?y removes an

element from the other end, and gives its value to y. If the queue is empty, the

queue will not respond, and the system may deadlock.

X5 (see 4.2 X10) A stack with name st is declared

st : STACK // Q

Inside the main process Q, st.left!v can be used to push the value v onto the

stack, and st.right?x will pop the top value. To deal with the possibility that

the stack is empty, a choice construction can be used

(st.right?x →Q1(x) ¦ st.empty →Q2)

If the stack is non-empty, the ﬁrst alternative is selected; if empty, deadlock is

avoided and the second alternative is selected.

A subordinate process with several channels may be used by several con-

current processes, provided that they do not use the same channel.

4.5 Subordination 145

X6 A process Q is intended to communicate a stream of values to R; these

values are to be buﬀered by a subordinate buﬀer process named b, so that

output from Q will not be delayed when R is not ready for input. Q uses

channel b.left for its output and R uses b.right for its input

(b : BUFFER // (Q || R))

Note that if R attempts to input from an empty buﬀer, the system will not

necessarily deadlock; R will simply be delayed until Q next outputs a value to

the buﬀer. (If Q and R communicate with the buﬀer on the same channel, then

that channel must be in the alphabet of both of them; and the deﬁnition of ||

would require them always to communicate simultaneously the same value—

which would be quite wrong.)

The subordination operator may be used to deﬁne subroutines by recur-

sion. Each level of recursion (except the last) declares a new local subroutine

to deal with the recursive call(s).

X7 (Factorial)

FAC = µ X • left?n →

(if n = 0 then

(right!1 →X)

else

(f : X // (f .left!(n −1) →

f .right?y →right!(n ×y) →X)))

The subroutine FAC uses channels left and right to communicate paramet-

ers and results to its calling process; and it uses channels f .left and f .right to

communicate with its subordinate process named f . In these respects it is

similar to the QUADRUPLE subroutine (X2). The only diﬀerence is that the

subordinate process is isomorphic to FAC itself.

This is a boringly familiar example of recursion, expressed in an unfamil-

iar but rather cumbersome notational framework. A less familiar idea is that

of using recursion together with subordination to implement an unbounded

data structure. Each level of the recursion stores a single component of the

structure, and declares a new local subordinate data structure to deal with the

rest.

X8 (Unbounded ﬁnite set) A process which implements a set inputs its mem-

bers on its left channel. After each input, it outputs a YES if it has already input

the same value, and NO otherwise. It is very similar to the set of 2.6.2 X4, ex-

cept that it will store messages of any kind

SET = left?x →right!NO →(rest : SET // LOOP(x))

146 4 Communication

where

LOOP(x) = µ X • left?y →

(if y = x then

right!YES →X

else

(rest.left!y →

rest.right?z →

right!z →X))

The set starts empty; therefore on input of its ﬁrst member x is immedi-

ately outputs NO. It then declares a subordinate process called rest, which is

going to store all members of the set except x. The LOOP is designed to input

subsequent members of the set. If the newly input member is equal to x, the

answer YES is sent back immediately on the right channel. Otherwise, the new

member is passed on for storage by rest. Whatever answer (YES or NO) is sent

back by rest is passed on again, and the LOOP repeats.

X9 (Binary tree) A more eﬃcient representation of a set is as a binary tree,

which relies on some given total ordering ≤over its elements. Each node stores

its earliest inserted element, and declares two subordinate trees, one to store

elements smaller than the earliest, and one to store the bigger elements. The

external speciﬁcation of the tree is the same as X8

TREE = left?x →

right!NO →

(smaller : TREE //

(bigger : TREE //

LOOP))

The design of the LOOP is left as an exercise.

4.5.1 Laws

The following obvious laws govern communications between a process and

its subordinates. The ﬁrst law describes concealed communication in each

direction between the main and subordinate processes

L1A (m : (c?x →P(x))) // (m.c!v →Q) = (m : P(v)) // Q

L1B (m : (d!v →P)) // (m.d?x →Q(x)) = (m : P) // Q(v)

4.5 Subordination 147

If b is a channel not named by m, the main process can communicate on

b without aﬀecting the subordinate

L2 (m : P // (b!e →Q)) = (b!e →(m : P // Q))

The only process capable of making a choice for a subordinate process is its

main process

L3 (m : (c?x →P1(x) ¦ d?y →P2(y))) // (m.c!v →Q) = (m : P1(v) // Q)

If two subordinate processes have the same name, one of them is inaccessible

L4 m : P // (m : Q // R) = (m : Q // R)

Usually, the order in which subordinate processes are written does not matter

L5 m : P // (n : Q // R) = n : Q // (m : P // R)

provided that m and n are distinct names

The use of recursion in deﬁning subordinate processes is suﬃciently sur-

prising to raise doubts as to whether it actually works. These doubts may be

slightly alleviated by showing how the combination evolves. The example be-

low uses a particular trace of behaviour of the process, and shows how that

trace is produced. More important, it shows how other slightly diﬀering traces

cannot be produced.

Example

X1 A typical trace of SET is

s = ¸left.1, right.NO, left.2, right.NO)

The value of SET / s can be calculated using laws L1A, L1B, and L2:

SET / ¸left.1)

= right!NO →(rest : SET // LOOP(1))

SET / ¸left.1, right.NO)

= (rest : SET // LOOP(1))

SET / ¸left.1, right.NO, left.2)

= (rest : SET //

(rest.left!2 →rest.right?z →right!z →LOOP(1)))

= (rest : (right!NO →(rest : SET // LOOP(2)))) //

(rest.right?z →right!z →LOOP(1)))

= rest : (rest : SET // LOOP(2)) // (right!NO →LOOP(1))

SET / s =

148 4 Communication

rest : (rest : SET // LOOP(2)) // LOOP(1)

It is obvious from this that ¸left.1, right.NO, left.2, right.YES) is not a trace of

SET. The reader may check that

SET / s

**¸left.2, right.YES) = SET / s
**

and

SET / s

¸left.5, right.NO) =

rest : (rest : (rest : SET // LOOP(5)) // LOOP(2)) // LOOP(1)

4.5.2 Connection diagrams

A subordinate process may be drawn inside the box representing the process

that uses it, as shown for 4.5 X1 in Figure 4.10. For nested subordinate pro-

cesses, the boxes nest more deeply, as shown for 4.5 X2 in Figure 4.11.

A recursive process is one that is nested inside itself, like the picture of the

artist’s studio, in which there stands on an easel the completed painting itself,

which shows on the easel a completed painting…Such a picture in practice can

never be completed. Fortunately, for a process it is not necessary to complete

the picture—it evolves automatically as need during its activity. Thus (see 4.5.1

X1) we may picture successive stages in the early history of a set as shown in

Figure 4.12.

If we ignore the nesting of the boxes, this can be drawn as a linear structure

as in Figure 4.13. Similarly, the example TREE (4.5 X9) could be drawn as in

Figure 4.14.

The connection diagrams suggest how a corresponding network might be

constructed from hardware components, with boxes representing integrated

circuits and arrows representing wires between them. Of course in any prac-

tical realisation, the recursion must be unfolded to some ﬁnite limit before the

network can start its normal operation; and if this limit is exceeded during

operation, the network cannot successfully continue.

Dynamic reallocation and reconﬁguration of hardware networks is a lot

more diﬃcult than the stack-based storage allocation which makes recursion

in conventional sequential programs so eﬃcient. Nevertheless, recursion is

surely justiﬁed by the aid that it gives in the invention and design of algorithms;

and if not by that, then at least by the intellectual joy which it gives to those

who understand it and use it.

4.5 Subordination 149

Q

doub.

left

doub.

right

doub:

DOUBLE

Figure 4.10

LOOP

doub.

left

doub.

right

doub:

DOUBLE

quad:

Q

quad.

left

quad.

right

Figure 4.11

150 4 Communication

SET / ¸left.1, right.NO) =

LOOP(1)

rest: SET

rest.

left

rest.

right

right left

SET / s =

LOOP(1)

LOOP(2)

rest.

left

rest.

right

right left

...

Figure 4.12

4.5 Subordination 151

LOOP(x)

LOOP(y)

LOOP(z)

right left

Figure 4.13

right left

Figure 4.14

Sequential Processes 5

5.1 Introduction

The process STOP is deﬁned as one that never engages in any action. It is

not a useful process, and probably results from a deadlock or other design

error, rather than a deliberate choice of the designer. However, there is one

good reason why a process should do nothing more, namely that it has already

accomplished everything that it was designed to do. Such a process is said

to terminate successfully. In order to distinguish between this and STOP, it

is convenient to regard successful termination as a special event, denoted by

the symbol ✓ (pronounced “success”). A sequential process is deﬁned as one

which has ✓ in its alphabet; and naturally this can only be the last event in

which it engages. We stipulate that ✓ cannot be an alternative in the choice

construct

(x : B →P(x)) is invalid if ✓ ∈ B

SKIP

A

is deﬁned as a process which does nothing but terminate successfully

αSKIP

A

= A∪¦✓¦

As usual, we shall frequently omit the subscript alphabet.

Examples

X1 Avending machine that is intended to serve only one customer with chocol-

ate or toﬀee and then terminate successfully

VMONE = (coin →(choc →SKIP ¦ toﬀee →SKIP))

**In designing a process to solve a complex task, it is frequently useful to
**

split the task into two subtasks, one of which must be completed successfully

before the other begins. If P and Q are sequential processes with the same

alphabet, their sequential composition

P ; Q

154 5 Sequential Processes

is a process which ﬁrst behaves like P; but when P terminates successfully,

(P ; Q) continues by behaving as Q. If P never terminates successfully, neither

does (P ; Q).

X2 A vending machine designed to serve exactly two customers, one after the

other

VMTWO = VMONE ; VMONE

**A process which repeats similar actions as often as required is known as
**

a loop; it can be deﬁned as a special case of recursion

∗

P = µ X • (P ; X)

= P ; P ; P ; . . .

α(

∗

P) = αP −¦✓¦

Clearly such a loop will never terminate successfully; that is why it is conveni-

ent to remove ✓ from its alphabet.

X3 A vending machine designed to serve any number of customers

VMCT =

∗

VMONE

This is identical to VMCT (1.1.3 X3).

A sequence of symbols is said to be a sentence of a process P if P termin-

ates successfully after engaging in the corresponding sequence of actions. The

set of all such sentences is called the language accepted by P. Thus the nota-

tions introduced for describing sequential processes may also be used to deﬁne

the grammar of a simple language, such as might be used for communication

between a human being and a computer.

X4 A sentence of Pidgingol consists of a noun clause followed by a predic-

ate. A predicate is a verb followed by a noun clause. A verb is either bites or

scratches. The deﬁnition of a noun clause is given more formally below

αPIDGINGOL = ¦a, the, cat, dog, bites, scratches¦

PIDGINGOL = NOUNCLAUSE ; PREDICATE

PREDICATE = VERB ; NOUNCLAUSE

VERB = (bites →SKIP ¦ scratches →SKIP)

NOUNCLAUSE = ARTICLE ; NOUN

ARTICLE = (a →SKIP ¦ the →SKIP)

NOUN = (cat →SKIP ¦ dog →SKIP)

Example sentences of Pidgingol are

the cat scratches a dog

a dog bites the cat

5.1 Introduction 155

**To describe languages with an unbounded number of sentences, it is ne-
**

cessary to use some kind of iteration or recursion.

X5 A noun clause which may contain any number of adjectives furry or prize

NOUNCLAUSE = ARTICLE ; µ X • (furry →X ¦ prize →X

¦ cat →SKIP ¦ dog →SKIP)

Examples of a noun clause are

the furry furry prize dog

a dog

**X6 A process which accepts any number of as followed by a b and then the
**

same number of cs, after which it terminates successfully

A

n

BC

n

= µ X • (b →SKIP

¦ a →(X ; (c →SKIP)))

If a b is accepted ﬁrst, the process terminates; no as and no cs are accepted,

so their numbers are the same. If the second branch is taken, the accepted

sentence starts with a and ends with c, and between these is the sentence

accepted by the recursive call on the process X. If we assume that the recursive

call accepts an equal number of as and cs, then so will the non-recursive call

on A

n

BC

n

, since it accepts just one more a at the beginning and one more c at

the end.

This example shows howsequential composition, used in conjunction with

recursion, can deﬁne a machine with an inﬁnite number of states.

X7 A process which ﬁrst behaves like A

n

BC

n

, but the accepts a d followed by

the same number of es

A

n

BC

n

DE

n

= ((A

n

BC

n

) ; d →SKIP) || C

n

DE

n

where C

n

DE

n

= f (A

n

BC

n

) for f which maps a to c, b to d, and c to e.

In this example, the process on the left of the || is responsible for ensuring

an equal number of as and cs (separated by a b). It will not allow a d until the

proper number of cs have arrived; but the es (which are not in its alphabet)

are ignored. The process on the right of || is responsible for ensuring an equal

number of es as cs. It ignores the as and the b, which are not in its alphabet.

The pair of processes terminate together when they have both completed their

allotted tasks.

156 5 Sequential Processes

The notations for deﬁning a language by means of an accepting process

are as powerful as those of regular expressions. The use of recursion intro-

duces some of the power of context-free grammars, but not all. A process can

only deﬁne those languages that can be parsed from left to right without back-

tracking or look-ahead. This is because the use of the choice operator requires

that the ﬁrst event of each alternative is diﬀerent from all its other ﬁrst events.

Consequently, it is not possible to use the construction of X5 to deﬁne a noun

clause in which the word prize can be either a noun of an adjective or both,

e.g., the prize dog, the furry prize.

The use of (Section 3.3) would not help, because it introduces non-

determinism, and allows an arbitrary choice of the clause which will analyse

the rest of the input. If the choice is wrong, the process will deadlock before

reaching the end of the input text. What is required to solve this problem is a

new kind of choice operator which provides angelic nondeterminism like or3

(Section 3.2.2). This new operator requires that the two alternatives run con-

currently until the environment makes the choice; its deﬁnition is left as an

exercise.

Without angelic nondeterminism the language-deﬁning method described

above is not as powerful as context-free grammars, because it requires left-to-

right parsability without back-tracking. However, the introduction of || permits

deﬁnition of languages which are not context-free, for example X7.

X8 A process which accepts any interleaving of downs and ups, except that it

terminates successfully on the ﬁrst occasion that the number of downs exceeds

the number of ups

POS = (down →SKIP ¦ up →(POS ; POS))

If the ﬁrst symbol is down, the task of POS is accomplished. But if the ﬁrst

symbol is up, it is necessary to accept two more downs than ups. The only way

of achieving this is ﬁrst to accept one more down than up; and then again to

accept one more down than up. Thus two successive recursive calls on POS

are needed, one after the other.

X9 The process C

0

behaves like CT

0

(1.1.4 X2)

C

0

= (around →C

0

¦ up →C

1

)

C

n+1

= POS ; C

n

= POS ; . . . POS

n times

; POS ; POS ; C

0

for all n ≥ 0

**We can now solve the problem mentioned in 2.6.2 X3, and encountered
**

again in 4.5 X4, that each operation on a subordinate process explicitly men-

tions the rest of the user process which follows it. The required eﬀect can now

be more conveniently achieved by means of SKIP and sequential composition.

5.2 Laws 157

X10 A USER process manipulates two count variables named l and m (see

2.6.2 X3)

l : CT

0

|| m : CT

3

|| USER

The following subprocess (inside the USER) adds the current value of l to m

ADD = (l.around →SKIP

¦ l.down →(ADD ; (m.up →l.up →SKIP)))

If the value of l is initially zero, nothing needs to be done. Otherwise, l is

decremented, and its reduced value is added to m(by the recursive call to ADD).

Then m is incremented once more, and l is also incremented, to compensate

for the initial decrementation and bring it back to its initial value.

5.2 Laws

The laws for sequential composition are similar to those for catenation (Section

1.6.1), with SKIP playing the role of the unit

L1 SKIP ; P = P ; SKIP = P

L2 (P ; Q) ; R = P ; (Q ; R)

L3 (x : B →P(x)) ; Q = (x : B →(P(x) ; Q))

The law for the choice operator has corollaries

L4 (a →P) ; Q = a →(P ; Q)

L5 STOP ; Q = STOP

When sequential operators are composed in parallel, the combination ter-

minates successfully just when both components do so

L6 SKIP

A

|| SKIP

B

= SKIP

A∪B

A successfully terminating process participates in no further event oﬀered by

a concurrent partner

L7 ((x : B →P(x)) || SKIP

A

) = (x : (B −A) →(P(x) || SKIP

A

))

In a concurrent combination of a sequential with a nonsequential pro-

cesses, when does the combination terminate successfully? If the alphabet of

the sequential process wholly contains that of its partner, termination of the

partnership is determined by that of the sequential process, since the other

process can do nothing when its partner is ﬁnished.

L8 STOP

A

|| SKIP

B

= SKIP

B

if ✓ ∉ A ∧ A ⊆ B.

158 5 Sequential Processes

The condition for the validity of this law is very reasonable one, which should

always be observed when ✓ is in the alphabet of only one of a pair of processes

running concurrently. In this way, we avoid the problem of a process which

continues after engaging in ✓.

The laws L1 to L3 may be used to prove the claim made in 5.1 X9 that C

0

behaves like CT

0

(1.1.4 X2). This is done by showing that C satisﬁes the set of

guarded recursive equations used to deﬁne CT. The equation for CT

0

is the

same as that for C

0

C

0

= (around →C

0

¦ up →C

1

) [def C

0

]

For n > 0, we need to prove

C

n

= (up →C

n+1

¦ down →C

n−1

Proof

LHS

= POS ; C

n−1

[def C

n

]

= (down →SKIP ¦ up →POS ; POS) ; C

n−1

[def POS]

= (down →(SKIP ; C

n−1

) ¦ up →(POS ; POS) ; C

n−1

) [L3]

= (down →C

n−1

¦ up →POS ; (POS ; C

n−1

)) [L1, L2]

= (down →C

n−1

¦ up →POS ; C

n

) [def C

n

]

= RHS [def C

n

]

Since C

n

obeys the same set of guarded recursive equations as CT

n

, they are

the same.

This proof has been written out in full, in order to illustrate the use of

the laws, and also in order to allay suspicion of circularity. What seems most

suspicious is that the proof does not use induction on n. In fact, any attempt

to use induction on n will fail, because the very deﬁnition of CT

n

contains the

process CT

n+1

. Fortunately, an appeal to the law of unique solutions is both

simple and successful.

5.3 Mathematical treatment

The mathematical deﬁnition of sequential composition must be formulated in

such a way as to ensure the truth of the laws quoted in the previous section.

Special care needs to be exercised on

P ; SKIP = P

As usual, the treatment of deterministic processes is much simpler, and will

be completed ﬁrst.

5.3 Mathematical treatment 159

5.3.1 Deterministic processes

Operations on deterministic processes are deﬁned in terms of the traces of

their results. The ﬁrst and only action of the process SKIP is successful ter-

mination, so it has only two traces

L0 traces(SKIP) = ¦¸), ¸✓)¦

To deﬁne sequential composition of processes, it is convenient ﬁrst to deﬁne

sequential composition of their individual traces. If s and t are traces and s

does not contain ✓

(s ; t) = s

(s

¸✓)) ; t = s

t

(see Section 1.9.7 for a fuller treatment). A trace of (P ; Q) consists of a trace

of P; and if this trace ends in ✓, the ✓ is replaced by a trace of Q

L1 traces(P ; Q) = ¦ s ; t ¦ s ∈ traces(P) ∧ t ∈ traces(Q) ¦

An equivalent deﬁnition is

L1A traces(P ; Q) = ¦ s ¦ s ∈ traces(P) ∧ ¬ ¸✓) in s ¦ ∪

¦ s

t ¦ s

**¸✓) ∈ traces(P) ∧ t ∈ traces(Q) ¦
**

This deﬁnition may be simpler to understand but it is more complicated to

use.

The whole intention of the ✓symbol is that it should terminate the process

which engages in it. We therefore need the law

L2 P / s = SKIP if s

¸✓) ∈ traces(P)

This law is essential in the proof of

P ; SKIP = P

Unfortunately, it is not in general true. For example, if

P = (SKIP

¦¦

|| c →STOP

¦c¦

)

then traces(P) = ¦¸), ¸✓), ¸c), ¸c, ✓), ¸✓, c)¦ and P / ¸) ≠ SKIP, even though

¸✓) ∈ traces(P). We therefore need to impose alphabet constraints on parallel

composition. (P || Q) must be regarded as invalid unless

αP ⊆ αQ ∨ αQ ⊆ αP ∨ ✓ ∈ (αP ∩αQ ∪αP ∩αQ)

For similar reasons, alphabet change must be guaranteed to leave ✓unchanged,

so f (P) is invalid unless

f (✓) = ✓

160 5 Sequential Processes

Furthermore, if m is a process name, we must adopt the convention that

m.✓ = ✓

Finally, we must never use ✓ in the choice construct

(✓ →P ¦ c →Q)

This restriction also rules out RUN

A

when ✓ ∈ A.

5.3.2 Non-deterministic processes

Sequential composition of nondeterministic processes presents a number of

problems. The ﬁrst of them is that a nondeterministic process like SKIP ¦

(c →SKIP) does not satisfy the law L2 of the previous section. A solution of

this is to weaken 5.3.1 L2 to

L2A s

**¸✓) ∈ traces(P) ⇒(P / s) U SKIP
**

This means that whenever P can terminate, it can do so without oﬀering any

alternative event to the environment. To maintain the truth of L2A, all restric-

tions of the previous section must be observed, and also

SKIP must never appear unguarded in an operand of

✓ must not appear in the alphabet of either operand of |||

(It is possible that a slight change to the deﬁnitions of and ||| might permit

relaxation of these restrictions.)

In addition to the laws given earlier in this chapter, sequential composition

of nondeterministic processes satisﬁes the following laws. Firstly, a divergent

process remains divergent, no matter what is speciﬁed to happen after its

successful termination

L1 CHAOS ; P = CHAOS

Sequential composition distributes through nondeterministic choice

L2A P ¦ Q) ; R = (P ; R) ¦ (Q ; R)

L2B R ; (P ¦ Q) = (R ; P) ¦ (R ; Q)

To deﬁne (P ; Q) in the mathematical model of nondeterministic pro-

cesses (Section 3.9) requires treatment of its failures and divergences. But ﬁrst

we describe its refusals (Section 3.4). If P can refuse X, and cannot terminate

successfully, it follows that X ∪ ¦✓¦ is also a refusal of P (3.4.1 L11). In this

case X is a refusal of (P ; Q). But if P oﬀers the option of successful termin-

ation, then in (P ; Q) this transition may occur autonomously; its occurrence

is concealed, and any refusal of Q is also a refusal of (P ; Q). The case where

successful termination of P is nondeterministic is also treated in the deﬁnition

5.4 Interrupts 161

D1 refusals(P ; Q) = ¦ X ¦ (X ∪¦✓¦) ∈ refusals(P) ¦ ∪

¦ X ¦ ¸✓) ∈ traces(P) ∧ X ∈ refusals(Q) ¦

The traces of (P ; Q) are deﬁned in exactly the same way as for determ-

inistic processes. The divergences of (P ; Q) are deﬁned by the remark that

it diverges whenever P diverges; or when P has terminated successfully and

then Q diverges

D2 divergences(P ; Q) = ¦ s ¦ s ∈ divergences(P) ∧ ¬ ¸✓) in s ¦ ∪

¦ s

t ¦ s

¸✓) ∈ traces(P) ∧ ¬ ¸✓) in s ∧

t ∈ divergences(Q) ¦

Any failure of (P ; Q) is either a failure of P before P can terminate, or it is a

failure of Q after P has terminated successfully

D3 failures(P ; Q) = ¦ (s, X) ¦ (s, X ∪¦✓¦) ∈ failures(P) ¦ ∪

¦ (s

t, X) ¦ s

¸✓) ∈ traces(P) ∧

(t, X) ∈ failures(Q) ¦ ∪

¦ (s, X) ¦ s ∈ divergences(P ; Q) ¦

5.3.3 Implementation

SKIP is implemented as the process which accepts only the symbol "SUCCESS.

It does not matter what it does afterwards

SKIP = λx •if x = "SUCCESS then STOP else "BLEEP

A sequential composition behaves like the second operand if the ﬁrst operand

terminates; otherwise, the ﬁrst operand participates in the ﬁrst event, and the

rest of it is composed with the second operand

sequence(P, Q) = if P("SUCCESS) ≠ "BLEEP then Q

else λx •if P(x) = "BLEEP then "BLEEP

else sequence(P(x), Q)

5.4 Interrupts

In this section we deﬁne a kind of sequential composition (P ZQ) which does

not depend on successful termination of P. Instead, the progress of P is just

interrupted on occurrence of the ﬁrst event of Q; and P is never resumed. It

follows that a trace of (P ZQ) is just a trace of P up to an arbitrary point when

162 5 Sequential Processes

the interrupt occurs, followed by any trace of Q.

α(P ZQ) = αP ∪αQ

traces(P ZQ) = ¦ s

t ¦ s ∈ traces(P) ∧ t ∈ traces(Q) ¦

To avoid problems, we specify that ✓ must not be in αP.

The next law states that it is the environment which determines when Q

shall start, by selecting an event which is initially oﬀered by Q but not by P

L1 (x : B →P(x)) ZQ = Q (x : B →(P(x) ZQ))

If (P Z Q) can be interrupted by R, this is the same as P interruptible by

(Q ZR)

L2 (P ZQ) ZR = P Z(Q ZR)

Since STOP oﬀers no ﬁrst event, it can never be triggered by the environment.

Similarly, if STOP is interruptible, only the interrupt can actually occur. Thus

STOP is a unit of

L3 P ZSTOP = P = STOP ZP

The interrupt operator executes both of its operands at most once, so it dis-

tributes through nondeterministic choice

L4A P Z(Q ¦ R) = (P ZQ) ¦ (P ZR)

L4B (Q ¦ R) ZP = (Q ZP) ¦ (R ZP)

Finally, one cannot cure a divergent process by interrupting it; nor is it safe to

specify a divergent process after the interrupt

L5 CHAOS ZP = CHAOS = P ZCHAOS

In the remainder of this section, we shall insist that the possible initial

events of the interrupting process are outside the alphabet of the interrupted

process. Since the occurrence of interrupt is visible and controllable by the

environment, this restriction preserves determinism, and reasoning about the

operators is simpliﬁed. To emphasise the preservation of determinism, we

extend the deﬁnition of the choice operator. Provided that c ∉ B

(x : B →P(x) ¦ c →Q) ≡ (x : (B ∪¦c¦) →(if x = c then Q else P(x)))

and similarly for more operands.

5.4.1 Catastrophe

Let be a symbol standing for a catastrophic interrupt event, which it is reas-

onable to suppose would not be caused by P; more formally

∉ αP

5.4 Interrupts 163

Then a process which behaves like P up to catastrophe and thereafter like Q

is deﬁned

P ˆ

Q = P Z( →Q)

Here Q is perhaps a process which is intended to eﬀect a recovery after cata-

strophe. Note that the inﬁx operator ˆ

**is distinguished from the event by
**

the circumﬂex.

The ﬁrst law is just an obvious formulation of the informal description of

the operator

L1 (P ˆ

Q) / (s

¸)) = Q for s ∈ traces(P)

In the deterministic model, this single law uniquely identiﬁes the meaning of

the operator. In a nondeterministic universe, uniqueness would require addi-

tional laws stating strictness and distributivity in both arguments.

The second law gives a more explicit description of the ﬁrst and sub-

sequent steps of the process. It shows how ˆ

**distributes back through →
**

L2 (x : B →P(x)) ˆ

Q = (x : B →(P(x) ˆ

Q) ¦ →Q)

This law too uniquely deﬁnes the operator on deterministic processes.

5.4.2 Restart

One possible response to catastrophe is to restart the original process again.

Let P be a process such that ∉ αP. We specify

ˆ

P as a process which behaves

as P until occurs, and after each behaves like P from the start again. Such

a process is called restartable and is deﬁned by the simple recursion

α

ˆ

P = αP ∪¦¦

ˆ

P = µ X • (P ˆ

X)

= P ˆ

(P ˆ

(P ˆ

. . .))

This is a guarded recursion, since the occurrence of X is guarded by .

ˆ

P is

certainly a cyclic process (Section 1.8.3), even if P is not.

Catastrophe is not the only reason for a restart. Consider a process de-

signed to play a game, interacting with its human opponent by means of a

selection of keys on a keyboard (see the description of the interact function of

Section 1.4). Humans sometimes get dissatisﬁed with the progress of a game,

and wish to start a new game again. For this purpose, a new and special key

() is provided on the keyboard; depression of this key at any point in the pro-

gress of the game will restart the game. It is convenient to deﬁne a game P

independently of the restart facility and then transform it into a restartable

game

ˆ

P by using the operator deﬁned above. This idea is due to Alex Teruel.

The informal deﬁnition of

ˆ

P is expressed by the law

L1

ˆ

P / s

¸) =

ˆ

P for s ∈ traces(P)

164 5 Sequential Processes

But this law does not uniquely deﬁne

ˆ

P, since it is equally well satisﬁed by

RUN. However,

ˆ

P is the smallest deterministic process that satisﬁes L1.

5.4.3 Alternation

Suppose P and Q are processes which play games in the manner described in

Section 5.4.2; and a human player wishes to play both games simultaneously,

alternating between them in the same way as a chess master plays a simultan-

eous match by cycling round several weaker opponents. We therefore provide

a new key _x , which causes alternation between the two games P and Q. This

is rather like an interrupt, in that the current game is interrupted at an ar-

bitrary point; but it diﬀers from the interrupt in that the current state of the

current game is preserved, so that it can be resumed when the other game is

later interrupted. The process which plays the games P and Q simultaneously

is denoted (P _x Q), and it is most clearly speciﬁed by the laws

L1 _x ∈ (α(P _x Q)) −αP −αQ)

L2 (P _x Q) / s = (P / s) _x Q if s ∈ traces(P)

L3 (P _x Q) / ¸_x ) = (Q _x P)

We want the smallest operator that satisﬁes L2 and L3. A more constructive

description of the operator can be derived from these laws; it shows how _x

distributes backward through →

L4 (x : B →P(x)) _x Q = (x : B →(P(x) _x Q) ¦ _x →(Q _x P))

The alternation operator is useful not only for playing games. A similar

facility should be provided in a “friendly” operating system for alternating

between system utilities. For example, you do not wish to lose your place in

the editor on switching to a “help” program, nor vice versa.

5.4.4 Checkpoints

Let P be a process which describes the behaviour of a long-lasting data base

system. When lightning () strikes, one of the worst responses would be to

restart P in its initial state, losing all the laboriously accumulated data of the

system. It would be much better to return to some recent state of the system

which is known to be satisfactory. Such a state is known as a checkpoint. We

therefore provide a newkey _c , which should be pressed only when the current

state of the systemis known to be satisfactory. When occurs, the most recent

checkpoint is restored; or if there is no checkpoint the initial state is restored.

We suppose _c and are not in the alphabet of P, and deﬁne Ch(P) as the

process that behaves as P, but responds in the appropriate fashion to these

two events.

The informal deﬁnition of Ch(P) is most succinctly formalised in the laws

5.4 Interrupts 165

L1 Ch(P) / (s

¸)) = Ch(P) for s ∈ traces(P)

L2 Ch(P) / (s

**¸_c )) = Ch(P/afters) for s ∈ traces(P)
**

Ch(P) can be deﬁned more explicitly in terms of the operator Ch2(P, Q),

where P is the current process and Q is the most recent checkpoint waiting

to be reinstated. If catastrophe occurs before the ﬁrst checkpoint, the system

restarts, as described by the laws

L3 Ch(P) = Ch2(P, P)

L4 If P = (x : B →P(x))

then Ch2(P, Q) = (x : B →Ch2(P(x), Q)

¦ →Ch2(Q, Q)

¦ _c →Ch2(P, P))

The law L4 is suggestive of a practical implementation method, in which

the checkpointed state is stored on some cheap but durable medium such as

magnetic disc or tape. When _c occurs, the current state is copied as the new

checkpoint; when occurs, the checkpoint is copied back as the new current

state. For reasons of economy, a system implementor ensures that as much

data as possible is shared between the current and the checkpoint states. Such

optimisation is highly machine and application dependent; it is pleasing that

the mathematics is so simple.

The checkpointing operator is useful not only for large-scale data base

systems. When playing a diﬃcult game, a human player may wish to explore

a possible line of play without committing himself to it. So he presses the _c

key to store the current position, and if his explorations are unsuccessful, use

of the key will restore the status quo.

These ideas of checkpointing have been explored by Ian Hayes.

5.4.5 Multiple checkpoints

In using a checkpointable system Ch(P) it may happen that a checkpoint is

declared in error. In such cases, it may be desirable to cancel the most recent

checkpoint, and go back to the one before. For this we require a system which

retains two or more of the most recently checkpointed states. In principle,

there is no reason why we should not deﬁne a system Mch(P) which retains

all checkpoints back to the beginning of time. Each occurrence of returns to

the state just before the most recent _c , rather than the state just after it. As

always we insist

αMch(P) = αP = ¦_c , ¦

A before a _c goes back to the beginning

L1 Mch(P) / s

**¸) = Mch(P) for s ∈ traces(P)
**

166 5 Sequential Processes

A after a _c cancels the eﬀect of everything that has happened back to and

including the most recent _c

L2 Mch(P) / s

¸_c )

t

¸) = Mch(P) / s for (s αP)

t ∈ traces(P)

A much more explicit description of Mch(P) can be given in terms of a

binary operator Mch2(P, Q), where P is the current process and Q is the stack

of checkpoints waiting to be resumed if necessary. The initial content of the

stack is an inﬁnite sequence of copies of P

L3 Mch(P) = µ X • Mch2(P, X)

= Mch2(P, Mch(P))

= Mch2(P, Mch2(P, Mch2(P, . . .)))

On occurrence of _c the current state is pushed down; on occurrence of the

whole stack is reinstated

L4 If P = (x : B →P(x)) then

Mch2(P, Q) =

(x : B →Mch2(P(x), Q) ¦ _c →Mch2(P, Mch2(P, Q)) ¦ →Q)

The pattern of recursions which appear in L4 is quite ingenious, but the mul-

tiple checkpoint facility could be very expensive to implement in practice when

the number of checkpoints gets large.

5.4.6 Implementation

The implementation of the various versions of interrupt are based on laws

which show how the operators distribute through →. Consider for example

the alternation operator (5.4.3 L4)

alternation(P, Q) = λx • if x = _x then

alternation(Q, P)

else if P(x) = "BLEEP then

"BLEEP

else

alternation(P(x), Q)

A more surprising implementation is that of MCh (5.4.5 L3, L4)

Mch(P) = Mch2(P, Q)

5.5 Assignment 167

where

Mch2(P, Q) = λx • if x = then

Mch2(P, Mch2(P, Q))

else if x = _c then

Mch2(P, Mch2(P, Q))

else if P(x) = "BLEEP then

"BLEEP

else

Mch2(P(x), Q)

When this function is executed, the amount of store used grows in proportion

to the number of checkpoints; and available storage is very rapidly exhausted.

Of course, the storage can be reclaimed by the garbage collector on each oc-

currence of , but that is not really much consolation. As in the case of other

recursions, constraints if practical implementation enforce a ﬁnite bound on

the depth. In this case, the designer should impose a limit on the number of

checkpoints retained, and discard the earlier ones. But such a design is not so

elegantly expressible by recursion.

5.5 Assignment

In this section we shall introduce the most important aspects of conventional

sequential programming, namely assignments, conditionals, and loops. To

simplify the formulation of useful laws, some unusual notations will be deﬁned.

The essential feature of conventional computer programming is assign-

ment. If x is a program variable and e is an expression and P a process

(x := e ; P)

is a process which behaves like P, except that the initial value of x is deﬁned

to be the initial value of the expression e. Initial values of all other variables

are unchanged. Assignment by itself can be given a meaning by the deﬁnition

(x := e) = (x := e ; SKIP)

Single assignment generalises easily to multiple assignment. Let x stand

for a list of distinct variables

x = x

0

, x

1

, . . . x

n−1

Let e stand for a list of expressions

e = e

0

, e

1

, . . . e

n−1

168 5 Sequential Processes

Provided that the lengths of the two lists are the same

x := e

assigns the initial value of e

i

to x

i

, for all i. Note that all the e

i

are evaluated

before any of the assignments are made, so that if y occurs in g

y := f ; z := g

is quite diﬀerent from

y, z := f , g

Let b be an expression that evaluates to a Boolean truth value (either true or

false). If P and Q are processes

P ¦ <b ¦ > Q (P if b else Q)

is a process which behaves like P if the initial value of b is true, or like Q if the

initial value of b is false. The notation is novel, but less cumbersome than the

traditional

if b then P else Q

For similar reasons, the traditional loop

while b do Q

will be written

b ∗ Q

This may be deﬁned by recursion

D1 b ∗ Q = µ X • ((Q ; X) ¦ <b ¦ > SKIP)

Examples

X1 A process that behaves like CT

n

(1.1.4 X2)

X1 = µ X • (around →X ¦ up →(n := 1 ; X))

¦ <n = 0¦ >

(up →(n := n +1 ; X) ¦ down →(n := n −1 ; X))

The current value of the count is recorded in the variable n

X2 A process that behaves like CT

0

n := 0 ; X1

The initial value of the count is set to zero.

5.5 Assignment 169

X3 A process that behaves like POS (5.1 X8)

n := 1 ; (n > 0) ∗ (up →n := n +1 ¦ down →n := n −1)

Recursion has been replaced by a conventional loop.

X4 A process which divides a natural number x by a positive number y as-

signing the quotient to q and the remainder to r

QUOT = (q := x +y ; r := x −q ×y)

**X5 A process with the same eﬀect as , which computes the quotient by the
**

slow method of repeated subtraction

LONGQUOT = (q := 0 ; r := x ; ((r ≥ y) ∗ (q := q +1 ; r := r −y)))

**In a previous example (4.5 X3) we have shown how the behaviour of a vari-
**

able can be modelled by a subordinate process which communicates its value

with the process which uses it. In this chapter, we have deliberately rejected

that technique, because it does not have the properties which we would like.

For example, we want

(m := 1 ; m := 1) = (m := 1)

but unfortunately

(m.left!1 →m.left!1 →SKIP) ≠ (m.left!1 →SKIP)

5.5.1 Laws

In the laws for assignment, x and y stand for lists of distinct variables; e, f (x),

f (e) stand for lists of expressions, possible containing occurrences of variables

in x or y; and f (e) contains e

i

whenever f (x) contains x

i

for all indices i. For

simplicity, in the following laws we shall assume that all expressions always

give a result, for any values of the variables they contain.

L1 (x := x) = SKIP

L2 (x := e ; x := f (x)) = (x := f (e))

L3 If x, y is a list of distinct variables (x := e) = (x, y := e, y)

L4 If x, y, z are of the same length as e, f , g respectively

(x, y, z := e, f , g) = (x, z, y := e, g, f )

170 5 Sequential Processes

Using these laws, it is possible to transform every sequence of assignments

into a single assignment to a list of all the variables involved.

When ¦ <b¦ > is considered as a binary inﬁx operator, it possesses several

familiar algebraic properties

L5–L6 ¦ <b¦ > is idempotent, associative, and distributes through ¦ <c ¦ >

L7 P ¦ <true ¦ >Q = P

L8 P ¦ <false ¦ >Q = Q

L9 P ¦ <¬ b ¦ >Q = Q ¦ <b ¦ >P

L10 P ¦ <b ¦ >(Q ¦ <b ¦ >R) = P ¦ <b ¦ >R

L11 P ¦ <(a ¦ <b ¦ >c) ¦ >Q = (P ¦ <a ¦ >Q) ¦ <b ¦ >(P ¦ <c ¦ >Q)

L12 x := e ; (P ¦ <b(x) ¦ >Q) = (x := e ; P) ¦ <b(e) ¦ >(x := e ; Q)

L13 (P ¦ <b ¦ >Q) ; R = (P ; R) ¦ <b ¦ >(Q ; R)

To deal eﬀectively with assignment in concurrent processes, it is necessary

to impose a restriction that no variable assigned in one concurrent process

can ever be used in another. To enforce this restriction, we introduce two new

categories of symbol into the alphabets of sequential processes

var(P) the set of variables that may be assigned within P

acc(P) the set of variables that may be accessed in expressions within P.

All variables which may be changed may also be accessed

var(P) ⊆ acc(P) ⊆ αP

Similarly, we deﬁne acc(e) as the set of variables appearing in e. Now if P and

Q are to be joined by ||, we stipulate that

var(P) ∩acc(Q) = var(Q) ∩acc(P) = ¦¦

Under this condition, it does not matter whether an assignment takes place

before a parallel split, or within one of its components after they are running

concurrently

L14 ((x := e ; P) || Q) = (x := e ; (P || Q))

provided that x ⊆ var(P) −acc(Q) and acc(e) ∩var(Q) = ¦¦

An immediate consequence of this is

(x := e ; P) || (y := f ; Q) = (x, y := e, f ; (P || Q))

provided that x ⊆ var(P) −acc(Q) −acc(f )

and y ⊆ var(Q) −acc(P) −acc(e)

5.5 Assignment 171

This shows how the alphabet restriction ensures that assignments within

one component process of a concurrent pair cannot interfere with assignments

within the other. In an implementation, sequences of assignments may be car-

ried out either together or in any interleaving, without making any diﬀerence

to the externally observable actions of the process.

Finally, concurrent combination distributes through the conditional

L15 P || (Q ¦ <b ¦ >R) = (P || Q) ¦ <b ¦ >(P || R)

provided that acc(b) ∩var(P) = ¦¦.

This law again states that it does not matter whether b is evaluated before

or after the parallel split.

We now deal with the problem which arises when expressions are un-

deﬁned for certain values of the variables they contain. If e is a list of ex-

pressions, we deﬁne 1e as a Boolean expression which is true just when all

the operands of e are within the domains of their operators. For example, in

natural number arithmetic,

1(x - y) = (y > 0)

1(y +1, z +y) = true

1(e +f ) = 1e ∧ 1f

1(r −y) = y ≤ r

It is reasonable to insist that 1e is always deﬁned, i.e.,

1(1e) = true

We deliberately leave completely unspeciﬁed the result of an attempt to

evaluate an undeﬁned expression—anything whatsoever may happen. This is

reﬂected by the use of CHAOS in the following laws.

L16’ (x := e) = (x := e ¦ <1e ¦ >CHAOS)

L17’ P ¦ <b ¦ >Q = ((P ¦ <b ¦ >Q) ¦ <1b ¦ >CHAOS)

Furthermore, the laws L2, L5, and L12 need slight modiﬁcation

L2’ (x := e; x := f (x)) = (x := f (e) ¦ <1e ¦ >CHAOS)

L5’ (P ¦ <b ¦ >P) = (P ¦ <1b ¦ >CHAOS)

5.5.2 Speciﬁcations

A speciﬁcation of a sequential process describes not only the traces of the

events which occur, but also the relationship between these traces, the initial

values of the program variables, and their ﬁnal values. To denote the initial

value of a program variable x, we simply use the variable name x by itself. To

172 5 Sequential Processes

denote the ﬁnal value, we decorate the name with a superscript ✓, as in x

✓

. The

value of x

✓

is not observable until the process is terminated, i.e., the last event

of the trace is ✓. This fact is represented by not specifying anything about x

✓

unless tr

0

= ✓.

Examples

X1 A process which performs no action, but adds one to the value of x, and

terminates successfully with the value of y unchanged

tr = ¸) ∨ (tr = ¸✓) ∧ x

✓

= x +1 ∧ y

✓

= y)

**X2 A process which performs an event whose symbol is the initial value of
**

the variable x, and then terminates successfully, leaving the ﬁnal values of x

and y equal to their initial values

tr = ¸) ∨ tr = ¸x) ∨ (tr = ¸x, ✓) ∧ x

✓

= x ∧ y

✓

= y)

**X3 A process which stores the identity of its ﬁrst event as the ﬁnal value of x
**

#tr ≤ 2 ∧ (#tr = 2 ⇒(tr = ¸x

✓

, ✓) ∧ y

✓

= y))

**X4 A process which divides a nonnegative x by a positive y, and assigns the
**

quotient to q and the remainder to r

DIV = (y > 0 ⇒

tr = ¸) ∨ (tr = ¸✓) ∧ q

✓

= (x - y) ∧

r

✓

= x −(q

✓

×y) ∧ y

✓

= y ∧ x

✓

= x))

Without the precondition, this speciﬁcation would be impossible to meet in its

full generality.

X5 Here are some more complex speciﬁcations which will be used later

DIVLOOP =

(tr = ¸) ∨ (tr = ¸✓) ∧ r = (q

✓

−q) ×y +r

✓

∧

r

✓

< y ∧ x

✓

= x ∧ y

✓

= y))

T(n) = r < n ×y

5.5 Assignment 173

All variables in these and subsequent speciﬁcations are intended to denote

natural numbers, so subtraction is undeﬁned if the second operand is greater

than the ﬁrst.

We shall now formulate the laws which underlie proofs that a process

satisﬁes its speciﬁcation. Let s(x, tr, x

✓

) be a speciﬁcation. In order to prove

that SKIP satisﬁes this speciﬁcation, clearly the speciﬁcation must be true when

the trace is empty; furthermore, it must be true when the trace is ¸✓) and

the ﬁnal values of all variables x

✓

are equal to their initial values. These two

conditions are also suﬃcient, as stated in the following law

L1 If S(x, ¸), x

✓

)

and S(x, ¸✓), x

✓

)

then SKIP sat S(x, tr, ¸x

✓

)

X6 The strongest speciﬁcation satisﬁed by SKIP is

SKIP

A

sat (tr = ¸) ∨ (tr = ¸✓) ∧ x

✓

= x))

where x is a list of all variables in A and x

✓

is a list of their ticked variants. X6

is an immediate consequence of L1 and vice versa.

X7 We can prove that

SKIP sat (r < y ⇒(T(n +1) ⇒DIVLOOP))

Proof :

(1) Replacing tr by ¸) in the speciﬁcation gives

r < y ∧ T(n +1) ⇒¸) = ¸) ∨ . . .

which is a tautology.

(2) Replacing tr by ¸✓) and ﬁnal values by initial values gives

r < y ∧ T(n +1) ⇒

(¸✓) = ¸) ∨ (¸✓) = ¸✓) ∧ x = x ∧

y = y ∧ r = ((q −q) ×y +r ∧ r < y)))

which is also a trivial theorem. This result will be used in X10.

It is a precondition of successful assignment x := e that the expressions e

on the right-hand side should be deﬁned. In this case, if P satisﬁes a speciﬁc-

ation S(x), (x := e ; P) satisﬁes the same speciﬁcation, after modiﬁcation to

reﬂect the fact that the initial value of x is e.

174 5 Sequential Processes

L2 If P sat S(x) then

(x := e; P) sat (1e ⇒S(e))

The law for simple assignment can be derived from L2 on replacing P by SKIP,

and using X6 and 5.2 L1

L2A x

0

:= e sat (1e ∧ tr ≠ ¸) ⇒tr = ¸✓) ∧ x

✓

0

= e ∧ x

✓

1

= x

1

∧ . . .)

A consequence of L2 is that for any P, the strongest fact one can prove about

(x := 1/0 ; P) is

(x := 1/0 ; P) sat true

Whatever non-vacuous goal you may wish to achieve, it cannot be achieved by

starting with an illegal assignment.

Examples

X8

SKIP sat (tr ≠ ¸) ⇒tr = ¸✓) ∧ q

✓

= q ∧ r

✓

= r ∧ y

✓

= y ∧ x

✓

= x)

therefore

(r := x −q ×y ; SKIP) sat (x ≥ q ×x ∧ tr ≠ ¸) ⇒

tr = ¸✓) ∧ q

✓

= q ∧

r

✓

= (x −q ×y) ∧ y

✓

= y ∧ x

✓

= x)

therefore

(q := x - y ; r := x −q ×y) sat (y > 0 ∧ x ≥ (x - y) ×y ∧ tr ≠ ¸) ⇒

tr = ¸✓) ∧ q

✓

= (x - y) ∧

r

✓

= (x −(x - y) ×y) ∧

y

✓

= y ∧ x

✓

= x)

The speciﬁcation on the last line is equivalent to DIV , deﬁned in X4.

X9 Assume

X sat (T(n) ⇒DIVLOOP)

therefore

(r := r −y ; X) sat (y ≤ r ⇒

(r −y < n ×y ⇒

(tr = ¸) ∨ tr = ¸✓) ∧ (r −y) = . . .)))

5.5 Assignment 175

therefore

(q := q +1 ; r := r −y ; X) sat (y ≤ r ⇒(r < (n +1) ×y ⇒DIVLOOP

¹

))

where

DIVLOOP

¹

=

(tr = ¸) ∨ (tr = ¸✓) ∧ (r −y) = (q

✓

−(q +1)) ×y +r

✓

∧

r

✓

< y ∧ x

✓

= x ∧ y

✓

= y))

By elementary algebra of natural numbers

y ≤ r ⇒(DIVLOOP

¹

≡ DIVLOOP)

therefore

(q := q +1 ; r := r −y ; X) sat (y ≤ r ⇒(T(n +1) ⇒DIVLOOP))

This result will be used in X10.

For general sequential composition, a much more complicated law is re-

quired, in which the traces of the components are sequentially composed, and

the initial state of the second component is identical to the ﬁnal state of the

ﬁrst component. However, the values of the variables in this intermediate state

are not observable; only the existence of such values is assured

L3 If P sat S(x, tr, x

✓

)

and Q sat T(x, tr, x

✓

)

and P does not diverge

then (P ; Q) sat (∃y, s, t • tr = (s ; t) ∧ S(x, s, y) ∧ T(y, t, x

✓

))

In this law, x is a list of all variables in the alphabet of P and Q, x

✓

is a list of

their subscripted variants, and y a list of the same number of fresh variables.

The speciﬁcation of a conditional is the same as that of the ﬁrst component

if the condition is true, and the same as that of the second component if false.

L4 If P sat S and Q sat T

then (P ¦ <b ¦ >Q) sat ((b ∧ S) ∨ (¬ b ∧ T))

An alternative form of this law is sometimes more convenient

L4A If P sat (b ⇒S) and Q sat (¬ b ⇒S)

then (P ¦ <b ¦ >Q) sat S

176 5 Sequential Processes

Example

X10 Let

COND = (q := q +1 ; r := r −y ; X) ¦ <r ≥ y ¦ >SKIP

and

X sat (T(n) ⇒DIVLOOP)

then

COND sat (T(n +1) ⇒DIVLOOP)

The two suﬃcient conditions for this conclusion have been proved in X7 and

X9; the result follows by L4A.

The proof of a loop uses the recursive deﬁnition given in 5.5 D1, and the

law for unguarded recursion (3.7.1 L8). If R is the intended speciﬁcation of the

loop, we must ﬁnd a speciﬁcation S(n) such that S(0) is always true, and also

(∀n • S(n)) ⇒R

A general method to construct S(n) is to ﬁnd a predicate T(n, x), which de-

scribes the conditions on the initial state x such that the loop is certain to

terminate in less than n repetitions. Then deﬁne

S(n) = (T(n, x) ⇒R)

Clearly, no loop can terminate in less than no repetitions, so if T(n, x) has been

correctly deﬁned T(0, x) will be false, and consequently S(0) will be true. The

result of the proof of the loop will be ∀n • S(n), i.e.,

∀n • (T(n, x) ⇒R)

Since n has been chosen as a variable which does not occur in R, this is equi-

valent to

(∃n • T(n, x)) ⇒R

No stronger speciﬁcation can possibly be met, since

∃n • T(n, x)

is the precondition under which the loop terminates in some ﬁnite number of

iterations.

Finally, we must prove that the body of the loop meets its speciﬁcation.

Since the recursive equation for a loop involves a conditional, this task splits

into two. Thus we derive the general law

5.5 Assignment 177

L5 If ¬ T(0, x) and T(n, x) ⇒1b

and SKIP sat (¬ b ⇒(T(n, x) ⇒R))

and (X sat T(n, x) ⇒R)) ⇒((Q ; X) sat (b ⇒(T(n +1, x) ⇒R)))

then (b ∗ Q) sat ((∃n • T(n, x)) ⇒R)

Example

X11 We wish to prove that the programfor long division by repeated subtrac-

tion (5.5 X5) meets its speciﬁcation DIV . The task splits naturally in two. The

second and more diﬃcult part is to prove that the loop meets some suitably

formulated speciﬁcation, namely

(r ≥ y) ∗ (q := q +1 ; r := r −y) sat (y > 0 ⇒DIVLOOP)

First we need to formulate the condition under which the loop terminates in

less than n iterations

T(n) = r < n ×y

Here T(0) is obviously false; the clause ∃n • T(n) is equivalent to y > 0, which

is the precondition under which the loop terminates. The remaining steps of

the proof of the loop have already been taken in X7 and X5. The rest of the

proof is a simple exercise.

The laws given in this section are designed as a calculus of total correctness

for purely sequential programs, which contain no input or output. If Q is such

a program, then a proof that

Q sat (P(x) ∧ tr ≠ ¸) ⇒tr = ¸✓) ∧ R(x, x

✓

) (1)

established that if P(x) is true of the initial values of the variables when Q

is started, then Q will terminate and R(x, x

✓

) will describe the relationship

between the initial values x and the ﬁnal values x

✓

. Thus (P(x), R(x, x

✓

)) form

a precondition/postcondition pair in the sense of Cliﬀ Jones. If R(x

✓

) does not

mention the initial values x, the assertion (1) is equivalent to

P(x) ⇒wp(Q, R(x))

where wp is Dijkstra’s weakest precondition.

Thus in the special case of noncommunicating programs, the proof meth-

ods are mathematically equivalent to ones that are already familiar, though

the explicit mention of “tr = ¸)” and “tr = ¸✓)” makes them notationally more

clumsy. This extra burden is of course necessary, and therefore more accept-

able, when the methods are extended to deal with communicating sequential

processes.

178 5 Sequential Processes

5.5.3 Implementation

The initial and ﬁnal states of a sequential process can be represented as a

function which maps each variable name onto its value. A sequential process

is deﬁned as a function which maps its initial states onto its subsequent be-

haviour. Successful termination (✓) is represented by the atom "SUCCESS. A

process which is ready to terminate will accept this symbol, which it maps, not

onto another process, but onto the ﬁnal state of its variables.

The process SKIP takes an initial state as a parameter, accepts "SUCCESS

as its only action, and delivers its initial state as its ﬁnal state

SKIP = λs • λy • if y ≠ "SUCCESS then "BLEEP else s

An assignment is similar, except that its ﬁnal state is slightly changed

assign(x, e) = λs • λy • if y ≠ "SUCCESS then

"BLEEP

else

update(s, x, e)

where update(s, x, e) is the function λy • if y = x then eval(e, s) else s(y) and

eval(s, e) is the result of evaluating the expression e in state s.

If e is undeﬁned in state s, we do not care what happens. Here, for simpli-

city, we have implemented only the single assignment. Multiple assignment is

a little more complicated.

To implement sequential composition, it is necessary ﬁrst to test whether

the ﬁrst operand has successfully terminated. If so, its ﬁnal state is passed on

to the second operand. If not, the ﬁrst action is that of the ﬁrst operand

sequence(P, Q) = λs • if P(s)("SUCCESS) ≠ "BLEEP then

Q(P(s)("SUCCESS))

else

λy • if P(s)(y) = "BLEEP then

"BLEEP

else

sequence(P(s)(y), Q)

The implementation of the conditional is as a conditional

condition(P, b, Q) = λs • if eval(b, s) then P(s) else Q(s)

The implementation of the loop (b ∗ Q) is left as an exercise.

5.5 Assignment 179

Note that the deﬁnition of sequence given above is more complicated than

that given in Section 5.3.3, because it takes a state as its ﬁrst argument, and it

has to supply the state as the ﬁrst argument of its operands. Unfortunately,

a similar complexity has to be introduced into the deﬁnitions of all the other

operators given in earlier chapters. A simpler alternative would be to model

variables as subordinate processes; but this would probably be a great deal less

eﬃcient than the use of conventional random access storage. When consider-

ations of eﬃciency are added to those of mathematical convenience, there are

adequate grounds for introducing the assignable program variable as a new

primitive concept, rather than deﬁning it in terms of previously introduced

concepts.

Shared Resources 6

6.1 Introduction

In Section 4.5 we introduced the concept of a named subordinate process

(m : R), whose sole task is to meet the needs of a single main process S; and

for this we have used the notation

(m : R // S)

Suppose now that S contains or consists of two concurrent processes (P || Q),

and that both P and Q require the services of the same subordinate process

(m : R). Unfortunately, it is not possible for P and Q both to communicate

with (m : R) along the same channels, because these channels would have to

be in the alphabet of both P and Q; and then the deﬁnition of || would require

that communications with (m : R) take place only when both P and Q com-

municate the same message simultaneously—which (as explained in 4.5 X6) is

far from the required eﬀect. What is needed is some way of interleaving the

communications between P and (m : R) with those between Q and (m : R). In

this way, (m : R) serves as a resource shared between P and Q; each of them

uses it independently, and their interactions with it are interleaved.

When the identity of all the sharing processes is known in advance, it is

possible to arrange that each sharing process uses a diﬀerent set of channels

to communicate with the shared resource. This technique was used in the

story of the dining philosophers (Section 2.5); each fork was shared between

two neighbouring philosophers, and the footman was shared among all ﬁve.

Another example was 4.5 X6, in which a buﬀer was shared between two pro-

cesses, one of which used only the left channel and the other used only the

right channel.

Ageneral method of sharing is provided by multiple labelling (Section 2.6.4),

which eﬀectively creates enough separate channels for independent commu-

nication with each sharing process. Individual communications along these

channels are arbitrarily interleaved. But this method requires that the names

of all the sharing processes are known in advance; and so it is not adequate

for a subordinate process intended to serve the needs of a main process which

182 6 Shared Resources

splits into an arbitrary number of concurrent subprocesses. This chapter in-

troduces techniques for sharing a resource among many processes, even when

their number and identities are not known in advance. It is illustrated by ex-

amples drawn from the design of an operating system.

6.2 Sharing by interleaving

The problemdescribed in Section 6.1 arises fromthe use of the combinator || to

describe the concurrent behaviour of processes; and this problem can often be

avoided by using instead the interleaving formof concurrency (P ||| Q). Here, P

and Q have the same alphabet and their communications with external (shared)

processes are arbitrarily interleaved. Of course, this prohibits direct commu-

nication between P and Q; but indirect communication can be re-established

through the services of a shared subordinate process of appropriate design, as

shown in 4.5 X6 and in X2 below.

Example

X1 (Shared subroutine)

doub : DOUBLE // (P ||| Q)

Here, both P and Q may contain calls on the subordinate process

(doub.left!v →doub.right?x →SKIP)

Even though these pairs of communications from P and Q are arbitrarily in-

terleaved, there is no danger than one of the processes will accidentally obtain

an answer which should have been received by the other. To ensure this, all

subprocesses of the main process must observe a strict alternation of commu-

nications on the left channel with communications on the right channel of the

shared subordinate. For this reason, it seems worthwhile to introduce a special-

ised notation, whose exclusive use will guarantee observance of the required

discipline. The suggested notation is reminiscent of a traditional procedure

call in a high-level language, except that the value parameters are preceded by

! and the result parameters by ?, this

doub!x?y = (doub.left!x →doub.right?y →SKIP)

**The intended eﬀect of sharing by interleaving is illustrated by the fol-
**

lowing series of algebraic transformations. When two sharing processes both

simultaneously attempt to use the shared subroutine, matched pairs of com-

munications are taken in arbitrary order, but the components of a pair of com-

munications with one process are never separated by a communication with

6.2 Sharing by interleaving 183

another. For convenience, we use the following abbreviations

d!v for d.left!v

d?x for d.right?x

within a main process

!v for right!v

?x for left?x

within a subordinate process

Let

D = ?x →!(x +x) →D

P = d!3 →d?y →P(y)

Q = d!4 →d?z →Q(z)

R = (d : D // (P ||| Q))

(as in X1 above), then

P ||| Q

= d!3 →((d?y →P(y)) ||| Q)

d!4 →(P ||| (d?z →Q(z)))

[by 3.6.1 L7]

The sharing processes each start with an output to the shared process. It is the

shared process that is oﬀered the choice between them. But the shared process

is willing to accept either, so after hiding the choice becomes nondeterministic

(d : D // (P ||| Q))

= ((d : (!3 +3 →D)) // ((d?y →P(y)) ||| Q))

¦

((d : (!4 +4 →D)) // (P ||| (d?z →Q(z))))

= (d : D // (P(6) ||| Q)) ¦ (d : D // (P ||| Q(8)))

[4.5.1 L1, 3.5.1 L5, etc.]

The shared process oﬀers its result to whichever of the sharing processes is

ready to take it. Since one of these processes is still waiting for output, it is the

process which provided the argument that gets the result. That is why strict

alternation of output and input is so important in calling a shared subroutine.

Example

X2 (Shared data structure) In an airline ﬂight reservation system, bookings are

made by many reservation clerks, whose actions are interleaved. Each reserva-

184 6 Shared Resources

tion adds a passenger to the ﬂight list, and returns an indication whether that

passenger was already booked or not. For this oversimpliﬁed example, the set

implemented in 4.5 X8 will serve as a shared subordinate process, named by

the ﬂight number

AG109 : SET // (. . . (CLERK ||| CLERK ||| . . .) . . .)

Each CLERK books a passenger by the call

AG109!pass no?x

which stands for

(AG109.left!pass no →AG109.right?x →SKIP)

**In these two examples, each occasion of use of the shared resource in-
**

volves exactly two communications, one to send the parameters and the other

to receive the results; after each pair of communications, the subordinate pro-

cess returns to a state in which it is ready to serve another process, or the same

one again. But frequently we wish to ensure that a whole series of communic-

ations takes place between two processes, without danger of interference by a

third process. For example, a single expensive output device may have to be

shared among many concurrent processes. On each occasion of use, a number

of lines constituting a ﬁle must be output consecutively, without any danger

of interleaving of lines sent by another process. For this purpose, the output

of a ﬁle must be preceded by an acquire which obtains exclusive use of the

resource; and on completion, the resource must be made available again by a

release.

Examples

X3 (Shared line printer)

LP = acquire →µ X • (left?x →h!s →X ¦ release →LP)

Here, h is the channel which connects LP to the hardware of the line printer.

After acquisition, the process LP copies successive lines from its left channel

to its hardware, until a release signal returns it to its original state, in which it

is available for use by any other processes. This process is used as a shared

resource

lp.acquire →. . . lp.left!“A. JONES” →. . .

lp.left!nextline →. . . lp.release →

**6.2 Sharing by interleaving 185
**

X4 (An improvement on X3) When a line printer is shared between many

users, the length of paper containing each ﬁle must be manually detached after

output fromthe previous and the following ﬁles. For this purpose, the printing

paper is usually divided into pages, which are separated by perforations; and

the hardware of the printer allows an operation throw, which moves the paper

rapidly to the end of the current page—or better, to the next outward-facing

fold in the paper stack. To assist in separation of output, ﬁles should begin

and end on page boundaries, and a complete rowof asterisks should be printed

at the end of the last page of the ﬁle, and at the beginning of the ﬁrst page. To

prevent confusion, no complete line of asterisks is permitted to be printed in

the middle of a ﬁle

LP = (h!throw →h!asterisks →

acquire →h!asterisks →

µ X • (left?s →if s ≠ asterisks then

h!s →X

else

X

¦ release →LP))

This version of LP is used in exactly the same way as the previous one.

In the last two examples, the use of the signals acquire and release prevent

arbitrary interleaving of lines from distinct ﬁles, and they do so without intro-

ducing the danger of deadlock. But if more than one resource is to be shared

in this fashion, the risk of deadlock cannot be ignored.

Example

X5 (Deadlock) Ann and Mary are good friends and good cooks; they share a

pot and a pan, which they acquire, use and release as they need them

UTENSIL = (acquire →use →use →. . . →release →UTENSIL)

pot : UTENSIL // pan : UTENSIL // (ANN ||| MARY)

Ann cooks in accordance with a recipe which requires a pot ﬁrst and then a

pan, whereas Mary needs a pan ﬁrst, then a pot

ANN = . . . pot.acquire →. . . pan.acquire →. . .

MARY = . . . pan.acquire →. . . pot.acquire →. . .

Unfortunately, they decide to prepare a meal at about the same time. Each of

them acquires her ﬁrst utensil; but when she needs her second utensil, she

ﬁnds that she cannot have it, because it is being used by the other.

186 6 Shared Resources

eat

serve

pan.release

pan.use

pan.use

pot.release

pot.use

pan.acquire

pot.acquire

mix

pan.

acquire

pot.

acquire

pot.

release

pan.

release

eat

ANN

MARY

peel

Figure 6.1

The story of Ann and Mary can be visualised on a two-dimensional plot

(Figure 6.1), where the life of Ann is displayed along the vertical axis and Mary’s

life on the horizontal. The system starts in the bottom left hand corner, at the

beginning of both their lives. Each time Ann performs an action, the system

moves one step upward. Each time Mary performs an action, the systemmoves

one step right. The trajectory shown on the graph shows a typical interleaving

of Ann’s and Mary’s actions. Fortunately, this trajectory reaches the top right

hand corner of the graph where both cooks are enjoying their meal.

But this happy outcome is not certain. Because they cannot simultaneously

use a shared utensil, there are certain rectangular regions in the state space

through which the trajectory cannot pass. For example in the region hatched

both cooks would be using the pan, and this is not possible. Similarly, exclu-

sion on the use of the pot prohibits entry into the region hatched . Thus if the

trajectory reaches the edge of one of these forbidden regions, it can only fol-

low the edge upward (for a vertical edge) or rightward (for a horizontal edge).

During this period, one of the cooks is waiting for release of a utensil by the

other.

Now consider the zone marked with dots . If ever the trajectory enters

this zone, it will inevitably end in deadlock at the top right hand corner of the

zone. The purpose of the picture is to show that the danger of deadlock arises

solely as a result of a concavity in the forbidden region which faces towards

the origin: other concavities are quite safe. The picture also shows that the

only sure way of preventing deadlock is to extend the forbidden region to

6.3 Shared storage 187

cover the danger zone, and so remove the concavity. One technique would be

to introduce an additional artiﬁcial resource which must be acquired before

either utensil,and must not be released until both utensils have been released.

This solution is similar to the one imposed by the footman in the story of the

dining philosophers (Section 2.5.3) where permission to sit down is a kind of

resource, of which only four instances are shared among ﬁve philosophers.

An easier solution is to insist that any cook who is going to want both utensils

must acquire the pan ﬁrst. This example is due to E. W. Dijkstra.

The easier solution suggested for the previous example generalises to any

number of users, and any number of resources. Provided that there is a ﬁxed

order in which all users acquire the resources they want, there is no risk of

deadlock. Users should release the resources as soon as they have ﬁnished with

them; the order of release does not matter. Users may even acquire resources

out of order, provided that at the time of acquisition they have already released

all resources which are later in the standard ordering. Observance of this

discipline of resource acquisition and release can often be checked by a visual

scan of the text of the user processes.

6.3 Shared storage

The purpose of this section is to argue against the use of shared storage; the

section may be omitted by those who are already convinced.

The behaviour of systems of concurrent processes can readily be imple-

mented on a single conventional stored program computer, by a technique

known as timesharing, in which a single processor executes each of the pro-

cesses in alternation, with process change on occurrence of interrupt from an

external device or from a regular timer. In this implementation, it is very easy

to allowthe concurrent processes to share locations of common storage, which

are accessed and assigned simply by means of the usual machine instructions

within the code for each of the processes.

A location of shared storage can be modelled in our theory as a shared

variable (4.2 X7) with the appropriate symbolic name, for example

(count : VAR //

(count.left!0 →(P ||| Q)))

Shared storage must be clearly distinguished from the local storage described

in 5.5. The simplicity of the laws for reasoning about sequential processes de-

rives solely from the fact that each variable is updated by at most one process;

and these laws do not deal with the many dangers that arise from arbitrary

interleaving of assignments from diﬀerent processes.

These dangers are most clearly illustrated by the following example.

188 6 Shared Resources

Example

X1 (Interference) The shared variable count is used to keep a count of the total

number of occurrences of some important event. On each occurrence of the

event, the relevant process P or Q attempts to update the count by the pair of

communications

count.right?x; count.left!(x +1)

Unfortunately, these two communications may be interleaved by a similar pair

of communications from the other process, resulting in the sequence

count.right?x →

count.right?y →

count.left!(y +1) →

count.left!(x +1) →. . .

As a consequence, the value of the count is incremented only by one instead

of two. This kind of error is known as interference, and it is an easy mistake in

the design of processes which share common storage. Further, the actual oc-

currence of the fault is highly nondeterministic; it is not reliably reproducible,

and so it is almost impossible to diagnose the error by conventional testing

techniques. As a result, I suspect that there are several operating systems in

current use which regularly produce slightly inaccurate summaries, statistics,

and accounts.

A possible solution to this problem is to make sure that no change of

process takes place during a sequence of actions which must be protected

from interleaving. Such a sequence is known as a critical region. On an imple-

mentation by a single processor, the required exclusion is often achieved by

inhibiting all interrupts for the duration of the critical region. This solution

has an undesirable eﬀect in delaying response to interrupts; and worse, it fails

completely as soon as a second processing unit is added to the computer.

A better solution was suggested by E. W. Dijkstra in his introduction of

the binary exclusion semaphore. A semaphore may be described as a process

which engages alternatively in actions named P and V

SEM = (P →V →SEM)

This is declared as a shared resource

(mutex : SEM // . . .)

Each process, on entry into a critical region, must send the signal

mutex.P

6.4 Multiple resources 189

and on exit from the critical region must engage in the event

mutex.V

Thus the critical region in which the count is incremented should appear

mutex.P →

count.right?x →count.left!(x +1) →

mutex.V →. . .

Provided that all processes observe this discipline, it is impossible for two pro-

cesses to interfere with each other’s updating of the count. But if any process

omits a P or a V , or gets them in the wrong order, the eﬀect will be chaotic,

and will risk a disastrous or (perhaps worse) a subtle error.

A much more robust way to prevent interference is to build the required

protection into the very design of the shared storage, taking advantage of

knowledge of its intended pattern of usage. For example, if a variable is to

be used only for counting, then the operation which increments it should be a

single atomic operation

count.up

and the shared resource should be designed like CT

0

(1.1.4 X2)

count : CT

0

// (. . . P ||| Q . . .)

In fact there are good reasons for recommending that each shared resource

be specially designed for its purpose, and that pure storage should not be

shared in the design of a system using concurrency. This not only avoids the

grave dangers of accidental interference; it also produces a design that can

be implemented eﬃciently on networks of distributed processing elements as

well as single-processor and multiprocessor computers with physically shared

store.

6.4 Multiple resources

In Section 6.2, we described how a number of concurrent processes with diﬀer-

ent behaviour could share a single subordinate process. Each sharing process

observes a discipline of alternating input and output, or alternating acquire

and release signals, to ensure that at any given time the resource is used by at

most one of the potentially sharing processes. Such resources are known as

serially reusable. In this section we introduce arrays of processes to represent

multiple resources with identical behaviour; and indices in the array ensure

that each element communicates safely with the process that has acquired it.

190 6 Shared Resources

We shall therefore make substantial use of indices and indexed operators,

with obvious meaning. For example

||

i<12

P

i

= (P

0

|| P

1

|| . . . || P

11

)

|||

i<4

P = (P ||| P ||| P ||| P)

||

i≥0

P

i

= (P

0

|| P

1

|| . . .)

i≥0

(f (i) →P

i

) = (f (0) →P

0

¦ f (1) →P

1

¦ . . .)

In the last example, we insist that f is a one-one function, so that the choice

between the alternatives is made solely by the environment.

Examples

X1 (Re-entrant subroutine) A shared subroutine that is serially reusable can

be used by only one calling process at a time. If the execution of the subroutine

requires a considerable calculation, there could be corresponding delays to

the calling processes. If several processors are available to perform the cal-

culations, there is good reason to allow several instances of the subroutine to

proceed concurrently on diﬀerent processors. A subroutine capable of several

concurrent instances is known as re-entrant, and it is deﬁned as an array of

concurrent processes

doub : (||

i<27

(i : DOUBLE)) // . . .

A typical call of this subroutine could be

(doub.3.left!30 →doub.3.right?y →SKIP)

The use of the index 3 ensures that the result of the call is obtained from the

same instance of doub to which the arguments were sent, even though some

other concurrent process may at the same time call another instance of the

array, resulting in an interleaving of the messages

doub.3.left.30, . . . doub.2.left.20,

. . . doub.3.right.60, . . . doub.2.right.40, . . .

When a process calls a re-entrant subroutine, it really does not matter

which element of the array responds to the call; any one that happens to be

free will be equally good. So rather than specifying a particular index 2 or 3, a

calling process should leave the selection arbitrary, by using the construct

i≥0

(doub.i.left!30 →doub.i.right?y →SKIP)

This still observes the essential discipline that the same index is used for send-

ing the arguments and (immediately after) for receiving the result.

6.4 Multiple resources 191

In the example shown above, there is an arbitrary limit of 27 simultaneous

activations of the subroutine. Since it is fairly easy to arrange that single pro-

cessor divides its attention among a much larger number of processes, such

arbitrary limits can be avoided by introducing an inﬁnite array of concurrent

processes

doub : (||

i≥0

i : D)

where D can now be designed to serve only a single call and then stop

D = left?x →right!(x +x) →STOP

A subroutine with no bound on its re-entrancy is known as a procedure.

The intention in using a procedure is that the eﬀect of each call

i≥0

(doub.i.left!x →doub.i.right?y →SKIP)

should be identical to the call of a subordinate process D declared immediately

adjacent to the call

(doub : D // (doub.left!x →doub.right?y →SKIP))

This latter is known as a local procedure call, since it suggests execution of

the procedure on the same processor as the calling process; whereas the call

of a shared procedure is known as a remote call, since it suggests execution

on a separate possibly distant processor. Since the eﬀect of remote and local

calls is intended to be the same, the reasons for using the remote call can only

be political or economic—e.g., to keep the code of the procedure secret, or to

run it on a machine with special facilities which are too expensive to provide

on the machine on which the using processes run.

A typical example of an expensive facility is a high-volume backing store,

such as a disk or bubble memory.

X2 (Shared backing storage) A storage medium is split into B sectors which

can be read and written independently. Each sector can store one block of

information, which it inputs on the left and outputs on the right. Unfortunately

the storage medium is implemented in a technology with destructive read-out,

so that each block written can be read only once. Thus each sector behaves

like COPY (4.2 X1) rather than VAR (4.2 X7). The whole backing store is an

array of such sectors, indexed by numbers less than B

BSTORE = ||

i<B

i : COPY

This store is intended for use as a subordinate process

(back : BSTORE // . . .)

192 6 Shared Resources

Within its main process, the store may be used by the communications

back.i.left?bl →. . . back.i.right?y →. . .

The backing store may also be shared by concurrent processes. In this case,

the action

i<B

(back.i.left!bl →. . .)

will simultaneously acquire an arbitrary free sector with number i, and write

the value of bl into it. Similarly, back.i.right?x will in a single action both

read the content of sector i into x and release this sector for use on another

occasion, very possibly by another process. It is this simpliﬁcation that is the

real motive for using COPY to model the behaviour of each sector: the story

of destructive read-out is just a story.

Of course, successful sharing of this backing store requires the utmost

discipline on the part of the sharing processes. A process may input from a

sector only if the same process has most recently output to that very sector;

and each output must eventually be followed by such an input. Failure to ob-

serve such disciplines will lead to deadlock, or even worse confusion. Methods

of enforcing this discipline painlessly will be introduced after the next example,

and will be extensively illustrated in the subsequent design of modules of an

operating system (Section 6.5).

X3 (Two line printers) Two identical line printers are available to serve the

demands of a collection of using processes. They both need the kind of protec-

tion from interleaving that was provided by LP (6.2 X4). We therefore declare

an array of two instances of LP, each of which is indexed by a natural number

indicating its position in the array

LP2 = (0 : LP || 1 : LP)

This array may itself be given a name for use as a shared resource

(lp2 : LP2 // . . .)

Each instance of LP is nowpreﬁxed twice, once by a name and once by an index;

thus communications with the using process have three or four components,

e.g.,

lp.0.acquire, lp.1.left.“A.JONES”, . . .

As in the case of a re-entrant procedure, when a process needs to acquire

one of an array of identical resources, it really cannot matter which element

of the array is selected on a given occasion. Any element which is ready to

respond to the acquire signal will be acceptable. A general choice construction

will make the required arbitrary choice

i≥0

(lp.i.acquire →. . . lp.i.left!x →. . . lp.i.release →SKIP)

6.4 Multiple resources 193

Here, the initial lp.i.acquire will acquire whichever of the two LP processes

is ready for this event. If neither is ready, the acquiring process will wait; if

both are ready, the choice between them is nondeterministic. After the initial

acquisition, the bound variable i takes as its value the index of the selected

resource, and all subsequent communications will be correctly directed to that

same resource.

When a shared resource has been acquired for temporary use within an-

other process, the resource is intended to behave exactly like a locally declared

subordinate process, communicating only with its using subprocess. Let us

therefore adapt the familiar notation for subordination, and write

(myﬁle :: lp // . . . myﬁle.left!x . . .)

instead of the much more cumbersome construction

i≥0

(lp.i.acquire →. . . lp.i.left!x . . . ; lp.i.release →SKIP)

Here, the local name myﬁle has been introduced to stand for the indexed name

lp.i, and the technicalities of acquisition and release have been conveniently

suppressed. The new “::” notation is called remote subordination; it is dis-

tinguished from the familiar “:” notation in that it takes on its right, not a

complete process, but the name of a remotely positioned array of processes.

X4 (Two output ﬁles) A using process requires simultaneous use of two line

printers to output two ﬁles, f1 and f2

(f1 :: lp // (f2 :: lp // . . . f1.left!s1 →f2.left!s2 →. . .))

Here, the using process interleaves output of lines to the two diﬀerent ﬁles;

but each line is printed on the appropriate printer. Of course, deadlock will

be the certain result of any attempt to declare three printers simultaneously;

it is also a likely result of declaring two printers simultaneously in each of two

concurrent processes, as shown in the history of Ann and Mary (6.2 X5).

X5 (Scratch ﬁle) A scratch ﬁle is used for output of a sequence of blocks.

When the output is complete, the ﬁle is rewound, and the entire sequence of

blocks is read back fromthe beginning. When all the blocks have been read, the

scratch ﬁle will then give only empty signals; no further reading or writing is

possible. Thus a scratch ﬁle behaves like a ﬁle output to magnetic tape, which

must be rewound before being read. The empty signal serves as an end-of-ﬁle

marker

SCRATCH = WRITE

¸)

WRITE

s

= (left?x →WRITE

s

¸x)

¦ rewind →READ

s

)

READ

¸x)

s

= (right!x →READ

s

)

READ

¸)

= (empty →READ

¸)

)

194 6 Shared Resources

This may conveniently be used as a simple unshared subordinate process

(myﬁle : SCRATCH // . . . mﬁle.left!v . . . myﬁle.rewind . . .

. . . (myﬁle.right?x →. . .

¦ myﬁle.empty →. . .) . . .)

It will serve later as a model for a shared process.

X6 (Scratch ﬁles on backing store) The scratch ﬁle described in X5 can be

readily implemented by holding the stored sequence of blocks in the main

store of a computer. But if the blocks are large and the sequence is long, this

could be an uneconomic use of main store, and it would be better to store the

blocks on a backing store.

Since each block in a scratch ﬁle is read and written only once, a backing

store (X2) with destructive read-out will suﬃce. An ordinary scratch ﬁle (held

in main store) is used to hold the sequence of indices of the sectors of backing

store on which the corresponding actual blocks of information are held; this

ensures that the correct blocks are read back, and in the correct sequence

BSCRATCH = (pagetable : SCRATCH //

µ X • (left?x →(

i<B

back.i.left!x →pagetable.left!i →X)

¦ rewind →

pagetable.rewind →

µ Y • (pagetable.right?i →

back.i.right?x →right!x →Y

¦ pagetable.empty →empty →Y)))

BSCRATCH uses the name back to address a backing store (X2) as a subordinate

process. This should be supplied

SCRATCHB = (back : BSTORE // BSCRATCH)

SCRATCHB can be used as a simple unshared subordinate process in exactly

the same way as the scratch ﬁle of X5

(myﬁle : SCRATCHB // . . . myﬁle.left!v . . .)

The eﬀect is exactly the same as use of SCRATCH, except that the maximum

length of the scratch ﬁle is limited to B blocks.

X7 (Serially reused scratch ﬁles) Suppose we want to share the scratch ﬁle

on backing store among a number of interleaved users, who will acquire, use,

and release it one at a time in the manner of a shared line printer (6.2 X3). For

this purpose, we must adapt BSCRATCH to accept acquire and release signals.

6.4 Multiple resources 195

If a user releases his scratch ﬁle before reading to the end, there is a danger

that the unread blocks on backing store will never be reclaimed. This danger

can be averted by a loop that reads back these blocks and discards them

SCAN = µ X • (pagetable.right?i →

back.i.right?x →

X

¦ pagetable.empty →SKIP)

A shared scratch ﬁle acquires its user, and then behaves as BSCRATCH.

The release signal causes an interrupt (Section 5.4) to the SCAN process

SHBSCR = acquire →

(BSCRATCH Z(release →SCAN))

The serially reusable scratchﬁle is provided by the simple loop

∗

SHBSCR

which uses BSTORE as a subordinate process

back : BSTORE //

∗

SHBSCR

**X8 (Multiplexed scratch ﬁles) In the previous two examples only one scratch
**

ﬁle is in use at a time. A backing store is usually suﬃciently large to allow

many scratch ﬁles to exist simultaneously, each occupying a disjoint subset

of the available sectors. The backing store can therefore be shared among

an unbounded array of scratch ﬁles. Each scratch ﬁle acquires a sector when

needed by outputting to it, and releases it automatically on inputting that block

again. The backing store is shared by the technique of multiple labelling (Sec-

tion 2.6.4), using as labels the same indices (natural numbers) which are used

in constructing the array of sharing processes

FILESYS = N : (back : BSTORE) // (||

i≥0

i : SHBSCR)

where N = ¦ i ¦ i ≥ 0 ¦

This ﬁling system is intended to be used as a subordinate process, shared

by interleaving among any number of users

ﬁlesys : FILESYS // . . . (USER1 ||| USER2 ||| . . .)

Inside each user, a fresh scratch ﬁle can be acquired, used, and released by

remote subordination

myﬁle : ﬁlesys // (. . . myﬁle.left!v . . . myﬂe.rewid . . . myﬁle.right?x . . .)

196 6 Shared Resources

which is intended (apart from resource limitations) to have exactly the same

eﬀect as the simple subordination of a private scratch ﬁle (X5)

(myﬁle : SCRATCH //

. . . myﬁle.left!v . . . myﬁle.rewind . . . myﬁle.right?x . . .)

**The structure of the ﬁling system (X8) and its mode of use is a paradigm
**

solution of the problem of sharing a limited number of actual resources (sec-

tors on backing store) among an unknown number of users. The users do not

communicate directly with the resources; there is an intermediary virtual re-

source (the SHBSCR) which they declare and use as though it were a private

subordinate process. The function of the virtual resource is twofold

(1) it provides a nice clean interface to the user; in this example, SHBSCR glues

together into a single contiguous scratch ﬁle a set of sectors scattered on

backing store.

(2) It guarantees a proper, disciplined access to the actual resources; for ex-

ample, the process SHBSCR ensures that each user reads only fromsectors

allocated to that user, and cannot forget to release sectors on ﬁnishing with

a scratch ﬁle.

Point (1) ensures that the discipline of Point (2) is painless.

The paradigm of actual and virtual resources is very important in the

design of resource-sharing systems. The mathematical deﬁnition of the paradigm

is quite complicated, since it uses an unbounded set of natural numbers to im-

plement the necessary dynamic creation of new virtual processes, and of new

channels through which to communicate with them. In a practical implement-

ation on a computer, these would be represented by control blocks, pointers to

activation records, etc. To use the paradigmeﬀectively, it is certainly advisable

to forget the implementation method. But for those who wish to understand

it more fully before forgetting it, the following further explanation of X8 may

be helpful.

Inside a user processor a scratch ﬁle is created by remote subordination

myﬁle :: ﬁlesys // (. . . myﬁle.left!v . . . myﬁle.rewind . . . myﬁle.right?x . . .)

By deﬁnition of remote subordination this is equivalent to

(

i≥0

ﬁlesys.i.acquire →

ﬁlesys.i.left!v . . . ﬁlesys.i.rewind . . . ﬁlesys.i.right?x . . .

ﬁlesys.i.release →SKIP)

Thus all communications between ﬁlesys and its users begin with ﬁlesys.i . . .

where i is the index of the particular instance of SHBSCR which has been ac-

quired by a particular user on a particular occasion. Furthermore, each occa-

6.4 Multiple resources 197

sion of its use is surrounded by a matching pair of signals ﬁlesys.i.acquire and

ﬁlesys.i.release.

On the side of the subordinate process, each virtual scratchﬁle begins by

acquiring its user, and then continues according to the pattern of X5 and X6

(acquire →. . . left?x . . . rewind . . . right!v . . . release . . .)

All other communications of the virtual scratch ﬁle are with the subordinate

BSTORE process, and are concealed from the user. Each instance of the virtual

scratch ﬁle is indexed by a diﬀerent index i, and then named by the name

ﬁlesys. So the externally visible behaviour of each instance is

(ﬁlesys.i.acquire →

ﬁlesys.i.left?x . . . ﬁlesys.i.rewind . . . ﬁlesys.i.right!v . . .

ﬁlesys.i.release)

This exactly matches the user’s pattern of communication as described above.

The matching pairs of acquire and release signals ensure that no user can

interfere with a scratch ﬁle that has been acquired by another user.

We turn nowto communications within FILESYS between the virtual scratch

ﬁles and the backing store. These are concealed fromthe user, and do not even

have the name ﬁlesys attached to them. The relevant events are:

i.back.j.left.v denotes communication of block v from the ith element of

the array of scratch ﬁles to the jth sector of backing store

i.back.j.right.v denotes a communication in the reverse direction.

Each sector of the backing store behaves like COPY. After indexing with a

sector number j and naming by back, the jth sector behaves like

µ X • (back.j.left?x →back.j.right!x →X)

After multiple labelling by natural numbers it behaves like

µ X • (

i≥0

i.back.j.left?x →(

k≥0

k.back.j.right!x →X))

This is now ready to communicate on any occasion with any element of the

array of virtual scratch ﬁles. Each individual scratch ﬁle observes the discip-

line of reading only from those sectors which the scratch ﬁle itself has most

recently written to.

In the above description the role of the natural numbers i and j is merely

to permit any scratch ﬁle to communicate with any sector on disc, and to com-

municate safely with the user that has acquired it. The indices therefore serve

as a mathematical description of a kind of crossbar switch which is used in

a telephone exchange to allow any subscriber to communicate with any other

subscriber. A crude picture of this can be drawn as in Fig 6.2.

198 6 Shared Resources

USER

COPY

sectors of the backing store

crossbar

scratch files

crossbar

users

SHBSCR

Figure 6.2

If the number of sectors in the backing store is inﬁnite, FILESYS behaves

exactly like a similarly constructed array of simple scratch ﬁles

||

i≥0

i : (acquire →(SCRATCH Z(release →STOP)))

With a backing store of ﬁnite size, there is a danger of deadlock if the

backing store gets full at a time when all users are still writing to their scratch

ﬁles. In practice, this risk is usually reduced to insigniﬁcance by delaying

acquisition of new ﬁles when the backing store is nearly full.

6.5 Operating systems

The human users of a single large computer submit programs on decks of cards

for execution. The data for each program immediately follows it. The task of

a batch processing operating system is to share the resources of the computer

eﬃciently among these jobs. For this we postulate that each user’s program

is executed by a process called JOB, which inputs the cards of the program on

channel cr.right, runs the program on the data which immediately follows it

in the card reader, and outputs the results of execution on the channel lp.left.

We do not need to know about the internal structure of JOB—in early days it

6.5 Operating systems 199

used to be a FORTRAN monitor system. However, we need to rely on the fact

that it will terminate successfully within a reasonable internal after starting.

The alphabet of JOB is therefore deﬁned

αJOB = ¦cr.right, lp.left, ✓¦

If LPH represents the hardware of the line printer and CRH represents the

hardware of the card reader, a single job for a single user will be executed by

JOB1 = (cr : CRH // lp : LPH // JOB)

An operating system that runs just one job and then terminates is not

much use. The simplest method of sharing a single computer among many

users is to run their jobs serially, one after the other

BATCH0 = (cr : CRH // lp : LPH // JOB)

But this design ignores some important administrative details, such as separ-

ation of ﬁles output by each job, and separation of card decks containing each

job from the previous job, so that one job cannot read the cards containing its

successor. To solve these problems we use the LP process deﬁned in 6.2 X4,

and a CR process deﬁned below (X1)

JOBS =

∗

((cr.acquire →lp.acquire →JOB) ;

(cr.release →lp.release →SKIP))

BATCH1 = (cr : CR // lp : LP // JOBS)

BATCH1 is an abstract description of the simplest viable operating system,

sharing a computer among many users whose jobs are executed one at a time in

succession. The operating system expedites the transition between successive

jobs, and protects each job from possible interference by its predecessors.

Examples

X1 (A shared card reader) A special separator card is inserted at the front

of each jobﬁle loaded into the card reader. The card reader is acquired to read

all cards of a single jobﬁle and is then released. If the user attempts to read

beyond a separator card, further copies of the separator card are supplied,

without further input from the card reader. If the user fails to read up to the

separator card, the left-over cards are ﬂushed out. Superﬂuous separators are

ignored. Input from the hardware is achieved by h?x.

The shared card reader needs to read one card ahead, so the value of the

buﬀered card is used as an index

CR = h?x →if x = separator then CR else (acquire →CR

x

)

200 6 Shared Resources

CR

x

= (right!x →h?y →

if y ≠ separator then

CR

y

else

µ X • (right!separator →X ¦ release →CR)

¦ release →

µ X • (h?y →if y = separator then CR else X))

After ignoring an initial subsequence of separators, this process acquires

its user and copies on its right channel the sequence of nonseparator cards

which it reads from the hardware. On detecting a separator card, its value is

replicated as necessary until the user releases the resource. But if the user

releases the reader before the separator card is reached, the remaining cards

of the deck up to the next separator card must be ﬂushed out by reading and

ignoring them.

The BATCH1 operating systemis logically complete. However, as the hard-

ware of the central processor gets faster, it outstrips the capacity of readers

and printers to supply input and transmit output. In order to establish a match

between input, output, and processing speeds, it is necessary to use two or

more readers and printers. Since only one job at a time is being processed,

the extra readers should be occupied in reading ahead the card ﬁle for the fol-

lowing job or jobs, and the extra printers should be occupied in printing the

output ﬁle for the previous job or jobs. Each input ﬁle must therefore be held

temporarily in a scratch ﬁle during the period between actual input on a card

reader and its consumption by JOB; and each output ﬁle must be similarly buf-

fered during the period between production of the lines by JOB and its actual

output on the line printer. This technique is known as spooling.

The overall structure of a spooled operating system is

OPSYS1 =

insys : INSPOOL //

outsys : OUTSPOOL //

BATCH

Here BATCH is like BATCH1, except that it uses remote subordination to ac-

quire any one of the waiting input ﬁles, and also to acquire an output ﬁle which

is destined for subsequent printing

BATCH =

∗

(cr :: insys // lp :: outsys // JOB)

The spoolers are deﬁned in the next two examples.

6.5 Operating systems 201

X2 (Spooled output) A single virtual printer uses a temporary scratch ﬁle

(6.4 X5) to store blocks that have been output by its using process. When the

using process signals release of the virtual printer, then an actual printer (6.4

X3) is acquired to output the content of the temporary ﬁle

VLP = (temp : SCRATCH //

µ X • left?x →temp.left!x →X

¦ release →temp.rewind →

(actual :: lp //

µ Y • (temp.right?y →actual.left!y →Y

¦ temp.empty →SKIP)))

The requisite unbounded array of virtual line printers is deﬁned

VLPS = ||

i≥0

i : (acquire →VLP)

Since we want the actual line printers (6.4 X3) to be used only in spooling mode,

we can declare them local to the spooling system using multiple labelling to

share them among all elements of the array VLPS as in 6.4 X8

OUTSPOOL = (N : (lp : LP2) // VLPS)

**X3 (Spooled input) Input spooling is very similar to output spooling, except
**

that a real card reader is acquired ﬁrst, and is released at the end of the input

for a single job; a user process is then acquired to execute the job, and the

contents of the cards are output to it

VCR = temp : SCRATCH //

(actual :: cr //

(µ X • actual.right?x →if x = separator then

SKIP

else

temp.left!x →X));

(temp.rewind →acquire →

(µ Y • (temp.right?x →right!x →Y

¦ temp.empty →right!separator →Y)) Z

(release →SKIP))

INSPOOL = (N : cr : (0 : cr || 1 : cr)) // (||

i≥0

i : VCR)

**202 6 Shared Resources
**

The input and output spoolers now supply an unbounded number of vir-

tual card readers and virtual line printers for use of the JOB process. As a

result, it is possible for two or more JOB processes to run concurrently, shar-

ing these virtual resources. Since no communication is required between these

jobs, simple interleaving is the appropriate sharing method. This technique is

known as multiprogramming; or if more than one actual hardware processor

is used, it is known as multiprocessing. However, the logical eﬀects of multi-

programming and multiprocessing are the same. Indeed, the operating system

deﬁned below has the same logical speciﬁcation as OPSYS1 deﬁned above

OPSYS = insys : INSPOOL // outsys : OUTSPOOL // BATCH4

where

BATCH4 = (|||

i<4

BATCH)

In mathematics, the change to multiprogramming has been remarkably simple:

historically it caused much greater agony.

In the design of the VLP process within OUTSPOOL (X2) the subordinate

process SCRATCH was used to store the lines produced by each JOB until they

are output on an actual printer. In general the output ﬁles are too large to be

held in the main store of a computer, and should be held on a backing store

as illustrated in 6.4 X8. All the temporary ﬁles should share the same backing

store, so we need to replace the subordinate process

temp : SCRATCH // . . .

within VLP by a declaration of a remote subordinate process

temp :: ﬁlesys // . . .

and then declare the ﬁling system (6.4 X8) as a subordinate process of the

output spooler

(ﬁlesys : FILESYS // OUTSPOOL)

If the volume of card input is signiﬁcant, a similar change must be made to

INSPOOL. If a separate backing store is available for this purpose, the change is

easy. If not, we will need to share a single backing store between the temporary

ﬁles of both the input and the output spoolers. This means that FILESYS must

be declared as a subordinate process, shared by multiple labelling between

both spoolers; and this involves a change in the structure of the system. We

will do this redesign in a top-down fashion, trying to re-use as many of the

previously deﬁned modules as possible.

6.5 Operating systems 203

The operating systemis composed froma batched multiprogramming sys-

tem BATCH4, and an input-output system, serving as a subordinate process

OP = IOSYSTEM // BATCH4

The input-output system shares a ﬁling system between an input spooler and

an output spooler

IOSYSTEM = SH : (ﬁlesys : FILESYS) //

(lp : OUTSPOOL

¹

|| cr : INSPOOL

¹

)

and SH = ¦ lp.i ¦ i ≥ 0 ¦ ∪ ¦ cr.i ¦ i ≥ 0 ¦, and OUTSPOOL

¹

and INSPOOL

¹

are

the same as X2 and X3, except that

temp : SCRATCH

is replaced by the equivalent remote subordination

temp :: ﬁlesys

In the design of the four operating systems described in this chapter

(BATCH1, OPSYS1, OPSYS, and OP) we have emphasised above all else the

virtue of modularity. This means that we have been able to re-use large parts

of the earlier systems within the later systems. Even more important, every

decision of detail is isolated within one or two modules of the system. Con-

sequently, if a detail must be changed, it is very easy to identify which mod-

ule must be altered, and the alteration can be conﬁned to that single module.

Among the easy changes are

• the number of line printers

• the number of card readers

• the number of concurrent batches

But not all changes will be so easy: a change to the value of the separator card

will aﬀect three modules, CR (X1), INSPOOL (X3) and JOB.

There are also a number of valuable improvements to the system which

would require very signiﬁcant changes in its structure

1. The user jobs should also have access to the ﬁling system, and to multiple

virtual input and output devices.

2. Users’ ﬁles should be stored permanently between the jobs they submit.

3. A method of checkpointing for rapid recovery from breakdown may be

needed.

4. If there is a backlog of jobs input but not yet executed, some method is

needed to control the order in which waiting jobs are started. This point

is taken up more fully in the next section.

204 6 Shared Resources

One of the problems encountered in making these improvements is the

impossibility of sharing resources between a subordinate and its main process,

in those cases where the technique of multiple labelling is not appropriate. It

seems that a new deﬁnition of subordination is required, in which the alphabet

of the subordinate is not a subset of the alphabet of the main process. But this

is a topic which is left for future research.

6.6 Scheduling

When a limited number of resources is shared among a greater number of po-

tential users, there will always be the possibility that some aspiring users will

have to wait to acquire a resource until some other process has released it. If

at the time of release there are two or more processes waiting to acquire it,

the choice of which waiting process will acquire the resource has been non-

deterministic in all examples given so far. In itself, this is of little concern;

but suppose, by the time a resource is released again, yet another process

has joined the set of waiting processes. Since the choice between waiting pro-

cesses is again nondeterministic, the newly joined process may be the lucky

one chosen. If the resource is heavily loaded, this may happen again and again.

As a result, some of the processes may happen to be delayed forever, or at least

for a wholly unpredictable and unacceptable period of time. This is known as

the problem of inﬁnite overtaking (Section 2.5.5).

One solution to the problemis to ensure that all resources are lightly used.

This may be achieved either by providing more resources, or by rationing their

use, or by charging a heavy price for the services provided. In fact, these are

the only solutions in the case of a resource which is consistently under heavy

load. Unfortunately, even a resource which is on average lightly loaded will

quite often be heavily used for long periods (rush hours or peaks).

The problem can sometimes be mitigated by diﬀerential charging to try to

smooth the demand, but this is not always successful or even possible. During

the peaks, it is inevitable that, on the average, using processes will be subject to

delay. It is important to ensure that these delays are reasonably consistent and

predictable—you would much prefer to knowthat you will be served within the

hour than to wonder whether you will have to wait one minute or one day.

The task of deciding how to allocate a resource among waiting users is

known as scheduling. In order to schedule successfully, it is necessary to know

which processes are currently waiting for allocation of the resource. For this

reason, the acquisition of a resource cannot any longer be regarded as a single

atomic event. It must be split into two events

please, which requests the allocation

thankyou, which accompanies the actual allocation of the resource.

For each process, the period between the please and the thankyou is the period

during which the process has to wait for the resource. In order to identify

6.6 Scheduling 205

the requesting process, we will index each occurrence of please, thankyou

and release by a diﬀerent natural number. The requesting process acquires

its number on each occasion by the same construction as remote subordina-

tion (6.4, X3)

i≥0

(res.i.please;

res.i.thankyou;

. . . ;

res.i.release →SKIP)

A simple but eﬀective method of scheduling a resource is to allocate it to

the process which has been waiting longest for it. This policy is known as ﬁrst

come ﬁrst served (FCFS) or ﬁrst in ﬁrst out (FIFO). It is the queueing discipline

observed by passengers who form themselves into a line at a bus stop.

In such a place as a bakery, where customers are unable or unwilling to

form a line, there is an alternative mechanism to achieve the same eﬀect. A

machine is installed which issues tickets with strictly ascending serial numbers.

On entry to the bakery, a customer takes a ticket. When the server is ready,

he calls out the lowest ticket number of a customer who has taken a ticket

but has not yet been served. This is known as the bakery algorithm, and is

described more formally below. We assume that up to R customers can be

served simultaneously.

Example

X1 (The bakery algorithm) We need to keep the following counts

p customers who have said please

t customers who have said thankyou

r customers who have released their resources

Clearly, at all times r ≤ t ≤ p. Also, at all times, p is the number that will

be given to the next customer who enters the bakery, and t is the number of

the next customer to be served; furthermore, p − t is the number of waiting

customers, and R + r − t is the number of waiting servers. All counts are

initially zero, and can revert to zero again whenever they are all equal—say at

night after the last customer has left.

One of the main tasks of the algorithm is to ensure that there is never sim-

ultaneously a free resource and a waiting customer; whenever such a situation

arises, the very next event must be the thankyou of a customer obtaining the

resource

BAKERY = B

0,0,0

206 6 Shared Resources

B

p,t,r

= if 0 < r = t = p then

BAKERY

else if R +r −t > 0 ∧ p −t > 0 then

t.thankyou →B

p,t+1,r

else

(p.please →B

p+1,t,r

¦ (

i<t

i.release →B

p,t,r+1

))

The bakery algorithm is due to Leslie Lamport.

Discussion 7

7.1 Introduction

The main objective of my research into communicating processes has been to

ﬁnd the simplest possible mathematical theory with the following desirable

properties

1. It should describe a wide range of interesting computer applications, from

vending machines, through process control and discrete event simulation,

to shared-resource operating systems.

2. It should be capable of eﬃcient implementation on a variety of conven-

tional and novel computer architectures, from time-sharing computers

through microprocessors to networks of communicating microprocessors.

3. It should provide clear assistance to the programmer in his tasks of spe-

ciﬁcation, design, implementation, veriﬁcation and validation of complex

computer systems.

It is not possible to claim that all these objectives have been achieved in

an optimal fashion. There is always hope that a radically diﬀerent approach,

or some signiﬁcant change to the detailed deﬁnitions, would lead to greater

success in one or more of the objectives listed above. This chapter initiates a

discussion of some of the alternatives which I and others have explored, and

an explanation of why I have not adopted them. It also gives me an opportunity

to acknowledge the inﬂuence of the original research of other workers in the

ﬁeld. Finally, I hope to encourage further research both into the foundations

of the subject and into its wider practical application.

7.2 Shared storage

The earliest proposals in the 1960s for the programming of concurrent oper-

ations within a single computer arose naturally from contemporaneous devel-

opments in computer architecture and operating systems. At that time pro-

cessing power was scarce and expensive, and it was considered very wasteful

208 7 Discussion

that a processor should have to wait while communicating with slowperipheral

equipment, or even slower human beings. Consequently, cheaper special-

purpose processors (channels) were provided to engage independently in input–

output, thereby freeing the central processor to engage in other tasks. To keep

the valuable central processor busy, a timesharing operating system would en-

sure that there were several complete programs in the main store of the com-

puter, and at any time several of the programs could be using the input–output

processors, while another program used the central processor. At the termin-

ation of an input–output operation, an interrupt would enable the operating

system to reconsider which program should be receiving the attention of the

central processor.

The scheme described above suggests that the central processor and all

the channels should be connected to all the main storage of the computer; and

accesses fromeach processor to each word of store were interleaved with those

from the other processors. Nevertheless, each program under execution was

usually a complete job submitted by a diﬀerent user, and wholly independent

of all the other jobs.

For this reason, great care was expended in the design of hardware and

software to divide the store into disjoint segments, one for each program, and

to ensure that no program could interfere with the store of another. When it

became possible to attach several independent central processors to the same

computer, the eﬀect was to increase the throughput of jobs; and if the original

operating system were well structured this could be achieved with little eﬀect

on the code of the operating system, and even less on the programs for the

jobs which it executed.

The disadvantages of sharing a computer among several distinct jobs were

1. The amount of storage required goes up linearly with the number of jobs

executed.

2. The amount of time that each user has to wait for the result of his jobs is

also increased, except for the highest priority jobs.

It therefore seems tempting to allow a single job to take advantage of the

parallelism provided by the hardware of the computer, by initiating several

concurrent processes within the same area of storage allocated to a single

program.

7.2.1 Multithreading

The ﬁrst proposal of this kind was based on a jump (go to command). If L is

a label of some place in the program, the command fork L transfers control to

the label L, and also allows control to pass to the next command in sequence.

Fromthen on, the eﬀect is that two processors execute the same programat the

same time; each maintains its own locus of control threading its way through

the commands. Since each locus of control can fork again, this programming

technique is known as multithreading.

7.2 Shared storage 209

Having provided a method for a process to split in two, some method is

also required for two processes to merge into one. A very simple proposal is

to provide a command join which can be executed only when two processes

execute it simultaneously. The ﬁrst process to reach the command must wait

until another one also reaches it. Then only one process goes ahead to execute

the following command.

In its full generality, multithreading is an incredibly complex and error-

prone technique, not to be recommended in any but the smallest programs.

In excuse, we may plead that it was invented before the days of structured

programming, when even FORTRAN was still considered to be a high-level pro-

gramming language!

A variation of the fork command is still used in the UNIX

TM

operating

system. The fork does not mention a label. Its eﬀect is to take a completely

fresh copy of the whole of storage allocated to the program, and to allocate the

copy to a newprocess. Both the original and the newprocess resume execution

at the command following the fork. A facility is provided for each process to

discover whether it is the parent or the oﬀspring. The allocation of disjoint

storage areas to the processes removes the main diﬃculties and dangers of

multithreading, but it can be ineﬃcient both in time and in space. This means

that concurrency can be aﬀorded only at the outermost (most global) level of

a job, and its use on a small scale is discouraged.

7.2.2 cobegin. . . coend

A solution to the problems of multithreading was proposed by E. W. Dijkstra:

make sure that after a fork the two processors execute completely diﬀerent

blocks of program, with no possibility of jumping between them. If P and Q

are such blocks, the compound command

cobeginP; Q coend

causes P and Q to start simultaneously, and proceed concurrently until they

have both ended. After that, only a single processor goes on to execute the

following commands. This structured command can be implemented by the

unstructured fork and join commands, using labels L and J

fork L; P; go to J ; L:Q; J : join

The generalisation to more than two component processes is immediate and

obvious

cobeginP; Q; . . . ; R coend

One great advantage of this structured notation is that it is much easier

to understand what is likely to happen, especially if the variables used in each

of the blocks are distinct from the variables used in the others (a restriction

that can be checked or enforced by a compiler for a high-level language). In

210 7 Discussion

this case, the processes are said to be disjoint, and (in the absence of commu-

nication) the concurrent execution of P and Q has exactly the same eﬀect as

their sequential execution in either order

beginP; Q end = beginQ; P end = cobeginP; Q coend

Furthermore, the proof methods for establishing correctness of the parallel

composition can be even simpler than the sequential case. That is why Dijk-

stra’s proposal forms the basis for the parallel construct in this book. The main

change is notational; to avoid confusion with sequential composition, I have

introduced the || operator to separate the processes; and this permits the use

of simple brackets to surround the command instead of the more cumbersome

cobegin. . . coend .

7.2.3 Conditional critical regions

The restriction that concurrent processes should not share variables has the

consequence that they cannot communicate or interact with each other in any

way, a restriction which seriously reduces the potential value of concurrency.

After reading this book, the introduction of (simulated) input and out-

put channels may seem the obvious solution; but in earlier days, an obvious

technique (suggested by the hardware of the computers) was to communic-

ate by sharing main storage among concurrent processes. Dijkstra showed

how this could be safely achieved by critical regions (Section 6.3) protected by

mutual exclusion semaphores. I later proposed that this method should be

formalised in the notations of a high-level programming language. A group

of variables which is to be updated in critical regions within several sharing

processes should be declared as a shared resource, for example

sharedn : integer

sharedposition : recordx, y : real end

Each critical region which updates this variable is preceded by a with clause,

quoting the variable name

with n do n := n +1;

with position do beginx := x +δx; y := y +δy end

The advantage of this notation is that a compiler automatically introduces the

necessary semaphores, and surrounds each critical region by the necessary P

and V operations. Furthermore, it can check at compile time that no shared

variable is ever accessed or updated except from within a critical region pro-

tected by the relevant semaphore.

Cooperation between processes which share store sometimes requires an-

other form of synchronisation. For example, suppose one process updates a

variable with the objective that other processes should read the new value.

7.2 Shared storage 211

The other processes must not read the variable until the updating has taken

place. Similarly, the ﬁrst process must not update the variable again until all

the other processes have read the earlier updating.

To solve this problem, a convenient facility is oﬀered by the conditional

critical region. This takes the form

with sharedvar whencondition do critical region

On entry to the critical region, the value of the condition is tested. If it is true,

the critical region is executed normally. But if the condition is false, this entry

into the critical region is postponed, so that other processes are permitted to

enter their critical regions and update the shared variable. On completion of

each such update, the condition is retested. If it has become true, the delayed

process is permitted to proceed with its critical region; otherwise that process

is suspended again. If more than one delayed process can proceed, the choice

between them is arbitrary.

To solve the problem of updating and reading a message by many pro-

cesses, declare as part of the resource an integer variable to count the number

of processes that must read the message before it is updated again

sharedmessage : recordcount : integer; content : . . . end

message.count := 0;

The updating process contains a critical region

with message whencount = 0 do

begincontent := . . . ;

. . . ;

count := number of readers

end

Each reading process contains a critical region

with message whencount > 0 do

beginmy copy := content; count = count −1 end

Conditional critical regions may be implemented by means of semaphores.

Compared with direct use of synchronisation semaphores by the programmer,

the overhead of conditional critical regions may be quite high, since the condi-

tions of all processes waiting to enter the region must be retested on every exit

from the region. Fortunately, the conditions do not have to be retested more

frequently than that, because restrictions on access to shared variables ensure

that the condition tested by a waiting process can change value only when the

shared variable itself changes value. All other variables in the condition must

212 7 Discussion

be private to the waiting process, which obviously cannot change them while

it is waiting.

7.2.4 Monitors

The development of monitors was inspired by the class of SIMULA 67, which

was itself a generalisation of the procedure concept of ALGOL 60. The basic

insight is that all meaningful operations on data (including its initialisation and

perhaps also its ﬁnalisation) should be collected together with the declaration

of the structure and type of the data itself; and these operations should be

invoked by procedure calls whenever required by the processes which share

the data. The important characteristic of a monitor is that only one of its

procedure bodies can be active at one time; even when two processes call a

procedure simultaneously (either the same procedure or two diﬀerent ones),

one of the calls is delayed until the other is completed. Thus the procedure

bodies act like critical regions protected by the same semaphore.

For example a very simple monitor acts like a count variable. In notations

based on PASCAL PLUS it takes the form

1 monitor count;

2 var n : integer;

3 procedure

∗

up; beginn := n +1 end ;

4 procedure

∗

down; whenn > 0 do beginn := n −1 end ;

5 function

∗

grounded : Boolean; begingrounded := (n = 0) end ;

6 beginn := 0;

7 · · ·;

8 if n ≠ 0 then print(n)

9 end

Line 1 declares the monitor and gives it the name count.

2 declares the shared variable n local to the monitor; it

is inaccessible except within the monitor itself.

3

4

5

declare three procedures within their bodies; the

asterisks ensure they can be called from the program

which uses the monitor;

6 the monitor starts execution here;

7 the three fat dots are an inner statement, which stands

for the block that is going to use this monitor;

8 the ﬁnal value of n (if non-zero) is printed on exit from

the user block.

7.2 Shared storage 213

A new instance of this monitor can be declared local to a block P

instancerocket : count ; P

Within the block P, the starred procedures may be called by commands

rocket.up; . . . rocket.down; . . . ; if rocket.grounded then . . .

However an unstarred procedure or variable such as n cannot be accessed

from within P and observance of this restriction is enforced by a compiler.

The mutual exclusion inherent in the monitor ensures that the procedure of

the monitor can be safely called by any number of processes within P, and

there is no danger of interference in updating n. Note that an attempt to call

rocket.down when n = 0 will be delayed until some other process within P

calls rocket.up. This ensures that the value of n can never go negative.

The eﬀect of the declaration of an instance of a monitor is explained by

a variation of the copy rule for procedure calls in ALGOL 60. Firstly, a copy

is taken of the text of the monitor; the using block P is copied in place of the

three dots inside the monitor, and all local names of the monitor are preﬁxed

by the name of the instance, as shown below

rocket.n : integer;

procedurerocket.up; beginrocket.n := rocket.n +1 end ;

procedurerocket.down;

whenrocket.n > 0 do beginrocket.n := rocket.n −1 end ;

functionrocket.grounded : Boolean;

beginrocket.grounded := (rocket.n = 0) end ;

beginrocket.n := 0;

· · ·;

if rocket.n ≠ 0 then print(rocket.n)

end

Note how the copy rule has made it impossible for the user process to forget

to initialise the value of n, or to forget to print its ﬁnal value of n, or to forget

to print its ﬁnal value when necessary.

The ineﬃciency of repeated testing of entry conditions has led to the

design of monitors with a more elaborate scheme of explicit waiting and expli-

cit signalling for resumption of waiting processes. These schemes even allow

a procedure call to suspend itself in the middle of its execution, after auto-

matically releasing exclusion of the suspended process. In this way, a number

of ingenious scheduling techniques can be eﬃciently implemented; but I now

think that the extra complexity is hardly worthwhile.

214 7 Discussion

7.2.5 Nested monitors

A monitor instance can be used like a semaphore to protect a single resource

such as a line printer which must not be used by more than one process at a

time. Such a monitor could be declared

monitor singleresource;

var free : Boolean;

procedure

∗

acquire; whenfree do free := false;

procedure

∗

release; beginfree := true end ;

beginfree := true; · · · end

However, the protection aﬀorded by this monitor can be evaded by a process

which uses the resource without acquiring it, or frustrated by one which forgets

to release it afterwards. Both of these dangers can be averted by a construction

similar to that of the virtual resource (6.4 X4). This takes the formof a monitor

declared locally within the actual resource monitor shown above. The name

of the virtual resource is starred to make it accessible for declaration by user

processes. However, the stars are removed from

∗

acquire and

∗

release, so

that these can be used only within the virtual resource monitor, and cannot be

misused by other processes.

monitor singleresource;

free : Boolean;

procedureacquire;

whenfree do free := false;

procedurerelease;

begin

free := true;

end

monitor

∗

virtual;

procedure

∗

use(l : line); begin. . . end ;

begin

acquire; · · ·; release

end

begin

free := true; · · ·;

end

7.2 Shared storage 215

An instance of this monitor is declared

instancelpsystem : singleresource; P

A block within P which requires to output a ﬁle to a line printer is written

instancemine : lpsystem.virtual;

begin. . . mine.use(l1); . . . mine.use(l2); . . . end

The necessary acquisition and release of the line printer are automatically in-

serted by the virtual monitor before and after this user block, in a manner

which prevents antisocial use of the line printer. In principle, it would be pos-

sible for the using block to split into parallel processes, all of which use the

instance mine of the virtual monitor, but this is probably not the intention here.

A monitor which is to be used only by a single process is known in PASCAL

PLUS as an envelope, and it can be implemented more eﬃciently without exclu-

sion or synchronisation; and the compiler checks that it is not inadvertently

shared.

The meaning of these instance declarations can be calculated by repeated

application of the copy rule, with the result shown below

var lpsystem.free : Boolean;

procedurelpsystem.acquire;

whenlpsystem.free do lpsystem.free := false;

procedurelpsystem.release; beginlpsystem.free := true end ;

beginlpsystem.free := true

.

.

.

begin

proceduremine.lpsystem.use(l : line); begin. . . end ;

lpsystem.acquire;

. . . mine.lpsystem.use(l1);

. . . mine.lpsystem.use(l2);

lpsystem.release;

end ;

.

.

.

end

The explicit copying shown here is only for initial explanation; a more exper-

ienced programmer would never wish to see the expanded version, or even

think about it.

216 7 Discussion

These notations were used in 1975 for the description of an operating sys-

tem similar to that of Section 6.5; and they were later implemented in PASCAL

PLUS. Extremely ingenious eﬀects can be obtained by a mixture of starring and

nesting; but the PASCAL- and SIMULA-based notations seemrather clumsy, and

explanations in terms of substitution and renaming are rather diﬃcult to fol-

low. It was Edsger W. Dijkstra’s criticisms of these aspects that ﬁrst impelled

me towards the design of communicating sequential processes.

However, it is now clear from the constructions of Section 6.5 that the

control of sharing requires complication, whether it is expressed within the

conceptual framework of communicating processes or within the copy-rule and

procedure-call semantics of PASCAL PLUS. The choice between the languages

seems partly a matter of taste, or perhaps eﬃciency. For implementation of

an operating system on a computer with shared main storage, PASCAL PLUS

probably has the advantage.

7.2.6 Ada

TM

Facilities oﬀered for concurrent programming in Ada are an amalgam of the

remote procedure call of PASCAL PLUS, with the less structured form of com-

munication by input and output. Processes are called tasks, and they commu-

nicate by call statements (which are like procedure calls with output and input

parameters), and accept statements (which are like procedure declarations in

their syntactic form and in their eﬀect). A typical accept statement might be

accept put(V : ininteger; PREV : out integer) do

PREV := K; K := V end

A corresponding call might be

put(37, X)

The identiﬁer put is known as an entry name.

An accept and a call statement with the same name in diﬀerent tasks as

executed when both processes are ready to execute them together. The eﬀect

is as follows

1. Input parameters are copied from the call to the accepted process.

2. The body of the accept statement is executed.

3. The values of the output parameters are copied back to the call.

4. Then both tasks continue execution at their next statements.

The execution of the body of an accept is known as a rendezvous, since the

calling and accepting task may be thought to be executing it together. The

rendezvous is an attractive feature of Ada, since it simpliﬁes the very common

practice of alternating output and input, without much complicating the case

when only input or only output is required.

7.2 Shared storage 217

The Ada analogue of is the select statement which takes the form

select

accept get(v : out integer) do v := B[i] end ; i := i +1; . . .

or accept put(v : ininteger) do B[j] := v end ; j := j +1; . . .

or . . .

end select

Exactly one of the alternatives separated by or will be selected for execution,

depending on the choice made by the calling task(s). The remaining state-

ments of the selected alternative after the end of the accept are executed on

completion of the rendezvous, concurrently with continuation of the calling

task. Selection of an alternative can be inhibited by falsity of a preceding when

condition, for example

when not full ⇒accept . . .

This achieves the eﬀect of a conditional critical region.

One of the alternatives in a select statement may begin with a delay instead

of an accept. This alternative may be selected if no other alternative is selected

during elapse of a period greater than the speciﬁed number of seconds’ delay.

The purpose of this is to guard against the danger that hardware or software

error might cause the select statement to wait forever. Since our mathematical

model deliberately abstracts from time, a delay cannot be faithfully repres-

ented, except by allowing wholly nondeterministic selection of the alternative

beginning with the delay.

One of the alternatives in a select statement may be the statement terminate.

This alternative is selected when all tasks which might call the given task have

terminated; and then the given task terminates too. This is not as convenient

as the inner statement of PASCAL PLUS, which allows the monitor to tidy up

on termination.

A select statement may have an else clause, which is selected if none of

the other alternatives can be selected immediately, either because all the when

conditions are false, or because there is no corresponding call already waiting

in some other task. This would seem to be equivalent to an alternative with

zero delay.

A call statement may also be protected against arbitrary delay by a delay

statement or an else clause. This may lead to some ineﬃciencies in implement-

ation on a distributed network of processors.

Tasks in Ada are declared in much the same way as subordinate processes

(Section 4.5); but like monitors in PASCAL PLUS, each one may serve any num-

ber of calling processes. Furthermore, the programmer must arrange for the

task to terminate properly. The deﬁnition of a task is split into two parts,

its speciﬁcation and its body. The speciﬁcation gives the task name and the

names and parameter types of all entries through which the task may be called.

218 7 Discussion

This is the information required by the writer of the program which uses the

task, and by the compiler of that program. The body of the task deﬁnes its

behaviour, and may be compiled separately from the using program.

Each task in Ada may be given a ﬁxed priority. If several tasks are cap-

able of making progress, but only a lesser number of processors are available,

the tasks with lower priority will be neglected. The priority of execution of a

rendezvous is the higher of the priorities of the calling and of the accepting

tasks. The indication of priority is called a pragma; it is intended to improve

critical response times compared with non-critical ones, and it is not intended

to aﬀect the logical behaviour of the program. This is an excellent idea, since it

separates concern for abstract logical correctness from problems of real time

response, which can often be more easily solved by experiment or by judicious

selection of hardware.

Ada oﬀers a number of additional facilities. It is possible to test howmany

calls are waiting on an entry. One task may abruptly terminate (abort) another

task, and all tasks dependent upon it. Tasks may access and update shared

variables. The eﬀect of this is made even more unpredictable by the fact that

compilers are allowed to delay, reorder, or amalgamate such updating just as

if the variable were not shared. There are some additional complexities and

interaction eﬀects with other features which I have not mentioned.

Apart from the complexities listed in the preceding paragraph, tasking in

Ada seems to be quite well designed for implementation and use on a multi-

processor with shared storage.

7.3 Communication

The exploration of the possibility of structuring a program as a network of

communicating processes was also motivated by spectacular progress in the

development of computer hardware. The advent of the microprocessor rapidly

reduced the cost of processing power by several orders of magnitude. How-

ever, the power of each individual microprocessor was still rather less than

that of a typical computer of the traditional and still expensive kind. So it

would seem to be most economical to obtain greater power by use of several

microprocessors cooperating on a single task. These microprocessors would

be cheaply connected by wires, along which they could communicate with each

other. Each microprocessor would have its own local main store, which it could

access at high speed, thus avoiding the expensive bottleneck that tends to res-

ult from allowing only one processor at a time to access a shared store.

7.3.1 Pipes

The simplest pattern of communication between processing elements is just

single-directional message passing between each process and its neighbour in

a linear pipe, as described in Section 4.4. The idea was ﬁrst propounded by

Conway who illustrated it by examples similar to 4.4 X2 and X3, except that all

7.3 Communication 219

components of a pipe were expected to terminate successfully instead of run-

ning forever. He proposed that the pipe structure should be used for writing

a multiple-pass compiler for a programming language. On a computer with

adequate main storage, all the passes are simultaneously active, and control

transfers between the passes together with the messages, thus simulating par-

allel execution. On a computer with less main storage, only one pass at a time

is active, and sends its output to a ﬁle in backing store. On completion of each

pass, the next pass starts, taking its input from the ﬁle produced by its prede-

cessor. However, the ﬁnal result of the compiler is exactly the same, in spite

of the radical diﬀerences in the manner of execution. It is characteristic of a

successful abstraction in programming that it permits several diﬀerent imple-

mentation methods which are eﬃcient in diﬀering circumstances. In this case,

Conway’s suggestion could have been very valuable for software implementa-

tion on a computer range oﬀering widely varying options of store size.

The pipe is also the standard method of communication in the UNIX

TM

operating system, where the notation ‘¦’ is used instead of ‘>>’.

7.3.2 Multiple buﬀered channels

The pipe construction allows a linear chain of processes to communicate in a

single direction only, and it does not matter whether the message sequence is

buﬀered or not. The natural generalisation of the pipe is to permit any process

to communicate with any other process in either direction; and at ﬁrst sight it

seems equally natural to provide buﬀering on all the communication channels.

In the design of the RC4000 operating system, a facility for buﬀered communic-

ation was implemented in the kernel; and was used for communication between

modules providing services at a higher level. On a grander scale, a store-and-

forward packet switching network, like ARPAnet in the United States, inevitably

interposes buﬀering between the source and destination of messages.

When the pattern of communication between processes is generalised from

a linear chain to a network that may be cyclic, the presence or absence of buf-

fering can make a vital diﬀerence to the logical behaviour of the system. The

presence of buﬀering is not always favourable: for example, it is possible to

write a program that can deadlock if the length of the buﬀer is allowed to

grow greater than ﬁve, as well as a diﬀerent program that will deadlock un-

less the buﬀer length is allowed to exceed ﬁve. To avoid such irregularities,

the length of all buﬀers should be unbounded. Unfortunately, this leads to

grave problems of implementation eﬃciency when main storage is ﬁlled with

buﬀered messages. The mathematical treatment is also complicated by the

fact that every network is an inﬁnite-state machine, even when the compon-

ent processes are ﬁnite. Finally, for rapid and controllable interaction between

humans and computers, buﬀers only stand in the way, since they can inter-

pose delay between a stimulus and a response. If something goes wrong in

processing a buﬀered stimulus, it is much more diﬃcult to trace the fault

and recover from it. Buﬀering is a batch-processing technique, and should be

220 7 Discussion

avoided wherever fast interactions are more important than heavy processor

utilisation.

7.3.3 Functional multiprocessing

A deterministic process may be deﬁned in terms of a mathematical function

from its input channels to its output channels. Each channel is identiﬁed with

the indeﬁnitely extensible sequence of messages which pass along it. Such

functions are deﬁned in the usual way by recursion on the structure of the input

sequences, except that the case of an empty input sequence is not considered.

For example, a process which outputs the result of multiplying each number

input by n is deﬁned

prod

n

(left) = ¸n ×left

0

)

prod

n

(left

¹

)

A function which takes two sorted streams (without duplicates) as parameters

and outputs their sorted merge (suppressing duplicates) is deﬁned

merge2(left1, left2) =

if left1

0

< left2

0

then

¸left1

0

)

merge2(left1

¹

, left2)

else if left2

0

< left1

0

then

¸left2

0

)

**merge2(left1, left2
**

¹

)

else

¸left2

0

)

merge2(left1

¹

, left2

¹

)

An acyclic network can be represented by a composition of such functions. For

example, a function which merges three sorted input streams can be deﬁned

merge3(left1, left2, left3) = merge2(left1, merge2(left2, left3))

Figure 7.1 shows a connection diagram of this function.

merge3

merge2

merge2

left1

left2

left3

Figure 7.1

7.3 Communication 221

A cyclic network can be constructed by a set of mutually recursive equa-

tions. For example, consider the problem attributed by Dijkstra to Hamming,

namely to deﬁne a function which outputs in ascending sequence all numbers

which have only 2, 3, and 5 as non-trivial factors. The ﬁrst such number is 1;

and if x is any such number, so are 2 × x, 3 × x, and 5 × x. We therefore use

three processes prod

2

, prod

3

, and prod

5

to generate these products, and feed

them back into the process merge3, which ensures they are eventually output

in ascending order (Figure 7.2).

prod

5

prod

2

prod

3

merge3

1

Figure 7.2

The function which outputs the desired result has no inputs; it is simply

deﬁned

Hamming =

¸1)

merge3(prod

2

(Hamming), prod

3

(Hamming), prod

5

(Hamming))

The functional approach to multiprocessor networks is very diﬀerent from

that taken in this book in the following respects

1. A general implementation requires unbounded buﬀering on all channels.

2. Each value output into the buﬀer must be retained in the buﬀer until all the

inputting processes have taken it, which they may do at diﬀerent times.

3. There is no possibility for a process to wait for one of two inputs, whichever

one arrives ﬁrst.

4. The processes are all deterministic.

Recent research has been directed towards reducing the ineﬃciency of 1

and 2, and towards relaxing the restrictions 3 and 4.

222 7 Discussion

7.3.4 Unbuﬀered communication

For many years now, I have chosen to take unbuﬀered (synchronised) commu-

nication as basic. My reasons have been

1. It matches closely a physical realisation on wires which connect processing

agents. Such wires cannot store messages.

2. It matches closely the eﬀect of calling and returning from subroutines

within a single processor, copying the values of the parameters and the

results.

3. When buﬀering is wanted, it can be implemented simply as a process; and

the degree of buﬀering can be precisely controlled by the programmer.

4. Other disadvantages of buﬀers have been mentioned at the end of Sec-

tion 7.3.2.

Of course, none of these arguments carry absolute conviction. For ex-

ample, if buﬀered communication were taken as primitive, this would make

no logical diﬀerence in the common case of alternation of subroutine call and

return; and synchronisation can be achieved in all other cases by following

every output by input of an acknowledgement, and every input by output of

an acknowledgement.

7.3.5 Communicating sequential processes

This was the title of my ﬁrst complete exposition of a programming language

based on concurrency and communication. That early proposal diﬀers from

this book in two signiﬁcant respects.

(1) Parallel composition

Channels are not named. Instead, the component processes of a parallel con-

struction have unique names, preﬁxed to them by a pair of colons

[a :: P || b :: Q || . . . || c : R]

Within process P, the command b!v outputs the value v to the process named

b. This value is input by a command a?x occurring within the process Q. The

process names are local to the parallel command in which they are introduced,

and communications between the component processes are hidden.

The advantage of this scheme is that there is no need to introduce into

the language any concept of channel or channel declaration. Furthermore, it

is logically impossible to violate the restriction that a channel is between two

processes only, and one of them uses it for input and the other for output. But

there are some disadvantages, both in practice and in theory.

7.3 Communication 223

1. A serious practical disadvantage is that a subordinate process needs to

know the name of its using process; this complicates construction of lib-

raries of subordinate processes.

2. A disadvantage in the underlying mathematics is that parallel composi-

tion is an operation with a variable number of parameters and cannot be

reduced to a binary associative operator like ||.

(2) Automatic termination

In the early version, all processes of a parallel command were expected to

terminate. The reason for this was the hope that the correctness of a process

could be speciﬁed in the same way as for a conventional program by a post-

condition, i.e., a predicate intended to be true on successful termination. (That

hope was never fulﬁlled, and other proof methods (Section 1.10) now seem

more satisfactory). The obligation that a subordinate process should terminate

imposes an awkward obligation on the using process to signal its termination

to all subordinates. An ad hoc convention was therefore introduced. A loop

of the form

∗

[a?x →P b?x →Q . . .]

terminates automatically on termination of all of the processes a, b, …from

which input is requested. This enables the subordinate process to complete

any necessary ﬁnalisation code before termination—a feature which had proved

useful in SIMULA and PASCAL PLUS.

The trouble with this convention is that it is complicated to deﬁne and

implement; and methods of proving program correctness seem no simpler

with it than without. Now it seems to me better (as in Section 4.5) to relax the

restriction that simple subordinate processes must terminate; and take other

measures (Section 6.4) in the more complicated cases.

7.3.6 Occam

In contrast to Ada, occam is a very simple programming language, and very

closely follows the principles expounded in this book. The most immediately

striking diﬀerences are notational; occam syntax is designed to be composed

at a screen with the aid of a syntax checking editor; it uses preﬁx operators

instead of inﬁx, and it uses indentation instead of brackets.

SEQ for (P ; Q ; R)

P

Q

R

224 7 Discussion

PAR for (P || Q || R)

P

Q

R

ALT for (c?x →P d?y →Q)

c?x

P

d?y

Q

IF for (P ¦ <B ¦ >Q)

B

P

NOT B

Q

WHILE B for (B ∗ P)

P

The ALT construct corresponds to the Ada select statement, and oﬀers a

similar range of options. Selection of an alternative may be inhibited by falsity

of a Boolean condition B

B & c?x

P

The input may be replaced by a SKIP, in which case the alternative may be

selected whenever the Boolean guard is true; or it may be replaced by a wait,

which allows the alternative to be selected after elapse of a speciﬁed interval.

The occam language does not have any distinct notations for pipes (Sec-

tion 4.4), subordinate processes (Section 4.5), or shared processes (Section 6.4).

All the required patterns of communication must be achieved by explicit iden-

tity of channel names. To help in this, procedures may be declared with chan-

nels as parameters. For example, the simple copying process may be declared

PROC copy(CHAN left, right) =

WHILE TRUE

VARx :

SEQ

left?x

right!x :

7.3 Communication 225

The double buﬀer COPY>>COPY can now be constructed

CHAN mid :

PAR

copy(left, mid)

copy(mid, right)

A chain of n buﬀers may be constructed using an array of n channels and an

iterative form of the parallel construct, which constructs n −2 processes, one

for each value of i between 0 and n −3 inclusive

CHAN mid[n −1] :

PAR

copy(left, mid[0])

PARi = [0 FORn −2]

copy(mid[i], mid[i +1])

copy(mid[n −2], right)

Because occam is intended to be implemented with static storage allocation

on a ﬁxed number of processors, the value n in the above example must be a

constant. For the same reason, recursive procedures are not allowed.

A similar construction may be used to achieve the eﬀect of subordinate

processes, for example

PROC double(left, right) =

WHILE TRUE

VARx :

SEQ

left?x

right!(x +x) :

This may be declared subordinate to a single using process P

CHAN doub.left, doub.right :

PAR

double(doub.left, doub.right)

P

Inside P a number is doubled by

doub.left!4; doub.right?y . . .

Processes may be shared using arrays of channels (with one element per

using process) and an iterative form of the ALT construction. For example,

take an integrator, which after each new input outputs the sum of all numbers

226 7 Discussion

it has input so far

CHAN add[n], integral[n] :

PAR

VARsum, x :

SEQ

sum := 0

ALT i = [0 FORn]

add[i]?x

SEQ

sum := sum+x

integral[i]!sum

PARi = [0 FORn]

. . . user processes . . .

Like Ada, occam allows a programmer to assign relative priorities to pro-

cesses combined in parallel. This is done by using PRI PAR instead of PAR;

and the earlier processes in the list have higher priority. The screen-editing

facilities provided with the language facilitate reordering of processes when ne-

cessary. A similar option is oﬀered for the ALT construction, namely PRI ALT.

This ensures that if more than one alternative is ready for immediate selec-

tion, the textually earliest will be chosen—otherwise the eﬀect is the same as

the simple ALT. Of course, the programmer is urged to ensure that his pro-

grams are logically correct, independent of the assignment of priorities.

There are also facilities for distributing processes among distinct pro-

cessors, and for specifying which physical pins on each processor are to be

used for each relevant channel of the occam program, and which pin is to be

used for loading the code of the program itself.

7.4 Mathematical models

Recognition of the idea that a programming language should have a precise

mathematical meaning or semantics dates from the early 1960s. The mathem-

atics provides a secure, unambiguous, precise and stable speciﬁcation of the

language to serve as an agreed interface between its users and its implement-

ors. Furthermore, it gives the only reliable grounds for a claim that diﬀerent

implementations are implementations of the same language. So mathemat-

ical semantics are as essential to the objective of language standardisation as

measurement and counting are to the standardisation of nuts and bolts.

In the later 1960s an even more important role for mathematical semantics

was recognised, that of assisting a programmer to discharge his obligation of

establishing correctness of his program. Indeed R. W. Floyd suggested that the

semantics be formulated as a set of valid proof rules, rather than as an explicit

7.4 Mathematical models 227

mathematical model. This suggestion has been adopted in the speciﬁcation of

PASCAL and Euclid and Gypsy.

The early design of Communicating Sequential Processes (Section 7.3.5)

had no mathematical semantics, and it left open a number of important design

questions, for example

1. Is it permissible to nest one parallel command inside another?

2. If so, is it possible to write a recursive procedure which calls itself in

parallel?

3. Is it theoretically possible to use output commands in guards?

The mathematical model given in this book answers “yes” to all these ques-

tions.

7.4.1 A calculus of communicating systems

The major breakthrough in the mathematical modelling of concurrency was

made by Robin Milner. The objective of his investigation was to provide a

framework for constructing and comparing diﬀerent models, at diﬀerent levels

of abstraction. So he starts with the basic syntax of expressions intended to

denote processes, and he deﬁnes a series of equivalences between the expres-

sions, of which the most important are

strong equivalence

observational equivalence

observational congruence

Each equivalence deﬁnes a diﬀerent model of concurrency. The initials CCS

usually refer to the model obtained by adopting observational congruence as

the deﬁnition of equality between processes.

The basic notations of CCS are illustrated by the following correspond-

ences

a.P corresponds to a →P

(a.P) +(b.Q) corresponds to (a →P ¦ b →Q)

NIL corresponds to STOP

More important than these notational distinctions are diﬀerences in the treat-

ment of hiding. In CCS, there is a special symbol τ which stands for the occur-

rence of a hidden event or an internal transition. The advantage of retaining

this vestigial record of a hidden event is that it can be freely used to guard re-

cursive equations and so ensure that they have unique solutions, as described

in Section 2.8.3. A second (but perhaps less signiﬁcant) advantage is that pro-

cesses which can engage in an unbounded sequence of τs do not all reduce to

228 7 Discussion

CHAOS; so possibly useful distinctions can be drawn between divergent pro-

cesses. However, CCS fails to distinguish a possibly divergent process from

one that is similar in behaviour but nondivergent. I expect this would make

eﬃcient implementation of the full CCS language impossible.

CCS does not include ¦as a primitive operator. However, nondeterminism

can be modelled by use of τ, for example

(τ.P) +(τ.Q) corresponds to P ¦ Q

(τ.P) +(a.Q) corresponds to P ¦ (P (a →Q))

But these correspondences are not exact, because in CCS nondeterminism

deﬁned by τ would not be associative, as shown by the fact that the trees

in Figure 7.3 are distinct.

Q R

P P Q R

P Q

R

Figure 7.3

Furthermore, preﬁxing does not distribute through nondeterminism, be-

cause the trees in Figure 7.4 are distinct when P ≠ Q

P P P Q Q Q

a a

a

a a

Figure 7.4

These examples show that CCS makes many distinctions between pro-

cesses which would be regarded as identical in this book. The reason for this

is that CCS is intended to serve as a framework for a family of models, each

of which may make more identiﬁcations than CCS but cannot make less. To

avoid restricting the range of models, CCS makes only those identiﬁcations

which seem absolutely essential. In the mathematical model of this book we

have pursued exactly the opposite goal—we have made as many identiﬁcations

as possible, preserving only the most essential distinctions. We have therefore

a far richer set of algebraic laws. It is hoped that these laws will be practic-

7.4 Mathematical models 229

ally useful in reasoning about designs and implementations; in particular, they

permit more transformations and optimisations than CCS.

The basic concurrent combinator of CCS is denoted by the single bar ¦. It

is rather more complicated than the || combinator, in that it includes aspects

of hiding, nondeterminism and interleaving as well as synchronisation.

Each event in CCS has two forms, either simple (a) or overbarred (a).

When two processes are put together to run concurrently, synchronisation oc-

curs only when one process engages in a barred event and the other engages in

the corresponding simple event. Their joint participation in such an event is

hidden by immediate conversion to τ. However, synchronisation is not com-

pulsory; each of the two events can also occur visibly and independently as an

interaction with the outer environment. Thus in CCS

(a.P) ¦ (b.Q) = a.(P ¦ (b.Q)) +b.((a.P) ¦ Q)

(a.P) ¦ (a.Q) = a.(P ¦ (a.Q)) +a.((a.P) ¦ Q)

(a.P) ¦ (a.Q) = τ.(P ¦ Q) +a.(P ¦ (a.Q)) +a.((a.P) ¦ Q)

Consequently, only two processes can engage in a synchronisation event; if

more than two processes are ready, the choice of which pair succeeds is non-

deterministic

(a.P) ¦ (a.Q) ¦ (a.R) = τ.(P ¦ (a.Q) ¦ R) +

τ.((a > P) ¦ Q ¦ R) +

a.(P ¦ (a.Q) ¦ (a.R)) +

a.((a.P) ¦ Q ¦ (a.R)) +

a.((a.P) ¦ (a.Q) ¦ R)

Because of the extra complexity of the parallel operator, there is no need

for a concealment operator. Instead, there is a restriction operator \, which

simply prevents all occurrence of the relevant events, and removes them from

the alphabet of the process, together with their overbarred variant. The eﬀect

is illustrated by the following laws of CCS

(a.P) \ ¦a¦ = (a.P) \ ¦a¦ = NIL

(P +Q) \ ¦a¦ = (P \ ¦a¦) +(Q \ ¦a¦)

((a.P) ¦ (a.Q)) \ ¦a¦ = τ.((P ¦ Q) \ ¦a¦)

((a.P) ¦ (a.Q) ¦ (a.R)) \ ¦a¦ = τ.((P ¦ (a.Q) ¦ R) \ ¦a¦) +

τ.(((a.P) ¦ Q ¦ R) \ ¦a¦)

The last law above illustrates the power of the CCS parallel combinator in

achieving the eﬀect of sharing the process (a.R) among two using processes

(a.P) and (a.Q). It was an objective of CCS to achieve the maximumexpressive

power with as few distinct primitive operators as possible. This is the source

of the elegance and power of CCS, and greatly simpliﬁes the investigation of

families of models deﬁned by diﬀerent equivalence relations.

230 7 Discussion

In this book, I have taken a complementary approach. Simplicity is sought

through design of a single simple model, in terms of which it is easy to deﬁne as

many operators as seemappropriate to investigate a range of distinct concepts.

For example, the nondeterministic choice ¦ introduces nondeterminism in its

purest form, and is quite independent of environmental choice represented by

(x : B →P(x)). Similarly, || introduces concurrency and synchronisation, quite

independent of nondeterminism or hiding, each of which is represented by a

distinct operator. The fact that the corresponding concepts are distinct is per-

haps indicated by the simplicity of the algebraic laws. A reasonably wide range

of operators seems to be needed in the practical application of useful math-

ematical theories. Minimisation of operator sets is also useful, more especially

in theoretical investigations.

Milner has introduced a form of modal logic to specify the observable

behaviour of a process. The modality

a S

describes a process which may do a and then behave as described by S, and

the dual

a S

describes a process that if it starts with a must behave like S afterwards. A

calculus of correctness is deﬁned which permits a proof that a process P meets

speciﬁcation S, a fact which is expressed in the traditional logical notation

P = S

The calculus is very diﬀerent from that governing the sat notation, because it

is based on the structure of the speciﬁcation rather than the structure of the

programs. For example, the rule for negation is

If it is not true that P = F, then P = ¬ F

This means that the whole process P must be written before the proof of its

correctness starts. In contrast, the use of sat permits proof of the correctness

of a compound process to be constructed from a proof of correctness of its

parts. Furthermore, there is never a need to prove that a process does not

satisfy its speciﬁcation. Modal logic is a subject of great theoretical interest,

but in the context of communicating processes it does not yet show much

promise for useful application.

In general, equality in CCS is a strong relation, since equal processes must

resemble each other both in their observable behaviour and in the structure of

their hidden behaviour. CCS is therefore a good model for formulating and ex-

ploring various weaker deﬁnitions of equivalence, which ignore more aspects

of the hidden behaviour. Milner accomplishes this by introducing the concept

of observational equivalence. This involves deﬁnition of the set of observa-

tions or experiments that may be made on a process; then two processes are

7.4 Mathematical models 231

equivalent if there is no observation that can be made of one of them but not

the other—a nice application of the philosophical principle of the identity of

indiscernibles. The principle was taken as the basis of the mathematical the-

ory in this book, which equates a process with the set of observations that can

be made of its behaviour. A sign of the success of the principle is that two

processes P and Q are equivalent if and only if they satisfy the same speciﬁc-

ations

∀S • P = S ≡ Q = S

Unfortunately, it doesn’t always work as simply as this. If two processes

are to be regarded as equal, the result of transforming them by the same func-

tion should also be equal, i.e.,

(P ≡ Q) ⇒(F(P) ≡ F(Q))

Since τ is supposed to be hidden, a natural deﬁnition of an observation would

lead to the equivalence

(τ.P) ≡ P

However, (τ.P +τ.NIL) should not be equivalent to (P +NIL), which equals P,

since the former can make a nondeterministic choice to deadlock instead of

behaving like P.

Milner’s solution to this problem is to use the concept of congruence in

place of equivalence. Among the experiments which can be performed on the

process P is to place it in an environment F(P) (where F is composed of other

processes by means of operators in the language), and then to observe the

behaviour of this assembly. Processes P and Q are (observationally) congruent

if for every F expressible in the language the process F(P) is observationally

equivalent to F(Q). According to this deﬁnition τ.P is not congruent to P.

The discovery of a full set of laws of congruence is a signiﬁcant mathematical

achievement.

The need for the extra complexity of observational congruence is due to

the inability to make suﬃciently penetrating observations of the behaviour of

P, without placing it in an environment F(P). That is why we have had to in-

troduce the concept of a refusal set, rather than just a refusal of a single event.

The refusal set seems to be the weakest kind of observation that eﬃciently

represents the possibility of nondeterministic deadlock; and it therefore leads

to a much weaker equivalence, and to a more powerful set of algebraic laws

than CCS.

The description given above has overemphasised the diﬀerences of CCS,

and has overstated the case for practical application of the approach taken

in this book. The two approaches share their most important characteristic,

namely a sound mathematical basis for reasoning about speciﬁcations, designs

and implementations; and either of them can be used for both theoretical in-

vestigations and for practical applications.

Select Bibliography

Conway, M. E. ‘Design of a Separable Transition-Diagram Compiler,’ Comm.

ACM 6 (7), 8–15 (1963)

The classical paper on coroutines.

Hoare, C. A. R. ‘Monitors: An Operating System Structuring Concept,’ Comm.

ACM 17 (10), 549–557 (1974)

A programming language feature to aid in construction of operating

systems.

Hoare, C. A. R. ‘Communicating Sequential Processes,’ Comm. ACM 21 (8),

666-677 (1978)

A programming language design—an early version of the design

propounded in this book.

Milner, R. A Calculus of Communicating Systems, Lecture Notes in Computer

Science 92, Springer Verlag, New York (1980)

A clear mathematical treatment of the general theory of concurrency and

synchronisation.

Kahn, G. ‘The Semantics of a simple language for Parallel Programming,’ in

Information Processing, 74, North Holland, Amsterdam pp. 471–475

(1984)

An elegant treatment of functional multiprogramming.

Welsh, J. and McKeag, R. M. Structured System Programming, Prentice–Hall,

London, pp. 324 (1980)

An account of PASCAL PLUS and its use in structuring an operating

system and compiler.

Filman, R. E. and Friedman, D. P. Coordinated Computing, Tools and

Techniques for Distributed Software, McGraw–Hill pp. 370 (1984)

A useful survey and further bibliography.

234 Select Bibliography

Dahl, O–J ‘Hierarchical Program Structures,’ in Structured Programming,

Academic Press, London pp. 175–220 (1982)

An introduction to the inﬂuential ideas of SIMULA 67.

(INMOS Ltd.) occam

TM

Programming Manual Prentice–Hall International, pp.

100 (1984).

(ANSI/MIL–STD 1815A) Ada

TM

Reference Manual

Chapter 9 describes the tasking feature.

Brookes, S. D., Hoare, C. A. R., and Roscoe, A. W. ‘A Theory of

Communicating Sequential Processes,’ Journal ACM 31 (7), 560–599

(1984)

An account of the mathematics of nondeterminism processes, but

excluding divergences.

Brookes, S. D. and Roscoe, A. W. ‘An Improved Failures Model for

Communicating Sequential Processes,’ in Proceedings NSF–SERC Seminar

on Concurrency, Springer Verlag, New York, Lecture Notes in Computer

Science (1985)

The improvement is the addition of divergences.

Index

acc, 179

accept, 226

accepting process, 163

acknowledgement, 150

acquire, 192

activation records, 205

actual resource, 224

Ada, 226

after, 33

algebraic laws, 15, 91

ALGOL 60, 222

alphabet, 1, 3

alternation, 173

alternative, 7

angelic nondeterminism, 89

arcs, 12

ARPAnet, 229

assignment, 176

backing storage, 200

backtracking, 164

batch processing, 208

binary choice, 10, 90

binary exclusion semaphore, 197

binary tree, 155

bit stuﬃng, 151

BOOL, 67

BUFFER, 125

catastrophe, 171

catenation, 22

CCS, 237

chain, 79

CHAIN2, 66

chaining, 141

change-giving, 7

channel, 121

CHAOS, 113

checkpoints, 173

choice, 7

choice2, 18

class, 222

CLOCK, 5

cobegin, 219

coend , 219

COLLEGE, 60

communication, 121

complete partial order (c. p. o.), 79

composition, 38

concealment, 95

concurrency, 48

conditional critical regions, 220

conditionals, 177

congruence, 237

connection diagrams, 55

constructive, 81

context-free grammar, 164

continuous, 79

control block, 205

COPY, 123

copy rule, 223

copybit, 9

critical region, 197

crossbar switch, 206

CT, 11

cyclic, 34

236 Index

data base system, 173

data ﬂow, 134

data structure, 153

deadlock, 46

deterministic processes, 76

dining philosophers, 56

disjoint, 218

distributed processing, 198

distributive, 23

divergence, 99

diverges, 101

domain, 115

DOUBLE, 123

double buﬀer, 142

end-of-ﬁle marker, 202

entry name, 226

Euclid, 237

event, 1

factorial, 154

failures, 115

FCFS, 214

FIB, 129

FIFO, 214

ﬁnite set, 28

ﬁxed point, 78

ﬂight reservation, 192

ﬂow control, 151

footman, 62

fork, 218

FORTRAN monitor system, 208

free variables, 39

functional multiprocessing, 230

guarded, 6

Gypsy, 237

Hamming, 230

head, 24

hide, 100

implementation, 17

independent choice, 87

inﬁnite overtaking, 63

inner statement, 227

input channel, 122

instance, 225

interaction, 45

interference, 196

interleaving, 36

interruptible, 171

interrupts, 170

inverse, 34

ismember, 28

ispreﬁx, 29

istrace, 33

join, 219

label, 69

LABEL, 18

LACKEY, 73

language, 162

laws, 15

layers, 149

lazy evaluation, 20

least upper bound, 79

left shift, 129

left-guarded, 146

length, 26

limit, 79

LISP, 17

LISPkit, 20

livelock, 146

local procedure call, 200

LONGQUOT, 178

loop, 162

main process, 152

master, 152

mathematical semantics, 237

menu, 10

menu, 19

MERGE, 125

message, 121

modularity, 141

monitor, 222

monotonic, 26

multiple checkpoints, 174

multiple choice, 88

multiple labelling, 72

multiprocessor, 198

Index 237

multithreading, 218

mutual recursion, 11

nested monitors, 224

NEWCOLLEGE, 62

NIL, 19

node, 102

NOISYVM, 49

nondestructive, 83

NONSTOP, 108

observational equivalence, 237

occam, 234

operating system, 208

or1, 89

or2, 89

or3, 89

output channel, 122

PACK, 123

packet switching, 229

partial order, 25, 79

PASCAL PLUS, 222, 225

path, 32

phase encoding, 150

PHIL, 59

picture, 12

PIDGINGOL, 162

pipes, 140

pointers, 205

post-condition, 233

pragma, 228

precondition, 182

preﬁx, 3

priority, 228

procedure, 200

process, 2

process labelling, 69

protection, 224

protocol, 148

QUOT, 178

RC4000, 229

re-entrant subroutine, 199

ready sets, 94

receiver, 149

recursion, 4

refusals, 93

regular expressions, 164

release, 192

remote call, 200

rendezvous, 226

scalar product, 134

scheduling, 213

scratch ﬁle, 202

sector, 200

selection, 38

semaphore, 197

sentence, 162

separate compilation, 229

separator card, 209

sequential composition, 166

sequential programming, 176

serially reusable, 198

SET, 155

shared card reader, 209

shared data structure, 192

SHARED LACKEY, 73

shared storage, 196

SIMULA 67, 222

singleton, 24

SKIP, 161

slave, 152

speciﬁcation, 39

spooled input, 211

spooled output, 210

spooling, 210

stack, 126

star, 25

state, 11

static binding, 20

STOP, 3

strict, 23

structure clash, 141

subordination, 151

subroutine, 153

subscription, 37

success, 161

successful termination, 38

symbolic execution, 142

synchronisation, 46

238 Index

systolic array, 134

tail, 24

then, 3

timesharing, 196

tr, 39

trace, 21, 48, 54

transition, 12

transmitter, 149

transparent, 130

unfolding, 5

unguarded recursion, 99

unique solutions, 81

UNIX, 219

UNPACK, 123

VAR, 125

var, 179

virtual resource, x, 205

VMC, 8

VMCRED, 9

VMCT, 8

VMS, 6

VMS2, 9

weighted sum, 133

WIRE, 149

© C. A. R. Hoare, 1985–2004 This document is an electronic version of Communicating Sequential Processes, ﬁrst published in 1985 by Prentice Hall International. It may be copied, printed, and distributed free of charge. However, such copying, printing, or distribution may not: − be carried out for commercial gain; or − take place within India, Pakistan, Bangladesh, Sri Lanka, or the Maldives; or − involve any modiﬁcation to the document itself. Questions and comments are welcome, and should be sent to the editor of this version: Jim.Davies@comlab.ox.ac.uk.

Foreword

For a variety of reasons, this is a book eagerly awaited by all who knew it was in the making; to say that their patience has been rewarded would be an understatement. A simple reason was that it is Tony Hoare’s ﬁrst book. Many know him from the lectures he has untiringly given all over the world; many more know him as the articulate and careful author of a number of articles (of great variety!) that became classics almost before the printer’s ink had dried. But a book is a diﬀerent medium: here the author can express himself without the usually stringent limitations of time and space; it gives him the opportunity of revealing himself more intimately and of covering a topic of wider span, opportunities of which Tony Hoare has made the best use we could hope for. A more solid reason was derived from the direct contents of the book. When concurrency confronted the computing community about a quarter of a century ago, it caused an endless confusion, partly by the technically very diﬀerent circumstances in which it emerged, partly by the accident of history that it introduced non-determinism at the same time. The disentanglement of that confusion required the hard work of a mature and devoted scientist who, with luck, would clarify the situation. Tony Hoare has devoted a major part of his scientiﬁc endeavours to that challenge, and we have every reason to be grateful for that. The most profound reason, however, was keenly felt by those who had seen earlier drafts of his manuscript, which shed with surprising clarity new light on what computing science could—or even should—be. To say or feel that the computing scientist’s main challenge is not to get confused by the complexities of his own making is one thing; it is quite a diﬀerent matter to discover and show how a strict adherence to the tangible and quite explicit elegance of a few mathematical laws can achieve this lofty goal. It is here that we, the grateful readers, reap to my taste the greatest beneﬁts from the scientiﬁc wisdom, the notational intrepidity, and the manipulative agility of Charles Antony Richard Hoare. Edsger W. Dijkstra

Preface

This is a book for the aspiring programmer, the programmer who aspires to greater understanding and skill in the practice of an intellectually demanding profession. It is designed to appeal ﬁrst to a natural sense of curiosity, which is aroused by a new approach to a familiar topic. The approach is illustrated by a host of examples drawn from a wide range of applications, from vending machines through fairy stories and games to computer operating systems. The treatment is based on a mathematical theory, which is described by a systematic collection of algebraic laws. The ultimate objective of the book is to convey an insight which will enable the reader to see both current and future problems in a fresh light, in which they can be more eﬃciently and more reliably solved; and even better, they can sometimes be avoided. The most obvious application of the new ideas is to the speciﬁcation, design, and implementation of computer systems which continuously act and interact with their environment. The basic idea is that these systems can be readily decomposed into subsystems which operate concurrently and interact with each other as well as with their common environment. The parallel composition of subsystems is as simple as the sequential composition of lines or statements in a conventional programming language. This insight brings practical beneﬁts. Firstly, it avoids many of the traditional problems of parallelism in programming—interference, mutual exclusion, interrupts, multithreading, semaphores, etc. Secondly, it includes as special cases many of the advanced structuring ideas which have been explored in recent research into programming languages and programming methodology—the monitor, class, module, package, critical region, envelope, form, and even the humble subroutine. Finally, it provides a secure mathematical foundation for avoidance of errors such as divergence, deadlock and non-termination, and for achievement of provable correctness in the design and implementation of computer systems. I have tried hard to present the ideas in a logically and psychologically wellordered sequence, starting with the simple basic operators, and progressing towards their more elaborate applications. An assiduous reader may study the book from cover to cover. But many readers will start with greater interest in

They will also be of interest to senior programmers. The algebraic laws which describe the essential properties of the various operations will be of interest to those with a taste for mathematical elegance. The succession of chapters in the book has also been organised to permit judicious browsing. who need to design a system by splitting it into subsystems with clearly speciﬁed interfaces. the implementations and the proof rules. but not everyone will wish to read and master it in the order presented. Each new idea is introduced by an informal description and illuminated by a number of small examples. since there is a reasonable expectation that the omitted material will not be immediately required again. which will probably be helpful to all readers. The deﬁnitions of traces and speciﬁcations will be of interest to systems analysts. 5. So if at any stage there is any diﬃculty of understanding. and at a ﬁxed cost. Finally. selection. and for their beneﬁt each chapter of the book has been structured to permit judicious selection. When such a requirement arises. The proposed implementations are unusual in that they use a very simple purely functional subset of the well-known programming language LISP. These deﬁnitions are a basis for the algebraic laws. but later sections may be more lightly skimmed or postponed to a second pass. to a ﬁxed schedule. there will often be an explicit backward reference. Chapters 3. 4. 2. The proof rules will be of interest to implementors who take seriously their obligation to produce reliable programs to a known speciﬁcation.vi Preface some topics than others. 3. or which present greater diﬃculty of understanding. and may be started in any combination or in any order. 4 and 5 are independent of each other. They will also be of beneﬁt for those who wish to optimise their system designs by means of correctness-preserving transformations. . The earlier sections of Chapter 1 and Chapter 2 will be a necessary introduction for all readers. who need to specify a client’s requirements before undertaking an implementation. according to the interest and inclination of the reader. A reader may consistently or intermittently omit or postpone any of these topics which are of lesser interest. I hope everything in the book will in the end be found interesting and rewarding. 6. the mathematical theory gives a rigorous deﬁnition of the concept of a process. 1. which can be followed when there is suﬃcient motivation to do so. and the operators in terms of which processes are constructed. This will aﬀord additional excitement to those who have access to a LISP implementation on which to exercise and demonstrate their designs. or rearrangement. it is advisable to continue reading at the next section or even the next chapter.

but they will end with the distinct advantage of putting their knowledge to practical use. programmers who study it start with no disadvantage over mathematicians. like programming. If it is any consolation. the eﬀort of mastering a new method receives it most abundant reward. The early examples chosen to illustrate each new idea must be so simple that the idea cannot be obscured by the complexity or unfamiliarity of the example. Start with some grossly over-simpliﬁed version of some selected aspect of the problem. and the vocabulary is much smaller. you will see straight through them to what they mean. Perhaps the model can serve as a structure on which complex detail can later be safely superimposed. but the real problem is to understand the meaning and properties of the symbols and how they may and may not be manipulated. and cannot be hurried. especially since many of them have strange pronunciations. The symbols may initially appear to be a serious hurdle. Such problems may seem to be intractable by any mathematical theory. The great advantage of mathematics is that the rules are much simpler than those of a natural language. perhaps painfully familiar. Even mathematicians ﬁnd it diﬃcult to study new branches of their subject. All this requires study and exercise and time. . A student setting out to learn the Russian language often complains at the initial hurdle of learning the unfamiliar letters of the Cyrillic alphabet. Notations are a frequent complaint. and after that you must master the idiom and style.Preface vii The examples chosen to illustrate the ideas of this book will all seem very small. Consequently. and proofs. complexity and importance than the examples appropriate for an introductory text. you must learn the grammar and the vocabulary. This is deliberate. and the simplicity of their solution could be due to the power of the concepts used and the elegance of the notations in which they are expressed. In such cases. the problems themselves are of the kind that could generate much confusion and complexity. And the ﬁnal surprise is that perhaps some of the additional complexity turns out to be unnecessary after all. with problems of far greater scope. After learning the script. Finally. By that time. That is why mathematics. the symbols will be invisible. and after that you must develop ﬂuency in the use of the language to express your own ideas. when presented with something unfamiliar it is possible to work out a solution for yourself. But it is not always easy. The theory of communicating processes is a new branch of mathematics. Some of the later examples are more subtle. and to gain ﬂuency in using them to express new problems. this should be the least of your worries. but rather accept the challenge of trying to apply these new methods to existing problems. Please do not give way to irritation or disillusion. each reader will be familiar. It is surprising how often the initial over-simpliﬁed model will convey additional insight. Nevertheless. So it is with mathematics. you will cultivate an appreciation of mathematical elegance and style. solutions. by logical deduction and invention rather than by consulting books or experts. can be so enjoyable. to assist in the solution of the problem as a whole. and gradually add the complexities which appear to be necessary.

Bach. In such a course. An audience usually ﬁnds something particularly farcical about deadlock. and the notations of the predicate calculus. and even in a single hour’s seminar it is possible by careful selection. It was ﬁrst designed for a onesemester Master’s course in software engineering. So one must learn to concentrate attention on the cold dry text of the mathematical formulae. If even less time is available. . since the examples give scope for exercise of the histrionic skills of the lecturer. But they should be constantly warned about the dangers of anthropomorphism. leaving the more mathematical material for later private study. These are summarised on the ﬁrst page of the glossary of symbols just after this preface. preferences. The mathematical formulae have deliberately abstracted from the motives. In particular. It is great fun to present lectures and seminars on communicating sequential processes. a course which ends after Chapter 2 is quite worthwhile. though most of it could be presented in the ﬁnal or even the second year of a Bachelor’s degree in computing science. the lecturer would concentrate on examples and deﬁnitions. The main prerequisite is some acquaintance with high-school algebra. S. The book is also a suitable basis for an intensive one-week course for experienced programmers.viii Preface The material of this book has been tested by presentation in informal workshops as well as on formal academic courses. the concepts of set theory. some of the recursively deﬁned algorithms have something of the breathtaking beauty of a fugue composed by J. to get as far as the edifying tale of the ﬁve dining philosophers. and cultivate an appreciation for their elegant abstraction. and emotional responses by which the lecturer “lends an air of verisimilitude to an otherwise bald and unconvincing tale”. Each example presents a little drama which can be acted with due emphasis on the feelings of the human participants.

A process can be speciﬁed in advance of implementation by describing the properties of its traces. The second part of Chapter 2 shows how processes can be conveniently adapted to new purposes by changing the names of the events in which they engage. It also preserves certain symmetries in the deﬁnition of the operators of the mathematical theory. in which the components interact with each other and with their external environment. there are techniques for . Many useful operations on traces are deﬁned. and by an implementation on a computer in a functional programming language. Rules are given to help in implementation of processes which can be proved to meet their speciﬁcations. It shows how the familiar technique of recursion may be used to describe processes that last a long time. The chapter concludes with an explanation of the mathematical theory of deterministic processes. The second part of the chapter explains how the behaviour of a process can be recorded as a trace of the sequence of actions in which it engages. including a simple account of the domain theory of recursion. or forever. The introduction of concurrency does not by itself introduce any element of nondeterminism.Summary Chapter 1 introduces the basic concept of a process as a mathematical abstraction of the interactions between a system and its environment. a more complete explanation is given by algebraic laws. The third chapter gives one of the simplest known solutions to the vexed problem of nondeterminism. The second chapter describes how processes can be assembled together into systems. since it is necessary to demonstrate that every possible nondeterministic choice will result in a behaviour which meets the given speciﬁcation. Proof methods for nondeterministic processes are slightly more complicated than those for deterministic processes. Nondeterminism is shown to be a valuable technique for achieving abstraction. since it arises naturally from the decision to ignore or conceal those aspects of the behaviour of a systems in which we are no longer interested. The main example of this chapter is a treatment of the traditional tale of the ﬁve dining philosophers. The concepts are explained ﬁrst by example and then by pictures. Fortunately.

An important objective in the design of concurrent systems is to achieve greater speed of computation in the solution of practical problems. Consequently the study or mastery of Chapter 3 can be postponed until just before Chapter 6. who wishes to explore the foundations of the subject. This is illustrated by the design of some simple systolic (or iterative) array algorithms. Chapter 4 at last introduces communication: it is a special case of interaction between two processes. if buﬀering is required on a channel. Even the externally triggered interrupt is deﬁned and shown to be useful. and that sequential programs can be proved to meet their speciﬁcations in much the same way as concurrent programs. Pipes are useful for the implementation of a single direction of a communications protocol. On each occasion that a resource is required by a user process. Chapter 6 describes how to structure and implement a system in which a limited number of physical resources such as discs and line printers can be shared among a greater number of processes. and subject to elegant laws. Chapter 5 shows how the conventional operators of sequential programming can be integrated within the framework of communicating sequential processes. So the real and virtual processes play the same roles as the monitors and envelopes of PASCAL PLUS. A simple case is a pipe. . Such communications are interleaved with those of other concurrently active virtual processes. or to verify by proof the validity of the algebraic laws and other properties of processes. the important concept of an abstract data type is modelled a a subordinate process. The chapter is illustrated by the modular development of a series of complete but very simple operating systems. structured as a hierarchy of layers. and they may safely omit the more theoretical sections. whose resource requirements vary with time. in which the introduction of nondeterminism can no longer be avoided. a new virtual resource is created. each instance of which communicates only with the block in which it is declared. This deﬁnition will be of interest to the pure mathematician. which are the largest examples given in this book. this is achieved by interposing a buﬀer process between the two processes. Finally. Each resource is represented as a single process. In the later sections of Chapter 3. there is given a complete mathematical deﬁnition of the concept of a nondeterministic process. and these are used extensively in Chapters 4 and 5. deﬁned as a sequence of processes in which each process inputs only from its predecessor and outputs only to its successor. one of which outputs a message at the same time as the other one inputs it. Thus communication is synchronised. A virtual resource is a process which behaves as if it were subordinate to the user process. Applied mathematicians (including programmers) may choose to regard the laws as self-evident or justiﬁed by their utility. It may be surprising to experienced programmers that these operators enjoy the same kind of elegant algebraic properties as the operators of familiar mathematical theories.x Summary avoiding nondeterminism. but it also communicates with the real resource whenever required.

Summary xi Chapter 7 describes a number of alternative approaches to concurrency and communication. historical. and explains the technical. and personal motives which led to the theory expounded in the preceding chapters. . Here I acknowledge my great debt to other authors. and give recommendations and an introduction to further reading in the ﬁeld.

.

Stephen Brookes. Zhou Chao Chen. Dijkstra. Mike McKeag. Jeremy Jacob. Goldschlager. Y.-J. expounded in his seminal work on a Calculus for Communicating Systems. in Computation at Oxford University in the years 1979 to 1984. Jim Kaubisch. Finally. John Elder. Dave Bustard. Kong. his personal friendship and his professional rivalry have been a constant source of inspiration and encouragement for the work which culminated in the publication of this book. Alastair Tocher and Jim Welsh. Carroll Morgan. Leslie M. Ian Hayes. including Per Brinch Hansen. Peter Lauer. Dahl. Ole-Johan Dahl. E.Sc. and students of the M. and in particular to the participants in the Wollongong Summer School on the Science of Computer Programming in January 1983. John Kennaway. Alex Teruel. W. the attendants at my seminar in the Graduate School of the Chinese Academy of Science. His original insights. and the design of a programming language to ease those problems. Dijkstra. During that period I have proﬁted greatly by collaboration with many scientists. and who have pointed out its errors and obscurities. Edsger W. Ernst-Rudiger Olderog. . Bill Roscoe. April 1983.Acknowledgements It is a great pleasure to acknowledge the profound and original work of Robin Milner. Rudi Reinecke. Jeﬀ Sanders and others who have carefully studied an earlier draft of this text. my special thanks for to O. For the last twenty years I have been considering the problems of programming for concurrent computations. T.

.

P there exists an x in set A such that P for all x in set A. P x ≤x +1∧x ≠x +1 x≤y ∨y ≤x ¬3≥5 x<y ⇒x≤y x<y ≡y >x ∃x • x > y ∀x • x < x + 1 Example x=x x ≠x +1 Sets Notation ∈ ∉ {} Meaning is a member of is not a member of the empty set (with no members) Example Napoleon ∈ mankind Napoleon ∉ Russians ¬ Napoleon ∈ {} .Glossary of Symbols Logic Notation = ≠ Meaning equals is distinct from end of an example or proof P ∧Q P ∨Q ¬P P ⇒Q P ≡Q ∃x • P ∀x • P ∃x : A • P ∀x : A • P P and Q (both true) P or Q (one or both true) not P (P is not true) if P then Q P if and only if Q there exists an x such that P forall x.

xvi

Glossary of Symbols

**{a} {a, b, c} { x | P (x) } A∪B A∩B A−B A⊆B A⊇B { x : A | P (x) } N PA
**

n≥0 n≥0

the singleton set of a; a is the only member the set with members a, b, and c the set of all x such that P (x) A union B A intersect B A minus B A is contained in B A contains B the set of x in A such that P (x) the set of natural numbers the power set of A An An union of a family of sets intersection of a family of sets

x ∈ {a} ≡ x = a c ∈ {a, b, c} {a} = { x | x = a } A ∪ B = {x | x ∈ A ∨ x ∈ B } A ∩ B = {x | x ∈ A ∧ x ∈ B } A − B = {x | x ∈ A ∧ ¬ x ∈ B } A ⊆ B ≡ ∀x : A • x ∈ B A⊇B≡B⊆A

{0, 1, 2, . . .} PA = {X | X ⊆ A}

n≥0 n≥0

An = { x | ∃ n ≥ 0 • x ∈ A } An = { x | ∀ n ≥ 0 • x ∈ A }

Functions

Notation f :A→B Meaning f is a function which maps each member of A to a member of B that member of B to which f maps x (in A) a function f which maps each member of A to a distinct member of B inverse of an injection f the set formed by applying f to all x such that P (x) the image of C under f { y | ∃ x • y = f (x) ∧ x ∈ C } square({3, 5}) = {9, 15} f ◦g λ x • f (x) f composed with g the function which maps each value of x to f (x) f ◦ g(x) = f (g(x)) (λ x • f (x))(3) = f (3) x ≠ y ⇒ f (x) ≠ f (y) Example square : N → N

f (x) injection

f −1 { f (x) | P (x) } f (C)

x = f (y) ≡ y = f −1 (x)

Glossary of Symbols

xvii

Traces

Section 1.5 1.5 1.5 1.6.1 1.6.1 1.6.2 1.6.5 4.2.2 1.6.5 1.6.6 1.6.6 1.9.6 1.9.2 1.9.7 1.6.4 1.6.3 1.6.3 1.9.4 1.9.1 1.9.5 sn s A a a, b, c Notation Meaning the empty trace the trace containing only a (singleton sequence) the trace with three symbols, a then b, then c (between traces) followed by s repeated n times s restricted to A s is a preﬁx of t s is like t with up to n symbols removed s is in t the length of s the count of b in s the communications on channel c recorded in s ﬂatten s s successfully followed by t set of sequences with elements in A the head of s the tail of s the ith element of s f star of s reverse of s a, b, c = a, b a, b

2

Example

c

= a, b, a, b {a, c} = c, a

b, c, d, a

**s≤t s ≤n t s in t #s s↓b s↓c /s s;t A∗ s0 s s[i] f (s) s
**

∗

a, b ≤ a, b, c a, b ≤2 a, b, c, d c, d in b, c, d, a, b # b, c, b, a = 4 b, c, b, a ↓ b = 2 c.1, a.4, c.3, d.1 ↓ c = 1, 3 / a, b , (s );t =s t A=s}

A∗ = { s | s a, b, c a, b, c =a

0

= b, c

a, b, c [1] = b square∗ ( 1, 5, 3 ) = 1, 25, 9 a, b, c, = c, b, a

Special Events

Section 1.9.7 2.6.2 l.a Notation Meaning success (successful termination) participation in event a by a process named l

xviii

Glossary of Symbols

4.1 4.5 4.5 5.4.1 5.4.3 5.4.4 6.2 6.2

c.v l.c l.c.v

communication of value v on channel c channel c of a process named l communication of a message v on channel l.c catastrophe (lightning)

x c

exchange checkpoint for later recovery acquisition of a resource release of a resource

acquire release

Processes

Section 1.1 4.1 1.1.1 1.1.3 1.1.3 1.1.2 1.8 2.3 2.6.2 2.6.4 3.2 3.3 3.5 3.6 4.4 4.5 6.4 5.1 5.4 5.4.1 5.4.2 Notation αP αc a→P (a → P | b → Q ) (x : A → P (x)) µ X : A • F (X ) P /s P || Q l:P L:P P P Q Q Meaning the alphabet of process P the set of messages communicable on channel c a then P a then P choice b then Q (provided a ≠ b) (choice of) x from A then P (x) the process X with alphabet A such that X = F (X ) P after (engaging in events of trace) s P in parallel with Q P with name l P with names from set L P or Q (non-deterministic) P choice Q P without C (hiding) P interleave Q P chained to Q P subordinate to Q remote subordination P (successfully) followed by Q P interrupted by Q P but on catastrophe Q restartable P

P\C P ||| Q P> >Q P // Q l :: P // Q P ;Q P ˆ P Q

P ˆQ

Glossary of Symbols

xix

5.4.3 5.5 5.1 5.5 5.5 4.2 4.2 6.2 1.10.1 1.10.1 3.7 5.5.2 5.5.1 5.5.1 2.8.2 3.9 5.5.1

P x Q P <b> Q | |

∗

P alternating with Q P if b else Q repeat P while b repeat P x becomes (value of) e on (channel) b output (value of) e on (channel) b input to x call of shared subroutine named l with value parameter e and results to x (process) P satisﬁes (speciﬁcation) S an arbitrary trace of the speciﬁed process an arbitrary refusal of the speciﬁed process the ﬁnal value of x produced by the speciﬁed process set of variables assignable by P set of variables accessible by P (deterministic) Q can do at least as much as P (nondeterministic) Q is as good as P or better expression e is deﬁned

P

b∗P x := e b!e b?x l!e?x P sat S tr ref x var (P ) acc(P ) P P De Q Q

Algebra

Term reﬂexive antisymmetric transitive partial order bottom monotonic strict idempotent symmetric associative Meaning a relation R such that x R x a relation R such that x R y ∧ y R x ⇒ x = y a relation R such that x R y ∧ y R z ⇒ x R z a relation ≤ that is reﬂexive, antisymmetric, and transitive a least element ⊥ such that ⊥≤ x a function f that respects a partial order: x ≤ y ⇒ f (x) ≤ f (y) a function f that preserves bottom: f (⊥) =⊥ a binary operator f such that x f x = x a binary operator f such that x f y = y f x a binary operator f such that x f (y f z) = (x f y) f z

xx

Glossary of Symbols

distributive unit zero

f distributes through g if x f (y g z) = (x f y) g (x f z) and (y g z) f x = (y f x) g (z f x) of f is an element 1 such that x f 1 = 1 f x = x of f is an element 0 such that x f 0 = 0 f x = 0

Graphs

Term graph node arc undirected graph directed graph directed cycle undirected cycle Meaning a relation drawn as a picture a circle in a graph representing an element in the domain or range of a relation a line or arrow in a graph connecting nodes between which the pictured relation holds graph of a symmetric relation graph of an asymmetric relation often drawn with arrows a set of nodes connected in a cycle by arrows all in the same direction a set of nodes connected in a cycle by arcs or arrows in either direction

Contents Foreword Preface Summary Acknowledgements Glossary of Symbols 1 Processes 1.4 Pictures 2.6 Operations on traces 1.1 Introduction 2.3 Concurrency 2.4 Implementation of processes 1.1 Introduction 1.3 Laws 1.7 Implementation of traces 1.9 More operations on traces 1.6 Change of symbol 2.2 Pictures 1.8 Mathematical theory of deterministic processes iii v ix xiii xv 1 1 12 14 16 19 21 26 27 34 37 45 45 45 48 54 55 61 71 72 2 .10 Speciﬁcations Concurrency 2.5 Example: The Dining Philosophers 2.5 Traces 1.7 Speciﬁcations 2.8 Traces of a process 1.2 Interaction 2.

5 Assignment Shared Resources 6.5 Operating systems 6.4 Refusals 3.3 Communications 4.1 Introduction 7.2 Nondeterministic or 3.1 Introduction 4.2 Laws 5.2 Shared storage 7.4 Multiple resources 6.3 Mathematical treatment 5.xxii Contents 3 Nondeterminism 3.2 Sharing by interleaving 6.9 Mathematical theory of non-deterministic processes Communication 4.5 Concealment 3.4 Interrupts 5.3 Communication 7.6 Interleaving 3.1 Introduction 6.1 Introduction 5.3 Shared storage 6.8 Divergence 3.4 Mathematical models Select Bibliography Index 81 81 82 86 88 90 99 101 105 108 113 113 113 122 131 142 153 153 157 158 161 167 181 181 182 187 189 198 204 207 207 207 218 226 233 235 4 5 6 7 .2 Input and output 4.5 Subordination Sequential Processes 5.4 Pipes 4.6 Scheduling Discussion 7.3 General choice 3.1 Introduction 3.7 Speciﬁcations 3.

out1p—the extraction of one penny in change. there are two kinds of event: coin—the insertion of a coin in the slot of a vending machine. in2p—the insertion of a two penny coin. for example.Processes 1 1. separated in time. small—the extraction of a small biscuit or cookie. and choose a diﬀerent name for each kind. there may be a greater variety of events: in1p—the insertion of one penny. A similar distinction between a class and an occurrence should be made in the case of the letter ‘h’. Note that each event name denotes an event class. But the converse . The set of names of events which are considered relevant for a particular description of an object is called its alphabet. there may be many occurrences of events in a single class. choc—the extraction of a chocolate from the dispenser of the machine.1 Introduction Forget for a while about computers and computer programming. ﬁrst decide what kinds of event or action will be of interest. large—the extraction of a large biscuit or cookie. In the case of a simple vending machine. and think instead about objects in the world around us. of which there are many occurrences spatially separated in the text of this book. It is logically impossible for an object to engage in an event outside its alphabet. In the case of a more complex vending machine. The alphabet is a permanent predeﬁned property of an object. which act and interact with us and with each other in accordance with some characteristic pattern of behaviour. To describe their patterns of behaviour. Think of clocks and counters and telephones and board games and vending machines. a machine designed to sell chocolates could not suddenly deliver a toy battleship.

1. during such an internal. . it remains so. there is no need to make a distinction between events which are initiated by the object (perhaps choc) and those which are initiated by some agent outside the object (for example. We shall use the following conventions. a. Words in upper-case letters denote speciﬁc deﬁned processes. 2. are deliberately ignored—perhaps on the grounds that they are not (or should not be) of any concern to the customers of the machine. A machine designed to sell chocolates may actually never do so—perhaps because it has not been ﬁlled. For example. the colour. and furthermore can be applied to physical and computing systems of any speed and performance.g. Let us now begin to use the word process to stand for the behaviour pattern of an object.g. coin). out1p and so also do the letters. weight. The actual occurrence of each event in the life of an object should be regarded as an instantaneous or an atomic action without duration.2 1 Processes does not hold. choc. or it is broken. Extended or time-consuming actions should be represented by a pair of events. these concerns can be treated independently of the logical correctness of the design. other events may occur. b. In choosing an alphabet. such as replenishing the stack of chocolates or emptying the coin box. insofar as it can be described in terms of the limited set of events selected as its alphabet. Words in lower-case letters denote distinct events. e. The avoidance of the concept of causality leads to considerable simpliﬁcation in the theory and its application. e. When simultaneity of a pair of events is important (e. The advantage of this is that designs and reasoning about them are simpliﬁed.. and shape of a vending machine are not described. in2p. and when it is not.. the ﬁrst denoting its start and the second denoting its ﬁnish. in synchronisation) we represent it as a singleevent occurrence. a decision to ignore many other properties and actions which are considered to be of lesser interest. The choice of an alphabet usually involves a deliberate simpliﬁcation. or nobody wants chocolates. The duration of an action is represented by the internal between the occurrence of its start event and the occurrence of its ﬁnish event. Another detail which we have deliberately chosen to ignore is the exact timing of occurrences of events. But once it is decided that choc is in the alphabet of the machine. Two extended actions may overlap in time if the start of each one precedes the ﬁnish of the other. and certain very necessary events in its life. A consequence of ignoring time is that we refuse to answer of even to ask whether one event occurs simultaneously with another. Independence of timing has always been a necessary condition to the success of high-level programming languages. d. c. even if that event never actually occurs. e. we allow two potentially simultaneous event occurrences to be recorded in either order.g. coin. In cases where timing of responses is critical.

it never exercises those capabilities. Then (x → P ) (pronounced “x then P ”) describes an object which ﬁrst engages in the event x and then behaves exactly as described by P . so this notation must not be used unless x is in that alphabet. Q . small. C stand for sets of events. Y are variables denoting processes. 1. whereas STOPαVMC could never give out a chocolate. more formally. e.1.g. R (occurring in laws) stand for arbitrary processes. So STOPαVMS might have given out a chocolate. out1p} The process with alphabet A which never actually engages in any of the events of A is called STOPA . A customer for either machine knows these facts. The letters X . αVMS = {coin. choc} αVMC = {in1p. y. large. The letters x. 6. in2p. z are variables denoting events. The process (x → P ) is deﬁned to have the same alphabet as P . 5.1. The alphabet of process P is denoted αP . This describes the behaviour of a broken object: although it is equipped with the physical capabilities to engage in the events of A.. Objects with diﬀerent alphabets are distinguished. 4. even if they never do anything. In the remainder of this introduction. we shall deﬁne some simple notations to aid in the description of objects which actually succeed in doing something. even if he does not know that both machines are broken. The letters A. 3.1 Preﬁx Let x be an event and let P be a process.1 Introduction 3 VMS —the simple vending machine VMC—the complex vending machine and the letters P . α(x → P ) = αP Examples X1 A simple vending machine which consumes one coin before breaking (coin → STOPαVMS ) provided x ∈ αP X2 A simple vending machine that successfully serves two customers before breaking (coin → (choc → (coin → (choc → STOPαVMS )))) . B. only a biscuit.

But it would be extremely tedious to write out the full behaviour of a vending machine for its maximum design life. on the convention that → is right associative. But after the ﬁrst coin is inserted. In future.2 Recursion The preﬁx notation can be used to describe the entire behaviour of a process that eventually stops. nor will it give out two consecutive chocolates. it is syntactically incorrect to write P →Q The correct method of describing a process which behaves ﬁrst like P and then like Q is described in Chapter 5. right } CTR = (right → up → right → right → STOPαCTR ) Note that the → operator always takes a process on the right and a single event on the left. like those in X2. so we need a . it is syntactically incorrect to write x→y Such a process could be correctly described x → (y → STOP ) Thus we carefully distinguish the concept of an event from that of a process which engages in events—maybe many events or even none. If P and Q are processes. Similarly. if x and y are events. and can move only up or right to an adjacent white square αCTR = {up. This machine will not accept two coins in a row. we shall omit brackets in the case of linear sequences of events.4 1 Processes Initially. the coin slot closes until a chocolate is extracted. 1. X3 A counter starts on the bottom left square of a board.1. this machine will accept insertion of a coin in its slot. but will not allow a chocolate to be extracted.

. in the same way that the square root of two might be deﬁned as the positive solution for x in the equation x = x2 + x − 2 The equation for the clock has some obvious consequences. This method of self-referential recursive deﬁnition of processes will work only if the right-hand side of the equation starts with at least one event preﬁxed to all recursive occurrences of the process name. except that it ﬁrst emits a single tick (tick → CLOCK ) The behaviour of this object is indistinguishable from that of the original clock. Consider the simplest possible everlasting object. The potentially unbounded behaviour of the CLOCK has been eﬀectively deﬁned as tick → tick → tick → · · · in the same way as the square root of two can be thought of as the limit of a series of decimals 1. This reasoning leads to formulation of the equation CLOCK = (tick → CLOCK ) This can be regarded as an implicit deﬁnition of the behaviour of the clock. the recursive . this will permit description of objects which will continue to act and interact with their environment for as long as they are needed. For example. and the possibility of further unfolding will still be preserved. a clock which never does anything but tick (the act of winding it is deliberately ignored) αCLOCK = {tick} Consider next an object that behaves exactly like the clock. which are derived by simply substituting equals for equals CLOCK = (tick → CLOCK ) = (tick → (tick → CLOCK )) = (tick → (tick → (tick → CLOCK ))) [original equation] [by substitution] [similarly] The equation can be unfolded as many times as required.1. .1 Introduction 5 method of describing repetitive behaviours by much shorter notations.414 . Preferably these notations should not require a prior decision on the length of life of an object.

and A is the alphabet of X . It is sometimes convenient to denote this solution by the expression µ X : A • F (X ) Here X is a local name (bound variable). and can be changed at will. then we claim that the equation X = F (X ) has a unique solution with alphabet A. A process description which begins with a preﬁx is said to be guarded. whichever is more convenient. In the case of µ X : A • F (X ). we will give recursive deﬁnitions of processes either by equations. choc} • (coin → (choc → X )) . since µ X : A • F (X ) = µ Y : A • F (Y ) This equality is justiﬁed by the fact that a solution for X in the equation X = F (X ) is also a solution for Y in the equation Y = F (Y ) In future. where this is obvious from the content of context of the process. If F (X ) is a guarded expression containing the process name X .6 1 Processes equation X =X does not succeed in deﬁning anything: everything is a solution to this equation. or by use of µ. Examples X1 A perpetual clock CLOCK = µ X : {tick} • (tick → X ) X2 At last. a simple vending machine which serves as many chocs as required VMS = (coin → (choc → VMS )) As explained above. we shall often omit explicit mention of the alphabet A. this equation is just an alternative for the more formal deﬁnition VMS = µ X : {coin.

α(x → P | y → Q ) = αP provided {x.8.1.e. y} ⊆ αP and αP = αQ The bar | should be pronounced “choice”: “x then P choice y then Q ” . and so describes a longer initial segment of behaviour. This will be done in Section 2. or by Q if the ﬁrst event was y. whose value and relevance will gradually become more apparent. The account of recursion given here relies heavily on guardedness of recursive equations. may be informally justiﬁed by the method of substitution.. Any ﬁnite amount of behaviour can be determined in this way. we insist on constancy of alphabets.3 Choice By means of preﬁxing and recursion it is possible to describe objects with a single possible stream of behaviour.1. and it is the customer that decides between these two events.3. a vending machine may oﬀer a choice of slots for inserting a 2p or a 1p coin. the choice between P and Q is determined by the ﬁrst event that actually occurs. the expression deﬁning the behaviour of the process gets longer. i.. For example. A meaning for unguarded recursions will be discussed in Section 3. i. Since x and y are diﬀerent events. Each time that the right-hand side of the equation is substituted for every occurrence of the process name.1 Introduction 7 X3 A machine that gives change for 5p repeatedly αCH5A = {in5p. the subsequent behaviour of the object is described by P if the ﬁrst event was x.8. many objects allow their behaviour to be inﬂuenced by interaction with the environment within which they are placed.e. and that this solution may be unique. out2p. If x and y are distinct events (x → P | y → Q ) describes an object which initially engages in either of the events x or y. Those who ﬁnd this reasoning incomprehensible or unconvincing should accept this claim as an axiom. 1. However. As before. A more formal proof cannot be given without some mathematical deﬁnition of exactly what a process is. out1p} CH5A = (in5p → out2p → out1p → out2p → CH5A) X4 A diﬀerent change-giving machine with the same alphabet CH5B = (in5p → out1p → out1p → out1p → out2p → CH5B) The claim that guarded equations have a solution. they are the same process. Two objects which behave the same up to every moment in time have the same behaviour. After the ﬁrst event has occurred.

1.8 1 Processes Examples X1 The possible movements of a counter on the board are deﬁned by the process (up → STOP | right → right → up → STOP ) X2 A machine which oﬀers a choice of two combinations of change for 5p (compare 1. which oﬀer no choice). X3 A machine that serves either chocolate or toﬀee on each transaction VMCT = µ X • coin → (choc → X | toﬀee → X ) X4 A more complicated vending machine. which oﬀers a choice of coins and a choice of goods and change VMC = (in2p → (large → VMC | small → out1p → VMC) | in1p → (small → VMC | in1p → (large → VMC | in1p → STOP ))) Like many complicated machines.” . this has a design ﬂaw. so we write a notice on the machine “WARNING: do not insert three pennies in a row. It is often easier to change the user manual than correct the design.2 X3 and X4. CH5C = in5p → (out1p → out1p → out1p → out2p → CH5C | out2p → out1p → out2p → CH5C) The choice is exercised by the customer of the machine.

1—output of one on its output channel Its behaviour consists of a repetition of pairs of events. The reason for this rule is that we want to avoid giving a meaning to (x → P | x → Q ) which appears to oﬀer a choice of ﬁrst event.1 → out .1 Introduction 9 X5 A machine that allows its customer to sample a chocolate. an initial payment is extracted for the privilege of using VMCRED VMS2 = (coin → VMCRED) This machine will allow insertion of up to two consecutive coins before extraction of up to two consecutive chocolates.1—input of one on its input channel out . That will be the main diﬀerence between input and output in our treatment of communication in Chapter 4. in Sec- . but actually fails to do so. The deﬁnition of choice can readily be extended to more than two alternatives.0 → out . On each cycle. it inputs a bit and outputs the same bit COPYBIT = µ X • (in. for processes P and Q . it would be syntactically incorrect to write P | Q . . but it will never give out more chocolates than have been previously paid for. e.0—input of zero on its input channel in.1 → X ) Note how this process allows its environment to choose which value should be input.0 → X | in. | z → R) Note that the choice symbol | is not an operator on processes.0—output of zero on its output channel out .. This problem is solved. (x → P | y → Q | .1. and trusts him to pay after.g. at the expense of introducing nondeterminism. but no choice is oﬀered in the case of output. X7 A copying process engages in the following events in. The normal sequence of events is also allowed VMCRED = µ X • (coin → choc → X | choc → coin → X ) X6 To prevent loss. .

R. and then behaves like P (y). since it gives the set of actions between which a choice is to be made at the start. nothing at all can happen. z are distinct events. Q . In the even more special case that the initial menu is empty. if B is any set of events and P (x) is an expression deﬁning a process for each diﬀerent x in B. b} and R(x) =if x = a then P else Q . In this construction.10 1 Processes tion 3. In general. (x → P | y → Q | z → R) should be regarded as a single operator with three arguments P . if x. x is a local variable.3. so (x : {} → P (x)) = (y : {} → Q (y)) = STOP The binary choice operator | can also be deﬁned using the more general notation (a → P | b → Q ) = (x : B → R(x)) where B = {a. (x : {e} → P (x)) = (e → P (e)) since e is the only possible initial event. Meanwhile. Examples X8 A process which at all times can engage in any event of its alphabet A αRUNA = A RUNA = (x : A → RUNA ) In the special case that the menu contains only one event e. It cannot be regarded as equal to (x → P | (y → Q | z → R)) which is syntactically incorrect. so (x : B → P (x)) = (y : B → P (y)) The set B deﬁnes the initial menu of the process. y. then (x : B → P (x)) deﬁnes a process which ﬁrst oﬀers a choice of any event y in B. It should be pronounced “x from B then P of x”.

In each state. introduce the indexed name CTn to describe the behaviour of the object when it is n moves oﬀ the ground.4). except that when on the ground it cannot move any further down. Examples X2 An object starts on the ground. Thus choice. 1.4 Mutual recursion Recursion permits the deﬁnition of a single process as the solution of a single equation. The actions of dispensing a drink are orange and lemon. setlemon.1.1 Introduction 11 Choice between three or more alternatives can be similarly expressed. and each unknown process must appear exactly once on the left-hand side of one of the equations. no drink will be dispensed. Let n range over the natural numbers {0. orange. . The technique is easily generalised to the solution of sets of simultaneous equations in more than one unknown. 2. Example X1 A drinks dispenser has two buttons labelled ORANGE and LEMON. By using indexed variables.}. it is possible to specify inﬁnite sets of equations. lemon} DD = (setorange → O | setlemon → L) O = (orange → O | setlemon → L | setorange → O) L = (lemon → O | setorange → O | setlemon → L) Informally. Pressing the button for the same state is allowed. For this to work properly. preﬁxing and STOP are deﬁned as just special cases of the general choice notation. Before any button is pressed. This will be a great advantage in the formulation of general laws governing processes (Section 1. but has no eﬀect. 1. The choice of drink that will next be dispensed is made by pressing the corresponding button.3). and may move up. But when it is on the ground. Its initial behaviour is deﬁned as CT0 = (up → CT1 | around → CT0 ) . For each n. after the ﬁrst event (which must be a setting event) the dispenser is in one of two states O or L. all the right-hand sides must be guarded. it may serve the appropriate drink or be set into the other state. and in their implementation (Section 1. . . it may move around. The deﬁnition uses two auxiliary deﬁnitions of O and L. At any time thereafter it may move up and down. which are mutually recursive αDD = αO = αL = {setorange.1. Here are the equations deﬁning the alphabet and behaviour of the process DD. The actions of pressing the two buttons are setorange and setlemon.

and so this can be regarded only as an inﬁnite set of mutually recursive deﬁnitions. X2. CTn+1 is deﬁned in terms of CTn+2 . Examples (1.1. and the process moves downward along the arrows.3 X3) X1 coin X2 coin X3 coin choc choc toffee coin coin coin choc choc toffee In these three examples. namely . 1. 1. represented as a circle with no arrows leading out of it. The arrows leading from the same node must all have diﬀerent labels. To represent processes with unbounded behaviour it is necessary to introduce another convention.2 Pictures It may be helpful sometimes to make a pictorial representation of a process as a tree structure. every branch of each tree ends in STOP . Each arrow is labelled by the event which occurs on making that transition. Here. 2. and the arrows represent transitions between the states. … An ordinary inductive deﬁnition is one whose validity depends on the fact that the right-hand side of each equation uses only indices less than that of the left-hand side.1 X1. consisting of circles connected by arrows.1. whose validity depends on the fact that the right-hand side of each equation is guarded.12 1 Processes and the remaining inﬁnite set of equations consists of CTn+1 = (up → CTn+2 | down → CTn ) where n ranges over the natural numbers 0. The single circle at the root of the tree (usually drawn at the top of the page) is the starting state. In the traditional terminology of state machines. 1. the circles represent states of the process.

It is one of the weaknesses of pictures that proofs of such an equality are diﬃcult to conduct pictorially.1.1. for example CT0 around up down up down up There is never enough room to draw the whole picture. Another problem with pictures is that they cannot illustrate processes with a very large or inﬁnite number of states. The convention is that when a process reaches the node at the tail of the arrow. it immediately and imperceptibly goes back to the node to which the arrow points.2 Pictures 13 an unlabelled arrow leading from a leaf circle back to some earlier circle in the tree. A counter with only 65 536 diﬀerent states would take a long time to draw. X4 coin X5 coin choc toffee choc toffee coin choc toffee Clearly. these two pictures illustrate exactly the same process (1.3 X3). .

1. The ﬁrst law (L1 below) deals with the choice operator (1. then obviously the processes are identical.3 end) if c ≠ d Proof : LHS = (x : {} → P ) ≠ (x : {d} → P ) = RHS L1B (c → P ) ≠ (d → Q ) Proof : {c} ≠ {d} L1C (c → P | d → Q ) = (d → Q | c → P ) if x = c if x = d Proof : Deﬁne R(x) = P =Q LHS = (x : {c.1. It states that two processes deﬁned by choice are diﬀerent if they oﬀer diﬀerent choices on the ﬁrst step. d} = {d. However. we must learn to recognise which expressions describe the same object and which do not. Identity of processes with the same alphabet may be proved or disproved by appeal to algebraic laws very like those of arithmetic. or if after the same ﬁrst step the behave diﬀerently. if the initial choices are the same.3 end) because {} ≠ {d} by deﬁnition (1. c} by deﬁnition . and for each initial choice the subsequent behaviours are the same. a process that can do something is not the same as one that cannot do anything (x → P ) ≠ STOP In order to understand a notation properly and to use it eﬀectively. d} → R(x)) = (x : {d.1. it obviously should not matter in which order a choice between events is presented (x → P | y → Q ) = (y → Q | x → P ) On the other hand.3). c} → R(x)) = RHS by deﬁnition because {c.14 1 Processes 1. The law L1 has a number of consequences L1A STOP ≠ (d → P ) by deﬁnition (1. For example.3 Laws Even with the very restricted set of notations introduced so far. there are many diﬀerent ways of describing the same behaviour. just as everyone who understands arithmetic knows that (x + y) is the same number as (y + x). we assume without stating it that the alphabets of the processes on each side of an equation are the same. L1 (x : A → P (x)) = (y : B → Q (y)) ≡ (A = B ∧ ∀ x : A • P (x) = Q (x)) Here and elsewhere.

To prove more general theorems about recursively-deﬁned processes. X2 µ X • (coin → (choc → X | toﬀee → X )) = µ X • (coin → (toﬀee → X | choc → X )) Proof : by L1C. Examples X1 (coin → choc → coin → choc → STOP ) ≠ (coin → STOP ) Proof : by L1D then L1A. L2 If F (X ) is a guarded expression. (Y = F (Y )) ≡ (Y = µ X • F (X )) An immediate but important corollary states that µ X • F (X ) is indeed a solution of the relevant equation L2A µ X • F (X ) = F (µ X • F (X )) Example X3 Let VM1 = (coin → VM2) and VM2 = (choc → VM1) Required to prove VM1 = VMS . it is important to justify every line of the proof in full. . there is only one solution.3 Laws 15 L1D (c → P ) = (c → Q ) ≡ P = Q Proof : {c} = {c} These laws permit proof of simple theorems. as a check that the proof is not circular. Proof : VM1 = (coin → VM2) = (coin → (choc → VM1)) deﬁnition of VM1 deﬁnition of VM2 Therefore VM1 is a solution of the same recursive equation as VMS . This theorem may seem so obviously true that its proof in no way adds to its credibility. When proving obvious facts from less obvious laws. it is necessary to introduce a law which states that every properly guarded recursive equation has only one solution.1. Since the equation is guarded. So VM1 and VMS are just diﬀerent names for this unique solution. The only purpose of the proof is to show by example that the laws are powerful enough to establish facts of this kind.

A process is a function which can be applied to such a symbol as an argument. or may contain only one member (in the case of preﬁx). we have insisted that the recursion should be guarded. X ) ∧ Yi = F (i. deﬁning the set of events in which the process is initially prepared to engage. X ) where S is an indexing set with one member for each equation. In the case of a recursively deﬁned process. . X )) and this may be unfolded to the required form using L2A (x : B → F (x.4 Implementation of processes Every process P expressible in the notations introduced so far can be written in the form (x : B → F (x)) where F is a function from symbols to processes. Y ))) then X = Y for all i in S 1. so that it may be written µ X • (x : B → F (x. X ) is a guarded expression. and X is an array of processes with indices ranging over the set S . F (x) deﬁnes the future behaviour of the process if the ﬁrst event was x. A set of mutually recursive equations can be written in the general form using subscripts Xi = F (i. µ X • (x : B → F (x. Each event in the alphabet of a process is represented as an atom. "TOFFEE . and where B may be empty (in the case of STOP ). and for each x in B. if (∀ i : S • (Xi = F (i. X )))) Thus every process may be regarded as a function F with a domain B. for example "COIN . Under these conditions. This insight permits every process to be represented as a function in some suitable functional programming language such as LISP. and F (i. or may contain more than one member (in the case of choice).16 1 Processes The law L2 can be extended to mutual recursion. the law L3 states that there is only one array X whose elements satisfy all the equations L3 Under the conditions explained above.

CT (1. and so it is deﬁned STOP = λ x • "BLEEP But if the actual argument is a possible event for the process. P . the function gives back as its result another function. LISP also allows a function to be passed as an argument to a function. For example. P ) = λ x • if x = c then P else "BLEEP A function to represent a general binary choice (c → P | d → Q) requires four parameters choice2(c. since STOP never engages in any event. For example. preﬁx("CHOC. which is used only for this purpose.4 Implementation of processes 17 If the symbol is not a possible ﬁrst event for the process.1. this is the only result it can ever give.1.4 X2) may be regarded as a function from natural numbers to processes . STOP ) as the result of a function. Q ) = λ x • if x = c then P else if x = d then Q else "BLEEP Recursively deﬁned processes may be represented with the aid of the LABEL feature of LISP. the function gives as its result a special symbol "BLEEP . d..g. X )) The LABEL may also be used to represent mutual recursion. representing the subsequent behaviour of the process. a facility used in representing a general preﬁx operation (c → P) preﬁx(c. the simple vending machine process (µ X • coin → choc → X) is represented as LABEL X • preﬁx("COIN . Thus (coin → STOP) is represented as the function λ x • if x = "COIN then STOP else "BLEEP This last example takes advantage of the facility of LISP for returning a function (e. For example.

P ) gives a list of all those symbols of A which can occur as the ﬁrst event in the life of P menu(A. P (x)). the LISP function menu(A. P ). interact (A. Write a program which ﬁrst outputs the value of menu(A. then P (x)(y) will give its later behaviour. Thus if k is the sequence of symbols input from the keyboard. P (x) is not "BLEEP . P ) else cons(car (A).18 1 Processes (which are themselves functions—but let not that be a worry). P . Otherwise the symbol is accepted. after both x and y have occurred. "DOWN . So CT may be deﬁned CT = LABEL X • (λ n • if n = 0 then choice2("AROUND. and A is a list containing the symbols of its alphabet. P )) If x is in menu(A. P ). menu(cdr (A). X (0). the following function gives the sequence of outputs required interact (A. This suggests a useful method of exploring the behaviour of a process. P . If the symbol is not in the menu. cdr (k))) else interact (A. P ) on a screen. and then inputs a symbol from the keyboard. it should be greeted with an audible bleep and then ignored. and the process is repeated with P replaced by the result of applying P to the accepted symbol. k) = cons(menu(A. cdr (k))) The notations used above for deﬁning LISP functions are very informal. P (car (k). and is therefore a function deﬁning the future behaviour of P after engaging in x. X (1)) else choice2("UP . "UP . If P is a function representing a process. and they will need to be translated to the speciﬁc conventional S-expression . Thus if y is in menu(A. X (n + 1). if car (k) = "END then NIL else if P (car (k)) = "BLEEP then cons("BLEEP . P ) = if A = NIL then NIL else if P (car (A)) = "BLEEP then menu(cdr (A). The process is terminated by typing an "END symbol. X (n − 1))) The process that starts on the ground is CT (0).

we shall use only a very small subset of pure functional LISP. the interact function should be rewritten.5 Traces A trace of the behaviour of a process is a ﬁnite sequence of symbols recording the events in which the process has engaged up to some moment in time. without using LABEL. By selecting and inputting a symbol from the successive menus. a user can interactively explore the behaviour of the process P . In other versions of LISP. choose one with proper static binding of variables. When this has been done. 1. a LISP function such as preﬁx which operates on these representations may be regarded as the implementation of the corresponding operator on processes.1. preﬁx("CHOC.5 Traces 19 form of some particular implementation of LISP. is the empty sequence containing no events. . using explicit input and output to achieve the same eﬀect. If there are several versions of LISP available. it is possible to observe the computer executing any process that has been represented as a LISP function. since it permits direct encoding of recursive equations. x followed by y. Imagine there is an observer with a notebook who watches the process and writes down the name of each event as it occurs. such a LISP function may be regarded as an implementation of the corresponding process. thus VMS = preﬁx("COIN . We can validly ignore the possibility that two events occur simultaneously. y consists of two events. A trace will be denoted as a sequence of symbols. In this sense. VMS )) If input and output are implemented by lazy evaluation. for if they did. Furthermore. A LISP with lazy evaluation is also more convenient. x is a sequence containing only the event x. so there should be little diﬃculty in translating and running these processes in a variety of dialects on a variety of machines. For example in LISPkit. and the menu for the process P will appear as the ﬁrst output. separated by commas and enclosed in angular brackets x. the observer would still have to record one of them ﬁrst and then the other. and the order in which he records them would not matter. the interact function may be called with the keyboard as its third parameter. the preﬁx function can be deﬁned (preﬁx lambda (a p) (lambda (x) (if (eq x a) p (quote BLEEP )))) Fortunately.

Breakage is only indicated by the fact that among all the possible traces of the machine there is no trace which extends this one. in1p.2 X2) at the moment it has completed service of its ﬁrst two customers coin. coin Neither the process nor its observer understands the concept of a completed transaction. in1p. X5 A trace of VMC if its ﬁrst customer has ignored the warning is in1p.3 X4) has the following seven traces of length two or less in2p in2p. there is no event x such that in1p. in1p in1p in1p. and the readiness of the machine to satisfy it are not in the alphabet of these processes. i. small Only one of the four traces of length two can actually occur for a given machine. and cannot be observed or recorded. coin.. large in2p.20 1 Processes Examples X1 A trace of the simple vending machine VMS (1. The hunger of the expectant customer. X4 The complex vending machine VMC (1. the notebook of the observer is empty. choc.e. in1p The traces does not actually record the breakage of the machine.1. in1p. small in1p. The choice between them will be determined by the wishes of the ﬁrst customer to use the machine. choc X2 A trace of the same machine before the second customer has extracted his choc coin. choc. x . X3 Before a process has engaged in any events.1. This is represented by the empty trace Every process has this as its shortest possible trace.

6. 1. and understanding the behaviour of processes. t . choc in1p coin. and has as its unit L1 s L2 s (t = s=s y) = (s t ) u The following laws are both obvious and useful L3 s t = s u ≡ t = u L4 s t = u t ≡ s = u L5 s t = ≡s= ∧t = Let f stand for a function which maps traces to traces. The ultimate disposal of customer and machine are not in our chosen alphabet. U stand for sets of traces f . describing. The function is said to be strict if it maps the empty trace to the empty trace f( ) = . u stand for traces S .1. toﬀee in1p = in1p.1 Catenation By far the most important operation on traces is catenation. The customer may fret and fume.6 Operations on traces 21 is a possible trace of VMC. the observer may watch eagerly with pencil poised. T . in1p The most important properties of catenation are that it is associative. which constructs a trace from a pair of operands s and t by simply putting them together in this order. in1p = in1p. toﬀee = coin. g. coin. In this section we explore some of the general properties of traces and of operations on them. choc. We will use the following conventions s. in1p in1p. but not another single event can occur. h stand for functions 1. and not another symbol will ever be written in the notebook. the result will be denoted s t For example coin.6 Operations on traces Traces play a central role in recording.

For example.6. if y ≠ x x.22 1 Processes It is said to be distributive if it distributes through catenation f (s t ) = f (s) f (t ) All distributive functions are strict.2 Restriction The expression (t A) denotes the trace t when restricted to symbols in the set A. up. here are two more which can be proved from them L8 t n+1 = t n t L9 (s t )n+1 = s (t s)n t 1. For example around. around {up. x =( x =( x = x = x. down Restriction is distributive and therefore strict L1 A= A) (t A) L2 (s t ) A = (s Its eﬀect on singleton sequences is obvious L3 L4 x y A= x A= if x ∈ A if y ∉ A A distributive function is uniquely deﬁned by deﬁning its eﬀect on singleton sequences. x {x} y x x ) {x} {x}) ( x {x}) [by L2] [by L3 and L4] {x}) ( y . it is formed from t simply by omitting all symbols outside A. down} = up. we deﬁne t n as n copies of t catenated with each other. y. It is readily deﬁned by induction on n L6 t 0 = L7 t n+1 = t tn This deﬁnition itself gives two useful laws. down. since its eﬀect on all longer sequences can be calculated by distributing the function to each individual element of the sequence and catenating the results. If n is a natural number.

For example x.6 Operations on traces 23 The following laws show the relationship between restriction and set operations. This fact permits a simple deﬁnition A∗ = { s | s A=s} The following laws are consequences of this deﬁnition L1 L2 ∈ A∗ x ∈ A∗ ≡ x ∈ A L3 (s t ) ∈ A∗ ≡ s ∈ A∗ ∧ t ∈ A∗ They are suﬃciently powerful to determine whether a trace is a member of A∗ or not.4 Star ∨ (s0 = t0 ∧ s = t )) The set A∗ is the set of all ﬁnite traces (including ) which are formed from symbols in the set A. and a successive restriction by two sets is the same as a single restriction by the intersection of the two sets. they remain unchanged. A trace restricted to the empty set of symbols leaves nothing.6. y.3 {} = A) B = s (A ∩ B) Head and tail If s is a nonempty sequence. and the result of removing the ﬁrst symbol is s . When such traces are restricted to A. For example. its ﬁrst sequence is denoted s0 . x Both of these operations are undeﬁned for the empty sequence. y. These laws can be proved by induction on the length of s L5 s L6 (s 1. y ∈ A∗ .1.6. x x. if x ∈ A and y ∉ A x. L1 ( x L2 ( x s)0 = x s) = s s ) if s ≠ L3 s = ( s0 The following law gives a convenient method of proving whether two traces are equal L4 s = t ≡ (s = t = 1. x 0 =x = y.

5 Ordering ∨ (t0 ∈ A ∧ t ∈ A∗ ) } If s is a copy of an initial subsequence of t . y ≤ x. y ≤ z. x. w x.24 1 Processes ≡( x y ) ∈ A∗ [by L3] [by L2] ≡ ( x ∈ A∗ ) ∧ ( y ∈ A∗ ) ≡ true ∧ false ≡ false Another useful law could serve as a recursive deﬁnition of A∗ L4 A∗ = { t | t = 1. y. it is possible to ﬁnd some extension u of s such that s u = t . It also satisﬁes L8 ( x s) in t ≡ t ≠ ∧ ((t0 = x ∧ s ≤ t ) ∨ ( x s in t )) . and its least element is laws L1 to L4 L1 ≤s . as stated in least element reﬂexive antisymmetric transitive L2 s ≤ s L3 s ≤ t ∧ t ≤ s ⇒ s = t L4 s ≤ t ∧ t ≤ u ⇒ s ≤ u The following law. in that it satisﬁes laws L1 to L4 above. we say s is in t . together with L1. y. gives a method for computing whether s ≤ t or not L5 ( x s) ≤ t ≡ t ≠ ∧ x = t0 ∧ s ≤ t The preﬁxes of a given subsequence are totally ordered L6 s ≤ u ∧ t ≤ u ⇒ s ≤ t ∨ t ≤ s If s is a subsequence of t (not necessarily initial). v • t = u s v) This relation is also a partial ordering. this may be deﬁned L7 s in t = (∃ u. x.6. x ≡ x = z The ≤ relation is a partial ordering. We therefore deﬁne an ordering relation s ≤ t = (∃ u • s u = t ) and say that s is a preﬁx of t . For example.

1.6

Operations on traces

25

A function f from traces to traces is said to be monotonic if it respects the ordering ≤, i.e., f (s) ≤ f (t ) whenever s ≤ t

All distributive functions are monotonic, for example L9 s ≤ t ⇒ (s A) ≤ (t A)

A dyadic function may be monotonic in either argument, keeping the other argument constant. For example, catenation is monotonic in its second argument (but not its ﬁrst) L10 t ≤ u ⇒ (s t ) ≤ (s u)

A function which is monotonic in all its arguments is said simply to be monotonic.

1.6.6

Length

The length of the trace t is denoted #t . For example # x, y, x = 3 The laws which deﬁne # are L1 # =0

L2 # x = 1 L3 #(s t ) = (#s) + (#t ) The number of occurrences in t of symbols from A is counted by #(t L4 #(t #(t #(t (A ∪ B)) = A) + #(t B) − A).

(A ∩ B))

L5 s ≤ t ⇒ #s ≤ #t L6 #(t n ) = n × (#t ) The number of occurrences of a symbol x in a trace s is deﬁned s ↓ x = #(s {x})

26

1

Processes

1.7

Implementation of traces

In order to represent traces in a computer and to implement operations on them, we need a high-level list-processing language. Fortunately, LISP is very suitable for this purpose. Traces are represented in the obvious way by lists of atoms representing their events = NIL coin = (cons("COIN , NIL)) coin, choc = " (COIN CHOC) which means cons("COIN , cons("CHOC, NIL)) Operations on traces can be readily implemented as functions on lists. For example, the head and tail of a nonempty list are given by the primitive functions car and cdr

t0 = car (t ) t = cdr (t ) x s = cons(x, s)

General catenation is implemented as the familiar append function, which is deﬁned by recursion

s t = append(s, t )

where append(s, t ) = if s = NIL then t else cons(car (s), append(cdr (s), t )) The correctness of this deﬁnition follows from the laws

t =t s t = s0

(s

t)

whenever s ≠ The termination of the LISP append function is guaranteed by the fact that the list supplied as the ﬁrst argument of each recursive call is shorter than it was at the previous level of recursion. Similar arguments establish the correctness of the implementations of the other operations deﬁned below.

1.8

Traces of a process

27

To implement restriction, we represent a ﬁnite set B as a list of its members. The test (x ∈ B ) is accomplished by a call on the function ismember (x, B) = if B = NIL then false else if x = car (B) then true else ismember (x, cdr (B)) (s

**B ) can now be implemented by the function
**

restrict (s, B) = if s = NIL then NIL else if ismember (car (s), B) then cons(car (s), restrict (cdr (s), B)) else restrict (cdr (s), B)

A test of (s ≤ t ) is implemented as a function which delivers the answer true or false; it relies on 1.6.5 L1 and L5 ispreﬁx(s, t ) = if s = NIL then true else if t = NIL then false else car (s) = car (t ) and ispreﬁx(cdr (s), cdr (t ))

1.8

Traces of a process

In Section 1.6 a trace of a process was introduced as a sequential record of the behaviour of a process up to some moment in time. Before the process starts, it is not known which of its possible traces will actually be recorded: the choice will depend on environmental factors beyond the control of the process. However the complete set of all possible traces of a process P can be known in advance, and we deﬁne a function traces(P ) to yield that set.

28

1

Processes

Examples X1 The only trace of the behaviour of the process STOP is of the observer of this process remains forever blank traces(STOP ) = { } . The notebook

X2 There are only two traces of the machine that ingests a coin before breaking traces(coin → STOP ) = { , coin }

X3 A clock that does nothing but tick traces(µ X • tick → X ) = { , tick , tick, tick , . . .} = {tick}∗ As with most interesting processes, the set of traces is inﬁnite, although of course each individual trace is ﬁnite. X4 A simple vending machine traces(µ X • coin → choc → X ) = { s | ∃ n • s ≤ coin, choc

n

}

1.8.1

Laws

In this section we show how to calculate the set of traces of any process deﬁned using the notations introduced so far. As mentioned above, STOP has only one trace L1 traces(STOP ) = { t | t = }={ }

A trace of (c → P ) may be empty, because is a trace of the behaviour of every process up to the moment that it engages in its very ﬁrst action. Every nonempty trace begins with c, and its tail must be a possible trace of P L2 traces(c → P ) = { t | t = ={ }∪{ c ∨ (t0 = c ∧ t ∈ traces(P )) } t | t ∈ traces(P ) }

A trace of the behaviour of a process which oﬀers a choice between initial events must be a trace of one of the alternatives L3 traces(c → P | d → Q ) = {t | t = ∨ (t0 = c ∧ t ∈ traces(P )) ∨ (t0 = d ∧ t ∈ traces(Q )) }

1.8

Traces of a process

29

These three laws are summarised in the single general law governing choice L4 traces(x : B → P (x)) = { t | t = ∨ (t0 ∈ B ∧ t ∈ traces(P (t0 ))) }

To discover the set of traces of a recursively deﬁned process is a bit more diﬃcult. A recursively deﬁned process is the solution of an equation X = F (X ) First, we deﬁne iteration of the function F by induction F 0 (X ) = X F n+1 (X ) = F (F n (X )) = F n (F (X )) = F (. . . (F (F (X ))) . . .)

n times

Then, provided that F is guarded, we can deﬁne L5 traces(µ X : A • F (X )) = Examples X1 Recall that RUNA was deﬁned in 1.1.3 X8 as µ X : A • F (X ) where F (X ) = (x : A → X ) We wish to prove that traces(RUNA ) = A∗ Proof : A∗ =

n≥0 { s n≥0

traces(F n (STOPA ))

| s ∈ A∗ ∧ #s ≤ n }

**This is done by induction on n. 1. traces(STOPA ) ={ } = { s | s ∈ A∗ ∧ #s ≤ 0 } 2. traces(F n+1 (STOPA )) = traces(x : A → F n (STOPA )) = {t | t = = {t | t = = { t | (t =
**

∗

**[def. F , F n+1 ] [L4] [ind. hyp.] [property of #] [1.6.4 L4]
**

∗ ∗

∨ (t0 ∈ A ∧ t ∈ traces(F n (STOPA ))) } ∨ (t0 ∈ A ∧ (t ∈ A ∧ #t ≤ n)) } ∨ (t0 ∈ A ∧ t ∈ A )) ∧ #t ≤ n + 1 }

= { t | t ∈ A ∧ #t ≤ n + 1 }

traces(coin → choc → F n (STOP )) = { . From this it follows that is a trace of every process up to the moment in which it engages in its very ﬁrst event.5. coin } ∪ { coin. small. choc n 0 n } } [1.30 1 Processes X2 We want to prove 1. Furthermore. choc = { s | s ≤ coin. choc = { .] } ∨ s = coin ∨ t ∧ t ≤ coin. i. choc n } Proof : The inductive hypothesis is traces(F n (VMS )) = { t | t ≤ coin. in the tree for VMC shown in Figure 1. These three facts are formalised in the laws L6 ∈ traces(P ) L7 s t ∈ traces(P ) ⇒ s ∈ traces(P ) L8 traces(P ) ⊆ (αP )∗ There is a close relationship between the traces of a process and the picture of its behaviour drawn as a tree. then s must have been a trace of that process up to some earlier moment.6.. choc 2. a trace is a sequence of symbols recording the events in which a process P has engaged up to some moment in time.5 X4. choc n+1 } } The conclusion follows by L5. As mentioned in Section 1. the trace of the behaviour of a process up to the time when it reaches that node is just the sequence of labels encountered on the path leading from the root of the tree to that node. Finally. every event that occurs must be in the alphabet of the process. For any node on the tree. the trace corresponding to the path from the root to the black node is in2p. hyp. coin } ∪ { coin. traces(STOP ) = { } = { s | s ≤ coin. out1p .e. choc = {s | s = t | t ≤ coin. choc where F (X ) = coin → choc → X 1.1. if (s t ) is a trace of a process up to some moment. choc n ∃ t • s = coin.1 L6] [L2 twice] t | t ∈ traces(F n (STOP )) } [ind. For example. traces(VMS ) = n≥0 { s | s ≤ coin.

and let s be a trace. i. this is stated more formally in L7 above. The empty trace deﬁnes the path from the root to itself. P (car (s))) Since s is ﬁnite. all initial subpaths of a path in a tree are also paths in the same tree. a function whose result is a function whose result is a function whose result… . P ) = if s = NIL then true else if P (car (s)) = "BLEEP then false else istrace(cdr (s).8. having explored only a ﬁnite initial segment of the behaviour of the process P . the recursion involved here will terminate.e. The traces of a process are just the set of paths leading from the root to some node in the tree. which justiﬁes the law L6. Then it is possible to test whether s is a possible trace of P by the function istrace(s..1. each trace of a process uniquely speciﬁes a path leading from the root of a tree to a particular node.8 Traces of a process VMC in2p in1p 31 large small in1p small in2p in1p out1p in1p large in2p in1p in2p in1p Figure 1. It is because we avoid inﬁnite exploration that we can safely deﬁne a process as an inﬁnite object.1 Clearly.2 Implementation Suppose a process has been implemented as a LISP function P . 1. because the branches leading from each node are all labelled with diﬀerent events. Thus any set of traces satisfying L6 and L7 constitutes a convenient mathematical representation for a tree with no duplicate labels on branches emerging from a single node. Conversely.

Examples X1 (VMS / coin ) = (choc → VMS ) X2 (VMS / coin. the behaviour of a process is as deﬁned by this initial choice L3 (x : B → P (x)) / c = P (c) provided that c ∈ B A corollary shows that / c is the inverse of the preﬁxing operator c → L3A (c → P ) / c = P The traces of (P / s) are deﬁned .3 X5.1 is denoted by VMC / in2p. choc ) = VMS X3 (VMC / in1p 3 ) = STOP X4 To avoid loss arising from installation of VMCRED (1.32 1 Processes 1. (P / s) denotes the whole subtree whose root lies at the end of the path labelled by the symbols of s. out1p The following laws describe the meaning of the operator /.8. If s is not a trace of P . Thus the subtree below the black node in the Figure 1. small. the behaviour of P is the same as that of (P / s) after After engaging in s engaging in t L2 P / (s t ) = (P / s) / t After engaging in a single event c. X6). the owner decides to eat the ﬁrst chocolate himself (VMCRED / choc ) = VMS2 In a tree picture of P (Figure 1. After doing nothing.1.1).3 After If s ∈ traces(P ) then P /s (P after s) is a process which behaves the same as P behaves from the time after P has engaged in all the actions recorded in the trace s. (P / s) is not deﬁned. a process remains unchanged L1 P / =P t .

. (around → CT7 ) For example. Proof : (a → ((a → P ) / a )) = (a → P ) by L3A. 1. because it is not possible to return them to their initial state ( 1. a process P is deﬁned as cyclic if in all circumstances it is possible for it to return to its initial state.2 X2.4 X2) RUNA . (choc → VMS ).1..e.1. and has as its solution any process of the form a→P for any P .8 Traces of a process 33 L4 traces(P / s ) = { t | s t ∈ traces(P ) } provided that s ∈ traces(P ) In order to prove that a process P never stops it is suﬃcient to prove that ∀ s : traces(P ) • P / s ≠ STOP Another desirable property of a process is cyclicity. CT7 X6 The following are not cyclic. 1. Warning: The use of / in a recursively deﬁned process has the unfortunate consequence of invalidating its guards.1.3 X3.3 X8.4 X2) (coin → VMS ). then it also has the desirable property of never stopping. but subsequently whenever choc is obtainable a choice of toﬀee is also possible. in the initial state of choc → VMCT only a chocolate is obtainable. 1. For this reason. VMS .2 X2.1. i.1. (choc → VMCT ). 1. VMCT . For example X = (a → (X / a )) is not guarded. but if any other process is cyclic. consequently none of these subsequent states is equal to the initial state.1.1.1. we will never use the / operator in recursive process deﬁnitions.3 X3. ∀ s : traces(P ) • ∃ t • (P / (s t ) = P ) STOP is trivially cyclic. thereby introducing the danger of multiple solutions to the recursive equations. Examples X5 The following processes are cyclic (1. 1.

For example. 6. 2 A starred function is obviously distributive and therefore strict L1 f ∗ ( ) = L2 f ∗ ( x ) = f (x) L3 f ∗ (s t ) = f ∗ (s) f ∗ (t ) Other laws are obvious consequences L4 f ∗ (s)0 = f (s0 ) L5 #f ∗ (s) = #s But here is an “obvious” law which is unfortunately not generally true f ∗ (s A) = f ∗ (s) f (A) if s ≠ where f (A) = { f (x) | x ∈ A }. 5.34 1 Processes 1.1 Change of symbol Let f be a function mapping symbols from a set A to symbols in a set B. 1.9 More operations on traces This section describes some further operations on traces. 10. since backwards references will be given in later chapters where the operations are used. . it may be skipped at this stage. From f we can derive a new function f ∗ which maps a sequence of symbols in A∗ to a sequence in B ∗ by applying f to each element of the sequence. 3. The simplest counterexample is given by the function f such that f (b) = f (c) = c Therefore f ∗( b ∗ where b ≠ c {c}) [since b ≠ c] [L1] {c} [since f (c) = c] =f ( ) = ≠ c = c ∗ = f ( c ) f ({c}) However.9. if double is a function which doubles its integer argument double∗ ( 1. the law is true if f is a one-one function (injection) L6 f ∗ (s A) = f ∗ (s) f (A) provided that f is an injection. 1 ) = 2.

4 Subscription If 0 ≤ i ≤ #s.9. 3 = 1. 5. each of whose elements is itself a sequence.9. 2. u) ≡ ∧ t0 = x ∧ s interleaves (t . Then / s is obtained by catenating all the elements together in the original order.2 Catenation Let s be a sequence. with alternate subsequences extracted from t and u. 3. 6. 7 is an interleaving of t and u. 1. u)) ∨ ∧ u0 = x ∧ s interleaves (t . 7 and u = 3. 6. 7 = 1. For example s = 1. where t = 1.3 / = 7 / s =s /(s t ) = ( / s) ( / t ) Interleaving A sequence s is an interleaving of two sequences t and u if it can be split into a series of subsequences. For example / 1. u )) (t ≠ (u ≠ 1. 2. 4. u) ≡ s interleaves (u.9 More operations on traces 35 1. we use the conventional notation s[i] to denote the i th element of the sequence s as described by L1 L1 s[0] = s0 ∧ s[i + 1] = s [i] L2 (f ∗ (s))[i] = f (s[i]) provided s ≠ for i < #s . u) ≡ (t = ∧u= ) L2 s interleaves (t . 4 A recursive deﬁnition of interleaving can be given by means of the following laws L1 interleaves (t . 5. 1. t ) L3 ( x s) interleaves (t .9. 7 This operator is distributive L1 L2 L3 1. . 3 .1. 3.

7 Composition Let be a symbol denoting successful termination of the process which engages in it.6 Selection for i ≤ #s If s is a sequence of pairs.z If s is not a sequence of pairs.0 then s ↓ a = 7 . a. 3 Reversal is deﬁned fully by the following laws L1 L2 = x = x s L3 s t = t Reversal enjoys a number of simple algebraic properties.9. If does not occur in s. 37 = 37 . For example 3. t ).z L3 ( x. c. Thus if s = a. s ↓ a denotes the number of occurrences of a in s (as deﬁned in Section 1. As a result.9. and in general L5 s[i] = s[#s − i − 1] 1. including L4 s = s Exploration of other properties is left to the reader.8. 1.6. Let t be a trace recording a sequence of events which start when s has successfully terminated. The composition of s and t is denoted (s . 5. 8 and s ↓ d = L1 ↓x= t) ↓ x = t ↓ x t) ↓ x = z if y ≠ x (t ↓ x) L2 ( y. we deﬁne s ↓ x as the result of selecting from s all those pairs whose ﬁrst element is x and then replacing each pair by its second element. We write a pair with a dot between its two components.9.9.6).7 . 5. One of the useful facts about reversal is that s 0 is the last element of the sequence. b. then t cannot start . s is formed by taking its elements in reverse order. this symbol can appear only at the end of a trace.36 1 Processes 1.5 Reversal If s is a sequence.

(t . and has as its left unit L3 s .10 Speciﬁcations 37 L1 s . u L4A L4B L5 L6 If s ≤ t ⇒ ((u . in the absence of the glue. we stipulate for the sake of completeness that all symbols after the ﬁrst occurrence are irrelevant and should be discarded L2A (s u) .t =s t if ¬ ( in s) L2 (s The symbol may be regarded as a sort of glue which sticks s and t together. s) ≤ (u . t = s t if ¬ ( in s) This unfamiliar operator enjoys a number of familiar algebraic properties. 1. If occurs (incorrectly) in the middle of a trace. For example. t cannot stick (L1). Unlike catenation. it is understood that v stands for the input voltage and v stands for the output voltage. Like catenation it is associative.t =t never occurs except at the end of a trace. Also.t = .10 Speciﬁcations A speciﬁcation of a product is a description of the way it is intended to behave. each of which stands for some observable aspect of the behaviour of the product. t )) s ≤ t ⇒ ((s . it is strict in its ﬁrst argument. =s provided ¬ ( in (s) ) is a right unit as well L7 s . with an input range of one volt and with an approximate gain of 10. the most obviously relevant observation of its behaviour is the trace of events that occur up to a given moment in time. . u) = (s . it is monotonic in its ﬁrst as well as its second argument. u)) . is given by the predicate AMP10 = (0 ≤ v ≤ 1 ⇒ | v − 10 × v | ≤ 1) In this speciﬁcation. t ) . In the case of a process. Such an understanding of the meaning of variables is essential to the use of mathematics in science and engineering. just as v and v are used for arbitrary observations of voltage in the previous example. We will use the special variable tr to stand for an arbitrary trace of the process being speciﬁed.1. t = s If if ¬ ( in s) does occur at the end of s. This description is a predicate containing free variables. the speciﬁcation of an electronic ampliﬁer. u) ≤ (t . it is removed and t is appended to the result ).

He therefore speciﬁes that the number of chocolates dispensed must never exceed the number of coins inserted NOLOSS = (#(tr {choc}) ≤ #(tr {coin})) In future we will use the abbreviation (introduced in 1.1. X2 The customer of a vending machine wants to ensure that it will not absorb further coins until it has dispensed the chocolate already paid for FAIR1 = ((tr ↓ coin) ≤ (tr ↓ choc) + 1) X3 The manufacturer of a simple vending machine must meet the requirements both of its owner and its customer VMSPEC = NOLOSS ∧ FAIR1 = (0 ≤ ((tr ↓ coin) = (tr ↓ choc)) ≤ 1) X4 The speciﬁcation of a correction to the complex vending machine forbids it to accept three pennies in a row VMCFIX = (¬ in1p 3 in tr ) X5 The speciﬁcation of a mended machine MENDVMC = (tr ∈ traces(VMC) ∧ VMCFIX ) X6 The speciﬁcation of VMS2 (1.6) tr ↓ c = #(tr {c}) to stand for the number of occurrences of the symbol c in tr .3 X6) 0 ≤ ((tr ↓ coin) − (tr ↓ choc) ≤ 2 .38 1 Processes Examples X1 The owner of a vending machine does not wish to make a loss by installing it.6.

without violating the speciﬁcation.10 Speciﬁcations 39 1.. or in other words. i. The fourth column shows that if the input voltage is outside the speciﬁed range. even a broken product satisﬁes such a weak and undemanding speciﬁcation L1 P sat true If a product satisﬁes two diﬀerent speciﬁcation.) The following laws give the most general properties of the satisﬁes relation. abbreviated to P sat S This means that every possible observation of the behaviour of P is described by S . (In this simple example we have ignored the possibility that excessive input may break the product. we say that P satisﬁes S . to universal quantiﬁcation.5 2 . it also satisﬁes their conjunction L2A If and then P sat S P sat T P sat (S ∧ T ) The law L2A generalises to inﬁnite conjunctions. The second and third columns illustrate the fact that the output of the ampliﬁer is not completely determined by its input. If a speciﬁcation S logically implies another speciﬁcation T . ∀ tr • tr ∈ traces(P ) ⇒ S .1 Satisfaction If P is a product which meets a speciﬁcation S .1.5 . S is true whenever its variables take values observed from the product P . Consequently every product which satisﬁes S must also satisfy the weaker speciﬁcation T . then every observation described by S is also described by T . Let S (n) be a predicate containing the variable n L2 If ∀ n • (P sat S (n)) then P sat (∀ n • S (n)) provided that P does not depend on n. The speciﬁcation true which places no constraints whatever on observations of a product will be satisﬁed by all products.e.1 v 0 5 41 3 All observations except the last are described by AMP10.10. or more formally. the output voltage can be anything at all. For example. the following table gives some observations of the properties of an ampliﬁer 1 2 34 5 v 0 .

However. We will sometimes write the speciﬁcation as S (tr ). Every subsequent trace begins with c. It is important to note that both S and S (tr ) can have other free variables besides tr . Any observation of the process STOP will always be an empty trace. since this process never does anything L4A STOP sat (tr = ) A trace of the process (c → P ) is initially empty. the designer has a responsibility to ensure that it will satisfy its speciﬁcation. so if S ⇒ T . Consequently its tail must be described by any speciﬁcation of P L4B If then P sat S (tr ) (c → P ) sat (tr = ∨ (tr0 = c ∧ S (tr ))) A corollary of this law deals with double preﬁxing . geometry or the diﬀerential and integral calculus.40 1 Processes L3 If and then P sat S S ⇒T P sat T In the light of this law. 1. In the next section we shall give the additional laws which apply to processes. suggesting that a speciﬁcation will normally contain tr as a free variable.2 Proofs In the design of a product. for example. as for example in S (tr ). we write P sat S ⇒ T as an abbreviation for the fuller proof P sat S S ⇒T P sat T [by L3] The laws and their explanations given above apply to all kinds of products and all kinds of speciﬁcations. In this section we shall give a collection of laws which permit the use of mathematical reasoning to ensure that a process P meets its speciﬁcation S . we will sometimes lay out proofs as a chain. this responsibility may be discharged by the reasoning methods of the relevant branches of mathematics.10. the real reason for making tr explicit is to indicate how tr may be substituted by some more elaborate expression. and its tail is a trace of P .

tr0 ))) The law governing the after operator is surprisingly simple. .1. If tr is a trace of (P / s). and its tail must be described by the speciﬁcation of the chosen alternative L4D If and then P sat S (tr ) Q sat T (tr ) (c → P | d → Q ) sat (tr = ∨ (tr0 = c ∧ S (tr )) ∨ (tr0 = d ∧ T (tr ))) All the laws given above are special cases of the law for general choice L4 If then ∀ x : B • (P (x) sat S (tr . except that the trace may begin with either of the two alternative events. d ∨ (tr ≥ c. F n (STOP ) fully describes at least the ﬁrst n steps of the behaviour of µ X • F (X ).8. s tr is a trace of P . and therefore must be described by any speciﬁcation which P satisﬁes L5 If and then P sat S (tr ) s ∈ traces(P ) P / s sat S (s tr ) Finally. we need a law to establish the correctness of a recursively deﬁned process L6 If and and then F (x) is guarded STOP sat S ((X sat S ) ⇒ (F (X ) sat S )) (µ X • F (X )) sat S The antecedents of this law ensure (by induction) that F n (STOP ) sat S for all n Since F is guarded. This trace must therefore satisfy the same speciﬁcation as F n (STOP ). d ∧ S (tr ))) Binary choice is similar to preﬁxing. x)) (x : B → P (x)) sat (tr = ∨ (tr0 ∈ B ∧ S (tr .10 Speciﬁcations 41 L4C If then P sat S (tr ) (c → d → P ) sat (tr ≤ c. which (for all n) is S . So each trace of µ X • F (X ) is a trace of F n (STOP ) for some n. A more formal proof can be given in terms of the mathematical theory of Section 2.

choc ) ∨ (tr ≥ coin.2 X2. 1. STOP sat tr = ⇒ 0 ≤ (tr ↓ coin = tr ↓ choc) ≤ 1 [since ( ↓ coin) = ( [L4A] ↓ choc) = 0] The conclusion follows by an (implicit) appeal to L3.1. choc ↓ coin) = coin. choc ⇒ (tr ↓ coin = tr ↓ coin + 1 ∧ tr ↓ choc = tr ↓ choc + 1) ↓ choc = coin ↓ choc = 0 The conclusion follows by appeal to L3 and L6. choc ∧ 0 ≤ ((tr ↓ coin) − (tr ↓ choc)) ≤ 1)) [L4C] ⇒ 0 ≤ ((tr ↓ coin) − (tr ↓ choc)) ≤ 1 since ↓ coin = and coin ↓ coin = ( coin.10 X3) that VMS sat VMSPEC Proof : 1.42 1 Processes Example X1 We want to prove (1. 2. The fact that a process P satisﬁes its speciﬁcation does not necessarily mean that it is going to be satisfactory in use. Assume X sat (0 ≤ ((tr ↓ coin) − (tr ↓ choc)) ≤ 1). choc ↓ choc = 1 and tr ≥ coin. then (coin → choc → X ) sat (tr ≤ coin. since tr = ⇒ 0 ≤ (tr ↓ coin − tr ↓ choc) ≤ 1 one can prove by L3 and L4A that STOP sat 0 ≤ (tr ↓ coin − tr ↓ choc) ≤ 1 . For example.

and guarded recursions will never stop. but only by the lazy expedient of doing nothing at all. such simple precautions are no longer adequate. The only way to write a process that can stop is to include explicitly the process STOP . either for its owner or for the customer. A more general method of specifying and proving that a process will never stop is described in Section 3. In fact. choice. STOP satisﬁes every speciﬁcation which is satisﬁable by any process. any process deﬁned solely by preﬁxing. For this reason. .10 Speciﬁcations 43 Yet STOP will not serve as an adequate vending machine. or the process (x : B → P (x)) where B is the empty set.1. However.7. Fortunately. it is obvious by independent reasoning that VMS will never stop. By avoiding such elementary mistakes one can guarantee to write processes that never stop. It certainly avoid doing anything wrong. after introduction of concurrency in the next chapter.

.

On each such occasion. the insertion of a large coin or a small one into a vending machine VMC (1. there will be a choice between several diﬀerent actions. recorded and analysed in a simple and homogeneous fashion. they are all of them just processes whose behaviour may be prescribed. 2. Frequently. . Consequently. In fact. and ignore all others. The complete system should also be regarded as a process. with its behaviour deﬁned by familiar notations. the environment of a process itself may be described as a process. environments. described. Fortunately. These interactions may be regarded as events that require simultaneous participation of both the processes involved. let us conﬁne attention to such events. the usual intention is that they will interact with each other. For example. For example.1 Introduction A process is deﬁned by describing the whole range of its potential behaviour. and systems. For the time being. a chocolate can be extracted from a vending machine only when its customer wants it and only when the vending machine is prepared to give it. for example. each event that actually occurs must be a possible event in the independent behaviour of each process separately.Concurrency 2 2. Thus we will assume that the alphabets of the two processes are the same. it is best to forget the distinction between processes. and the system may in turn be placed within a yet wider environment. This permits investigation of the behaviour of a complete system composed from the process together with its environment.3 X4). it is the customer of the vending machine who may select which coin to insert.2 Interaction When two processes are brought together to evolve concurrently.1. acting and interacting with each other as they evolve concurrently. the choice of which event will actually occur can be controlled by the environment within which the process evolves. whose range of behaviour is deﬁnable in terms of the behaviour of its component processes.

. Examples X1 A greedy customer of a vending machine is perfectly happy to obtain a toﬀee or even a chocolate without paying. these actions are diﬀerent. if thwarted in these desires. he is determined on a large biscuit FOOLCUST = (in2p → large → FOOLCUST | in1p → large → FOOLCUST ) Unfortunately. Similarly. because the customer never wants one after he has paid (GRCUST || VMCT ) = µ X • (coin → choc → X ) This example shows how a process which has been deﬁned as a composition of two subprocesses may also be described as a simple single process. However. Although each component process is prepared to engage in some further action. without using the concurrency operator ||.3 X3) his greed is frustrated. nothing further can happen. he is reluctantly prepared to pay a coin. X2 A foolish customer wants a large biscuit. the vending machine is not prepared to yield a large biscuit for only a small coin (FOOLCUST || VMC) = µ X • (in2p → large → X | in1p → STOP ) The STOP that supervenes after the ﬁrst in1p is known as deadlock. so he puts his coin in the vending machine VMC.46 2 Concurrency If P and Q are processes with the same alphabet. VMCT never gives a toﬀee. nevertheless. He does not notice whether he has inserted a large coin or a small one. we introduce the notation P || Q to denote the process which behaves like the system composed of processes P and Q interacting in lock-step synchronisation as described above. since the vending machine does not allow goods to be extracted before payment.1. but then he insists on taking a chocolate GRCUST = (toﬀee → GRCUST | choc → GRCUST | coin → choc → GRCUST ) When this customer is brought together with the machine VMCT (1. since the processes cannot agree on what the next action shall be.

It is important to remember that events are intended to be neutral transitions which could be observed and recorded by some dispassionate visitor from another planet.2 Interaction 47 The stories that accompany these examples show a sad betrayal of proper standards of scientiﬁc abstraction and objectivity. We have deliberately chosen the alphabet of relevant events to exclude such internal emotional states. or of the hunger suﬀered by the foolish customer as he vainly tries to obtain sustenance. only events which they both oﬀer will remain possible when the processes are combined L4 (x : A → P (x)) || (y : B → Q (y)) = (z : (A ∩ B) → (P (z) || Q (z))) It is this law which permits a system deﬁned in terms of concurrency to be given an alternative description without concurrency. it does not matter in which order they are put together L2 P || (Q || R) = (P || Q ) || R Thirdly.3 X8) makes no diﬀerence L3A L3B P || STOPαP = STOPαP P || RUNαP = P The next laws show how a pair of processes either engage simultaneously in the same action. as shown in 2. or deadlock if they disagree on what the ﬁrst action should be L4A L4B (c → P ) || (c → Q ) = (c → (P || Q )) (c → P ) || (d → Q ) = STOP if c ≠ d These laws readily generalise to cases when one or both processes oﬀer a choice of initial event. a deadlocked process infects the whole system with deadlock.2. The ﬁrst law expresses the logical symmetry between a process and its environment L1 P || Q = Q || P The next law shows that when three processes are assembled.3 X1.2.1.1 Laws The laws governing the behaviour of (P || Q ) are exceptionally simple and regular. but composition with RUNαP (1. further events can be introduced to model internal state changes. if and when desired. who knows nothing of the pleasures of eating biscuits. . 2.

events that are in both their alphabets (as explained in the previous section) require simultaneous participation of both P and Q . Q ) = λ z • if P (z) = "BLEEP or Q (z) = "BLEEP then "BLEEP else intersect (P (z). Q (z)) 2. each sequence of such actions must be possible for both these operands.2 Implementation The implementation of the || operator is clearly based on L4 intersect (P .2.] 2. L1 traces(P || Q ) = traces(P ) ∩ traces(Q ) L2 (P || Q ) / s = (P / s) || (Q / s) 2. events in the alphabet of P but not in the alphabet of Q are of no concern to Q .2.3 Concurrency The operator described in the previous section can be generalised to the case when its operands P and Q have diﬀerent alphabets αP ≠ αQ When such processes are assembled to run concurrently. For the same reason. However. which is physically incapable of . / s distributes through ||.48 2 Concurrency Example X1 Let P = (a → b → P | b → P ) Q = (a → (b → Q | c → Q )) and Then (P || Q ) = = a → ((b → P ) || (b → Q | c → Q )) = a → (b → (P || Q )) = µ X • (a → b → X ) [by L4A] [by L4A] [since the recursion is guarded.3 Traces Since each action of (P || Q ) requires simultaneous participation of both P and Q .

or the other way round. curse. They may even occur simultaneously. clink. . the curse is what he utters when he fails to get it. he then has to take a chocolate instead αCUST = {coin. where clink is the sound of a coin dropping into the moneybox of a noisy vending machine. toﬀee}. However in the case when the two operands have the same alphabet. The noisy vending machine has run out of toﬀee NOISYVM = (coin → clink → choc → clunk → NOISYVM ) The customer of this machine deﬁnitely prefers toﬀee. Note also that the mathematical formula in no way represents the fact that the customer prefers to get a toﬀee rather than utter a curse. and (P || Q ) has exactly the meaning described in the previous section. Similarly. Q may engage alone in events which are in the alphabet of Q but not of P . whether those events are desired or not. Such events may occur independently of Q whenever P engages in them. and yields a result with yet a third alphabet. toﬀee} CUST = (coin → (toﬀee → CUST | curse → choc → CUST )) The result of the concurrent activity of these two processes is (NOISYVM || CUST ) = µ X • (coin → (clink → curse → choc → clunk → X | curse → clink → choc → clunk → X )) Note that the clink may occur before the curse. choc. so does the resulting combination.2. Thus the set of all events that are logically possible for the system is simply the union of the alphabets of the component processes α(P || Q ) = αP ∪ αQ This is a rare example of an operator which takes operands with diﬀerent alphabets. The formula is an abstraction from reality. choc. and it will not matter in which order they are recorded. which ignores human emotions and concentrates on describing only the possibilities of occurrence and non-occurrence of events within the alphabet of the processes.3 Concurrency 49 controlling or even of noticing them. Examples X1 Let αNOISYVM = {coin. and clunk is the sound made by the vending machine on completion of a transaction. clunk.

3. Let αP = {up. 2}. down} P = (up → down → P ) αQ = {left .50 2 Concurrency X2 i 2 1 1 2 3 j A counter starts at the middle bottom square of the board. Consequently. left or right . j ∈ {1. Then (P || Q ) = R12 where R21 = (down → R11 | right → R22 ) R11 = (up → R21 | right → R12 ) R22 = (down → R12 | left → R21 | right → R23 ) R12 = (up → R22 | left → R11 | right → R13 ) R23 = (down → R13 | left → R22 ) R13 = (up → R23 | left → R12 ) 2. the alphabets αP and αQ have no event in common. the movements of the counter are an arbitrary interleaving of actions from the process P with actions from the process Q .2 || is symmetric and associative L3A P || STOPαP = STOPαP . down. for i ∈ {1. and may move within the board either up. In this example.2. let Rij stand for the behaviour of the counter (X2) when situated in row i and column j of the board. Such interleavings are very laborious to describe without concurrency. 3}.1) L1. right } Q = (right → left → Q | left → right → Q ) The behaviour of this counter may be deﬁned P || Q . For example.1 Laws The ﬁrst three laws for the extended form of concurrency are similar to those for interaction (Section 2. 2.

2. as shown in the following example. d} ⊆ (αP ∩ αQ ). c} αQ = {b. c} P || Q = (a → c → P ) || (c → b → Q ) = a → ((c → P ) || (c → b → Q )) = a → c → (P || (b → Q )) Also P || (b → Q ) [by deﬁnition] [by L5A] [by L4A …‡] and P = (a → c → P ) Q = (c → b → Q ) . b ∈ (αQ − αP ) and {c. Example X1 Let αP = {a. Q engages alone in b.3 Concurrency 51 L3B P || RUNαP = P Let a ∈ (αP − αQ ). The following laws show the way in which P engages alone in a. These laws permit a process deﬁned by concurrency to be redeﬁned without that operator. but c and d require simultaneous participation of both P and Q L4A L4B L5A L5B (c → P ) || (c → Q ) = c → (P || Q ) (c → P ) || (d → Q ) = STOP if c ≠ d (a → P ) || (c → Q ) = a → (P || (c → Q )) (c → P ) || (b → Q ) = b → ((c → P ) || Q ) L6 (a → P ) || (b → Q ) = (a → (P || (b → Q )) | b → ((a → P ) || Q )) These laws can be generalised to deal with the general choice operator L7 Let and Then where and P = (x : A → P (x)) Q = (y : B → Q (y)) (P || Q ) = (z : C → P || Q ) C = (A ∩ B) ∪ (A − αQ ) ∪ (B − αP ) P = P (z) P =P and Q = Q (z) Q =Q if z ∈ A otherwise if z ∈ B otherwise.

αP . αQ . A and B. B) then aux(P .52 2 Concurrency = (a → (c → P ) || (b → Q ) | b → (P || Q )) = (a → b → ((c → P ) || Q ) | b → (P || Q )) = (a → b → c → (P || (b → Q )) | b → a → c → (P || (b → Q ))) = µ X • (a → b → c → X | b → a → c → X) Therefore (P || Q ) = (a → c → µ X • (a → b → c → X | b → a → c → X )) [by L6] [by L5B] [by ‡ above] [since this is guarded] by ‡ above 2. B. Test of membership uses the function ismember (x. Q (x)) else "BLEEP . A) deﬁned in Section 1. A) then aux(P (x).3. The alphabets of the operands are represented as ﬁnite lists of symbols. B) then aux(P (x). Q ). Q ) where aux(P . Q (x)) else if ismember (x. Q ) else if ismember (x. A) and ismember (x. P || Q is implemented by calling a function concurrent (P . Q ) = aux(P . Q ) = λ x • if P = "BLEEP or Q = "BLEEP then "BLEEP else if ismember (x. which is deﬁned as follows concurrent (P .2 Implementation The implementation of the operator || is derived directly from the law L7. A.7.

Furthermore.3 X1. it follows that s αP = s αQ = s and these laws are then the same as in Section 2. in which events which are in the alphabet of both of them occur only once. and the meaning of (P || Q ) is exactly as deﬁned for interaction (Section 2.2. Thus (t αP ) is a trace of all those events in which P has participated. Then every event in t which belongs to the alphabet of P has been an event in the life of P . curse.2). then t1 αNOISYVM = coin. every event in t must be in either αP or αQ . clink ∈ traces(NOISYVM || CUST ) This shows that the curse and the clink may be recorded one after the other in either order.3 X2. but we have made the decision to provide no way of recording this. and is therefore a trace of P .3 Traces Let t be a trace of (P || Q ).3 Concurrency 53 2. click t1 αCUST = coin. curse . If αP ∩αQ = {} then the traces are pure interleavings (Section 1. a trace of (P || Q ) is a kind of interleaving of a trace of P with a trace of Q . This reasoning suggests the law L1 traces(P || Q ) = { t | (t αP ) ∈ traces(P ) ∧ (t αQ ) ∈ traces(Q ) ∧ t ∈ (αP ∪ αQ )∗ } The next law shows how the / s operator distributes through parallel composition L2 (P || Q ) / s = (P / (s αP )) || (Q / (s αQ )) When αP = αQ .3).3. In summary. curse therefore t1 ∈ traces(NOISYVM || CUST ) Similar reasoning shows that coin. By a similar argument (t αQ ) is a trace of Q .) Let t1 = coin.9. They may even occur simultaneously. At the other extreme. as shown in 2. click. Example X1 (See 2.3. where αP = αQ . and every event in t which does not belong to αP has occurred without the participation of P . every event belongs to both of the alphabets. [which is in traces(NOISYVM )] [which is in traces(CUST )] .2.

1 b b Q c Figure 2. c} is pictured as a box labelled P . This diagram shows that the event c requires participation of all three processes. Pictures of this kind will be known as connection diagrams. as shown in Figure 2. b. but lines labelled by events in the alphabet of only one process are left free (Figure 2. b requires participation of P and Q .2 d When these two processes are put together to evolve concurrently. from which emerge a number of lines.1). d} may be pictured as in Figure 2.4 Pictures A process P with alphabet {a. a P c Figure 2.3 A third process R with αR = {c. Similarly. a b P Q d c Figure 2. the resulting system may be pictured as a network in which similarly labelled lines are connected. then traces(P || Q ) = { s | ∃ t : traces(P ). .54 2 Concurrency L3A If αP ∩ αQ = {}.3). u) } L3B If αP = αQ then traces(P || Q ) = traces(P ) ∩ traces(Q ) 2. e} may be added. whereas each remaining event is the sole concern of a single process. u : traces(Q ) • s interleaves (t .2. each labelled with a diﬀerent event from its alphabet (Figure 2. c.4. Q with its alphabet {b.

A system constructed from three processes is still only a single process. 2. Each philosopher had a room in which he could engage in his professional activity of thinking.5 But these pictures could be quite misleading.4 b a d ( P || Q || R ) c e Figure 2. and therefore be pictured as a single box (Figure 2.5 Example: The Dining Philosophers 55 a b P c Q d R e Figure 2.2.5 Example: The Dining Philosophers In ancient times. The number 60 can be constructed as the product of three other numbers (3 × 4 × 5). there was also a common dining .5). and the manner of its construction is no longer relevant or even observable. a wealthy philanthropist endowed a College to accommodate ﬁve eminent philosophers. but after it has been so constructed it is still only a single number.

he just has to wait until the fork is available again.picks up fork.56 2 Concurrency room. he went to the dining room. Of course. Thus each event except sitting down and getting up requires participation of exactly two adjacent actors.gets up. so i ⊕ 1 identiﬁes the right-hand neighbour of the ith philosopher.puts down fork. and plunged it into the spaghetti.6. picked up his own fork on his left.picks up fork.i. If the other philosopher wants it. surrounded by ﬁve chairs. PHIL3 . 2. Note that the alphabets of the philosophers are mutually disjoint. but when he felt hungry. which was constantly replenished.i. The names of the philosophers were PHIL0 . The philosopher therefore had also to pick up the fork on his right. When we was ﬁnished he would put down both his forks. .(i ⊕ 1). the set is deﬁned αPHILi = {i.1 Alphabets We shall now construct a mathematical model of this system.puts down fork. each labelled by the name of the philosopher who was to sit in it. and in the centre stood a large bowl of spaghetti. as shown in the connection diagram of Figure 2.puts down fork. There is no event in which they participate jointly. i. To the left of each philosopher there was laid a golden fork.i} where denotes subtraction modulo 5.i. PHIL1 . i. A philosopher was expected to spend most of his time thinking.(i ⊕ 1) } where ⊕ is addition modulo 5. or by his neighbour on the other side. (i 1).puts down fork. and they were disposed in this order anticlockwise around the table. i. i.i.i.5. PHIL2 . (i i. First we must select the relevant sets of events. For PHILi . A fork is picked up and put down either by this philosopher. The other actors in our little drama are the ﬁve forks. a fork can be used by only one philosopher at a time. PHIL4 . The alphabet of the ith fork is deﬁned αFORKi = {i.picks up fork. a philosopher and a fork. get up from his chair. 1). sat down in his own chair.sits down. i. and continue thinking. furnished with a circular table. But such is the tangled nature of spaghetti that a second fork is required to carry it to the mouth. each of which bears the same number as the philosopher who owns it.picks up fork. so there is no way whatsoever in which they can interact or communicate with each other—a realistic reﬂection of the behaviour of philosophers in those days.

picks up fork.gets up → PHILi ) .5 Example: The Dining Philosophers 57 2.puts down fork.6 2.puts down fork.2 FORK2 1.4 4.gets up 3.sits down → i.picks up fork.picks up fork.gets up 3.puts down fork.gets up 2.3 2.4 5.puts down fork.(i ⊕ 1) → i. the life of each philosopher is described as the repetition of a cycle of six events PHILi = (i.i → i.puts down fork.2 1.puts down fork.gets up Figure 2.picks up fork.puts down fork.1 PHIL4 4.3 3.sits down 2.picks up fork.picks up fork.4 1.picks up fork.puts down fork.2 Behaviour Apart from thinking and eating which we have chosen to ignore.4 3.puts down fork.1 FORK4 4.picks up fork.3 2.puts down fork.2 PHIL2 FORK3 3.1 1.3 2.5.5 PHIL5 5.sits down FORK5 4.2 1.sits down 3.2.puts down fork.gets up 4.1 FORK1 5.picks up fork.sits down PHIL3 PHIL1 1.picks up fork.5 5.(i ⊕ 1) → i.puts down fork.picks up fork.5 5.5 5.i → i.sits down 4.picks up fork.

Thus the behaviour of the hands is modiﬁed to contain an iteration. for example LEFTi = (i.picks up fork.i → i.picks up fork.picks up fork. Consider the behaviour of each philosopher’s hand separately.gets up} αRIGHTi = {i.puts down fork.sits down → i.58 2 Concurrency The role of a fork is a simple one. each fork may be picked up and put down many times on each occasion that the philosopher sits down. i.picks up fork.puts down fork.i → FORKi | (i 1).i → i.(i ⊕ 1) → i.gets up → LEFTi )) . or put them down in either order.i → FORKi ) The behaviour of the whole College is the concurrent combination of the behaviour of each of these components PHILOS = (PHIL0 || PHIL1 || PHIL2 || PHIL3 || PHIL4 ) FORKS = (FORK0 || FORK1 || FORK2 || FORK3 || FORK4 ) COLLEGE = PHILOS || FORKS An interesting variation of this story allows the philosophers to pick up their two forks in either order.(i ⊕ 1) → i.picks up fork.sits down → i.puts down fork.gets up → RIGHTi ) PHILi = LEFTi || RIGHTi Synchronisation of sitting down and getting up by both LEFTi and RIGHTi ensures that no fork can be raised except when the relevant philosopher is seated. i.i.sits down.picks up fork.i → (i 1). operations on the two forks are arbitrarily interleaved.puts down fork.gets up} LEFTi = (i.picks up fork.i → i.puts down fork. i.puts down fork.(i ⊕ 1). i.i.i → X | i.(i ⊕ 1).sits down. but both hands are needed for sitting down and getting up αLEFTi = {i. In yet another variation of the story. Apart from this.gets up → LEFTi ) RIGHTi = (i.puts down fork. Each hand is capable of picking up the relevant fork.sits down → µ X • (i. it is repeatedly picked up and put down by one of its adjacent philosophers (the same one on both occasions) FORKi = (i. i.i → i. i.

Although each actor is capable of further action.sits down} FOOTj deﬁnes the behaviour of the footman with j philosophers seated FOOT0 = (x : D → FOOT1 ) FOOTj = (x : D → FOOTj+1 | y : U → FOOTj−1 ) FOOT4 = (y : U → FOOT3 ) A college free of deadlock is deﬁned NEWCOLLEGE = (COLLEGE || FOOT0 ) The edifying tale of the dining philosophers is due to Edsger W.5.2. there is no action which any pair of them can agree to do next.3 Deadlock! When a mathematical model had been constructed. Scholten.gets up} D= 4 i=0 {i.5. they will all inevitably starve. 2.gets up} This footman was given secret instructions never to allow more than four philosophers to be seated simultaneously.5 Example: The Dining Philosophers 59 2. and they all reach out for the other fork—which isn’t there. they all sit down.4 Proof of absence of deadlock In the original COLLEGE the risk of deadlock was far from obvious. His behaviour is most simply deﬁned by mutual recursion. they all pick up their own forks. Suppose all the philosophers get hungry at about the same time. there were suggested many ways to avert it. Dijkstra. Once the danger was detected. The footman is due to Carel S.sits down. What we must prove can be stated formally as (NEWCOLLEGE / s) ≠ STOP for all s ∈ traces(NEWCOLLEGE ) . In this undigniﬁed situation. For example. it revealed a serious danger. whose duty it was to assist each philosopher into and out of his chair. The solution ﬁnally adopted was the appointment of a footman. whereas the purchase of ﬁve more forks was much too expensive. However. Let U = 4 i=0 {i. for j ∈ {1. His alphabet was deﬁned as 4 i=0 {i. the claim that NEWCOLLEGE is free from deadlock should therefore be proved with some care. our story does not end so sadly. 3} 2. one of the philosophers could always pick up the wrong fork ﬁrst—if only they could agree which one it should be! The purchase of a single additional fork was ruled out for similar reasons. i.

Suppose that a seated philosopher has an extremely greedy left neighbour. then one of the seated philosophers can pick up his left fork.5.8 million Since the alphabet of the footman is contained in that of the COLLEGE . If this is three or less. Let us consider an alternative proof method: program a computer to explore all possible behaviours of the system to look for deadlock. rapidly picks up both forks. at least one more philosopher can sit down. described informally in terms of the behaviour of this particular example. Before he can pick up his left fork. his left neighbour rushes in. consider the number of philosophers who are eating (with both their forks raised). If seated(s) ≤ 3. there is another danger that faces a dining philosopher— that of being inﬁnitely overtaken. so that there is no deadlock. the NEWCOLLEGE cannot have more states than the COLLEGE . then at least one of the seated philosophers must be eating. In the remaining case that seated(s) = 4. the number of traces that must be examined will exceed two raised to the power of 1. and showing that in all cases there is at least one event by which s can be extended and still remain in traces(NEWCOLLEGE ). First we deﬁne the number of seated philosophers seated(s) = #(s D) − #(s U) where U and D are deﬁned above Because (by 2. that no philosopher is eating. But then the left neighbour instantly gets hungry again. then an eating philosopher can always put down his left fork. 2. In general. the total number of states of the COLLEGE does not exceed 6 5 × 35 . or approximately 1. sits down.5 Inﬁnite overtaking Apart from deadlock.8 million. If this is nonzero. Since in nearly every state there are two or more possible events. and leaves his seat. . consider the number of raised forks. There is no hope that a computer will ever be able to explore all these possibilities. even for quite simple ﬁnite processes. then the philosopher to the left of the vacant seat already has raised his left fork and can pick up his right one. Since each philosopher has six states and each fork has three states. Eventually he puts down both forks. The number of states of (P || Q ) does not exceed the product of the number of states of P and the number of states of Q .60 2 Concurrency The proof proceeds by taking an arbitrary trace s. Proof of the absence of deadlock. But in the case of a ﬁnite-state system like the COLLEGE is is suﬃcient to consider only those traces whose length does not succeed a known upper bound on the number of states.3. we could never know whether such a program had looked far enough to guarantee absence of deadlock. and spends a long time eating. If there are four raised forks. In the remaining case. will remain the responsibility of the designer of concurrent systems.3 L1) s (U ∪ D) ∈ traces(FOOT0 ). and a rather slow left arm. we know seated(s) ≤ 4. If there are ﬁve raised forks. This proof involves analysis of a number of cases.

In this section we introduce a convenient method of deﬁning groups of processes with similar behaviour. Let f be a one-one function (injection) which maps the alphabet of P onto a set of symbols A f : αP → A We deﬁne the process f (P ) as one which engages in the event f (c) whenever P would have engaged in c. . and plenty of spaghetti. Suppose the footman conceives an irrational dislike for one of his philosophers. even though the programmer has no way of enforcing or even describing this obligation. It follows that αf (P ) = f (αP ) traces(f (P )) = { f ∗ (s) | s ∈ traces(P ) } (For the deﬁnition of f ∗ see 1. before his long-seated and long-suﬀering right neighbour gets around to picking up the fork they share. There is no clever way of ensuring general satisfaction. This is a possibility that cannot be described in our conceptual framework. or rather to delegate it to a diﬀerent phase of design and implementation. and the only eﬀective solution is to buy more forks. philosophers and forks. if it is important to guarantee that a seated philosopher will eventually eat. because if any philosopher is as greedy as described above. a seated philosopher may never succeed in eating. except that the names of the events in which they engage are different. which we have deliberately decided to ignore. then somebody (either he or his neighbours) will inevitably spend a long time hungry.1). even when the philosopher is ready to engage in that event. and persistently delays the action of escorting him to his chair. But there remains a more philosophical problem about inﬁnite overtaking. Since this cycle may be repeated indeﬁnitely. The implementor of a conventional high-level programming language has a similar obligation not to insert arbitrary delays into the execution of a program. The correct solution to this problem is probably to regard it as insoluble.6 Change of symbol 61 rushes in. It is an implementor’s responsibility to ensure that any desirable event that becomes possible will take place within an acceptable interval. within each collection the processes have very similar behaviour. So here is a problem. modify the behaviour of the footman: having helped a philosopher to his seat he waits until that philosopher has picked up both forks before he allows either of his neighbours to sit down.6 Change of symbol The example of the previous section involved two collections of processes. and rapidly snatches both forks.9. 2. sits down. because we cannot distinguish it from the possibility that the philosopher himself takes an indeﬁnitely long time to get hungry. like detailed timing problems. However.2.

X3 A counter moves left .1. we deﬁne a function f by the following equations f (in2p) = in10p f (in1p) = in5p f (out1p) = out5p The new vending machine is NEWVMC = f (VMC) f (large) = large f (small) = small X2 A counter behaves like CT0 (1. the price of everything goes up. it can turn around. LR0 = f (CT0 ) f (down) = left .3 X2. but around requires simultaneous participation of both LRUD = LR0 || CT0 . The main reason for changing event names of processes in this fashion is to enable them to be composed usefully in concurrent combination. right . As in 2.4 X2). except that it moves right and left instead of up and down f (up) = right .62 2 Concurrency Examples X1 After a few years. vertical and horizontal movements can be modelled as independent actions of separate processes. On this square alone. f (around) = around. To represent the eﬀect of inﬂation. up or down on an inﬁnite board with boundaries at the left and at the bottom It starts at the bottom left corner.

1) = out .0 COPYBIT out.0 and mid. This method of communication between concurrent processes will be generalised in Chapter 4. we need to change the names of the events used for internal communication. This models the synchronised communication of binary digits on a channel which connects the two operands. The answer we want is CHAIN2 = f (COPYBIT ) || g(COPYBIT ) Note that each output of 0 or 1 by the left operand of || is (by the deﬁnition of f and g) the very same event (mid. in.0 COPYBIT in. Sometimes one wishes to ignore or conceal such internal events.0 mid. so this topic is postponed to Section 3.1.0) = out . and deﬁne the functions f and g to change the output of one process and the input of the other f (out .1 mid.0) = g(in.0 or mid.0 or mid. The events in its alphabet are assign0—assignment of value zero to the variable assign1—assignment of value one to the variable fetch0—access of the value of the variable at a time when it is zero fetch1—access of the value of the variable at a time when it is one .0) = in. and can be observed (or even perhaps controlled) by its environment. X5 We wish to represent the behaviour of a Boolean variable used by a computer program.0.0.1) = g(in.1) = in.1 are still in the alphabet of the composite processes.1 g(out . It is therefore the outputting process that determines on each occasion which of these two events will occur.1.3 X7) in series. so that each bit output by the ﬁrst is simultaneously input by the second.7.0 f (out . as shown in Figure 2.1 g(out .1 Figure 2. First. whereas the right operand is prepared to engage in either of the events mid.1 The left operand oﬀers no choice of which value is transmitted on the connecting channel. f (in.5.6 Change of symbol 63 X4 We wish to connect two instances of COPYBIT (1.0 and mid. in the general case such concealment may introduce nondeterminism.1. we therefore introduce two new events mid.7 out.1) as the input of the same 0 or 1 by the right operand.2.1) = mid.1 f (in.0) = mid. Note that the internal communications mid.

For example.9. this transformation preserves the structure of the tree and the important distinctness of labels on all branches leading from the same node.1.1 . An attempt to fetch an unassigned value would result in deadlock—which is probably the kindest failure mode for incorrect programs.4 X1). a picture of NEWVMC is shown in Figure 2. The tree picture of f (P ) may be constructed from the tree picture of P by simply applying the function f to the labels on all the branches.1 Laws Change of symbol by application of a one-one function does not change the structure of the behaviour of a process.6.8 2. This is reﬂected by the fact that function application distributes through all the other operators. as described in the following laws. The following auxiliary deﬁnitions are used f (B) = { f (x) | x ∈ B } f −1 is the inverse of f f ◦ g is the composition of f and g f∗ is deﬁned in Section 1. Because f is a one-one function.64 2 Concurrency The behaviour of the variable is remarkably similar to that of the drinks dispenser (1. Note that the Boolean variable refuses to give its value until after a value has been ﬁrst assigned. so we deﬁne BOOL = f (DD) where the deﬁnition of f is a trivial exercise. NEWVMC in5p in10p large small in5p small in10p in5p out5p in10p large in10p in5p in10p in5p Figure 2.8. because the simplest postmortem will pinpoint the error.

This is done by applying the inverse function f −1 . and cannot be used as an argument to F until its alphabet has been changed back to A. The behaviour of P after this event is P (f −1 (y)). and the actions of this process must continue to be changed by application of f . and the subsequent behaviour is similarly changed L2 f (x : B → P (x)) = (y : f (B) → f (P (f −1 (y)))) The use of f −1 on the right-hand side may need explanation. Recall that P is a function delivering a process depending on selection of some x from the set B. But the variable y on the right-hand side is selected from the set f (B). the use of f −1 on the right-hand side may be puzzling. thus ensuring the validity of the recursion on the right-hand side of the law. The corresponding event for P is f −1 (y). which is in B (since y ∈ f (B)). Change of symbol simply distributes through parallel composition L3 f (P || Q ) = f (P ) || f (Q ) Change of symbol distributes in a slightly more complex way over recursion. Recall that the validity of the recursion on the left-hand side requires that F is a function which takes as argument a process with alphabet A. and delivers a process with the same alphabet. STOP still performs no event from its changed alphabet L1 f (STOPA ) = STOPf (A) In the case of a choice. After change of symbol. Y is a variable ranging over processes with alphabet f (A). On the right-hand side. the symbols oﬀered for selection are changed. so an application of f will transform the alphabet to f (A).6 Change of symbol 65 (the need for f −1 in the following laws is an important reason for insisting that f is an injection). The composition of two changes of symbol is deﬁned by the composition of the two symbol-changing functions L5 f (g(P )) = (f ◦ g) (P ) The traces of a process after change of symbol are obtained simply by changing the individual symbols in every trace of the original process L6 traces(f (P )) = { f ∗ (s) | s ∈ traces(P ) } The explanation of the next and ﬁnal law is similar to that of L6 L7 f (P ) / f ∗ (s) = f (P / s) . changing the alphabet in the appropriate way L4 f (µ X : A • F (X )) = (µ Y : f (A) • f (F (f −1 (Y )))) Again.2. Now F (f −1 (Y )) has alphabet A.

this is a consequence of the fact that (VMS || VMS ) = VMS The labelling of processes permits them to be used in the manner of variables in a high-level programming language. every event would require participation of both of them.1—to access the current value of c when it is one . and the pair would be indistinguishable from a single machine. A labelled event is a pair l. each process is labelled by a diﬀerent name. If the machines were not named before being placed in parallel. and each event of a labelled process is also labelled by its name. To achieve this. declared locally in the block of program which uses them.fetch. and x is the symbol standing for the event.6. X2 The behaviour of a Boolean variable is modelled by BOOL (2.0—to assign value zero to b c. but which do not interact with each other in any way at all.assign. This means that they must all have diﬀerent and mutually disjoint alphabets. and every event that occurs is labelled by the name of the machine on which it occurred.x where l is a label. The function required to deﬁne l : P is fl (x) = l. The behaviour of a block of program is represented by a process USER.x for all x in αP and the deﬁnition of labelling is l : P = fl (P ) Examples X1 A pair of vending machines standing side by side (left : VMS ) || (right : VMS ) The alphabets of the two processes are disjoint.2 Process labelling Change of symbol is particularly useful in constructing groups of similar processes which operate concurrently in providing identical services to their common environment.x whenever P would have engaged in x.66 2 Concurrency 2. A process P labelled by l is denoted by l:P It engages in the event l. This process assigns and accesses the values of two Boolean variables named b and c. Thus αUSER includes such compound events as b.6 X5).

down and m. They are initialised to 0 and 3 respectively.4 X2) can be used after appropriate labelling by l and by m (l : CT0 || m : CT3 || USER) Within the USER process the following eﬀects (expressed in conventional notation) can be achieved (m := m + 1 .down. restoration of the original value is more laborious. l. the value of the count has been decremented.around is selected. and decrements it (when positive) by l.around and m. the following eﬀects may be achieved b := false .fetch. X3 A USER process needs two count variables named l and m. P by (c.0 → P ) Note how the current value of the variable is discovered by allowing the variable to make the choice between fetch.g. and it must immediately be restored to its original value by l. at the same time as attempting l. if non-zero.around. The USER process increments each variable by l.1 → b.up or m.assign.1.up. The count selects between these two events: if the value is zero. P which explicitly mentions the rest of the program P . P by (b. and this choice aﬀects in an appropriate way the subsequent behaviour of the USER. In X2 and the following examples it would have been more convenient to deﬁne the eﬀect of the single assignment. A test of zero is provided by the events l.around → P | l.up → P ) (l.0 and fetch.0 → b.assign.2.up → Q ) Note how the test for zero works: an attempt is made by l.fetch.up. P ) if l = 0 then P else Q by by (m.down to reduce the count by one. .around. e..assign. But in the latter case.down → l. Thus the process CT (1. In the next example.1 → P | c.0 → P ) b := ¬ c .6 Change of symbol 67 The USER process runs in parallel with its two Boolean variables b : BOOL || c : BOOL || USER Inside the USER program. the other.1. b := false rather than the pair of commands b := false . The means of doing this will be introduced in Chapter 5.

thereby restoring l to its initial value and adding this value to m. On the ﬁrst occurrence of in it responds no and on each subsequent occurrence it responds yes αEL = {in. The eﬀect of an array variable can be achieved by a collection of concurrent processes.no → Q ) 2.yes → P | m. m.2.up → UPi The DOWNi processes discover the initial value of l by decrementing it to zero. the eﬀect of if 2 ∈ m then P else (m := m ∪ {2} .2.6. We also need to ensure that g will give the .2. The UPi processes then add the discovered value to both m and to l. P ) is implemented by ADD.68 2 Concurrency (m := m + l . e. X4 The purpose of the process EL is to record whether the event in has occurred or not.2.down → DOWNi+1 | l.in → (m. no.up → m. each labelled by its index within the array.g..around → UPi ) UP0 = P UPi+1 = l. we need to know the inverse g of the symbol-changing function f .in. Within the USER process.3 Implementation To implement symbol change in general. yes} EL = in → no → µ X • (in → yes → X ) This process can be used in an array to mimic the behaviour of a set of small integers SET3 = (0 : EL) || (1 : EL) || (2 : EL) || (3 : EL) The whole array can be labelled yet again before use m : SET3 || USER Each event in α(m : SET3) is a triple. Q ) may be achieved by m. where ADD is deﬁned recursively ADD = DOWN0 DOWNi = (l.

The implementation is based upon 2. Example X1 A lackey is a junior footman. who helps his single master to and from his seat. gets up} LACKEY = (sits down → gets up → LACKEY ) To teach the lackey to share his services among ﬁve masters (but serving only one at a time). The compound event l . except that it engages in the event l.c (where l ∈ L and c ∈ αP ) whenever P would have done c.2. P (g(x))) The special case of process labelling can be implemented more simply.x is represented as the pair of atoms cons("l.6.4 Multiple labelling The deﬁnition of labelling can be extended to allow each event to take any label l from a set L. "x). change(g. P ) = λ y • if null(y) or atom(y) then "BLEEP else if car (y) ≠ l then "BLEEP else if P (cdr (y)) = "BLEEP then "BLEEP else label(l.6 Change of symbol 69 special answer "BLEEP when applied to an argument outside the range of f . Now (l : P ) is implemented by label(l. we deﬁne . and stands behind his chair while he eats αLACKEY = {sits down. The choice of the label l is made independently on each occasion by the environment of (L : P ).6. (L : P ) is deﬁned as a process which behaves exactly like P . If P is a process. P (cdr (y))) 2. P ) = λ x • if g(x) = "BLEEP then "BLEEP else if P (g(x)) = "BLEEP then "BLEEP else change(g.1 L4.

multiple labelling can be used to share the services of a single process among a number of other labelled processes. .10 In general.3) is on holiday.sits down 0. 4} SHARED LACKEY = (L : LACKEY ) The shared lackey could be employed to protect the dining philosophers from deadlock when the footman (2. 1} : LACKEY is a complete binary tree (Figure 2. For example.sits down 1.70 2 Concurrency L = {0. but it is much more bushy in the sense that there are many more branches leading from each node. 2.gets up 0. 3. the tree picture of L : P is similar to that for P . sits down gets up sits down Figure 2. provided that the set of labels is known in advance.9). This technique will be exploited more fully in Chapter 6.9 However.gets up Figure 2.gets up 1. Of course the philosophers may go hungrier during the holiday. 0. the picture of the LACKEY is a single trunk with no branches (Figure 2. the picture of {0. since only one of them is allowed to the table at a time. the tree for the SHARED LACKEY is even more bushy.5.10).gets up 1. 1. If L contains more than one label.

2. i. S (tr αP ) αQ ) is a trace of Q .e. It follows by 2..7 Speciﬁcations 71 2. and consequently it satisﬁes S . c} and P = (a → c → P ) Q = (c → b → Q ) We wish to prove that (P || Q ) sat 0 ≤ tr ↓ a − tr ↓ b ≤ 2 The proof of 1. c} αQ = {b.1 X1) Let αP = {a.3.2 X1 can obviously be adapted to show that P sat (0 ≤ tr ↓ a − tr ↓ c ≤ 1) and Q sat (0 ≤ tr ↓ c − tr ↓ b ≤ 1) By L1 it follows that (P || Q ) .3 L1 that (tr αP ) is a trace of P . and suppose we have proved that P sat S (tr ) and that Q sat T (tr ). (tr T (tr αQ ) This argument holds for every trace of (P || Q ).3. Consequently we may deduce (P || Q ) sat (S (tr αP ) ∧ T (tr αQ )) This informal reasoning is summarised in the law L1 If and then P sat S (tr ) Q sat T (tr ) (P || Q ) sat (S (tr αP ) ∧ T (tr αQ )) Example X1 (See 2. Let tr be a trace of (P || Q ). so Similarly.10.7 Speciﬁcations Let P and Q be processes intended to run concurrently.

More powerful laws will be given in Section 3. Meanwhile. However.1 X1. as in Section 2. 2. . Example X2 The process (P || Q ) deﬁned in X1 will never stop. Then f −1 (tr ) is a trace of P . we have stated a large number of laws. Another method is to show that a process deﬁned by the parallel combinator is equivalent to a non-stopping process deﬁned without this combinator.8 Mathematical theory of deterministic processes In our description of processes. as was done in 2.72 2 Concurrency sat (0 ≤ (tr 0 ≤ (tr αP ) ↓ a − (tr αP ) ↓ c ≤ 1 ∧ αQ ) ↓ b ≤ 1) A) ↓ a = tr ↓ a whenever a ∈ A. The antecedent of L3 states that every trace of P satisﬁes S . It follows that f −1∗ (tr ) satisﬁes S . because αP ∩ αQ = {c} The proof rule for change of symbol is L3 If then P sat S (tr ) f (P ) sat S (f −1∗ (tr )) The use of f −1 in the consequent of this law may need extra explanation.5. reasoning based on these laws can never prove absence of deadlock. For a reader with the instincts of an applied mathematician or engineer. But the question also arises. then (P || Q ) never stops.4.] αQ ) ↓ c − (tr ⇒ 0 ≤ tr ↓ a − tr ↓ b ≤ 2 [since (tr Since the laws for sat allow STOP to satisfy every satisﬁable speciﬁcation. such as L2 If P and Q never stop and if (αP ∩ αQ ) contains at most one event. that may be enough. Wherever possible. which is exactly what is stated by the consequent of L3.7. Let tr be a trace of f (P ). one should appeal to some general law. such proofs involve long and tedious algebraic transformations.3. one way to eliminate the risk of stoppage is by careful proof. The laws have been justiﬁed (if at all) by informal explanations of why we should expect and want them to be true. and we have occasionally used them in proofs. are these laws in fact true? Are they even consistent? Should there be more of them? Or are they complete in the sense that they permit all true facts about processes to be proved from them? Could one manage with fewer and simpler laws? These are questions for which an answer must be sought in a deeper mathematical investigation.

t • s t ∈ S ⇒ s ∈ S The simplest example of a process which meets this deﬁnition is the one that does nothing .2. For a deterministic process P .1 L6. S ) where A is any set of symbols and S is any subset of A∗ which satisﬁes the two conditions C0 ∈S C1 ∀ s. L8. this is a suﬃcient justiﬁcation for identifying the two concepts. it is a good strategy to deﬁne the basic concepts in terms of attributes that can be directly or indirectly observed or measured. Let P0 = { x | x ∈ S } and. D0 A deterministic process is a pair (A. In mathematics.8 Mathematical theory of deterministic processes 73 2. We have explained how these two sets must satisfy the three laws 1. L7.8. traces(P )). S ) which satisfy these three laws. Consider now an arbitrary pair of sets (A. the pair (A. let P (x) be the process whose traces are {t | x Then αP = A and P = (x : P 0 → P (x)) t ∈S} Furthermore.8. we are familiar with two such attributes αP —the set of events in which the process is in principle capable of engaging traces(P )—the set of all sequences of events in which the process can actually participate if required. by using one of them as the deﬁnition of the other.1 The basic deﬁnitions In constructing a mathematical model of a physical system. S ) can be recovered by the equations A = αP S = traces(x : P 0 → P (x)) Thus there is a one-one correspondence between each process P and the pairs of sets (αP . This pair uniquely identiﬁes a process P whose traces are S constructed according to the following deﬁnitions. for all x in P 0 .

T ) = (A ∪ B. { }) At the other extreme there is the process that will do anything at any time D2 RUNA = (A.1 L3A. it is necessary to prove that the right-hand sides of these deﬁnitions are actually processes. in this book we have avoided quoting such laws. µ X • F (X ) = F (µ X • F (X )) The treatment follows the ﬁxed-point theory of Scott.8)). S ) / s = (A. (except 2.e. n≥0 provided s ∈ S traces(F n (STOPA ))) provided F is a guarded expression A) ∈ S ∧ (s B) ∈ T }) D6 (A. { f ∗ (s) | s ∈ S }) provided f is one-one Of course. { } ∪ { x s | x ∈ B ∧ s ∈ S (x) }) provided B ⊆ A D4 (A. A∗ ) The various operators on processes can now be formally deﬁned by showing how the alphabet and traces of the result are derived from the alphabet and traces of the operands D3 (x : B → (A. L3B. Fortunately. 2. 2. { t | (s t ) ∈ S }) D5 µ X : A • F (X ) = (A. { s | s ∈ (A ∪ B)∗ ∧ (s D7 f (A. In Chapter 3. that they satisfy the conditions C0 and C1 of D0. .3 L1. i. that is quite easy.3 L1.. for example P || P = P To avoid confusion.2 Fixed point theory The purpose of this section is to give an outline of a proof of the fundamental theorem of recursion.3..1 D5) is indeed a solution of the corresponding recursive equation. it will become apparent that D0 is not a fully adequate deﬁnition of the concept of a process. which are false for processes containing CHAOS (3. But deterministic processes obey some additional laws. i.8. S ) = (f (A). a more general and more complicated deﬁnition will be required. All laws for nondeterministic processes are true for deterministic processes as well. so all quoted laws may safely be applied to nondeterministic processes as well as deterministic ones.e. Consequently.74 2 Concurrency D1 STOPA = (A. L3A. S ) || (B. L2. 2. because it does not represent the possibility of nondeterminism.2.8. 2.3. that a recursively deﬁned process (2.1 L3A.2. S (x))) = (A.

and one of them can do everything done by the other—and maybe more. to another one (or the same one) is said to be continuous if it distributes over the limits of all chains.o.1 D5) can be reformulated in terms of a limit L7 µ X : A • F (X ) = i≥0 F i (STOPA ) A function F from one c.8 Mathematical theory of deterministic processes 75 First. This ordering is a partial order in the sense that L1 P L2 P L3 P P Q ∧Q Q ∧Q P ⇒P =Q R⇒P R A chain in a partial order is an inﬁnite sequence of elements { P0 . . F( i≥0 Pi ) = i≥0 F (Pi ) if { Pi | i ≥ 0 } is a chain (All continuous functions are monotonic in the sense that P Q ⇒ F (P ) F (Q ) for all P and Q .). i. since it satisﬁes the laws L4 STOPA L5 Pi i≥0 P Pi provided αP = A L6 (∀ i ≥ 0 • Pi Q) ⇒ ( i≥0 Pi ) Q Furthermore the deﬁnition of µ (2.. A partial order is said to be complete if it has a least element. P1 . T ) = (A = B ∧ S ⊆ T ) among processes Two processes are comparable in this ordering if they have the same alphabet.2. } such that Pi Pi+1 for all i We deﬁne the limit (least upper bound) of such a chain i≥0 Pi = (αP0 .p.p.o. .8. S ) (B. . i≥0 traces(Pi )) In future. we need to specify an ordering relationship D1 (A.e. we will apply the limit operator only to sequences of processes that form a chain. . The set of all processes with a given alphabet A forms a complete partial order (c. P2 . and all chains have a least upper bound.

it will be continuous in X . i. . for example G(( and G(Q . For example. G and H are continuous G(F (X ). Pi ) for all Q The composition of continuous functions is also continuous. ( Pi )) = L10 L11 ( f( i≥0 Pi ) || Q = Q || ( Pi ) = i≥0 i≥0 Pi ) = || Pi ) i≥0 f (Pi ) Consequently if F (X ) is any expression constructed solely in terms of these operators.e. F i+1 ] [STOPA F (STOPA )] [def.76 2 Concurrency so that the right-hand side of the previous equation is also the limit of an ascending chain. µ] i≥0 i≥1 i≥0 F (F (STOPA )) F (STOPA ) F i (STOPA ) i = µ X : A • F (X ) This proof has relied only on the fact that F is continuous. i≥0 i≥0 Pi ). and indeed any expression constructed by application of any number of continuous functions to any number and combination of variables is continuous in each of those variables. Q ) for all Q Pi )) = i≥0 G(Q . H (( i≥0 Pi ). Q ) = i≥0 G(Pi . Y )) = i≥0 G(F (Pi ). µ] [continuity F ] [def. Pi ) provided F is continuous i≥0 (Q L9 µ X : A • F (X . G(F ( i≥0 Pi ). Now it is possible to prove the basic ﬁxed-point theorem F (µ X : A • F (X )) = F( = = = i≥0 F i (STOPA )) i [def. Y )) for all Y All the operators (except /) deﬁned in D3 to D7 are continuous in the sense deﬁned above L8 (x : B → ( i≥0 Pi (x))) = i≥0 i≥0 (x i≥0 : B → Pi (x)) µ X : A • F (X . H (X .) A function G of several arguments is deﬁned as continuous if it is continuous in each of its arguments separately.. if F . Y )) is continuous in X . The guardedness of F is necessary only to establish uniqueness of the solution. H (Pi .

8 Mathematical theory of deterministic processes 77 2.8. so if s ≠ s ∈ traces(F (X )) ≡ s ∈ traces(F (X (#s − 1))) Preﬁxing is the primary example of a constructive function. If P is a process and n is a natural number. the treatment easily extends to sets of simultaneous equations.3 Unique solutions In this section we treat more formally the reasoning given in Section 1. F is said to be constructive if F (X ) (n + 1) = F (X n) (n + 1) for all X This means that the behaviour of F (X ) on its ﬁrst n + 1 steps is determined by the behaviour of X on its ﬁrst n steps only. more formally (A. since I (c → P ) 1 = c → STOP ≠ STOP = I ((c → P ) 0) 1 We can now formulate the fundamental theorem L5 Let F be a constructive function. S ) n = (A.2. we deal only with single equations.1. since (c → P ) (n + 1) = (c → (P n)) (n + 1) General choice is also constructive (x : B → P (x)) (n + 1) = (x : B → (P (x) n)) (n + 1) The identity function I is not constructive. For simplicity. we shall make explicit more general conditions for uniqueness of such solutions.2 to show that an equation deﬁning a process by guarded recursion has only one solution. { s | s ∈ S ∧ #s ≤ n }) It follows that L1 P L2 P 0 = STOP n P n≥0 (n + 1) P n n≥0 (Pn P L3 P = L4 n≥0 Pn = n) Let F be a monotonic function from processes to processes. . and then stops. we deﬁne (P n) as a process which behaves like P for its ﬁrst n events. In doing so. The equation X = F (X ) has only one solution for X .

78 2 Concurrency Proof : Let X be an arbitrary solution.8. or in other words. F n ] n) (n + 1) = F (F (STOP ) n) (n + 1) = F (F n (STOP )) (n + 1) = F n+1 (STOP ) (n + 1) Now we go back to the main theorem X = = = n≥0 (X n≥0 n≥0 n n n) [L3] [just proved] [L4] [2. since f (P ) n = f (P n) n So is the identity function.2 L7] F (STOP ) n F (STOP ) = µ X • F (X ) Thus all solutions of X = F (X ) are equal to µ X • F (X ). First by induction we prove the lemma that X n = F n (STOP ) n Base case. µ X • F (X ) is the only solution of the equation. since ((c → c → STOP ) / c ) 1 = c → STOP ≠ STOP = (c → STOP ) / c = (((c → c → STOP ) 1) / c ) 1 . Let us deﬁne a nondestructive function G as one which satisﬁes G(P ) n = G(P n) n for all n and P . Any monotonic function which is constructive is also nondestructive. X (n + 1) = F (X ) (n + 1) = F (X n [since X = F (X )] [F is constructive] [hypothesis] [F is constructive] [def. X 0 = STOP = STOP 0 = F 0 (STOP ) 0 Induction step. Alphabet transformation is nondestructive in this sense. But the after operator is destructive. The usefulness of this theorem is much increased if we can recognise which functions are constructive and which are not.

and general choice are said to be guard-preserving. x) is guard-preserving for all x. and no destructive function. So if all of F . we reach the conclusion L6 If E is guarded in X . .2. D4 A symbol change f (P (X )) is guarded in X if P (X ) is guarded in X . Then E is said to be guarded in X if every occurrence of X in E has a constructive function applied to it. H are nondestructive and just one of them is constructive. D3 A general choice (x : B → P (X . Thus the following expression is constructive in X (c → X | d → f (X || P ) | e → (f (X ) || Q )) || ((d → X ) || R) The important consequence of this is that constructiveness can be deﬁned syntactically by the following conditions for guardedness D1 Expressions constructed solely by means of the operators concurrency. The above reasoning extends readily to functions of more than one argument. . . Finally. D5 A concurrent system P (X ) || Q (X ) is guarded in X if both P (X ) and Q (X ) are guarded in X . symbol change. any composition of a constructive function with nondestructive functions is also constructive. . x)) is guarded in X if P (X . because G(H (P )) n = G(H (P ) n) n = G(H (P n) n) n = G(H (P n)) n Even more important. .8 Mathematical theory of deterministic processes 79 Any composition of nondestructive functions (G and H ) is also nondestructive. G. …. (H (X )) . For example parallel composition is nondestructive (in both its arguments) because (P || Q ) n = ((P n) || (Q n)) n Let E be an expression containing the process variable X . D2 An expression which does not contain X is said to be guarded in X .)) is a constructive function of X . then the equation X =E has a unique solution. then F (G(.

.

Sometimes a process has a range of possible behaviours. in an arbitrary or nondeterministic fashion. as it were internally.1 Introduction The choice operator (x : B → P (x)) is used to deﬁne a process which exhibits a range of possible behaviours. because whenever there is more than one event possible. but we have excluded these events from the alphabet. a diﬀerent change-giving machine may give change in either of the combinations described above. the choice between them is determined externally by the environment of the process. Such processes are called deterministic. or in the weaker sense that the environment can observe which choice has been made at the very moment of the choice. The choice is made.Nondeterminism 3 3. Thus nondeterminism is useful for maintaining a high level of abstraction in descriptions of the behaviour of physical systems and machines. although it may later infer which choice was made from the subsequent behaviour of the process. but the choice between them cannot be controlled or even predicted by its user.1. For example. by the machine itself. It is determined either in the sense that the environment can actually make the choice. it cannot ﬁnd out exactly when the choice was made. There is nothing mysterious about this kind of nondeterminism: it arises from a deliberate decision to ignore the factor which inﬂuence the selection. but the environment of the process does not have any ability to inﬂuence of even observe the selection between the alternatives. For example. the combination of change given by the machine may depend on the way in which the machine has been loaded with large and small coins. The environment cannot control the choice of even observe it. For example. or two large coins and one small. .3 X2) oﬀers its customer the choice of taking his change as three small coins and one large. and the concurrency operator || permits some other process to make a selection between the alternatives oﬀered in the set B. the change-giving machine CH5C (1.

after this machine gives its ﬁrst coin in change. but we do not know initially which it will be (see 1. such as low cost. . fast response times. The choice can be made in advance by the implementor on grounds not relevant (and deliberately ignored) in the speciﬁcation. nondeterminism arises more naturally from use of the other operators deﬁned later in this chapter. Of course. and then throw the other one away! The main advantage of nondeterminism is in specifying a process. It would be very foolish to build both P and Q . In fact.82 3 Nondeterminism 3. where the selection between them is made arbitrarily. without the knowledge of control of the external environment.2 X3. X4) CH5E = CH5A CH5B Of course. is not intended as a useful operator for implementing a process. A process speciﬁed as (P Q ) can be implemented either by building P or by building Q .1. make an arbitrary choice between them. the operator will not often be used directly even in speciﬁcations. The alphabets of the operands are assumed to be the same α(P Q ) = αP = αQ Examples X1 A change-giving machine which always gives the right change in one of two combinations CH5D = (in5p → ((out1p → out1p → out1p → out2p → CH5D) (out2p → out1p → out2p → CH5D))) X2 CH5D may give a diﬀerent combination of change on each occasion of use.2 Nondeterministic or If P and Q are processes. its subsequent behaviour is entirely predictable. or early delivery. put them in a black bag. Here is a machine that always gives the same combination. For this reason CH5D ≠ CH5E Nondeterminism has been introduced here in its purest and simplest form by the binary operator . then we introduce the notation P Q (P or Q ) to denote a process which behaves either like P or like Q .

b.3. so its traces do not include the one displayed above. This point is simply illustrated by the diﬀerence between the two processes P = µ X • ((a → X ) Q = (µ X • (a → X )) (b → X )) (µ X • (b → X )) P can make an independent choice between a and b on each iteration. Such operators are said to be distributive. P may choose always to do . However. A process which ﬁrst does x and then makes a choice is indistinguishable from one which ﬁrst makes the choice and then does x L4 x → (P Q ) = (x → P ) (x → Q ) (distribution) The law L4 states that the preﬁxing operator distributes through nondeterminism.1 Laws The algebraic laws governing nondeterministic choice are exceptionally simple and obvious. the recursion operator is not distributive. b. so its traces include a.2. A dyadic operator is said to be distributive if it distributes through in both its argument positions independently. a. It does not matter in which way this is done L3 P (Q R) = (P Q) R (associativity) The occasion on which a nondeterministic choice is made is not signiﬁcant. except in the trivial case where the operands of are identical. A choice between P and P is vacuous L1 P P =P (idempotence) It does not matter in which order the choice is presented L2 P Q =Q P (symmetry) A choice between three alternatives can be split into two successive binary choices.2 Nondeterministic or 83 3. Most of the operators deﬁned so far for processes are distributive in this sense L5 (x : B → (P (x) L6 P || (Q L7 (P L8 f (P Q (x))) = (x : B → P (x)) (P || R) (Q || R) (x : B → Q (x)) R) = (P || Q ) Q ) || R = (P || R) Q ) = f (P ) f (Q ) However. b Q must make a choice between always doing a and always doing b.

2. j. or it may be postponed until the process is actually running.2 As mentioned above. there is no such concept of fairness. then event a must always occur within n steps of its previous occurrence Pi = (a → P0 ) Pn = (a → P0 ) Later. so traces(Q ) ⊆ traces(P ) In some theories. Because we observe only ﬁnite traces of the behaviour of a process. one way of implementing (P Q ) is to select the ﬁrst operand or1(P . in the sense that an event that inﬁnitely often may happen eventually must happen (though there is no limit to how long it may be delayed). If fairness of nondeterminism is required. In our theory. in the process P0 deﬁned below. Then the process must be designed explicitly to satisfy this constraint. . one of the main reasons for the introduction of nondeterminism is to abstract from details of implementation. Let S be a ﬁnite nonempty set S = {i. The diﬀerences arise from diﬀerent permitted resolutions of the nondeterminism inherent in P . for example. It seems highly desirable to separate complex probabilistic reasoning from concerns about the logical correctness of the behaviour of a process. . by ascribing nonzero probabilities to the alternatives of a nondeterministic choice. k} Then we deﬁne x:S x:S (b → Pi+1 ) P (x) = P (i) P (j) . . This means that there may be many diﬀerent implementations of a nondeterministic process P . The choice involved may be made by the implementor before the process starts.84 3 Nondeterminism a or always to do b. For example. Q ) = P .. we must state that there is a number n such that every trace longer than n contains that event. we can never tell whether it is going to happen or not.. In view of laws L1 to L3 it is useful to introduce a multiple-choice operator. Implementations 3. nondeterminism is obliged to be fair. each with an observably diﬀerent pattern of behaviour. this should be speciﬁed and implemented at a separate stage. For example. If we want to insist that the event shall happen eventually. P (k) is meaningless when S is either empty or inﬁnite. if an event can be postponed indeﬁnitely. we will see that both Q and P0 are valid implementations of P .

Q ) = Q Yet a third implementation postpones the decision until the process is running. For example.2 Nondeterministic or 85 Another implementation is obtained by selecting the second operand. Since the design of the process (P Q ) has no control over whether P or Q will be selected. by selecting an event that is possible for one process but not the other. Q ). the implementations or1 and or2 are asymmetric: or1(P . Q ) = λ x • (if P (x) = "BLEEP then Q (x) else if Q (x) = "BLEEP then P (x) else or3(P (x). Q ) } = {or2(Q . Q ) } = { P .3. or3(P .2. The laws apply to processes. since or3 is symmetric. If the event is possible for both processes. but this is not so. In fact they assert the identity of the set of all implementations of their left and right hand sides. Q ) ≠ or1(Q . In contrast to or3.1 L2. this will never happen. it then arbitrarily chooses P . If there is any risk that either P or Q will deadlock with its environment. or3(P . an implementation may behave like or3 for the ﬁrst ﬁve steps. or3(Q . But the price to be paid is high in terms of eﬃciency: if the choice between P and Q is not made on the ﬁrst step. and if all these steps are possible both for P and for Q . and the ineﬃciency will also be extreme. P ) This seems to violate law 3. the decision is again postponed or3(P . or1(Q . it then allows the environment to make the choice. P ). In fact there are many more: for example. P ). he must ensure that his system will work correctly for both choices. not to any particular implementation of them. Q . the deﬁnition of or3 is sometimes known as angelic nondeterminism. both P and Q have to be executed concurrently until the environment chooses an event which is possible for one but not the other. Q (x)) Here we have given three diﬀerent possible implementations of the same operator. In the simple but extreme case of or3(P . For this reason. Q ). perhaps on the grounds of greater eﬃciency on a particular machine or2(P . or2(P . { or1(P . P ). P ) } . then (P Q ) also runs that same risk. The implementation or3 is the one which minimises the risk of deadlock by delaying the choice until the environment makes it. and then selecting whichever of P or Q does not deadlock.

. in the case that P is selected. Q ) degenerates to nondetermin- However if the initial events are the same. i. because the environment must be prepared to deal with either P or Q .e.86 3 Nondeterminism One of the advantages of introducing nondeterminism is to avoid the loss of symmetry that would result from selecting one of the two simple implementations. then the choice between them is nondeterministic. or even the time at which the choice is made.2. If. (P istic choice (c → P c → Q ) = (c → P c → Q) Here we have adopted the convention that → binds more tightly than . which has been used hitherto to represent choice between diﬀerent events (c → P d → Q ) = (c → P | d → Q ) if c ≠ d. Conversely. The behaviour of (P Q ) after s is deﬁned by whichever of P or Q could engage in s. then it just cannot happen.3 General choice The environment of (P Q ) has no control or even knowledge of the choice that is made between P and Q . If this action is not a possible ﬁrst action of P . then s is also a possible trace of (P Q ). the general choice operator is the same as the | operator. So (P Q ) is not a helpful way of combining processes. and either one of them separately would be easier to deal with.. 3. Similarly. each trace of (P Q ) must be a trace of one or both alternatives. then Q will be selected.) As usual α(P Q ) = αP = αQ In the case that no initial event of P is also possible for Q . the ﬁrst action is possible for both P and Q . and yet to avoid the ineﬃciency of the symmetric implementation or3. it is also a trace of (P Q ). L1 traces(P L2 (P Q ) = traces(P ) ∪ traces(Q ) if s ∈ (traces(Q ) − traces(P )) if s ∈ (traces(P ) − traces(Q )) (Q / s) if s ∈ (traces(P ) ∩ traces(Q )) Q) / s = Q / s =P /s = (P / s) 3. if s is a trace of Q . however. if both could. P will be selected. if the event is impossible for both P and Q . the choice remains nondeterministic. We therefore introduce another operation (P Q ). but if Q cannot engage initially in the action. provided that this control is exercised on the very ﬁrst action. for which the environment can control which of P and Q will be selected. (Of course.3 Traces If s is a trace of P .

Thus the results of the choice strategies described on the left. . symmetric.1 Laws are similar to those for . John chooses ( ) between P and letting Mary choose ( ) between Q and R. the same reasoning applies to L6. The algebraic laws for L1–L3 L4 P is idempotent. But if John does not select P . On the right-hand side. if John chooses P . Mary chooses either (1) to oﬀer John the choice between P and Q or (2) to oﬀer John the choice between P and R. The explanation given above is rather subtle.3 General choice 87 3. On the left-hand side of the law. Let John be the agent which makes nondeterministic choices and Mary be the environment.3. On both sides of the equation. then P will be the overall outcome.and right-hand sides of the law are always equal.3. STOP = P The following law formalises the informal deﬁnition of the operation L5 (x : A → P (x)) (z : (A ∪ B) → (if z ∈ (A − B) then P (z) else if z ∈ (B − A) then Q (z) else if z ∈ (A ∩ B) then (P (z) Q (z)))) distributes (y : B → Q (y)) = Like all other operators introduced so far (apart from recursion). and for the same reasons. through L6 P (Q R) = (P Q) (P R) distributes through What may seem more surprising is that L7 P (Q R) = (P Q) (P R) This law states that choices made nondeterministically and choices made by the environment are independent. Of course. in the sense that the selection made by one of them does not inﬂuence the choice made by the other. perhaps it would be better to explain the law as the unexpected by unavoidable consequence of other more obvious deﬁnitions and laws given later in this chapter. the choice between Q and R is made by Mary. and associative.

αP = αQ = {x. Then (P but (P Q ) || P = (P || P ) (Q || P ) = P STOP Q ) || P = (x → P ) = P Q = (y → Q ). but (P Q ) cannot. Assuming the symmetry of or . process (P Q ) may reach deadlock but process (P Q ) cannot. y} This shows that in environment P . Of course. Q (x)) 3. because each trace of one of them is also a possible trace of the other. it is also symmetrical choice(P . and conversely Every trace of (P L1 traces(P Q ) = traces(P ) ∪ traces(Q ) The next law is slightly diﬀerent from the corresponding law for L2 (P Q) / s = P / s =Q /s = (P / s) (Q / s) if s ∈ traces(P ) − traces(Q ) if s ∈ traces(Q ) − traces(P ) if s ∈ traces(P ) ∩ traces(Q ) 3.3 Traces Q ) must be a trace of P or a trace of Q . it is possible to put them in an environment in which (P Q ) can deadlock at its ﬁrst step. For example let x ≠ y and P = (x → P ). However.4 Refusals The distinction between (P Q ) and (P Q ) is quite subtle.2 Implementation The implementation of the choice operator follows closely the law L5.3. They cannot be distinguished by their traces. Q ) = λ x • if P (x) = "BLEEP then Q (x) else if Q (x) = "BLEEP then P (x) else or (P (x). even with (P Q ) we cannot be sure .88 3 Nondeterminism 3.3.

we say that X is a refusal of P .1 (below). The process STOP does nothing and refuses everything L1 refusals(STOPA ) = all subsets of A (including A itself) A process c → P refuses every set that does not contain the event c L2 refusals(c → P ) = { X | X ⊆ (αP − {c}) } . we will never know that it might have..4 Refusals 89 that deadlock will occur. there is at some time some event in which it can engage. In other words. This condition applies not only on the initial step of P but also after any possible sequence of actions of P . i. let X be a set of events which are oﬀered initially by the environment of a process P . even though the environment is ready for it. but also (as a result of some internal nondeterministic choice) it may refuse to engage in that event. it might seem more natural to use the sets of symbols which a process may be ready to accept. A process is said to be deterministic if it can never refuse any event in which it can engage.e. The set of all such refusals of P is denoted refusals(P ) Note that the refusals of a process constitute a family of sets of symbols. or more formally P is deterministic ⇒ (X ∈ refusals(P ) ≡ (X ∩ P 0 = {})) where P 0 = { x | x ∈ traces(P ) }. Thus we can deﬁne P is deterministic ≡ ∀ s : traces(P ) • (X ∈ refusals(P / s) ≡ (X ∩ (P / s)0 = {})) A nondeterministic process is one that does not enjoy this property.4. however the refusals are slightly simpler because they obey laws L9 and L10 of Section 3.4.3. This is an unfortunate complexity. Instead of refusals. The introduction of the concept of a refusal permits a clear formal distinction to be made between deterministic and nondeterministic processes. 3.1 Laws The following laws deﬁne the refusals of various simple processes. which in this context we take to have the same alphabet as P . but it does seem to be unavoidable in a proper treatment of nondeterminism. whereas the corresponding laws for ready sets would be more complicated. and if it does not occur. If it is possible for P to deadlock on its ﬁrst step when placed in this environment. a set is a refusal of a deterministic process only if that set contains no event in which that process can initially engage. But the mere possibility of an occurrence of deadlock is enough to distinguish (P Q ) from (P Q ). In general.

so can (P Q ) L5 refusals(P Q ) = refusals(P ) ∩ refusals(Q ) Comparison of L5 with L4 shows the distinction between and . if X is not a refusal of Q .e. it can also refuse any subset of that set. Similarly every refusal of Q is also a possible refusal of (P Q ).90 3 Nondeterminism These two laws have a common generalisation L3 refusals(x : B → P (x)) = { X | X ⊆ (αP − B) } If P can refuse X . we often need to consider events representing internal transitions of that mechanism. and if a process refuses a nonempty set. the alphabet of a process contains just those events which are considered to be relevant. if both P and Q can refuse X . then it is not a refusal of (P Q ). and whose occurrence requires simultaneous participation of an environment. A process can refuse only events in its own alphabet. If X is not a refusal of P . A process deadlocks when the environment oﬀers no events. then P cannot refuse X . Finally. These are its only refusals. L8 X ∈ refusals(P ) ⇒ X ⊆ αP L9 {} ∈ refusals(P ) L10 L11 (X ∪ Y ) ∈ refusals(P ) ⇒ X ∈ refusals(P ) X ∈ refusals(P ) ⇒ (X ∪ {x}) ∈ refusals(P ) ∨ x ∈ traces(P ) 3. so will P Q if P is selected. However. so L4 refusals(P Q ) = refusals(P ) ∪ refusals(Q ) A converse argument applies to (P Q ). any event x which cannot occur initially may be added to any set X already refused. Such events may denote the interactions and communications between concurrently acting components from which the mechanism has been .. it can refuse the union of the two sets X and Y L6 refusals(P || Q ) = { X ∪ Y | X ∈ refusals(P ) ∧ Y ∈ refusals(Q ) } For symbol change. and neither can (P Q ). If P can refuse X and Q can refuse Y . Similarly. In describing the internal behaviour of a mechanism. the relevant law is clear L7 refusals(f (P )) = { f (x) | X ∈ refusals(P ) } There are a number of general laws about refusals. i. then their combination (P || Q ) can refuse all events refused by P as well as all events refused by Q .5 Concealment In general.

Clearly it is our intention that α(P \ C) = (αP ) − C Examples X1 A noisy vending machine (2.5 Concealment 91 constructed. c} P = (a → c → P ) Q = (c → b → Q ) as in (2. without aﬀecting its actual behaviour. clunk. then P\C is a process which behaves like P . to be concealed (P || Q ) \ {c} = (a → c → µ X • (a → b → c → X | b → a → c → X )) \ {c} = a → µ X • (a → b → X | b → a → X) . their mutual interactions are usually regarded as internal workings of the resulting systems. toﬀee} When two processes have been combined to run concurrently. CHAIN2 (2.6. The resulting process is equal to the simple vending machine VMS = NOISYVM \ {clink.3. without the knowledge or intervention of the system’s outer environment. without being observed or controlled by the environment of the process..g. clunk} Its unexercised capability of dispensing toﬀee can also be removed from its alphabet. The action c in the alphabet of both P and Q is now regarded as an internal action.3. e.2 X3. Thus it is the symbols in the intersection of the alphabets of the two components that need to be concealed. they are intended to occur autonomously and as quickly as possible.1 X1). we conceal the structure of its components.3 X1) can be placed in a soundproof box NOISYVM \ {clink. c} αQ = {b.6 X4) and 2. If C is a ﬁnite set of events to be concealed in this way. After construction of the mechanism. X2 Let αP = {a. except that each occurrence of any event in C is concealed. and we also wish to conceal all occurrences of actions internal to the mechanism. we want these actions to occur automatically and instantaneously as soon as they can. In fact.

i.92 3 Nondeterminism 3. then (x : B → P (x)) \ C = (x : B → (P (x) \ C)) . in which they participate jointly. then the initial choice remains the same as it was before concealment L8 If B ∩ C = {}. Concealment of nothing leaves everything revealed L1 P \ {} = P To conceal one set of symbols and then some more is the same as concealing them all simultaneously. but make such occurrences totally invisible. only its alphabet L4 STOPA \ C = STOPA−C The purpose of concealment is to allow any of the concealed events to occur automatically and instantaneously.5.. The remaining laws of this group show how concealment distributes through other operators. Unconcealed events remain unchanged L5 (x → P ) \ C = x → (P \ C) =P \C if x ∉ C if x ∈ C If C contains only events in which P and Q participate independently.1 Laws The ﬁrst laws state that concealing no symbols has no eﬀect. then (P || Q ) \ C = (P \ C) || (Q \ C) This is not a commonly useful law. because what we usually wish to conceal are the interactions between concurrent processes. concealment of C distributes through their concurrent composition L6 If αP ∩ αQ ∩ C = {}. the events of αP ∩ αQ .e. and that it makes no diﬀerence in what order the symbols of a set are concealed. L2 (P \ B) \ C = P \ (B ∪ C) Concealment distributes in the familiar way through nondeterministic choice L3 (P Q ) \ C = (P \ C) (Q \ C) Concealment does not aﬀect the behaviour of a stopped process. Concealment distributes in the obvious way through symbol change by a one-one function L7 f (P \ C) = f (P ) \ f (C) If none of the possible initial events of a choice is concealed.

5 Concealment 93 Like the choice operator . In this case the total behaviour will be deﬁned by (P \ C).5. then (x : B → P (x)) \ C = x∈B (P (x) \ C) In the intermediate case. d may very well happen before the hidden event.4. This is a rather convoluted justiﬁcation for the rather complex law (c → P | d → Q ) \ C = (P \ C) ((P \ C) (d → (Q \ C))) Similar reasoning justiﬁes the more general law L10 If C ∩ B is ﬁnite and non-empty. and the possibility of occurrence of the event d will be withdrawn. d ∉ C The concealed event c may happen immediately. the situation is rather more complicated. when some of the initial events are concealed and some are not.3. the concealment of events can introduce nondeterminism. In this case. Consider the process (c → P | d → Q ) \ C where c ∈ C. Note that \ C does not distribute backwards through . A counterexample is (c → STOP d → STOP ) \ {c} . and B is ﬁnite and not empty. When several diﬀerent concealed events can happen. But even if d occurs. If the environment is ready for it. after which the hidden event c can no longer occur. then (x : B → P (x)) \ C = Q where Q = x∈B∩C (Q (x : (B − C) → P (x))) P (x) \ C A pictorial illustration of these laws is given in Section 3. but whichever does occur is concealed L9 If B ⊆ C. it is not determined which of them will occur. it might have been performed by (P \ C) after the hidden occurrence of c. But we cannot reliably assume that d will not happen. the total behaviour is as deﬁned by (P \ C) (d → (Q \ C)) The choice between this and (P \ C) is nondeterministic.

as discussed in Section 3. there is no way of preventing the process from inﬁnitely often choosing to perform the hidden event instead. This possibility seems to aid in achieving the highest eﬃciency of implementation. in Section 3.2. the recursion is unguarded.1. A more rigorous discussion of divergence is given in Section 3. The same problem arises even if the divergent process is inﬁnitely often capable of some unconcealed event.8. The general name for this phenomenon is divergence.8. and leads to divergence. We can also deﬁne an operation which extends the alphabet of a process P by inclusion of symbols of a set B α(P+B ) = αP ∪ B P+B = (P || STOPB ) provided B ∩ αP = {} None of the new events of B will ever actually occur. concealment distributes through recursion (µ X : A • (c → X )) \ {c} = µ X : (A − {c}) • ((c → X+{c} ) \ {c}) = µ X : (A − {c}) • X [by L12.94 3 Nondeterminism = STOP = STOP = STOP (STOP (d → STOP )) [L10] [3. . for example (µ X • (c → X d → P )) \ {c} d → P ) \ {c}) ((X \ {c}) d → (P \ {c})) [by L10] = µ X • ((c → X = µ X • (X \ {c}) Here again. L5] Thus the attempt to conceal an inﬁnite sequence of consecutive events leads to the same unfortunate result as an inﬁnite loop or unguarded recursion. It also seems to be related to our decision not to insist on fairness of nondeterminism.3.1 L4] (d → STOP ) (d → STOP ) ((d → STOP ) \ {c}) ≠ d → STOP = ((c → STOP ) \ {c}) Concealment reduces the alphabet of a process. concealment of B reverses the extension of the alphabet by B L12 (P+B ) \ B = P It is appropriate here to raise a problem that will be solved later. so the behaviour of P+B is eﬀectively the same as that of P L11 traces(PB+ ) = traces(P ) Consequently. Even though it seems that the environment is inﬁnitely often oﬀered the choice of selecting d. In simple cases.

3. .g. (a → P \ {c})) which results. . for example hide(µ X • (c → X d → P ). c) = P \ {c} A set of two or more symbols may be hidden by hiding one after the other.2 Implementation (a → P \ {c} For simplicity.5 Concealment 95 There is a sense. ((P \ {c1}) \ {c2}) \ . . whenever it can and as soon as it can hide(P . thereby immediately calling itself . and consider the process ((c → a → P | d → STOP ) \ {c}) || (a → R) = ((a → P \ {c}) (a → P \ {c} d → STOP )) || (a → R) d → STOP ) || (a → R) [L10] = (a → P \ {c}) || (a → R) = a → ((P \ {c}) || R) This shows that a process which oﬀers the choice between a hidden action c and a nonhidden one d cannot insist that the nonhidden action shall occur. so that the environment has the chance to interact with the process (e. then the hidden event must occur. c) = if P (c) = "BLEEP then (λ x • if P (x) = "BLEEP then "BLEEP else hide(P (x). since P \ {c1. c) In this case. c) Let us explore what happens when the hide function is applied to a process which is capable of engaging in an inﬁnite sequence of hidden events. we shall implement an operation which hides a single symbol at a time hide(P . . an important one.) \ {cn} The simplest implementation is one that always makes the hidden event occur invisibly. c)) else hide(P (c). . a → R) is not prepared for d. let d ∈ αR. in which hiding is in fact fair. . cn} = (.5. . . c2. so the hide function will always select its else clause. 3. If the environment (in this example. the test (P (c) = "BLEEP ) will always yield FALSE .

C) means that P diverges immediately on concealment of C. the corresponding trace of P \ C is obtained from t simply by removing all occurrences of any of the symbols in C. C) = ∀ n • ∃ s : traces(P ) ∩ C ∗ • #s > n . d). the order in which it hides the symbols is signiﬁcant. This is the penalty for attempting to implement a divergent process. c) = hide(hide((a → STOP ). d) = STOP and hide(hide(P .2. d).5.96 3 Nondeterminism recursively.e. We can therefore state L1 traces(P \ C) = { t (αP − C) | t ∈ traces(P ) } provided that ∀ s : traces(P ) • ¬ diverges(P / s. d) = hide(hide(STOP . i.3 Traces If t is a trace of P . This implementation of concealment does not obey L2. so no further communication with the outside world will ever occur. There is no exit from this recursion. as shown by the example P = (c → STOP | d → a → STOP ) Then hide(hide(P . d} = (STOP (a → STOP )) 3. Conversely each trace of P \ C must have been obtained from some such trace of P . indeed. c) = (a → STOP ) But as explained in Section 3. Thus we deﬁne diverges(P . that it can engage in an unbounded sequence of hidden events. C) The condition diverges(P . It is suﬃcient that both the results shown above are permitted implementations of the same process P \ {c. c). c). a particular implementation of a nondeterministic operator does not have to obey the laws.2.

2. P Q Figure 3.3. t (αP − C) = s. For a fuller treatment of divergence see Section 3.8. The restrictions are not serious.e.1 Thus P Q is pictured as in Figure 3.. the choice being nondeterministic. i. on reaching the node.5 Concealment 97 Corresponding to a single trace s of P \ C.4 Pictures Nondeterministic choice can be represented in a picture by a node from which emerge two or more unlabelled arrows. associativity of is illustrated in Figure 3. The next law states that after s it is not determined which of the possible subsequent behaviours of P will deﬁne the subsequent behaviour of (P \ C). because divergences is never the intended result of the attempted deﬁnition of a process.g. e. P = Q R P Q R Figure 3..2 . L2 (P \ C) / s = ( t ∈T P / t) \ C (αP − C) = s } where T = traces(P ) ∩ { t | t provided that T is ﬁnite and s ∈ traces(P \ C) These laws are restricted to the case when the process does not diverge.1. The algebraic laws governing nondeterminism assert identities between such processes. 3. there can be several traces t of the possible behaviour in which P has engaged which cannot be distinguished after the concealment.5. a process passes imperceptibly along one of the emergent arrows.

5 .98 3 Nondeterminism Concealment of symbols may be regarded as an operation which simply removes concealed symbols from all arrows which they label. a P Figure 3. Such a picture can arise only in the case of divergence.4 It is fairly obvious that such eliminations are always possible for ﬁnite trees.5.4. a R b P c Q = P b c Q P b c Q a R Figure 3.3 But what is the meaning of a node if some of its arcs are labelled and some are not? The answer is given by the law 3. as for example in Figure 3. as shown in Figure 3. d } Q \ { c. The resulting nondeterminism emerges naturally. provided that the graph contains no inﬁnite path of consecutive unlabelled arrows. d } c P d Q = P \ { c.5. \ { c. d } Figure 3.1 L10. Such a node can be eliminated by redrawing as shown in Figure 3. which we have already decided to regard as an error. They are also possible for inﬁnite graphs.3. so that these arcs turn into unlabelled arrows.

Using this operator.6 Interleaving 99 As a result of applying the transformation L10.6 Interleaving The || operator was deﬁned in Chapter 2 in such a way that actions in the alphabet of both operands require simultaneous participation of them both. 3. In this case.3 (Figure 3. However. but if both processes could have engaged in the same action. it is possible that the node may acquire two emergent lines with the same label.6).3. This form of combination is denoted P ||| Q P interleave Q and its alphabet is deﬁned by the usual stipulation α(P ||| Q ) = αP = αQ Examples X1 A vending machine that will accept up to two coins before dispensing up to two chocolates (1. but without introducing nondeterminism.6 The pictorial representation of processes and the laws which govern them are included here as an aid to memory and understanding. it is sometimes useful to join processes with the same alphabet to operate concurrently without directly interacting or synchronising with each other. then it must have been the other one. it is possible to combine interacting processes with diﬀering alphabets into systems exhibiting concurrent activity.3 X6) (VMS ||| VMS ) = VMS2 .1. Such nodes can be eliminated by the law given at the end of Section 3. a P b a b Q b R = Q R P Figure 3. each action of the system is an action of exactly one of the processes If one of the processes cannot engage in the action. the choice between them is nondeterministic. they are not intended to be used for practical transformation or manipulation of large-scale processes. whereas the remaining actions of the system occur in an arbitrary interleaving.

The left operand stops. the choice being nondeterministic.1 Laws . L6 and L7 state that it is the environment which chooses between the initial events oﬀered by the operands of |||. the event a may be an event of either operand of |||. Example X1 Let R = (a → b → R). then (R ||| R) = (a → ((b → R) ||| R) = a → ((b → R) ||| R) = a → ((b → R) ||| R) Also (b → R) ||| R a → (R ||| (b → R))) (R ||| (b → R)) [L2] [L6] . This is shown by the counter- Note that ||| does not distribute through example (where b ≠ c) ((a → STOP ) ||| (b → Q = (b → Q ≠ ((b → Q ) = ((a → STOP c → R) (c → R)) c → R)) / a b → Q ) ||| (a → STOP c → R)) / a On the left-hand side of this chain. symmetric.5. each serving only one philosopher at a time (see Section 2.6. Thus the environment can no longer choose whether the next event will be b or c. Nondeterminism arises only when the chosen event is possible for both operands. On the right-hand side of the chain. so no nondeterminism is introduced. and distributes through L4 P ||| STOP = P L5 P ||| RUN = RUN L6 (x → P ) ||| (y → Q ) = (x → (P ||| (y → Q )) L7 If and then P = (x : A → P (x)) Q = (y : B → Q (y)) P ||| Q = (x : A → (P (x) ||| Q ) provided P does not diverge y → ((x → P ) ||| Q )) y : B → (P ||| Q (y))) . and the choice between b and c is left open to the environment. the occurrence of a can involve progress only of the left operand of |||.4). 3. L1–L3 ||| is associative.100 3 Nondeterminism X2 A footman made from four lackeys.

we therefore need to describe the desired properties of its refusal sets as well as its traces. u) | t ∈ traces(P ) ∧ u ∈ traces(Q ) ∧ s interleaves (t . Let us use the variable ref to denote an arbitrary refusal set of a process.] A trace of (P ||| Q ) is an arbitrary interleaving of a trace from P with a trace from Q .u∈T (P / t ) ||| (Q / u) 3. see Section 1. u) } This law reﬂects the fact that there is no way of knowing in which way a trace s of (P ||| Q ) has been constructed as an interleaving of a trace from P and a trace from Q .3.2 Traces and refusals [since the recursion is guarded. t .7 Speciﬁcations 101 = (a → ((b → R) ||| (b → R)) = (a → (b → ((b → R) ||| R)) b → (R ||| R)) [L6] [as shown above] b → (a → ((b → R) ||| R))) = µ X • (a → b → X b → a → X) Thus (R ||| R) is identical to the example 3.3. in the same way as we have used tr to denote an arbitrary trace. and it can therefore refuse only those sets which are refused by both P and Q L2 refusals(P ||| Q ) = refusals(P Q) The behaviour of (P ||| Q ) after engaging in the events of the trace s is deﬁned by the rather elaborate formula L3 (P ||| Q ) / s = where T = { (t . when P is a nondeterministic process the meaning of P sat S (tr . thus after s. ref ) . A similar proof shows that (VMS ||| VMS ) = VMS2.4 we saw the need to introduce refusal sets as one of the important indirectly observable aspects of the behaviour. the future behaviour of (P ||| Q ) may reﬂect any one of the possible interleavings.9. For a deﬁnition of interleaving. In specifying a process. The choice between them is not known and not determined. As a result. L1 traces(P ||| Q ) = { s | ∃ t : traces(P ) • ∃ u : traces(Q ) • s interleaves (t .7 Speciﬁcations In Section 3.5 X2. u) } (P ||| Q ) can engage in any initial action possible for either P or Q .6. 3.

It is also satisﬁed by a vending machine like VMS2 (1. the customer speciﬁes that it must not refuse to dispense a chocolate FAIR = (tr ↓ choc < tr ↓ coin ⇒ choc ∉ ref ) It is implicitly understood that every trace tr and every refusal ref of the speciﬁed process at all times should satisfy this speciﬁcation. one can insist that the machine accept at least two coins in a row whenever the customer oﬀers them ATLEAST2 = (tr ↓ coin − tr ↓ choc < 2 ⇒ coin ∉ ref ) X6 The process STOP refuses every event in its alphabet.3 X6) which will accept several coins in a row. the owner speciﬁes that it must not refuse a further coin PROFIT1 = (tr ↓ choc = tr ↓ coin ⇒ coin ∉ ref ) X3 A simple vending machine should satisfy the combined speciﬁcation NEWVMSPEC = FAIR ∧ PROFIT ∧ (tr ↓ choc ≤ tr ↓ coin) This speciﬁcation is satisﬁed by VMS . The following predicate speciﬁes that a process with alphabet A will never stop NONSTOP = (ref ≠ A) .102 3 Nondeterminism is revised to ∀ tr .1. X2 When a vending machine has given out as many chocolates as have been paid for. one may place a limit on the balance of coins which may be accepted in a row ATMOST2 = (tr ↓ coin − tr ↓ choc ≤ 2) X5 If desired. ref ) Examples X1 When a vending machine has ingested more coins than it has dispensed chocolates. X4 If desired. and then give out several chocolates. ref • tr ∈ traces(P ) ∧ ref ∈ refusals(P / tr ) ⇒ S (tr .

a speciﬁcation will be written in any of the forms S .7 Speciﬁcations 103 If P sat NOTSTOP . 3. ref ). This observation will therefore be described by the speciﬁcation of P or by the speciﬁcation of Q or by both.3. .7. (P Q ) behaves either like P or like Q .8 describes how a divergent process is one that can do anything and refuse anything.4. It is also desirable to prove that a process does not diverge. choc} it follows that any process which satisﬁes NEWVMSPEC will never stop. S (tr . Therefore every observation of its behaviour will be an observation possible for P or for Q or for both. then the process is not divergent. These advantages are obtained at the cost of slightly increased complexity in proof rules and in proofs. perhaps the most important of all is the property that the process must not stop (X6). So if there is a set which cannot be refused. This justiﬁes formulation of a suﬃcient condition for non-divergence NONDIV = (ref ≠ A) Fortunately NONSTOP ≡ NONDIV so proof of absence of divergence does not entail any more work than proof of absence of deadlock. P must perform one of them. By the deﬁnition of nondeterminism. Since (see X3 above) NEWVMSPEC ⇒ ref ≠ {coin. S (tr ).1 Proofs In the following proof rules. So if we omit alphabets altogether (which we shall do in future). In all cases. the proof rule for nondeterminism has an exceptionally simple form L1 If and then P sat S Q sat T (P Q ) sat (S ∨ T ) The proof rule for STOP states that it does nothing and refuses anything L2A STOPA sat (tr = ∧ ref ⊆ A) Since refusals are always contained in the alphabet (3. it should be understood that the speciﬁcation may contain tr and ref among its free variables. according to convenience. Section 3. and if an environment allows all events in A. Consequently. These examples show how the introduction of ref into the speciﬁcation of a process permits the expression of a number of subtle but important properties.1 L8) the clause ref ⊆ A can be omitted.

Y )) The law for change of symbol needs a similar adaptation L4 If then The law for L5 If and and then P sat (tr .2 L4) needs to be similarly strengthened L2 If then ∀ x : B • P (x) sat S (tr . ref ) neither P nor Q diverges (P Q ) sat (if tr = then (S ∧ T ) else (S ∨ T )) provided f is one-one.104 3 Nondeterminism the law L2A is identical to that for deterministic processes (1.10. and must therefore be described by one of their speciﬁcations (or both). Subsequently. x) (x : B → P (x)) sat ((tr = ∧ (B ∩ ref = {}) ∨ (tr0 ∈ B ∧ S (tr . each observation of (P Q ) must be an observation either of P or of Q . the initial action cannot be refused L2B If then P sat S (tr ) (c → P ) sat ((tr = ∧ c ∉ ref ) ∨ (tr0 = c ∧ S (tr ))) The law for general choice (1. ref ) Q sat T (tr . ref ) f (P ) sat S (f −1∗ (tr ). tr0 )))) The law for parallel composition given in 2. f −1 (ref )) is surprisingly simple P sat S (tr . when tr ≠ . when tr = . This set must therefore be described by both their speciﬁcations. The rule needs to be strengthened by mention of the fact that in the initial state. X ) ∧ T (tr αQ . a slightly more complicated law is required L3 If and and then P sat S (tr . ref ) neither P nor Q diverges (P || Q ) sat (∃ X . when tr = .2 L4B) is also still valid.10.2 L4A) STOP sat tr = The previous law for preﬁxing (1. but it is not quite strong enough to prove that the process cannot stop before its initial action. ref ) Q sat T (tr .10. ref • ref = (X ∪ Y ) ∧ S (tr αP . provided that the speciﬁcations make no mention of refusal sets. Initially. . In order to deal correctly with refusals. a set is refused by (P Q ) only if it is refused by both P and Q . Y .7 L1 is still valid.

. 2.10. Let S (n) be a predicate containing the variable n.8 Divergence In previous chapters. The clause ref ∪ C in the consequent of law L7 requires some explanation.1.1. 1. The proof method for recursion (1.5. t • (tr interleaves (s. This restriction has ensured that the equations have only a single solution (1.2 L6) also needs to be strengthened. This kind of fairness is a most important feature of any reasonable deﬁnition of concealment. i. P \ C cannot refuse to interact with its external environment until it has reached a state in which it cannot engage in any further concealed internal activities.3 L2). ref )) then (P \ C) sat ∃ s • tr = s (αP − C) ∧ S (tr . . It has also released us from the obligation of giving a meaning to the inﬁnite recursion µX • X . t ) ∧ S (s) ∧ T (t ))) The law for concealment is complicated by the need to guard against divergence L7 If P sat (NODIV ∧ S (tr . It is due to the fact that P \ C can refuse a set X only when P can refuse the whole set X ∪ C. ref ∪ C) where NODIV states that the number of hidden symbols that can occur is bounded by some function of the non-hidden symbols that have occurred NODIV = #(tr C) ≤ f (tr (αP − C)) where f is some total function from traces to natural numbers. . as described in Section 3.e. which ranges over the natural numbers 0.8 Divergence 105 The law for interleaving does not need to mention refusal sets L6 If and and then P sat S (tr ) Q sat T (tr ) neither P nor Q diverges (P ||| Q ) sat (∃ s. though the strongest speciﬁcation which can be proved of such a process is the vacuous speciﬁcation true.3. L8 If and then S (0) (X sat S (n)) ⇒ F (X ) sat S (n + 1)) (µ X • F (X )) sat (∀ n • S (n)) This law is valid even for an unguarded recursion. 3. we have observed the restriction that the equations which deﬁne a process must be guarded (Section 1. . X together with all the hidden events.2).

There is nothing that it might not do.1 Laws Since CHAOS is the most nondeterministic process it cannot be changed by adding yet further nondeterministic choices. it is therefore a zero of L1 P CHAOS = CHAOS A function of processes that yields CHAOS if any of its arguments is CHAOS is said to be strict. With this understanding. f . and µ X However preﬁxing is not strict. The above law (plus symmetry) states that is a strict function. the introduction of concealment (Section 3. Which solution should be taken as the right one? We stipulate that the right solution is the least deterministic.5) means that an apparently guarded recursion is not constructive. L3 CHAOS ≠ (a → CHAOS ) because the right-hand side can be relied upon to do a before becoming completely unreliable. a fact which may be checked by substitution. |||. 3. CHAOS is the most unpredictable and most uncontrollable of processes. As mentioned before. and we can give a (possibly nondeterministic) meaning to every expression of the form µ X • F (X ). ||. we can altogether remove the restriction that recursions must be guarded. For example.8. Consequently. where F is deﬁned in terms of any of the operators introduced in this book (except /). . because this allows a nondeterministic choice between all the other solutions. there is nothing that it might not refuse to do! L4 traces(CHAOSA ) = A∗ L5 refusals(CHAOSA ) = all subsets of A. L2 The following operations are strict / s. any recursion equation which involves recursion under the hiding operator is potentially unguarded. CHAOS is such an awful process that almost any process which is deﬁned in terms of CHAOS is itself equal to CHAOS . . and liable to have more than one solution. furthermore. consider the equation X = c → (X \ {c})+{c} This has as solutions both (c → STOP ) and (c → a → STOP ). \ C. and observing all alphabet constraints.106 3 Nondeterminism Unfortunately.

8. in the sense that L2 s ∈ divergences(P ) ∧ t ∈ (αP )∗ ⇒ (s t ) ∈ divergences(P ) Since CHAOSA may refuse any subset of its alphabet A L3 s ∈ divergences(P ) ∧ X ⊆ αP ⇒ X ∈ refusal(P / s) The three laws given above state general properties of divergences of any process. which leads to divergence of either P or of Q (or of both) . The following laws show how the divergences of compound processes are determined by the divergences and traces of their components.2 Divergences A divergence of a process is deﬁned as any trace of the process after which the process behaves chaotically. CHAOS / t = CHAOS and it follows that the divergences of a process are extension-closed. its divergences are determined by what happens after the ﬁrst step L6 divergences(x : B → P (x)) = { x s | x ∈ B ∧ s ∈ divergences(P (x)) } Q ) and of (P Q) Any divergence of P is also a divergence of (P L7 divergences(P Q ) = divergences(P Q) = divergences(P ) ∪ divergences(Q ) Since || is strict.3. The set of all divergences is deﬁned divergences(P ) = { s | s ∈ traces(P ) ∧ (P / s) = CHAOSαP } It follows immediately that L1 divergences(P ) ⊆ traces(P ) Because / t is strict. the process STOP never diverges L4 divergences(STOP ) = {} At the other extreme. every trace of CHAOS leads to CHAOS L5 divergences(CHAOSA ) = A∗ A process deﬁned by choice does not diverge on its ﬁrst step. Firstly. Consequently.8 Divergence 107 3. a divergence of (P || Q ) starts with a trace of the nondivergent activity of both P and Q .

. Unfortunately. we need to use a mathematical theory in which it can! 3. t ) ∧ ((s ∈ divergences(P ) ∧ t ∈ traces(Q )) ∨ (s ∈ traces(P ) ∧ t ∈ divergences(Q ))) } Divergences of a process resulting from concealment include traces derived from the original divergences. when divergence is always something we do not want. and the informal justiﬁcations and examples carry correspondingly less conviction. It can arise from either concealment or unguarded recursion. plus those resulting from the attempt to conceal an inﬁnite sequence of symbols L10 divergences(P \ C) = { (s (αP − C)) t | t ∈ (αP − C)∗ ∧ (s ∈ divergences(P ) ∨ (∀ n • ∃ u ∈ C ∗ • #u > n ∧ (s u) ∈ traces(P ))) } A process deﬁned by symbol change diverges only when its argument diverges L11 divergences(f (P )) = { f ∗ (s) | s ∈ divergences(P ) } provided f is one-one. It is a shame to devote so much attention to divergence.9 Mathematical theory of non-deterministic processes The laws given in this chapter are distinctly more complicated than the laws given in the two earlier chapters. and prove the correctness of the laws from the deﬁnitions of the operators. In order to prove that something can’t happen. and it is part of the task of a system designer to prove that for his particular design the problem will not occur. t • u interleaves (s. It is therefore even more important to construct a proper mathematical deﬁnition of the concept of a nondeterministic process. it seems to be an inevitable consequence of any eﬃcient of even computable method of implementation.108 3 Nondeterminism L8 divergences(P || Q ) = { s t | t ∈ (αP ∪ αQ )∗ ∧ ((s s αP ∈ divergences(P ) ∧ s αP ∈ traces(P ) ∧ s αQ ∈ traces(Q )) ∨ αQ ∈ divergences(Q )) A similar explanation applies to ||| L9 divergences(P ||| Q ) = { u | ∃ s.

3.8. C2. in spite of the fact that its environment is prepared to engage in any of the events of X .4) and divergences (Section 3. where A is any set of symbols (for simplicity ﬁnite) F is a relation between A∗ and P A D is a subset of A∗ provided that they satisfy the following conditions C1 ( . C4 under the deﬁnition D0 below). D). F . X ) | s ∈ traces(P ) ∧ X ∈ refusals(P / s) } If (s.8). X ) ∈ F (the last three conditions reﬂect the laws 3. These certainly include its alphabet and its traces. We will ﬁrst deﬁne the powerset of A as the set of all its subsets PA = {X | X ⊆ A} D0 A process is a triple (A.1 L6.8. which can both be deﬁned in terms of failures traces(P ) = { s | ∃ X • (s. L7.1 L8.1. C3. any three sets which satisfy the relevant conditions uniquely deﬁne a process. {}) ∈ F C3 (s. The failures of a process are more informative about the behaviour of that process that its traces or refusals. L11) can easily be reformulated in terms of failures (see conditions C1. a mathematical model is based on the relevant directly or indirectly observable properties of a process.2 L1.4.8. L10. In addition to refusals at the ﬁrst step of a process P . it is necessary also to take into account what P may refuse after engaging in an arbitrary trace s of its behaviour. X ) ∈ failures(P ) } The various properties of traces (1. and then refuse to do anything more. this means that P can engage in the sequence of events recorded by s. X ) ∈ F C4 (s. L3). Y ) ∈ F ∧ X ⊆ Y ⇒ (s. and conversely. L8) and refusals (3. but for a nondeterministic process there are also its refusals (Section 3. X ∪ {x}) ∈ F ∨ (s C5 D ⊆ domain(F ) C6 s ∈ D ∧ t ∈ A∗ ⇒ s t ∈ D C7 s ∈ D ∧ X ⊆ A ⇒ (s. We are now ready for the bold decision that a process is uniquely deﬁned by the three sets specifying its alphabet. x . its failures. X ) ∈ F ⇒ (s. L9. L2. {}) ∈ F . X ) ∈ F ∧ x ∈ A ⇒ (s. {}) ∈ F C2 (s t . and its divergences. X ) ∈ failures(P ) } = domain(failures(P )) refusals(P ) = { X | ( . We therefore deﬁne the failures of a process as a relation (set of pairs) failures(P ) = { (s. X ) is a failure of P .9 Mathematical theory of non-deterministic processes 109 As in Section 2.

L3 are direct consequences of this deﬁnition. The simplest operation to deﬁne is the nondeterministic or ( ). X | X ⊆ (αP − B) } ∪ { x s. The laws 3. D5 α(P || Q ) = (αP ∪ αQ ) D6 α(f (P )) = f (α(P )) D7 α(P Q ) = α(P ||| Q ) = αP provided αP = αQ D8 α(P \ C) = αP − C D9 failures(x : B → P (x)) = { . X | x ∈ B ∧ (s. since every member of A∗ is both a trace and a divergence. can refuse everything. it is deﬁned only for operands with the same alphabet D3 (A.2. this proof is usually based on the assumption that its operands do so to start with. L2. failures and divergences. and every subset of A is a refusal after all traces. (A∗ × P A). X ) | s ∈ A∗ ∧ X ∈ P A } This is the largest process with alphabet A. and has no divergences. {}) This process never does anything. so it remains only to deﬁne the alphabets and the failures. D4 If and then αP (x) = A for all x B⊆A α(x : B → P (x)) = A. D2) = (A. F1 ∪ F2. { } × P A. The deﬁnitions of all the other operators can be given similarly. A∗ ) where A∗ × P A is the Cartesian product { (s. Of course it is necessary to show that the result of the operation satisﬁes the six conditions of D0. F2. X ) ∈ failures(P (x)) } .8. Another simple process is deﬁned D2 STOPA = (A.2 L1. D1 ∪ D2) The resulting process can fail or diverge in all cases that either of its two operands can do so. F1. An operator is deﬁned on processes by showing how the three sets of the result can be derived from those of their operands. D1) (A. Like many other operations. The deﬁnitions of the divergences have been given in Section 3. but it seems slightly more elegant to write separate deﬁnitions for the alphabets.110 3 Nondeterminism The simplest process which satisﬁes this deﬁnition is also the worst D1 CHAOSA = (A.

D2) ≡ (F2 ⊆ F1 ∧ D2 ⊆ D1) P Q now means that Q is equal to P or better in the sense that it is less likely to diverge and less likely to fail. X | s ∈ divergences(P || Q ) } D11 D12 failures(f (P )) = { f ∗ (s). D1) (A. together with the laws for \. X ∪ C) ∈ failures(P ) } ∪ { s. or in short the worst L1 CHAOS P This ordering is clearly a partial order. u • (t . X | s ∈ divergences(P D13 failures(P ||| Q ) = { s. except that the deﬁnition of the ordering is diﬀerent D15 (A. n≥0 Dn ) provided (∀ n ≥ 0 • Fn+1 ⊆ Fn ∧ Dn+1 ⊆ Dn ) . X ) ∈ failures(P ) ∩ failures(Q ) ∨ (s ≠ ∧ (s. Dn ) = (A. X ) ∈ failures(P ) ∪ failures(Q )) } ∪ Q)} { s. In fact it is a complete partial order. F1. X | (s. with a limit operation deﬁned in terms of the intersections of descending chains of failures and divergences D16 n≥0 (A. It remains to give a deﬁnition for processes deﬁned recursively by means of µ. True to its name.8. and if Q can refuse to do something desirable.3. X ) ∈ failures(Q ) } ∪ { s. X ) ∈ failures(P ) } failures(P Q) = { s. and can refuse to do anything at any time. X | (s. Fn .9 Mathematical theory of non-deterministic processes 111 D10 failures(P || Q ) = { s. CHAOS can do anything at any time. P can also refuse. it is the least predictable and controllable of all processes. f (X ) | (s. X | ∃ t . X | s ∈ divergences(P ||| Q ) } D14 failures(P \ C) = {s (αP − C). (X ∪ Y ) | s ∈ (αP ∪ αQ )∗ ∧ (s (s αP . X ) ∈ failures(P ) ∧ αQ . X | s ∈ divergences(P \ C) } The explanations of these laws may be derived from the explanations of the corresponding traces and refusals. because if Q can do something undesirable P can do it too. The treatment is based on the same ﬁxed point theory as Section 2. F2. n≥0 Fn .2. X ) ∈ failures(P ) ∧ (u. Y ) ∈ failures(Q ) } ∪ {s. Q is more predictable and more controllable than P .

As before. which requires that CHAOS be used in place of STOP D17 µ X : A • F (X ) = i≥0 F i (CHAOSA ) The proof that this is a solution (in fact the most nondeterministic solution) of the relevant equation is the same as that given in Section 2.8. .2.8. all the operators deﬁned in this book (except /) are continuous. except for the diﬀerence in the deﬁnition of the ordering. the requirement of continuity was one of the main motivations for the rather complicated treatment of divergence.112 3 Nondeterminism The µ operation is deﬁned in the same way as for deterministic processes (2.2 L7). In the case of the concealment operator. the validity of the proof depends critically on the fact that all the operators used on the right-hand side of the recursion should be continuous in the appropriate ordering. and so is every formula constructed from them. Fortunately.

.1.v → P ) The only event in which this process is initially prepared to engage is the communication event c. A communication is an event that is described by a pair c. and most of the laws are just special cases of previously familiar laws.1 Introduction In previous chapters we have introduced and illustrated a general concept of an event as an action without duration.v) = c. In this chapter we shall concentrate on a special class of event known as a communication. Examples of this convention have already been given in COPYBIT (1. The set of all messages which P can communicate on channel c is deﬁned αc(P ) = { v | c. 4.6 X4).v) = v All the operations introduced in this chapter can be deﬁned in terms of the more primitive concepts introduced in earlier chapters.Communication 4 4.v. message(c. and because in some cases imposition of notational restrictions permits the use of more powerful reasoning methods.2 Input and output Let v be any member of αc(P ). The reasons for introducing special notations is that they are suggestive of useful applications and implementation methods. whose occurrence may require simultaneous participation by more than one independently described process.v where c is the name of the channel on which the communication takes place and v is the value of the message which passes.v ∈ αP } We also deﬁne functions which extract channel and message components of a communication channel(c. A process which ﬁrst outputs v on the channel c and then behaves like P is deﬁned (c!v → P ) = (c.3 X7) and CHAIN2 (2.

3 X7 COPYBIT = µ X • (in?x → (out !x → X )) where αin(COPYBIT ) = αout (COPYBIT ) = {0. whereas the inputting process is prepared to accept any communicable value. When P and Q are composed concurrently in the system (P || Q ). we will assume satisfaction of this constraint. where v is the value speciﬁed by the outputting process. and more interesting examples will be given in Section 4. communication will occur on channel c on each occasion that P outputs a message and Q simultaneously inputs that message. When drawing a connection diagram (Section 2. A channel which is used only for output by a process will be called an output channel of that process.v. and let c be an output channel of P and an input channel of Q .1. is deﬁned (c?x → P (x)) = (y : { y | channel(y) = c } → P (message(y))) Example X1 Using the new deﬁnitions of input and output we can rewrite 1. This requires the obvious constraint that the channel c must have the same alphabet at both ends.3 and subsequent sections. In both cases.1 Let P and Q be processes. . the channels are drawn as arrows in the appropriate direction.1). αc(P ) = αc(Q ) In future..4) of a process. left P right down Figure 4. i. and then behave like P (x). Thus the event that will actually occur is the communication c. An example of the working of this model for communication has been given in CHAIN2 (2.6 X4).114 4 Communication A process which is initially prepared to input any value x communicable on the channel c. we shall say loosely that the channel name is a member of the alphabet of the process. An outputting process speciﬁes a unique value for the message. 1} We shall observe the convention that channels are used for communication in only one direction and between only two processes.e. and one used only for input will be called an input channel. and labelled with the name of the channel (Figure 4. and where no confusion can arise we will write αc for αc(P ).

1. except that every number input is doubled before it is output αleft = αright = N DOUBLE = µ X • (left ?x → right !(x + x) → X ) X3 The value of a punched card is a sequence of eighty characters. X2 A process like COPY . COPY is almost identical to COPYBIT (1. A process which reads cards and outputs their characters one at a time αleft = { s | s ∈ αright ∗ ∧ #s = 80 } UNPACK = P where P P P x x s = left ?s → Ps = right !x → P = right !x → Ps X4 A process which inputs characters one at a time from the left. .3 X7). which may be read as a single value along the left channel. Examples X1 A process which immediately copies every message it has input from the left by outputting it to the right αleft (COPY ) = αright (COPY ) COPY = µ X • (left ?x → right !x → X ) If αleft = {0. they are waiting to be output when the line is long enough. as illustrated in the following examples.4. Ps describes the behaviour of the process when it has input and packed the characters in the sequence s. and assembles them into lines of 125 characters’ length. Each completed line is output on the right as a single array-valued message αright = { s | s ∈ αleft ∗ ∧ #s = 125 } PACK = P where Ps = right !s → P Ps = left ?x → Ps x if #s = 125 if #s < 125 Here. the value to be output by a process is speciﬁed by means of an expression containing variables to which a value has been assigned by some previous input. 1}.2 Input and output 115 In general.

116 4 Communication X5 A process which copies from left to right. which outputs on one and then the other (c!2 → d!4 → P ) Thus the presentation of a choice of inputs not only protects against deadlock but also achieves greater eﬃciency and reduces response times to proferred communications. A traveller who is waiting for a number 127 bus will in general have to wait longer than one who is prepared to travel in either a number 19 or a number 127. An implementor will be expected. which are independent in the sense that they do not directly or indirectly communicate with each other. Thus if one processes is making progress towards an output on c. it is not determined which of them reaches its output ﬁrst. For this purpose we adapt the choice notation introduced in Chapter1. as in the case when both the channels c and d are connected to the same concurrent process. If c and d are distinct channel names (c?x → P (x) | d?y → Q (y)) denotes a process which initially inputs x on c and then behaves like P (x). The actions of these two processes are therefore arbitrarily interleaved. On the assumption of random arrivals. or initially inputs y on channel d and then behaves like Q (y). The choice is determined by whichever of the corresponding outputs is ready ﬁrst. it is as though he is waiting twice as fast! To wait for the ﬁrst of many possible events is the only way of achieving this: purchase of faster computers is useless. whichever arrives ﬁrst at the bus stop. but not compelled. Consider the case when the channels c and d are output channels of two other separate processes. Since we have decided to abstract from the timing of events and the speed of processes which engage in them. This policy also protects against the deadlock that will result if the second output is never going to occur. as explained below. . the last sentence of the previous paragraph may require explanation. the traveller who oﬀers a choice will wait only half as long—paradoxically. or if it can occur only after the ﬁrst output. except that each pair of consecutive asterisks is replaced by a single “↑” αleft = αright − {“↑”} SQUASH = µ X • left ?x → if x ≠ “*” then (right !x → X ) else left ?y → if y = “*” then (right !“↑” → X ) else (right !“*” → right !y → X )) A process may be prepared initially to communicate on any one of a set of channels. and the other is making progress towards an output on d. leaving the choice between them to the other processes with which it is connected. to resolve this nondeterminism in favour of the ﬁrst output to become ready.

or to output to the right a value which it has most recently input αleft = αright VAR = left ?x → VARx where VARx = (left ?y → VARy | right !x → VARx ) Here VARx behaves like a program variable with current value x. outputs to down a function of what it has input. and to output on its right the ﬁrst message which it has input but not yet output BUFFER = P where P P x s = left ?x → P x x s y = (left ?y → P | right !x → Ps ) .2 Input and output 117 Examples X6 A process which accepts input on either of the two channels left1 or left2. and immediately outputs the message to the right αleft1 = αleft2 = αright MERGE = (left1?x → right !x → MERGE | left2?x → right !x → MERGE ) The output of this process is an interleaving of the messages input from left1 and left2. 1} the behaviour of VAR is almost identical to that of BOOL (2. New values are assigned to it by communication on the left channel.4. X8 A process which inputs from up and left . before repeating NODE (v) = µ X • (up?sum → left ?prod → down!(sum + v × prod) → X ) X9 A process which is at all times ready to input a message on the left. If αleft = {0. and its current value is obtained by communication on the right channel.6 X5). X7 A process that is always prepared to input a value on the left.

The actual value communicated is treated separately in the next stage. its result is a function which expects the input value x as its argument. . 4. instead. the event c. in the same order as they joined. and whenever nonempty. When empty.v.v is naturally represented by the dotted pair c.118 4 Communication BUFFER behaves like a queue. F ) = λ y • if y ≠ c then "BLEEP else F It follows that Q / c. X10 A process which behaves like a stack of messages. and x is the message currently ready for output. provided that c. v) Input and output commands are conveniently implemented as functions which ﬁrst take a channel name as argument. the STACK stores y x s but the BUFFER stores x s y . and that it puts newly arrived messages on the same end of the stored sequence as it takes them oﬀ. and delivers as its result the process Q (x).v is a trace of Q . Thus if y is the newly arrived input message. as described below. If Q is the input command c?x → Q (x)) then Q ("c) ≠ "BLEEP . Thus Q is implemented by calling the LISP function input ("c. but after a possible delay.2. it delivers the answer "BLEEP . At all times it is ready to input a new message from the left and put it on top of the stack. during which later messages may join the queue. it responds to the signal empty. messages join the right-hand end of the queue and leave it from the left end.v is represented in LISP by Q ("c)(v). which is constructed by cons("c. λ x • Q (x)) which is deﬁned input (c.1 Implementation In a LISP implementation of communicating processes. If the process is not prepared to communicate on the channel. it is prepared to output and remove the top element of the stack STACK = P where P P x s = (empty → P | left ?x → P x y ) x s) = (right !x → Ps | left ?y → P This process is very similar to the previous example. except that when empty it participates in the empty event.

4. Thus P is implemented by calling the LISP function output ("c.9. if αc is ﬁnite. One of the justiﬁcations for introducing specialised notation for input and output is to encourage and permit methods of implementation which are signiﬁcantly more eﬃcient.2 Input and output 119 If P is the output command (c!v → P ) then P ("c) ≠ "BLEEP .v is a trace of P . cons(x. x. If c is a channel name. P ) It follows that v = car (P ("c)). X (NIL)) else input ("left .2. In theory.v) ≠ "BLEEP for all values v in αc. its result is the pair cons(v. it would be possible to treat c. provided that c. But this would be grotesquely ineﬃcient. v. X )) X2 PACK = P (NIL) where P = LABEL X • (λ s • if length(s) = 125 then output ("right .2 Speciﬁcations In specifying the behaviour of a communicating process. NIL))))) 4. Examples X1 COPY = LABEL X • input ("left . P ). λ x • X (append(s. instead. we deﬁne (see Section 1. since the only way of ﬁnding out what value is output would be to test whether P (c. it is convenient to describe separately the sequences of messages that pass along each of the channels. P ) = λ y •if y ≠ c then "BLEEP else cons(v. The disadvantage is that the implementation of nearly all the other operators needs to be recoded in the light of this optimisations. passed as a parameter to the input and output commands. λ x • output ("right .v is represented in LISP by cdr (P ("c)). v. s. and that P / c.v as a single event.6) tr ↓ c = message∗ (tr αc) . P ) which is deﬁned output (c. until the right one is found.

. The following laws are obvious and useful s ≤0 t ≡ (s = t ) s ≤n t ∧ t ≤m u ⇒ s ≤n+m u s ≤ t ≡ ∃ n • s ≤n t Examples X1 COPY sat right ≤1 left X2 DOUBLE sat right ≤1 double∗ (left ) X3 UNPACK sat right ≤ / left where / s0 . sn−1 = s0 s1 . . #t ). (s ⊕ t ) if s = or t = otherwise .2) The speciﬁcation here states that the output on the right is obtained by ﬂattening the sequence of sequences input on the left. Another useful deﬁnition places a lower bound on the length of a preﬁx s ≤n t = (s ≤ t ∧ #t ≤ #s + N ) This means that s is a preﬁx of t . X4 PACK sat (( / right ≤125 left ) ∧ (# ∗ right ∈ {125}∗ )) This speciﬁcation states that each element output on the right is itself a sequence of length 125. sn−1 (see 1. and write right ≤ left instead of tr ↓ right ≤ tr ↓ left . it is convenient to apply it distributively to the corresponding elements of two sequences. . . . and the catenation of all these sequences is an initial subsequence of what has been input on the left. . with not more than n items removed.9. The length of the resulting sequence is equal to that of the shorter operand s⊕t = = s0 ⊕ t0 Clearly (s ⊕ t )[i] = s[i] ⊕ t [i] and s ≤n t ⇒ (s ⊕ u ≤n t ⊕ u) ∧ (u ⊕ s ≤n u ⊕ t ) for i ≤ min(#s.120 4 Communication It is convenient just to omit the tr ↓. If ⊕ is a binary operator. s1 .

or x. 5 . 3. 5. if the most recent . 1 ∨ ( 1. . ﬁb [1. 1 ≤ right ∧ right ≤ right + right )) X6 A variable with value x outputs on the right the value most recently input on the left..4.9. = ﬁb In the above discussion. If s is a ﬁnite initial subsequence of ﬁb (with #s ≥ 2) then instead of the equation we get the inequality s ≤s +s This formulation can be used to specify a process FIB which outputs the Fibonacci sequence to the right. 3 . where the left shift is clearly displayed 1 . 3 . ﬁb is regarded as an inﬁnite sequence. 1. 2 . 3 . 2. . 5 . 5 .. . .4 L1] 1 . FIB sat (right ≤ 1. More formally. if there is no such input. . is deﬁned by the recurrence relation ﬁb[0] = ﬁb[1] = 1 ﬁb[i + 2] = ﬁb[i + 1] + ﬁb[i] The second line can be rewritten using the operator to left-shift the sequence by one place ﬁb = ﬁb + ﬁb The original deﬁnition of the Fibonacci sequence may be recovered from this more cryptic form by subscripting both sides of the equation ﬁb [i] = (ﬁb + ﬁb)[i] ⇒ ﬁb [i + 1] = ﬁb [i] + ﬁb[i] ⇒ ﬁb[i + 2] = ﬁb[i + 1] + ﬁb[i] Another explanation of the meaning of the equation is as a description of the inﬁnite sum.2 Input and output 121 Examples X5 The Fibonacci sequence 1. 1 . 8.. . 2 .. + ﬁb 2 ...

v is within the intersection of the alphabet of P with the alphabet of Q . i. and Q simultaneously inputs the same value.v can occur only when both processes engage simultaneously in that event.5. Examples will be given in Section 4.9. then the value which was output is equal to the last item in the sequence x left VARx sat (channel(tr 0 ) = right ⇒ right 0 = ( x left )0 ) where s 0 is the last element of s (Section 1. A protocol achieves this in spite of the fact that the place where the messages are submitted is widely separated from the place where they are received. This is an example of a process that cannot be adequately speciﬁed solely in terms of the sequence of messages on its separate channels..4. this extra complexity will be necessary for processes which use the choice operator. In general. so it is the outputting process that determines which actual message value is transmitted on each occasion. which is guaranteed to deliver on the right only those messages which have been submitted on the left. 4. and of thereby .122 4 Communication action was an output.3 Communications Let P and Q be processes.5). It is also necessary to know the order in which the communications on separate channels are interleaved.6 X4. and in the same order. as in 2. buﬀering up to one message MERGE sat ∃ r • right ≤1 r ∧ r interleaves (left1. An inputting process is prepared to accept any communicable value. and input a special case of choice. for example that the latest communication is on the right.3) of the two sequences input on left1 and left2. whenever P outputs a value v on the channel c.9. Thus output may be regarded as a specialised case of the preﬁx operator. left2) X8 BUFFER sat right ≤ left A process which satisﬁes the speciﬁcation right ≤ left describes the behaviour of a transparent communications protocol.e. and let c be a channel used for output by P and for input by Q . This represents the physical possibility of tapping the wires connecting the components of a system. a communication c. Thus the set containing all communication events of the form c. and this leads to the law L1 (c!v → P ) || (c?x → Q (x)) = c!v → (P || Q (v)) Note that c!v remains on the right-hand side of this equation as an observable action in the behaviour of the system. When these processes are composed concurrently in the system (P || Q ). and the fact that the communications medium which connects the two places is somewhat unreliable. X7 The MERGE process produces an interleaving (Section 1.

In the speciﬁcation of P . which is not always possible. c stands for the sequence of messages communicated by P on c. as shown by the law L2 ((c!v → P ) || (c?x → Q (x))) \ C = (P || Q (v)) \ C where C = { c. Let c be the name of a channel along which P and Q communicate.3 Communications 123 keeping a log of their internal communications. It follows that (P || Q ) sat (right ≤1 173 × mid) ∧ (mid ≤1 square∗ (left )) . in the speciﬁcation of Q . If desired.4.2 X6. Consequently this sequence must satisfy both the speciﬁcation of P and the speciﬁcation of Q . The speciﬁcation of the parallel composition of communicating processes takes a particularly simple form when channel names are used to denote the sequences of messages passing on them.4 and 4.v | v ∈ αc } Examples will be given in Sections 4. Fortunately.2. it is Q that determines the properties of the communications on its own channels. This channel cannot be mentioned in the speciﬁcation of Q . the sequences of messages sent and received must at all times be identical. Consider now a channel d in the alphabet of P but not of Q . by the very nature of communication. Similarly. such internal communications can be concealed by applying the concealment operator described in Section 3.5 outside the parallel composition of the two processes which communicate on the same channel. as shown by 4. The same is true for all channels in the intersection of their alphabets. so the values communicated on it are constrained only by the speciﬁcation of P . this simpliﬁcation is valid only when the speciﬁcations of P and Q are expressed wholly in terms of the channel names. c stands for the sequence of messages communicated by Q . It is also a help in reasoning about the system. when P and Q communicate on c. Consequently a speciﬁcation of the behaviour of (P || Q ) can be simply formed as the logical conjunction of the speciﬁcation of P with that of Q . However. Similarly. Example X1 Let P = (left ?x → mid!(x × x) → P ) Q = (mid?y → right !(173 × y) → Q ) Clearly P sat (mid ≤1 square∗ (left )) Q sat (right ≤1 173 × mid) where (173 × mid) multiples each message of mid by 173.5.

and the results must be output at the same rate as the input. the number (a × x + b × y) is to be output on the right. and compose them X21 = (left1?x → mid!(a × x) → X21) X22 = (left2?y → mid?z → right !(z + b × y) → X22) X2 = (X21 || X22) Clearly. For each x read from left1 and each y from left2. but channels in the alphabet of only one process are left free. Such systems are called data ﬂow networks. but possibly after an initial delay. We therefore deﬁne two processes. The technique is particularly eﬀective when the same calculation must be performed on each member of a stream of input data. A picture of a system of communicating processes closely represents their physical realisation. When communicating processes are connected by the concurrency operator ||. X2 sat (mid ≤1 a × left1 ∧ right ≤1 mid + b × left2) ⇒ (right ≤ a × left1 + b × left2) . An output channel of one processes is connected to a like-named input channel of the other process.2. Thus the example X1 can be drawn.2 Examples X2 Two streams of numbers are to be input from left1 and left2. as shown in Figure 4. the resulting formulae are highly suggestive of a physical implementation method in which electronic components are connected by wires along which they communicate. The speed requirement dictates that the multiplications must proceed concurrently. The purpose of such an implementation is to increase the speed with which useful results can be produced.124 4 Communication The speciﬁcation here implies right ≤ 173 × square∗ (left ) which was presumably the original intention. left P mid Q right Figure 4.

so in this case absence of deadlock cannot so easily be assured.3 Communications 125 X3 A stream of numbers is to be input on the left.7 L2). deadlock would occur rapidly.3.3 When two concurrent processes communicate with each other by output and input only on a single channel. any network of nonstopping processes which is free of cycles cannot deadlock. they cannot deadlock (compare 2. However. In proving the absence of deadlock it is often possible to ignore the content of the messages. As a result. the network of X3 contains an undirected cycle. we require that right ≤ a × left + b × left The solution can be constructed by adding a new process X23 to the solution of X2 X3 = (X2 || X23) where X23 sat (left1 ≤1 left ∧ left2 ≤1 left ) X23 can be deﬁned X23 = (left ?x → left1!x → (µ X • left ?x → left2!x → left1!x → X )) It copies from left to both left1 and left2. and on the right is output a weighted sum of consecutive pairs of input numbers. since an acyclic graph can be decomposed into subgraphs connected only by a single arrow. More precisely. A picture of the network of X3 is shown in Figure 4. and regard each communication on channel c as a single event . For example. and cyclic networks cannot be decomposed into subnetworks except with connections on two or more channels. if the two outputs left2!x → left1!x → in the loop of X3 were reversed. with weights a and b.4. left1 X21 left X23 mid right X22 left2 Figure 4. but omits the ﬁrst element in the case of left2.

The solution to the problem should therefore be designed as an iterative array with at least n elements. Thus X3 can be written in terms of these events (µ X • left1 → mid → X ) || (µ Y • left2 → mid → Y ) || (left1 → (µ Z • left2 → left1 → Z )) = µ X3 • (left1 → left2 → mid → X3) This proves that X3 cannot deadlock. The shape of the network corresponds closely to the structure of the operands and operators appearing in the expressions to be computed. It is therefore clear that at least n processors will be required to operate concurrently. These examples show how data ﬂow networks can be set up to compute one or more streams of results from one or more streams of input data. the term systolic array is often used.126 4 Communication named c. since data passes through the system much like blood through the chambers of the heart. and to introduce an iterated notation for concurrent combination ||i<n P (i) = (P (0) || P (1) || . . Communications on unconnected channels can be ignored. it is convenient to use subscripted names for channels. . Each coordinate set is to be multiplied by a ﬁxed vector V of length n. a multiplication. each containing at most one multiplication. All that is required is for j < n . using algebraic methods as in 2. and the resulting scalar product is to be output to the right. Let us replace the Σ in the speciﬁcation by its usual inductive deﬁnition mid0 = 0 ∗ midj+1 = Vj × leftj + midj right = midn Thus we have split the speciﬁcation into a conjunction of n + 1 component equations.3 X1. The speed of each individual processor is such that it takes nearly one microsecond to do an input. If the connection diagram has no directed cycles. Examples X4 The channels { leftj | j < n } are used to input the coordinates of successive points in n-dimensional space. || P (n − 1)) A regular network of this kind is known as an iterative array. an addition and an output. When these patterns are large but regular. or more formally right ≤ Σn−1 Vj × leftj j=0 It is speciﬁed that in each microsecond the n coordinates of one point are to be input and one scalar product is to be output.

4.3

Communications

127

to write a process for each equation: for j < n, we write MULT0 = (µ X • mid0 !0 → X ) MULTj+1 = (µ X • leftj ?x → midj ?y → midj+1 !(Vj × x + y) → X ) MULTn+1 = (µ X • midn ?x → right !x → X ) NETWORK = ||j<n+2 MULTj The connection diagram is shown in Figure 4.4.

MULT0 midn left0 MULT1 mid1

leftn-1

MULTn midn MULTn+1 right

Figure 4.4

X5 This is similar to X4, except that m diﬀerent scalar products of the same coordinate sets are required almost simultaneously. Eﬀectively, the channel leftj (for j < n) is to be used to input the jth column of an inﬁnite array; this is to be multiplied by the (n × m) matrix M , and the ith column of the result is to be output on righti , for i < m. In formulae righti = Σj<n Mij × leftj The coordinates of the result are required as rapidly as before, so at least m ×n processes are required.

128

4 Communication

The solution might ﬁnd practical application in a graphics display device which automatically transforms or even rotates a two-dimensional representation of a three-dimensional object. The shape is deﬁned by a series of points in absolute space; the iterative array applies linear transformations to compute the deﬂection on the x and y plates of the cathode ray tube; a third output coordinate could perhaps control the intensity of the beam.

left

n

right

m

Figure 4.5 The solution is based on Figure 4.5. Each column of this array (except the last) is modelled on the solution to X4; but it copies each value input on its horizontal input channel to its neighbour on its horizontal output channel. The processes on the right margin merely discard the values they input. It would be possible to economise by absorbing the functions of these marginal processors into their neighbours. The details of the solution are left as an exercise.

4.3

Communications

129

X6 The input on channel c is to be interpreted as the successive digits of a natural number C, starting from the least signiﬁcant digit, and expressed with number base b. We deﬁne the value of the input number as C = Σi≥0 c[i] × b i where c[i] < b for all i. Given a ﬁxed multiplier M , the output on channel d is to be the successive digits of the product M × C. The digits are to be output after minimal delay. Let us specify the problem more precisely. The desired output d is d = Σi≥0 M × c[i] × b i The jth element of d must be the jth digit, which can be computed by the formula d[j] = ((Σi≥0 M × c[i] × b i ) div b j ) mod b = (M × c[j] + zj ) mod b where zj = (Σi<j M × c[i] × b i ) div b j and div denotes integer division. zj is the carry term, and can readily be proved to satisfy the inductive deﬁnition z0 = 0 zj+1 = ((M × c[j] + zj ) div b) We therefore deﬁne a process MULT1(z), which keeps the carry z as a parameter MULT1(z) = c?x → d!(M × x + z) mod b → MULTI1((M × x + z) div b) The initial value of z is zero, so the required solution is MULT = MULT1(0)

X7 The problem is the same as X6, except M is a multi-digit number M = Σi<n Mi × b i A single processor can multiply only single-digit numbers. However, output is to be produced at a rate which allows only one multiplication per digit. Consequently, at least n processors are required. We will get each NODEi to look after one digit Mi of the multiplier.

130

4 Communication

The basis of a solution is the traditional manual algorithm for multi-digit multiplication, except that the partial sums are added immediately to the next row of the table . . . 153091 253 . . . 306182 . . . 765455 . . . 827275 . . . 459273 . . . 732023 C M M2 × C M1 × C the incoming number the multiplier computed by NODE2

computed by NODE1 25 × C M0 × C computed by NODE0 M ×C

cn-1 NODEn-1 dn-1

c1 NODE0 d1

Figure 4.6

c0 d0

The nodes are connected as shown in Figure 4.6. The original input comes in on c0 and is propagated leftward on the c channels. The partial answers are propagated rightward on the d channels, and the desired answer is output on d0 . Fortunately each node can give one digit of its result before communicating with its left neighbour. Furthermore, the leftmost node can be deﬁned to behave like the answer to X6 NODEn−1 (z) = cn−1 ?x → dn−1 !(Mn−1 × x + z) mod b → NODEn−1 ((Mn−1 × x + z) div b) The remaining nodes are similar, except that each of them passes the input digit to its left neighbour, and adds the result from its left neighbour to its own carry. For k < n − 1 NODEk (z) = ck ?x → dk !(Mk × x + z) mod b → ck+1 !x → dk+1 ?y → NODEk (y + (Mk × x + z) div b) The whole network is deﬁned

||i<n NODEi (0)

4.4

Pipes

131

X7 is a simple example from a class of ingenious network algorithms, in which there is an essential cycle in the directed graph of communication channels. But the statement of the problem has been much simpliﬁed by the assumption that the multiplier is known in advance and ﬁxed for all time. In a practical application, it is much more likely that such parameters would have to be input along the same channel as the subsequent data, and would have to be reinput whenever it is required to change them. The implementation of this requires great care, but little ingenuity. A simple implementation method is to introduce a special symbol, say reload, to indicate that the next number or numbers are to be treated as a change of parameter; and if the number of parameters is variable, an endreload symbol may also be introduced. Example X8 Same as X4, except that the parameters Vj are to be reloaded by the number immediately following a reload symbol. The deﬁnition of MULTj+1 needs to be changed to include the multiplier as parameter MULTj+1 (v) = leftj ?x → if x = reload then (leftj ?y → MULTj+1 (y)) else (midj ?y → midj+1 !(v × x + y) → MULTj+1 (v))

4.4

Pipes

In this section we shall conﬁne attention to processes with only two channels in their alphabet, namely an input channel left and an output channel right . Such processes are called pipes, and they may be pictured as in Figure 4.7.

left

P

right

left

Q

right

Figure 4.7 The processes P and Q may be joined together so that the right channel of P is connected to the left channel of Q , and the sequence of messages output by P and input by Q on this internal channel is concealed from their common

X4) UNPACK > >PACK This process is quite diﬃcult to write using conventional structured programming techniques. or once per output line. and may >Q >Q again be placed in series with other pipes (P > )> >Q >R.8. It also shows that all messages input on the left channel of (P > ) are input by P .2 X2) QUADRUPLE = DOUBLE > >DOUBLE X2 A process which inputs cards of eighty characters and outputs their text.8 This connection diagram represents the concealment of the connecting channel by not giving it a name. and all messages output on the right >Q channel of (P > ) are output by Q . Finally (P > ) is itself a pipe. The result of the connection is denoted P> >Q and may be pictured as the series shown in Figure 4. By 4. because it is not clear whether the major loop should iterate once per input card.4. which nicely matches the structure of the original problem.132 4 Communication environment. > The validity of chaining processes by > depends on the obvious alphabet > constraints α(P > ) = αleft (P ) ∪ αright (Q ) >Q and a further constraint states that the connected channels are capable of transmitting the same kind of message αright (P ) = αleft (Q ) Examples X1 A pipe which outputs each input value multiplied by four (4. (P > )> >Q >(R> ). so in future we shall omit brackets in such a series. .2 X3. >S etc. tightly packed into lines of 125 characters each (4. The solution given above contains a separate loop in each of the two processes. The problem is known by Michael Jackson as structure clash.1 L1 > is associative. left P Q right Figure 4.

This kind of modularity has been introduced and exploited by the designers of operating systems.4 Pipes 133 X3 Same as X2. X4 Same as X2.2 X5) UNPACK > >SQUASH > >PACK In a conventional sequential program. They will be available for input by the PACK process during times when the UNPACK process is temporarily delayed. Here is a version of X4 which reads one card ahead on input and buﬀers one line on output COPY > >UNPACK > >PACK > >COPY Note the alphabets of the two instances of COPY are diﬀerent. the buﬀer will expand indeﬁnitely.2 X1) may be adequate.2 X9) UNPACK > >BUFFER> >PACK The buﬀer holds characters which have been produced by the UNPACK process. but not yet consumed by the PACK process. 4. If the card reader is on average slower than the printer. X5 In order to avoid undesirable expansion of buﬀers. until it consumes all available storage space. It is nice to avoid such problems by the simple expedient of inserting an additional process.1 Laws The most useful algebraic property of chaining is associativity L1 P > >(Q > >R) = (P > )> >Q >R . If the reader is faster. this minor change in speciﬁcation could cause severe problems.6 X4) and VMS2 (1. and the printing can continue when the card reader is held up (4. except that the reading of cards may continue when the printer is held up. which accepts up to two messages before requiring output of the ﬁrst COPY > >COPY Its behaviour is similar to that of CHAIN2 (2.4.3 X6). the buﬀer will be nearly always empty. it is usual to limit the number of messages buﬀered. However it can never solve the problem of long-term mismatch between the rates of production and consumption. and no smoothing eﬀect will be achieved. a fact which should be understood from the context in which they are placed. except that each pair of consecutive asterisks is replaced by “↑” (4. The buﬀer thus smoothes out temporary variations in the rate of production and consumption.1. Even the single buﬀer provided by the COPY process (4.4. X6 A double buﬀer.

134 4 Communication The remaining laws show how input and output can be implemented in a pipe. as shown in the following law L2 (right ! → P )> >(left ?y → Q (y)) = P > (v) >Q If one of the processes is determined to communicate with the other. and the internal communication is saved up for a subsequent occasion L3 (right !x → P )> >(right !w → Q ) = right !w → ((right !v → P )> ) >Q L4 (left ?x → P (x))> >(left ?y → Q (y)) = left ?x → (P (x)> >(left ?y → Q (y))) If both processes are prepared for external communication. they enable process descriptions to be simpliﬁed by a form of symbolic execution. the message v is transmitted from the former process to the latter. however the actual communication is concealed. then either may happen ﬁrst L5 (left ?x → P (x))> >(right !w → Q ) = (left ?x → (P (x)> >(right !w → Q )) | right !w → ((left ?x → P (x))> )) >Q The law L5 is equally valid when the operator > is replaced by > R > since > > >. For example. but the other is prepared to communicate externally. R> >(right !w → Q ) = right !w → (R> ) >Q L8 If R is a chain of processes all starting with input from the left. pipes in the middle of a chain cannot communicate directly with the environment L6 (left ?x → P (x))> >R> >(right !w → Q ) = (left ?x → (P (x)> >R> >(right !w → Q )) | right !w → ((left ?x → P (x))> >R> )) >Q Similar generalisations may be made to the other laws L7 If R is a chain of processes all starting with output to the right. (left ?x → P (x))> = left ?x → (P (x)> >R >R) . if the process on the left of > starts with output of a > message v to the right. and the process on the right of > starts with input from > the left. it is the external communication that takes place ﬁrst.

2 Implementation [L3.4. after input of its ﬁrst message. . it does so immediately.4. the process diverges (Section 3.1 X1. X2] [L5] In the implementation of (P > ) three cases are distinguished >Q 1. 4.2).5. Otherwise.4 Pipes 135 Examples X1 Let us deﬁne R(y) = (right !y → COPY )> >COPY So R(y) = (right !y → COPY )> >(left ?x → right !x → COPY ) = COPY > >(right !y → COPY ) [def COPY ] [L2] X2 COPY > >COPY = (left ?x → right !x → COPY )> >COPY = left ?x → ((right !x → COPY )> >COPY ) = left ?x → R(x) [def COPY ] [L4] [def R(x)] X3 From the last line of X1 we deduce R(y) = (left ?x → right !x → COPY )> >(right !y → COPY ) = (left ?x → ((right !x → COPY )> >(right !y → COPY )) | right !y → (COPY > >COPY )) = (left ?x → right !y → R(x) | right !y → left ?x → R(x)) This shows that a double buﬀer. The reasoning of the above proofs is very similar to that of 2. If communication can take place on the internal connecting channel. without consideration of the external environment.3. is prepared either to output that message or to input a second message before doing so. 2. if the environment is interested in communication on the left channel. this is dealt with by P . If an inﬁnite sequence of such communications is possible.

3. and so it introduces no risk of deadlock. chain(P .8) is illustrated by the trivial example P = (right !1 → P ) Q = (left ?x → Q ) (P > ) is obviously a useless process.1. then (P > ) >Q will not stop either. cdr (Q ("right )))) else if x = "left then if P (x) = "BLEEP then "BLEEP else λ y • chain(P ("left )(y).4. there is a new danger that the processes P and Q will spend the whole time communicating with each other. Q ) else "BLEEP [Case 1] [Case 2] [Case 3] 4.1. so that (P > ) never again communicates with the external world. Unfortunately. see Section 4. A less trivial example is (P > ). in that >Q like an endless loop it may consume unbounded computing resources without achieving anything. it is even worse than STOP . Or if the environment is interested in the right channel. For an explanation of the input and output operations.136 4 Communication 3.2. If both P and Q are nonstopping. this is dealt with by Q . chain(P . where >Q P = (right !1 → P | left ?x → P1(x)) Q = (left ?x → Q | right !1 → Q1) . This case of diver>Q gences (Sections 3. Q ) = if P ("right ) ≠ "BLEEP and Q ("left ) ≠ "BLEEP then chain(cdr (P ("right )).5.3 Livelock The chaining operator connects two processes by just one channel. Q ("left )(car (P ("right )))) else λx • if x = "right then if Q ("right ) = "BLEEP then "BLEEP else cons(car (Q ("right )).

To ensure this.4.1 X1. we must prove that the length of the sequence output to the right is at all times bounded above by some well-deﬁned function f of the sequence of values input from the left. right ) between the sequences of messages input on the left channel and the sequence of messages output on the right. BUFFER X2 The following are left-guarded in accordance with the original deﬁnition.4 Pipes 137 In this example. because UNPACK sat #right ≤ #( / left ) PACK sat #right ≤ #left X3 BUFFER is not right-guarded. X5. >Q Examples X1 The following are left-guarded by L1 (4. >Q Exactly the same reasoning applies to right-guardedness of the second operand of > > L3 If Q is right-guarded then (P > ) is free of livelock. since it can input arbitrarily many messages from the left without ever inputting from the right. L2 If P is left-guarded then (P > ) is free of livelock. 4. it exists even though the choice of external communication on the left and on the right is oﬀered on every possible occasion.4 Speciﬁcations A speciﬁcation of a pipe can often be expressed as a relation S (left . A simple method to prove (P > ) is free of livelock is to show that P is >Q left-guarded in the sense that it can never output an inﬁnite series of messages to the right without interspersing inputs from the left. L1 If every recursion used in the deﬁnition of P is guarded by an input from the left. When two pipes are connected in series. X2. DOUBLE . the . SQUASH . divergence derives from the mere possibility of inﬁnite internal communication. then P is left-guarded. we deﬁne P is left-guarded ≡ ∃ f • P sat (#right ≤ f (left )) Left-guardedness is often obvious from the text of P . or more formally. and even though after such an external communication the subsequent behaviour of (P > ) >Q would not diverge.4. X9) COPY .

Since the > operator cannot introduce deadlock in pipes. All that is known of the concealed sequence is that it exists. right ) Q sat T (left . Thus we explain the rule L1 If and and then P sat S (left . right ) P is left-guarded or Q is right-guarded P> >Q sat ∃ s • S (left . But we also need to avert the risk of livelock. right ) This states that the relation between left and right which is maintained by (P > ) is the normal relational composition of the relation for P with the >Q relation for Q .138 4 Communication sequence right produced by the left operand is equated with the sequence left consumed by the right operand. we > can aﬀord to omit reasoning about refusals. and this common sequence is then concealed. Examples X1 DOUBLE sat right ≤1 double∗ (left ) DOUBLE is left-guarded and right-guarded. s) ∧ T (s. so (DOUBLE > >DOUBLE ) sat ∃ s • (s ≤1 double∗ (left ) ∧ right ≤1 double∗ (s)) ≡ right ≤2 double∗ (double∗ (left )) ≡ right ≤2 quadruple∗ (left ) X2 Let us use recursion together with > to give an alternative deﬁnition of a > buﬀer BUFF = µ X • (left ?x → (X > >(right !x → COPY ))) We wish to prove that BUFF sat (right ≤ left ) Assume that X sat #left ≥ n ∨ right ≤ left We know that COPY sat right ≤ left Therefore (right !x → COPY ) .

Such a protocol consists of two processes. 3.) sat right = left = (left > ∨ left )) left ) s) ∧ (#left ≥ n ∨ right ≤ left0 ⇒ #left ≥ n + 1 ∨ right ≤ left The desired conclusion follows by the proof rule for recursive processes (3. BUFFER Buﬀers are clearly useful for storing information which is waiting to be processed. clearly (T > >R) must be a buﬀer. It follows that all buﬀers are left-guarded.4. a transmitter T and a receiver R.4. Example X1 The following processes are buﬀers COPY . But they are even more useful as speciﬁcations of the desired behaviour of a communications protocol. when non-empty. though possibly after some delay.10. furthermore. it is always ready to output on the right.7. and which meets the speciﬁcation P sat (right ≤ left ) ∧ (if right = left then left ∉ ref else right ∉ ref ) Here c ∉ ref means that the process cannot refuse to communicate on channel c (Sections 3.2 L6) cannot be used.4 Pipes 139 sat (right = left = ⇒ right ≤ x left ∨ (right ≤ x ∧ right ≤ left )) Since the right operand is right-guarded. . by L1 and the assumption (X > >(right !x → COPY )) sat (∃ s • (#left ≥ n ∨ s ≤ left ) ∧ right ≤ x ⇒ (#left ≥ n ∨ right ≤ x Therefore left ?x → (. which is free of livelock. If the protocol is correct. which is intended to deliver messages in the same order in which they have been submitted. because the recursion is not obviously guarded. 4. More formally. (COPY > >COPY ). .1 L8). BUFF .7. which are connected in series (T > >R).4).5 Buﬀers and protocols A buﬀer is a process which outputs on the right exactly the same sequence of messages as it has input from the left. we deﬁne a buﬀer to be a process P which never stops. . The simpler law (1.

> 2 > 1 )> > >T >T >WIRE > >(R1 > 2 > . > n >R >R > >R Of course. They are due to A.140 4 Communication In practice. each one using the previous layer as its communication medium Tn > . > > >(T2 > >(T1 > >WIRE > 1 )> 2 )> . Rn ) . .e. in accordance with the changed bracketing (Tn > . It is the task of the protocol designer to ensure that in spite of the bad behaviour of the wire. The following laws are useful in proving the correctness of protocols. z))> >(right !x → R(f (x. R1 ) . … (Tn . the system as a whole acts as a buﬀer. L1 If P and Q are buﬀers. In practice. (T1 . (T2 . . . . z))))) then T (z)> >R(z) is a buﬀer for all z. > n ) >R > >R The law of associativity of > guarantees that this regrouping does not change > the behaviour of the system. protocols must be more complicated than this. since singledirectional ﬂow of messages is not adequate to achieve reliable communication on an unreliable wire: it is necessary to add channels in the reverse direction. Roscoe. . BUFFER> >BUFFER . so that unacknowledged messages can be retransmitted. . when the protocol is implemented in practice. . and the messages which are sent along it are subject to corruption or loss. i. W. R2 ) . Examples X2 The following are buﬀers by L1 COPY > >COPY . which may behave not quite like a buﬀer. The following is a generalisation of L2 L3 If for some function f and for all z (T (z)> >R(z)) = (left ?x → (T (f (x. so are (P > ) and (left ?x → (P > >Q >(right !x → Q ))) L2 If T > = (left ?x → (T > >R >(right !x → R))) then (T > >R) is a buﬀer. to enable the receiver to send back acknowledgement signals for successfully transmitted messages. the wire that connects the transmitter to the receiver is quite long. (T > >WIRE > >R) is a buﬀer A protocol is usually built in a number of layers. Thus the behaviour of the wire itself can be modelled by a process WIRE . all the transmitters are collected into a single transmitter at one end and all the receivers at the other. . COPY > >BUFFER. BUFFER> >COPY .

1 X1 and X2 that (COPY > >COPY ) = (left ?x → (COPY > >(right !y → COPY ))) By L2 it is therefore a buﬀer. are left as an exercise.4. A decoder R reverses this translation T = left ?x → right !x → right !(1 − x) → T R = left ?x → left ?y →if y = x then FAIL else (right !x → R) where the process FAIL is left undeﬁned.4. and .9. only a single wire mid is available. Unfortunately. and the proof of their correctness.4 Pipes 141 X3 It has been shown in 4. as shown by Figure 4. 0 for each 1 input. X5 (Bit stuﬃng) The transmitter T faithfully reproduces the input bits from left to right. and this must be used for both streams of data. by L2. X6 (Line sharing) It is desired to copy data from a channel left1 to right1 and from left2 to right2.9 Messages input by T must be tagged before transmission along mid. left1 left2 T mid R right1 right2 Figure 4. 1 for each 0 input and 1. except that after three consecutive 1-bits which have been output. We wish to prove by L2 that (T > >R) is a buﬀer (T > >R) = left ?x → ((right !x → right !(1 − x) → T )> > (left ?x → left ?y → if y = x then FAIL else (right !x → R))) = left ?x → (T > if (1 − x) = x then FAIL else (right !x → R)) > = left ?x → (T > >(right !x → R)) Therefore (T > >R) is a buﬀer. This can most easily be achieved by two disjoint protocols. Thus the input 01011110 is output as 010111010. each using a diﬀerent wire. X4 (Phase encoding) A phase encoder is a process T which inputs a stream of bits. and outputs 0. it inserts a single extra 0. Thus (T > >R) must be proved to be a buﬀer. The receiver R removes these extra zeroes. The construction of T and R.

To insert buﬀers on the channels will only postpone the problem for a short while. and transmission between left2 and right2 may be seriously delayed. say m. but the recipient is not yet ready for them. and for R to send signals back to T to stop sending messages on the stream for which there seems to be little demand. This is known as ﬂow control.6. If two messages are input on left1. we use the asymmetric notation P // Q Using the concealment operator. without the permission and without the knowledge of its partner P .2 can be readily extended . the whole system will have to wait. and then α(P // Q ) = (αQ − αP ) It is usually convenient to give the subordinate process a name. which is used in the main process for all interactions with its subordinate. this can be deﬁned P // Q = (P || Q ) \ αP This notation is used only when αP ⊆ αQ . 4. Thus P serves Q as a slave or subordinate process.142 4 Communication R must untag them and output them on the corresponding right channel T = (left1?x → mid!tag1(x) → T | left2?y → mid!tag2(y) → T ) R = mid?z → if tag(z) = 1 then (right1!untag(z) → R) else (right2!untag(z) → R) This solution is quite unsatisfactory.5 Subordination Let P and Q be processes with αP ⊆ αQ In the combination (P || Q ). The process naming technique described in Section 2. while Q acts as a master of main process. The correct solution is to introduce another channel in the reverse direction. each action of P can occur only when Q permits it to occur. whereas Q can engage independently in the actions of (αQ − αP ). When communications between a subordinate process and a main process are to be concealed from their common environment.

it therefore serves as a local name for the subordinate process.right ?z → right !z → X )) . Examples X1 (for DOUBLE see 4. . and do so several times QUADRUPLE = (doub : DOUBLE // (µ X • left ?x → doub. Thus for example (m : (c!v → P ) // (m. for example (n : (m : P // Q ) // R) In this case. all occurrences of events involving the name m are concealed before the name n is attached to the remaining events.right ?x → . the value of 2 × e may be obtained by a successive output of the argument e on the left channel of doub. In the construction (m : P // Q ). the name m can never be detected from the outside.c(m : P ) = αc(P ) and v ∈ αc(P ).d. whereas P uses the corresponding simple channels c and d for the same communications. . Each communication on this channel is a triple m.right ?y → doub. Subordination can be nested. and not of P . ) X2 One subroutine may use another as a subroutine.c.2 X2) doub : DOUBLE // Q The subordinate process acts as a simple subroutine called from within the main process Q . These take the form m.c and m.left !y → doub. and input of the result on the right channel doub. There is no way that R can communicate directly with P .v where αm.c. by introducing compound channel names. Inside Q . where m is a process name and c is the name of one of its channels.4. all of which are in the alphabet of Q .left !e → (doub.5 Subordination 143 to communicating processes. or even know of the existence of P or its name m.c?x → Q (x))) = (m : P // Q (v)) Since all these communications are concealed from the environment.left !x → doub. Q communicates with P along channels with compound names of the form m.

and q. the queue will not respond.2 X10) A stack with name st is declared st : STACK // Q Inside the main process Q . P m := m + 3. P x := m.left !v can be used to push the value v onto the stack. read. as described in 2.empty → Q2) If the stack is non-empty.144 4 Communication This is designed itself to be used as a subroutine quad : QUADRUPLE // Q This version of QUADRUPLE is similar to that of 4. X5 (see 4.right ?y removes an element from the other end. st . X3 A conventional program variable named m may be modelled as a subordinate process m : VAR // Q Inside the main process Q . A subordinate process with several channels may be used by several concurrent processes.right ?x → Q1(x) | st .left !(y + 3) → P ) A subordinate process may be used to implement a data structure with a more elaborate behaviour than just a simple variable.right ?x will pop the top value. and updated by input and output.left !v adds v to one end of the queue. and the system may deadlock.left !3 → P ) (m. and gives its value to y.4 X1. if empty. the value of m can be assigned.right ?x → P ) (m.right ?y → m.6. P is implemented by is implemented by is implemented by (m. provided that they do not use the same channel. To deal with the possibility that the stack is empty. but does not have the same double-buﬀering eﬀect.2 X9) (q : BUFFER // Q ) The subordinate process serves as an unbounded queue named q. Within Q . the output q.2 X2 m := 3. the ﬁrst alternative is selected. X4 (see 4. and st . a choice construction can be used (st . deadlock is avoided and the second alternative is selected. If the queue is empty. .

right to communicate with its subordinate process named f . the system will not necessarily deadlock. A less familiar idea is that of using recursion together with subordination to implement an unbounded data structure. then that channel must be in the alphabet of both of them. expressed in an unfamiliar but rather cumbersome notational framework. these values are to be buﬀered by a subordinate buﬀer process named b.right ?y → right !(n × y) → X ))) The subroutine FAC uses channels left and right to communicate parameters and results to its calling process. It is very similar to the set of 2. In these respects it is similar to the QUADRUPLE subroutine (X2). Each level of recursion (except the last) declares a new local subroutine to deal with the recursive call(s). Each level of the recursion stores a single component of the structure.6.right for its input (b : BUFFER // (Q || R)) Note that if R attempts to input from an empty buﬀer. R will simply be delayed until Q next outputs a value to the buﬀer.4. X7 (Factorial) FAC = µ X • left ?n → (if n = 0 then (right !1 → X ) else (f : X // (f .left and f . After each input. so that output from Q will not be delayed when R is not ready for input. X8 (Unbounded ﬁnite set) A process which implements a set inputs its members on its left channel. This is a boringly familiar example of recursion. and declares a new local subordinate data structure to deal with the rest. Q uses channel b.2 X4. and it uses channels f . it outputs a YES if it has already input the same value.5 Subordination 145 X6 A process Q is intended to communicate a stream of values to R. The only diﬀerence is that the subordinate process is isomorphic to FAC itself. and NO otherwise.) The subordination operator may be used to deﬁne subroutines by recursion. (If Q and R communicate with the buﬀer on the same channel. and the deﬁnition of || would require them always to communicate simultaneously the same value— which would be quite wrong.left for its output and R uses b. except that it will store messages of any kind SET = left ?x → right !NO → (rest : SET // LOOP (x)) .left !(n − 1) → f .

right ?z → right !z → X )) The set starts empty. and declares two subordinate trees. which is going to store all members of the set except x.1 Laws The following obvious laws govern communications between a process and its subordinates. Otherwise. and the LOOP repeats. therefore on input of its ﬁrst member x is immediately outputs NO. The ﬁrst law describes concealed communication in each direction between the main and subordinate processes L1A L1B (m : (c?x → P (x))) // (m. 4. Whatever answer (YES or NO) is sent back by rest is passed on again. The external speciﬁcation of the tree is the same as X8 TREE = left ?x → right !NO → (smaller : TREE // (bigger : TREE // LOOP )) The design of the LOOP is left as an exercise.left !y → rest . one to store elements smaller than the earliest. the new member is passed on for storage by rest . If the newly input member is equal to x.5. X9 (Binary tree) A more eﬃcient representation of a set is as a binary tree. the answer YES is sent back immediately on the right channel.c!v → Q ) = (m : P (v)) // Q (m : (d!v → P )) // (m.146 4 Communication where LOOP (x) = µ X • left ?y → (if y = x then right !YES → X else (rest . The LOOP is designed to input subsequent members of the set. Each node stores its earliest inserted element.d?x → Q (x)) = (m : P ) // Q (v) . and one to store the bigger elements. It then declares a subordinate process called rest . which relies on some given total ordering ≤ over its elements.

left .5 Subordination 147 If b is a channel not named by m.right ?z → right !z → LOOP (1))) = (rest : (right !NO → (rest : SET // LOOP (2)))) // (rest .NO.NO = (rest : SET // LOOP (1)) SET / left . the main process can communicate on b without aﬀecting the subordinate L2 (m : P // (b!e → Q )) = (b!e → (m : P // Q )) The only process capable of making a choice for a subordinate process is its main process L3 (m : (c?x → P1(x) | d?y → P2(y))) // (m. These doubts may be slightly alleviated by showing how the combination evolves. More important. and shows how that trace is produced.NO. right .1.2. one of them is inaccessible L4 m : P // (m : Q // R) = (m : Q // R) Usually.1 = right !NO → (rest : SET // LOOP (1)) SET / left .1. it shows how other slightly diﬀering traces cannot be produced. left .c!v → Q ) = (m : P1(v) // Q ) If two subordinate processes have the same name. L1B.2 = (rest : SET // (rest .right ?z → right !z → LOOP (1))) = rest : (rest : SET // LOOP (2)) // (right !NO → LOOP (1)) SET / s = .4.NO The value of SET / s can be calculated using laws L1A. The example below uses a particular trace of behaviour of the process. the order in which subordinate processes are written does not matter L5 m : P // (n : Q // R) = n : Q // (m : P // R) provided that m and n are distinct names The use of recursion in deﬁning subordinate processes is suﬃciently surprising to raise doubts as to whether it actually works. right .left !2 → rest .1. and L2: SET / left . right . right . Example X1 A typical trace of SET is s = left .

If we ignore the nesting of the boxes. right . right .1.14.148 4 Communication rest : (rest : SET // LOOP (2)) // LOOP (1) It is obvious from this that left .NO. The reader may check that SET / s and SET / s left . in which there stands on an easel the completed painting itself. Fortunately. Of course in any practical realisation. Thus (see 4. the network cannot successfully continue.2.YES = SET / s rest : (rest : (rest : SET // LOOP (5)) // LOOP (2)) // LOOP (1) 4.5 X9) could be drawn as in Figure 4.YES is not a trace of SET .10.5 X1 in Figure 4. the recursion must be unfolded to some ﬁnite limit before the network can start its normal operation.2 Connection diagrams A subordinate process may be drawn inside the box representing the process that uses it.5. the example TREE (4.12. Similarly. the boxes nest more deeply. this can be drawn as a linear structure as in Figure 4. recursion is surely justiﬁed by the aid that it gives in the invention and design of algorithms.NO = left . with boxes representing integrated circuits and arrows representing wires between them. and if not by that.5 X2 in Figure 4. right .2. then at least by the intellectual joy which it gives to those who understand it and use it.13. left .5. and if this limit is exceeded during operation. A recursive process is one that is nested inside itself. for a process it is not necessary to complete the picture—it evolves automatically as need during its activity.1 X1) we may picture successive stages in the early history of a set as shown in Figure 4. as shown for 4. right . Dynamic reallocation and reconﬁguration of hardware networks is a lot more diﬃcult than the stack-based storage allocation which makes recursion in conventional sequential programs so eﬃcient.5. The connection diagrams suggest how a corresponding network might be constructed from hardware components. .11. like the picture of the artist’s studio. Nevertheless. which shows on the easel a completed painting…Such a picture in practice can never be completed. For nested subordinate processes. as shown for 4.

left doub: DOUBLE doub.11 .10 Q quad. right Figure 4. right LOOP doub.5 Subordination 149 Q doub. left quad: quad. right Figure 4.4. left doub: DOUBLE doub.

right LOOP(2) . left rest.NO = left LOOP(1) right rest. right rest: SET SET / s = left LOOP(1) right rest. right .1.150 4 Communication SET / left .. Figure 4..12 . left rest.

4.5 Subordination 151 left LOOP(x) right LOOP(y) LOOP(z) Figure 4.13 left right Figure 4.14 .

.

it is convenient to regard successful termination as a special event. We stipulate that cannot be an alternative in the choice construct (x : B → P (x)) is invalid if ∈B SKIPA is deﬁned as a process which does nothing but terminate successfully αSKIPA = A ∪ { } As usual. However.Sequential Processes 5 5. Such a process is said to terminate successfully. denoted by the symbol (pronounced “success”). namely that it has already accomplished everything that it was designed to do.Q . rather than a deliberate choice of the designer. and naturally this can only be the last event in which it engages. If P and Q are sequential processes with the same alphabet. their sequential composition P . and probably results from a deadlock or other design error. there is one good reason why a process should do nothing more. In order to distinguish between this and STOP . It is not a useful process. we shall frequently omit the subscript alphabet. one of which must be completed successfully before the other begins. A sequential process is deﬁned as one which has in its alphabet. it is frequently useful to split the task into two subtasks. Examples X1 A vending machine that is intended to serve only one customer with chocolate or toﬀee and then terminate successfully VMONE = (coin → (choc → SKIP | toﬀee → SKIP )) In designing a process to solve a complex task.1 Introduction The process STOP is deﬁned as one that never engages in any action.

If P never terminates successfully. P . X ) = P . the. Thus the notations introduced for describing sequential processes may also be used to deﬁne the grammar of a simple language. X2 A vending machine designed to serve exactly two customers.3 X3). .1. cat . VMONE A process which repeats similar actions as often as required is known as a loop. Q ). but when P terminates successfully. such as might be used for communication between a human being and a computer. X4 A sentence of Pidgingol consists of a noun clause followed by a predicate.. dog. Q ) continues by behaving as Q . NOUN ARTICLE = (a → SKIP | the → SKIP ) NOUN = (cat → SKIP | dog → SKIP ) Example sentences of Pidgingol are the cat scratches a dog a dog bites the cat ∗ . A sequence of symbols is said to be a sentence of a process P if P terminates successfully after engaging in the corresponding sequence of actions. bites.. NOUNCLAUSE VERB = (bites → SKIP | scratches → SKIP ) NOUNCLAUSE = ARTICLE .154 5 Sequential Processes is a process which ﬁrst behaves like P . it can be deﬁned as a special case of recursion ∗ P = µ X • (P . PREDICATE PREDICATE = VERB . The set of all such sentences is called the language accepted by P . A predicate is a verb followed by a noun clause. α( P ) = αP − { } Clearly such a loop will never terminate successfully. The deﬁnition of a noun clause is given more formally below αPIDGINGOL = {a. (P . P . X3 A vending machine designed to serve any number of customers VMCT = ∗ VMONE This is identical to VMCT (1. neither does (P . A verb is either bites or scratches. one after the other VMTWO = VMONE . scratches} PIDGINGOL = NOUNCLAUSE . that is why it is convenient to remove from its alphabet.

and between these is the sentence accepted by the recursive call on the process X . which are not in its alphabet. after which it terminates successfully An BC n = µ X • (b → SKIP | a → (X . since it accepts just one more a at the beginning and one more c at the end. b to d. . then so will the non-recursive call on An BC n . The pair of processes terminate together when they have both completed their allotted tasks. In this example. so their numbers are the same. X7 A process which ﬁrst behaves like An BC n . It will not allow a d until the proper number of cs have arrived. it is necessary to use some kind of iteration or recursion. used in conjunction with recursion. µ X • (furry → X | prize → X | cat → SKIP | dog → SKIP ) Examples of a noun clause are the furry furry prize dog a dog X6 A process which accepts any number of as followed by a b and then the same number of cs. X5 A noun clause which may contain any number of adjectives furry or prize NOUNCLAUSE = ARTICLE . no as and no cs are accepted. This example shows how sequential composition. can deﬁne a machine with an inﬁnite number of states. but the accepts a d followed by the same number of es An BC n DE n = ((An BC n ) . It ignores the as and the b. and c to e. (c → SKIP ))) If a b is accepted ﬁrst.1 Introduction 155 To describe languages with an unbounded number of sentences. If we assume that the recursive call accepts an equal number of as and cs. the accepted sentence starts with a and ends with c.5. d → SKIP ) || C n DE n where C n DE n = f (An BC n ) for f which maps a to c. but the es (which are not in its alphabet) are ignored. the process on the left of the || is responsible for ensuring an equal number of as and cs (separated by a b). The process on the right of || is responsible for ensuring an equal number of es as cs. the process terminates. If the second branch is taken.

POS )) If the ﬁrst symbol is down. .1. and encountered again in 4. But if the ﬁrst symbol is up.156 5 Sequential Processes The notations for deﬁning a language by means of an accepting process are as powerful as those of regular expressions.2. The use of recursion introduces some of the power of context-free grammars. Cn = POS . the task of POS is accomplished.3) would not help. X8 A process which accepts any interleaving of downs and ups. it is not possible to use the construction of X5 to deﬁne a noun clause in which the word prize can be either a noun of an adjective or both. Thus two successive recursive calls on POS are needed.5 X4.4 X2) C0 = (around → C0 | up → C1 ) Cn+1 = POS . . for example X7. This new operator requires that the two alternatives run concurrently until the environment makes the choice.2).2 X3. . it is necessary to accept two more downs than ups. The only way of achieving this is ﬁrst to accept one more down than up. C0 n times for all n ≥ 0 We can now solve the problem mentioned in 2. POS . e.g. POS . The required eﬀect can now be more conveniently achieved by means of SKIP and sequential composition. However. If the choice is wrong. What is required to solve this problem is a new kind of choice operator which provides angelic nondeterminism like or3 (Section 3. Consequently. . because it introduces nondeterminism. X9 The process C0 behaves like CT0 (1. its deﬁnition is left as an exercise. the prize dog. POS . the introduction of || permits deﬁnition of languages which are not context-free. except that it terminates successfully on the ﬁrst occasion that the number of downs exceeds the number of ups POS = (down → SKIP | up → (POS . The use of (Section 3. A process can only deﬁne those languages that can be parsed from left to right without backtracking or look-ahead. Without angelic nondeterminism the language-deﬁning method described above is not as powerful as context-free grammars. and allows an arbitrary choice of the clause which will analyse the rest of the input. This is because the use of the choice operator requires that the ﬁrst event of each alternative is diﬀerent from all its other ﬁrst events.. the furry prize. that each operation on a subordinate process explicitly mentions the rest of the user process which follows it. the process will deadlock before reaching the end of the input text. one after the other. but not all. because it requires left-toright parsability without back-tracking. and then again to accept one more down than up.6.

(m. . Q ) . SKIP = P L2 (P . the combination terminates successfully just when both components do so L6 SKIPA || SKIPB = SKIPA∪B A successfully terminating process participates in no further event oﬀered by a concurrent partner L7 ((x : B → P (x)) || SKIPA ) = (x : (B − A) → (P (x) || SKIPA )) In a concurrent combination of a sequential with a nonsequential processes. with SKIP playing the role of the unit L1 SKIP . P = P . when does the combination terminate successfully? If the alphabet of the sequential process wholly contains that of its partner. Q = STOP When sequential operators are composed in parallel.up → SKIP ))) If the value of l is initially zero. Q = a → (P . termination of the partnership is determined by that of the sequential process.6. Q ) L5 STOP .2 Laws 157 X10 A USER process manipulates two count variables named l and m (see 2. l is decremented. Otherwise. L8 STOPA || SKIPB = SKIPB if ∉ A ∧ A ⊆ B.2 X3) l : CT0 || m : CT3 || USER The following subprocess (inside the USER) adds the current value of l to m ADD = (l. and its reduced value is added to m (by the recursive call to ADD).2 Laws The laws for sequential composition are similar to those for catenation (Section 1. R = P . Then m is incremented once more.1). and l is also incremented.5.up → l. nothing needs to be done.6. Q )) The law for the choice operator has corollaries L4 (a → P ) . R) L3 (x : B → P (x)) . Q = (x : B → (P (x) . (Q . 5. since the other process can do nothing when its partner is ﬁnished.around → SKIP | l.down → (ADD . to compensate for the initial decrementation and bring it back to its initial value.

the treatment of deterministic processes is much simpler. Special care needs to be exercised on P . in order to illustrate the use of the laws. What seems most suspicious is that the proof does not use induction on n. because the very deﬁnition of CTn contains the process CTn+1 . and will be completed ﬁrst.1. L2] [def Cn ] [def Cn ] [def C0 ] Since Cn obeys the same set of guarded recursive equations as CTn . This is done by showing that C satisﬁes the set of guarded recursive equations used to deﬁne CT . they are the same. 5. Fortunately. Cn−1 = (down → SKIP | up → POS .1 X9 that C0 behaves like CT0 (1. any attempt to use induction on n will fail. we need to prove Cn = (up → Cn+1 | down → Cn−1 Proof LHS = POS . we avoid the problem of a process which continues after engaging in . Cn−1 ) | up → (POS . and also in order to allay suspicion of circularity. (POS . which should always be observed when is in the alphabet of only one of a pair of processes running concurrently. POS ) . In fact. an appeal to the law of unique solutions is both simple and successful.158 5 Sequential Processes The condition for the validity of this law is very reasonable one. The laws L1 to L3 may be used to prove the claim made in 5. POS ) .4 X2). Cn−1 ) = (down → Cn−1 | up → POS . . The equation for CT0 is the same as that for C0 C0 = (around → C0 | up → C1 ) For n > 0. Cn−1 )) = (down → Cn−1 | up → POS . Cn−1 = (down → (SKIP . This proof has been written out in full.3 Mathematical treatment The mathematical deﬁnition of sequential composition must be formulated in such a way as to ensure the truth of the laws quoted in the previous section. SKIP = P As usual. In this way. Cn ) = RHS [def Cn ] [def POS ] [L3] [L1.

We therefore need the law L2 P / s = SKIP if s ∈ traces(P ) This law is essential in the proof of P . We therefore need to impose alphabet constraints on parallel composition. For similar reasons. The ﬁrst and only action of the process SKIP is successful termination. it is not in general true. If s and t are traces and s does not contain (s . Q ) = { s . . so it has only two traces L0 traces(SKIP ) = { . the is replaced by a trace of Q L1 traces(P . it is convenient ﬁrst to deﬁne sequential composition of their individual traces. Q ) = { s | s ∈ traces(P ) ∧ ¬ {s t | s in s } ∪ ∈ traces(P ) ∧ t ∈ traces(Q ) } This deﬁnition may be simpler to understand but it is more complicated to use.7 for a fuller treatment). c.3 Mathematical treatment 159 5. For example. c } and P / ≠ SKIP . } To deﬁne sequential composition of processes. (P || Q ) must be regarded as invalid unless αP ⊆ αQ ∨ αQ ⊆ αP ∨ ∈ (αP ∩ αQ ∪ αP ∩ αQ ) unchanged. A trace of (P . and if this trace ends in .5. alphabet change must be guaranteed to leave so f (P ) is invalid unless f( ) = . t | s ∈ traces(P ) ∧ t ∈ traces(Q ) } An equivalent deﬁnition is L1A traces(P .1 Deterministic processes Operations on deterministic processes are deﬁned in terms of the traces of their results. The whole intention of the symbol is that it should terminate the process which engages in it. Q ) consists of a trace of P .t =s t (see Section 1. even though ∈ traces(P ). c .9. .3. SKIP = P Unfortunately. t ) = s (s ). . if P = (SKIP{} || c → STOP{c} ) then traces(P ) = { .

1 L11). no matter what is speciﬁed to happen after its successful termination L1 CHAOS .3.4). it follows that X ∪ { } is also a refusal of P (3. The ﬁrst of them is that a nondeterministic process like SKIP (c → SKIP ) does not satisfy the law L2 of the previous section. Firstly. a divergent process remains divergent. R) Q ) = (R . R = (P . R) (R . if m is a process name. Q ) this transition may occur autonomously.) In addition to the laws given earlier in this chapter.3. Q ). we must never use ( → P | c → Q) ∈ A. P ) (Q .2 Non-deterministic processes Sequential composition of nondeterministic processes presents a number of problems. sequential composition of nondeterministic processes satisﬁes the following laws. In this case X is a refusal of (P . and cannot terminate successfully.160 5 Sequential Processes Furthermore. = in the choice construct Finally. its occurrence is concealed. This restriction also rules out RUNA when 5. and also SKIP must never appear unguarded in an operand of must not appear in the alphabet of either operand of ||| (It is possible that a slight change to the deﬁnitions of and ||| might permit relaxation of these restrictions.1 L2 to L2A s ∈ traces(P ) ⇒ (P / s) SKIP This means that whenever P can terminate. Q ). The case where successful termination of P is nondeterministic is also treated in the deﬁnition . If P can refuse X .4.9) requires treatment of its failures and divergences. But ﬁrst we describe its refusals (Section 3. To maintain the truth of L2A. we must adopt the convention that m. then in (P . Q ) R . But if P oﬀers the option of successful termination. P = CHAOS Sequential composition distributes through nondeterministic choice L2A L2B P Q ) . A solution of this is to weaken 5. (P To deﬁne (P . it can do so without oﬀering any alternative event to the environment. and any refusal of Q is also a refusal of (P . all restrictions of the previous section must be observed. Q ) in the mathematical model of nondeterministic processes (Section 3.

Q ) is either a failure of P before P can terminate. otherwise. X ) | (s. Instead. Q ) = { X | (X ∪ { }) ∈ refusals(P ) } ∪ {X | ∈ traces(P ) ∧ X ∈ refusals(Q ) } The traces of (P .5. X ) | s ∈ traces(P ) ∧ (t . and the rest of it is composed with the second operand sequence(P . the progress of P is just interrupted on occurrence of the ﬁrst event of Q . Q ) 5. Q ) = if P ("SUCCESS ) ≠ "BLEEP then Q else λ x •if P (x) = "BLEEP then "BLEEP else sequence(P (x).4 Interrupts In this section we deﬁne a kind of sequential composition (P Q ) which does not depend on successful termination of P . and P is never resumed. Q ) } in s } ∪ in s ∧ 5. or when P has terminated successfully and then Q diverges D2 divergences(P . Q ) = { s | s ∈ divergences(P ) ∧ ¬ {s t | s ∈ traces(P ) ∧ ¬ t ∈ divergences(Q ) } Any failure of (P . X ) ∈ failures(Q ) } ∪ { (s. The divergences of (P . or it is a failure of Q after P has terminated successfully D3 failures(P . X ∪ { }) ∈ failures(P ) } ∪ { (s t .3. X ) | s ∈ divergences(P . the ﬁrst operand participates in the ﬁrst event.3 Implementation SKIP is implemented as the process which accepts only the symbol "SUCCESS . It does not matter what it does afterwards SKIP = λ x • if x = "SUCCESS then STOP else "BLEEP A sequential composition behaves like the second operand if the ﬁrst operand terminates. Q ) are deﬁned in exactly the same way as for deterministic processes. It follows that a trace of (P Q ) is just a trace of P up to an arbitrary point when . Q ) = { (s.4 Interrupts 161 D1 refusals(P . Q ) are deﬁned by the remark that it diverges whenever P diverges.

5. Provided that c ∉ B (x : B → P (x) | c → Q ) ≡ (x : (B ∪ {c}) → (if x = c then Q else P (x))) and similarly for more operands. nor is it safe to specify a divergent process after the interrupt L5 CHAOS P = CHAOS = P CHAOS In the remainder of this section. Since the occurrence of interrupt is visible and controllable by the environment. by selecting an event which is initially oﬀered by Q but not by P L1 (x : B → P (x)) If (P R) Q =Q (x : B → (P (x) Q )) Q ) can be interrupted by R. so it distributes through nondeterministic choice L4A L4B P (Q (Q R) R) = (P P = (Q Q) P) (P (R R) P) Finally.4. only the interrupt can actually occur. To emphasise the preservation of determinism. more formally ∉ αP . followed by any trace of Q . it can never be triggered by the environment. Similarly. which it is reasonable to suppose would not be caused by P . this restriction preserves determinism. this is the same as P interruptible by (Q L2 (P Q) R=P (Q R) Since STOP oﬀers no ﬁrst event. The next law states that it is the environment which determines when Q shall start. we extend the deﬁnition of the choice operator. Thus STOP is a unit of L3 P STOP = P = STOP P The interrupt operator executes both of its operands at most once. we specify that must not be in αP .1 Catastrophe Let be a symbol standing for a catastrophic interrupt event. we shall insist that the possible initial events of the interrupting process are outside the alphabet of the interrupted process. α(P traces(P Q ) = αP ∪ αQ Q ) = { s t | s ∈ traces(P ) ∧ t ∈ traces(Q ) } To avoid problems. one cannot cure a divergent process by interrupting it. and reasoning about the operators is simpliﬁed.162 5 Sequential Processes the interrupt occurs. if STOP is interruptible.

4).8. depression of this key at any point in the progress of the game will restart the game. this single law uniquely identiﬁes the meaning of the operator. and wish to start a new game again. P is certainly a cyclic process (Section 1. Humans sometimes get dissatisﬁed with the progress of a game. It is convenient to deﬁne a game P independently of the restart facility and then transform it into a restartable ˆ game P by using the operator deﬁned above. ˆ Let P be a process such that ∉ αP .4 Interrupts 163 Then a process which behaves like P up to catastrophe and thereafter like Q is deﬁned P ˆQ =P ( → Q) Here Q is perhaps a process which is intended to eﬀect a recovery after catastrophe. since the occurrence of X is guarded by .5. This idea is due to Alex Teruel. Catastrophe is not the only reason for a restart. The second law gives a more explicit description of the ﬁrst and subsequent steps of the process. It shows how ˆ distributes back through → L2 (x : B → P (x)) ˆ Q = (x : B → (P (x) ˆ Q ) | → Q) This law too uniquely deﬁnes the operator on deterministic processes. . In a nondeterministic universe. Note that the inﬁx operator ˆ is distinguished from the event by the circumﬂex. The ﬁrst law is just an obvious formulation of the informal description of the operator L1 (P ˆ Q ) / (s )=Q for s ∈ traces(P ) In the deterministic model. 5. ˆ The informal deﬁnition of P is expressed by the law ˆ L1 P / s ˆ =P for s ∈ traces(P ) . We specify P as a process which behaves as P until occurs. Such a process is called restartable and is deﬁned by the simple recursion ˆ αP = αP ∪ { } ˆ P = µ X • (P ˆ X ) = P ˆ (P ˆ (P ˆ . a new and special key ( ) is provided on the keyboard. even if P is not. and after each behaves like P from the start again. Consider a process designed to play a game.3).2 Restart One possible response to catastrophe is to restart the original process again. For this purpose.4. interacting with its human opponent by means of a selection of keys on a keyboard (see the description of the interact function of Section 1. . uniqueness would require additional laws stating strictness and distributivity in both arguments.)) ˆ This is a guarded recursion.

2.4. This is rather like an interrupt. since it is equally well satisﬁed by ˆ is the smallest deterministic process that satisﬁes L1. the most recent checkpoint is restored. We therefore provide a new key x . you do not wish to lose your place in the editor on switching to a “help” program. P 5. A similar facility should be provided in a “friendly” operating system for alternating between system utilities. We therefore provide a new key c . When lightning ( ) strikes. in that the current game is interrupted at an arbitrary point. but responds in the appropriate fashion to these two events. When occurs. so that it can be resumed when the other game is later interrupted. one of the worst responses would be to restart P in its initial state. We suppose c and are not in the alphabet of P . but it diﬀers from the interrupt in that the current state of the current game is preserved.3 Alternation Suppose P and Q are processes which play games in the manner described in Section 5. alternating between them in the same way as a chess master plays a simultaneous match by cycling round several weaker opponents. and a human player wishes to play both games simultaneously. or if there is no checkpoint the initial state is restored. it shows how x distributes backward through → L4 (x : B → P (x)) x Q = (x : B → (P (x) x Q ) | x → (Q x P )) The alternation operator is useful not only for playing games. However. It would be much better to return to some recent state of the system which is known to be satisfactory.164 5 Sequential Processes ˆ But this law does not uniquely deﬁne P . which causes alternation between the two games P and Q . losing all the laboriously accumulated data of the system. The informal deﬁnition of Ch(P ) is most succinctly formalised in the laws . which should be pressed only when the current state of the system is known to be satisfactory. and deﬁne Ch(P ) as the process that behaves as P . A more constructive description of the operator can be derived from these laws.4.4. 5. The process which plays the games P and Q simultaneously is denoted (P x Q ). nor vice versa.4 Checkpoints Let P be a process which describes the behaviour of a long-lasting data base system. For example. and it is most clearly speciﬁed by the laws L1 x ∈ (α(P x Q )) − αP − αQ ) L2 (P x Q ) / s = (P / s) x Q L3 (P x Q ) / x if s ∈ traces(P ) = (Q x P ) We want the smallest operator that satisﬁes L2 and L3. Such a state is known as a checkpoint. RUN .

the current state is copied as the new checkpoint.4 Interrupts 165 L1 Ch(P ) / (s L2 Ch(P ) / (s ) = Ch(P ) for s ∈ traces(P ) for s ∈ traces(P ) c ) = Ch(P /afters) Ch(P ) can be deﬁned more explicitly in terms of the operator Ch2(P . and if his explorations are unsuccessful. Q ) | → Ch2(Q . In principle. As always we insist αMch(P ) = αP = { c . the checkpoint is copied back as the new current state. a system implementor ensures that as much data as possible is shared between the current and the checkpoint states. use of the key will restore the status quo. as described by the laws L3 Ch(P ) = Ch2(P . These ideas of checkpointing have been explored by Ian Hayes. it may be desirable to cancel the most recent checkpoint.4. Q ) = (x : B → Ch2(P (x). In such cases. So he presses the c key to store the current position. If catastrophe occurs before the ﬁrst checkpoint.5 Multiple checkpoints In using a checkpointable system Ch(P ) it may happen that a checkpoint is declared in error. Such optimisation is highly machine and application dependent. For reasons of economy. Q ). For this we require a system which retains two or more of the most recently checkpointed states. } A before a c goes back to the beginning = Mch(P ) for s ∈ traces(P ) L1 Mch(P ) / s . the system restarts. there is no reason why we should not deﬁne a system Mch(P ) which retains all checkpoints back to the beginning of time. 5. The checkpointing operator is useful not only for large-scale data base systems. in which the checkpointed state is stored on some cheap but durable medium such as magnetic disc or tape. When playing a diﬃcult game. Each occurrence of returns to the state just before the most recent c .5. where P is the current process and Q is the most recent checkpoint waiting to be reinstated. P )) The law L4 is suggestive of a practical implementation method. and go back to the one before. it is pleasing that the mathematics is so simple. when occurs. rather than the state just after it. Q ) | c → Ch2(P . When c occurs. P ) L4 If P = (x : B → P (x)) then Ch2(P . a human player may wish to explore a possible line of play without committing himself to it.

Q ) | c → Mch2(P . Mch2(P . Mch2(P .6 Implementation The implementation of the various versions of interrupt are based on laws which show how the operators distribute through →. The initial content of the stack is an inﬁnite sequence of copies of P L3 Mch(P ) = µ X • Mch2(P . .4.5 L3. Q ) = (x : B → Mch2(P (x).166 5 Sequential Processes A after a c cancels the eﬀect of everything that has happened back to and including the most recent c L2 Mch(P ) / s c t = Mch(P ) / s for (s αP ) t ∈ traces(P ) A much more explicit description of Mch(P ) can be given in terms of a binary operator Mch2(P . on occurrence of whole stack is reinstated L4 If P = (x : B → P (x)) then Mch2(P . P ) else if P (x) = "BLEEP then "BLEEP else alternation(P (x). L4) Mch(P ) = Mch2(P . Consider for example the alternation operator (5. where P is the current process and Q is the stack of checkpoints waiting to be resumed if necessary. but the multiple checkpoint facility could be very expensive to implement in practice when the number of checkpoints gets large. Q ). Q ) = λ x • if x = x then alternation(Q .4. Q ) .))) On occurrence of c the current state is pushed down. X ) = Mch2(P . Q )) | the → Q) The pattern of recursions which appear in L4 is quite ingenious. Q ) A more surprising implementation is that of MCh (5. Mch2(P . . Mch(P )) = Mch2(P .4.3 L4) alternation(P . 5. .

. except that the initial value of x is deﬁned to be the initial value of the expression e. P ) is a process which behaves like P . Q ) When this function is executed. the amount of store used grows in proportion to the number of checkpoints. The essential feature of conventional computer programming is assignment. e1 . the storage can be reclaimed by the garbage collector on each occurrence of . conditionals. Mch2(P .5. Initial values of all other variables are unchanged. To simplify the formulation of useful laws. Q )) else if P (x) = "BLEEP then "BLEEP else Mch2(P (x). x1 . If x is a program variable and e is an expression and P a process (x := e . .5 Assignment 167 where Mch2(P . but that is not really much consolation. and available storage is very rapidly exhausted. . But such a design is not so elegantly expressible by recursion. . . Mch2(P . Of course. Q )) else if x = c then Mch2(P . Q ) = λ x • if x = then Mch2(P .5 Assignment In this section we shall introduce the most important aspects of conventional sequential programming. 5. As in the case of other recursions. In this case. . en−1 . Assignment by itself can be given a meaning by the deﬁnition (x := e) = (x := e . xn−1 Let e stand for a list of expressions e = e0 . the designer should impose a limit on the number of checkpoints retained. some unusual notations will be deﬁned. Let x stand for a list of distinct variables x = x0 . and loops. and discard the earlier ones. namely assignments. constraints if practical implementation enforce a ﬁnite bound on the depth. SKIP ) Single assignment generalises easily to multiple assignment.

X )) The current value of the count is recorded in the variable n X2 A process that behaves like CT0 n := 0 . X ) < b > SKIP ) | | Examples X1 A process that behaves like CTn (1. X1 The initial value of the count is set to zero. the traditional loop while b do Q will be written b∗Q This may be deﬁned by recursion D1 b ∗ Q = µ X • ((Q . g Let b be an expression that evaluates to a Boolean truth value (either true or false). z := f . so that if y occurs in g y := f . or like Q if the initial value of b is false. z := g is quite diﬀerent from y. Note that all the ei are evaluated before any of the assignments are made.168 5 Sequential Processes Provided that the lengths of the two lists are the same x := e assigns the initial value of ei to xi . for all i.4 X2) X1 = µ X • (around → X | up → (n := 1 .1. but less cumbersome than the traditional if b then P else Q For similar reasons. The notation is novel. X ) | down → (n := n − 1 . . If P and Q are processes P <b> Q | | (P if b else Q ) is a process which behaves like P if the initial value of b is true. X )) <n = 0 > | | (up → (n := n + 1 .

because it does not have the properties which we would like. we have deliberately rejected that technique. r := x − q × y) X5 A process with the same eﬀect as . ((r ≥ y) ∗ (q := q + 1 . z. g. e.5 X3) we have shown how the behaviour of a variable can be modelled by a subordinate process which communicates its value with the process which uses it. L1 (x := x) = SKIP L2 (x := e . m := 1) = (m := 1) but unfortunately (m. f (x).5. we want (m := 1 . f .left !1 → SKIP ) ≠ (m. f . f ) . X4 A process which divides a natural number x by a positive number y assigning the quotient to q and the remainder to r QUOT = (q := x + y . x := f (x)) = (x := f (e)) L3 If x.left !1 → SKIP ) 5. g) = (x. which computes the quotient by the slow method of repeated subtraction LONGQUOT = (q := 0 . and f (e) contains ei whenever f (x) contains xi for all indices i. r := x .left !1 → m. for any values of the variables they contain. y is a list of distinct variables (x := e) = (x. y) L4 If x.1 Laws In the laws for assignment. z := e. In this chapter. possible containing occurrences of variables in x or y.5 Assignment 169 X3 A process that behaves like POS (5. g respectively (x. For simplicity. y. z are of the same length as e. r := r − y))) In a previous example (4.5. y. For example. x and y stand for lists of distinct variables.1 X8) n := 1 . y := e. y := e. in the following laws we shall assume that all expressions always give a result. (n > 0) ∗ (up → n := n + 1 | down → n := n − 1) Recursion has been replaced by a conventional loop. f (e) stand for lists of expressions.

170

5

Sequential Processes

Using these laws, it is possible to transform every sequence of assignments into a single assignment to a list of all the variables involved. When <b > is considered as a binary inﬁx operator, it possesses several | | familiar algebraic properties L5–L6 <b > is idempotent, associative, and distributes through <c > | | | | L7 P < true > Q = P | | L8 P < false > Q = Q | | L9 P < ¬ b > Q = Q < b > P | | | | L10 L11 L12 L13 P < b > (Q < b > R) = P < b > R | | | | | | P < (a < b > c) > Q = (P < a > Q ) < b > (P < c > Q ) | | | | | | | | | | x := e ; (P < b(x) > Q ) = (x := e ; P ) < b(e) > (x := e ; Q ) | | | | (P < b > Q ) ; R = (P ; R) < b > (Q ; R) | | | |

To deal eﬀectively with assignment in concurrent processes, it is necessary to impose a restriction that no variable assigned in one concurrent process can ever be used in another. To enforce this restriction, we introduce two new categories of symbol into the alphabets of sequential processes var (P ) the set of variables that may be assigned within P acc(P ) the set of variables that may be accessed in expressions within P . All variables which may be changed may also be accessed var (P ) ⊆ acc(P ) ⊆ αP Similarly, we deﬁne acc(e) as the set of variables appearing in e. Now if P and Q are to be joined by ||, we stipulate that var (P ) ∩ acc(Q ) = var (Q ) ∩ acc(P ) = {} Under this condition, it does not matter whether an assignment takes place before a parallel split, or within one of its components after they are running concurrently L14 ((x := e ; P ) || Q ) = (x := e ; (P || Q )) provided that x ⊆ var (P ) − acc(Q ) and acc(e) ∩ var (Q ) = {} An immediate consequence of this is (x := e ; P ) || (y := f ; Q ) = (x, y := e, f ; (P || Q )) provided that x ⊆ var (P ) − acc(Q ) − acc(f ) and y ⊆ var (Q ) − acc(P ) − acc(e)

5.5

Assignment

171

This shows how the alphabet restriction ensures that assignments within one component process of a concurrent pair cannot interfere with assignments within the other. In an implementation, sequences of assignments may be carried out either together or in any interleaving, without making any diﬀerence to the externally observable actions of the process. Finally, concurrent combination distributes through the conditional L15 P || (Q < b > R) = (P || Q ) < b > (P || R) | | | | provided that acc(b) ∩ var (P ) = {}. This law again states that it does not matter whether b is evaluated before or after the parallel split. We now deal with the problem which arises when expressions are undeﬁned for certain values of the variables they contain. If e is a list of expressions, we deﬁne D e as a Boolean expression which is true just when all the operands of e are within the domains of their operators. For example, in natural number arithmetic, D(x ÷ y) = (y > 0) D(y + 1, z + y) = true D(e + f ) = D e ∧ D f D(r − y) = y ≤ r It is reasonable to insist that D e is always deﬁned, i.e., D(D e) = true We deliberately leave completely unspeciﬁed the result of an attempt to evaluate an undeﬁned expression—anything whatsoever may happen. This is reﬂected by the use of CHAOS in the following laws. L16’ L17’ (x := e) = (x := e < D e > CHAOS ) | | P < b > Q = ((P < b > Q ) < D b > CHAOS ) | | | | | |

Furthermore, the laws L2, L5, and L12 need slight modiﬁcation L2’ L5’ 5.5.2 (x := e; x := f (x)) = (x := f (e) < D e > CHAOS ) | | (P < b > P ) = (P < D b > CHAOS ) | | | | Speciﬁcations

A speciﬁcation of a sequential process describes not only the traces of the events which occur, but also the relationship between these traces, the initial values of the program variables, and their ﬁnal values. To denote the initial value of a program variable x, we simply use the variable name x by itself. To

172

5

Sequential Processes

denote the ﬁnal value, we decorate the name with a superscript , as in x . The value of x is not observable until the process is terminated, i.e., the last event of the trace is . This fact is represented by not specifying anything about x unless tr 0 = . Examples X1 A process which performs no action, but adds one to the value of x, and terminates successfully with the value of y unchanged tr = ∨ (tr = ∧ x = x + 1 ∧ y = y)

X2 A process which performs an event whose symbol is the initial value of the variable x, and then terminates successfully, leaving the ﬁnal values of x and y equal to their initial values tr = ∨ tr = x ∨ (tr = x, ∧ x = x ∧ y = y)

X3 A process which stores the identity of its ﬁrst event as the ﬁnal value of x #tr ≤ 2 ∧ (#tr = 2 ⇒ (tr = x , ∧ y = y))

X4 A process which divides a nonnegative x by a positive y, and assigns the quotient to q and the remainder to r DIV = (y > 0 ⇒ tr = ∨ (tr = r ∧ q = (x ÷ y) ∧ = x − (q × y) ∧ y = y ∧ x = x))

Without the precondition, this speciﬁcation would be impossible to meet in its full generality. X5 Here are some more complex speciﬁcations which will be used later DIVLOOP = (tr = ∨ (tr = r ∧ r = (q − q) × y + r < y ∧ x = x ∧ y = y)) ∧

T (n) = r < n × y

5.5

Assignment

173

All variables in these and subsequent speciﬁcations are intended to denote natural numbers, so subtraction is undeﬁned if the second operand is greater than the ﬁrst. We shall now formulate the laws which underlie proofs that a process satisﬁes its speciﬁcation. Let s(x, tr , x ) be a speciﬁcation. In order to prove that SKIP satisﬁes this speciﬁcation, clearly the speciﬁcation must be true when the trace is empty; furthermore, it must be true when the trace is and the ﬁnal values of all variables x are equal to their initial values. These two conditions are also suﬃcient, as stated in the following law L1 If S (x, and S (x, ,x ) ,x )

then SKIP sat S (x, tr , x ) X6 The strongest speciﬁcation satisﬁed by SKIP is SKIPA sat (tr = ∨ (tr = ∧ x = x))

where x is a list of all variables in A and x is a list of their ticked variants. X6 is an immediate consequence of L1 and vice versa. X7 We can prove that SKIP sat (r < y ⇒ (T (n + 1) ⇒ DIVLOOP )) Proof : (1) Replacing tr by in the speciﬁcation gives = ∨ ...

r < y ∧ T (n + 1) ⇒ which is a tautology. (2) Replacing tr by r < y ∧ T (n + 1) ⇒ ( = ∨( =

and ﬁnal values by initial values gives

∧x=x∧

y = y ∧ r = ((q − q) × y + r ∧ r < y))) which is also a trivial theorem. This result will be used in X10. It is a precondition of successful assignment x := e that the expressions e on the right-hand side should be deﬁned. In this case, if P satisﬁes a speciﬁcation S (x), (x := e ; P ) satisﬁes the same speciﬁcation, after modiﬁcation to reﬂect the fact that the initial value of x is e.

174

5

Sequential Processes

L2 If P sat S (x) then (x := e; P ) sat (D e ⇒ S (e)) The law for simple assignment can be derived from L2 on replacing P by SKIP , and using X6 and 5.2 L1 L2A x0 := e sat (D e ∧ tr ≠ ⇒ tr = ∧ x0 = e ∧ x1 = x1 ∧ . . .)

A consequence of L2 is that for any P , the strongest fact one can prove about (x := 1/0 ; P ) is (x := 1/0 ; P ) sat true Whatever non-vacuous goal you may wish to achieve, it cannot be achieved by starting with an illegal assignment. Examples X8 SKIP sat (tr ≠ therefore (r := x − q × y ; SKIP ) sat (x ≥ q × x ∧ tr ≠ tr = r therefore (q := x ÷ y ; r := x − q × y) sat (y > 0 ∧ x ≥ (x ÷ y) × y ∧ tr ≠ tr = r ∧ q = (x ÷ y) ∧ = (x − (x ÷ y) × y) ∧ y = y ∧ x = x) The speciﬁcation on the last line is equivalent to DIV , deﬁned in X4. X9 Assume X sat (T (n) ⇒ DIVLOOP ) therefore (r := r − y ; X ) sat (y ≤ r ⇒ (r − y < n × y ⇒ (tr = ∨ tr = ∧ (r − y) = . . .))) ⇒ ⇒ ⇒ tr = ∧q =q∧r = r ∧ y = y ∧ x = x)

∧q =q∧ = (x − q × y) ∧ y = y ∧ x = x)

and the same as that of the second component if false. and the initial state of the second component is identical to the ﬁnal state of the ﬁrst component. Q ) sat (∃ y. in which the traces of the components are sequentially composed. the values of the variables in this intermediate state are not observable. t . x )) In this law. X ) sat (y ≤ r ⇒ (r < (n + 1) × y ⇒ DIVLOOP )) where DIVLOOP = (tr = ∨ (tr = r ∧ (r − y) = (q − (q + 1)) × y + r < y ∧ x = x ∧ y = y)) ∧ By elementary algebra of natural numbers y ≤ r ⇒ (DIVLOOP ≡ DIVLOOP ) therefore (q := q + 1 . t ) ∧ S (x. The speciﬁcation of a conditional is the same as that of the ﬁrst component if the condition is true. s. x is a list of their subscripted variants. and y a list of the same number of fresh variables. r := r − y . s. For general sequential composition. t • tr = (s . x ) and P does not diverge then (P . r := r − y . only the existence of such values is assured L3 If P sat S (x. a much more complicated law is required. x is a list of all variables in the alphabet of P and Q .5. y) ∧ T (y. tr . x ) and Q sat T (x. However. X ) sat (y ≤ r ⇒ (T (n + 1) ⇒ DIVLOOP )) This result will be used in X10. tr . L4 If P sat S and Q sat T then (P < b > Q ) sat ((b ∧ S ) ∨ (¬ b ∧ T )) | | An alternative form of this law is sometimes more convenient L4A If P sat (b ⇒ S ) and Q sat (¬ b ⇒ S ) then (P < b > Q ) sat S | | .5 Assignment 175 therefore (q := q + 1 .

x).e. x) has been correctly deﬁned T (0. x)) ⇒ R No stronger speciﬁcation can possibly be met. no loop can terminate in less than no repetitions. ∀ n • (T (n. x) will be false. which describes the conditions on the initial state x such that the loop is certain to terminate in less than n repetitions.5 D1. Finally. the result follows by L4A. x) ⇒ R) Since n has been chosen as a variable which does not occur in R.. x) ⇒ R) Clearly.176 5 Sequential Processes Example X10 Let COND = (q := q + 1 . The proof of a loop uses the recursive deﬁnition given in 5. i. If R is the intended speciﬁcation of the loop. r := r − y .7. Then deﬁne S (n) = (T (n. this is equivalent to (∃ n • T (n. x) is the precondition under which the loop terminates in some ﬁnite number of iterations. and also (∀ n • S (n)) ⇒ R A general method to construct S (n) is to ﬁnd a predicate T (n. X ) < r ≥ y > SKIP | | and X sat (T (n) ⇒ DIVLOOP ) then COND sat (T (n + 1) ⇒ DIVLOOP ) The two suﬃcient conditions for this conclusion have been proved in X7 and X9. so if T (n. The result of the proof of the loop will be ∀ n • S (n). we must prove that the body of the loop meets its speciﬁcation. Thus we derive the general law . this task splits into two. since ∃ n • T (n. and the law for unguarded recursion (3.1 L8). we must ﬁnd a speciﬁcation S (n) such that S (0) is always true. Since the recursive equation for a loop involves a conditional. and consequently S (0) will be true.

when the methods are extended to deal with communicating sequential processes. the proof methods are mathematically equivalent to ones that are already familiar. Thus in the special case of noncommunicating programs. x) ⇒ D b and SKIP sat (¬ b ⇒ (T (n. The laws given in this section are designed as a calculus of total correctness for purely sequential programs. Thus (P (x). This extra burden is of course necessary. x) ⇒ R)) ⇒ ((Q . R(x)) where wp is Dijkstra’s weakest precondition.5 Assignment 177 L5 If ¬ T (0. If R(x ) does not mention the initial values x. and therefore more acceptable. x) and T (n. The second and more diﬃcult part is to prove that the loop meets some suitably formulated speciﬁcation. x) ⇒ R))) then (b ∗ Q ) sat ((∃ n • T (n. r := r − y) sat (y > 0 ⇒ DIVLOOP ) First we need to formulate the condition under which the loop terminates in less than n iterations T (n) = r < n × y Here T (0) is obviously false. The task splits naturally in two. If Q is such a program. which contain no input or output. the assertion (1) is equivalent to P (x) ⇒ wp(Q .5 X5) meets its speciﬁcation DIV . X ) sat (b ⇒ (T (n + 1. x ) (1) established that if P (x) is true of the initial values of the variables when Q is started. then Q will terminate and R(x. . The remaining steps of the proof of the loop have already been taken in X7 and X5.5. namely (r ≥ y) ∗ (q := q + 1 . though the explicit mention of “tr = ” and “tr = ” makes them notationally more clumsy. the clause ∃ n • T (n) is equivalent to y > 0. which is the precondition under which the loop terminates. x)) ⇒ R) Example X11 We wish to prove that the program for long division by repeated subtraction (5. The rest of the proof is a simple exercise. x )) form a precondition/postcondition pair in the sense of Cliﬀ Jones. x ) will describe the relationship between the initial values x and the ﬁnal values x . R(x. then a proof that Q sat (P (x) ∧ tr ≠ ⇒ tr = ∧ R(x. x) ⇒ R)) and (X sat T (n.

except that its ﬁnal state is slightly changed assign(x. we do not care what happens. e) is the function λ y • if y = x then eval(e. x. The process SKIP takes an initial state as a parameter. If not.178 5 Sequential Processes 5. . To implement sequential composition. we have implemented only the single assignment. for simplicity. b. and delivers its initial state as its ﬁnal state SKIP = λ s • λ y • if y ≠ "SUCCESS then "BLEEP else s An assignment is similar. not onto another process. it is necessary ﬁrst to test whether the ﬁrst operand has successfully terminated. e) is the result of evaluating the expression e in state s. accepts "SUCCESS as its only action. e) where update(s. s) then P (s) else Q (s) The implementation of the loop (b ∗ Q ) is left as an exercise. its ﬁnal state is passed on to the second operand. A sequential process is deﬁned as a function which maps its initial states onto its subsequent behaviour. e) = λ s • λ y • if y ≠ "SUCCESS then "BLEEP else update(s. the ﬁrst action is that of the ﬁrst operand sequence(P . Q ) The implementation of the conditional is as a conditional condition(P . If e is undeﬁned in state s. Successful termination ( ) is represented by the atom "SUCCESS . Here.5. A process which is ready to terminate will accept this symbol.3 Implementation The initial and ﬁnal states of a sequential process can be represented as a function which maps each variable name onto its value. Q ) = λ s • if eval(b. If so. but onto the ﬁnal state of its variables. Multiple assignment is a little more complicated. Q ) = λ s • if P (s)("SUCCESS ) ≠ "BLEEP then Q (P (s)("SUCCESS )) else λ y • if P (s)(y) = "BLEEP then "BLEEP else sequence(P (s)(y). x. s) else s(y) and eval(s. which it maps.

and it has to supply the state as the ﬁrst argument of its operands. When considerations of eﬃciency are added to those of mathematical convenience. because it takes a state as its ﬁrst argument.3.3.5 Assignment 179 Note that the deﬁnition of sequence given above is more complicated than that given in Section 5. Unfortunately. rather than deﬁning it in terms of previously introduced concepts. a similar complexity has to be introduced into the deﬁnitions of all the other operators given in earlier chapters. but this would probably be a great deal less eﬃcient than the use of conventional random access storage.5. A simpler alternative would be to model variables as subordinate processes. there are adequate grounds for introducing the assignable program variable as a new primitive concept. .

.

A general method of sharing is provided by multiple labelling (Section 2. Unfortunately.5 we introduced the concept of a named subordinate process (m : R). Another example was 4. because these channels would have to be in the alphabet of both P and Q . and the footman was shared among all ﬁve. it is possible to arrange that each sharing process uses a diﬀerent set of channels to communicate with the shared resource.5).5 X6.4).6. in which a buﬀer was shared between two processes. and then the deﬁnition of || would require that communications with (m : R) take place only when both P and Q communicate the same message simultaneously—which (as explained in 4. and so it is not adequate for a subordinate process intended to serve the needs of a main process which .5 X6) is far from the required eﬀect. which eﬀectively creates enough separate channels for independent communication with each sharing process. This technique was used in the story of the dining philosophers (Section 2.Shared Resources 6 6. In this way.1 Introduction In Section 4. Individual communications along these channels are arbitrarily interleaved. But this method requires that the names of all the sharing processes are known in advance. it is not possible for P and Q both to communicate with (m : R) along the same channels. and for this we have used the notation (m : R // S ) Suppose now that S contains or consists of two concurrent processes (P || Q ). each of them uses it independently. each fork was shared between two neighbouring philosophers. and their interactions with it are interleaved. When the identity of all the sharing processes is known in advance. What is needed is some way of interleaving the communications between P and (m : R) with those between Q and (m : R). (m : R) serves as a resource shared between P and Q . one of which used only the left channel and the other used only the right channel. whose sole task is to meet the needs of a single main process S . and that both P and Q require the services of the same subordinate process (m : R).

this prohibits direct communication between P and Q . It is illustrated by examples drawn from the design of an operating system. P and Q have the same alphabet and their communications with external (shared) processes are arbitrarily interleaved.1 arises from the use of the combinator || to describe the concurrent behaviour of processes.182 6 Shared Resources splits into an arbitrary number of concurrent subprocesses. Of course. but the components of a pair of communications with one process are never separated by a communication with . and this problem can often be avoided by using instead the interleaving form of concurrency (P ||| Q ). but indirect communication can be re-established through the services of a shared subordinate process of appropriate design. both P and Q may contain calls on the subordinate process (doub. Here. The suggested notation is reminiscent of a traditional procedure call in a high-level language. matched pairs of communications are taken in arbitrary order. even when their number and identities are not known in advance. it seems worthwhile to introduce a specialised notation. To ensure this. 6. there is no danger than one of the processes will accidentally obtain an answer which should have been received by the other. This chapter introduces techniques for sharing a resource among many processes. this doub!x?y = (doub. Example X1 (Shared subroutine) doub : DOUBLE // (P ||| Q ) Here. For this reason.5 X6 and in X2 below.2 Sharing by interleaving The problem described in Section 6. When two sharing processes both simultaneously attempt to use the shared subroutine. except that the value parameters are preceded by ! and the result parameters by ?.left !x → doub.left !v → doub.right ?x → SKIP ) Even though these pairs of communications from P and Q are arbitrarily interleaved. all subprocesses of the main process must observe a strict alternation of communications on the left channel with communications on the right channel of the shared subordinate. whose exclusive use will guarantee observance of the required discipline. as shown in 4.right ?y → SKIP ) The intended eﬀect of sharing by interleaving is illustrated by the following series of algebraic transformations.

1 L1.1 L5. whose actions are interleaved. so after hiding the choice becomes nondeterministic (d : D // (P ||| Q )) = ((d : (!3 + 3 → D)) // ((d?y → P (y)) ||| Q )) ((d : (!4 + 4 → D)) // (P ||| (d?z → Q (z)))) = (d : D // (P (6) ||| Q )) (d : D // (P ||| Q (8))) [4. Since one of these processes is still waiting for output. 3.2 Sharing by interleaving 183 another.6. it is the process which provided the argument that gets the result. bookings are made by many reservation clerks.1 L7] The sharing processes each start with an output to the shared process.right ?x !v for right !v within a subordinate process ?x for left ?x Let D = ?x →!(x + x) → D P = d!3 → d?y → P (y) Q = d!4 → d?z → Q (z) R = (d : D // (P ||| Q )) (as in X1 above).5. then P ||| Q = d!3 → ((d?y → P (y)) ||| Q ) d!4 → (P ||| (d?z → Q (z))) [by 3. Each reserva- . etc. It is the shared process that is oﬀered the choice between them.5. That is why strict alternation of output and input is so important in calling a shared subroutine.6.] The shared process oﬀers its result to whichever of the sharing processes is ready to take it. we use the following abbreviations d!v for d.left !v within a main process d?x for d. Example X2 (Shared data structure) In an airline ﬂight reservation system. But the shared process is willing to accept either. For convenience.

after each pair of communications. For this purpose. each occasion of use of the shared resource involves exactly two communications.5 X8 will serve as a shared subordinate process. . On each occasion of use. . lp. . This process is used as a shared resource lp. . and returns an indication whether that passenger was already booked or not. one to send the parameters and the other to receive the results.left !“A. Examples X3 (Shared line printer) LP = acquire → µ X • (left ?x → h!s → X | release → LP ) Here. (CLERK ||| CLERK ||| .release → . After acquisition. h is the channel which connects LP to the hardware of the line printer.left !nextline → . . without any danger of interleaving of lines sent by another process.right ?x → SKIP ) In these two examples. the process LP copies successive lines from its left channel to its hardware. a number of lines constituting a ﬁle must be output consecutively.184 6 Shared Resources tion adds a passenger to the ﬂight list. . JONES” → . a single expensive output device may have to be shared among many concurrent processes. But frequently we wish to ensure that a whole series of communications takes place between two processes. . until a release signal returns it to its original state. . lp. . lp. . For example. the set implemented in 4. without danger of interference by a third process. the subordinate process returns to a state in which it is ready to serve another process. the output of a ﬁle must be preceded by an acquire which obtains exclusive use of the resource.left !pass no → AG109. . in which it is available for use by any other processes.) Each CLERK books a passenger by the call AG109!pass no?x which stands for (AG109.acquire → . the resource must be made available again by a release. or the same one again. . For this oversimpliﬁed example. named by the ﬂight number AG109 : SET // (. and on completion.) .

→ release → UTENSIL) pot : UTENSIL // pan : UTENSIL // (ANN ||| MARY ) Ann cooks in accordance with a recipe which requires a pot ﬁrst and then a pan. she ﬁnds that she cannot have it. . . . they decide to prepare a meal at about the same time. which they acquire. the risk of deadlock cannot be ignored. but when she needs her second utensil. they share a pot and a pan.acquire → . In the last two examples. . Example X5 (Deadlock) Ann and Mary are good friends and good cooks. . whereas Mary needs a pan ﬁrst. ﬁles should begin and end on page boundaries.acquire → . pan. Unfortunately. the printing paper is usually divided into pages. and a complete row of asterisks should be printed at the end of the last page of the ﬁle. . . and at the beginning of the ﬁrst page. . use and release as they need them UTENSIL = (acquire → use → use → . Each of them acquires her ﬁrst utensil. .acquire → . To prevent confusion. . the length of paper containing each ﬁle must be manually detached after output from the previous and the following ﬁles.6. . pot . To assist in separation of output. MARY = . because it is being used by the other.2 Sharing by interleaving 185 X4 (An improvement on X3) When a line printer is shared between many users. . .acquire → . pan. no complete line of asterisks is permitted to be printed in the middle of a ﬁle LP = (h!throw → h!asterisks → acquire → h!asterisks → µ X • (left ?s → if s ≠ asterisks then h!s → X else X | release → LP )) This version of LP is used in exactly the same way as the previous one. But if more than one resource is to be shared in this fashion. and the hardware of the printer allows an operation throw. pot . For this purpose. . . the use of the signals acquire and release prevent arbitrary interleaving of lines from distinct ﬁles. then a pot ANN = . and they do so without introducing the danger of deadlock. which moves the paper rapidly to the end of the current page—or better. which are separated by perforations. to the next outward-facing fold in the paper stack.

where the life of Ann is displayed along the vertical axis and Mary’s life on the horizontal.use pan. The system starts in the bottom left hand corner.acquire mix peel ANN MARY pan.release pan. For example in the region hatched both cooks would be using the pan. the system moves one step right. The picture also shows that the only sure way of preventing deadlock is to extend the forbidden region to . The purpose of the picture is to show that the danger of deadlock arises solely as a result of a concavity in the forbidden region which faces towards the origin: other concavities are quite safe. exclusion on the use of the pot prohibits entry into the region hatched . Now consider the zone marked with dots .1). one of the cooks is waiting for release of a utensil by the other. release pot. Similarly.use pot. the system moves one step upward. it will inevitably end in deadlock at the top right hand corner of the zone.acquire pot. at the beginning of both their lives. this trajectory reaches the top right hand corner of the graph where both cooks are enjoying their meal. it can only follow the edge upward (for a vertical edge) or rightward (for a horizontal edge). Thus if the trajectory reaches the edge of one of these forbidden regions. there are certain rectangular regions in the state space through which the trajectory cannot pass. acquire pot.use pot. Each time Ann performs an action. and this is not possible. acquire Figure 6. Fortunately. The trajectory shown on the graph shows a typical interleaving of Ann’s and Mary’s actions. If ever the trajectory enters this zone. Because they cannot simultaneously use a shared utensil.186 6 Shared Resources eat serve pan.release pan. Each time Mary performs an action.1 The story of Ann and Mary can be visualised on a two-dimensional plot (Figure 6. But this happy outcome is not certain. release eat pan. During this period.

A location of shared storage can be modelled in our theory as a shared variable (4. W. 6. The simplicity of the laws for reasoning about sequential processes derives solely from the fact that each variable is updated by at most one process. by a technique known as timesharing. The behaviour of systems of concurrent processes can readily be implemented on a single conventional stored program computer.3) where permission to sit down is a kind of resource. This solution is similar to the one imposed by the footman in the story of the dining philosophers (Section 2. In this implementation. . which are accessed and assigned simply by means of the usual machine instructions within the code for each of the processes.5. Dijkstra. Provided that there is a ﬁxed order in which all users acquire the resources they want. provided that at the time of acquisition they have already released all resources which are later in the standard ordering. The easier solution suggested for the previous example generalises to any number of users.3 Shared storage 187 cover the danger zone.left !0 → (P ||| Q ))) Shared storage must be clearly distinguished from the local storage described in 5. One technique would be to introduce an additional artiﬁcial resource which must be acquired before either utensil. Observance of this discipline of resource acquisition and release can often be checked by a visual scan of the text of the user processes. Users should release the resources as soon as they have ﬁnished with them.and must not be released until both utensils have been released.3 Shared storage The purpose of this section is to argue against the use of shared storage. in which a single processor executes each of the processes in alternation. These dangers are most clearly illustrated by the following example. and so remove the concavity. of which only four instances are shared among ﬁve philosophers. it is very easy to allow the concurrent processes to share locations of common storage. Users may even acquire resources out of order. there is no risk of deadlock. An easier solution is to insist that any cook who is going to want both utensils must acquire the pan ﬁrst.5. and these laws do not deal with the many dangers that arise from arbitrary interleaving of assignments from diﬀerent processes. with process change on occurrence of interrupt from an external device or from a regular timer. This example is due to E. the order of release does not matter. and any number of resources. the section may be omitted by those who are already convinced. for example (count : VAR // (count .6.2 X7) with the appropriate symbolic name.

Such a sequence is known as a critical region. the required exclusion is often achieved by inhibiting all interrupts for the duration of the critical region. .left !(x + 1) → . I suspect that there are several operating systems in current use which regularly produce slightly inaccurate summaries.right ?y → count .right ?x.left !(x + 1) Unfortunately. resulting in the sequence count . Dijkstra in his introduction of the binary exclusion semaphore. these two communications may be interleaved by a similar pair of communications from the other process. it fails completely as soon as a second processing unit is added to the computer. and it is an easy mistake in the design of processes which share common storage. A possible solution to this problem is to make sure that no change of process takes place during a sequence of actions which must be protected from interleaving.left !(y + 1) → count . on entry into a critical region. As a result. A better solution was suggested by E. A semaphore may be described as a process which engages alternatively in actions named P and V SEM = (P → V → SEM ) This is declared as a shared resource (mutex : SEM // . As a consequence. it is not reliably reproducible. . This solution has an undesirable eﬀect in delaying response to interrupts. . count . and so it is almost impossible to diagnose the error by conventional testing techniques. the relevant process P or Q attempts to update the count by the pair of communications count .right ?x → count . and worse. the actual occurrence of the fault is highly nondeterministic. the value of the count is incremented only by one instead of two.P . and accounts. Further. This kind of error is known as interference. On an implementation by a single processor. statistics.188 6 Shared Resources Example X1 (Interference) The shared variable count is used to keep a count of the total number of occurrences of some important event. On each occurrence of the event. W. must send the signal mutex. .) Each process.

6. we described how a number of concurrent processes with diﬀerent behaviour could share a single subordinate process. P ||| Q . the eﬀect will be chaotic. Such resources are known as serially reusable. . 6. if a variable is to be used only for counting. A much more robust way to prevent interference is to build the required protection into the very design of the shared storage. . . and that pure storage should not be shared in the design of a system using concurrency.1. then the operation which increments it should be a single atomic operation count . Provided that all processes observe this discipline.up and the shared resource should be designed like CT0 (1. But if any process omits a P or a V . . . or gets them in the wrong order. taking advantage of knowledge of its intended pattern of usage. This not only avoids the grave dangers of accidental interference. For example. Each sharing process observes a discipline of alternating input and output. .4 Multiple resources 189 and on exit from the critical region must engage in the event mutex.4 X2) count : CT0 // (.P → count .left !(x + 1) → mutex.2. and indices in the array ensure that each element communicates safely with the process that has acquired it.) In fact there are good reasons for recommending that each shared resource be specially designed for its purpose.4 Multiple resources In Section 6. to ensure that at any given time the resource is used by at most one of the potentially sharing processes. it also produces a design that can be implemented eﬃciently on networks of distributed processing elements as well as single-processor and multiprocessor computers with physically shared store. .V Thus the critical region in which the count is incremented should appear mutex.right ?x → count . or alternating acquire and release signals. it is impossible for two processes to interfere with each other’s updating of the count. In this section we introduce arrays of processes to represent multiple resources with identical behaviour.V → . and will risk a disastrous or (perhaps worse) a subtle error.

. so that the choice between the alternatives is made solely by the environment. doub.right . doub.20. a calling process should leave the selection arbitrary. .) i≥0 (f (i) → Pi ) = (f (0) → P0 | f (1) → P1 | . . . Examples X1 (Re-entrant subroutine) A shared subroutine that is serially reusable can be used by only one calling process at a time. . there could be corresponding delays to the calling processes.2. .40. with obvious meaning.60.) In the last example. doub.30. we insist that f is a one-one function. When a process calls a re-entrant subroutine. || P11 ) |||i<4 P = (P ||| P ||| P ||| P ) ||i≥0 Pi = (P0 || P1 || .left !30 → doub.left !30 → doub. If the execution of the subroutine requires a considerable calculation. there is good reason to allow several instances of the subroutine to proceed concurrently on diﬀerent processors. . For example ||i<12 Pi = (P0 || P1 || .2. So rather than specifying a particular index 2 or 3.right ?y → SKIP ) The use of the index 3 ensures that the result of the call is obtained from the same instance of doub to which the arguments were sent. .i.3.3. . even though some other concurrent process may at the same time call another instance of the array. . A subroutine capable of several concurrent instances is known as re-entrant.left .3. . . . it really does not matter which element of the array responds to the call.right .left . any one that happens to be free will be equally good. If several processors are available to perform the calculations. . A typical call of this subroutine could be (doub.right ?y → SKIP ) This still observes the essential discipline that the same index is used for sending the arguments and (immediately after) for receiving the result. . .i.190 6 Shared Resources We shall therefore make substantial use of indices and indexed operators. . .3. . . by using the construct i≥0 (doub. and it is deﬁned as an array of concurrent processes doub : (||i<27 (i : DOUBLE )) // . resulting in an interleaving of the messages doub. .

which it inputs on the left and outputs on the right. indexed by numbers less than B BSTORE = ||i<B i : COPY This store is intended for use as a subordinate process (back : BSTORE // . so that each block written can be read only once. or to run it on a machine with special facilities which are too expensive to provide on the machine on which the using processes run.i. The whole backing store is an array of such sectors.4 Multiple resources 191 In the example shown above. The intention in using a procedure is that the eﬀect of each call i≥0 (doub. such as a disk or bubble memory. .. X2 (Shared backing storage) A storage medium is split into B sectors which can be read and written independently. Since it is fairly easy to arrange that single processor divides its attention among a much larger number of processes. whereas the call of a shared procedure is known as a remote call.2 X7). there is an arbitrary limit of 27 simultaneous activations of the subroutine.6. since it suggests execution of the procedure on the same processor as the calling process. . since it suggests execution on a separate possibly distant processor.right ?y → SKIP ) should be identical to the call of a subordinate process D declared immediately adjacent to the call (doub : D // (doub.) .left !x → doub.i. Since the eﬀect of remote and local calls is intended to be the same. Each sector can store one block of information.g. A typical example of an expensive facility is a high-volume backing store. such arbitrary limits can be avoided by introducing an inﬁnite array of concurrent processes doub : (||i≥0 i : D) where D can now be designed to serve only a single call and then stop D = left ?x → right !(x + x) → STOP A subroutine with no bound on its re-entrancy is known as a procedure.left !x → doub. to keep the code of the procedure secret.2 X1) rather than VAR (4.right ?y → SKIP )) This latter is known as a local procedure call. Unfortunately the storage medium is implemented in a technology with destructive read-out. the reasons for using the remote call can only be political or economic—e. Thus each sector behaves like COPY (4.

) Each instance of LP is now preﬁxed twice. back. As in the case of a re-entrant procedure. Methods of enforcing this discipline painlessly will be introduced after the next example. the action i<B (back.0. and will be extensively illustrated in the subsequent design of modules of an operating system (Section 6. e. . In this case. A general choice construction will make the required arbitrary choice i≥0 (lp.left !x → .left . We therefore declare an array of two instances of LP . or even worse confusion.1. Of course. .release → SKIP ) . . . it really cannot matter which element of the array is selected on a given occasion. each of which is indexed by a natural number indicating its position in the array LP2 = (0 : LP || 1 : LP ) This array may itself be given a name for use as a shared resource (lp2 : LP2 // . Failure to observe such disciplines will lead to deadlock. A process may input from a sector only if the same process has most recently output to that very sector. the store may be used by the communications back.“A. . Any element which is ready to respond to the acquire signal will be acceptable.JONES”.left !bl → . It is this simpliﬁcation that is the real motive for using COPY to model the behaviour of each sector: the story of destructive read-out is just a story. .. . . . thus communications with the using process have three or four components. successful sharing of this backing store requires the utmost discipline on the part of the sharing processes. when a process needs to acquire one of an array of identical resources.i.left ?bl → . The backing store may also be shared by concurrent processes. . lp. They both need the kind of protection from interleaving that was provided by LP (6. . Similarly.acquire. once by a name and once by an index.g.2 X4). and write the value of bl into it. .i.5). and each output must eventually be followed by such an input.right ?y → .192 6 Shared Resources Within its main process.right ?x will in a single action both read the content of sector i into x and release this sector for use on another occasion. very possibly by another process. lp.i. lp. back.i. lp.) will simultaneously acquire an arbitrary free sector with number i.i. .acquire → . . .i. X3 (Two line printers) Two identical line printers are available to serve the demands of a collection of using processes.i.

the scratch ﬁle will then give only empty signals. the resource is intended to behave exactly like a locally declared subordinate process. . the bound variable i takes as its value the index of the selected resource. no further reading or writing is possible. If neither is ready.i. which must be rewound before being read. . as shown in the history of Ann and Mary (6. . When the output is complete. and the technicalities of acquisition and release have been conveniently suppressed. When all the blocks have been read. Of course. communicating only with its using subprocess. . . the initial lp. .release → SKIP ) Here.left !s2 → . and all subsequent communications will be correctly directed to that same resource. . When a shared resource has been acquired for temporary use within another process. it is distinguished from the familiar “:” notation in that it takes on its right. .6. Thus a scratch ﬁle behaves like a ﬁle output to magnetic tape.left !s1 → f2. the ﬁle is rewound.i. The empty signal serves as an end-of-ﬁle marker SCRATCH = WRITE WRITEs = (left ?x → WRITEs = (right !x → READs ) = (empty → READ ) x | rewind → READs ) READ x s READ . lp. X4 (Two output ﬁles) A using process requires simultaneous use of two line printers to output two ﬁles. X5 (Scratch ﬁle) A scratch ﬁle is used for output of a sequence of blocks. .4 Multiple resources 193 Here.i.)) Here. but the name of a remotely positioned array of processes. the choice between them is nondeterministic.left !x . deadlock will be the certain result of any attempt to declare three printers simultaneously.2 X5). it is also a likely result of declaring two printers simultaneously in each of two concurrent processes. and the entire sequence of blocks is read back from the beginning. .i. After the initial acquisition. if both are ready.) instead of the much more cumbersome construction i≥0 (lp.left !x . Let us therefore adapt the familiar notation for subordination. The new “::” notation is called remote subordination. .acquire will acquire whichever of the two LP processes is ready for this event. the acquiring process will wait. f1 and f2 (f1 :: lp // (f2 :: lp // . lp. not a complete process. and write (myﬁle :: lp // . but each line is printed on the appropriate printer. the local name myﬁle has been introduced to stand for the indexed name lp. the using process interleaves output of lines to the two diﬀerent ﬁles. . .acquire → .i. myﬁle. f1.

who will acquire. | myﬁle.) It will serve later as a model for a shared process.empty → .194 6 Shared Resources This may conveniently be used as a simple unshared subordinate process (myﬁle : SCRATCH // . . . and in the correct sequence BSCRATCH = (pagetable : SCRATCH // µ X • (left ?x → ( | rewind → pagetable. . .i. use.) .) The eﬀect is exactly the same as use of SCRATCH .2 X3). and it would be better to store the blocks on a backing store. except that the maximum length of the scratch ﬁle is limited to B blocks. . i<B back. X7 (Serially reused scratch ﬁles) Suppose we want to share the scratch ﬁle on backing store among a number of interleaved users.right ?x → .right ?i → back. . . . X6 (Scratch ﬁles on backing store) The scratch ﬁle described in X5 can be readily implemented by holding the stored sequence of blocks in the main store of a computer. But if the blocks are large and the sequence is long. This should be supplied SCRATCHB = (back : BSTORE // BSCRATCH ) SCRATCHB can be used as a simple unshared subordinate process in exactly the same way as the scratch ﬁle of X5 (myﬁle : SCRATCHB // . . a backing store (X2) with destructive read-out will suﬃce. . . myﬁle.empty → empty → Y ))) BSCRATCH uses the name back to address a backing store (X2) as a subordinate process. . (myﬁle. and release it one at a time in the manner of a shared line printer (6.left !v . .left !x → pagetable. .left !i → X ) . . this ensures that the correct blocks are read back. For this purpose. An ordinary scratch ﬁle (held in main store) is used to hold the sequence of indices of the sectors of backing store on which the corresponding actual blocks of information are held. myﬁle. this could be an uneconomic use of main store. .left !v . .i.rewind → µ Y • (pagetable. mﬁle. Since each block in a scratch ﬁle is read and written only once. we must adapt BSCRATCH to accept acquire and release signals.rewind . .right ?x → right !x → Y | pagetable. .

. . . . each occupying a disjoint subset of the available sectors. shared by interleaving among any number of users ﬁlesys : FILESYS // . used. and released by remote subordination myﬁle : ﬁlesys // (.4 Multiple resources 195 If a user releases his scratch ﬁle before reading to the end.6. . This danger can be averted by a loop that reads back these blocks and discards them SCAN = µ X • (pagetable. .) Inside each user. . myﬁle.left !v .right ?x .) . . . A backing store is usually suﬃciently large to allow many scratch ﬁles to exist simultaneously.right ?x → X | pagetable. using as labels the same indices (natural numbers) which are used in constructing the array of sharing processes FILESYS = N : (back : BSTORE ) // (||i≥0 i : SHBSCR) where N = { i | i ≥ 0 } This ﬁling system is intended to be used as a subordinate process.right ?i → back. myﬂe. and releases it automatically on inputting that block again. The release signal causes an interrupt (Section 5. The backing store can therefore be shared among an unbounded array of scratch ﬁles. and then behaves as BSCRATCH . . myﬁle. Each scratch ﬁle acquires a sector when needed by outputting to it.4) to the SCAN process SHBSCR = acquire → (BSCRATCH (release → SCAN )) The serially reusable scratchﬁle is provided by the simple loop ∗ SHBSCR which uses BSTORE as a subordinate process back : BSTORE // ∗ SHBSCR X8 (Multiplexed scratch ﬁles) In the previous two examples only one scratch ﬁle is in use at a time.6. (USER1 ||| USER2 ||| . . The backing store is shared by the technique of multiple labelling (Section 2. there is a danger that the unread blocks on backing store will never be reclaimed.4). . a fresh scratch ﬁle can be acquired.rewid .empty → SKIP ) A shared scratch ﬁle acquires its user.i.

. . since it uses an unbounded set of natural numbers to implement the necessary dynamic creation of new virtual processes. the process SHBSCR ensures that each user reads only from sectors allocated to that user. each occa- .right ?x . ﬁlesys. . the following further explanation of X8 may be helpful. . myﬁle.i.left !v . .rewind . . The users do not communicate directly with the resources. In a practical implementation on a computer.i.left !v . . . . ﬁlesys. it is certainly advisable to forget the implementation method. . . there is an intermediary virtual resource (the SHBSCR) which they declare and use as though it were a private subordinate process. pointers to activation records.rewind .) By deﬁnition of remote subordination this is equivalent to ( i≥0 ﬁlesys. myﬁle. myﬁle. . Point (1) ensures that the discipline of Point (2) is painless. where i is the index of the particular instance of SHBSCR which has been acquired by a particular user on a particular occasion.i. .right ?x . . etc. ﬁlesys.i. and cannot forget to release sectors on ﬁnishing with a scratch ﬁle. The function of the virtual resource is twofold (1) it provides a nice clean interface to the user.rewind . myﬁle. . (2) It guarantees a proper. disciplined access to the actual resources. The mathematical deﬁnition of the paradigm is quite complicated. and of new channels through which to communicate with them.release → SKIP ) Thus all communications between ﬁlesys and its users begin with ﬁlesys. . in this example.left !v .) The structure of the ﬁling system (X8) and its mode of use is a paradigm solution of the problem of sharing a limited number of actual resources (sectors on backing store) among an unknown number of users. SHBSCR glues together into a single contiguous scratch ﬁle a set of sectors scattered on backing store.i . for example. . . . But for those who wish to understand it more fully before forgetting it. . these would be represented by control blocks. . .196 6 Shared Resources which is intended (apart from resource limitations) to have exactly the same eﬀect as the simple subordination of a private scratch ﬁle (X5) (myﬁle : SCRATCH // . . . To use the paradigm eﬀectively.acquire → ﬁlesys. The paradigm of actual and virtual resources is very important in the design of resource-sharing systems. myﬁle. Furthermore.i. myﬁle. Inside a user processor a scratch ﬁle is created by remote subordination myﬁle :: ﬁlesys // (.right ?x .

rewind .back. Each sector of the backing store behaves like COPY .j.left ?x → ( k≥0 k.left ?x → back. ﬁlesys.v denotes communication of block v from the ith element of the array of scratch ﬁles to the jth sector of backing store i. . ﬁlesys.j.release) This exactly matches the user’s pattern of communication as described above. The relevant events are: i. We turn now to communications within FILESYS between the virtual scratch ﬁles and the backing store. In the above description the role of the natural numbers i and j is merely to permit any scratch ﬁle to communicate with any sector on disc. . .right !v .j. and are concealed from the user. release . right !v . . .right !x → X )) This is now ready to communicate on any occasion with any element of the array of virtual scratch ﬁles. .back.acquire → ﬁlesys. These are concealed from the user.i.back.right .i.i. the jth sector behaves like µ X • (back.j.j. So the externally visible behaviour of each instance is (ﬁlesys. After indexing with a sector number j and naming by back. and do not even have the name ﬁlesys attached to them.i. left ?x . The matching pairs of acquire and release signals ensure that no user can interfere with a scratch ﬁle that has been acquired by another user. A crude picture of this can be drawn as in Fig 6.i. .6. Each individual scratch ﬁle observes the discipline of reading only from those sectors which the scratch ﬁle itself has most recently written to.j.right !x → X ) After multiple labelling by natural numbers it behaves like µX • ( i≥0 i.left ?x . . . ﬁlesys. . The indices therefore serve as a mathematical description of a kind of crossbar switch which is used in a telephone exchange to allow any subscriber to communicate with any other subscriber. and then continues according to the pattern of X5 and X6 (acquire → .back. . Each instance of the virtual scratch ﬁle is indexed by a diﬀerent index i.) All other communications of the virtual scratch ﬁle are with the subordinate BSTORE process.i.2.release. On the side of the subordinate process. . .acquire and ﬁlesys. rewind . . . and then named by the name ﬁlesys. .i. and to communicate safely with the user that has acquired it. each virtual scratchﬁle begins by acquiring its user.v denotes a communication in the reverse direction.left .4 Multiple resources 197 sion of its use is surrounded by a matching pair of signals ﬁlesys. .

The task of a batch processing operating system is to share the resources of the computer eﬃciently among these jobs. there is a danger of deadlock if the backing store gets full at a time when all users are still writing to their scratch ﬁles. We do not need to know about the internal structure of JOB—in early days it . FILESYS behaves exactly like a similarly constructed array of simple scratch ﬁles ||i≥0 i : (acquire → (SCRATCH (release → STOP ))) With a backing store of ﬁnite size. this risk is usually reduced to insigniﬁcance by delaying acquisition of new ﬁles when the backing store is nearly full.5 Operating systems The human users of a single large computer submit programs on decks of cards for execution.2 If the number of sectors in the backing store is inﬁnite.left .198 6 Shared Resources COPY sectors of the backing store crossbar SHBSCR scratch files crossbar users USER Figure 6. In practice. runs the program on the data which immediately follows it in the card reader.right . which inputs the cards of the program on channel cr . For this we postulate that each user’s program is executed by a process called JOB. and outputs the results of execution on the channel lp. 6. The data for each program immediately follows it.

(cr . Input from the hardware is achieved by h?x. The simplest method of sharing a single computer among many users is to run their jobs serially. The shared card reader needs to read one card ahead.left .release → lp. we need to rely on the fact that it will terminate successfully within a reasonable internal after starting. The alphabet of JOB is therefore deﬁned αJOB = {cr . without further input from the card reader. a single job for a single user will be executed by JOB1 = (cr : CRH // lp : LPH // JOB) An operating system that runs just one job and then terminates is not much use. the left-over cards are ﬂushed out.6. and protects each job from possible interference by its predecessors. so the value of the buﬀered card is used as an index CR = h?x →if x = separator then CR else (acquire → CRx ) . and separation of card decks containing each job from the previous job. If the user fails to read up to the separator card. and a CR process deﬁned below (X1) JOBS = ∗ ((cr . } If LPH represents the hardware of the line printer and CRH represents the hardware of the card reader. sharing a computer among many users whose jobs are executed one at a time in succession.release → SKIP )) BATCH1 = (cr : CR // lp : LP // JOBS ) BATCH1 is an abstract description of the simplest viable operating system. one after the other BATCH0 = (cr : CRH // lp : LPH // JOB) But this design ignores some important administrative details.right . If the user attempts to read beyond a separator card. such as separation of ﬁles output by each job.acquire → JOB) . Examples X1 (A shared card reader) A special separator card is inserted at the front of each jobﬁle loaded into the card reader. so that one job cannot read the cards containing its successor. However.acquire → lp. To solve these problems we use the LP process deﬁned in 6. lp. further copies of the separator card are supplied. The card reader is acquired to read all cards of a single jobﬁle and is then released.5 Operating systems 199 used to be a FORTRAN monitor system.2 X4. The operating system expedites the transition between successive jobs. Superﬂuous separators are ignored.

and the extra printers should be occupied in printing the output ﬁle for the previous job or jobs. the remaining cards of the deck up to the next separator card must be ﬂushed out by reading and ignoring them. The overall structure of a spooled operating system is OPSYS1 = insys : INSPOOL // outsys : OUTSPOOL // BATCH Here BATCH is like BATCH1. But if the user releases the reader before the separator card is reached. the extra readers should be occupied in reading ahead the card ﬁle for the following job or jobs. However. The BATCH1 operating system is logically complete. . except that it uses remote subordination to acquire any one of the waiting input ﬁles. it is necessary to use two or more readers and printers. Each input ﬁle must therefore be held temporarily in a scratch ﬁle during the period between actual input on a card reader and its consumption by JOB.200 6 Shared Resources CRx = (right !x → h?y → if y ≠ separator then CRy else µ X • (right !separator → X | release → CR) | release → µ X • (h?y →if y = separator then CR else X )) After ignoring an initial subsequence of separators. and each output ﬁle must be similarly buffered during the period between production of the lines by JOB and its actual output on the line printer. it outstrips the capacity of readers and printers to supply input and transmit output. In order to establish a match between input. this process acquires its user and copies on its right channel the sequence of nonseparator cards which it reads from the hardware. and also to acquire an output ﬁle which is destined for subsequent printing BATCH = ∗ (cr :: insys // lp :: outsys // JOB) The spoolers are deﬁned in the next two examples. This technique is known as spooling. On detecting a separator card. and processing speeds. Since only one job at a time is being processed. as the hardware of the central processor gets faster. output. its value is replicated as necessary until the user releases the resource.

right ?x → if x = separator then SKIP else temp.right ?x → right !x → Y | temp.empty → right !separator → Y )) (release → SKIP )) INSPOOL = (N : cr : (0 : cr || 1 : cr )) // (||i≥0 i : VCR) . (temp.right ?y → actual. then an actual printer (6.left !x → X )).4 X5) to store blocks that have been output by its using process.5 Operating systems 201 X2 (Spooled output) A single virtual printer uses a temporary scratch ﬁle (6. and is released at the end of the input for a single job.empty → SKIP ))) The requisite unbounded array of virtual line printers is deﬁned VLPS = ||i≥0 i : (acquire → VLP ) Since we want the actual line printers (6. When the using process signals release of the virtual printer.left !y → Y | temp.rewind → acquire → (µ Y • (temp. we can declare them local to the spooling system using multiple labelling to share them among all elements of the array VLPS as in 6.rewind → (actual :: lp // µ Y • (temp.4 X3) to be used only in spooling mode. a user process is then acquired to execute the job.6.left !x → X | release → temp.4 X3) is acquired to output the content of the temporary ﬁle VLP = (temp : SCRATCH // µ X • left ?x → temp. and the contents of the cards are output to it VCR = temp : SCRATCH // (actual :: cr // (µ X • actual. except that a real card reader is acquired ﬁrst.4 X8 OUTSPOOL = (N : (lp : LP2) // VLPS ) X3 (Spooled input) Input spooling is very similar to output spooling.

simple interleaving is the appropriate sharing method. the change is easy. we will need to share a single backing store between the temporary ﬁles of both the input and the output spoolers. and this involves a change in the structure of the system. the logical eﬀects of multiprogramming and multiprocessing are the same. or if more than one actual hardware processor is used. If a separate backing store is available for this purpose. trying to re-use as many of the previously deﬁned modules as possible.4 X8. sharing these virtual resources. In general the output ﬁles are too large to be held in the main store of a computer. In the design of the VLP process within OUTSPOOL (X2) the subordinate process SCRATCH was used to store the lines produced by each JOB until they are output on an actual printer. . so we need to replace the subordinate process temp : SCRATCH // .202 6 Shared Resources The input and output spoolers now supply an unbounded number of virtual card readers and virtual line printers for use of the JOB process. within VLP by a declaration of a remote subordinate process temp :: ﬁlesys // . . All the temporary ﬁles should share the same backing store. and then declare the ﬁling system (6. This technique is known as multiprogramming. . However. Since no communication is required between these jobs. . it is known as multiprocessing. the change to multiprogramming has been remarkably simple: historically it caused much greater agony. If not. . the operating system deﬁned below has the same logical speciﬁcation as OPSYS1 deﬁned above OPSYS = insys : INSPOOL // outsys : OUTSPOOL // BATCH4 where BATCH4 = (|||i<4 BATCH ) In mathematics. a similar change must be made to INSPOOL.4 X8) as a subordinate process of the output spooler (ﬁlesys : FILESYS // OUTSPOOL) If the volume of card input is signiﬁcant. and should be held on a backing store as illustrated in 6. This means that FILESYS must be declared as a subordinate process. shared by multiple labelling between both spoolers. We will do this redesign in a top-down fashion. it is possible for two or more JOB processes to run concurrently. As a result. Indeed.

and to multiple virtual input and output devices. and the alteration can be conﬁned to that single module. 2. some method is needed to control the order in which waiting jobs are started. OPSYS1. and an input-output system. 3. Among the easy changes are • the number of line printers • the number of card readers • the number of concurrent batches But not all changes will be so easy: a change to the value of the separator card will aﬀect three modules. A method of checkpointing for rapid recovery from breakdown may be needed. it is very easy to identify which module must be altered. OPSYS . There are also a number of valuable improvements to the system which would require very signiﬁcant changes in its structure 1.i | i ≥ 0 }. and OP ) we have emphasised above all else the virtue of modularity. every decision of detail is isolated within one or two modules of the system. serving as a subordinate process OP = IOSYSTEM // BATCH4 The input-output system shares a ﬁling system between an input spooler and an output spooler IOSYSTEM = SH : (ﬁlesys : FILESYS ) // (lp : OUTSPOOL || cr : INSPOOL ) and SH = { lp. Users’ ﬁles should be stored permanently between the jobs they submit. . If there is a backlog of jobs input but not yet executed. 4.6.5 Operating systems 203 The operating system is composed from a batched multiprogramming system BATCH4. except that temp : SCRATCH is replaced by the equivalent remote subordination temp :: ﬁlesys In the design of the four operating systems described in this chapter (BATCH1. INSPOOL (X3) and JOB. This means that we have been able to re-use large parts of the earlier systems within the later systems. if a detail must be changed. The user jobs should also have access to the ﬁling system. This point is taken up more fully in the next section.i | i ≥ 0 } ∪ { cr . Consequently. CR (X1). Even more important. and OUTSPOOL and INSPOOL are the same as X2 and X3.

the choice of which waiting process will acquire the resource has been nondeterministic in all examples given so far. But this is a topic which is left for future research. Unfortunately. It must be split into two events please. this is of little concern. This is known as the problem of inﬁnite overtaking (Section 2. This may be achieved either by providing more resources. which requests the allocation thankyou. the newly joined process may be the lucky one chosen. by the time a resource is released again. which accompanies the actual allocation of the resource. on the average. For each process.6 Scheduling When a limited number of resources is shared among a greater number of potential users. The problem can sometimes be mitigated by diﬀerential charging to try to smooth the demand. In order to identify . In order to schedule successfully. In itself.5. or by rationing their use. If at the time of release there are two or more processes waiting to acquire it. using processes will be subject to delay. the acquisition of a resource cannot any longer be regarded as a single atomic event. yet another process has joined the set of waiting processes. it is inevitable that. there will always be the possibility that some aspiring users will have to wait to acquire a resource until some other process has released it. The task of deciding how to allocate a resource among waiting users is known as scheduling. During the peaks. For this reason. If the resource is heavily loaded. these are the only solutions in the case of a resource which is consistently under heavy load. the period between the please and the thankyou is the period during which the process has to wait for the resource. As a result. It seems that a new deﬁnition of subordination is required.5). but this is not always successful or even possible. 6. this may happen again and again. One solution to the problem is to ensure that all resources are lightly used. even a resource which is on average lightly loaded will quite often be heavily used for long periods (rush hours or peaks). or at least for a wholly unpredictable and unacceptable period of time. in those cases where the technique of multiple labelling is not appropriate. it is necessary to know which processes are currently waiting for allocation of the resource. It is important to ensure that these delays are reasonably consistent and predictable—you would much prefer to know that you will be served within the hour than to wonder whether you will have to wait one minute or one day. Since the choice between waiting processes is again nondeterministic. or by charging a heavy price for the services provided. In fact. in which the alphabet of the subordinate is not a subset of the alphabet of the main process. but suppose. some of the processes may happen to be delayed forever.204 6 Shared Resources One of the problems encountered in making these improvements is the impossibility of sharing resources between a subordinate and its main process.

All counts are initially zero.. and is described more formally below. On entry to the bakery. furthermore. there is an alternative mechanism to achieve the same eﬀect.i. Also. he calls out the lowest ticket number of a customer who has taken a ticket but has not yet been served.6 Scheduling 205 the requesting process.. the very next event must be the thankyou of a customer obtaining the resource BAKERY = B0. res. It is the queueing discipline observed by passengers who form themselves into a line at a bus stop. One of the main tasks of the algorithm is to ensure that there is never simultaneously a free resource and a waiting customer. res. at all times r ≤ t ≤ p.6. A machine is installed which issues tickets with strictly ascending serial numbers. whenever such a situation arises. thankyou and release by a diﬀerent natural number.0. The requesting process acquires its number on each occasion by the same construction as remote subordination (6.thankyou.0 . at all times. This policy is known as ﬁrst come ﬁrst served (FCFS) or ﬁrst in ﬁrst out (FIFO).release → SKIP ) A simple but eﬀective method of scheduling a resource is to allocate it to the process which has been waiting longest for it. p − t is the number of waiting customers. where customers are unable or unwilling to form a line. This is known as the bakery algorithm. X3) i≥0 (res. and can revert to zero again whenever they are all equal—say at night after the last customer has left. we will index each occurrence of please. and t is the number of the next customer to be served. In such a place as a bakery. and R + r − t is the number of waiting servers.4. a customer takes a ticket.i. When the server is ready. We assume that up to R customers can be served simultaneously. Example X1 (The bakery algorithm) We need to keep the following counts p customers who have said please t customers who have said thankyou r customers who have released their resources Clearly.i. p is the number that will be given to the next customer who enters the bakery.please. ..

.please → Bp+1.release → Bp.r +1 )) The bakery algorithm is due to Leslie Lamport.thankyou → Bp.206 6 Shared Resources Bp.r else (p.t +1.t .t .t .r = if 0 < r = t = p then BAKERY else if R + r − t > 0 ∧ p − t > 0 then t .r |( i<t i.

to shared-resource operating systems. It should provide clear assistance to the programmer in his tasks of speciﬁcation. would lead to greater success in one or more of the objectives listed above. Finally. It is not possible to claim that all these objectives have been achieved in an optimal fashion. implementation. 2. It should be capable of eﬃcient implementation on a variety of conventional and novel computer architectures. This chapter initiates a discussion of some of the alternatives which I and others have explored. 3. It should describe a wide range of interesting computer applications.2 Shared storage The earliest proposals in the 1960s for the programming of concurrent operations within a single computer arose naturally from contemporaneous developments in computer architecture and operating systems. through process control and discrete event simulation. and it was considered very wasteful . There is always hope that a radically diﬀerent approach. At that time processing power was scarce and expensive. veriﬁcation and validation of complex computer systems.Discussion 7 7. design. It also gives me an opportunity to acknowledge the inﬂuence of the original research of other workers in the ﬁeld.1 Introduction The main objective of my research into communicating processes has been to ﬁnd the simplest possible mathematical theory with the following desirable properties 1. and an explanation of why I have not adopted them. 7. from time-sharing computers through microprocessors to networks of communicating microprocessors. I hope to encourage further research both into the foundations of the subject and into its wider practical application. from vending machines. or some signiﬁcant change to the detailed deﬁnitions.

and if the original operating system were well structured this could be achieved with little eﬀect on the code of the operating system. thereby freeing the central processor to engage in other tasks. each program under execution was usually a complete job submitted by a diﬀerent user. Nevertheless. the eﬀect was to increase the throughput of jobs. If L is a label of some place in the program. the command fork L transfers control to the label L. cheaper specialpurpose processors (channels) were provided to engage independently in input– output. while another program used the central processor. When it became possible to attach several independent central processors to the same computer. great care was expended in the design of hardware and software to divide the store into disjoint segments. The disadvantages of sharing a computer among several distinct jobs were 1. and at any time several of the programs could be using the input–output processors. and wholly independent of all the other jobs. The amount of storage required goes up linearly with the number of jobs executed. The amount of time that each user has to wait for the result of his jobs is also increased. For this reason. a timesharing operating system would ensure that there were several complete programs in the main store of the computer. 7. Since each locus of control can fork again. by initiating several concurrent processes within the same area of storage allocated to a single program. and to ensure that no program could interfere with the store of another. one for each program. The scheme described above suggests that the central processor and all the channels should be connected to all the main storage of the computer. this programming technique is known as multithreading.2. or even slower human beings. each maintains its own locus of control threading its way through the commands. . 2.208 7 Discussion that a processor should have to wait while communicating with slow peripheral equipment. At the termination of an input–output operation. It therefore seems tempting to allow a single job to take advantage of the parallelism provided by the hardware of the computer. an interrupt would enable the operating system to reconsider which program should be receiving the attention of the central processor. except for the highest priority jobs. From then on. and even less on the programs for the jobs which it executed. To keep the valuable central processor busy. and accesses from each processor to each word of store were interleaved with those from the other processors. and also allows control to pass to the next command in sequence. the eﬀect is that two processors execute the same program at the same time.1 Multithreading The ﬁrst proposal of this kind was based on a jump (go to command). Consequently.

P . Both the original and the new process resume execution at the command following the fork.2. and proceed concurrently until they have both ended. J : join The generalisation to more than two component processes is immediate and obvious cobegin P . The fork does not mention a label. A very simple proposal is to provide a command join which can be executed only when two processes execute it simultaneously. with no possibility of jumping between them. . In . This structured command can be implemented by the unstructured fork and join commands. not to be recommended in any but the smallest programs. when even FORTRAN was still considered to be a high-level programming language! A variation of the fork command is still used in the UNIXTM operating system. W. If P and Q are such blocks.2 Shared storage 209 Having provided a method for a process to split in two. especially if the variables used in each of the blocks are distinct from the variables used in the others (a restriction that can be checked or enforced by a compiler for a high-level language). . some method is also required for two processes to merge into one. the compound command cobegin P . we may plead that it was invented before the days of structured programming. but it can be ineﬃcient both in time and in space. 7.7. and to allocate the copy to a new process. using labels L and J fork L. L:Q . Q . multithreading is an incredibly complex and errorprone technique. Q coend causes P and Q to start simultaneously. . . This means that concurrency can be aﬀorded only at the outermost (most global) level of a job. In excuse. Its eﬀect is to take a completely fresh copy of the whole of storage allocated to the program. After that. The allocation of disjoint storage areas to the processes removes the main diﬃculties and dangers of multithreading. R coend One great advantage of this structured notation is that it is much easier to understand what is likely to happen. . coend A solution to the problems of multithreading was proposed by E. only a single processor goes on to execute the following commands. go to J . The ﬁrst process to reach the command must wait until another one also reaches it.2 cobegin . A facility is provided for each process to discover whether it is the parent or the oﬀspring. . Then only one process goes ahead to execute the following command. Dijkstra: make sure that after a fork the two processors execute completely diﬀerent blocks of program. and its use on a small scale is discouraged. In its full generality.

2. with position do begin x := x + δx.3) protected by mutual exclusion semaphores. y := y + δy end The advantage of this notation is that a compiler automatically introduces the necessary semaphores. for example shared n : integer shared position : record x. . P end = cobegin P . Cooperation between processes which share store sometimes requires another form of synchronisation. the introduction of (simulated) input and output channels may seem the obvious solution. to avoid confusion with sequential composition. quoting the variable name with n do n := n + 1. a restriction which seriously reduces the potential value of concurrency. After reading this book. Q coend Furthermore. I have introduced the || operator to separate the processes. it can check at compile time that no shared variable is ever accessed or updated except from within a critical region protected by the relevant semaphore. Dijkstra showed how this could be safely achieved by critical regions (Section 6.3 Conditional critical regions The restriction that concurrent processes should not share variables has the consequence that they cannot communicate or interact with each other in any way. A group of variables which is to be updated in critical regions within several sharing processes should be declared as a shared resource. and this permits the use of simple brackets to surround the command instead of the more cumbersome cobegin . the proof methods for establishing correctness of the parallel composition can be even simpler than the sequential case. 7. . . an obvious technique (suggested by the hardware of the computers) was to communicate by sharing main storage among concurrent processes. Q end = begin Q . and surrounds each critical region by the necessary P and V operations. That is why Dijkstra’s proposal forms the basis for the parallel construct in this book. but in earlier days. I later proposed that this method should be formalised in the notations of a high-level programming language. For example. and (in the absence of communication) the concurrent execution of P and Q has exactly the same eﬀect as their sequential execution in either order begin P . the processes are said to be disjoint. y : real end Each critical region which updates this variable is preceded by a with clause.210 7 Discussion this case. coend . suppose one process updates a variable with the objective that other processes should read the new value. Furthermore. The main change is notational.

. the ﬁrst process must not update the variable again until all the other processes have read the earlier updating. the critical region is executed normally. . . But if the condition is false. .2 Shared storage 211 The other processes must not read the variable until the updating has taken place. . the conditions do not have to be retested more frequently than that. otherwise that process is suspended again. To solve the problem of updating and reading a message by many processes. This takes the form with sharedvar when condition do critical region On entry to the critical region. count := number of readers end Each reading process contains a critical region with message when count > 0 do begin my copy := content .. this entry into the critical region is postponed. Compared with direct use of synchronisation semaphores by the programmer.7. the condition is retested. Fortunately. All other variables in the condition must . the choice between them is arbitrary. If it has become true. the delayed process is permitted to proceed with its critical region. . a convenient facility is oﬀered by the conditional critical region..count := 0. because restrictions on access to shared variables ensure that the condition tested by a waiting process can change value only when the shared variable itself changes value. end message. so that other processes are permitted to enter their critical regions and update the shared variable. On completion of each such update. declare as part of the resource an integer variable to count the number of processes that must read the message before it is updated again shared message : record count : integer . count = count − 1 end Conditional critical regions may be implemented by means of semaphores. Similarly. content : . the overhead of conditional critical regions may be quite high. The updating process contains a critical region with message when count = 0 do begin content := . since the conditions of all processes waiting to enter the region must be retested on every exit from the region. If it is true. the value of the condition is tested. To solve this problem. If more than one delayed process can proceed..

The important characteristic of a monitor is that only one of its procedure bodies can be active at one time. one of the calls is delayed until the other is completed. the ﬁnal value of n (if non-zero) is printed on exit from the user block. which stands for the block that is going to use this monitor. · · ·. function∗ grounded : Boolean. the monitor starts execution here. begin n := n + 1 end . 7. var n : integer .4 Monitors The development of monitors was inspired by the class of SIMULA 67. Thus the procedure bodies act like critical regions protected by the same semaphore. the three fat dots are an inner statement. begin grounded := (n = 0) end . which obviously cannot change them while it is waiting. when n > 0 do begin n := n − 1 end . which was itself a generalisation of the procedure concept of ALGOL 60. it is inaccessible except within the monitor itself. begin n := 0.2. even when two processes call a procedure simultaneously (either the same procedure or two diﬀerent ones). if n ≠ 0 then print (n) end declares the monitor and gives it the name count . declares the shared variable n local to the monitor. the asterisks ensure they can be called from the program which uses the monitor. In notations based on PASCAL PLUS it takes the form 1 2 3 4 5 6 7 8 9 monitor count . procedure∗ up. and these operations should be invoked by procedure calls whenever required by the processes which share the data. The basic insight is that all meaningful operations on data (including its initialisation and perhaps also its ﬁnalisation) should be collected together with the declaration of the structure and type of the data itself.212 7 Discussion be private to the waiting process. procedure∗ down. declare three procedures within their bodies. Line 1 2 3 4 5 6 7 8 . For example a very simple monitor acts like a count variable.

These schemes even allow a procedure call to suspend itself in the middle of its execution. Note that an attempt to call rocket .down when n = 0 will be delayed until some other process within P calls rocket . procedure rocket . The ineﬃciency of repeated testing of entry conditions has led to the design of monitors with a more elaborate scheme of explicit waiting and explicit signalling for resumption of waiting processes. if rocket . procedure rocket . the using block P is copied in place of the three dots inside the monitor. . . after automatically releasing exclusion of the suspended process. a number of ingenious scheduling techniques can be eﬃciently implemented.2 Shared storage 213 A new instance of this monitor can be declared local to a block P instance rocket : count . rocket .grounded then . . .n ≠ 0 then print (rocket .n := 0. . Firstly. · · ·. the starred procedures may be called by commands rocket . .n) end Note how the copy rule has made it impossible for the user process to forget to initialise the value of n. a copy is taken of the text of the monitor.up. function rocket .up.n : integer . begin rocket .n := rocket .grounded : Boolean. . .up.7. . However an unstarred procedure or variable such as n cannot be accessed from within P and observance of this restriction is enforced by a compiler. begin rocket . and all local names of the monitor are preﬁxed by the name of the instance. The eﬀect of the declaration of an instance of a monitor is explained by a variation of the copy rule for procedure calls in ALGOL 60.n > 0 do begin rocket . as shown below rocket . if rocket .n := rocket . when rocket .down. In this way. P Within the block P . This ensures that the value of n can never go negative. begin rocket .n − 1 end . .down. and there is no danger of interference in updating n. The mutual exclusion inherent in the monitor ensures that the procedure of the monitor can be safely called by any number of processes within P . but I now think that the extra complexity is hardly worthwhile.n = 0) end .n + 1 end . or to forget to print its ﬁnal value when necessary.grounded := (rocket . or to forget to print its ﬁnal value of n.

when free do free := false.5 Nested monitors A monitor instance can be used like a semaphore to protect a single resource such as a line printer which must not be used by more than one process at a time. monitor singleresource. · · ·. begin free := true end . Both of these dangers can be averted by a construction similar to that of the virtual resource (6. begin free := true. · · · end However. This takes the form of a monitor declared locally within the actual resource monitor shown above. end . when free do free := false. . or frustrated by one which forgets to release it afterwards. procedure ∗ acquire.4 X4). begin . procedure ∗ release. so that these can be used only within the virtual resource monitor. end . begin free := true. procedure release. · · ·. . var free : Boolean.2. However. and cannot be misused by other processes. Such a monitor could be declared monitor singleresource. The name of the virtual resource is starred to make it accessible for declaration by user processes. procedure acquire. release end begin free := true. the protection aﬀorded by this monitor can be evaded by a process which uses the resource without acquiring it. procedure ∗ use(l : line). begin acquire. free : Boolean. end monitor ∗ virtual.214 7 Discussion 7. the stars are removed from ∗ acquire and ∗ release.

2 Shared storage 215 An instance of this monitor is declared instance lpsystem : singleresource. all of which use the instance mine of the virtual monitor. . procedure lpsystem. mine. end The explicit copying shown here is only for initial explanation. begin procedure mine. and it can be implemented more eﬃciently without exclusion or synchronisation. . . but this is probably not the intention here. . .acquire. it would be possible for the using block to split into parallel processes. .use(l2). . . lpsystem. .release. . end The necessary acquisition and release of the line printer are automatically inserted by the virtual monitor before and after this user block. In principle. . A monitor which is to be used only by a single process is known in PASCAL PLUS as an envelope.use(l : line).use(l1). mine.release. begin lpsystem.acquire.use(l1). begin . . and the compiler checks that it is not inadvertently shared. . mine. a more experienced programmer would never wish to see the expanded version.free := true end . in a manner which prevents antisocial use of the line printer. with the result shown below var lpsystem. . P A block within P which requires to output a ﬁle to a line printer is written instance mine : lpsystem. begin . end .free := true .lpsystem. . . procedure lpsystem. .free : Boolean.free do lpsystem.use(l2). . The meaning of these instance declarations can be calculated by repeated application of the copy rule. mine. begin lpsystem. . . lpsystem. end . . or even think about it.free := false. .lpsystem. when lpsystem.7.lpsystem.virtual.

or perhaps eﬃciency. The execution of the body of an accept is known as a rendezvous. PREV : out integer ) do PREV := K . For implementation of an operating system on a computer with shared main storage. 7. K := V end A corresponding call might be put (37 . X ) The identiﬁer put is known as an entry name. PASCAL PLUS probably has the advantage. without much complicating the case when only input or only output is required. The values of the output parameters are copied back to the call. Dijkstra’s criticisms of these aspects that ﬁrst impelled me towards the design of communicating sequential processes. whether it is expressed within the conceptual framework of communicating processes or within the copy-rule and procedure-call semantics of PASCAL PLUS. since it simpliﬁes the very common practice of alternating output and input. However. Extremely ingenious eﬀects can be obtained by a mixture of starring and nesting. 4. A typical accept statement might be accept put (V : in integer .216 7 Discussion These notations were used in 1975 for the description of an operating system similar to that of Section 6. The choice between the languages seems partly a matter of taste. Then both tasks continue execution at their next statements. The rendezvous is an attractive feature of Ada. An accept and a call statement with the same name in diﬀerent tasks as executed when both processes are ready to execute them together. It was Edsger W.5. and they communicate by call statements (which are like procedure calls with output and input parameters). with the less structured form of communication by input and output. 3. but the PASCAL. it is now clear from the constructions of Section 6.6 AdaTM Facilities oﬀered for concurrent programming in Ada are an amalgam of the remote procedure call of PASCAL PLUS. 2. The body of the accept statement is executed. and accept statements (which are like procedure declarations in their syntactic form and in their eﬀect). since the calling and accepting task may be thought to be executing it together.and SIMULA-based notations seem rather clumsy. Processes are called tasks. and explanations in terms of substitution and renaming are rather diﬃcult to follow. and they were later implemented in PASCAL PLUS.2. The eﬀect is as follows 1. Input parameters are copied from the call to the accepted process.5 that the control of sharing requires complication. .

for example when not full ⇒ accept . depending on the choice made by the calling task(s). a delay cannot be faithfully represented. One of the alternatives in a select statement may be the statement terminate. One of the alternatives in a select statement may begin with a delay instead of an accept. . This may lead to some ineﬃciencies in implementation on a distributed network of processors. . . This achieves the eﬀect of a conditional critical region. A select statement may have an else clause. . This is not as convenient as the inner statement of PASCAL PLUS.5). The deﬁnition of a task is split into two parts. and then the given task terminates too. This would seem to be equivalent to an alternative with zero delay. Selection of an alternative can be inhibited by falsity of a preceding when condition. . or accept put (v : in integer ) do B[j] := v end . . . The remaining statements of the selected alternative after the end of the accept are executed on completion of the rendezvous. each one may serve any number of calling processes. i := i + 1. . its speciﬁcation and its body. or . . either because all the when conditions are false. which allows the monitor to tidy up on termination. or because there is no corresponding call already waiting in some other task. j := j + 1. Since our mathematical model deliberately abstracts from time. except by allowing wholly nondeterministic selection of the alternative beginning with the delay. .7.2 Shared storage 217 The Ada analogue of select is the select statement which takes the form accept get (v : out integer ) do v := B[i] end . A call statement may also be protected against arbitrary delay by a delay statement or an else clause. end select Exactly one of the alternatives separated by or will be selected for execution. This alternative is selected when all tasks which might call the given task have terminated. which is selected if none of the other alternatives can be selected immediately. . The speciﬁcation gives the task name and the names and parameter types of all entries through which the task may be called. This alternative may be selected if no other alternative is selected during elapse of a period greater than the speciﬁed number of seconds’ delay. but like monitors in PASCAL PLUS. concurrently with continuation of the calling task. the programmer must arrange for the task to terminate properly. Furthermore. The purpose of this is to guard against the danger that hardware or software error might cause the select statement to wait forever. Tasks in Ada are declared in much the same way as subordinate processes (Section 4.

So it would seem to be most economical to obtain greater power by use of several microprocessors cooperating on a single task. This is an excellent idea. it is intended to improve critical response times compared with non-critical ones. or amalgamate such updating just as if the variable were not shared. Each microprocessor would have its own local main store. There are some additional complexities and interaction eﬀects with other features which I have not mentioned. except that all . If several tasks are capable of making progress. Apart from the complexities listed in the preceding paragraph. which it could access at high speed. thus avoiding the expensive bottleneck that tends to result from allowing only one processor at a time to access a shared store.4. reorder. and all tasks dependent upon it.3 Communication The exploration of the possibility of structuring a program as a network of communicating processes was also motivated by spectacular progress in the development of computer hardware.3. the power of each individual microprocessor was still rather less than that of a typical computer of the traditional and still expensive kind. It is possible to test how many calls are waiting on an entry. Tasks may access and update shared variables. tasking in Ada seems to be quite well designed for implementation and use on a multiprocessor with shared storage. and may be compiled separately from the using program. since it separates concern for abstract logical correctness from problems of real time response. The priority of execution of a rendezvous is the higher of the priorities of the calling and of the accepting tasks. These microprocessors would be cheaply connected by wires. The eﬀect of this is made even more unpredictable by the fact that compilers are allowed to delay. 7. The indication of priority is called a pragma.1 Pipes The simplest pattern of communication between processing elements is just single-directional message passing between each process and its neighbour in a linear pipe. along which they could communicate with each other. 7. The idea was ﬁrst propounded by Conway who illustrated it by examples similar to 4. Each task in Ada may be given a ﬁxed priority.4 X2 and X3. However. The advent of the microprocessor rapidly reduced the cost of processing power by several orders of magnitude. and by the compiler of that program.218 7 Discussion This is the information required by the writer of the program which uses the task. The body of the task deﬁnes its behaviour. as described in Section 4. which can often be more easily solved by experiment or by judicious selection of hardware. and it is not intended to aﬀect the logical behaviour of the program. One task may abruptly terminate (abort ) another task. the tasks with lower priority will be neglected. Ada oﬀers a number of additional facilities. but only a lesser number of processors are available.

in spite of the radical diﬀerences in the manner of execution. In this case. It is characteristic of a successful abstraction in programming that it permits several diﬀerent implementation methods which are eﬃcient in diﬀering circumstances. The presence of buﬀering is not always favourable: for example. 7. only one pass at a time is active. and was used for communication between modules providing services at a higher level. where the notation ‘|’ is used instead of ‘> >’. Conway’s suggestion could have been very valuable for software implementation on a computer range oﬀering widely varying options of store size. a store-andforward packet switching network. as well as a diﬀerent program that will deadlock unless the buﬀer length is allowed to exceed ﬁve. and control transfers between the passes together with the messages. and should be . The natural generalisation of the pipe is to permit any process to communicate with any other process in either direction. On a computer with adequate main storage. To avoid such irregularities. the length of all buﬀers should be unbounded.2 Multiple buﬀered channels The pipe construction allows a linear chain of processes to communicate in a single direction only. He proposed that the pipe structure should be used for writing a multiple-pass compiler for a programming language.7.3 Communication 219 components of a pipe were expected to terminate successfully instead of running forever. the presence or absence of buffering can make a vital diﬀerence to the logical behaviour of the system. Buﬀering is a batch-processing technique. When the pattern of communication between processes is generalised from a linear chain to a network that may be cyclic. If something goes wrong in processing a buﬀered stimulus. the next pass starts. the ﬁnal result of the compiler is exactly the same. all the passes are simultaneously active. On completion of each pass. On a computer with less main storage. thus simulating parallel execution. The mathematical treatment is also complicated by the fact that every network is an inﬁnite-state machine. Finally. for rapid and controllable interaction between humans and computers. The pipe is also the standard method of communication in the UNIXTM operating system. even when the component processes are ﬁnite. inevitably interposes buﬀering between the source and destination of messages. On a grander scale. In the design of the RC4000 operating system. and sends its output to a ﬁle in backing store. Unfortunately. and at ﬁrst sight it seems equally natural to provide buﬀering on all the communication channels. and it does not matter whether the message sequence is buﬀered or not. However. taking its input from the ﬁle produced by its predecessor. a facility for buﬀered communication was implemented in the kernel. like ARPAnet in the United States.3. it is much more diﬃcult to trace the fault and recover from it. this leads to grave problems of implementation eﬃciency when main storage is ﬁlled with buﬀered messages. it is possible to write a program that can deadlock if the length of the buﬀer is allowed to grow greater than ﬁve. since they can interpose delay between a stimulus and a response. buﬀers only stand in the way.

left2) else if left20 < left10 then left20 else left20 merge2(left1 . left2) = if left10 < left20 then left10 merge2(left1 .3 Functional multiprocessing A deterministic process may be deﬁned in terms of a mathematical function from its input channels to its output channels. left3)) Figure 7. except that the case of an empty input sequence is not considered. Each channel is identiﬁed with the indeﬁnitely extensible sequence of messages which pass along it.3. For example. left2. a process which outputs the result of multiplying each number input by n is deﬁned prodn (left ) = n × left0 prodn (left ) A function which takes two sorted streams (without duplicates) as parameters and outputs their sorted merge (suppressing duplicates) is deﬁned merge2(left1. merge3 left1 left2 left3 merge2 merge2 Figure 7. a function which merges three sorted input streams can be deﬁned merge3(left1.220 7 Discussion avoided wherever fast interactions are more important than heavy processor utilisation. For example. left3) = merge2(left1. merge2(left2. 7. left2 ) merge2(left1. left2 ) An acyclic network can be represented by a composition of such functions. Such functions are deﬁned in the usual way by recursion on the structure of the input sequences.1 .1 shows a connection diagram of this function.

it is simply deﬁned Hamming = 1 merge3(prod2 (Hamming). and feed them back into the process merge3. and 5 as non-trivial factors. There is no possibility for a process to wait for one of two inputs. which ensures they are eventually output in ascending order (Figure 7. and if x is any such number.3 Communication 221 A cyclic network can be constructed by a set of mutually recursive equations. Recent research has been directed towards reducing the ineﬃciency of 1 and 2. consider the problem attributed by Dijkstra to Hamming. prod2 merge3 prod3 1 prod5 Figure 7. and towards relaxing the restrictions 3 and 4. namely to deﬁne a function which outputs in ascending sequence all numbers which have only 2. which they may do at diﬀerent times. whichever one arrives ﬁrst. The ﬁrst such number is 1. prod3 (Hamming).2 The function which outputs the desired result has no inputs. Each value output into the buﬀer must be retained in the buﬀer until all the inputting processes have taken it. We therefore use three processes prod2 . and 5 × x. For example. . 3. 4. prod5 (Hamming)) The functional approach to multiprocessor networks is very diﬀerent from that taken in this book in the following respects 1. and prod5 to generate these products. The processes are all deterministic.7. so are 2 × x. 2. 3 × x. A general implementation requires unbounded buﬀering on all channels.2). 3. prod3 .

3.3. When buﬀering is wanted. The advantage of this scheme is that there is no need to introduce into the language any concept of channel or channel declaration. copying the values of the parameters and the results. || c : R] Within process P . 2. 3. I have chosen to take unbuﬀered (synchronised) communication as basic. . and communications between the component processes are hidden. the command b!v outputs the value v to the process named b. none of these arguments carry absolute conviction. it is logically impossible to violate the restriction that a channel is between two processes only. and synchronisation can be achieved in all other cases by following every output by input of an acknowledgement. and one of them uses it for input and the other for output. My reasons have been 1. .5 Communicating sequential processes This was the title of my ﬁrst complete exposition of a programming language based on concurrency and communication. 4. . Other disadvantages of buﬀers have been mentioned at the end of Section 7. It matches closely a physical realisation on wires which connect processing agents. it can be implemented simply as a process. preﬁxed to them by a pair of colons [a :: P || b :: Q || . That early proposal diﬀers from this book in two signiﬁcant respects. Of course. and the degree of buﬀering can be precisely controlled by the programmer. both in practice and in theory. This value is input by a command a?x occurring within the process Q . the component processes of a parallel construction have unique names. Instead. Furthermore. Such wires cannot store messages. It matches closely the eﬀect of calling and returning from subroutines within a single processor. The process names are local to the parallel command in which they are introduced. if buﬀered communication were taken as primitive. and every input by output of an acknowledgement.222 7 Discussion 7. But there are some disadvantages. this would make no logical diﬀerence in the common case of alternation of subroutine call and return. 7.2.3. (1) Parallel composition Channels are not named.4 Unbuﬀered communication For many years now. For example.

…from which input is requested. The reason for this was the hope that the correctness of a process could be speciﬁed in the same way as for a conventional program by a postcondition. occam syntax is designed to be composed at a screen with the aid of a syntax checking editor.e.5) to relax the restriction that simple subordinate processes must terminate.7. The obligation that a subordinate process should terminate imposes an awkward obligation on the using process to signal its termination to all subordinates. A disadvantage in the underlying mathematics is that parallel composition is an operation with a variable number of parameters and cannot be reduced to a binary associative operator like ||.. (That hope was never fulﬁlled. This enables the subordinate process to complete any necessary ﬁnalisation code before termination—a feature which had proved useful in SIMULA and PASCAL PLUS. and methods of proving program correctness seem no simpler with it than without. An ad hoc convention was therefore introduced. and very closely follows the principles expounded in this book.4) in the more complicated cases. a predicate intended to be true on successful termination. Now it seems to me better (as in Section 4.3 Communication 223 1. A loop of the form ∗ [a?x → P b?x → Q . 2. it uses preﬁx operators instead of inﬁx.6 Occam In contrast to Ada. all processes of a parallel command were expected to terminate. R) . 7. The most immediately striking diﬀerences are notational. A serious practical disadvantage is that a subordinate process needs to know the name of its using process. and it uses indentation instead of brackets. b. i. (2) Automatic termination In the early version. . Q . and other proof methods (Section 1.] terminates automatically on termination of all of the processes a.10) now seem more satisfactory). SEQ P Q R for (P . this complicates construction of libraries of subordinate processes. . The trouble with this convention is that it is complicated to deﬁne and implement. occam is a very simple programming language.3. and take other measures (Section 6.

Selection of an alternative may be inhibited by falsity of a Boolean condition B B & c?x P The input may be replaced by a SKIP . For example. procedures may be declared with channels as parameters. right ) = WHILE TRUE VAR x : SEQ left ?x right !x : . subordinate processes (Section 4. in which case the alternative may be selected whenever the Boolean guard is true. which allows the alternative to be selected after elapse of a speciﬁed interval. All the required patterns of communication must be achieved by explicit identity of channel names.224 7 Discussion PAR P Q R ALT c?x P d?y Q IF B P NOT B Q WHILE B P for (P || Q || R) for (c?x → P d?y → Q ) for (P < B > Q ) | | for (B ∗ P ) The ALT construct corresponds to the Ada select statement.5). To help in this. or it may be replaced by a wait. The occam language does not have any distinct notations for pipes (Section 4.4).4). the simple copying process may be declared PROC copy(CHAN left . or shared processes (Section 6. and oﬀers a similar range of options.

mid) copy(mid. mid[0]) PAR i = [0 FOR n − 2] copy(mid[i]. doub.right : PAR double(doub. . For example.7.left !4. For the same reason. mid[i + 1]) copy(mid[n − 2]. for example PROC double(left .right ) P Inside P a number is doubled by doub. one for each value of i between 0 and n − 3 inclusive CHAN mid[n − 1] : PAR copy(left . take an integrator.left . A similar construction may be used to achieve the eﬀect of subordinate processes. doub.3 Communication 225 The double buﬀer COPY > >COPY can now be constructed CHAN mid : PAR copy(left .left . doub. right ) Because occam is intended to be implemented with static storage allocation on a ﬁxed number of processors. which after each new input outputs the sum of all numbers . which constructs n − 2 processes. right ) A chain of n buﬀers may be constructed using an array of n channels and an iterative form of the parallel construct. the value n in the above example must be a constant.right ?y . . right ) = WHILE TRUE VAR x : SEQ left ?x right !(x + x) : This may be declared subordinate to a single using process P CHAN doub. recursive procedures are not allowed. Processes may be shared using arrays of channels (with one element per using process) and an iterative form of the ALT construction.

This ensures that if more than one alternative is ready for immediate selection. . occam allows a programmer to assign relative priorities to processes combined in parallel. 7. The screen-editing facilities provided with the language facilitate reordering of processes when necessary. the programmer is urged to ensure that his programs are logically correct. x : SEQ sum := 0 ALT i = [0 FOR n] add[i]?x SEQ sum := sum + x integral[i]!sum PAR i = [0 FOR n] . Indeed R.226 7 Discussion it has input so far CHAN add[n]. A similar option is oﬀered for the ALT construction. namely PRI ALT . In the later 1960s an even more important role for mathematical semantics was recognised. and which pin is to be used for loading the code of the program itself. precise and stable speciﬁcation of the language to serve as an agreed interface between its users and its implementors. rather than as an explicit . unambiguous. This is done by using PRI PAR instead of PAR.4 Mathematical models Recognition of the idea that a programming language should have a precise mathematical meaning or semantics dates from the early 1960s. . Like Ada. Furthermore. and the earlier processes in the list have higher priority. . Of course. user processes . Floyd suggested that the semantics be formulated as a set of valid proof rules. independent of the assignment of priorities. . The mathematics provides a secure. There are also facilities for distributing processes among distinct processors. and for specifying which physical pins on each processor are to be used for each relevant channel of the occam program. So mathematical semantics are as essential to the objective of language standardisation as measurement and counting are to the standardisation of nuts and bolts. W. that of assisting a programmer to discharge his obligation of establishing correctness of his program. integral[n] : PAR VAR sum. it gives the only reliable grounds for a claim that diﬀerent implementations are implementations of the same language. the textually earliest will be chosen—otherwise the eﬀect is the same as the simple ALT .

7.3. 7. The basic notations of CCS are illustrated by the following correspondences a. Is it permissible to nest one parallel command inside another? 2. The objective of his investigation was to provide a framework for constructing and comparing diﬀerent models.P (a. Is it theoretically possible to use output commands in guards? The mathematical model given in this book answers “yes” to all these questions. at diﬀerent levels of abstraction.8. for example 1. and he deﬁnes a series of equivalences between the expressions. The initials CCS usually refer to the model obtained by adopting observational congruence as the deﬁnition of equality between processes. there is a special symbol τ which stands for the occurrence of a hidden event or an internal transition. as described in Section 2.4 Mathematical models 227 mathematical model. The advantage of retaining this vestigial record of a hidden event is that it can be freely used to guard recursive equations and so ensure that they have unique solutions.P ) + (b. If so.5) had no mathematical semantics.4. and it left open a number of important design questions.1 A calculus of communicating systems The major breakthrough in the mathematical modelling of concurrency was made by Robin Milner. In CCS.Q ) NIL corresponds to a → P corresponds to (a → P | b → Q ) corresponds to STOP More important than these notational distinctions are diﬀerences in the treatment of hiding. of which the most important are strong equivalence observational equivalence observational congruence Each equivalence deﬁnes a diﬀerent model of concurrency. So he starts with the basic syntax of expressions intended to denote processes.3. A second (but perhaps less signiﬁcant) advantage is that processes which can engage in an unbounded sequence of τs do not all reduce to . is it possible to write a recursive procedure which calls itself in parallel? 3. The early design of Communicating Sequential Processes (Section 7. This suggestion has been adopted in the speciﬁcation of PASCAL and Euclid and Gypsy.

P Q R R P P Q Q R Figure 7. However. as shown by the fact that the trees in Figure 7.3 Furthermore. each of which may make more identiﬁcations than CCS but cannot make less. because the trees in Figure 7. for example (τ.4 are distinct when P ≠ Q a a a a a P Q P Q P Q Figure 7.4 These examples show that CCS makes many distinctions between processes which would be regarded as identical in this book. preserving only the most essential distinctions. preﬁxing does not distribute through nondeterminism. CCS fails to distinguish a possibly divergent process from one that is similar in behaviour but nondivergent.P ) + (a. The reason for this is that CCS is intended to serve as a framework for a family of models. nondeterminism can be modelled by use of τ. CCS makes only those identiﬁcations which seem absolutely essential. CCS does not include as a primitive operator. It is hoped that these laws will be practic- . However.Q ) (τ.P ) + (τ.Q ) corresponds to P corresponds to P Q (P (a → Q )) But these correspondences are not exact. To avoid restricting the range of models. because in CCS nondeterminism deﬁned by τ would not be associative. We have therefore a far richer set of algebraic laws.3 are distinct. I expect this would make eﬃcient implementation of the full CCS language impossible. so possibly useful distinctions can be drawn between divergent processes. In the mathematical model of this book we have pursued exactly the opposite goal—we have made as many identiﬁcations as possible.228 7 Discussion CHAOS .

R) among two using processes (a. there is a restriction operator \. together with their overbarred variant. nondeterminism and interleaving as well as synchronisation.Q ) = a. if more than two processes are ready.R) = τ.P ) | (a. Instead.((a. .(P | (a.((a > P ) | Q | R) + a.Q ) | (a.(P | (b.(((a.P ) | (a. This is the source of the elegance and power of CCS.P ) | (b.P ) \ {a} = (a. Thus in CCS (a. the choice of which pair succeeds is nondeterministic (a.Q ).R)) \ {a} = τ. It was an objective of CCS to achieve the maximum expressive power with as few distinct primitive operators as possible.4 Mathematical models 229 ally useful in reasoning about designs and implementations. and greatly simpliﬁes the investigation of families of models deﬁned by diﬀerent equivalence relations. The basic concurrent combinator of CCS is denoted by the single bar |.Q )) \ {a} = τ.Q )) + a.Q ) | R) Because of the extra complexity of the parallel operator.(P | (a. and removes them from the alphabet of the process. The eﬀect is illustrated by the following laws of CCS (a.P ) | (a.P ) | Q ) (a.P ) | Q ) (a.(P | Q ) + a.Q ) | (a.Q ) = a.Q )) + a.R)) + a.Q )) + b.((a.P ) \ {a} = NIL (P + Q ) \ {a} = (P \ {a}) + (Q \ {a}) ((a.7.((a. in particular.(P | (a.Q ) | R) + τ. in that it includes aspects of hiding. Their joint participation in such an event is hidden by immediate conversion to τ.((P | Q ) \ {a}) ((a.P ) | Q | R) \ {a}) The last law above illustrates the power of the CCS parallel combinator in achieving the eﬀect of sharing the process (a. Each event in CCS has two forms.P ) and (a.Q ) | (a. However.((a.P ) | Q ) Consequently.((P | (a. which simply prevents all occurrence of the relevant events.Q ) | R) \ {a}) + τ. they permit more transformations and optimisations than CCS. each of the two events can also occur visibly and independently as an interaction with the outer environment. When two processes are put together to run concurrently.P ) | (a.(P | (a.Q ) = τ.R)) + a. synchronisation occurs only when one process engages in a barred event and the other engages in the corresponding simple event.P ) | (a. either simple (a) or overbarred (a).P ) | (a. It is rather more complicated than the || combinator.P ) | Q | (a.((a. synchronisation is not compulsory. only two processes can engage in a synchronisation event. there is no need for a concealment operator.

A calculus of correctness is deﬁned which permits a proof that a process P meets speciﬁcation S . the rule for negation is If it is not true that P F . I have taken a complementary approach. more especially in theoretical investigations. || introduces concurrency and synchronisation. because it is based on the structure of the speciﬁcation rather than the structure of the programs. Furthermore. quite independent of nondeterminism or hiding. The fact that the corresponding concepts are distinct is perhaps indicated by the simplicity of the algebraic laws. a fact which is expressed in the traditional logical notation P S The calculus is very diﬀerent from that governing the sat notation.230 7 Discussion In this book. in terms of which it is easy to deﬁne as many operators as seem appropriate to investigate a range of distinct concepts. The modality a S describes a process which may do a and then behave as described by S . For example. equality in CCS is a strong relation. In general. Modal logic is a subject of great theoretical interest. the nondeterministic choice introduces nondeterminism in its purest form. but in the context of communicating processes it does not yet show much promise for useful application. CCS is therefore a good model for formulating and exploring various weaker deﬁnitions of equivalence. which ignore more aspects of the hidden behaviour. This involves deﬁnition of the set of observations or experiments that may be made on a process. since equal processes must resemble each other both in their observable behaviour and in the structure of their hidden behaviour. then two processes are . A reasonably wide range of operators seems to be needed in the practical application of useful mathematical theories. and the dual a S describes a process that if it starts with a must behave like S afterwards. For example. Similarly. there is never a need to prove that a process does not satisfy its speciﬁcation. In contrast. Milner has introduced a form of modal logic to specify the observable behaviour of a process. Minimisation of operator sets is also useful. Milner accomplishes this by introducing the concept of observational equivalence. each of which is represented by a distinct operator. then P ¬F This means that the whole process P must be written before the proof of its correctness starts. and is quite independent of environmental choice represented by (x : B → P (x)). Simplicity is sought through design of a single simple model. the use of sat permits proof of the correctness of a compound process to be constructed from a proof of correctness of its parts.

namely a sound mathematical basis for reasoning about speciﬁcations.P ) ≡ P However. and has overstated the case for practical application of the approach taken in this book. The description given above has overemphasised the diﬀerences of CCS. That is why we have had to introduce the concept of a refusal set. it doesn’t always work as simply as this. Among the experiments which can be performed on the process P is to place it in an environment F (P ) (where F is composed of other processes by means of operators in the language).NIL) should not be equivalent to (P + NIL). (P ≡ Q ) ⇒ (F (P ) ≡ F (Q )) Since τ is supposed to be hidden. the result of transforming them by the same function should also be equal. a natural deﬁnition of an observation would lead to the equivalence (τ. According to this deﬁnition τ. and either of them can be used for both theoretical investigations and for practical applications. The principle was taken as the basis of the mathematical theory in this book. The refusal set seems to be the weakest kind of observation that eﬃciently represents the possibility of nondeterministic deadlock. and to a more powerful set of algebraic laws than CCS. Processes P and Q are (observationally) congruent if for every F expressible in the language the process F (P ) is observationally equivalent to F (Q ).4 Mathematical models 231 equivalent if there is no observation that can be made of one of them but not the other—a nice application of the philosophical principle of the identity of indiscernibles.P + τ. and it therefore leads to a much weaker equivalence. i. If two processes are to be regarded as equal. The discovery of a full set of laws of congruence is a signiﬁcant mathematical achievement. since the former can make a nondeterministic choice to deadlock instead of behaving like P .P is not congruent to P . The need for the extra complexity of observational congruence is due to the inability to make suﬃciently penetrating observations of the behaviour of P . rather than just a refusal of a single event. which equates a process with the set of observations that can be made of its behaviour. which equals P . Milner’s solution to this problem is to use the concept of congruence in place of equivalence. designs and implementations. A sign of the success of the principle is that two processes P and Q are equivalent if and only if they satisfy the same speciﬁcations ∀S • P S ≡Q S Unfortunately. The two approaches share their most important characteristic.. . and then to observe the behaviour of this assembly.e. (τ.7. without placing it in an environment F (P ).

.

C. A. ACM 21 (8). Coordinated Computing. ‘Design of a Separable Transition-Diagram Compiler. Hoare. ‘Monitors: An Operating System Structuring Concept. 471–475 (1984) An elegant treatment of functional multiprogramming.’ Comm. J. Amsterdam pp. E. E. . Springer Verlag. 74. pp. R. Kahn. G. 8–15 (1963) The classical paper on coroutines. McGraw–Hill pp. R. P. R. and Friedman. ‘Communicating Sequential Processes. Filman. and McKeag. Hoare. London. ACM 17 (10). 666-677 (1978) A programming language design—an early version of the design propounded in this book. M. A. 324 (1980) An account of PASCAL PLUS and its use in structuring an operating system and compiler. ACM 6 (7). R. A Calculus of Communicating Systems. Lecture Notes in Computer Science 92. Welsh. R. New York (1980) A clear mathematical treatment of the general theory of concurrency and synchronisation. M. C.’ in Information Processing. North Holland. 370 (1984) A useful survey and further bibliography. Milner.Select Bibliography Conway. D. Structured System Programming. 549–557 (1974) A programming language feature to aid in construction of operating systems. ‘The Semantics of a simple language for Parallel Programming.’ Comm. Prentice–Hall. Tools and Techniques for Distributed Software.’ Comm.

. Academic Press. C. and Roscoe. Brookes. A. S. D. O–J ‘Hierarchical Program Structures. 100 (1984). ‘A Theory of Communicating Sequential Processes. R. S. 175–220 (1982) An introduction to the inﬂuential ideas of SIMULA 67. (ANSI/MIL–STD 1815A) AdaTM Reference Manual Chapter 9 describes the tasking feature. London pp. (INMOS Ltd. A.) occamTM Programming Manual Prentice–Hall International. ‘An Improved Failures Model for Communicating Sequential Processes. A. W. New York. pp.234 Select Bibliography Dahl. D. Hoare. W.’ Journal ACM 31 (7)..’ in Proceedings NSF–SERC Seminar on Concurrency. and Roscoe. 560–599 (1984) An account of the mathematics of nondeterminism processes. but excluding divergences. Brookes. Springer Verlag. Lecture Notes in Computer Science (1985) The improvement is the addition of divergences. .’ in Structured Programming.

Index acc. 79 composition. p. 224 Ada. 164 batch processing. 79 CHAIN2. 7 angelic nondeterminism. 151 BOOL. 38 concealment. 155 bit stuﬃng. 222 CLOCK . 18 class. 91 ALGOL 60. 11 cyclic. 200 backtracking. 177 congruence. 123 copy rule. 125 catastrophe. 81 context-free grammar. 79 control block. 219 COLLEGE . 1. 150 acquire. 219 coend . 192 activation records. 66 chaining. 33 algebraic laws. 67 BUFFER. 95 concurrency. 22 CCS. 173 alternative. 60 communication. 179 accept. 222 alphabet. 176 backing storage. 48 conditional critical regions. 173 choice. 205 COPY . 121 CHAOS . 206 CT . 3 alternation. 220 conditionals. 226 after. 205 actual resource. o. 197 binary tree. 171 catenation. 10. 164 continuous. 90 binary exclusion semaphore. 223 copybit. 34 . 89 arcs. 15. 226 accepting process. 113 checkpoints. 237 connection diagrams. 237 chain. 55 constructive. 229 assignment. 208 binary choice. 121 complete partial order (c. 197 crossbar switch. 7 choice2. 7 channel. 141 change-giving.). 163 acknowledgement. 12 ARPAnet. 5 cobegin. 9 critical region.

173 data ﬂow. 17 LISPkit. 15 layers. 214 FIB. 33 join. 237 menu. 26 limit. 115 FCFS. 192 ﬂow control. 121 modularity. 141 monitor. 62 fork. 99 diverges. 198 . 129 left-guarded. 18 LACKEY . 162 laws. 218 FORTRAN monitor system. 1 factorial. 230 guarded. 76 dining philosophers. 19 MERGE . 237 Hamming. 20 livelock. 129 FIFO. 17 independent choice. 162 main process. 39 functional multiprocessing. 174 multiple choice. 225 interaction. 34 ismember . 170 inverse. 6 Gypsy. 29 istrace. 146 local procedure call. 122 instance. 23 divergence. 125 message. 153 deadlock. 230 head. 101 domain. 56 disjoint. 115 DOUBLE . 142 end-of-ﬁle marker. 123 double buﬀer. 88 multiple labelling. 237 event. 178 loop. 151 footman. 26 multiple checkpoints. 152 master. 36 interruptible. 69 LABEL. 196 interleaving. 73 language. 219 label. 79 left shift. 226 Euclid. 149 lazy evaluation. 46 deterministic processes. 78 ﬂight reservation. 134 data structure. 24 hide. 79 LISP. 214 ﬁnite set. 208 free variables. 152 mathematical semantics. 72 multiprocessor. 200 LONGQUOT . 45 interference. 28 ﬁxed point.236 Index data base system. 218 distributed processing. 146 length. 171 interrupts. 202 entry name. 20 least upper bound. 222 monotonic. 87 inﬁnite overtaking. 10 menu. 100 implementation. 198 distributive. 28 ispreﬁx. 63 inner statement. 227 input channel. 154 failures.

229 re-entrant subroutine. 162 separate compilation. 142 synchronisation. 3 strict. 23 structure clash. 89 output channel. 32 phase encoding. 224 NEWCOLLEGE . 218 mutual recursion. 93 regular expressions. 205 post-condition. 200 selection. 79 PASCAL PLUS. 164 release. 211 spooled output. 37 success. 148 QUOT . 49 nondestructive. 11 nested monitors. 233 pragma. 38 semaphore. 83 NONSTOP . 151 subroutine. 192 remote call. 123 packet switching. 62 NIL. 200 process. 153 subscription. 20 STOP . 89 or2. 155 shared card reader. 141 subordination. 234 operating system. 161 slave. 140 pointers. 197 sentence. 178 RC4000. 202 sector. 89 or3. 166 sequential programming. 161 successful termination. 2 process labelling. 12 PIDGINGOL. 222. 210 spooling. 162 pipes. 73 shared storage. 19 node. 25. 25 state. 39 spooled input. 209 shared data structure. 222 singleton. 208 or1. 134 scheduling. 38 symbolic execution. 229 separator card. 199 ready sets. 192 SHARED LACKEY . 24 SKIP . 224 protocol. 11 static binding. 226 scalar product. 152 speciﬁcation. 94 receiver. 198 SET . 229 partial order. 210 stack. 122 PACK . 225 path. 237 occam. 182 preﬁx.Index 237 multithreading. 3 priority. 102 NOISYVM . 176 serially reusable. 196 SIMULA 67. 4 refusals. 149 recursion. 150 PHIL. 46 . 108 observational equivalence. 209 sequential composition. 126 star. 200 rendezvous. 213 scratch ﬁle. 59 picture. 228 procedure. 69 protection. 228 precondition.

9 weighted sum. 219 UNPACK .238 Index systolic array. x. 205 VMC. 123 VAR. 196 tr . 21. 6 VMS2. 133 WIRE . 24 then. 149 transparent. 8 VMS . 149 . 12 transmitter. 81 UNIX. 125 var . 5 unguarded recursion. 8 VMCRED. 134 tail. 179 virtual resource. 54 transition. 99 unique solutions. 39 trace. 9 VMCT . 3 timesharing. 130 unfolding. 48.

- A Infelicidade Do Seculo Sobre - Alain Besancon
- A Abolicao Do Homem - C. S.lewis
- L9784.pdf
- Livro Lab Ipv6 Nicbr
- Discurso Do Presidente Da Republica- Luiz Inacio Lula Da Silva- No Ato Politico de Celebracao Aos 15 Anos Do Foro de Sao Paulo
- Manual 2 - UFFS
- ANSI TIA Standards
- Cartilha_Assédio_Moral_final
- Lobisomem+ +O+Apocalipse
- Redes
- Setlist
- Setlist
- Redes
- Addressing Subnet t Wb Work Book
- Introdução aos Sistemas Distribuidos
- Hitler - Minha Luta
- Allan Kardec - O Ceu e o Inferno
- Modelo Incremental
- Processos Software v01
- 29481244 Cisco CCNA Modulo02