You are on page 1of 75

Basic Concurrency Theory

Version 1.2

Hans Henrik Løvengreen

Applied Mathematics and Computer Science


Technical University of Denmark
Lyngby, Denmark
c 2002–2018 Hans Henrik Løvengreen

Applied Mathematics and Computer Science
Technical University of Denmark
Richard Petersens Plads, B. 324
DK-2800 Kongens Lyngby, Denmark
Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

1 Petri Nets 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Basic Petri Net Notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.1 What is a Petri Net? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.2 Markings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.3 Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.4 Multiple arcs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Modelling of Basic Concurrency Concepts . . . . . . . . . . . . . . . . . . . . . . 6
1.4.1 Sequential Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.2 Concurrent Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.3 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Extended Petri Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 Further Readings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Transition Systems 11
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Sequential Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Transition Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Concurrent Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.1 Atomic Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.2 The Interleaving Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.3 Concurrent Programs as Transition Systems . . . . . . . . . . . . . . . . . 16
2.4 On Atomicity of Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 On Memory Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.6 Bibliographical Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

i
ii CONTENTS

3 Safety Properties and Invariants 23


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.1 State Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.2 Control Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.3 History Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 The Invariant Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4 Proving Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4.1 Invariant Strengthening . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4.2 Auxiliary Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.4.3 Operational Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.5 Invariants for Busy-wait Synchronization . . . . . . . . . . . . . . . . . . . . . . . 30
3.5.1 Peterson’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5.2 A Formal Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.6 Invariants for Semaphores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4 Liveness Properties and Temporal Logic 39


4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Temporal Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2.2 Temporal Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2.3 Laws of Temporal Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.2.4 The Leads-to Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3 Expressing Liveness Properties of Concurrent Programs . . . . . . . . . . . . . . 43
4.3.1 General Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3.2 Liveness Properties of Program Constructs . . . . . . . . . . . . . . . . . 45
4.4 Proof Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.5 Example: Critical Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5.1 Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.5.2 Obligingness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.5.3 Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5.4 Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.6 General Fairness Properties: Weak and Strong fairness . . . . . . . . . . . . . . . 53
4.6.1 Example: Liveness of Semaphore Operations . . . . . . . . . . . . . . . . 53
4.7 Further Readings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
CONTENTS iii

5 Monitor Invariants 55
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 Monitor Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3 Stating Monitor Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.4 Proving Monitor Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.4.1 Signal and Continue Semantics . . . . . . . . . . . . . . . . . . . . . . . . 58
5.4.2 Signal and Wait Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Bibliography 61

Index 64
iv CONTENTS
Preface

In this set of notes, we present some of the basic theory underlying the discipline of programming
with concurrent processes/threads.
The notes are intended to supplement a standard textbook on concurrent programming such as
Andrews’ [And00].
The foundations of the notes are formal models of concurrency represented by Petri Nets and
transition systems. Based on these, notions of safety and liveness properties are defined and
systematic techniques for proving them are presented.
An objective of these notes is to let the reader become familiar with the abstract notions under-
lying current verification tools so that he or she will have a good background for constructing
adequate models and properties when such tools are applied.
In version 1.2 a new section on memory models has been added.

Lyngby, August 2018


Hans Henrik Løvengreen

v
vi CONTENTS
Chapter 1

Petri Nets

1.1 Introduction

In 1962, the German physicist Carl Adam Petri invented a mathematical model to describe
asynchronous phenomena. This model was quickly adopted by the computer science community
under the name of Petri Nets.

Petri Nets is a model of processes, i.e. well-defined activities with a regular behaviour. Especially,
Petri Nets aim at expressing concurrency explicitly, i.e. to show which activities may overlap in
time.

Although being a mathematical model, Petri Nets enjoy a graphical representation that many
people find intuitively appealing.

There are many varieties of Petri Nets. In this note we give an introduction to the classic model
of predicate/transition nets and show how they can be used to model concurrent processes as
well as basic synchronization forms.

1.2 Basic Petri Net Notions

1.2.1 What is a Petri Net?

A Petri Net is a graph with two kinds of nodes. Places that are drawn as circles and transitions
that are drawn as bars or boxes. The nodes are connected by arcs (or arrows) that always go
from a place to a transition or vice versa. An example of a Petri Net is given below:

The purpose of a Petri Net is to describe the dynamic behaviour of a system, especially the
concurrency and synchronization aspects. By convention, places are used to represent the states
of a system and transitions are used to represent the activities in the system. Each place
represents a local sub-state of the system and each transition represents an action being a well-
understood unit of activity in the system. The arcs indicate the dependencies among states and
activities as described later.

1
2 CHAPTER 1. PETRI NETS

p1 p2

t2 t3

t1 t4

p4
p3

t5 p5

Figure 1.1: A Petri Net

Mathematically, a Petri Net is a bipartite, directed graph. This may be represented by a tuple
hP , T , F i, where

P is a set of nodes called places


T is a set of nodes (P ∩ T = ∅) called transitions
F ⊆ (P × T ) ∪ (T × P ) is the flow relation consisting of a set of arcs.

1.2.2 Markings

A place is used to hold information about the state of a local part of a system. This information
could be propositions such as “part 3 has been completed and part 4 not yet begun” or “there
are n machines ready in Hall B”.
The information about the current state of a system is given by a marking of the net with tokens.
In a given marking, each place holds a number of tokens, possibly none. The tokens are drawn
as (bold) dots within the circles. A marking of the net of figure 1.1 is shown below.
A marking represents a global state of a system given as the totality of the local markings of
the individual places.
More formally, a marking M of a given a Petri Net N = hP , T , F i is a function P → N
(N = {0, 1, 2, . . .}) indicating the number of tokens at each place. A marked Petri Net is a Petri
Net N together with an initial marking M of N . Usually an initial marking is understood when
talking about a Petri Net.
One way to indicate the marking function is as a vector corresponding to a numbering of places.
For example, the marking in Figure 1.2, may be written as M = (1, 1, 0, 0, 1).
The meaning of a given number of tokens on a place is related to the interpretation of the place.
If the place is to represent a proposition like “part 3 has been completed and part 4 not yet
begun”, it is natural to let one token denote that the proposition is true and no token that it is
false. With this interpretation more than one token on the place would not make sense. On the
other hand, 5 tokens on a place may represent the information that “there are 5 machines ready
in Hall B”. Thus, it depends upon the interpretation whether a given marking correspond to a
meaningful system state.
1.2. BASIC PETRI NET NOTIONS 3

p1 p2
• •

t2 t3

t1 t4

p4
p3

t5 p5

Figure 1.2: A Marked Petri Net

1.2.3 Behaviour

A Petri Net describes how a system behaves dynamically over time, by showing how the sub-
states and the actions of the system influence each other. This relationship is indicated by arcs
that show under which conditions actions take place and what effect they have. The change of
state is modelled by letting the initial marking evolve though “firing” of transitions according to
the following rule.

Firing rule (single transition)


For a given marking M , a transition t is enabled iff M (p) ≥ 1 for any p where
(p, t) ∈ F .
An enabled transition t may be fired. By firing t, the marking M evolves into M ′ as
follows:
1. For every incoming arc (p, t) ∈ F , the number of tokens on p is decreased by
one.

2. For every outgoing arc (t, p) ∈ F , the number of tokens on p is increased by


one.
The fact that a transition t can be fired in a marking M resulting in a new marking
t
M ′ is denoted by M −→ M ′ .

Firing of a transition models the occurrence of the corresponding action in the system. According
to the firing rule, the action consumes a token for each incoming arc. Since these tokens must be
present for the action to take place, the incoming arcs represent the conditions under which the
action can occur. Correspondingly, the action produces a token for each outgoing arc which thus
represent the result of the action. Of course, the meaning “consume” and “produce” depends on
the interpretation of the states.

Consider the marked Petri Net in figure 1.2. In this net, the transitions t1 , t3 and t4 are enabled.
By firing t4 , we get the new marking:
4 CHAPTER 1. PETRI NETS

p1 p2
• ••

t2 t3

t1 t4

p4
p3

t5 p5

Figure 1.3: The marking after firing of t4

Execution

An execution (or run) of a marked Petri Net (N , M0 ) is a (finite or infinite) sequence of transition
firings:
t0 t1 t2
M0 −→ M1 , M1 −→ M2 , M2 −→ ...
which is usually written more succinctly as:
0 t 1 t
2 t
M0 −→ M1 −→ M2 −→ ...

representing a particular course of activity within the system.

Concurrency

Two or more transitions are said to be concurrently enabled in a marking M , if there are enough
tokens for all of them, that is, there must be a token for each arc leading to any one of the
transitions.
A non-empty multi-set1 U of concurrently enabled transitions can be fired simultaneously (or
together ). Simultaneous firing has the same effect on the marking as firing the transitions in
any order. The fact that U may be fired simultaneously in M leading to M ′ is denoted by
U {t}
M −→ M ′ . Note that the firing of a singular transition set M −→ M ′ corresponds to firing a
t
single transition M −→ M ′ .
A step execution of a marked net (N , M0 ) is a sequence of simultaneous firings (some of which
may be singular):
0 C 1 C2 C
M0 −→ M1 −→ M2 −→ ...

A simultaneous firing of transitions models that the corresponding actions occur in parallel, i.e.
overlapping in time.
1
By using multi-sets (bags) of transitions, we allow a transition to “fire together with itself” if there are enough
tokens for all its incarnations.
1.3. INTERPRETATION 5

Conflict

In a given marking, it may be the case that a number of transitions are enabled, but not
concurrently enabled. This indicates that some of the transitions are requiring more tokens
from the same place than present and thus, the situation is called a conflict.

Consider again the marked net in Figure 1.2. Here {t1 , t4 } and {t3 , t4 } are concurrently enabled,
but not {t1 , t3 }, since t1 and t3 are in conflict.

1.2.4 Multiple arcs

A slight generalization of Petri Nets is to allow more than one arc from a particular place to a
particular transition. Likewise for arcs from transitions to places. In the mathematical model,
this can be represented by adding a component W : F → N+ (N+ = {1, 2, 3, . . .}) that gives the
weight of the flow between a place and a transition, corresponding to a number of single arcs.
Graphically, multiple arcs may be drawn explicitly, or indicated by labelling a single arc:

2
or
3

Figure 1.4: Multiple Arcs

The notions of enabledness, concurrent enabledness, firing and simultaneous firing readily gen-
eralize to multiple arcs using the idea that “one arc handles one token”. We leave the precise
definitions to the reader.

1.3 Interpretation

As the reader may have noticed, we have been using two sets of notions. One set related to the
Petri Net itself, and one set related to the system that it is supposed to model:

Petri Net Interpretation


net system
place sub-state
transition action
arrow condition/result
marking global system state
firing action occurrence
execution course of activity
6 CHAPTER 1. PETRI NETS

The Petri Net by itself is nothing but a formal system (or game), that is a set of rules of what
nets may look like, and how they can be manipulated which are totally independent of their
interpretation.
Usually a Petri Net is constructed with a concrete interpretation in mind. By choosing some
meaningful names for places and transitions, the distinction between net and system may be
obliterated as shown in Figure 1.5.
Hungry Clean plates
••

Studies Heats some food


in the microwave
Goes to have A friend drops in
some pizza and cleans a plate
Dinner’s served
Not
• hungry

Eats

Dirty plates

Figure 1.5: Interpreted Petri Net: Student’s life

1.4 Modelling of Basic Concurrency Concepts

In this section, we illustrate how a number of basic notions of concurrency can be modelled with
Petri Nets.

1.4.1 Sequential Processes

In a sequential process, actions are executed one at a time. In Figure 1.6 (a) and (b) it is shown
how two sequential processes are represented with Petri Nets. Note that the process (b) has
infinitely many different executions.
A sequential process is characterized by a marked net with only one token and where all tran-
sitions have one incoming and one outgoing arc (so that the single token is preserved). Places,
however may have any number of incoming and outgoing arcs.

1.4.2 Concurrent Processes

By considering Figure 1.6 (a) and (b) as one net rather than two, we immediately get a model of
concurrent sequential processes. We see that in the initial marking, a1 and b1 are concurrently
enabled and thus can occur in parallel. We even have that we can find executions that make any
pair (ai , bj ) concurrently enabled. This reflects that we do not make any assumptions about the
relative execution speed of the processes.
1.4. MODELLING OF BASIC CONCURRENCY CONCEPTS 7

✎☞ ✎☞
s s ✛ ☎
✍✌ ✍✌
❄ a1 ❄ b1

✎☞ ❄
✎☞
b4
✍✌ ✍✌ ✻
❄ a2 ❄ b2

✎☞ ❄
✎☞

✍✌ ✍✌
❄ a3 ❄ b3

✎☞ ❄
✎☞
✍✌ ✍✌

(a) (b)

Figure 1.6: Two Sequential Processes

Rather than having concurrency at the very beginning, a fork transition that creates multiple
concurrent processes from a single one. Likewise these may be joined by a transition at the end
of the process.

1.4.3 Synchronization

By synchronization, we understand any constraint on the ordering of actions in different pro-


cesses. Below we show how the three main forms of synchronization are represented with Petri
Nets. For simplicity we use the two sequential processes of Figure 1.6, leaving out the action b4 .

Condition Synchronization

When an action in one process is always to wait for a conditions that is caused by other processes,
we talk about condition synchronization. Typically the condition could be that a certain action in
another process has taken place. To model such synchronization, we introduce an auxiliary place
in-between the processes and let a token at that place indicate that the condition is satisfied. If
we want b3 always to come after a1 , this is modelled as shown below.

True Synchronization

The meaning of synchronous is “at the same time”. If we say that two actions in different
processes should always occur at the same time, it is reasonable to think that the two actions
are just different views of the same phenomenon. An example is a bank transfer in which two
parties commonly agree that money has been transferred.
In Petri Nets, the common agreement is expressed by making a common transition that spans
the two processes. E.g. if we want the actions a2 and b2 to be truly synchronized, this is modelled
by “gluing” them together into a common action ab:
It is seen that the model makes the two processes meet. More than two processes may synchronize
in this way. This form of synchronization is also known as barrier synchronization.
8 CHAPTER 1. PETRI NETS

✎☞ ✎☞
s s
✍✌ ✍✌
❄ a1 ❄ b1
❄✝
✎☞ ☎ ❄
✎☞
✍✌ ✍✌

✎☞
❄ a2 ❄ b2
✍✌

✎☞ ❄
✎☞
✍✌ ✝ ✍✌

❄ a3 ❄❄ b3

✎☞ ❄
✎☞
✍✌ ✍✌

Figure 1.7: Condition Synchronization

✎☞ ✎☞
s s
✍✌ ✍✌
❄ a1 ❄ b1

✎☞ ❄
✎☞
✍✌ ✍✌
✝ ☎✞ ✆
❄❄ab
✞ ✆✝ ☎

✎☞ ❄
✎☞
✍✌ ✍✌
❄ a3 ❄ b3

✎☞ ❄
✎☞
✍✌ ✍✌

Figure 1.8: True Synchronization (Barrier Synchronization)

Mutual Exclusion

The third form of synchronization between processes is mutual exclusion in which certain actions
(or sequences of actions) are not allowed to be executed at the same time. Typically this is
to prevent simultaneous access to shared resources (e.g. printers). In the Petri Net, this is
accomplished by an auxiliary place with a single token that represents the “right” to execute.
By letting the transitions that must exclude each other claim this token, the firing rules prevent
them from being concurrently enabled. E.g. mutual exclusion between a2 and b2 is modelled as
shown below.

Synchronized vs. Synchronize

Unfortunately, the term synchronization is used both at a general term and for true synchro-
nization in particular. However, when we say that some processes are synchronized (passive
form), we usually mean that they are constrained by any one of the above synchronization
1.5. EXTENDED PETRI NETS 9

✎☞ ✎☞
s s
✍✌ ✍✌
❄ a1 ❄ b1

✎☞ ❄
✎☞
✍✌
✞ ☎✞ ✍✌

✎☞
❄❄a2 s ❄❄ b2
✍✌
❄✝
✎☞ ✆
✻✝ ✻ ✆❄
✎☞
✍✌ ✍✌
❄ a3 ❄ b3

✎☞ ❄
✎☞
✍✌ ✍✌

Figure 1.9: Mutual Exclusion

forms. On the other hand, when we say that some processes synchronize (active form), we
usually understand that they perform true synchronization (meet).

The Java language adds further to the confusion by using the keyword synchronized to denote
mutual exclusion.

1.5 Extended Petri Nets

In this note we have focussed on the class of Petri Nets called place/transition nets. There are
however many other forms of Petri Nets.

A simple extension is to allow inhibitor arcs test for absence of tokens. Inhibitor arcs are drawn
as an “arrow” where the arrowhead is replaced by a small circle. Consider e.g.

p1
t
p3

p2

Figure 1.10: Inhibitor arc

Here t may fire only when there are tokens in p1 and no tokens in p2 .

Introduction of inhibitor arcs makes Petri Nets fully Turing machine equivalent.

So-called high-level Petri Nets further increases the expressiveness by allowing token to take
values and let transitions manipulate these. A well-developed branch of high-level nets are
Coloured Petri Nets [Jen92, Jen95, Jen97] for which a number of analysis and programming
tools have been developed.

Also timing properties and probabilities have been added to Petri Nets.
10 CHAPTER 1. PETRI NETS

1.6 Further Readings

Much of the Petri Net literature over the last four decades has been devoted to studies of
the mathematical properties of the many varieties of Petri Nets. Unfortunately only very few
introductory textbooks have been published.
A survey of basic Petri Nets theory can be found in [Mur89].
A more comprehensive survey of Petri Net classes and applications of these can be bound in
[RR98a] and [RR98b].
The URL www.informatik.uni-hamburg.de/TGI/PetriNets/ is a good entry to the Petri Net
resources available online.
The use of Coloured Petri Nets is thoroughly described in [JK09] and a set of online tools are
available at cpntools.org.
Chapter 2

Transition Systems

2.1 Introduction

In order to formulate and prove properties of concurrent programs, we need a precise, but not
overly detailed, model of what happens when a concurrent program is executed.
In this chapter, we first introduce transition diagrams as a graphical notation for describing
sequential behaviour. We then turn to concurrent processes and observe that if actions enjoy the
property of being atomic, the concurrent execution may be represented by a sequence of actions.
We use the mathematical notion of labelled transition systems to represent this interleaving model
of concurrency. Finally, we discuss when atomicity of actions is met in usual implementations
and under which conditions the model is applicable.

2.2 Sequential Processes

A distinguished program part that is to be executed independently is said to describe a process1 .


By the process we understand the collection of all possible executions of the program part.
Usually, we are a bit sloppy and talk about the program part itself as “the process”.
The execution of a process is assumed to consist of execution of discrete actions (also known
as primitive operations or steps). At a given level of abstraction, the actions are defined as the
smallest unit of activity we talk about. Thus, actions are a relative concept.
In reality, the execution of actions take time. If the actions of a process are always executed
without overlapping in time, we talk about a sequential process. A program or program part
that describes a sequential process is called a sequential program respectively a sequential process
(description).
Actions may be combined to composite actions. E.g. the three actions a, b, and c may be
composed sequentially to the composite action d = (a; b; c). Actions and composite actions are
sometimes known collectively as operations.
1
This abstract notion of activity mostly corresponds to the notion of so-called threads provided by operating
systems and reflected in most contemporary programming languages.

11
12 CHAPTER 2. TRANSITION SYSTEMS

Henceforth, we shall assume a concurrent program to be composed from a fixed number of


sequential process descriptions that are executed concurrently, i.e., their executions may overlap
in time. Again we shall identify a process with its description and talk about a system of
concurrent processes. Likewise, we say that “the process P has reached . . . ” rather then “the
execution of the process described by P has reached . . . ”.
We assume that a system of concurrent processes is always in some state, that determines its
further behaviour. For a usual concurrent program, the state is determined by values of the
program variables as well as the control variables that indicate how far the execution of each
process has progressed. The program variables can be divided into shared variables that are
referred to in several processes and local variables that are only used in a single process. We
sometimes use the term global state to emphasize that we talk about the full program state.
We usually assume that the execution of a concurrent program starts in a state where all pro-
gram variables have been initialized and each control variable points to the start of the process
description it is associated with2 .
The state is changed subject to the execution of actions. We assume that actions always termi-
nate and that the action’s effect on the state, when executed in isolation, can be described by a
statement S . Usually S is a assignment of the form x := e, where x is a program variable and
e is an expression, or a multiple assignment x̄ := ē where x̄ and ē are vectors of equal length.

2.2.1 Transition Diagrams

Before we can give a model of the execution of a concurrent program, we need a model of its
constituents, i.e. we must provide a process model.
Obviously, an instance of a sequential process execution can be modelled by a sequence of actions:

a1 , a2 , a3 , . . .

The sequence may be infinite. However, the process is not given by a single execution—the
process encompasses all the potential executions. We could then imagine the process being
modelled by the collection of all these execution sequences, but this model is not sufficient, since
we would like the choice of which action to perform next to depend on the global state (as is
the case in an if statement). We thus need to be able to prescribe both the sequence in which
actions are performed as well as the conditions under which the actions may be performed.
To express that actions can be performed only under certain conditions, we introduce the notion
of a conditional action of the form:
a: B → S
where a is a unique action label, B is a boolean predicate on the global state and S is a statement.
B is called the guard of a and indicates the conditions under which a may be executed with the
effect given by S . If a can be performed unconditionally we write just S instead of true → S .
If the statement has no effect on the variable state it can be left out: B → .
To determine the sequencing of a process’ actions we use a transition diagram (state machine,
automaton) which is a directed graph whose edges are marked with conditional actions. The
2
In practice, this is usually accomplished by creating processes (threads) dynamically one by one.
2.2. SEQUENTIAL PROCESSES 13

nodes of the graph correspond to the (control) locations of the process. One of the nodes must be
designated as the start location. Locations without outgoing edges are called terminal locations.
To talk about the execution of a transition diagram, we think of a mark or pointer that at any
time indicates the location the process has reached. This control pointer corresponds to the
value of the control variable we have already introduced for each process. Initially the control
pointer is at the start location. At any given point of time, the process may only perform an
action if control has reached the location from which the action emanates and the condition
of the action is satisfied (if any). When an action is performed, the control pointer is moved
along the edge corresponding to the action to point at a new location. If control has reached a
terminal location, the process is said to have terminated.

Example

Figure 2.1 shows a transition diagram that models one of the processes in Dekker’s algo-
rithm (cf. [And00, p.141]).
turn 6= 2 → ✎☞
✞ l5
✍✌
turn 6= 2 → ✻
✞ ☎ enter1 := false
✎☞❄ enter1 := true ✎☞❄ enter2 → ✎☞turn = 2 → ✎☞
✲ l1 ✲ l2 ✲ l3 ✲ l4
✍✌ ✍✌ ✍✌ ✍✌

¬enter2 →
noncrit1 ❄
✎☞ ✎☞ ✎☞ ✎☞
l6 ✲ l7 ✲ l8 ✲ l9
✍✌ crit1 ✍✌ enter1 := false ✍✌ turn := 2 ✍✌
✝ ✆

Figure 2.1: Transition Diagram for Dekker’s Algorithm

The start location l1 is here indicated with a small arrow. The actions crit1 and noncrit1 are
names for actions whose effect is assumed to be irrelevant for the properties of interest. The
effect of the other actions is indicated by assignment statements that change the program
variable state. The actions emanating from l2 are test actions that do not change the
program variable state (but determine from where the execution is to proceed!). Since the
conditions are complementary, l2 corresponds to the test of a while statement. Likewise,
the action emanating from l5 is a test action, but as is has no alternative, it indicates that
the process has to wait until the variable turn differs from 2. In the textual representation
of processes, such an isolated guard is usually denoted by await turn 6= 2.

In the example we have omitted labels for the actions as they may be uniquely identified by
their actual statements. E.g. the action ’turn := 2’ is the one going from l8 to l9
Often, we shall not bother to label all the locations, but just identify them through the actions.
For a given action a we let preloc(a) denote the location from which the action emanates and
let postloc(a) denote the location to which the action leads. E.g. we have that preloc(turn =
2 →) = l3 and postloc(turn = 2 →) = l4 .
14 CHAPTER 2. TRANSITION SYSTEMS

A transition diagram corresponds to a flowchart, but may express more complicated branching
and exhibit non-determinism. A transition diagram may be seen as a simplified notation for the
Petri Net that is obtained by taking the locations as places and explicitly inserting transition
symbols at each edge:

a
a ✓✏ ✓✏
❡ ✲❡ ✲ ✲
✒✑ ✒✑

Transition diagram Petri net

Even though we use transition diagrams as our basic process model, we shall often describe
processes textually, using a programming language-like notation. As seen in the above example,
the correspondence between a program text and its transition diagram is usually evident.

2.3 Concurrent Processes

2.3.1 Atomic Actions

During the execution of a concurrent program, actions from different processes may be executed
overlapping in time. We have previously assumed that we know the effect of an action if it is
performed in isolation. Thus, the question is whether we can predict what will happen when
the actions are performed in parallel. To answer this question, we will turn to the concept of
atomicity characterized by:

atomic = virtually indivisible

We thus introduce the notion of an atomic action as an action that, as seen from other pro-
cesses, appears to be performed indivisibly. This means that no other process is able to detect
any intermediary states during its execution—they either find that the action has been fully
performed or not performed at all. This implies that that if two or more atomic actions are
executed overlapping in time, the resulting effect on the state is as if they were executed one at
a time, in some order.

Example

If we assume that assignment of a constant to an integer variable is an atomic action, a


parallel execution of the following two actions, a and b:

a : x := 1 b : x := 4

will result in the variable x getting either the value 1 (corresponding to b; a) or 4 (cor-
responding to a; b) and not some combined value (like 5) or an arbitrary value. [The
computers that we normally use for concurrent program execution usually ensure that, see
Section 2.4.]
2.3. CONCURRENT PROCESSES 15

That an action is atomic is thus a property of the action relative to actions in other processes of
the given program. We shall say that a group of actions are mutually atomic if any execution in
which they overlap in time has the same effect as if they were executed in some sequential order.
A concurrent program is said to have atomic actions if any selection of actions from different
processes are mutually atomic.

In the sequel we assume that the actions of all processes for a given concurrent program have
been chosen such that they are atomic. Especially, in a transition diagram, all actions are by
definition assumed to be atomic. In textual process descriptions, the atomic actions can be
indicated by angle brackets h a i.

2.3.2 The Interleaving Model

If one is not interested in program properties that are related to whether actions are actually
performed in parallel (like how fast it terminates), but solely in properties that are related to
the sequence of states the program will go through, one may use the interleaving model of the
program behaviour.

Assuming that all actions of a program are atomic, we can consider the execution of the program
as a sequence of state transitions
a
0 1a 2 a
s0 −→ s1 −→ s2 −→ ...

where the action sequence a0 , a1 , . . . is an interleaving of the actions of the individual processes
(i.e. the sub-sequence of actions belonging to process Pi is a proper action sequence for Pi ).

Now, the question is which interleavings should be considered feasible for the program. We here
use the first principle of concurrent programming:

The speed of execution of each process is unknown

In the model this means that the set of executions of a program is given by all possible interleav-
ings of all possible sequences of actions of the processes. When constructing the interleavings,
the fact that some actions are conditional must be taken into consideration.. The possible
interleavings thus in general depend on the initial program state.

Since the actions are assumed to be atomic, we see that the interleaving model has the following
fundamental property:

If the actions of the program “in reality” are performed overlapping in time, the states
that the program passes through are also passed through by one of the interleavings.

Thus, to see if the program “in reality” has a certain (state-related) property, we need only check
the property under the assumption that only one action is performed at a time.
16 CHAPTER 2. TRANSITION SYSTEMS

2.3.3 Concurrent Programs as Transition Systems

In the interleaving model, the development of the total system of processes is seen as huge
sequential process (albeit a very non-deterministic one). To represent the model formally, we
may thus use the general notion of a labelled transition system:

Definition 1 A (labelled) transition system TS is a tuple (Σ, A, T , s0 ), where:


Σ is a set of states
A is a set of actions (or labels)
T ⊆ Σ × A × Σ is the transition relation
s0 ∈ Σ is the initial state
The fact that (s, a, s ′ ) ∈ T indicates that the action a can be executed in the state s resulting
a
in the new state s ′ . For a given TS , this fact is usually written s −→ s ′

An execution of TS is a finite or infinite sequence


a0 1a 2 a
s0 −→ s1 −→ s2 −→ ...

where s0 is the initial state and (si , ai , si+1 ) ∈ T for every i .

Any state s that appears in some execution of TS is said to be reachable. The set of reachable
states is denoted by Reach(TS ). 

The transition systems may also be seen (and depicted) as a transition graph in which the states
a
correspond to nodes, each s −→ s ′ is a labelled arrow in the graph, and s0 is a distinguished
initial node. In this view, an execution corresponds to a finite or infinite path starting in s0 .

For a concurrent program with processes P1 , P2 , . . . , Pn defined by transition diagrams, we may


construct a program transition system TS = (Σ, A, T , s0 ) as described below.

For each process Pi its set of actions is denoted Ai . We require the actions sets to be disjoint
such that each action is uniquely attributed to one of the processes. The set of program actions
A is the union of all actions sets Ai .
With each process Pi we associate a control variable πi . The program state space Σ is given as
all possible assignments of values to
(i) The shared variables of the program
(ii) The local variables of each process Pi
(iii) The control variables πi
Each state may be represented as a concatenated vector of values for these variables.

The initial state s0 is then the vector in which each shared and local variable has its initialization
value (possible a default value) and each control variable πi points to the start location of Pi .

The transition relation T is the union of the transitions generated by each action in each process.

Each action a : B → S occurring in the transition diagram of Pi gives rise to set of program
a
transitions, viz. the transitions s −→ s ′ , where in s, control has reached preloc(a), B is true in
s and where s ′ is the result of executing S from s and assigning postloc(a) to πi .
2.3. CONCURRENT PROCESSES 17

a
An action a is said to be enabled in state s if there exists an s ′ such that s −→ s ′ . Relating to the
transition diagram, an action is enabled, if control for the process has reached its pre-location
and the condition (if any) is satisfied.
With this program model, an execution of the program transition system:
0a 1 a
2 a
s0 −→ s1 −→ s2 −→ ...
exactly reflects an interleaving of the actions of the constituent processes. In each step of the
execution, the action ai is chosen among those actions of the individual processes that are
enabled in si .

Example

Consider a small concurrent program given by the transition diagrams:

var x , y : integer := 0;

P1 : P2 :

✎☞ ❄
✎☞
l0
✍✌ ✍✌
0 k
a1 : x := y a2 : y := 1

✎☞ ❄
✎☞
1l
✍✌ ✍✌
1 k
b1 : y := x

✎☞
l
✍✌
2

For this program, the state space can be given as vectors corresponding to the values of
(x , y, π1 , π2 ), where π1 and π2 are the control variables for the two processes. For example,
the initial state is given by (0, 0, l0 , k0 ).
Now, the transition system for this program can be described by the following transition
graph:

❄ a2
(0, 0, l0 , k0 ) ✲ (0, 1, l0 , k1 )

a1 a1
❄ a2 ❄
(0, 0, l1 , k0 ) ✲ (0, 1, l1 , k1 ) (1, 1, l1 , k1 )
b1 b1 b1
❄ a2 ❄ ❄
(0, 0, l2 , k0 ) ✲ (0, 1, l2 , k1 ) (0, 0, l2 , k1 ) (1, 1, l2 , k1 )

The graph actually represents only the reachable part of the transition relation, i.e. the
b1
states that can be reached from the initial state (see chapter 3). E.g. (2, 1, l1 , k0 ) −→
(2, 2, l2 , k0 ) also belongs to T .
We see, that an execution of the composed transition systems is indeed an interleaving of
actions and that the program may terminate in three different states.
18 CHAPTER 2. TRANSITION SYSTEMS

Evidently, it is of vital importance for soundness of the interleaving model that the actions are
atomic. In the next section we discuss which actions can be assumed to be atomic in a usual
program executed on a usual machine.

2.4 On Atomicity of Actions

When we are about to decide what to choose as actions in the interleaving model, we are in a
dilemma. On one side, we would like actions to be as “powerful” as possible to avoid considering
too many interleavings. On the other side, it is of no use to make the actions so large, that it
will be unreasonably difficult to implement them atomically.
Most processors will perform their (simple) instructions indivisibly. Thus, machine instructions
are automatically atomic. Thus, if for a given program, we know the processor on which it
is to run as well as the code generated by the compiler for this machine, we may use the
individual instructions of this code as actions for the program model. This, however, is not
satisfactory, since we aim at writing concurrent programs that are correct irrespective of the
machine architecture on which they are run.
Instead, we proceed as follows: On all usual machines, access (i.e. reading or writing) to a simple
program variable is performed indivisibly by one machine instruction. (In a multiprocessor
system, concurrent access to a variable will be sequentialized by a bus-arbiter.) A simple variable
is a variable that can be held within one machine word3 —booleans, integers and characters are
usually simple in this sense. All other instructions (e.g. addition) are performed on local variables
(registers, stacks etc.). For instance, a statement like x := x + 1 is often implemented by three
machine instructions:

Load R,x
Incr R
Store R,x

where R is a local variable for the process (viz. a register). These three instructions correspond
to three atomic statements:

h R := x i
h R := R + 1 i
h x := R i

If two instances of this code are performed concurrently (with individual instances of R) from
an initial state of x = 1, we get 20 different interleavings, resulting in two different end results,
viz. x = 2 or x = 3.
Looking at the interleavings in detail, we notice that actions that refer only to local variables
(here the R’s) can be interleaved in any order; other processes cannot detect their execution.
Therefore, we conclude that such local actions may freely be joined with preceding or follow-
ing non-local actions. For the example, we may represent the machine code by the following
equivalent atomic actions:
3
For the time being a word is usually 32 or 64 bits
2.4. ON ATOMICITY OF ACTIONS 19

h R := x + 1 i
h x := R i

or alternatively

h R := x i
h x := R + 1 i

We now get 6 interleavings of the two processes while maintaining the two possible results for x .
If, however, we take the original statements x := x + 1 as actions, we get only two interleavings
and only one result, x = 3. Thus, the statement x := x + 1 is not atomic, if implemented in a
standard way.
As indicated above, what matters in the interleavings are the points where a process influences or
is influenced by other processes. If there is only one such point in each statement (or expression)
the effect of a concurrent execution is determined by the order of these critical points. This is
made precise in the following principle:

Rule of Critical References


A critical reference is either:
• Reading of a simple variable that may be changed by other processes.

• Writing of a simple, shared variable.


If a variable is not simple, access counts for more than one critical reference.
Assuming that access to a simple variable is atomic, we have

The evaluation of a test expression e or the execution of an assignment state-


ment S is atomic if at most one critical reference occurs in e resp. S

Note that if l is a local variable and y is a variable that may be changed by other processes, the
statement l := y + y is not atomic since the (critical) read of y occurs twice. Thus, even though
a compiler may optimize the right hand side to 2y such that y is read only once in resulting
code, we shall in general not rely on such optimizations.
By introduction of local variables we may always rewrite a program such that its assignment
statements and test expressions become atomic according to the above rule and thus may be
used as actions for the interleaving model.

Explicit Atomicity

To indicate that an action a is, or is assumed to be, atomic, even if it does not satisfy the rule
of Critical References, we use the notation h a i. Especially h B → S i means that the statement
S is performed virtually indivisibly with the evaluation of B (if B holds). By marking an action
this way, we make an obligation to implement it in a special way to ensure atomicity. This
may be accomplished by synchronization, use of operating system features or special indivisible
machine instructions.
20 CHAPTER 2. TRANSITION SYSTEMS

2.5 On Memory Models


In the interleaving model any write to shared variables takes effect on the global state immedi-
ately. Hence all processes will see the same history of the shared variables. This principle is also
known as sequential consistency. Actual processor and memory architectures, however, may not
provide such a simple realization.

Reorderings

Over the years much effort has been put into optimizing the execution of traditional (i.e. se-
quential) programs. This has taken place within all components of the execution chain:

• Compilers may reorder accesses to variables, hold variables temporarily or permanently


in registers or totally eliminate variables not being assigned.
• Processors may execute instruction out of order , e.g by loading operands in advance or
postpone storage of results. Usually strict orders can be enforced through explicit use of
special fence or barrier instructions though.
• Memory systems may reorder read and write requests or postpone flushing of cached
values to global memory.

These optimization have been made under the assumption that the program is executed as a
single sequential process without any asynchronous interactions on the program state. Thus, as
long as the sequential semantics is preserved, the optimizations are considered correct. Unfor-
tunately, these assumptions do not necessarily hold for concurrent programs — especially not
when executing on a multiprocessor architecture.
As a result, the actual execution of concurrent sequential processes may not always be adequately
represented by transition systems as in the interleaving model.

Example

Consider the following concurrent program:

var x , y : integer := 0; Shared variables


a, b : integer := 0; Local variables of P2
process P1 ; process P2 ;
x := 1; a := y;
y := 1 b := x ;

By the interleaving model the program may terminate with the following values for (a, b):
(0, 0), (0, 1) and (1, 1). The value (1, 0), however, would not be seen, as reading y = 1
would imply already x = 1. In an optimized execution, though, the write to y might be
flushed from the cache before that of x

Of course, the interleaving model may be refined taking potential reorderings into account. For
instance, the program part a; b may be represented by two paths a, b and b, a in the transition
diagram provided that a and b are sufficiently independent (e.g. operate upon different variables).
It would, however, be extremely difficult to reason about the behaviour of such models.
2.5. ON MEMORY MODELS 21

Memory models

Rather than inventing a new complex execution model, one may try to establish conditions under
which the interleaving model is feasible. For modern high level languages this has resulted in a
compromise where the use of a few dedicated language notions provides the necessary control
of the execution order while still allowing the execution chain to apply optimizations at a useful
scale.
Thus, many high-level languages have now been equipped with a memory model which specifies
how accesses to the shared variables of a concurrent program are executed.
Typically, the memory model defines a happens-before relation among the actions of a program.
The relation should be a partial order with the property that if an action a is said to happen
before an action b then the effect of a (e.g. writing to a shared variable x ) is visible to the actions
b (e.g. reading of x ).
The happens-before ordering is usually based on the program order of actions in individual
processes as well as a number of inherent inter-process causalities such as start of a new process
or the use of built-in synchronization mechanisms. The relation is typically characterized by
rules such as4 :

• Any action preceding (in program order) the creation of a new process happens before the
actions of the process.

• Any action of a process happens before the detection of its termination in another process.

• Any action that precedes the release of a lock happens before the next acquiring of that
lock.

• Etc.

In addition, the language may provide means of indicating shared variables to be treated in a
special way. For instance variables may be declared to be volatile with the effect that:

• The write of a volatile variable immediately takes global effect.

• Any action preceding the write of volatile variable happens before any action following a
subsequent read of the variable.

Many high-level languages also provide special atomic variables/objects which in addition to
providing extra atomic operations (such as atomic increment) in the same way enforce a happens-
before relationship among its operations.
The memory models puts obligation on the execution chain to respect the happens-before or-
dering. This may accomplished by letting the compiler itself refrain from certain reorderings or
register cashing as well as making it issue sufficient fencing and cache-flushing instructions into
the instruction stream.
4
For a language with dynamic process/thread creation
22 CHAPTER 2. TRANSITION SYSTEMS

Data race freedom

The happens-before relation needs not be total. In the case where a write of a shared variable
x is not in a happens-before relationship with a read of x , we have a data race in which the
value read may be a stale. In such case, the interleaving model is not sufficient and more refined
analysis must be used. The memory model of C++ even goes as far as stating that programs
with data races have undefined semantics [Wil12].
On the other hand, if a program does indeed avoid data races it is said to be correctly synchro-
nized. If so, its memory usage will be sequentially consistent and it may be safely analyzed using
the interleaving model as presented here. To check for correct synchronization, the following
rule of thumb may be of use:

Data Race Freedom


In a program, if for any shared variable x , either
• x is declared as volatile/atomic, or

• x is accessed only within critical sections protected by a specific lock.


then the program is free from data races.

The rule is sufficient, but there may be cases, where freedom from data race may be obtained
by less explicit synchronization.
Note that data race freedom only ensures that the reads and writes of simple variables are
globally ordered. It does not guarantee that the program does not suffer from race conditions
at a higher level.

2.6 Bibliographical Notes

The notion of transition system has been the favourite, basic model of concurrency for the last 50
years because it trades concurrency for non-deterministic sequential behaviour which is simpler
to deal with.
A generalization of parallel composition of transition system is to allow common actions that
call for true synchronization of processes. Most of the so-called process algebras originating from
[Hoa85, Mil89] are based on this interaction principle.
The form of transition diagrams is taken from [MP91].
As presented here, the model relies on the existence atomic read and write operations on sim-
ple data types. It is possible, however, to rely on weaker notions putting constraints on the
concurrent access into account (e.g. a single read concurrent with a single write). For modern
presentations of such algorithms, see [HS08] and [Ray13]
The idea of defining a dedicated memory model for a high-level language was first introduced
in with the Java language in 1995. The definition, however, was considered faulty and it was
subsequently revised [JMM]. Following Java, several other languages have provided a memory
model like C# [Ost12] and C++ [Wil12].
Chapter 3

Safety Properties and Invariants

3.1 Introduction

Concurrent programs differ from sequential ones by being reactive, i.e. expressing an (often
infinite) activity rather the computation of a final result. Thus, concurrent programs are not
characterized by their input/output relation, but rather by their behaviour.
Informally, properties of concurrent programs are divided into:

Safety properties that ensure that the program does nothing wrong.

Liveness properties that ensure that the program does make progress. Paired with the
safety properties, these imply that the program does something good.

In this chapter we focus on safety properties and how they can be expressed using the notion of
invariants.
Safety properties are the basic properties of a concurrent program. Typical examples of safety
properties are:

• Two processes never try to use the same resource at the same time.

• The program cannot deadlock.

• The results calculated by the program will be correct.

• The variable x never decreases.

• The light is not turned off unless the red button has been pressed.

If we assume that properties are given by boolean functions φ[α] of program executions α, we
can characterize safety properties by [Sch97]:

A property φ is a safety property if for any execution α that does not satisfy φ, there
is an execution prefix β of α such that for any further execution γ, φ does not hold
for βγ.

23
24 CHAPTER 3. SAFETY PROPERTIES AND INVARIANTS

In words, this means that if some safety property does not hold for a given execution, then we
can identify a point at which “it goes wrong” (after β) and from that point, there is no way (γ)
the situation can be remedied.
Fortunately, many important safety properties can be specified just by giving a state predicate
that characterizes the “good states” that the execution always should stay within. Such state
predicates that must always be satisfied are called invariants.
Using history variables 1 , any general safety property can be expressed by an invariant. Therefore
invariants form an important basis for correctness of concurrent programs. In this chapter, we
show how to prove invariants of concurrent programs using an inductive technique.

3.2 Notation
Definitions

To define X to denote the value of a given expression, we generally use the notation:

X = ”expression”

3.2.1 State Predicates

Formal statements will be expressed in usual (first order) predicate logic. That is, we can build
logical statements about the state of a program using variables, functions and relations. These
may again be combined using the boolean connectives:
¬ negation (also seen as ∼)
∨ or
∧ and
⇒ implies (also seen as ⊃, →)
⇔ equivalence (also seen as ≡)
and quantification:
∀x : p(x ) For all x , p(x ) holds.
∃x : p(x ) There exists an x for which p(x ) holds.
By convention, the operators bind in the order shown (¬ binding strongest).
Given a state that assigns values to variables, the truth of a predicate can be defined. We use
the notation:
[[p]]s
to denote the truth value of predicate p in state s. Thus, [[·]] is a function from states to
{true, false}. This function can be defined inductively over the structure of p using the
standard interpretations of operators and relations.
Correctness properties of sequential programs will occasionally be specified using Hoare triples:
{P } S {Q }
expressing that if the program part S is started in a state satisfying P , Q will hold for the end
state when (and if) the execution of S terminates. For a more thorough treatment of Hoare
triples for concurrent programs, see e.g. [And00, Sch97].
1
History variables are auxiliary variables recording parts of the execution history
3.2. NOTATION 25

3.2.2 Control Predicates

To express properties of program behaviour, it is convenient to be able to refer to the control


part of the state. We assume that each process is modelled by a transition diagram as described
in Chapter 2. The control variable associated with a process Pi will be denoted πi . We only
refer to the control variables through a fixed number of control assertions. Given that l is a
control location in the transition diagram of process Pi we introduce:


at l = πi = l Pi has reached l.

We may also refer to an action:


at a = πi = preloc(a) Pi has reached a.

For processes described textually, it is similarly possible to express that the control pointer
is within a certain program construct or is right after a program construct. For a program
construct S in process Pi we may express:

at S Pi is at the beginning of S .
in S Pi is in S .
after S Pi is right after S .

at S is defined only if S has a unique first control location l and is then equivalent to at l.
Correspondingly, after S is defined only if S is always left by going to a unique control location
m outside S and is then equivalent to at m. If S comprises the atomic actions a1 , a2 , . . . , an ,
in S is formally defined by:

in S = at a1 ∨ at a2 ∨ . . . ∨ at an

For an atomic action a, we thus have:

in a ⇔ at a

In general we have that being at some construct implies being in the construct:

at S ⇒ in S

To avoid ambiguity, S may be labelled (L : S ) and the label be used in control predicates: at L,
in L, and after L.
For composite constructs there will, of course, be a tight correspondence between the control
predicates for its constituents. For instance, for l : (S1 ; S2 ; S3 ) we have:

at l ⇔ at S1
after S1 ⇔ at S2
after l ⇔ after S3
in l ⇔ in S1 ∨ in S2 ∨ in S3
26 CHAPTER 3. SAFETY PROPERTIES AND INVARIANTS

When modelling textual programs by transition diagrams, we will avoid “empty” actions that
merely transfers control. E.g. for a loop w : while h B i do h S i we have:

at w ⇔ at h B → i
after h S i ⇔ at w
after w ⇔ after h ¬B → i

For sequentially composed program constructs S1 ; S2 ; . . . ; Sn we shall use the notation:



in Si ..Sj = in Si ∨ in Si+1 ∨ . . . ∨ in Sj

For a more formal treatment of control predicates for at textual language, we refer to [MP91].
Normally each process will have its own description, but we also allow a single description being
executed by several processes. In that case, we need to express which process is where. We do
that by qualifying the predicate with the process identification:

Pi at l = πi = l

3.2.3 History Variables

To formalize a property or make a proof, it may be necessary to introduce auxiliary history


variables that record information about the program execution that cannot be deduced from
the programs variables. History variables are also known as ghost variables.
History variables may appear only in assignments to history variables. Thereby, it is ensured that
they do not influence the program behaviour. History variables may occur in proof assertions,
of course.
Assignments to both local and shared history variables may be assumed being executed together
with preceding or following atomic actions, since these assignments are not to be implemented.

3.3 The Invariant Concept

The notion of invariance that we want to capture is as follows:

An invariant for a concurrent program is a state predicate I that is always satisfied,


i.e. the predicate holds for any state in any execution of the program.

To make this more precise, we shall assume that the behaviour of a concurrent program is
modelled by a labelled transition system reflecting the arbitrary interleaving of atomic actions
We recall some notions of transition systems from Definition 2.3.3:
An execution of a labelled transition system (Σ, A, T , s0 ) is a finite or infinite sequence
0a 1a 2 a
s0 −→ s1 −→ s2 −→ ...

where s0 is the initial state and (si , ai , si+1 ) ∈ T for every i .


Any state s that appears in some execution of TS is said to be reachable. The set of reachable
states is denoted by Reach(TS ).
3.4. PROVING INVARIANTS 27

Since the possible executions are given by the possible interleavings of actions, saying that I
must always be true amounts to saying that I must hold for all reachable states.
Definition 2 A state predicate I is an invariant of a labelled transition system TS iff:

For all s ∈ Reach(TS ) : [[I ]]s

From a set-theoretic point of view, we may thus characterize the invariant concept as follows:
Let the set of reachable states be denoted by R and the set of states satisfying I by I. That
I is an invariant of the program can thus be expressed as R ⊆ I. This is shown in figure 3.1,
where the outermost frame indicates the state space of the program, i.e. all conceivable program
states. The set R is characterized by the existence of a path of actions from the initial state s0
to any element of R and by the fact that no actions lead out of the set.

I
R

s0

Figure 3.1: The Invariance Concept

3.4 Proving Invariants

In this section, we present a general technique to prove invariants of concurrent programs. In


the following section, we then illustrate the technique on the kind of problems for which it is
most often used in its general form, viz. to prove safety properties of programs that synchronize
by busy-wait. In later parts of these notes, we present more specialized techniques associated
with different synchronization mechanisms.
From the definition of invariants, we see that in principle we may prove an invariant by (sys-
tematically) enumerating all the reachable states and check that I is satisfied for each of them.
This technique is known as model-checking. There exist programs such as Spin [Hol97] that
may model-check systems with quite large, but finite, state spaces.
Model-checking cannot be used in general, however, since R easily may be much larger than the
tools can cope with, often infinite. In such cases two techniques can be applied:

• Reducing the infinite state space to a finite one by program abstraction where only the
parts of the program relevant for the invariant at hand are represented.

• Proving the invariant by mathematical induction.


28 CHAPTER 3. SAFETY PROPERTIES AND INVARIANTS

Often the two techniques are used in combination. In this chapter, we focus on the inductive
technique.
To prove an invariant in general, we can use the following observation:

If I holds for the initial state and if it can be shown that none of the actions can
possibly falsify I , it follows by induction over the length of an execution, that the
invariant will hold after execution of an arbitrary sequence of actions. From this we
conclude that any state during any interleaving sequence will satisfy I , i.e. I is an
invariant of the program.

This observation is the basis of the following inductive proof technique:

The inductive invariance technique


Let there be given a concurrent program with atomic actions. To prove that I is an
invariant of the program it suffices to show:
1. That I holds for the initial state.

2. That any atomic action a of the program preserves I , i.e. for any state s for
which I is satisfied, it is either the case that:

• a cannot be executed in s, or
• the execution of a in state s results in a state s ′ that again satisfies I .

The satisfaction of point 1. usually follows immediately from the initialization. For point 2. we
must examine the effect of each action under the assumption that I holds and that the action is
actually executed, i.e. that control has reached the action and its condition (if any) is satisfied.
When the effect is investigated, attention must be paid to the fact that the execution changes
the values of the control predicates. For an action of the form h B → x := e i leading from
control location l to m in a process Pi we formally have to show the validity of the following
Hoare triple:
{I ∧ at l ∧ B } (x , πi ) := (e, m) {I }
In practice, a more informal, but systematic, argument can be used to show that each actions
preserves the invariant. Especially, the investigation should focus on the potentially dangerous
actions, i.e. action that may change the value of variables or control predicates that appear in
the invariant. The method is illustrated in section 3.5.

3.4.1 Invariant Strengthening

The invariance technique is based upon a very chaotic view of the execution of the program
actions, viz. that the only constraint on which actions may be executed at a given time is given
by the information in I . This view fits well when considering actions from different processes
which may normally be arbitrarily interleaved, but the technique does not cater for the fact that
actions from the same process are actually performed in a certain order. Likewise, the technique
does not consider that actions from different processes may be synchronized as a result of the
process interaction.
3.4. PROVING INVARIANTS 29

Thus, it should come as no surprise that there exist predicates that are invariants, but are not
inductive, i.e. cannot be shown to satisfy condition 2. of the invariance technique. That is, one
may find a state s, and an action a such that I holds in s, but does not hold after execution of
a. This seeming violation of the invariant is illustrated in figure 3.2.

a
s
I
R

s0

Figure 3.2: The action a does not preserve I

As can be seen from the figure, the reason that I is an invariant anyhow is that the “dangerous”
state s will not occur in any program execution, i.e. s 6∈ R. There are two ways to strengthen
the invariant to eliminate the offending state.

Very often we will be able to utilize a previously shown invariant H . If the offending state s does
not satisfy H , we know that the program will never reach s, and a does not have to preserve I
in that case. This is illustrated in figure 3.3(a).

a a
J
I I
R R

s0 s0
H

(a) (b)

Figure 3.3: Strengthening of the invariant

In some cases, it is not possible first to show another invariant that may eliminate the offending
states, since an attempt to do so requires the information in I . In that cases, one may try to find
a stronger invariant J (i.e. J ⇒ I ) and prove this instead. Such a technique where one proves
“more than necessary” is also known from standard mathematical induction. The technique is
illustrated in figure 3.3(b).

Finally, one may be lucky that the desired invariant I follows directly from a set of invariants
already proven. This can often be shown by contradiction, i.e. by showing that the assumption
¬I leads to a contradiction with the given invariants.
30 CHAPTER 3. SAFETY PROPERTIES AND INVARIANTS

3.4.2 Auxiliary Invariants

As described above, it is often possible to show a desired invariant I by proving a sequence of


auxiliary invariants I1 , I2 , . . . , that each may be used in the proof of the subsequent ones. There
are two kinds of invariants that we shall consider as immediate meaning that they usually do
not need further argumentation:

Local invariants for a process Pi , i.e. invariants in which only control predicates for Pi and
variables only changed by Pi occur. Such invariants are often shown by inspection of the
transition diagram or by usual sequential proof techniques.

Value invariants. By these we understand predicates that assert that a given variable may
only take certain values within its type. If the only assigments made to a variable x are
of the form x := c, where c is a constant, we may immediately assume the invariant:

x = cinit ∨ x = c1 ∨ . . . ∨ x = cn

where cinit is the initial value of x and c1 , . . . , cn are the constant values assigned to x in
the program.

For all other invariants a systematic argument based upon the invariance technique is needed.

3.4.3 Operational Proofs

By operational proofs, we understand argumentation for program properties based upon “what
will happen”during particular executions of the program. While this approach may be acceptable
for a deterministic, sequential program, it is totally unacceptable for concurrent programs. This
is, of course, partly due to the excessive number of possible executions (interleavings) and partly
due to the fact that the argument is at best based upon what is thought to be the “most critical
executions”. Many erroneous concurrent programs have been “proved” this way.
Another kind of operational proof is seen in the following technique: One assumes that I at some
time does not hold and traces the execution back to the point where it “went wrong” and then
proves that the actions violating the invariant could actually not be performed. Even though
this form is more acceptable, experience shows that it is difficult to go backwards in time to find
all possible ways to get to a certain state. Since in the end, the actions checked are the same as
in the invariance technique, the latter is recommended.

3.5 Invariants for Busy-wait Synchronization

By busy waiting we understand synchronization obtained exclusively by changing and polling


shared variables.
To prove invariants for programs using busy wait, we model the program at the level of atomic
actions. Hereby, we get a model in which the processes will exclusively contain actions of the
forms:
h x := e i assigments without conditions
hB →i conditional actions with no effect on the variable state (test actions)
3.5. INVARIANTS FOR BUSY-WAIT SYNCHRONIZATION 31

An invariant may then be proven by applying the invariance technique more ore less formally.
Below, we first show an informal, but systematic, approach.

First, we concentrate the investigation to the actions that from a purely syntactic view have
the potential to change the truth of the invariant. These potentially dangerous actions comprise
actions that assign to variables occurring in the invariant as well as actions that lead to or from
control locations mentioned in control predicates within the invariant.

Often a part of the potentially dangerous actions may be eliminated if we can ascertain that the
invariant will trivially hold after their execution. Especially we have that if the invariant is of
the form:
p⇒q
it will hold after actions that make p false or q true.

3.5.1 Peterson’s Algorithm

We shall illustrate the method by and example:

Example

We wish to prove mutual exclusion for Peterson’s algorithm (see also [And00, Fig. 3.6]).
For two processes, PA and PB , the algorithm is:

var tryA , tryB : boolean;


turn : (A, B );
tryA := false; tryB := false; turn := A;

process PA ; process PB ;
repeat repeat
Non-CritA ; Non-CritB ;
tryA := true; tryB := true;
turn := B ; turn := A;
WA : while tryB ∧ turn = B do ; WB : while tryA ∧ turn = A do ;
CritA ; CritB ;
tryA := false tryB := false
forever forever

where CritA and CritB are the two critical sections. We assume that tryA , tryB , and turn
are not changed in the non-critical or critical sections.
To clearly delineate the atomic actions of the program, the corresponding transition dia-
grams are shown in figure 3.4. As the conditions of the two while-statements each contains
two critical references, the tests are not atomic and must be refined. Here we have assumed
that the conditions are evaluated from the left as shown. The critical and non-critical sec-
tions are here represented as several anonymous actions, but could also have been modelled
as single actions.
32 CHAPTER 3. SAFETY PROPERTIES AND INVARIANTS

✞ ✲❄❝ ❄
❝✛ ☎
Non-CritA ❄ Non-CritB ❄
.. ..
. .

❝ ❄

tryA := true tryB := true

❝ ❄

turn := B turn := A

❝ ❄

WA ✯❍
✟✟ ❍
WB ✯❍
✟✟ ❍❍
✟ ❍ ✟
turn = B → ✟ turn = A → ✟
✙✟ tryB →
❝✟ ✙✟ tryA →
❝✟
turn 6= B → ¬tryB → turn 6= A → ¬tryA →
❍❍ ✟✟ ❍❍ ✟✟
❥ ❝✟
❍ ✙ ❥ ❝✟
❍ ✙
CritA ❄ CritB ❄
.. ..
. .

❝ ❄

tryA := false tryB := false
✝ ✆ ✝ ✆

Figure 3.4: Peterson’s algorithm

As tryA and tryB are changed only by PA and PB respectively, we immediately get from
the transition diagrams the following local invariants:

I1 = in WA ..CritA ⇒ tryA

I2 = in WB ..CritB ⇒ tryB

By being in W we mean being at the start of the while-statement or at the sub-test as


shown in figure 3.4.
Furthermore, we immediately get the value invariant:

I3 = turn = A ∨ turn = B

Formally, mutual exclusion is expressed as:



I = ¬(in CritA ∧ in CritB )

We first try to show that I is an invariant by direct use of the invariance technique.
I is easily seen to hold initially, as none of the processes are in their critical section.
The potentially dangerous actions are those leading to and from the critical sections. It
is obvious that the invariant becomes trivially satisfied if PA leaves CritA and likewise for
PB . What remains to be examined are the actions entering the critical sections. We first
look at the actions in PA . From the transition diagram, we find to actions entering CritA .
We now try to show that they preserve I :
3.5. INVARIANTS FOR BUSY-WAIT SYNCHRONIZATION 33

1. PA enters CritA finding tryB to be false. But this cannot be the case, if PB is in
CritB cf. I2 . So if PB is in CritB , this action cannot be executed, and if PB is not in
Critb before this action, it will not be after the action either, thus the invariant holds
trivially. This action, thus preserves the invariant in all cases.
2. PA enters CritA finding turn 6= B . Cf. I3 , we then have turn = A, but this is not
enough exclude PB being in CritB before the action and thereby I becoming false by
this action.

Even using I1 , I2 , and I3 it not possible to show the invariance of I —we need further
information about the program behaviour.
By studying the program and by identifying the information missing in point 2. above,
we get the idea to prove that if PA is in WA and PB is in CritB then turn has the value
B . Adding the symmetrical property, we are thus trying to show the following auxiliary
invariants:

Ia = in WA ∧ in CritB ⇒ turn = B

Ib = in WB ∧ in CritA ⇒ turn = A

The potentially dangerous actions for Ia are those that enter and leave WA and CritB as
well as those changing turn. We see that leaving WA or CritB makes the left hand side of
the implication false whereby Ia becomes trivially true. We are thus left to consider the
following actions in PA and PB (since turn is assumed not to be changed other places than
shown):

• PA enters WA by executing h turn := B i. After the execution, the right hand side of
Ia is true, why Ia is trivially preserved.
• PB enters CritB finding tryA to be false. But according to I1 , PA then cannot be
in WA before the action, and thus not after the action. The left hand side of the
implication is therefore false after the actions, why Ia holds trivially.
• PB enters CritB finding turn 6= A. From I3 it follows that turn = b before the action.
Since turn is not changed by the test action, turn = B also after the action, hence Ia
is preserved.

We have thus shown that Ia is an invariant for the program. By symmetry, we see that Ib
must be an invariant too.
We may now finish point 2. of the argument that PA preserves I :

2′ . PA enters CritA finding turn 6= B . Before the action we have in WA . Thus, PB cannot
be in CritB before since that according to Ib would imply turn = B contradicting the
test condition. Therefore PB is not in CritB after the action, why I holds.

As I is symmetric, we may likewise show that all actions in PB preserve I .


Since I holds initially and is preserved by all actions, we conclude that I is an invariant of
the program. Thus, Peterson’s algorithm ensures mutual exclusion (something that may
not be obvious by the first look at it).
To prove I , it was necessary to show the auxiliary invariants Ia and Ib as well as the
immediate invariants I1 , I2 , and I3 .
34 CHAPTER 3. SAFETY PROPERTIES AND INVARIANTS

3.5.2 A Formal Notation

As it emerges from the above proof of Peterson’s algorithm, an informal, but systematic proof
that any action preserves a predicate I may become a long story. In this section we propose a
formal notation that may be used to express these arguments.
We have already mentioned that for an action h B → x := e i leading from control location l to
m in a process Pi is given by the pre/post specification:

{I ∧ at l ∧ B } x , πi := e, m {I }

To prove this, we shall use a notation which is more close to the informal argumentation which
operates with before and after values of variables and control predicates. In the context of the
execution of a particular action we will let usual variable names and control predicates refer to
the values in the state before the execution of the action and let names and control predicates
marked with ′ denote the values in the state after the execution. For instance:

x The value of x before the execution.


y′ The value of y after the execution.
in S ′ The value of in S after the execution.

Further, we will use


p′
to indicate that all program variables x and control predicates are to be substituted with their
post-version x ′ in the assertion p.
Using this notation, what we formally have to show for an action h B → x := e i leading from
control location l to m is:

I ∧ P ∧ at l ∧ ”effect of action” ⇒ I ′

That is, if I holds before the execution and the conditions for the execution are satisfied (P ∧
at l) then from the effect of the execution we should be able to prove that the invariant hold in
the after-state, I ′ . The effect of the action is given by:

x ′ = e ∧ at m ′ ,
y′ = y for all variables y different from x ,
q′ = q for all control predicates q not involving l or m.

This notation may now be used in a proof for each potentially dangerous action in number of
steps record an argument for the I ′ to hold. In this proof, one may freely exploit the above
condition and effect assertions as well as already shown invariants. The individual proofs may
be gathered into a scheme as shown below:

Example

We will repeat the proof of Ia from the example above, this time using the formal notation.
For sake of completeness, the definition of Ia is repeated here:

Ia = in WA ∧ in CritB ⇒ turn = B
3.5. INVARIANTS FOR BUSY-WAIT SYNCHRONIZATION 35

Initially we have ¬in CritA ∧ ¬in CritB ⇒ Ia .

As before, we see that leaving WA or CritB immediately makes the invariant true. We are
thus left with the following potentially dangerous actions for Ia :

Process Action Proof Argument


PA turn := B Ia ′ turn ′ = B
PB turn := A ¬in CritB ′ at WB ′
Ia ′
¬tryA → ¬in WA Cond., I1
¬in WA ′ ′
in WA = in WA
Ia ′
turn 6= A → turn = B Cond., I3
turn ′ = B turn ′ = turn
Ia ′

Since all actions preserve Ia , it is an invariant of the program.

As it appears, for each potentially dangerous action in each process, we state:

1. The name of the process to which the action belongs. Can be used to distinguish otherwise
identical actions in different processes.

2. The action itself. Either given as it appears, by a label, or by an informal description. To


avoid informality, all actions could be labelled.

3. A proof consisting of a number of proof steps concluding with the post-version of the
invariant I ′ . For each step, it is noted how this step follows from the preceding one. This
argument may be a reference to the condition of the action (Cond.), an obvious effect of
the action, a previously shown invariant, or some combination of these. Possibly there may
be a reference to an elaborate argument. If there is a need to refer to previous steps, all
steps may be numbered, but where a step follows from the immediately preceding one (as
is the usual case) we omit the numbering. The argument for the last step usually is that
I ′ follows from the second but last step by using the invariant definition. This standard
argument is usually omitted too.

Notice that for the argument of a step, it is legal to refer to a previously shown invariant H
as well as its post version H ′ . One may also refer to (the pre-version of) I , but not to the
post-version I ′ , of course.
36 CHAPTER 3. SAFETY PROPERTIES AND INVARIANTS

3.6 Invariants for Semaphores

For semaphores (and other synchronization primitives) the operations on the them may be
modelled as conditional atomic actions. For instance the operations V and P may be seen as the
actions:
V(S ) corresponds to h s := s + 1 i
P(S ) corresponds to h s > 0 → s := s − 1 i

where s is the semaphore value.

Proving invariants for semaphore based programs could then be carried out as shown above. Of-
ten, however, one can give a simpler argument based on the semaphore synchronization property
expressed by:

The Semaphore Invariant


For any semaphore S the following holds.

s = s0 + #V(S ) − #P(S )
s ≥ 0,

where
s is the semaphore value
s0 is the initial value (≥ 0)
#V(S ) is the number of completed V-operations
#P(S ) is the number of completed P-operations

Thus, the semaphore value is defined to be the initial value plus the number of exceeding signal-
operations. The implementation of the semaphore must ensure that this value never becomes
negative.

Often the value of s is not used by itself, only that it is non-negative. This leads to a simplified
version of the semaphore invariant:

#P(S ) ≤ s0 + #V(S )

Example

We are given two processes PA and PB performing two operations opA and opB respectively.
We want to synchronize the two processes in such a way that opA and opB are executed in
step, i.e. such that the number of completed executions of the two operations never differ
by more than one. Below is a proposed solution to this problem using semaphores:

var SA , SB : semaphore;
SA := 0; SB := 0;
3.7. NOTES 37

process PA ; process PB ;
repeat repeat
V(SB ); V(SA );
P(SA ); P(SB );
.. ..
. .
opA ; opB ;
.. ..
. .
forever forever

We are now going to show the correctness of this solution. If the number of executions of
opA and opB are called a and b respectively, we formally have to prove

|a − b| ≤ 1

is an invariant. We argue for this by using the structure of the processes combined with
the semaphore invariant:
From the program structure for PA , we find that for each execution of P(SA ), it must
have performed V(SB ). Since these operations only appear in PA the following must be a
program invariant:
#P(SA ) ≤ #V(SB )
For the number of executions of opA we may likewise conclude:

a ≤ #P(SA )

and
#V(SB ) ≤ a + 1
Correspondingly, we may argue for the operations in PB . Altogether we have the inequal-
ities:
a ≤ #P(SA ) ≤ #V(SB ) ≤ a + 1
b ≤ #P(SB ) ≤ #V(SA ) ≤ b + 1
Combined with the simplified semaphore invariants:

#P(SA ) ≤ #V(SA )
#P(SB ) ≤ #V(SB )

we immediately get:

a ≤b+1 and b ≤a +1

that precisely states that the number of operation executions cannot differ by more than
one.

3.7 Notes

In this chapter we have defined safety properties and seen how they can be expressed using
invariants that must be true at any time in any execution. Invariants play a central rôle for
concurrent programs, since they capture the relationships across the processes of the program.
38 CHAPTER 3. SAFETY PROPERTIES AND INVARIANTS

The invariance technique presented here is similar to the use of global invariants as described in
Andrews’ book [And00], section 2.7.3. For a much more thorough and formalized approach to
the use of global invariants for concurrent programs and the inductive invariance technique, see
[MP91, MP95].
For practical verification of invariants and other program properties, many experimental verifi-
cation tools are available. Especially for concurrent programs, the model-checker Spin [Hol04]
has been used for many real applications.
The Spin homepage is at www.spinroot.com.
Chapter 4

Liveness Properties and Temporal


Logic

4.1 Introduction

As stated in Chapter 3, concurrent programs differ from sequential ones by being reactive, i.e.
expressing an (often infinite) activity rather the computation of a final result. Thus, concurrent
programs are not characterized by their input/output relation, but rather by their behaviour.
Informally, properties of concurrent programs are divided into:

Safety properties that ensure that the program does nothing wrong.
Liveness properties that ensure that the program does make progress. Paired with the
safety properties, these imply that the program does something good.

Safety properties can be expressed as invariants (perhaps using auxiliary variables recording the
history) and proved by explicit model-checking or by inductive arguments. Safety properties
alone, however, are not sufficient to characterize a desired behaviour. A program that halts
immediately will satisfy any invariant that holds for the initial state.
To ensure desired behaviour, safety properties must be supplemented by properties like:

• A certain part of the program will be visited again and again.


• The variable x will never become constant.
• Any press on the button Go leads to the green lamp being lit.

In general, such liveness properties refer to two or more time points of the program execution
and thus cannot be captured by single-state properties like invariants. Instead, we may use the
language of Temporal Logic which is able to express properties of sequences of states.
If, as in Chapter 3, we assume properties to be boolean functions φ[a] on executions, liveness
properties can be characterized as follows [Sch97]:

A property φ is a liveness property if for any execution prefix β we can find a further
execution γ such that φ holds for βγ.

39
40 CHAPTER 4. LIVENESS PROPERTIES AND TEMPORAL LOGIC

4.2 Temporal Logic

4.2.1 Model

We assume that a concurrent program is modelled by a transition system where concurrent exe-
cution is represented by arbitrary interleaving of atomic actions. Correspondingly, the behaviour
of the program is modelled by the set of executions that can be generated by the transition sys-
tem, where each execution is of the form:
a
0 1a 2 a
s0 −→ s1 −→ s2 −→ ···

where s0 , . . . are states and a0 , . . . are atomic actions. Each state is understood to assign values
to a set of variables.
For any terminating execution ending in state sn , we may pretend that the execution goes on
for ever by appending an “idle” action i that repeats the final state:
an−1 i i
· · · −−−→ sn −→ sn −→ sn · · ·

Finally, we assume that we are interested in properties of the state only, such that the actions
can be omitted. Thus, we consider a program Prog to be characterized by a set Exec(Prog) of
infinite state sequences:
σ = s0 , s1 , s2 , s3 , . . . .

4.2.2 Temporal Formulas

A Temporal Logic is a logic which can express properties related to the progress of time. There
are numerous kinds of Temporal Logics depending on whether time is considered continuous or
discrete, whether points or intervals of time are considered etc. In these notes, we stay within
a simple logic which can express properties of state sequences. This logic is also known as the
future fragment of linear time temporal logic.
The basic idea of temporal logic is to indicate temporal relationships among states only implicitly
through a number of temporal operators.
A temporal formula is built from usual first-order predicates on the state plus the temporal
operators ✷ (always) and ✸ (eventually). The formal syntax is given by

Temporal Logic Syntax

P ::= p State-predicate
| P ∧Q Conjunction
| P ∨Q Disjunction
| ¬P Negation
| ✷P Always
| ✸P Eventually
| ∀x : P Universal quantification
| ∃x : P Existential quantification

where p is any usual first-order predicate.


4.2. TEMPORAL LOGIC 41

In the following, we assume that the temporal formulas are closed, i.e. do not contain any free
variables (in the usual first-order logic sense).
In general, the truth value of a formula P is defined for a given point of time i in a given
execution σ. This truth value is denoted by (σ, i ) |= P . The point of time i is to be understood
as the current reference time (“now”).
Informally, the meaning of a temporal formula is given by:

• A state predicate p is read as “p holds now”.

• ✷P is read as “P holds from now on”.

• ✸P is read as “P holds eventually (now or in the future)”.

• All boolean operators and quantifiers retain their usual reading.

This reading is captured by the following formal definition:

Temporal Logic Semantics


We assume that the semantics of a state predicate p is given by [[p]], which maps
states to Bool, according to the usual mathematical interpretation of constants,
functions and relations.

(σ, i ) |= p = [[p]]si

(σ, i ) |= P ∧ Q = (σ, i ) |= P and (σ, i ) |= Q

(σ, i ) |= P ∨ Q = (σ, i ) |= P or (σ, i ) |= Q

(σ, i ) |= ¬P = (σ, i ) 6|= P

(σ, i ) |= ✷P = ∀j ≥ i : (σ, j ) |= P

(σ, i ) |= ✸P = ∃j ≥ i : (σ, j ) |= P

(σ, i ) |= ∀x : P = (σ, i ) |= P [c/x ] for all constants c

(σ, i ) |= ∃x : P = (σ, i ) |= P [c/x ] for some constant c

Here p[c/x ] denotes the formula obtained by substituting the term c for any free
occurrence of the variable x .

We say that a formula P holds for an execution σ, denoted by σ |= P , if it holds initially:



σ |= P = (σ, 0) |= P

We say that a formula is valid for a program Prog, denoted by Prog |= p, if it holds for any
execution of Prog:

Prog |= p = ∀σ ∈ Exec(Prog) : σ |= P
Finally, we say that a formula P is valid, denoted by |= P , if it holds for any state sequence:

|= P = ∀σ : σ |= P

To avoid too many parenthesis, we make the convention that the temporal operators bind
stronger than the logical ones. Eg. ✷P ∧ Q is interpreted as (✷P ) ∧ Q .
42 CHAPTER 4. LIVENESS PROPERTIES AND TEMPORAL LOGIC

Derived temporal operators

The basic operators ✷ and ✸ may be combined. Using the semantics of the operators, the
combinations may be read as follows:

• ✷✸P is read as “P holds infinitely often”.


• ✸✷P is read as “eventually P holds forever”.

As shown by the reduction laws below, further combinations may be reduced to the above two.
Another useful derived operator is leads-to that is introduced later.

4.2.3 Laws of Temporal Logic

Based on the semantics, a number of formulas can be shown to be valid, ie. to hold for any state
sequence. In this section, we shall take a look of a number of these temporal laws.
First, we have some laws that express basic properties of the temporal operators.
¬✷P ⇔ ✸¬P (4.1)
✷P ⇒ P (4.2)
P ⇒ ✸P (4.3)
✷(P ⇒ Q ) ⇒ (✷P ⇒ ✷Q ) (4.4)
✷P ∧ ✸Q ⇒ ✸(P ∧ Q ) (4.5)
Here (4.1) state that ✷ and ✸ are dual operators that may be defined in terms of each other.
The laws (4.2) and (4.3) state that the operators are reflexive, i.e. includes “now”.
The following distributive laws show that ✷ distribute like the universal quantifier and ✸ like the
existential one. In combinations, it is the innermost operator that determines the distribution
properties.
✷(P ∧ Q ) ⇔ (✷P ∧ ✷Q ) (4.6)
✸(P ∨ Q ) ⇔ (✸P ∨ ✸Q ) (4.7)
✸✷(P ∧ Q ) ⇔ (✸✷P ∧ ✸✷Q ) (4.8)
✷✸(P ∨ Q ) ⇔ (✷✸P ∨ ✷✸Q ) (4.9)

The reduction laws below show that any combination of the two temporal operators can be
reduced to ✷✸ (“infinitely often”) and ✸✷:
✷✷P ⇔ ✷P (4.10)
✸✸P ⇔ ✸P (4.11)
✷✸✷P ⇔ ✸✷P (4.12)
✸✷✸P ⇔ ✷✸P (4.13)
Finally, the following law is often used to divide into cases:
✷(P ∨ Q ) ⇒ (✷P ∨ ✸Q ) (4.14)
Thus, if we know that at any time, either P or Q holds, we may divide into the cases that P
holds forever, or, if not, P must be false somewhere and then Q must hold.
4.3. EXPRESSING LIVENESS PROPERTIES OF CONCURRENT PROGRAMS 43

4.2.4 The Leads-to Operator

A very common form of liveness properties is:

If, at any time, the system is in a state satisfying p, eventually the system will be in
a state satisfying q.

Such a property is easily expressed in temporal logic as

✷(p ⇒ ✸q)

Since properties of this form are very common, we introduce a special leads-to operator:

P ❀Q = ✷(P ⇒ ✸Q )

Here P and Q may be arbitrary temporal formulas, although usually only state predicates are
used.
A number of laws can be proved for the leads-to operator:

✷(P ⇒ Q ) ⇒ (P ❀ Q ) (4.15)
(P ❀ Q ) ∧ (Q ❀ R) ⇒ (P ❀ R) (4.16)
(P ❀ R) ∧ (Q ❀ R) ⇒ ((P ∨ Q ) ❀ R) (4.17)
(P ❀ Q ) ∨ (P ❀ R) ⇒ (P ❀ (Q ∨ R)) (4.18)

To reduce the need for parenthesis, we use the convention that ❀ binds less than any other
temporal or boolean operator.

4.3 Expressing Liveness Properties of Concurrent Programs

In this section we see how to express a number of general liveness properties of concurrent
programs using temporal logic. We also consider how to define the basic liveness properties of
a given programming language in order to prove these properties.
We assume that we are able to express properties of a single program state using a first-order
language of state predicates. Especially we assume that the control state is captured by control
predicates like in , at , and after referring to parts of the program. For an introduction to
control predicates, see Chapter 3 or [MP91].
For the rest of this note, we talk about formulas in the context of a given program. Thus, when
we say that a formula P holds, we understand that it is valid for the given program.

4.3.1 General Properties

In this part we look at various classes of properties expressible using temporal logic. These may
as usual be divided into safety properties and liveness properties where the latter may be divided
into further subclasses of which we shall consider responsiveness persistence and reactivity.
44 CHAPTER 4. LIVENESS PROPERTIES AND TEMPORAL LOGIC

Since a temporal formula talks about all states of all executions, it is obvious that the fact that
I is an invariant for a program can be stated as:

✷I

Invariants, however, may be proven by simple inductive techniques as shown in Chapter 3 and
are thus not considered further here.

Turning to liveness properties, one class consists of formulas of the form ✸✷p. Such persistence
properties state that some state eventually stabilizes. For instance, we may express that a
variable X eventually gets a positive value by:

✸✷ X > 0

Often we use persistence properties as assumptions about the behaviour of processes or we show
persistence properties as part of contradictive proofs, as described below.

The class of responsive properties consists of properties that can be expressed using the leads-to
operator (on state predicates): p ❀ q. Typically we may express that a process wanting to enter
a critical region (in entry) will eventually get there (at crit):

in entry ❀ at crit

A simple form of response properties are recurrence properties of the form true ❀ q. This may
be rewritten to ✷✸q. A typical example is to state that a programs always returns to a special
start-point. If at start denotes that control is at that point, the desired property is expressed
by:

✷✸at start

In general response formulas is a very important class of liveness properties and the technique of
proof lattices described below has been especially developed for proving properties of this class.

The most expressive class of liveness properties is that of reactivity properties. These are prop-
erties that can be expressed by formulas of the form: ✷✸p ⇒ ✷✸q (or equivalently ✷✸p ❀ q).
Typically these properties expresses strong fairness: That given sufficiently many chances, some
action will take place. For instance, a strongly fair semaphore can be captured by:

at P(S ) ∧ ✷✸(s > 0) ❀ after P(S )

where s is the value of semaphore S (see also section 4.6). Usually reactivity properties are used
to express fairness properties of language constructs and not properties of programs.
4.3. EXPRESSING LIVENESS PROPERTIES OF CONCURRENT PROGRAMS 45

4.3.2 Liveness Properties of Program Constructs

In order to prove any liveness properties of a concurrent program, the program must possess
some basic liveness properties. For the rest of these notes, we assume the basic property be that
of fair process execution1 :

Any sequential process will infinitely often be considered for execution.

By “considered for execution” we mean that it will be checked whether the process has any
actions enabled and if so, one of them chosen for execution. If the language has no conditional
actions, this correspond to executing each process on a processor with positive speed.
For a particular language, the principle of fair process execution must be expressed as liveness
properties of the basic sequential language constructs.
For a simple language with assignment, sequence, if and while, fair process execution may be
expressed by the following axioms:

• If l : x := e is an (atomic) assignment, the following axiom expresses fair process execution:

at l ❀ after l

That is, every assignment statement is eventually executed.


• For an if-statement l : if B then S1 else S2 the basic property states that the text-
condition will be evaluated:
at l ❀ (at S1 ∨ at S2 )

• Correspondingly for a while-loop w : while B do S we have that the condition is


evaluated:
at w ❀ (at S ∨ after w )

Based on these axioms, we may prove a number of derived rules that combined the structural
properties of the program with the above to get some rules that can be readily applied. Examples
of such rules are:

• Since the language is without gotos, there is the single exit rule for any language
construct S :
in S ⇒ (✷in S ∨ ✸after S )

• If we know the sequential effect of an atomic assignment l : x := e, e.g. in terms of a Hoare


triple, we may assume this effect immediately after execution of the statement:
“{p} l : x := e {q}” ✷(at l ⇒ p)
at l ❀ after l ∧ q

• For sequential composition l : (S1 ; S2 ) we have the control flow rule:


at S1 ❀ after S1 at S2 ❀ after S2
at l ❀ after l
That is, if both S1 and S2 are known to terminate, so is their sequential composition.
1
Fair process execution is equivalent to applying weak fairness to all atomic actions (see section 4.6)
46 CHAPTER 4. LIVENESS PROPERTIES AND TEMPORAL LOGIC

• For an if-statement l : if B then S1 else S2 we have a similar termination rule:


at S1 ❀ after S1 at S2 ❀ after S2
at l ❀ after l
In order to determine the way the branch goes, the test-condition must have a stable value:
at l ∧ ✷B ❀ at S1 ∧ b
at l ∧ ✷¬B ❀ at S2 ∧ ¬b

• Similarly, for a while loop w : while B do S we must assume stable condition (at least
whenever the test is performed) in order to determine the branching:
at w ∧ ✷(at w ⇒ B ) ❀ at S ∧ B
at w ∧ ✷(at w ⇒ ¬B ) ❀ after w ∧ ¬B
From this, it is possible to derive simplified rules that can be applied directly.
For instance, we would often like to conclude that a loop is eventually left. This may be
shown using a derived rule like the while exit rule:
in S ❀ after S
in w ∧ ✷¬B ❀ after w
That is, if we have shown that the body S always terminate and we know that control
is within w and that b is forever false, then eventually control must reach the while-test
which is then going to fail.
Also we may want to show that a process gets stuck in w . In that case, the following
derived rule is useful:
in w ∧ ✷B ❀ ✷in w

For a real language, the rules of sequential control flow will, of course, become much more
complex due to gotos, exceptions etc. However, often the intuitive understanding of the control
flow is sufficient for rigorous reasoning.

4.4 Proof Lattices


In this section, we present a technique for proving leads-to properties.
A lattice is a directed acyclic graph with one source and one terminal node. A proof lattice is a
lattice in which the nodes are labelled by temporal formulas.
A proof lattice represents a number of temporal formulas; one for each node. For each node
labelled by R with successor nodes labelled by R1 , R2 , . . . , Rn , the corresponding “branching”
formula is is2 :
R ❀ R1 ∨ R2 ∨ . . . ∨ Rn
Formally, the proof lattice represents the conjunction of these “branching” formulas. By induc-
tion over the height of the lattice, using the distributive laws for leads-to, it follows that
P ❀Q
where P is the source label and Q is the terminal label.
Thus, if all the “branching”formulas hold for a given program, we may conclude that the program
has the property P ❀ Q .
2
For the terminal node, this formula degenerates to true
4.4. PROOF LATTICES 47

Example

R1 R2


 P ❀ R1 ∨ R2
R1 ❀ R3

represents the formulas

 R2 ❀ Q ∨ R3
R3 ❀ Q

R3

A proof lattice may be seen as a way of structuring an argument going through all principal
execution paths from a given state. The lattice decomposes the argument into a number of
proof obligations (the “branching” formulas) that may be verified separately using the temporal
properties of the program constructs. The proof obligation should be simple enough to be
provable by the rules for sequential control flow combined with previously shown invariants for
the program. This means that each proof obligation should deal with the progress of only one
of the sequential processes.

Frames

Often a temporal proof eliminates undesired execution paths by contradiction. In a proof lattice,
this corresponds to having a certain branch that lead to the formula false. Often the single exit
rule is used such that a process either gets stuck in a certain construct, or it will eventually get
out of it:
at S ⇒ ✷in S ∨ ✸after S
In order to prove that the branch ✷in S leads to a contradiction, this assumed fact is usually
needed on the path(s) further below the branch. Rather than repeating ✷in S everywhere, we
introduce a frame as a box encompassing a number of proof lattice nodes and labelled with a
temporal formula of the form ✷P . The frame abbreviates the lattice obtained by adding (by
conjunction) the label to every node encompassed.

Example

P P

✷¬Q ✷¬Q
✷¬Q

✷R ✷R ∧ ✷¬Q
✷R abbreviates
T T ∧ ✷¬Q ∧ ✷R
Frames
false false

Q Q
48 CHAPTER 4. LIVENESS PROPERTIES AND TEMPORAL LOGIC

4.5 Example: Critical Region

In this section, we apply temporal logic to state principal liveness properties of critical regions
and prove these for a given program.

As is well-known, a critical region is a collection code stretches (called critical sections) from
different processes among which mutual exclusion is required. Processes attempt to enter the
critical region by performing a entry-protocol and indicate exit of the critical region by a exit-
protocol.

Of course, the most important property of a critical region is the safety property of mutual
exclusion among the critical sections. However, this must be accompanied by liveness properties
for the critical region to be of use.

To be meaningful, any liveness of a critical section must be based on the assumption of termi-
nation of all critical sections:
in criti ❀ after criti

for any i .

Three degrees of liveness of critical sections are usually defined:

Obligingness. This is also known as absence of unnecessary delay. If a process is the


only one to attempt access to the critical region, it should succeed “immediately”.

Resolution. If two or more processes attempt to enter the critical region simultaneously,
one of them should be succeed. This implies absence of deadlock and livelock.

Fairness. A process that attempts to enter its critical section will eventually succeed.

If we suppose that the critical region is used by processes P1 , P2 , . . . , Pn , and that criti , noncriti ,
and entryi denote the critical section, non-critical section, and entry-protocol of process Pi
respectively, then the above liveness properties for critical regions are readily expressed as:

in entryi ∧ (∀j 6= i : ✷in noncritj ) ❀ at criti (Obl.)


∃i : in entryi ❀ ∃j : at critj (Resol.)
in entryi ❀ at criti (Fairness)

There are other equivalent formulations of these properties. For instance, obligingness may be
expressed as:
at entryi ∧ (∀j 6= i : ✸✷in noncritj ) ❀ at criti

stating that if only other processes promise to stop using the critical region when the i ’th
process enters the entry-protocol, it will succeed. Also, fairness is often related to the start of
the entry-protocol:
at entryi ❀ at criti
4.5. EXAMPLE: CRITICAL REGION 49

4.5.1 Program

As an example, we now propose the following solution to the problem of establishing a critical
region for two processes.

var C1 , C2 : boolean;
C1 := false; C2 := false;

process P1 ; process P2 ;
repeat repeat
noncrit1 ; noncrit2 ;
a1 : C1 := true; a2 : C2 := true;
w1 : while C2 do ; w2 : while C1 do
crit1 ; begin
x1 : C1 := false b2 : C2 := false;
forever d2 : while C1 do ;
e2 : C2 := true;
end;
crit2 ;
x2 : C2 := false
forever

where the intended meaning is that after initialization of the shared variables, P1 and P2 are
executed concurrently.

Note that P1 has intensionally been given priority. Therefore, we can easily device a scenario
in which the P1 lockouts P2 from the critical region. However, we should still have obligingness
and resolution. This will be demonstrated below.

However, before embarking on the liveness properties, we state the following invariants of the
program:

Fi = in noncriti ⇒ ¬Ci i = 1, 2

G = in w1 ⇒ C1

H = in d2 ⇒ ¬C2

Since Ci is changed only by Pi , these invariants follow by local sequential reasoning.


50 CHAPTER 4. LIVENESS PROPERTIES AND TEMPORAL LOGIC

4.5.2 Obligingness

The weakest liveness property for critical regions is obligingness meaning that a process is
guaranteed to enter the critical region if it is the only one using it (at least for a while).

We start by proving obligingness for P2 . If P2 attempts to enter the critical region when P1 is
known not to need it anymore, P2 will succeed.

The proof is given by the following proof lattice:

✷in noncrit1 in a2 ..w2 ∧ ✷in noncrit1


✟✟ 1. ❅

✟✟
✷in w2 ❅
✷in w2
❄2.
✷¬C1
❄3.
after w2
❄4.
false
❍❍
5. ❍ ❥
❍ ✠
at crit2

Each step of the lattice is justified by the following annotations:

1. First, the assumption ✷in noncrit1 is assumed for the rest of the proof. Then either
crit2 is eventually entered, or P2 stays forever in the entry-protocol. Since by fair process
execution, it cannot stay at a2 , it would then have to stay within w2 . The latter is assumed
below.

2. By invariant F1 , if P1 stays forever in noncrit1 , C1 must be forever false.

3. If C1 is forever false, by fair process execution, P2 will eventually be forced to leave w2 .

4. But this contradicts the assumption ✷in w2 .

5. From false everything can be deduced.

Thus, the program is obliging towards P2 :

in a2 ..w2 ∧ ✷in noncrit1 ❀ at crit2

from which we may deduce the slightly stronger property:

at a2 ∧ ✸✷in noncrit1 ❀ at crit2


4.5. EXAMPLE: CRITICAL REGION 51

4.5.3 Resolution

We show that the program will not suffer from deadlock or livelock if the two processes attempt
to enter the critical region at the same time.

✷in a1 ..w1 ∧ ✷in a2 ..w2


✟✟ 1. ❅
✟✟ ❅
✟ ❅

✟ ❅
✷in w1 ∧ ✷in w2 ✷in w1 ∧ ✷in w2
❄2.
✷C1
❄3.
✷in d2
❄4.
✷¬C2
❄5.
after w1
❄6.
false

❍❍
7. ❍❍

❥ ✠
at crit1 ∨ at crit2

Annotations:

1. If both processes are in their entry-protocols, either one of them gets to the critical section,
or they both stay in their while-loops forever.

2. Henceforth both processes are assumed to stay in their while-loops. According to invariant
G, it then follows that C1 stays true forever.

3. Assumed to be caught in w2 , the fact that ✷C1 forces process P2 to get stuck in d2 .

4. In d2 , C2 is false according to H . Therefore ✷in d2 implies C2 false forever.

5. If C2 is false forever, P1 is eventually going to discover this and leave the while-loop w1 .

6. This would be in contradiction with the surrounding assumption ✷in w1

7. From false everything can be deduced.


52 CHAPTER 4. LIVENESS PROPERTIES AND TEMPORAL LOGIC

4.5.4 Fairness

Finally, we prove that the program ensures fairness for P1 .

at a1
❄1.
at w1
✟✟ 2. ❅

✟✟ ❅
✷in w1
❄3.
✷in w1 ∧ ✷C1 ✷in w1 ∧ ✷C1

✠ 4. ❅ ❘
✷in noncrit2 in w2
❄6.
✷in d2

5. ❅
❘ ✠ 7.
✷¬C2
❄8.
after w1
❄9.
false

10.❍❍❥
❍ ✠
at crit1

Annotations:

1. Due to fair process execution, a1 is eventually executed.


2. According to the single exit rule, either P1 stays in w1 forever or leaves it.
3. If P1 stays in w1 forever, C1 is forever true due to invariant G.
4. Assuming P1 to stay in w1 with C1 true forever, either P2 eventually reaches w2 or it
stays forever somewhere else. Since it cannot stay at a2 or x2 forever due to fair process
execution nor in the critical sections due to the basic liveness assumption, it has to stay
in noncrit2 .
5. Due to invariant F2 , if P2 stays in noncrit2 , C2 will forever be false.
6. Since C1 is assumed to remain true, P2 is eventually caught within d2 due to fair process
execution.
7. At d2 , C2 is false according to invariant H , thus C2 is false forever.
8. If C2 stays false, P1 will eventually test this and leave w1 .
9. But this contradicts the assumption above.
10. From false everything can be deduced.

Since we have proven at a1 ❀ at crit1 the algorithm is fair towards P1 as expected.


4.6. GENERAL FAIRNESS PROPERTIES: WEAK AND STRONG FAIRNESS 53

4.6 General Fairness Properties: Weak and Strong fairness

As an alternative to defining the liveness properties of all langauges constructs inductively as


shown in section 4.3.2, one may apply a general notion of progress to each individual atomic
action of the program.
For an arbitrary conditional action a : h B → x := e i two general fairness principles apply.
Weak fairness asserts that an action which is constantly enabled must eventaully be taken.
This notion correspond to fair process execution as used in section 4.3.2. Here, infinitely often
the action is considered for execution, i.e. if control has reached a, the condition B is checked.
However, only if the if the condition is stable it can be ensured that the action is enabled at the
time of testing and hence that the action will be executed.
To express the notion of weak fairness formally, we assume that the location after a can be
reached only by execution of a 3 . In this case, weak fairness for action a is expresed by by the
axiom:
✷(at a ∧ B ) ❀ after a
Although it may seem counter-intuitive to assume that control stays at a forever and at the
same time moves on (after a), the axiom implies that it cannot be the case that control stays
at a with B being true forever without a being taken.
If a has no alternatives (i.e. no other actions branch out from a’s pre-location), once a has been
reached, control will stay at a until a is taken. In that case, the axiom of weak fairness for a
may be simplified to:
at a ∧ ✷B ❀ after a

The principle of strong fairness, on the other hand, asserts that if the action is given sufficiently
many chances, the action will eventually get executed. This may be expressed by the axiom:

✷✸(at a ∧ B ) ❀ after a

Again, if the action has no alternatives the axiom of strong liveness simplifies to:

at a ∧ ✷✸B ❀ after a

Whereas weak fairness is readily implemented by a scheduling principle like round-robin (each
process is considered in turn), it is not feasible to apply strong fairness to all actions. Rather
the notion may be applied to selected actions and then implemented using some form of queuing
associated with these actions in combination with weak fairnes. See also [And00], Chapter 2.

4.6.1 Example: Liveness of Semaphore Operations

As described in section 3.6 the operations of a semaphore S may be modelled by atomic actions:

V(S ) corresponds to h s := s + 1 i
P(S ) corresponds to h s > 0 → s := s − 1 i

We assume that in any concrete program, these actions have no alternatives, ie. once control
has reached the action, it will stay there until executed.
3
If not, execution of a must be expresed by other more complex means, eg. the setting of a particular flag.
54 CHAPTER 4. LIVENESS PROPERTIES AND TEMPORAL LOGIC

Now the liveness properties of the semaphore may be defined by applying the principle of either
weak or strong fairness to (any instance of) the two operations giving rise to the notions of a
weakly fair semaphore and a strongly fair semaphore respectivly.
As a V-operation has no condition, application of both weak and strong fairness will simplify to
the progress property:
at V(S ) ❀ after V(S )
For a P operation, weak fairness amounts to:

at P(S ) ∧ ✷(s > 0) ❀ after P(S )

That is, if other processes may perform P-operations, the process at hand may suffer from
starvation and never get executed.
Application of strong fairness to a P-operation, on the other hand, will guarantee that that the
operation succeeds if the semphore is signalled sufficently often:

at P(S ) ∧ ✷✸(s > 0) ❀ after P(S )

A strongly fair semaphore may be implemented by maintaining a FIFO-queue holding the pro-
cesses currently blocked by P-operations for the particular semaphore.

4.7 Further Readings

The idea of applying the classical mathematical notion of temporal logic to concurrent programs
is due to Amir Pnueli. Together with Zohar Manna, he had begun a thorough exposition of
temporal logic and concurrency in the books [MP91] and [MP95]. The notion of proof lattices
was introduced by Susan Owicki and Leslie Lamport in [OL82].
Chapter 5

Monitor Invariants

5.1 Introduction

As described in Chapter 3, the notion of invariance is an important device for stating and proving
safety properties of concurrent programs.
In this note we focus on a particular form of invariants associated with the monitor concept. A
monitor is a program component that combines data abstraction, atomicity, and synchroniza-
tion. The combination may be implicit through syntactic constructs or explicit using library
mechanisms from within a class-like program construct.
A monitor invariant is a property of the state of the monitor taking into account that the
monitor operations can be considered atomic. Therefore, the property needs only hold in-
between monitor operations.
Monitor invariants may be expressed using various information about the monitor state. Basi-
cally, the invariant can use the monitor variables (i.e. the variables being encapsulated by the
monitor) and history variables. However, also notions such as the number of processes waiting
on various condition queues may be used.
Although a monitor invariant is inherently introvert, talking about the inner, hidden state of the
monitor, it can often be lifted to a property of of the sequence of operation invocations. Knowing
the usage pattern of the monitor operations, this again can be used to deduce properties of whole
program.

5.2 Monitor Invariants

Definition 3 A monitor invariant is a predicate on the state of the monitor that must be sat-
isfied whenever the monitor is free. A monitor is said to be free when a new process may start
executing a monitor operation. 

A direct consequence of this definition is that

If I is a monitor invariant, then a process can be certain that the I holds at the start
of a monitor operation.

55
56 CHAPTER 5. MONITOR INVARIANTS

The exact conditions for a monitor being free depends on the particular queue semantics in use.
E.g. for Hoare’s semantics (signal and urgent wait), the monitor is not considered free as long
as there are signalling processes waiting to resume execution within the monitor.
For the prevailing signal and continue semantics (SC), however, the monitor becomes free when-
ever a process leaves the critical region by waiting on a condition queue or ending a monitor
operation.

5.3 Stating Monitor Invariants

In the following we assume a monitor construct with explicit condition queues for conditional
synchronization. Similar notions, however, may be employed for constructs with implicit queues
such as the protected objects of Ada95.
Given a monitor the following notions may be used for expressing a monitor invariant:

1. The (“global”) monitor variables.


2. History variables added to the monitor.
3. Information about the processes on the condition queues.

4. Information about the processes woken up, waiting for re-entrance (SC only).

Monitor variables and history variables are just referenced by their names as usual. To talk
about the waiting and awakened processes, the following notation is introduced:

waiting(c) The number of processes waiting on condition queue c.

woken(c) The number of processes woken up from condition queue c that have not yet
reentered the monitor (SC only).

In many cases, it is necessary to distinguish processes that have entered a condition queue from
different monitor operations or even from different points of the same operation. For this, we
allow the above attributes to be qualified with an operation name op or occurrence number op, i .
For the waiting attribute, this yields:

waitingop (c) The number of processes waiting on condition queue c having called wait in
monitor operation op.
waitingop,i (c) The number of processes waiting on condition queue c having called the i ’th
textual occurrence of wait in monitor operation op.

woken may be qualified similarly.


These attributes are, of course, related in the sense that waitingop (c) is the sum of waitingop,i (c)
for all occurrences of wait in op and waiting(c) is the sum of waitingop (c) for all operations op
in the monitor.
The attributes may be further specialized to talk about the number of processes waiting with
particular parameters etc., but we refrain from giving formal notations for this.
5.4. PROVING MONITOR INVARIANTS 57

Example (FIFO Semaphore)

Consider the FIFO semaphore from Figure 5.3 in [And00]. Using our notation and adding
history variables to record the number of P and V operations, we get:

monitor FIFOsemaphore
var s : integer := 0; — Semaphore value
v , p : integer := 0; — History variables
pos : condition;
procedure Psem()
if s = 0 then wait(pos)
else s := s − 1;
p := p + 1
procedure Vsem()
if empty(pos) then s := s + 1
else signal(pos);
v := v + 1
end

For this monitor, we would like to prove the semaphore invariant (for initial value 0):
p≤v
Further we would like to ensure that the semaphore is (at least) weakly fair. Although this
is a liveness property, the essence of this is captured by the following monitor invariant:
waiting(pos) > 0 ⇒ p = v
This expresses that no processes wait unnecessarily. Combined with standard assumption
about fair process scheduling (and absense of infinite loops), from such a pre-liveness
property, we may conclude that the monitor is weakly fair.

5.4 Proving Monitor Invariants

Since all activity within a monitor takes place under mutual exclusion, monitor invariants can
be proved using standard proof techniques for sequential programs. Thus, one approach would
be to use Hoare-logic with special rules dealing with wait and signal operations, where the
sequential flow may be interrupted.
Alternatively, one may use a more informal technique based upon operational reasoning where
the execution within the monitor is traced. Although operational reasoning was rejected for
concurrent programs in general, it is acceptable in the sequential context provided by the monitor
construct.
In general the operational technique is based upon induction over the length of the execution
within the monitor. As a base case, the invariant must hold after initialization and must be
preserved by each stretch of activity, i.e. each sequence of monitor execution that cannot be
preempted by other activity within the monitor. Since the notion of a stretch differs for different
queue semantics, we consider the rules for signal and continue (SC) separately from signal and
wait (SW).
58 CHAPTER 5. MONITOR INVARIANTS

5.4.1 Signal and Continue Semantics

For signal and continue semantics, a stretch starts by a process entering or reentering the monitor
and ends when the process leaves the monitor or calls wait.

Operational proof technique (SC)


For a monitor using signal and continue semantics, a monitor invariant I is proven
operationally by:
1. Showing that I holds after initialization. It may be used that initially, the
conditions queues are empty.

2. For each monitor operation op showing that whenever a process

(a) starts the execution of the operation op, or


(b) resumes execution after a call of wait in op

in a state where I holds, I will hold again when either:

• The process calls wait, or


• The process leaves op

As for invariants based upon fine-grained atomic actions, only stretches that are potentially
dangerous for the invariant need be investigated.
To show that a stretch preserves an invariant I , all changes of the state made during the
execution stretch must be tracked. Especially, this includes proper update of the number of
processes waiting on/being awakened from condition queues.

Example (Proof of Semaphore Safety)

To prove the invariant p ≤ v for the monitor given previously, we start by checking the
initial state. Since p = v = 0, I holds. However, we cannot prove that I is preserved by
all stretches:

• If Psem is called in a situation where s > 0 ∧ p = v , p is incremented and I violated!

Of course, this situation will never occur, but I is not strong enough encompass this
information. It turns out that we need the following strengthened invariant:

H = p + s + woken(pos) = v ∧ s ≥ 0

from which I can be deduced.


Like I , H holds initially. We now check each operation:
Assume that H holds at start of Psem:

• If s = 0, the process leaves the monitor by wait(pos), H is unchanged.


Assume that H holds after the wait:
5.4. PROVING MONITOR INVARIANTS 59

– At the reentry, woken(pos) is decremented by one and afterwards, p is incre-


mented by one. Thus, the equational part of H still holds. Further s is un-
changed.
• If s > 0, s is decremented and p incremented. This preserves the equational part of
H as well as s ≥ 0.

Assume that H holds at the start of Vsem:

• If empty(pos), both v and s are incremented by one, preserving the equational part
of H and ensuring s > 0.
• If ¬empty(pos), we know that signal(pos) will wake up one process, thus woken(pos)
is incremented by one. Further, v is incremented by one. Thus, both the equational
part of H and s are unchanged.

Thus, H and thereby I is an invariant of the monitor.

Of course, when proving a monitor invariant, we may utilize previously proven invariants by
assuming them to hold at the start of the stretches considered.

Example (Proof of Semaphore pre-Liveness)

To prove the property



G = waiting(pos) > 0 => p = v
we see that it holds initially as waiting(pos) = 0 (by definition). Unfortunately we find
that G is not a proper monitor invariant. Consider the following scenario:

s p v waiting(pos) woken(pos)
Initially: 0 0 0 0 0
Psem is called, waits 0 0 0 1 0
Vsem is called, signals 0 0 1 0 1
PSem is called, waits 0 0 1 1 1

Now, waiting(pos) > 0, but p 6= v . The problem is that the awakened process has
effectively been released, but this is not yet recorded in the p variable. If we consider the
awakened processes as counting together with p, we get:

F = waiting(pos) > 0 => p + woken(pos) = v

which according to H is equivalent to



J = waiting(pos) > 0 => s = 0

J is seen to hold initially and the potentially dangerous activities are:

• Waiting on pos in Psem. This, however, occurs only under the condition s = 0 and
therefore J is preserved.
60 CHAPTER 5. MONITOR INVARIANTS

• Incrementing s in Vsem. This is done under the condition that pos is non-empty and
thus J holds at the end of Vsem.

Now F can be used to argue for weak fairness by the following argument:
Suppose that calls of the monitor operation cease (at least for a while). Assuming fair
process execution, all awakened processes will eventually reenter and leave the monitor.
Thus, eventually woken(pos) = 0. Together with F , we get the desired property G.
Therefore no process can be “forgotten” in the waiting queue of the monitor if p < v .

Pthreads

The mutex/condition primitives of Pthreads provide a kind of signal and continue semantics.
However, the possibility of awakening more than one thread by a single signal and the risk of
spurious wake-ups renders the waiting and woken notions less useful in monitor invariants. Thus
invariants will rather have to be stated in terms of monitor and history variables and rechecking
conditions must be used to make them invariant. The effect of multiple signals and spurious
wake-ups may be reflected by adding a pseudo monitor operation of the form:

procedure spuriousc ()
signal(c)

for each condition queue c and proving these to maintain the invariant.

Java Monitors

Java monitors use signal and continue semantics but have only one anonymous condition queue.
We may use waiting() and woken() to denote the number of threads on the queue or awakened
from the queue respectively. Otherwise, Java monitor invariants can be stated and proved using
the technique described above.

Since there is only a single condition queue, in many cases it will have to be shared by different
monitor operations. Therefore, invariants and proofs of Java monitor invariants will often need
to qualify the waiting and woken counts with operations as already introduced.

Furthermore, the semantics of Java monitors allows for spurious wake-ups as in Pthreads. These
may be dealt with by pseudo operations as shown above.

5.4.2 Signal and Wait Semantics

We here consider only the original Hoare semantics also known as signal and urgent wait. For
this semantics, a stretch starts by a process entering the monitor and ends when the all activity
following from this has ceased. This may involve passing the execution to an awakened process
taking over the monitor or to a signalling process urgently reentering the monitor when a process
leaves it.
5.4. PROVING MONITOR INVARIANTS 61

Operational proof technique (Hoare)


For a monitor using Hoare semantics (signal and urgent wait), a monitor invariant
I is proven operationally by:
1. Showing that I holds after initialization. It may be used, that initially, the
conditions queues are empty.

2. For each monitor operation op arguing that:

If a process starts the execution of the operation op in a state where


I holds, I will hold again when the monitor becomes free, i.e. when all
activity that follows from starting this operation has ceased, including
that there are no signalling processes left.

Although Hoare semantics enables very elegant and efficient signalling, it is not in practical use
any longer and therefore no proof examples will be given in this note.
62 CHAPTER 5. MONITOR INVARIANTS
Bibliography

[And00] Gregory R. Andrews. Foundations of Multithreaded, Parallel, and Distributed Pro-


gramming. Addison-Wesley, 2000.

[Hoa85] C.A.R. Hoare. Communicating Sequential Processes. Computer Science. Prentice-Hall,


1985.

[Hol97] Gerard J. Holtzmann. The model checker spin. IEEE Transactions on Software Engi-
neering, 23(5):279–295, May 1997.

[Hol04] Gerard J. Holtzmann. The Spin Model Checker: Primer and reference manual.
Addison-Wesley, 2004.

[HS08] Maurice Herliky and Nir Shavit. The Art of Multiprocessor Programming. Morgan
Kaufmann, 2008.

[Jen92] Kurt Jensen. Coloured Petri Nets, Basic Concepts, Analysis Methods, and Procti-
cal Use, Volume 1. EATCS Monographs on Theoretical Computer Science. Springer-
Verlag, 1992.

[Jen95] Kurt Jensen. Coloured Petri Nets, Basic Concepts, Analysis Methods, and Proctical
Use, Volume 2. EATCS Monographs on Theoretical Computer Science. Springer, 1995.

[Jen97] Kurt Jensen. Coloured Petri Nets, Basic Concepts, Analysis Methods, and Proctical
Use, Volume 3. EATCS Monographs on Theoretical Computer Science. Springer, 1997.

[JK09] Kurt Jensen and Lars M. Kristensen. Coloured Petri Nets: Modelling and Validation
of Concurrent Systems. Springer-Verlag, 2009.

[JMM] The java memory model. URL: www.cs.umd.edu/users/pugh/java/memoryModel/.

[Mil89] Robin Milner. Communication and Concurrency. Computer Science. Prentice-Hall,


1989.

[MP91] Zohar Manna and Amir Pnueli. The Temporal Logic of Reactive and Concurrent Sys-
tems: Specification. Springer-Verlag, 1991.

[MP95] Zohar Manna and Amir Pnueli. Temporal Verification of Reactive Systems: Safety.
Springer-Verlag, 1995.

[Mur89] Tadoa Murata. Petri nets: Properties, analysis and applications. Proceedings of the
IEEE, 77(4), April 1989.

63
64 BIBLIOGRAPHY

[OL82] Susan Owicki and Leslie Lamport. Proving liveness properties of concurrent programs.
ACM Transactions on Programming Languages and Systems, 4:455–495, July 1982.

[Ost12] Igor Ostrovsky. The c# memory model in theory and practice, part 1. MSDN Magazine,
27(12), December 2012. URL: msdn.microsoft.com/en-us/magazine/jj863136.

[Ray13] Michel Raynal. Concurrent Programming: Algorithms, Principles, and Foundations.


Springer-Verlag, 2013.

[RR98a] Wolfgang Reizig and Grzegorz Rozenberg, editors. Lectures on Petri Nets I: Basic
Models, volume 1491 of LNCS. Springer, 1998.

[RR98b] Wolfgang Reizig and Grzegorz Rozenberg, editors. Lectures on Petri Nets II: Applica-
tions, volume 1492 of LNCS. Springer, 1998.

[Sch97] Fred B. Schneider. On Concurrent Programming. Springer, 1997.

[Wil12] Anthony Williams. C++ Concurrency in Action. Practical Multithreading. Manning


Publications Co., 2012.
Index

action, 1, 11, 14 data abstraction, 53


atomic, 15 Dekker’s algorithm, 13
composite, 11
conditional, 12 enabledness
enabledness, 14 concurrent, 4
formal effect of, 32 of action, 14
potentially dangerous, 26 of Petri Net transition, 3
entry-protocol, 46
test, 13
eventually (✸), 39
activity
eventually forever (✸✷), 40
stretch, 55
execution
algorithm
of Petri net, 4
Dekker’s, 13
of transition diagram, 13
Peterson’s, 29
of transition system, 14
always (✷), 39
step, 4
arc, 1, 2
exit-protocol, 46
inhibitor, 9
multiple, 5 fair process execution, 43
arrow, 1 fairness
atomic, 15 strong, 42, 51
atomic action, 15 weak, 42, 51
atomicity, 15, 53 fairness (of critical region), 46
firing, 3
Bandera, 36 simultaneous, 4
barrier synchronization, 7 flow relation, 2
busy waiting, 28 fork, 7
formal system, 6
concurrent processes, 12 formula
concurrent program, 12 temporal, 38
concurrently enabled, 4 frame, 45
condition queue, 54 free (of monitor), 53
condition synchronization, 7
conditional action, 12 ghost variable, 24
conflict (Petri Net), 5 guard, 12
control assertion, 23 history variable, 22, 24
control location, 13 Hoare triples, 22
control pointer, 13 holds (of temporal formula), 39
control predicate, 23, 41
critical reference, 19 inductive predicate, 27
critical region, 46 inductive proof, 26
critical section, 46 infinitely often (✷✸), 40

65
66 INDEX

initial state, 14 place/transition net, 9


interleaving, 15 predicate
model, 15 inductive, 27
invariant, 22, 25, 42 preservation (of invariant), 26
auxiliary, 28 process
local, 28 sequential, 11
monitor, 53 sequential (Petri Net), 6
semaphore, 34 process algebra, 19
strengthening, 27 process model, 12
value, 28 processes
concurrent, 6
Java monitor, 58 proof
join, 7 inductive, 26
operational, 28
label, 14 operational (for monitor invariant), 55
leads-to operator, 41 proof by contradiction, 27
liveness property, 37 proof lattice, 44
liveness property, 21 property
local invariant, 28 fairness (of critical region), 46
location liveness, 21, 37
control, 13 obligingness, 46
start, 13 persistence, 42
terminal, 13 reactivity, 42
lockout, 47 recurrence, 42
resolution, 46
marking, 2, 2
responsiveness, 42
initial, 2
safety, 21, 21
model-checking, 25
protocol
Bandera, 36
entry, 46
Spin, 36
exit, 46
monitor, 53
Java, 58 reachable state, 14
monitor invariant, 53 reachable states, 17
mutual exclusion, 8, 46 reactive programs, 21
mutually atomic, 15 recurrence, 42
resolution, 46
obligingness, 46 responsiveness, 42
operation, 11 run, 4
operational proof, 28
for monitor invariant, 55 safety property, 21, 21
semaphore, 34
persistence property, 42 strongly fair, 42
Peterson’s algorithm, 29 semaphore invariant, 34
Petri Net, 2 sequential process, 11
coloured, 9 sequential program, 11
high-level, 9 simple variable, 18
marked, 2 Spin, 36
place/transition net, 9 start location, 13
place, 1, 2 state, 1, 12, 14
INDEX 67

global, 12 control, 12
initial, 14 history, 22, 24, 53
reachable, 14 local, 12
sub-, 1 monitor, 53
state predicate, 22 program, 12
statement, 12 shared, 12
step execution, 4 simple, 18
stretch of activity, 55
strong fairness, 42, 51 weak fairness, 42, 51
sub-state, 1
synchronization, 7, 8, 53
barrier, 7
condition, 7
true, 7
synchronize, 9
synchronous, 7

temporal formula, 38
temporal formula
closed, 39
valid, 39
valid for program, 39
temporal laws, 40
temporal logic, 38
temporal operator
always (✷), 39
eventually (✸), 39
eventually forever (✸✷), 40
infinitely often (✷✸), 40
leads-to (❀), 41
reflexive, 40
terminal location, 13
test action, 13
thread, 11
token, 2
tools
Bandera, 36
Spin, 36
transition, 1
transition (of Petri Net), 2
transition diagram, 12, 19
transition graph, 14
transition relation, 14
transition system, 14
true synchronization, 7

validity (of temporal formula), 39


value invariant, 28
variable

You might also like