You are on page 1of 11

The Wakeup Problem

(EXTENDED ABSTRACT)

M i c h a e l J. F i s c h e r * Shlomo Moran t Steven Rudich t Gadi Taubenfeld*

Abstract failures, in which a process may become faulty at any


time during its execution, and when it fails, it simply
We study a new problem, the wakeup problem, that stops participating in the protocol.
seems to be very fundamental in distributed comput- In the wakeup problem, it is known a priori by all pro-
ing. We present efficient solutions to the problem and ceases that at least n - t processes will eventually wake
show how these solutions can be used to solve the con- up. The goal is simply to have a point in time at which
sensus problem, the leader election problem, and other the fact that at least r processes have already waked
related problems. The main question we try to answer up is known to p processes. It is not required that this
is, how much memory is needed to solve the wakeup time be the earliest possible, and faulty processes are
problem? We assume a model that captures important included in the counts of processes that have waked up
properties of real systems that have been largely ignored and that know about that fact. Note that in a solution
by previous work on cooperative problems. to the wakeup problem, at least p - t correct processes
eventually learn that at least r - t correct processes are
awake and participating in the protocol.
1 Introduction The significance of this problem is two-fold. First, it
seems generally useful to have a protocol such that after
1.1 The Wakeup Problem
a crash of the network or after a malicious attack, the
The wakeup problem is a deceptively simple new prob- remaining correct processes can figure out if sufficiently
lem that seems to be very fundamental to distributed many other processes remain active to carry out a given
computing. The goal is to design a t-resilient proto- task. Second, a solution to this problem is a useful
col for n asynchronous processes in a shared memory building block for solving other important problems (cf.
environment such that at least p processes eventually section 6).
learn that at least 7- processes have waked up and be-
gun participating in the protocol. Put another way, the 1.2 A New Model
wakeup problem with parameters n, t, ~- and p is to find
a protocol such that in any fair run of n processes with Much work to date on fault-tolerant parallel and dis-
at most t failures, at least p processes eventually know tributed systems has been generous of the class of faults
that at least r processes have taken at least one step in considered but rather strict in the requirements on the
the past. The only kind of failures we consider are crash system itself. Problems are usually studied in an un-
derlying model that is fully synchronous, provides each
*Computer Science D e p a r t m e n t , Yale University, New Haven, process with a unique name that is known to all other
CT 06520.
l C o m p u t e r Science D e p a r t m e n t , Technion, Haifa 32000, Israel. processes, and is initialized to a known state at time
t C o m p u t e r Science D e p a r t m e n t Carnegie Mellon University, zero. We argue that none of these assumptions is real-
P i t t s b u r g h , PA 15213. istic in today's computer networks, and achieving them
This work was s u p p o r t e d in p a r t by ONR contract N00014- even within a single parallel computer is becoming in-
89-J-1980, by the National Science Foundation u n d e r grant CCR-
8405478, by the Hebrew Technical Institute scholarship, by the creasingly difficult and costly. Large systems do not run
Technion V.P.R. Funds - Wellner Research Fund, a n d by the off of a single clock and hence are not synchronous. Pro-
Foundation for Research in Electronics, Computers a n d Commu- viding processes with unique id's is costly and difficult
nications, a d m i n i s t r a t e d by the Israel Academy of Sciences a n d
and greatly complicates reconfiguring the system. Fi-
Humanities.
nally, simultaneously resetting all of the computers and
communication channels in a large network to a known
initial state is virtually impossible and would rarely be
Permission to copy without fee all or part of this matertial is granted pro- done even if it were possible because of the large de-
vided that the copies are not made or distributed for direct commercial structive effects it would have on ongoing activities.
advantage, the ACM copyright notice and the title of the publication and
its date appear, and notice is given that copying is by permission of the Our new model of computation makes none of these
Association for Computing Machinery. To copy otherwise, or to republish,
requires a fee and/or specific permission.

© 1990 ACM 089791-361-2/90/0005/0106 $1.50 106


assumptions. It consists of a fully asynchronous collec- 1.3 Space Complexity Results
tion of n identical anonymous deterministic processes
The main question we try to answer is, how m a n y values
that communicate via a single finite sized shared reg-
v for the shared register are necessary and sufficient to
ister which is initially in an arbitrary unknown state.
solve the wakeup problem? The answer both gives a
Access to the shared register is via atomic "test-and-
measure of the communication-space complexity of the
set" instructions which, in a single indivisible step, read
problem and also provides a way of assessing the cost
the value in the register and then write a new value that
of achieving reliability. We give a brief overview of our
can depend on the value just read.
results below.
Assuming an arbitrary unknown initial state relates
to the notion of self-stablizing systems defined by Di-
jkstra [8]. However, Dijkstra considers only non- 1.3.1 Fault-Free Solutions
terminating control problems such as the mutual ex- First we examine what can be done in the absence of
clusion problem, whereas we show how to solve decision faults (i.e., t = 0). We present a solution to the wakeup
problems such as the wakeup, consensus and leader elec- problem in which one process learns t h a t all other pro-
tion problems, in which a process makes an irrevocable cesses are awake (i.e., p = 1 and r = n), and it uses a
decision after a finite number of steps. single 4-valued register (i.e., v -- 4). The protocol for
Before proceeding, we should address two possible achieving this is quite subtle and surprising. It can also
criticisms of shared m e m o r y models in general and our be modified to solve the leader election problem. Based
model in particular. First, most computers implement on this protocol, we construct a fault-free protocol that
only reads and writes to memory, so why do we consider reaches consensus on one out of k possible values using
atomic test-and-set instructions? One answer is that a 5-valued register. Finally, we show that there is no
large parallel systems access shared m e m o r y through fault-free solution to the wakeup problem with only two
a communication network which m a y well possess in- values (i.e., one bit) when 7" _> 3.
dependent processing power that enables it to imple-
ment more powerful primitives than just simple reads 1.3.2 Fault-Tolerant Solutions: Upper Bounds
and writes. Indeed, such machines have been seriously
proposed [23, 44]. Another answer is that part of our We start by showing that the fault-free solution which
interest is in exploring the boundary between what can uses a single 4-valued register, mentioned in the pre-
and cannot be done, and a proof of impossibility for vious section, can actually tolerate t failures for any
a machine with test-and-set access to m e m o r y shows a r _< ((2n - 2)/(2t + 1) + 1)/2. Using m a n y copies of this
fortiori the corresponding impossibility for the weaker protocol, we construct a protocol with v = 8 t+l that
read/write model. tolerates t faults when I" _< n - t . Thus, if t is a con-
A second possible criticism is that real distributed stant, then a constant sized shared m e m o r y is sufficient,
systems are built around the message-passing paradigm independent of n. However, the constant grows expo-
and that shared m e m o r y models are unrealistic for large nentially with t. An easy protocol exists with v = n that
systems. Again we have several possible answers. First, works for any t and r < n - t . This means that the above
the premise m a y not be correct. Experience is showing exponential result is only of interest for t << log n. Fi-
that message-passing systems are difficult to program, nally, we show that for any t < n / 2 , there is a t-resilient
so increasing attention is being paid to implementing solution to the wakeup problem for any 7" < [n/2J + 1,
shared m e m o r y models, either in hardware (e.g. the Flu- using a single O(t)-valued register.
ent machine [45]) or in software (e.g. the Linda system
[5]). Second, message-passing systems are themselves 1.3.3 Fault-Tolerant Solutions: A Lower Bound
an abstraction that m a y not accurately reflect the reali-
ties of the underlying hardware. For example, message- We prove that for any protocol P that solves the wakeup
passing systems typically assume infinite buffers for in- problem for parameters n, t and r, the number of shared
coming messages, yet nothing is infinite in a real system, m e m o r y values used by P is at least W a, where W =
and indeed overflow of the message buffer is one kind of ( t v ~ - t ) / ( n - t) and ~ = 1 / ( l o g 2 (n7- -~t + 3)). The proof
fault to which real systems are subject. It is difficult to is quite intricate and involves showing for any protocol
see how to study a kind of fault which is assumed away with too few m e m o r y values that there is a run in which
by the model. Finally, at the lowest level, communi- n - t processes wake up and do not fail, yet no process
cation hardware looks very much like shared memory. can distinguish that run from another in which fewer
For example, a wire from one process to another can than r wake up; hence, no process knows that r are
be thought of as a binary shared register which the first awake.
process can write (by injecting a voltage) and the second
process can read (by sensing the voltage).

107
1.4 Relation to Other Problems no process wrote into the register. A register r is said
to be local if there exists an i such t h a t r E R/ and
We establish connections between the wakeup prob-
for any j ~ i, r ~ Rj. A register is s h a r e d if it is not
lem and two fundamental problems in distributed com-
local. In this paper we restrict attention to protocols
puting: the consensus problem and the leader elec-
which have exactly one register which is shared by all
tion problem. These two problems lie at the core of
the processes (i.e., IR1 n . . . n R , I = 1) and all other
m a n y problems for fault-tolerant distributed applica-
registers are local. If S' is a prefix of S then the run
tions [1, 7, 10, 13, 16, 20, 21, 32, 34, 43, 42, 48].
(f, S') is a prefiz of (f, S), and (f, S) is an extension of
We show that: (1) any protocol that uses v values
(f, SI). For any sequence S, let Si be the subsequence
and solves the wakeup problem for t < n/2, 7" > n / 2
of S containing all events in S which involve qi.
and p = 1 can be transformed into i-resilient consensus
and leader election protocols which use 8v values; and D e f i n i t i o n : C o m p u t a t i o n s (f, S) and ( f ' , S ' ) are equiv-
(2) any l-resilient consensus and leader election protocol alent with respect to qi, denoted by (f, S) / ( f ' , S'), iff
that uses v values can be transformed into a i-resilient si =
protocol which uses 4v values and solves the wakeup
We are now ready to define the notion of knowledge in
problem for any 7" < [n/2J + 1 and p = 1.
a shared m e m o r y environment. In the following, we use
Using the first result above, we can construct effi-
predicate to mean a set of runs.
cient solutions to both the consensus and leader election
problems from solutions for the wakeup problem. The D e f i n i t i o n : For a process qi, predicate b and finite run
i
second result implies that the lower bound proved for p, process qi knows b at p iff for all p' such that p .~ p',
the wakeup problem holds for these other two problems. it is the case that p' E b.
As a consequence, the consensus and the leader election
For simplicity, we assume t h a t a process always takes
problems are space-equivalent in our model. This is
a step whenever it is scheduled. A process that takes
particularly surprising since the two problems seem so
infinitely m a n y steps in a run is said to be correct in
different. The difficulty in leader election is breaking
that run; otherwise it is faulty. We say that an infinite
symmetry, whereas consensus is inherently symmetric.
run is l-fair iff at least I processes are correct in it.

2 D e f i n i t i o n s and N o t a t i o n s 2.2 Wakeup, Consensus and Leader


Election Protocols
2.1 Protocols and Knowledge
In this subsection we formally define the notions of t-
An n-process protocol P = ( C , N , R ) consists of a resilient wakeup, consensus and leader election protocols
nonempty set C of runs, an n-tuple N = (ql . . . . ,qn) C0 < t < n). We say t h a t a p r o c e s s qi is awake in a
of process id's (or process, for short), and an n-tuple run if the run contains an event that involves qi. The
R = ( R 1 , . . . , R n ) of sets of registers. Informally, Ri predicate "at least 7- processes are awake in run p" is the
includes all the register that process qi can access. We set of all runs for which there exist v different processes
assume throughout this paper that n > 2. which are awake in the run. Note that a process that
A run is a pair of the form ( f , S ) where f is a fails after taking a step is nevertheless considered to be
function which assigns initial values to the registers in awake in the run.
R1 t.J . . . t.J Rn and S is a finite or infinite sequence of
events. (When S is finite, we also say that the run is • A wakeup protocol with p a r a m e t e r s n, t, 7" and p is
finite.) An event e = (ql, v, r, v ~) means that process qi, a protocol for n processes such that, for any ( n - t ) -
in one atomic step, first reads a value v from register fair run p, there exists a finite prefix of p in which
r and then writes a value v ~ into register r. We say at least p processes know that at least 7" processes
that the event e involves process qi and that process ql are awake in p.
performs a test-and-set operation on register r. It is easy to see that a wakeup protocol exists only
The set of runs is assumed to satisfy several proper- if max(p, 7") < n - t, and hence, from now on, we
ties; for example, it should be prefix closed. Because of assume that this is always tlle case. We also assume
lack of space, we do not give a complete list here but that min(p, 7") > 1.
point out that these properties capture the fact that
In the following, whenever we speak about a solu-
we are assuming that the processes are anonymous and
tion to the wakeup problem without mentioning p,
identically p r o g r a m m e d , are not synchronized, and that
we are assuming that p = 1.
nothing can be assumed about the initial state of the
shared memory. • A t-resilient k-consensus protocol is a protocol for
The value of a register at a finite run is the last value n processes, where each process has a local read-
that was written into that register, or its initial value if only input register and a local write-once output

108
register. For any (n - t)-fair run there exists a fi- preted as two bits which we call the "token bit" and the
nite prefix in which all the correct processes decide "See-Saw" bit. The two states of the token bit are called
on some value from a set of size k (i.e., each cor- "token present" and "no token present". We think of
rect process writes a decision value into its local a public token slot which either contains a token or is
output register), the decision values written by all empty, according to the value of the token bit. The two
processes are the same, and the decision value is states of the See-Saw bit are called "left side down" and
equal to the input value of some process. "right side down". The "See-Saw" bit describes a vir-
In the following, whenever we say "consensus" tual See-Saw which has a left and a right side. The bit
(without mentioning specific k) we mean "binary indicates which side is down (implying that the opposite
consensus", where the possible decision values are side is up).
0 and 1. Each process remembers in private m e m o r y the num-
ber of tokens it currently possess and which of four
• A t-resilient leader election protocol is a proto- states it is currently in with respect to the See-Saw:
col for n processes, where each process has a local "never been on .... on left side", "on right side", and "got
write-once output register. For any (n - t)-fair run off". A process is said to be on the up-side of the See-
there exists a finite prefix in which all the correct Saw if it is currently "on left side" and the See-Saw bit
processes decide on some value in {0, 1}, and ex- is in state "right side down", or it is currently "on right
actly one (correct or faulty) process decides on 1. side" and the See-Saw bit is in state "left side down".
T h a t process is called the leader. A process initially possesses two tokens and is in state
"never been on".
We define the protocol by a list of rules. When a pro-
3 Fault-free solutions cess is scheduled, it looks at the shared register and at
its own internal state and carries out the first applica-
In this section, we develop tile See-Saw protocol, which
ble rule, if any. If no rule is applicable, it takes a null
solves the fault-free wakeup problem using a single 4-
step which leaves its internal state and the value in the
valued shared register. Then we show how the See-Saw
shared register unchanged.
protocol can be used to solve the k-valued consensus
problem. Finally, we claim that it is impossible to solve R u l e 1: (Start of protocol) Applicable if tile scheduled
the wakeup problem using only one shared bit. process is in state "never been on". The process
To understand the See-Saw protocol, the reader gets on the up-side of the See-Saw and then flips the
should imagine a playground with a See-Saw in it. The See-Saw bit. By "get on", we mean that the process
processes will play the protocol on the See-Saw, adher- changes its state to "on left side" or "on right side"
ing to strict rules. When each process enters the play- according to whichever side is up. Since flipping
ground (wakes up), it sits on the up-side of the See-Saw the See-Saw bit causes that side to go down, the
causing it to swing to the ground. Only a process on the process ends up on the down-side of the See-Saw.
ground (or down-side) can get off and when it does the
R u l e 2: (Emitter) Applicable if the scheduled process
See-Saw must swing to the opposite orientation. These
is on the down-side of the See-Saw, has one or
rules enforce a balance invariant which says that the
more tokens, and the token slot is empty. The
number of processes on each side of the See-Saw differs
process flips the token bit (to indicate that a to-
by at most one (the heavier side always being down).
ken is present) and decrements by one the count
Each process enters the playground with two tokens.
of tokens it possesses. If its token count thereby
The protocol will force the processes on the b o t t o m of
becomes zero, the process flips the See-Saw bit and
the See-Saw to give away tokens to the processes on
gets off the See-Saw by setting its state to "got off".
the top of the See-Saw. Thus, token flow will change
direction depending on the orientation of the See-Saw. R u l e 3: (Absorber)Applicable if the scheduled process
Tokens can be neither created nor destroyed. The idea is on the up-side of the See-Saw and a token is
of the protocol is to cause tokens to concentrate in the present in the token slot. The process flips the
hands of a single process. A process seeing 2k tokens token bit (to indicate that a token is no longer
knows that at least k processes are awake. Hence, if it present) and increments by one the count of tokens
is guaranteed that eventually some process will see at it possesses.
least 27- tokens, the protocol is by definition a wakeup
protocol with p a r a m e t e r 7-, even if the process does not Note that if a scheduled process is on the down-side,
know the value of 7- and hence does not know when the has 2 k - 1 tokens, and a token is present in the token
goal has been achieved. slot, then, although no rule is applicable, the process
Following is the complete description of the See-Saw nevertheless sees a total of 2k tokens and hence knows
protocol. The 4-valued shared register is easily inter- that k processes have waked up.

109
The two main ideas behind the protocol can be stated 4 Fault-tolerant solutions
as invariants.
In this section, we explore solutions to the wakeup prob-
T O K E N INVARIANT: The number of tokens in the lem which can tolerate t > 0 process failures.
system is either 2n or 2n + 1 and does not change at The See-Saw protocol, presented in the previous sec-
any time during the protocol. (The number of tokens in
tion, cannot tolerate even a single crash failure for any
the system is the total number of tokens possessed by r > n / 3 . The reason is that the faulty process may
all of the processes, plus 1 if a token is present in the
fail after accumulating 2 n / 3 tokens, trapping two other
token bit slot.)
processes on one side of the See-Saw, each with 2 n / 3
B A L A N C E INVARIANT: The number of processes on tokens. When 7" < n / 3 , the See-Saw protocol can toler-
the left and right sides of the See-Saw is either perfectly ate at least one failure. As the parameter 7" decreases,
balanced or favors the down-side of the See-Saw by one the number of failures that the protocol can tolerate
process. increases. This fact is captured in our first theorem.
T h e o r e m 3.1: Let t = O. The See-Saw protocol uses T h e o r e m 4.1: The See-Saw protocol is a wakeup pro-
a 4-valued shared register and is a wakeup protocol for tocol for n, t, r, where r <_ ((2n - 1)/(2t + 1) + 1)/2.
n, t, 7" (and p = 1), where n and r are arbitrary and t =
We note that the See-Saw protocol can tolerate up to
O. (Note that the rules f o r the protocol do not mention n / 2 - 1 initial failures [21, 49]. In the rest of this sec-
n o r 7..)
tion, we present three t-resilient solutions to the wakeup
In applications of wakeup protocols, it is often desir- problem. Notice that when the number of failures t is
able for the processes to know the value of v so that a constant, it is possible using a constant number of
a process learning that r processes are awake can stop values for one process to learn that n - t processes are
participating in the wakeup protocol and take some ac- awake.
tion based on that knowledge. The See-Saw protocol T h e o r e m 4.2: For any t < n / 6 , there is a wakeup pro-
can be easily modified to have this property by adding tocol which uses a single 8t+l-valued register and works
a termination rule immediately after Rule 1:
f o r any 7- < n - t.
R u l e l a : ( E n d of protocol) Applicable if the scheduled T h e o r e m 4.3: For any t < n, there is a wakeup proto-
process is on the See-Saw and sees at least 2r to- col which uses a single n-valued register and works f o r
kens, where the number of tokens the process sees any r < n - t.
is the number it possesses, plus one if a token is
present in the token slot. The process thus knows T h e o r e m 4.4: For any t < n / 2 , there is a wakeup pro-
that 7. processes have waked up. It gets off the See- tocol which uses a single O(t)-valued register and works
Saw (i.e., terminates) by setting its state to "got for any 7. _< [n/2J + 1.
Off ~ ,

The See-Saw protocol can also be used to solve the 5 A Lower Bound
leader election problem by electing the first process that
sees 2n tokens. By adding a 5th value, everyone can be In this section, we establish a lower bound on the num-
informed that the leader was elected, and the leader ber of shared memory values needed to solve the wakeup
can know that everyone knows. Now, the leader can problem, where only one process is required to learn
transmit an arbitrary message, for example a consensus that r processes are awake, assuming t processes may
value, to all the other processes without using any more crash fail (thus p = 1). To simplify the presentation,
new values through a kind of serial protocol. This leads we assume that 9 < t < 2 n / 3 and 7- > n / 3 . Also, recall
to our next theorem. that we already assumed that r < n - t. For the rest of
this section, let
T h e o r e m 3.2: In the absence of faults, it is possible
to reach consensus on one of k values using a single tx/t - t t2 - 1
5-valued shared register. W - - - U -- - - (1)
n-t ' 4(n-t)'
Finally, we claim that the See-Saw protocol cannot be
improved to use only a single binary register. A slightly (2)
weaker result than Theorem 3.3 was also proved by Joe l°g2 ( tn--~t -k- 3.5)"
tlalpern [27]. The question whether 3 values suffice is Note that W < U since t > 9.
still open.
T h e o r e m 5.1: Let P be a wakeup protocol with param-
T h e o r e m 3.3: There does not exist a solution to the eters n, t and r. Let V be the set of shared m e m o r y
wakeup problem which uses only a single binary register values used by P . Then IVI > w ~.
when r > 3.

110
When we take t to be a constant fraction of n we get D e f i n i t i o n : Let V be the set of shared m e m o r y values
the following immediate corollary. of protocol P. The teachability graph G of protocol P is
the labeled directed multigraph with node set V which
C o r o l l a r y 5.1: Let P be a wakeup protocol with pa-
has an edge from node a to node b labeled with r iff
rameters n, t and v, where t --- n/c. Let V be the
a ~ b holds. (Note that there m a y be several edges
set of shared memory values used by P and let 7 =
1/(21og2(c+ 2.5)). Then, }V] = ft(n'Y). with different labels between the same two nodes. Note
also that G is finite since a : : ~ b implies that r _< [VI. )
Theorem 5.1 is immediate if V is infinite, so we as-
D e f i n i t i o n : A graph C is closed at node a w.r.t. G if a
sume throughout this section that V is finite. The proof
is in C and for every node b in G, if (a, b) is an edge of
consists of several parts. First we define a sequence of
directed graphs whose nodes are shared m e m o r y values G then b is in C.
in V. Each component C of each graph in the sequence D e f i n i t i o n : A multigraph T is terminal w.r.t. G if T is
has a cardinality k¢ and a weight w~. We establish by a subgraph of G, all of T ' s components are rooted, and
induction that k¢ _> rain(we, W) a. Finally, we argue T has a component C with root a among its minimal
that in the last graph in the sequence, every component weight components that is closed at node a w . r . t . G .
C has weight w~ > W. Hence, IV[ > k~ > W a. In the rest of the section we show that the reacha-
bility graph G of any wakeup protocol with parameters
5.1 Reachability Graphs and Terminal n , t , r has size _> W% We do that by constructing a
multigraph T which is terminal w.r.t. G and has size
Graphs
> W a. Theorem 5.1 follows from these facts.
Let V be the alphabet of the shared register. We say
that a value a E V appears m times in a given run if 5.2 Reachability Graphs
there are (at least) m different prefixes of that run where
the value of the shared register is a. The teachability graphs are defined for all protocols.
Now we concentrate on such graphs constructed from
it
a , b denotes that there exists a run in which at wakeup protocols only. We show that when the weight
most u processes participate, the initial value of of a rooted component, say C, is sufficiently small, an
the shared register is a, and the value b appears at edge exists with a label q from a root of C to a node
least once. not in C, and we can bound the size of q.
For later reference we call the following three inequal-
a : : ~ b denotes that there exists a run in which exactly
ities,
u processes participate, each process that partici-
pates takes infinitely m a n y steps, the initial value (i) pq + ( p - 1)w < n,
of the shared register is a, and the value b appears (iX) pq > n - t,
infinitely m a n y times. (iii) max(q,w) < v

Clearly, a :=~ b implies a r , b but not vice versa. Also, the zigzag inequalities. These inequalities play an im-
for every a, there exists b such that a :=~ b. portant role in our exposition.
We use the following graph-theoretic notions. A di- L e m m a 5.1: Given teachability graph G of a wakeup
rected multigraph 1 G is weakly connected if the under-
protocol P with parameters n, t, 7" and a rooted subgraph
lying undirected multigraph of G is connected. A multi- C of G with root a and weight w, if there exist positive
graph G'(V', E') is a subgraph of G(V, E) if WI _C E and integers p and q that satisfy the zigzag inequalities, then
V I _ V. A multigraph G t is a component of a multi-
for any node b of G, ira ~ b is an edge of G then b is
graph G if it is a weakly connected subgraph of G and
not in C.
for any edge (a, b) in G, either both a and b are nodes
of G' or both a and b are not in G'. A node is a rool P r o o f : We assume to the contrary that there exists p
,3f a multigraph if there is a directed path from every and q that satisfy the zigzag inequalities, and there is
other node in the multigraph to that node. A rooted an edge a ~ b s u c h that b belongs to C. Let p b e a
.Traph (rooted component) is a graph (component) with q-fair run starting from a in which exactly q processes
at least one root. A labeled multigraph is a multigraph participate and b is written infinitely often. Since b is
together with a label function that assigns a weight in N in C, there is a path from b to a such that the sum of
to each edge of G. The weight of a labeled multigraph all the labels of edges in that path is at most w and
is the sum of the weights of its edges. hence b to, a. This allows us to construct a run with
We now define the notion of a reachability, graph of a pq non-faulty processes starting with a as follows:
given protocol P.
Run q processes according to p until b is writ-
l A multigraph c a n h a v e s e v e r a l e d g e s f r o m a t o b. ten. Run w processes until a is written. (This

111
must be possible since b ~ a.) Let these w which implies (i).
processes fail. Run a second group of q pro- Finally, we show t h a t inequality (iii) is satisfied. Re-
cesses according to p until b is written. Run call that we assume t h a t t <_ 2n/3 and r > n/3. It
a second group of w processes until a is writ- follows from these assumptions t h a t r > t/2. Since
ten, and let them fail. Repeat the above until q < t/2, obviously q < r. Also, since w _< U and
the pth group of q processes have just been run t <_ 2n/3, substituting in (1) gives w < n/3, and hence
and b has again been written. At this point, w<7-. []
pq processes belong to still-active groups, and L e m m a 5.3: lf w < W , then there exists positive inte-
( p - 1)w processes have died. If any processes
gers p and q that satisfy the zigzag inequalities and
remain, let t h e m die now without taking any
steps. Now, an infinite run p~ on the active pro- ,o(n - 0
q < - - + 3. (5)
cesses can be constructed by continuing to run t+2
the first group according to p until b is writ-
ten again, then doing the same for the second
through pth groups, and repeating this cycle Proof." Recall t h a t W < U, so in particular, w < U.
forever. Let q~ be the smallest positive integer solution to (3).
It follows from L e m m a 5.2 that q' exists. Let q be the
The result is a pq-fair run. Moreover, no reliable process
smallest positive integer for which there exists a positive
can distinguish this run from p, and hence no reliable
integer p such that p and q satisfy the zigzag inequalities.
process ever knows (in p~) that more than q processes
It follows from L e m m a 5.2 that q exists and q < q'.
are awake. Also, obviously, no faulty process knows that
If q = 1 than clearly 1 < w(n - t)/(t + 2) + 2 and
more than w processes are awake. Since max(q, w) < 7"
the l e m m a holds. Assume q > 1. Since q < q' it follows
but at least pq >_ n - t >_ r processes are awake in p~,
that
this leads to a contradiction to the assumption that P ( q - 1) 2 - t ( q - 1) + w ( n - t ) > O.
is a wakeup protocol. []
Thus,
L e m m a 5.2: Assume w <_ U. Then the inequality q2 -4- 1 -4- t -4- w ( n -- t )
q < (6)
x - + - t ) _< 0 (3) t+2
has a positive integer solution. The smallest positive By L e m m a 5.2,
integer solution for (3) is
[t- 'x/t2-4w(n-t)'] < t q _ It - I /t2 - 4w(n - t)l] . (7)
q = 2 _ (4)
Since w < W, we can substitute W for w in inequality
There exists a positive integer p such that p and q satisfy (7) and get that q < IvY] < v ~ + 1. Thus, q2 <
the zigzag inequalities.
t + 2v/t + 1, so it follows from (6) and the assumption
P r o o f : We first show that (3) has a positive solution. that t > 9 that inequality (5) holds. []
Using the quadratic formula, we get that the roots of
(3) are 5.3 Terminal Graphs
t - I~t 2 - 4w(n - t)l and t + I ~ t ~ - 4w(n - t)l In this subsection, we show that the teachability graph
2 2 G of any wakeup protocol with parameters n , t , r , has
Since w < U the discriminant t 2 - 4w(n - t) > 1. Since at least one subgraph which is terminal w.r.t. G and
the value of the discriminant is less than t 2 it follows has size > W a. We first prove that the weight of any
that the roots are positive. Moreover, the difference rooted component of any terminal graph w.r.t. G has
of the two roots is at least 1; hence there is a positive weight > U. Then we show that this implies that there
integer x satisfying (3), and q is the least such integer. exists a terminal graph w.r.t. G, all of whose rooted
Moreover, since t is an integer, t/2 is either an integer or components have size > W ~.
lies exactly half way between two integers, so inequality L e m m a 5.4: Let G be the reachability graph of a
(4) holds. wakeup protocol with parameters n,t, 7" and let T be ter-
Next we show t h a t there exists a positive integer p minal w.r.t. G. Any rooted component o f T has weight
such that p and q satisfy the inequalities (i) and (it). >U.
Let p = [(n - t)/q]. The choice of p clearly satisfies
P r o o f : Assume to the contrary t h a t T has a minimal-
(it). Also from (3) it follows t h a t
weight component C of weight w < U. Then, by L e m m a
5.2, there exist positive integers q and p that satisfy the
p= < ~ + 1 < ~
-- q -- q+w zigzag inequalities. From L e m m a 5.1, there is a node b

112
not in C and an edge a ~ b in G. Therefore, T is not rooted component of T/ are both at least 2. Now, sup-
a terminal w.r.t. G, a contradiction. 1:3 pose the procedure adds an edge of label q from compo-
nent C1 ofsize kl and weight wx to component C2 of size
L e m m a 5.5: Let G be the teachability graph of a
k2 and weight w2. By step 1, the new edge emanates
wakeup protocol with parameters n , t , 7". There exists a
from a minimal weight component, so wl _< w2. The
graph T which is terminal w.r.t. G, all of whose rooted
weight w of the newly formed component is Wl + w2 + q,
components have size >_ W ~.
and the number of nodes k is kl + ks. We show now
P r o o f : The following procedure constructs T by adding that k >_ min(w, W) ~.
edges one at a time to an initial subgraph To of G un- Clearly, if w~ > W then min(w2, W) ~ = min(w, W) ~
til step 2 fails. The initial subgraph To consists of all and k2 < k, so by the induction hypothesis we are done.
the nodes of G. For each node a there is exactly one Hence, we assume that w~ < W, so also Wl < W. Since
outgoing edge a ~ b in To. We note two facts about Wl < W it follows from L e m m a 5.3 that there exist
To: (1) for every edge a ~ b, a # b, and (2) every positive integers p~ and q~ that satisfy the zigzag in-
component of To has at least one root. Fact (1) follows equalities and q' < (wl(n - - t ) ) / ( t + 2) + 3; hence by
from Lemma 5.1, choosing q = 1 and p = n - t (w = 0); Lemma 5.1 there is an edge of label q~ from any root of
while (2) follows from the fact that the outdegree of any C1 to some node not in C1. Thus, by the minimality
node is exactly one. Also, it follows from (1) that the of q (the weight of the edge in step 2), it follows that
weight and size of any component of To is at least 2. q < q' which implies that q < (wl(n - t ) ) / ( t + 2) + 3;
At any stage of the construction, every component hence,
of the graph built so far will have at least one root.
Added edges always start at a root and end at a node w = wl+w2+q (8)
of a different component. After adding an edge (a,b), _< (~-~-~+1 wx+w2+3 (9)
the components of a and b are joined together into a
single component whose root is the root of b's compo- n-t )
nent, and the weight of the new component is the sum _< (~--~-~+2.5 w l + w 2 . (10)
of the weights of the two original components plus the
label of the edge from a to b. I I I t
Let k x = W l = , a n d k 2 = w 2 =. Then k l < k 2 ' w 1 = k l
I~
,
,O
P r o c e d u r e f o r a d d i n g a n e w e d g e t o T: a n d w 2 = k 2 . We claim that
S t e p 1: Select an arbitrary component C of minimal n t
weight and an arbitrary root a of C. + 2.5) wx + w~

S t e p 2: Find the smallest integer q for which there is n-t +2.5 k1 +k~ (11)
an edge a ~ b in G such that b is not in C. This
step fails if no such edge exists. _< (k; + k;?.
S t e p 3: Place the edge a ~ b into T. It is not difficult to see that since (n - t ) / ( t + 2 ) + 3 . 5 =
,~ , I I I , .

Let Ti be a graph that is constructed after i applications 2 , equahty holds for k x = k 2. As k 2 I s increased to be
I

of the above procedure, where To is an initial graph as larger than kl, the right side increases more rapidly than
defined above. Clearly, any such sequence {T0,T1,...} the left side since /3 > 1; hence, the inequality holds.I
is finite and the last element is terminal w . r . t . G . Finally, by the induction hypothesis, kl > wx ~ = kl
I

We prove by induction on i, the number of applica- and k~ > w2 a = k 2. Hence,


tions of the procedure, that for any graph T/, all of the
components of T/ are rooted, and for any rooted com- (k'1 + k'2)z < (kl + k2) ~ = k #. (13)
ponent C it is the case that k _> rain(w, W) ~, k _> 2 Putting equations ( 8 ) - ( 1 3 ) together gives w _< k ~, so
and w >_ 2, where k is the size of C and w is its weight.
k > w ~ > min(w, W) ~. CI
This together with Lemma 5.4 and the fact that W < U
completes the proof. Theorem 5.1 follows immediately from Lemma 5.5.
Let/3 = 1/a. As discussed before, each component C
ofT0 has a root and has size k at least 2. The component
C consists of exactly k edges with label 1, so its weight 6 Relation to Other Problems
is also k. Hence, the base case holds since/3 > 1. In this section we show that there are efficient reduc-
Since To is a subgraph of ~ which also includes all
tions between the wakeup problem for r = [n/2J + 1
nodes of T/, it follows that the size and weight of any and the consensus and leader election problems. Hence,
the wakeup problem can be viewed as a fundamental

113
problem that captures the inherent difficulty of these ble to simultaneously reset all parts of the system to a
two problems. The following Lemma shows that in or- known initial state.
der to decide on some value in a t-resilient consensus Our results are interesting for several reasons:
protocol, it is always necessary (and in some cases also
sufficient) to learn first that at least t + 1 processes have • They give a quantitative measure of the cost of
waked up, and similarly in order to be elected in a t- fault-tolerance in shared memory parallel machines
resilient leader election protocol, it is always necessary in terms of communication bandwidth.
to learn that at least t + 1 processes have waked up. An • They apply to a model which more accurately re-
immediate consequence of the lemma is that there is no flects reality.
consensus or leader election protocol that can tolerate
[n/2~ failures. • They relate recent results from three different ac-
tive research areas in parallel and distributed com-
L e m m a 6.1: (I) Any t-resilient consensus (leader elec-
puting, namely:
tion) protocol is a t-resilient wakeup protocol for any
7" < t + l and p = n - t (p=l); (2) For any t < n/3, - Results in shared memory systems [2, 11, 19,
there exists t-resilient consensus and leader election pro- 31, 36, 38, 39, 46, 50, 51].
tocols which are not t-resilient wakeup protocols for any
- The theory of knowledge in distributed sys-
r>t+2.
tems [6, 14, 15, 17, 18, 22, 28, 25, 29, 30, 26,
T h e o r e m 6.1: Any protocol that solves the wakeup 33, 37, 40, 41].
problem for any t < n/2, r > n / 2 and p = 1, using a
- Self stabilizing protocols [3, 4, 8, 9, 12, 24, 35,
single v-valued shared register, can be transformed into
47].
a t-resilient consensus (leader election) protocol which
uses a single 8v-valued (4v-valued) shared register. • They give a new point of view and enable a deeper
From Theorems 6.1 and 4.4, it follows that for any understanding of some classical problems and re-
t < n/2, there is a t-resilient consensus (leader elec- sults in cooperative computing.
tion) protocol that uses an O(t)-valued shared regis-
• They are proved using techniques that will likely
ter. Next we show that the converse of Theorem 6.1 have application to other problems in distributed
also holds. T h a t is, the existence of a t-resilient con- computing.
sensus or leader election protocol which uses a single
v-valued shared register implies the existence of a t-
resilient wakeup protocol for any r < In/2] + 1 which Acknowledgement
uses a single O(v)-valued shared register.
We thank Joe Halpern for helpful discussions.
T h e o r e m 6.2: Any t-resilient protocol that solves the
consensus or leader election problem using a single
v-valued shared register can be transformed into a t- References
resilient protocol that solves the wakeup problem for any
7- _< [n/2J + 1 which uses a single 4v.valued shared reg- [1] K. Abrahamson. On achieving consensus using
ister. shared memory. In Proc. 7th A CM Syrup. on Prin-
It follows from Theorem 6.2 that the lower bound ciples of Distributed Computing, pages 291-302,
we proved in Section 5 for the wakeup problem when 1988.
r = [n/2J + 1 also applies to the consensus and leader [2] B. Bloom. Constructing two-writer atomic regis-
election problems. Finally, an immediate corollary of ters. In Proc. 6th A C M Syrup. on Principles of
Theorem 6.1 and Theorem 6.2 is that the consensus Distributed Computing, pages 249-259, 1987.
and leader election problems are space-equivalent. T h a t
is, there is a t-resilient consensus protocol that uses an [3] G. M. Brown, M. G. Gouda, and C.-L. Wu. Token
O(t)-valued shared register if["there is a t-resilient leader systems that self-stabilize. 1EEE Trans. on Com-
election protocol that uses an O(t)-valued shared regis- puters, 38(6):845-852, June 1989.
ter.
[4] J. E. Burns and J. Pachl. Uniform self-stablilizing
rings. A C M Trans. on Programming Languages and
7 Conclusions Systems, 11(2):330-344, 1989.

We study the fundamental new wakeup problem in a [5] N. Carriero and D. Gelernter. Linda in context.
new model where all processes are programmed alike, Communications of the A CM, 32(4):444-458, April
there is no global synchronization, and it is not possi- 1989.

114
[6] M. Chandy and J. Misra. How processes learn. [19] M. J. Fischer, N. A. Lynch, J. E. Burns, and
Journal of Distributed Computing, 1:40-52, 1986. A. Borodin. Distributed FIFO allocation of identi-
cal resources using small shared space. A CM Trans.
[7] E. Chang and R. Roberts. An improved algorithm on Programming Languages and Systems, 11(1):90-
for decentralized extrema-finding in circular config- 114, 1989.
uration of processes. Communications of the ACM,
22(5):281-283, 1979. [20] M. J. Fischer, N. A. Lynch, and M. Merritt. Easy
impossibility proofs for distributed consensus prob-
[8] E. W. Dijkstra. Self-stablizing systems in spite of lems. Journal of Distributed Computing, 1:26-39,
distributed control. Communications of the ACM, 1986.
17:643-644, 1974.
[21] M. J. Fischer, N. A. Lynch, and M. S. Paterson. Im-
[9] E. W. Dijkstra. A belated proof of self- possibility of distributed consensus with one faulty
stabilization. Journal of Distributed Computing, process. Journal of the ACM, 32(2):374-382, April
1:5-6, 1986. 1985.
[10] D. Dolev, C. Dwork, and L. Stockmeyer. On the [22] M.J. Fischer and L. D. Zuck. Reasoning about un-
minimal synchronism needed for distributed con- certainty in fault-tolerant distributed systems. In
sensus. Journal of the ACM, 34(1):77-97, 1987. M. Joseph, editor, Formal Techniques in Real-Time
and Fault-Tolerant Systems, pages 142-158. Lec-
[11] D. Dolev, E. Gafifi, and N. Shavit. Toward a non- ture Notes in Computer Science, vol. 331, Springer-
atomic era: /-exclusion as a test case. In Proc. 20th Verlag, 1988.
A CM Syrup. on Theory of Computing, pages 78-92,
1988. [23] A. Gottlieb, R. Grishman, C.P. Kruskal, K.P.
McAuliffe, L. Rudolph, and M. Snir. The NYU
[12] S. Dolev, A. Israeli, and S. Moran. Self stabiliza- ultracomputer--designing an MIMD parallel com-
tion of dynamic systems assuming only read write puter. IEEE Trans. on Computers, pages 175-189,
atomicity, submitted for publication, 1989. February 1984.
[13] C. Dwork, N. Lynch, and L. Stockmeyer. Consen- [24] M. G. Gouda. The stabilizing philosopher: Asym-
sus in the presence of partial synchrony. Journal of metry by memory and by action. Science of Com-
the ACM, 35(2):288-323, 1988. puter Programming, 1989. To appear.
[14] C. Dwork and Y. Moses. Knowledge and common [25] V. Hadzilacos. A knowledge theoretic analysis of
knowledge in a Byzantine environment i: Crash atomic commitment protocols. In Proc. 6th A CM
failures. In Theoretical Aspects of Reasoning about Symp. on Principles of Database Systems, pages
Knowledge: Proceedings of the 1986 Conference, 129-134, 1987.
pages 149-169. Morgan Kaufmann, 1986.
[26] J. Halpern and L. Zuck. A little knowledge goes
115] R. Fagin, Y. J. Halpern, and M. Vardi. A model a long way: Simple knowledge-based derivations
theoretic analysis of knowledge. In Proc. 25th IEEE and correctness proofs for a family of protocols. In
Syrup. on Foundations of Computer Science, pages Proc. 6lh A C M Syrup. on Principles of Distributed
268-278, 1984. Computing, pages 269-280, August 1987.

I16] M. J. Fischer. The consensus problem in unreliable [27] Y. J. Halpern. personal COmlnunicatiou.
distributed systems (a brief survey). In M. Karpin- [28] Y. J. Halpern. Reasoning about knowledge: An
sky, editor, Foundations of Computation Theory, overview. In Theoretical Aspects of Reasoning about
pages 127-140. Lecture Notes in Computer Science, Knowledge: Proceedings of the 1986 Conference,
vol. 158, Springer-Verlag, 1983. pages 1-17. Morgan Kaufrnann, 1986.
[17] M. J. Fischer and N. Immerman. Foundations of [29] Y. J. Halpern and R. Fagin. A formal model
knowledge for distributed systems. In Theoretical of knowledge, action, and communication in dis-
Aspects of Reasoning about Knowledge: Proceed- tributed systems: Preliminary report. In Proc. 4th
ings of the 1986 Conference, pages 171-185. Mor- A C M Syrup. on Principles of Distributed Comput-
gan Kaufmann, March 1986. ing, pages 224-236, 1985.
[28] M. J. Fischer and N. Immerman. Interpreting log- [30] Y. J. Halpern and Y. Moses. Knowledge and com-
ics of knowledge in propositional dynamic logic mon knowledge in a distributed environment. In
with converse. Information Processing Letters, Proc. 3rd ACM Syrup. on Principles of Distributed
25(3):175-181, May 1987. Computing, pages 50-61, 1984.

115
[31] P. M. Herlihy. Impossibility and universality re- [44] G. H. Pfister and et. al. The IBM research par-
sults for wait-free synchronization. In Proc. 7th allel processor prototype (RP3): Introduction and
A CM Symp. on Principles of Distributed Comput- architecture. In Proceedings International Confer-
ing, pages 276-290, 1988. ence on Parallel Processing, 1985.
[32] D. S. Hirschberg and J.B. Sinclair. Decentralized [45] A. G. Ranade, S. N. Bhatt, and S. L. Johnsson.
extrema-finding in circular configuration of pro- The fluent abstract machine. TechnicM Report
cesses. Communications of the ACM, 23:627-628, YALEU/DCS/TR-573, Department of Computer
1980. Science, Yale University, January 1988.

[33] S. Katz and G. Taubenfeld. What processes know: [46] G. Taubenfeld. Leader election in the presence of
Definitions and proof methods. In Proc. 5th ACM n - 1 initial failures. Information Processing Let-
Syrup. on Principles of Distributed Computing, ters, 33:25-28, 1989.
pages 249-262, August 1986.
[47] G. Taubenfeld. Self-stabilizing Petri nets. Techni-
[34] E. Korach, S. Moran, and S. Zaks. Tight lower cal Report YALEU/DCS/TR-707, Department of
and upper bounds for some distributed algorithms Computer Science, Yale University, May 1989.
for a complete network of processors. In Proc. 3rd
ACM Symp. on Principles of Distributed Comput- [483 G. Taubenfeld, S. Katz, and S. Moran. Impossibil-
ing, pages 199-207, 1984. ity results in the presence of multiple faulty pro-
cesses. In Proc. of the 9th FCT-TCS conference,
[35] H. S. M. Kruijer. Self-stabilization (in spite of dis- Bangalore, India, December 1989. Previous version
tributed control) in tree-structured systems. Infor- appeared as Technion TR-#492, January 1988.
mation Processing Letters, 2:91-95, 1979.
[49] G. Taubenfeld, S. Katz, and S. Moran. Initial
[36] L. Lamport. The mutual exclusion problem: State- failures in distributed computations. International
ment and solutions. Journal of the ACM, 33:327- Journal of Parallel Programming, 1990. To appear.
348, 1986. Previous version appeared as Technion TR-#517,
August 1988.
[37] D. Lehmann. Knowledge, common knowledge and
related puzzles. In Proc. 3rd A CM Syrup. on Prin- [50] G. Taubenfeld and S. Moran. Possibility and im-
ciples of Distributed Computing, pages 62-67, 1984. possibility results in a shared memory environment.
In J.C. Bermond and M. Raynal, editors, Proc. 3rd
[38] C. M. Lout and H. Abu-Amara. Memory re- International Workshop on Distributed Algorithms,
quirements for agreement among unreliable asyn- pages 254-267. Lecture Notes in Computer Science,
chronous processes. Advances in Computing Re- vol. 392, Springer-Verlag, 1989.
search, 4:163-183, 1988.
[5i] P. M. B. Vitanyi and B. Awerbuch. Atomic shared
[39] N. A. Lynch and M. J. Fischer. A technique for register access by asynchronous hardware. In Proc.
decomposing algorithms which use a single shared 27th IEEE Symp. on Foundations of Computer Sci-
variable. Journal of Computer and System Sci- ence, pages 223-243, 1986.
ences, 27(3):350-377, December 1983.
[40] Y. Moses and M. R. Tuttle. Programming simulta-
neous actions using common knowledge. Algorilh-
mica, 3:121-169, 1988.
[41] R. Parikh and R. Ramanujam. Distributed pro-
cesses and the logic of knowledge. In R. Parikh,
editor, Proceedings of the Workshop on Logic of
Programs, pages 256-268, 1985.
[42] M. Pease, R. Shostak, and L. Lamport. Reaching
agreement in the presence of faults. Journal of the
ACM, 27(2):228-234, 1980.
[43] G. L. Peterson. An O(nlogn) unidirectional al-
gorithm for the circular extrema problem. ACM
Trans. on Programming Languages and Sys$ems,
4(4):758-762, 1982.

116