You are on page 1of 114

GAME THEORY

Hans Peters

Dries Vermeulen

NAKE University of Maastricht


September 2005

Preface
This material has been composed for use in the Ph.D. NAKE-course Game Theory,
Fall 2005. It is based on several courses offered in the Econometrics program at the
University of Maastricht and in the Mathematics program at the RWTH Aachen,
Germany. It also borrows from several textbooks on game theory, in particular A
Primer in Game Theory by R. Gibbons.
Due to the potentially quite different interests, backgrounds and previous education
of students it is not easy to find the right level for a course like this one. We have
chosen to start from scratch and assume not more than a basic level in mathematics
and calculus. On the one hand, for students totally unacquainted with game theory
there is a lot of material compressed in the first six weeks of this course, namely this
reader consisting of two parts: Part I on Noncooperative Game Theory (about 100
pages) and Part II on Cooperative Game Theory (about 40 pages). On the other
hand, for students already more or less familiar with game theory, this first part of
the course can be seen as a repetition of the basic theory.
The second part of the course will specialize on topics treated in this reader as well
as new topics. This may depend on the interests of the students.
Hans Peters (H.Peters@ke.unimaas.nl)
Dries Vermeulen (D.Vermeulen@ke.unimaas.nl)

ii

Contents
1 Introduction
1.1 Noncooperative games . .
1.1.1 Zerosum games . .
1.1.2 Nonzerosum games
1.1.3 Games in extensive
1.2 Cooperative games . . . .
1.2.1 Transferable utility
1.2.2 Bargaining games .
1.3 Concluding remarks . . .

. . . .
. . . .
. . . .
form .
. . . .
games
. . . .
. . . .

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

Noncooperative Game Theory

2 Nash Equilibrium
2.1 Strategic Games . . . . .
2.2 Finite Two-Person Games
2.3 Nash Equilibrium . . . . .
2.4 Applications . . . . . . . .
3 The
3.1
3.2
3.3
3.4

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

1
2
2
4
6
9
9
12
13

17
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

Mixed Extension
Mixed Strategies . . . . . . . . . . . . . . . . . . . . .
The Mixed Extension of a Finite Game . . . . . . . .
Solving Bimatrix Games . . . . . . . . . . . . . . . . .
The Interpretation of Mixed Strategy Nash Equilibria

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

19
19
22
23
29

.
.
.
.

33
33
34
39
44

4 Zero-Sum Games

47

5 Extensive Form Games: Perfect Information


5.1 Games in Extensive Form: the Model . . . . . .
5.2 The Strategic Form of an Extensive Form Game
5.3 Subgame Perfect Equilibria . . . . . . . . . . . .
5.4 Applications of Backwards Induction . . . . . . .

57
57
59
62
64

iii

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

iv
6 Extensive Form Games: Imperfect Information
6.1 Information Sets . . . . . . . . . . . . . . . . . .
6.2 Mixed and Behavioral Strategies . . . . . . . . .
6.3 Equilibria . . . . . . . . . . . . . . . . . . . . . .
6.4 Perfect Bayesian Nash Equilibria . . . . . . . . .

CONTENTS

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

69
69
72
77
81

7 Signaling Games
93
7.1 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.2 The Beer and Quiche Game . . . . . . . . . . . . . . . . . . . . . . . . 95
7.3 The Spence Signaling Model . . . . . . . . . . . . . . . . . . . . . . . . 102

Chapter 1

Introduction
Game Theory arose as a mathematical discipline (American Mathematical Society
code 90D). It is, however, a mathematical discipline which is inspired by economic,
political, or social problems rather than by problems from science, such as physics.
The first main work in Game Theory was the book Theory of Games and Economic
Behavior by John von Neumann and Oskar Morgenstern (Princeton University Press,
Princeton, 1944). The title of this book reflects its main source of inspiration.
Usually a distinction is made between cooperative and noncooperative game theory.
The idea is that in a cooperative game binding agreements are possible, whereas this
is not the case in a noncooperative game. The idea of binding agreements, however,
is not unambiguous. Therefore, it seems preferable to put the different modeling assumptions at the forefront. In a noncooperative game one defines the set of players,
their strategy sets, and the payoff functions, and the central solution concept is that
of Nash equilibrium, as a specific strategy combination. Thus, these games may also
be called strategy-oriented . In a cooperative game one defines the set of players and
the payoffs that can be reached by players and coalitions of players. So such a game
might also be called payoff-oriented. In a cooperative game, the underlying assumption is that solutions (e.g., payoff vectors, coalition structures) can be legalized by
binding contracts. Noncooperative game theory is mainly concerned with studying
Nash equilibrium and variations thereof. The usual approach in cooperative game
theory is the axiomatic approach, and there is not one central solution concept.
This chapter gives a brief introduction to noncooperative (Section 1.1) and cooperative
(Section 1.2) game theory. Some relations between the two approaches are mentioned
in Section 1.3. Our introduction is based on examples, mainly from economics. Each
example will consist of three parts: a story, a model, and possible solutions. We will
not be rigorous but, instead, try to convey the flavor of the associated mathematical
(game theoretical) analysis. For a more extensive and/or rigorous treatment the
reader is referred to the rest of this reader and other literature, a sample of which is
included in the list of references at the end of this chapter.
1

1.1

CHAPTER 1. INTRODUCTION

Noncooperative games

In a noncooperative game binding agreements between the players are, in principle,


not possible. Therefore, a solution of such a game should be self-enforcing in some
way or another. The central solution concept, Nash equilibrium, has this property.
The examples below involve only two players. More generally, however, there can be
any finite or even uncountably infinite number of players.

1.1.1

Zerosum games

Zerosum games are games where the sum of the payoffs of the players is always equal
to zero. In the two-person case this implies that these payoffs have opposite signs.
More generally, strictly competitive games are games where the interests of the players
are strictly opposed. Zerosum games but also constant-sum games are examples of
such games. (See Chapter 4 for a further treatment.)
Zerosum games were first explicitly studied by John von Neumann (1928) in his article
Zur Theorie der Gesellschaftsspiele (Mathematische Annalen, 100, pp. 295320) in
which he proved the famous Minimax Theorem.
The Battle of the Bismarck Sea
The story An example of a situation giving rise to a zerosum game is the Battle
of the Bismarck Sea (taken from Rasmusen, 1989). The game is set in the SouthPacific in 1943. The Japanese admiral Imamura has to transport troops across the
Bismarck Sea to New Guinea, and the American admiral Kenney wants to bomb the
transport. Imamura has two possible choices: a shorter Northern route (2 days) or a
larger Southern route (3 days), and Kenney must choose one of these routes to send
his planes to. If he chooses the wrong route he can call back the planes and send them
to the other route, but the number of bombing days is reduced by 1. We assume that
the number of bombing days represents the payoff to Kenney in a positive sense and
to Imamura in a negative sense.
A model
table:

The Battle of the Bismarck Sea problem can be modeled as in the following

North
South

North
2
1

South

2
3

This situation represents a game with two players, namely Kenney and Imamura.
Each player has two possible choices; Kenney (player 1) chooses a row, Imamura
(player 2) chooses a column, and these choices are to be made independently and
simultaneously. The numbers represent the payoffs to Kenney. For instance, the
number 2 up left means that if Kenney and Imamura both choose North, the payoff
to Kenney is 2 and to Imamura 2. [The convention is to let the numbers denote the
payoffs from player 2 (the column player) to player 1 (the row player).] This game is
an example of a zerosum game because the sum of the payoffs is always equal to zero.

1.1. NONCOOPERATIVE GAMES

A solution In this particular example, it does not seem difficult to predict what
will happen. By choosing North, Imamura is always at least as well off as by choosing
South, as is easily inferred from the above table of payoffs. So it is safe to assume
that Imamura chooses North, and Kenney, being able to perform this same kind of
reasoning, will then also choose North, since that is the best answer to the choice of
North by Imamura. Observe that this game is easy to analyse because one of the
players has a weakly dominant choice, i.e., a choice which is always at least as good
(giving always at least as high a payoff) as any other choice, no matter what the
opponent decides to do.
Another way to look at this game is to observe that the payoff 2 resulting from the
combination (North,North) is maximal in its column (2 1) and minimal in its row
(2 2). So neither player has an incentive to deviate. When discussing the more
general nonzerosum game below, we will see that this is exactly the definition of a
Nash equilibrium. In this zerosum case, it in fact means that the row player maximizes
his minimal payoff by playing the first row (because 2 = min{2, 2} 1 = min{1, 3}),
and the column player (who has to pay according to our convention) minimizes the
amount that he has to pay (because 2 = max{2, 1} 3 = max{2, 3}).
In the following (abstract) example:

2 3 3
1 4 0
1 0 4

neither player has a dominant or weakly dominant choice (row or column), but the
entry 2 (upper row, left column) is maximal in its column and minimal in its row.
Player 1 maximizes his minimal payoff whereas player two minimizes what he has to
pay maximally. So this is the natural solution of the game: neither player can expect
to obtain more.
In both these examples we say that the value of the game is equal to 2: this is what
each player can guarantee (receive minimally or pay maximally) in this game.
Matching Pennies
The story In the two-player game of matching pennies, both players have a coin
and simultaneously show heads or tails. If the coins match, player 2 pays one guilder
to player 1; otherwise, player 1 pays one guilder to player 2.
A model

This is a zerosum game with payoff matrix




1 1
1
1

The upper row and left column correspond to heads, the lower row and right column
to tails.
A solution Observe that in this game no player has a (weakly) dominant action,
and that there is no saddlepoint: there is no entry which is simultaneously a minimum

CHAPTER 1. INTRODUCTION

in its row and a maximum in its column. Thus, there does not seem to be a natural
way to solve the game. Von Neumann (1928) proposed to solve games like this
and zerosum games in generalby allowing the players to randomize between their
choices. In the present example of matching pennies, suppose player 1 chooses heads
or tails both with probability 21 . Suppose furthermore that player 2 plays heads with
probability q and tails with probability 1 q, where 0 q 1. In that case the
expected payoff for player 1 is equal to
1
1
[q 1 + (1 q) 1] + [q 1 + (1 q) 1]
2
2
which is independent of q, namely, equal to 0. So by randomizing in this way between
his two choices, player 1 can guarantee to obtain 0 in expectation (of course, the
actually realized outcome is always +1 or 1). Analogously, player 2, by playing
heads or tails each with probability 12 , can guarantee to pay 0 in expectation. Thus,
the amount of 0 plays a role similar to that of a saddlepoint. Again, we will say that
0 is the value of this game.
Von Neumann (1928) proved that every zerosum game has a value, by proving the
Minimax Theorem, which is equivalent to the Duality Theorem of Linear Programming. In fact, linear programming can be used to solve zerosum games in general,
although for special cases with not too many choices there are other methods, for
instance geometrical methods.

1.1.2

Nonzerosum games

In a nonzerosum game the sum of the payoffs of the players does not have to be equal
to zero. In particular, the interests of the players are not per se opposed.
A coordination problem
The story This example is based on Rasmusen (1989, p. 35). Two firms (Smith
and Brown) decide whether to design the computers they sell to use large or small
floppy disks. Both players will sell more computers if their disk drives are compatible.
If they both choose for small disks the payoffs will be 2 for each. If they both choose
for large disks the payoffs will be 1 for each. If they choose different sizes the payoffs
will be 1 for each.
A model

This situation can be represented by the following table:

Small
Large

Small
2, 2
1, 1

Large

1, 1
1, 1

In this representation player 1 (e.g., Smith) chooses a row and player 2 (Brown)
chooses a column. The payoffs are to (player 1,player 2).

1.1. NONCOOPERATIVE GAMES

Solutions Obviously, neither player has a dominant choice (row or column). The
combinations (Small,Small) and (Large,Large), however, are special in the following
sense. If player 2 believes that player 1 plays Small, the choice Small is also optimal
for player 2; and vice versa. The same holds true for the combination (Large,Large).
Such combinations are called Nash equilibria (Nash, 1951).
Suppose the game is extended by allowing randomization. Then the combination in
which each player plays Small with probability 25 and Large with probability 53 is
again a Nash equilibrium (in mixed strategies). This can be seen as follows. Suppose
player 1 believes that player 2 will play according to these probabilities. Then player
1s first row Small yields an expected payoff of 15 , but also the second row Large
yields an expected payoff of 51 . Hence, player 1 is indifferent between these rows,
any randomization yields the same expected payoff of 51 , and in particular the mixed
strategy under consideration. An analogous argument can be given for the roles of
the players reversed. So these mixed strategies are mutually optimal, i.e., they are a
Nash equilibrium in mixed strategies.
Nash (1951) showed that a Nash equilibrium in mixed strategies always exists in
games like this. The proof is based on Brouwers or Kakutanis fixed point theorem.
More generally, the problem with Nash equilibrium is multiplicity, rather than existence. Much of the literature on noncooperative game theory is concerned with
refining the Nash equilibrium concept, in order to get rid of part or all of the multiplicity; see, in particular, van Damme (1995).
The Cournot game
The story In a standard version of the Cournot game (Cournot, 1838) two firms
sell a homogeneous product. They compete on the market by supplying quantities of
this product; the total quantity supplied (offered) determines the price at which the
market clears.
A model The strategy sets of the players are S1 = S2 = [0, ). A strategy qi
for firm i is interpreted as: Firm i offers amount qi of the product on the market.
The price-demand function (market-clearing price) is the function p = P (q1 + q2 ) =
a q1 q2 , where a > 0 is a given fixed number. We further assume that the marginal
costs of firm i for producing the product are equal to c (so the same for both firms),
where 0 c < a. The payoff functions for the players are given by the profit functions:
K1 (q1 , q2 ) = q1 (a q1 q2 ) cq1
and
K2 (q1 , q2 ) = q2 (a q1 q2 ) cq2 .
[It is implicitly assumed here that the total quantity offered stays below a; in the
equilibrium analysis to follow this will be the case.]
Solutions As in the previous example, a Nash equilibrium is a pair of strategies
which are best replies to each other. Thus, a Nash equilibrium is a pair (q1 , q2 ) such

6
that

CHAPTER 1. INTRODUCTION

K1 (q1 , q2 ) K1 (q1 , q2 )
K2 (q1 , q2 ) K2 (q1 , q2 )

for all q1 S1
for all q2 S2 .

In the economic literature, such a pair (q1 , q2 ) is usually called a Cournot equilibrium. Alternatively, one may construct a model with prices instead of quantities as
strategies; then a Nash equilibrium is called a Bertrand equilibrium.
In order to calculate a Cournot equilibrium we first determine player 1s reaction
function. This is obtained by solving, for each fixed q2 0, the problem
max K1 (q1 , q2 ).
q1 0

Assuming an interior maximum, we find that this maximum is attained at q1 =


1
1
2 (a c q2 ). By a symmetric argument we obtain for player 2: q2 = 2 (a c q1 ).

The Cournot equilibrium is the unique point of intersection (q1 , q2 ) of the reaction
functions. It is not hard to verify that
q1 = q2 =

ac
.
3

Another solution concept applied in duopoly theory is the concept of Stackelberg


equilibrium. In such an equilibrium one player (say firm 1) is regarded as the leader
while the other player (firm 2) is the follower. This means that player 1 moves first,
while player 2 observes player 1s choice and reacts optimally; player 1 knows this and
chooses q1 so as to maximize profits. In order to calculate the Stackelberg equilibrium
we thus plug player 2s reaction function into player 1s profit function and maximize
over q1 , i.e.,
max K1 (q1 , R(q1 ))
q1 0

where
R(q1 ) =

1
(a c q1 ).
2

If q1 is a solution to this maximization problem, then the pair (


q1 , q2 ), where q2 =
R(
q1 ), is a Stackelberg equilibrium. One may verify that q1 = ac
2 = ac
2 and q
4 .

1.1.3

Games in extensive form

With one exception all games considered so far are static one-shot games, where the
players choose simultaneously. In particular they do not know each others choices
(but may form beliefs about these). Such games exhibit imperfect information. The
exception is the Stackelberg equilibrium, which is not so much a different solution
concept but rather refers to a different game; the leader moves first and the follower
observes this move and then makes his choice. Such a game therefore is said to have
perfect information.
In this section, more generally, games in extensive form are considered, which enable
us to model such sequential moves.

1.1. NONCOOPERATIVE GAMES


An entry deterrence game

The story An old question in industrial organization is whether an incumbent


monopolist can maintain his position by threatening to start a price war against any
new firm that enters the market. In order to analyse this question, consider the
following game. There are two players, the entrant and the incumbent. The entrant
decides whether to Enter (E) or to Stay Out (O). If the entrant enters, the incumbent
can Collude (C) with him, or Fight (F ) by cutting the price drastically. The payoffs
are as follows. Market profits are 100 at the monopoly price and 0 at the fighting
price. Entry costs 10. Collusion shares the profits evenly.
A model The structure of this game and its possible moves and payoffs are depicted
in Figure 1.1(a). A strategy in this game is a complete plan to play the game. This
should be a plan, devised before the actual start of the game, and specifying what a
player should do in every possible contingency of the game. In this case, the entrant
has only two strategies (Enter and Stay Out) and also the incumbent firm has only
two strategies (Collude and Fight). Given these strategies, there is a corresponding
simultaneous gamecalled the strategic or normal form of the gameas follows:

Enter
Stay Out

Collude
40, 50
0, 100

Fight

10, 0
0, 100

Solutions From the strategic form of the game (and limiting attention to pure
strategies, i.e., without randomization) it follows that there are two Nash equilibria,
namely (Enter,Collude) and (Stay Out,Fight). In the latter Nash Equilibrium one can
imagine the incumbent firm threatening the potential entrant with a price war in case
that firm would dare to enter. This threat, however, is not credible; once the firm
enters it is for the incumbent firm better to collude. Indeed, performing backward induction in the game tree in Figure 1.1(a), yields only the equilibrium (Enter,Collude).
The equilibrium (Stay Out,Enter) does not survive backward induction or, in game
theoretic parlance, is not subgame perfect. Subgame perfectness (Selten, 1965, 1975)
is one of the main refinements of the Nash equilibrium concept.
Entry deterrence with incomplete information
The story Consider the following variation on the foregoing entry deterrence model.
Suppose that with 50% probability the incumbents payoff from Fight (F ) is equal to
some amount x rather than the 0 above, that both firms know this, but that the true
payoff is only observed by the entrant. This situation might arise, for instance, if the
technology or cost structure of the entrant firm is private information but both firms
would make the same estimate about the associated probabilities.
A model A representation of this game is given in Figure 1.1(b). The uncertainty
is modeled by a chance move or move of Nature at the beginning of the game. The
potential entrant observes this move but the incumbent does not, as is represented

CHAPTER 1. INTRODUCTION

u (40, 50)
C
Enter
StO 

(0, 100) u
E
I P
 
PP
F Pu (10, 0)
(a)

StO 
Enter
(0, 100) u
E

50%

N

50%

(0, 100) u
E
StO 
Enter

C  u(40, 50)

 
I X

XXXu
(10, 0)
F
C u(40, 50)

 
I
XX X

Xu(10, x)
F

(b)
Figure 1.1: Entry deterrence games
by the dashed line. So this is a game of imperfect information and even of incomplete
information; the latter means that there is a move of Nature the outcome of which
is not observed by at least one player who moves after the move of Nature. The
idea of modeling incomplete information in a game in this way is basically due to
Harsanyi (1967-8). In this game the entrant now has four strategies, because he has
two choices at each of the two possible outcomes of Natures move. The incumbent
still has two strategies; since he does not observe Natures move his choice cannot be
made dependent on it. To determine the strategic form of the game one has to take
expected outcomes. This results in the following table.
Collude
Enter,Enter
40, 50
20, 75
Enter,Stay Out

20, 75
Stay Out,Enter
Stay Out,Stay Out
0, 100

Fight

10, x2
5, 50

5, 100+x
2
0, 100

The two moves in each strategy of the entrant refer to the upper and lower moves of
Nature in Figure 1.1(b), respectively.
Solutions Again there are two Nash equilibria in pure strategies, namely ((Enter,Enter), Collude) (provided x 100) and ((Stay Out,Stay Out),Fight). In particular if x is large, say x > 50 it is no longer possible to exclude the latter equilibrium
by simple backward induction. The concept of sequential or perfect Bayesian equilib-

1.2. COOPERATIVE GAMES

rium1 (Kreps and Wilson, 1982) handles situations as this as follows. Suppose that,
after observing Enter the incumbent has a belief p that the entrant is of the 0-type
and a belief 1 p that he is of the x-type. Given these beliefs, Fight is at least as
good as Collude if the expected payoff from Fight is at least as large as the expected
payoff from Collude, i.e., if (1 p)x 50 or p x50
x . We say that ((Stay Out,Stay
Out),Fight) is a sequential equilibrium under these beliefs of the incumbent.
Later literature focuses on refinements (additional conditions) on these beliefs, e.g.,
the Intuitive Criterion (Cho and Kreps, 1987).

1.2

Cooperative games

In a cooperative game the focus is on payoffs and coalitions rather than on strategies.
The prevailing analysis has an axiomatic flavor, in contrast to the equilibrium analysis
of noncooperative theory.
The examples discussed below are confined to so-called transferable utility games and
to bargaining games.

1.2.1

Transferable utility games

A game with transferable utility or TU-game is a pair (N, v), where N := {1, 2, . . . , n}
is the set of players, and v is a map assigning a real number v(S) to every coalition
S N , with v() = 0. The number v(S) is called the worth of coalition S.
The usual assumption is that the grand coalition forms, and the basic question is how
to distribute its worth v(N ). One possible answer is the core, defined by
X
xi v(S) for every nonempty coalition S,
C(N, v) = {x IR|
iS

xi = v(N )}.

iN

The core (Gillies, 1953) of a TU-game (N, v) is the set of all distributions of v(N )
such that no coalition has an incentive to split off. The core may be large, small, or
empty. It is a polytope (defined by a number of linear inequalities) and elements of
it can be determined by Linear Programming.
Other solution concepts are the Shapley value (Shapley, 1953) and the (pre)nucleolus
(Schmeidler, 1969). Both the Shapley value and the nucleolus are single-valued and
always exist. The nucleolus has the advantage of always being in the core, provided the
core is non-empty. The Shapley value measures the average contribution of a player
to all coalitions in the game, where the nucleolus minimizes maximal dissatisfaction
of coalitions. For both solutions axiomatic characterizations are available.2
1 These

concepts are not equivalent in general, but they are in this specific case.
completeness
we give here also the formal definitions. Let (N, v) be a TU-game.
For a vector
P
P
x IRN with iN xi = v(N ) and a coalition 6= S 6= N let e(S, x) = v(S) iS xi , and let e(x)
be the (2n 2)-dimensional vector consisting of the numbers e(S, x) rearranged in (weakly) decreasing
order. Then the nucleolus of the game (N, v) is the
Pvector that lexicographically minimizes e() over
the set of all efficient vectors, i.e., vectors x with iN xi = v(N ).
2 For

10

CHAPTER 1. INTRODUCTION





1
30
b
100 "
b
"
b 
"

b
"
b
"
50
3
!
!
a

!
aa 
!!
a
a
2
140
20


power

Figure 1.2: Three cooperating communities


Three cooperating communities
The story Communities 1, 2 and 3 want to be connected with a nearby power
source. The possible transmission links and their costs are shown in Figure 1.2. Each
community can hire any of the transmission links.
A model The cost game (N, c) associated with this situation is given by N =
{1, 2, 3} and the first two lines of the next table.
S=

{1}

{2}

{3}

{1, 2}

{1, 3}

{2, 3}

{1, 2, 3}

c(S) =

100

140

130

150

130

150

150

v(S) =

90

100

120

220

The numbers c(S) are obtained by calculating the cheapest routes connecting the
communities in the coalition S with the power source. The game (N, v) in the third
line of the table is the cost savings game corresponding to (N, c), determined by
v(S) :=

X
iS

c(i) c(S) for each S 2N .

The cost savings v(S) for coalition S is the difference in costs corresponding to the
situation where all members of S work alone and the situation where all members of
S work together.
The Shapley value assigns to player i N the amount
X

S: i6S

|S|!(|N | |S| 1)!


[v(S {i}) v(S)].
|N |!

11

1.2. COOPERATIVE GAMES

Adams
Benson
Cooper

Mon
2
10
10

Tue
4
5
6

Wed
8
2
4

Table 1.1: Preferences for a dentist appointment


Solutions The core of this game is the convex hull of the payoff vectors (100, 120, 0),
(0, 120, 100), (0, 90, 130), (90, 0, 130), and (100, 0, 120). The Shapley value is the payoff
vector (65, 75, 80), which is contained in the core. The nucleolus is the payoff vector
(56 23 , 76 23 , 86 23 ).
The glove game
The story Each player either owns a right hand or a left hand glove. One pair of
gloves is worth one guilder. The players can form coalitions and pairs of gloves.
A model Let N = {1, 2, . . . , n} be divided into two disjoint subsets L and R.
Members of L possess a left hand glove, members of R a right hand glove. The
situation above can be described as a TU-game (N, v) where
v(S) := min{|L S|, |R S|}
for each S 2N . (The number of elements in a finite set S is denoted by |S|.)
Solutions Suppose that 0 < |R| < |L|. The core of this game consists of one payoff
1
if i R. This is
vector, namely the vector x IRn with xi = 0 if i L, and xi = |R|
also the nucleolus. The Shapley value also gives something to the players in L; e.g.,
if n = 3, R = {1} and L = {2, 3}, then player 1 obtains 23 and players 2 and 3 each
obtain 61 .
A permutation game
The story (From Curiel, 1997, p. 54) Mr. Adams, Mrs. Benson, and Mr. Cooper
have appointments with the dentist on Monday, Tuesday, and Wednesday, respectively. This schedule not necessarily matches their preferences, due to different urgencies and other factors. These preferences (expressed in numbers) are given in Table
1.1.
A model This situation gives rise to a game in which the coalitions can gain by
reshuffling their appointments. For instance, Adams (player 1) and Benson (player
2) can change their appointments and obtain a total of 14 instead of 7. A complete
description of the resulting TU-game is given in Table 1.2.

12

CHAPTER 1. INTRODUCTION
S
v(S)

1
2

2
5

3
4

12
14

13
18

23
9

123
24

Table 1.2: A permutation game


Solutions The core of this game is the convex hull of the vectors (15, 5, 4), (14, 6, 4),
(8, 6, 10), and (9, 5, 10). The Shapley value is the vector (9 12 , 6 12 , 8), and the nucleolus
is the vector (11 12 , 5 12 , 7).

1.2.2

Bargaining games

The general model in cooperative game theory is that of a nontransferable utility game,
where the possible payoffs for each coalition are described by a set. Such games derive,
for instance, from classical exchange economies. TU-games as discussed above are a
special case,
P where for each coalition S the set of possible payoffs is the halfspace
{x IRn | iS xi v(S)}, for a game (N, v). Another special subclass are pure
bargaining games, where intermediate coalitions (coalitions with more than one but
less than all players) do not play a role.
The discussion of general cooperative games is beyond the scope of this introductory
text. An example of a bargaining game, however, is discussed next.
A division problem
The story Consider the following situation. Two players have to agree on the
division of one unit of a perfectly divisible good. If they reach an agreement, say
(, ) where , 0 and + 1, then they split up the one unit according to
this agreement; otherwise, they both get nothing. The players have preferences for
the good, described by utility functions.
The model To fix ideas, assume thatplayer 1 has a utility function u1 () = and
player 2 has a utility function u2 () = . Thus, a distribution(, 1 ) of the good
leads to a corresponding pair of utilities (u1 (), u2 (1)) = (, 1 ). By letting
range from 0 to 1 we obtain all utility pairs corresponding to all feasible distributions
of the good, as in Figure 1.3. It is assumed that also distributions summing to less
than the whole unit are possible. This yields the whole shaded region.
A solution Nash (1950) proposed the following way to solve this bargaining problem: Maximize the product of the players utilities on the shaded area. Since this
maximum will be reached on the boundary, the problem is equivalent to

max 1 .
01

The maximum is obtained for = 32 . So the solution of the bargaining problem in

utilities equals ( 23 , 31 3). This implies that player 1 obtains 23 of the 1 unit of the

13

1.3. CONCLUDING REMARKS


1
 graph of 7

1
Figure 1.3: A division problem

good, whereas player 2 obtains 31 . As described here, this solution comes out of the
blue. Nash, however, provided an axiomatic foundation for this solution (which is
usually called the Nash bargaining solution).

1.3

Concluding remarks

This introduction to game theory has necessarily been very brief and incomplete. For
one thing, it suggests a rather strict separation between cooperative and noncooperative models. In the literature, however, there are many relations. Let us mention
just a few examples.
There are many noncooperative bargaining models. Moreover, these models result in
outcomes which often are also predicted by cooperative, axiomatic models. The best
known example of this is the Rubinstein alternating offers model (Rubinstein, 1982),
the outcome of which is closely related to the Nash bargaining solution outcome. A
recent extension of this is provided by Hart and Mas-Colell (1996).
Up to now, for most cooperative solution concepts there exist noncooperative implementations. This means that a parallel noncooperative game is developed, the (Nash
or other) equilibrium outcomes of which coincide with the payoffs resulting from the
application of the cooperative solution under consideration.
Other topics on which this introduction has remained silent are the building of reputation and cooperation through repeated play, bounded rationality, evolutionary
approaches, and the role of knowledge in games, to mention just a few.

14

1.4

CHAPTER 1. INTRODUCTION

References

References from the text


Cho, I.K., and D.M. Kreps (1987): Signalling Games and Stable Equilibria, Quarterly
Journal of Economics, 102, 179221.
Cournot, A. (1838): Recherches sur les Principes Mathematiques de la Theorie des Richesses.
English translation (1897): Researches into the Mathematical Principles of the Theory of
Wealth. New York: Macmillan.
Curiel, I. (1997): Cooperative Game Theory and Applications: Cooperative Games Arising
from Combinatorial Optimization Problems. Dordrecht: Kluwer Academic Publishers.
van Damme, E. (1995): Stability and Perfection of Nash Equilibria. Berlin: Springer Verlag.
Gillies, D.B. (1953): Some Theorems on n-Person Games. Ph.D. Thesis, Princeton: Princeton University Press.
Harsanyi, J.C. (1967, 1968): Games with Incomplete Information played by Bayesian
Players, I, II, and III, Management Science, 14, 159182, 320334, 486502.
Hart, S., and A. Mas-Colell (1996): Bargaining and Value, Econometrica, 64, 357380
Kreps, D., and R. Wilson (1982): Sequential Equilibria, Econometrica, 50, 863894.
Nash, J.F. (1950): The Bargaining Problem, Econometrica, 18, 155162.
Nash, J.F. (1951): Non-Cooperative Games, Annals of Mathematics, 54, 286295.
von Neumann, J. (1928): Zur Theorie der Gesellschaftsspiele, Mathematische Annalen,
100, 295320.
von Neumann, J., and O. Morgenstern (1944, 1947): Theory of Games and Economic Behavior. Princeton: Princeton University Press.
Rasmusen, E. (1989): Games and Information, An Introduction to Game Theory. Oxford:
Basil Blackwell.
Rubinstein, A. (1982): Perfect Equilibrium in a Bargaining Model, Econometrica, 50, 97
109.
Schmeidler, D. (1969): The Nucleolus of a Characteristic Function Game, SIAM Journal
of Applied Mathematics, 17, 11631170.
Selten, R. (1965): Spieltheoretische Behandlung eines Oligopolmodels mit Nachfragezeit,
Zeitschrift f
ur Gesammte Staatswissenschaft, 121, 301324.
Selten, R. (1975): Reexamination of the Perfectness Concept for Equilibrium Points in
Extensive Games, International Journal of Game Theory, 4, 2555.
Shapley, L.S. (1953): A Value for n-Person Games, Annals of Mathematics Studies, 28.

Further reading
Aumann, R.J., and S. Hart, eds. (1992, 1994, 2002): Handbook of Game Theory with Economic Applications, Vols. 1,2, and 3. Amsterdam: North-Holland.
Bierman, H.S., and L. Fernandez (1993): Game Theory with Economic Applications. Reading, Mass.: Addison-Wesley.
Binmore, K. (1992): Fun and Games, A Text on Game Theory. Lexington, Mass.: D.C.

1.4. REFERENCES

15

Heath and Company.


Brams, S.J. (1994): Theory of Moves. Cambridge, UK: Cambridge University Press.
Dixit, A., and B. Nalebuff (1991): Thinking Strategically. New York: Norton.
Friedman, J.W. (1986): Game Theory with Applications to Economics. Oxford: Oxford
University Press.
Fudenberg, F., and J. Tirole (1991): Game Theory. Cambridge, Mass.: The MIT Press.
Gardner, R. (1995): Games for Business and Economics. New York: Wiley.
Hargreaves Heap, S.P., and Y. Varoufakis (1995): Game Theory: A Critical Introduction.
London: Routledge.
Kreps, D.M. (1990): A Course in Microeconomic Theory. Princeton: Princeton University
Press.
Luce, R.D., and H. Raiffa (1957): Games and Decisions: Introduction and Critical Survey.
New York: Wiley.
Myerson, R.B. (1991): Game Theory: Analysis of Conflict. Cambridge, Mass.: Harvard
University Press.
Osborne, M.J., and A. Rubinstein (1994): A Course in Game Theory. Cambridge, Mass.:
The MIT Press.
Osborne, M.J. (2004): An Introduction to Game Theory. New York: Oxford University
Press.
Owen, G. (1995): Game Theory. New York: Academic Press.
Peters, H., and K. Vrieze (1992): A Course in Game Theory. Aachen: Verlag der Augustinus
Buchhandlung.
Shubik, M. (1982): Game Theory in the Social Sciences, Concepts and Solutions. Cambridge, Mass.: The MIT Press.
Shubik, M. (1984): A Game-Theoretic Approach to Political Economy. Cambridge, Mass.:
The MIT Press.
Sutton, J. (1986): Non-Cooperative Bargaining Theory: An Introduction, Review of Economic Studies, 53, 709724.
Thomas, L.C. (1986): Games, Theory and Applications. Chichester: Ellis Horwood Limited.
Weibull, J.W. (1995): Evolutionary Game Theory. Cambridge, Mass.: The MIT Press.
Young, H.P. (2004): Strategic Learning and its Limits. Oxford, UK: Oxford University
Press.

16

CHAPTER 1. INTRODUCTION

Part I

Noncooperative Game Theory

17

Chapter 2

Nash Equilibrium
Game theory is a mathematical theory dealing with models of conflict and cooperation. In
this first chapter we consider situations where several parties (called players) are involved in
a conflict (called a game). We suppose that the players simultaneously choose an action and
that the combination of actions chosen by the players determines a payoff for each player.
Nash equilibrium is a basic concept in the theory developed for such games.

2.1

Strategic Games

A strategic game is a model of interactive decision-making in which each decision maker


chooses his plan of action once and for all, and these choices are made simultaneously. The
model consists of a finite set N of players and, for each player i, a set Ai of actions and a
payoff function ui defined on the set A = A1 A2 An of all possible combinations of
actions of the players. Formally,

strategic game

DEFINITION

An n-person game in strategic form G consists of

a finite set N = {1, 2, . . . , n} of players


for each player i N a nonempty set Ai of actions available to this player
Q
for each player i N a (payoff) function ui defined on the set A = jN Aj ; this
function assigns to each n-tuple a = (a1 , a2 , . . . , an ) of actions the real number ui (a).
 
 
i or more shortly by G = hA, ui.
, uj
We denote this game by G = hN, Aj
jN

jN

A play of the game G proceeds as follows: each player chooses independently of his opponents
one of his possible actions; if ai denotes the action chosen by player i, then this player obtains
a payoff ui (a), where a = (a1 , a2 , . . . , an ) A is a profile of actions. Under this interpretation
each player is unaware, when choosing his action, of the choices being made by his opponents.
The fact that the payoff to a player in general depends on the actions of his opponent,
distinguishes a strategic game from a decision problem: each player may care not only about
his own action but also about the actions taken by the other players.

19

20

CHAPTER 2. NASH EQUILIBRIUM

Cournot Model of Duopoly


Suppose there are two producers of mineral water. If producer i brings an amount of qi
units on the market, then his costs are ci (qi ) 0 units. The price of mineral water depends
on the total amount q1 + q2 brought on the market and is denoted by p(q1 + q2 ). Following
Cournot (1838) we suppose that the firms choose their quantities simultaneously.
This duopoly situation can be modeled as the two-person game in strategic form hA1 , A2 , u1 , u2 i,
where for each i
Ai = [0, )
EXAMPLE 1

and

ui (q1 , q2 ) = qi p(q1 + q2 ) ci (qi ).

EXERCISE 1

A painting is auctioned among two bidders. The worth of the painting for
bidder i equals wi . Now each bidder, independently of the other, makes a
bid. The highest bidder obtains the painting for the amount mentioned by
the other bidder. If the two bids are the same the problem is solved with
the help of a lottery.
Describe this situation as a two-person game in strategic form.

Although in our description of a game the players of a game choose their actions simultaneously, it is also possible to model conflict situations where players act one by one. As
the following example shows, this is just a matter of properly defining the actions of the
players.
A tree game
Suppose that two players are dealing with the following situation.
EXAMPLE 2

21

2.1. STRATEGIC GAMES


1
L

l1

r1

 
3
2

 
4
1

r2

l2

 
5
3

 
4
6

A play proceeds as follows. First player 1 decides to go to the left (L) or to go to the right
(R). Then player 2 decides to go left or right. If both players have chosen to go left, then
player 1 obtains 4 dollar and player 2 gets 1 dollar, etc. In this situation the behavior of
player 2 clearly depends on the choice of player 1: if player 1 chooses L, then player 2 will
prefer r1 ; if player 1 chooses R, then player 2 prefers l2 . So if we want to analyze this
situation we should define the actions of player 2 in such a way that this dependence is
incorporated in one way or the other. This can be done by representing the strategy of
player 2 described before as the pair (r1 , l2 ).
This situation can be modeled as a two-person game in strategic form hA1 , A2 , u1 , u2 i, where
A1 = {L, R}

and

A2 = {(l1 , l2 ), (l1 , r2 ), (r1 , l2 ), (r1 , r2 )}.

The payoffs are given in the following tableau:


L
R
EXERCISE 2

(l1 , l2 )
(4,1)
(4,6)

(l1 , r2 )
(4,1)
(5,3)

(r1 , l2 )
(3,2)
(4,6)

(r1 , r2 )
(3,2)
(5,3)

Four objects O1 , O2 , O3 and O4 have a different worth for two players 1 and
2:

worth for player 1


worth for player 2

O1

O2

O3

O4

1
2

2
3

3
4

4
1

Player 1 starts with choosing an object. After him player 2 chooses an


object followed by player 1 who takes his second object. Finally, player 2
gets the object that is left.
Show that this situation can be formulated as a two-person game by describing the actions available to the players.

22

2.2

CHAPTER 2. NASH EQUILIBRIUM

Finite Two-Person Games

For the two-person game described in Example 2 both action sets are finite sets. Therefore
we call this game a finite game.

finite game

DEFINITION

A game hA, ui is called a finite game if Ai is a finite set for all i.

Obviously the games described in Example 1 and Exercise 1 are not finite, while the games
introduced in Example 2 and Exercise 2 are finite.
Advertising
The two producers of mineral water we met in Example 1 are not allowed to cooperate. They
both control one half of the market. At this moment they earn 8 units of money per month.
Both producers must decide at a certain moment (independently of each other) whether or
not they want to start an advertising campaign. The price of the campaign is 2 units of
money. We suppose that the market share for both producers does not change if they take
the same decision. In the other case the producer who has decided to start a campaign
controls 75 % of the market in the next month.
We can write down the consequences of the decisions of the producers in the following way:
EXAMPLE 3

dont start

start

dont start

(8, 8)

(4, 10)

start

(10, 4)

(6, 6)

Of course we can index the two alternatives available to the producers:


1:

do not start an advertising campaign

2:

start an advertising campaign

This situation can be modeled as the finite two-person game hA1 , A2 , u1 , u2 i, where
A1 = A2 = {1, 2}
and for i, j {1, 2}

u1 (i, j) = aij

and

u2 (i, j) = bij .

Here aij is the element on the position (i, j) of the matrix A =




8 10
element on the position (i, j) of the matrix B =
.
4 6

10

and bij is the

Since the game in this example is completely determined by the two 2 2-matrices A and
B, we call this game a 2 2-bimatrix game and denote it by


(8, 8) (4, 10)
(A, B) =
.
(10, 4) (6, 6)
payoff matrices

The matrices A and B are called the payoff matrices of the players.

23

2.3. NASH EQUILIBRIUM


bimatrix game

Note that any finite two-person game hA1 , A2 , u1 , u2 i with |A1 | = m and |A2 | = n can be
seen as an m n-bimatrix game. In fact in this game player 1 chooses a row and player 2
chooses a column. The general form of an m n-bimatrix game is
(a , b )
11 11
(a21 , b21 )

..

(am1 , bm1 )

(a12 , b12 )
(a22 , b22 )
..
.
(am2 , bm2 )

..

(a1n , b1n )
(a2n , b2n )

.
..

(amn , bmn )

EXERCISE 3

Explain that the situation described in Exercise 2 can be modeled as a


bimatrix game. What is the order of the game?

EXERCISE 4

Each of two firms has one job opening. Suppose that (for reasons not
discussed here but relating to the value of filling each opening) the firms offer
different wages: firm i {1, 2} offers the wage wi , where 21 w1 < w2 < 2w1 .
Imagine that there are two workers, each of whom can apply to only one
firm. The two workers simultaneously decide whether to apply to firm 1 or
to firm 2. If only one worker applies to a given firm, that worker gets the
job; if both workers apply to one firm, the firm hires one worker at random
and the other worker is unemployed (which has a payoff of zero).
Show that this situation can be formulated as a bimatrix game.

2.3

Nash Equilibrium

Suppose that you have to advise which action each player of a game should choose. Then of
course each player must be willing to choose the action advised by you. Thus the advised
action of a player must be a best response to the advised actions of the other players. In
that case no player can gain by unilateral deviation from the advised action. We will call
such an advice a Nash equilibrium. Formally:

24

Nash equilibrium

CHAPTER 2. NASH EQUILIBRIUM

A Nash equilibrium of an n-person strategic game G = hA, ui is a profile


a A of actions with the property that for every player i

DEFINITION

ui (a1 , . . . , ai1 , ai , ai+1 , . . . an ) ui (a1 , . . . , ai1 , ai , ai+1 , . . . , an )

for all ai Ai .

So if a is a Nash equilibrium no player i has an action yielding a higher payoff to him


than ai does, given that each opponent chooses his equilibrium action. Indeed no player can
profitably deviate, given the actions of the opponents.

It is useful to describe the concept of an equilibrium in terms of best responses. In order


to do so we denote for a game G = hA, ui the set of actions of the opponents of player i by
Ai . So an element of this set is of the form
ai = (a1 , . . . , ai1 , ai+1 , . . . , an ).
For an ai Ai

best response



Bi (ai ) = ai Ai | ui (ai , ai ) ui (ai , ai ) for all ai Ai

is the set of best responses of player i against ai . Here (ai , ai ) is the profile obtained from
a by replacing ai by ai .
Obviously, a profile a of actions is a Nash equilibrium if and only if
ai Bi (ai )

for all i N.

Note that (i , j ) is a Nash equilibrium of an m n-bimatrix game (A, B) if


ai j
and

bi j

aij

bi j

for all i
for all j.

So ai j is the biggest number in column j of the matrix A, while bi j is the biggest


number in row i of the matrix B.
The Prisoners Dilemma
Two suspects in a crime are arrested and are put into separate cells. The police lack sufficient
evidence to convict the suspects, unless at least one confesses. If neither confesses, the
suspects will both be convicted of a minor offense and sentenced to one month in jail. If
both confess then both will be sentenced to jail for six months. If only one of them confesses,
he will be freed and used as a witness against the other, who will receive a sentence of nine
months six for the crime and three for obstructing justice. We have summarized the data
in the following tableau
EXAMPLE 4

dont confess
confess

dont confess

confess

(1, 1)

(9, 0)

(0, 9)

(6, 6)

This situation can be modeled as the 2 2-bimatrix game




(1, 1) (9, 0)
,
(0, 9) (6, 6)

25

2.3. NASH EQUILIBRIUM

where the first row/column corresponds with the decision not to confess and the second one
to the decision to confess.
It is clear that whatever one player does, the other prefers confess to dont confess. Hence
(2, 2) is the unique Nash equilibrium of the game.

Battle of the Sexes


On an evening a man and a woman wish to go out together. The man prefers to attend
the opera but the woman likes to go to a soccer match. Both would rather spend the
evening together than apart. We suppose that the preferences of the man and woman can
be described with the help of the payoffs represented in the following tableau:
EXAMPLE 5

opera

match

opera

(2, 1)

(0, 0)

match

(0, 0)

(1, 2)

This situation can be modeled as the 2 2-bimatrix game




(2, 1)

(0, 0)

(0, 0)

(1, 2)

In this game the man is player 1 and the woman is player 2. Furthermore, the first
row/column corresponds with the decision to attend the opera, while the other row/column
corresponds with the decision to visit a soccer match. This game has two Nash equilibria:
(1, 1) and (2, 2).

Matching Pennies
Each of two people chooses either Head or Tail. If the choices differ, person 1 pays person
2 one dollar; if they are the same, person 2 pays person 1 one dollar. This situation where
the interests of the players are diametrically opposed can be modeled by the 2 2-bimatrix
game


(1, 1) (1, 1)
,
(1, 1) (1, 1)
EXAMPLE 6

where the first row/column corresponds with the decision of choosing Head. This strictly
competitive game has no Nash equilibrium.

EXERCISE 5

Determine a Nash equilibrium of the bimatrix game in Exercise 3 and Exercise 4, respectively.

EXERCISE 6

Determine the Nash equilibria of the 3 3-bimatrix game

(2, 0)

(3, 4)
(1, 3)

(1, 1)

(4, 2)

(1, 2)

(2, 3) .

(0, 2)

(3, 0)

A hide-and-seek game
Player 2 hides in one of three rooms numbered 1, 2 and 3. Player 1 tries to guess the number
of the room chosen by player 2. If his first guess is correct, he receives one dollar from player
2. If his second guess is correct he has to pay one dollar. Otherwise player 1 pays two dollar
EXAMPLE 7

26

CHAPTER 2. NASH EQUILIBRIUM

to player 2. We can summarize this game in one matrix:

(1
(1
(2
(2
(3
(3

2
3
1
3
1
2

3)
2)
3)
1)
2)
1)

1
1
1

1
2

1
2
1
1
2
1

2
1

1
1

This matrix contains the payoffs to player 1 and the triple (1 2 3) for example denotes the
action of player 1 where room 1 is his first guess, room 2 his second one (if necessary) and
3 his third one (if necessary). Also this game doesnt have any equilibrium.

EXAMPLE 8

Cournot Model of Duopoly

We consider a simplified version of the Cournot Model discussed in Example 1. To be precise:


we will suppose that

p(q1 + q2 ) =

a q1 q2
0

if q1 + q2 < a
otherwise,

for some positive real number a. Furthermore, we assume that ci (qi ) = cqi for i = 1, 2; that
is: there are no fixed costs and the marginal cost is constant at c, where 0 < c < a.
The consequence of these assumptions is that the payoff function of player 1 in the corresponding two-person game is given by

u1 (q1 , q2 )

=
=

q1 p(q1 + q2 ) cq1

q1 [a q1 q2 ] cq1


if q1 + q2 < a

cq1

if q1 + q2 a

q1 [(a c q2 ) q1 ]

if q1 + q2 < a

cq1

if q1 + q2 a.

Using the fact that the (positive part of the) intersection of the graph of the payoff function

27

2.3. NASH EQUILIBRIUM


of player 1 with vertical planes with equation q2 = b for some b (0, a c) look like

u1

ac

q1

ac
q2
we find that
B1 (q2 ) =

0
1
(a
2

c q2 )

In a similar way one can show that



0
B2 (q1 ) = 1
(a c q1 )
2
best-response set

if q2 a c

if q2 < a c

if q1 a c

if q1 < a c

= max {0, 21 (a c q2 )}.

= max {0, 21 (a c q1 )}.

In Figure 1 you find the two best-response sets

and

B1

= {(q1 , q2 )| q1 B1 (q2 )}

B2

= {(q1 , q2 )| q2 B1 (q1 )} :

q2

q2

ac
B1
1
(a
2

c)
B2

1
(a
2

FIGURE 1

c)

The best-response sets

q1

ac

q1

28

CHAPTER 2. NASH EQUILIBRIUM

Obviously, a Nash equilibrium of the game corresponds with an intersection of the two bestresponse sets. So in order to determine the unique equilibrium of the game we have to solve
the system of equations
(
q1 = 12 (a c q2 )
q2 = 12 (a c q1 ).

Solving this system yields the equilibrium (q1 , q2 ), where


q1 = q2 = 31 (a c).

EXERCISE 7

Consider the Cournot duopoly model where the price is given by the equation p(q1 + q2 ) = a q1 q2 , but firms have asymmetric marginal costs: c1
for firm 1 and c2 for firm 2.
What is the Nash equilibrium if 0 < ci < a/2 for each firm i?
What if c1 < c2 < a but 2c2 > a + c1 ?

EXERCISE 8

Players 1 and 2 are bargaining over how to split one dollar. Both players
simultaneously name shares s1 , s2 [0, 1] they would like to have. If s1 +
s2 1, then the players receive the shares they named, otherwise both
players receive nothing.
What are the Nash equilibria of this game?

EXERCISE 9

Consider the two-person game hA1 , A2 , u1 , u2 i, where A1 = A2 = [0, 1) and


where for (x, y) [0, 1) [0, 1)
u1 (x, y) = x

and

u2 (x, y) = y.

Prove that this game doesnt have any Nash equilibrium.


EXERCISE 10 Suppose there are n firms in the Cournot oligopoly model. Let qi denote the

quantity produced by firm i. The price p depends on the aggregate quantity


Q = q1 + + qn on the market as is given by

a Q if Q < a
p(Q) =
0
otherwise.
Assume that the total cost of firm i from producing quantity qi is ci (qi ) =
cqi . That is, there are no fixed costs and the marginal cost is constant at
c < a. Following Cournot, suppose that the firms choose their quantities
simultaneously.
a) Model this situation as an n-person game.
Let (q1 , . . . , qn ) be a Nash equilibrium of this game.
P
b) Prove that qi = max{0, 21 (a c Qi )}, where Qi = j6=i qj .

Let S be the set of firms i such that qi > 0.


c) Prove that qi = qj for all i, j S and use this fact to prove that
S = {1, 2, . . . , n}.
1
d) Show that qi = n+1
(a c).
EXERCISE 11 We consider two finite versions of the Cournot duopoly model.

First, suppose each firm must choose either half the monopoly quantity qm =
ac
, or the Cournot equilibrium quantity qe = ac
. No other quantities
4
3
are feasible.

29

2.4. APPLICATIONS

Show that this two-action game is equivalent to the Prisoners Dilemma:


each firm has a strictly dominated strategy and both are worse off in equilibrium than they would be if they cooperated.
Second, suppose that each firm can choose either qm /2, or qe , or a third
quantity q .
Find a value for q such that the game is equivalent to the Cournot model in
Example 8, in the sense that (qe , qe ) is a unique Nash equilibrium and both
firms are worse off in equilibrium than they would be if they cooperated,
but neither firm has a strictly dominated strategy.

2.4

Applications

In this section, we consider a different model of how two duopolists might interact, based on
Bertrands (1883) suggestion that firms actually choose prices rather than quantities as in
Cournots model.
Bertrand Model of Duopoly
We consider the case of differentiated products. If firms 1 and 2 choose prices p1 and p2 ,
respectively, the quantity q1 that consumers demand from firm 1 satisfies
EXAMPLE 9

q1 = a p1 + bp2
and the demand q2 from firm 2 satisfies
q2 = a p2 + bp1 .
Here b > 0 reflects the extent to which the product of the one firm is a substitute for the
product of the other firm. (This is an unrealistic demand function because demand for the
product of the one firm is positive even when this firm charges an arbitrary high price, provided that the other firm also charges a high enough price. As will become clear, the problem
makes sense only if b < 2.) We assume that there are no fixed costs of production and that
marginal costs are constant at c < a, and that the firms choose their prices simultaneously.
First we model the situation as a two-person game in strategic form hA1 , A2 , u1 , u2 i.
Obviously, Ai = [0, ) for each i. Next we take as payoff function for a firm just its profit.
Hence
u1 (p1 , p2 ) = q1 (p1 c) = (a p1 + bp2 )(p1 c)
and

u2 (p1 , p2 ) = q2 (p2 c) = (a p2 + bp1 )(p2 c).

Since
and

B1 (p2 )

= 21 (a + bp2 + c)

B2 (p1 )

= 21 (a + bp1 + c),

the price pair (p1 , p2 ) is a Nash equilibrium if it solves the system



p1 = 21 (a + bp2 + c)
p2 = 21 (a + bp1 + c)

Solving this systems yields


p1 = p2 =

a+c
.
2b

EXERCISE 12 We analyze the Bertrand duopoly model with homogeneous products.

30

CHAPTER 2. NASH EQUILIBRIUM


Suppose that the quantity q1 that consumers demand from firm 1 satisfies

if p1 < p2

a p1
if p1 > p2
q1 = 0

(a p1 )/2 if p1 = p2 ,

while the quantity that consumers demand from firm 2 satisfies a similar
equation. Suppose also that there are no fixed costs and that marginal costs
are constant at c < a.
Show that if the firms choose prices simultaneously, then the unique Nash
equilibrium is that both firms charge the price c.

The tragedy of the Commons


This example illustrates the phenomenon that if citizens respond only to private incentives,
public resources will be overutilized.
Consider n farmers in a village. Each summer, all the farmers graze their goats on the
village green. Denote the number of goats the i-th farmer owns by gi . The cost of buying
and caring for a goat is c, independent of how many goats a farmer owns. The value to a
farmer of grazing a goat on the green when a total of G = g1 + + gn goats are grazing
is v(G) per goat. Since a goat needs at least a certain amount of grass in order to survive,
there is a maximum number Gmax of goats that can be grazed on the green. Obviously,
EXAMPLE 10

and

v(G) > 0

for G < Gmax

v(G) = 0

for G Gmax .

Also, since the first few goats have plenty of room to graze, adding one more does little
harm to those already grazing, but when so many goats are grazing that they are all just
barely surviving (i.e., G is just below Gmax ), then adding one more dramatically harms the
rest. Formally: for G < Gmax , v (G) < 0 and v (G) < 0. So v is a decreasing and concave
function of G as in the following picture
v

Gmax

During the spring, the farmers simultaneously choose how many goats to own.
Assuming goats are continuously divisible, the foregoing can be modeled as an n-person
game where
Ai = [0, Gmax )
and
Thus, if the profile
the function

ui (g1 , . . . , gn ) = gi v(g1 + + gn ) cgi .

(g1 , . . . , gn )

is a Nash equilibrium of this game, then gi must maximize

) cgi .
f : gi gi v(gi + gi

31

2.4. APPLICATIONS
So gi solves the equation

f (gi ) = 0 v(gi + gi
) + gi v (gi + gi
) c = 0.

Apparently,

) c = 0 for all i.
) + gi v (gi + gi
v(gi + gi

Hence, if we write G instead of g1 + + gn ,


i
Ph

i v(G ) + gi v (G ) c = 0

nv(G ) + G v (G ) nc = 0
v(G ) +

1
G v (G )
n

c = 0.

In contrast, the social optimum, denoted by G , is a solution of the problem


max Gv(G) Gc.
G0

So G

solves the equation


v(G ) + G v (G ) c = 0.

We ask you to show in an exercise that G > G . This means that, compared to the social
optimum, too many goats are grazed in the situation corresponding to the Nash equilibrium.

32

CHAPTER 2. NASH EQUILIBRIUM

EXERCISE 13 Show that G > G , where G and G are the quantities introduced in

the foregoing example.


EXERCISE 14 Consider a population of voters uniformly distributed along the ideological

spectrum represented by the interval [0, 1]. Each of the candidates for a
single office simultaneously chooses a campaign platform (i.e., a point of
the interval [0, 1]).
The voters observe the candidates choices, and then each voter votes for the
candidate whose platform is closest to the voters position on the spectrum.
(If there are two candidates and they choose platforms x1 = 0.3 and x2 =
0.6, for example, then all voters to the left of x = 0.45 vote for candidate
1, all those to the right vote for candidate 2. Hence candidate 2 wins the
elections with 55 percent of the vote.)
Assume that
any candidates who choose the same platform equally split the votes
cast for that platform,
ties among the leading vote-getters are resolved by coin flips.
Suppose that the candidates care only about being elected they do not
really care about their platforms at all!
a) Prove that there is only one Nash equilibrium if there are two candidates.
b) If there are three candidates, exhibit a Nash equilibrium.
EXERCISE 15 Three players announce a number in the set {1, . . . , K}. A prize of one

dollar is split equally between the players whose number is closest to


the average number.
Find the unique equilibrium of this game.

2
3

of

Chapter 3

The Mixed Extension


Some of the finite games we discussed in Chapter 2 did not have a Nash equilibrium. In this
chapter we first introduce for finite games a new class of actions by allowing the players to
randomize over their actions. This leads to a class of non-finite games which appear to have
at least one equilibrium.

3.1

Mixed Strategies

In Chapter 2 we have observed that the hide-and-seek game

(1
(1
(2
(2
(3
(3

2
3
1
3
1
2

3)
2)
3)
1)
2)
1)

1
1
1

1
2

2
1
2
1
1
2
1

3
2
1

1
1

possesses no value: the lower value of that game (= 2) is not equal to its upper value (= 1).
In order to fill the gap between the upper and lower value we will introduce so-called mixed
strategies.
Since player 2 doesnt prefer one of the rooms, he can decide to choose each room with
probability 13 . We denote this plan or strategy by the vector ( 31 , 13 , 13 ). Such a strategy is
called a mixed strategy for player 2.
More generally, a mixed strategy of player 2 is a vector
q = (q1 , q2 , q3 ),
where qj 0 represents the probability that player 2 chooses the j-th room and
The set of all mixed strategies of player 2 is denoted as
3 = {q IR3 | qj 0 and

3
X

P3

j=1

qj = 1.

qj = 1}.

j=1

Note that the strategy e2 = (0, 1, 0) corresponds to the situation where player 2 chooses the
second row (with probability one).

33

34

CHAPTER 3. THE MIXED EXTENSION

Analogously, the strategy space of player 1 can be extended to the set


6 := {p IR6 | pi 0 and

6
X

pi = 1}.

i=1

What is the effect of this extension on the payoffs to the players?


If player 1 chooses the strategy p = ( 61 , 16 , 16 , 61 , 16 , 16 ) and player 2 chooses the first room
(which corresponds to the strategy q = e1 ), the expected payoff to player 1 is
1
6

1+

1
6

1+

1
6

1 +

1
6

2 +

1
6

1 +

1
6

2 = 23 .

In general, a pair (p, q) 6 3 of strategies generates a(n expected) payoff to player 1


equal to
p1 q1 a11 + +pi qj aij + + p6 q3 a63 = pAq.
|{z}
the probability of entry (i,j)

In this case player 2 receives the amount pAq.

summarized

By introducing mixed strategies the new (zero-sum strategic) game


h6 , 3 , U i,
arises, where for a pair (p, q) 6 3
U (p, q) := pAq.
This game is called the mixed extension of the matrix game A.
We will show now that player 1, by using a specific mixed strategy, can guarantee himself
the amount of 23 , while player 2 can find a mixed strategy such that he doesnt lose more
than 23 . This suggests that the mixed extension of the hide-and-seek game has a value
indeed (and that this value is equal to 32 ).
If player 1 uses the mixed strategy p = ( 61 , 61 , 16 , 16 , 61 , 16 ), then his (expected) payoff against
an arbitrary strategy q of player 2 is equal to
1 1 2
( 61 , 16 , 16 , 61 , 16 , 16 )
q1

1 2 1
( 23 , 23 , 32 ) q1
q2

1 2

1
2
=
p Aq =

q2 = 3 .
1 1 q3
2

q3
1 2
1
2 1
1

Similarly, if player 2 uses the mixed strategy q = ( 13 , 13 , 13 ), then his (expected) payoff
against an arbitrary strategy p of player 1 is equal to 23 .
In Chapter 4 3 we will continue the investigation of the class of (mixed extensions of) matrix
games. In the next section we are going to introduce mixed extensions for arbitrary finite
games.

3.2

The Mixed Extension of a Finite Game

After the introductory example in Section 3.1 we now come to the formal definition of mixed
strategies and the mixed extension of a finite two-person strategic game. We will also discuss
the existence of Nash equilibrium in this framework.

35

3.2. THE MIXED EXTENSION OF A FINITE GAME


the strategies

As we have seen in the foregoing chapter a finite two-person game is in fact a bimatrix game,
say (A, B), where A and B are two m n matrices. If we allow the players to randomize
over their action sets {1, 2, . . . , m} and {1, 2, . . . , n} respectively, then we are in fact dealing
with a new game. In this new game the action set of player 1 is
m = {p IRm | pi 0

for all

and

m
X

pi = 1}

i=1

and the action set of player 2 is


n = {q IRn | qj 0

for all

and

n
X

qj = 1}.

j=1

The elements of m and n are called mixed strategies and we suppose that the players
choose their mixed strategies independently and simultaneously.
y

1
2
3

1
FIGURE 1

The sets 2 and 3

1
y

1
x

Note that pi refers to the probability that player 1 chooses the i-th row and that qj represents
the probability that player 2 chooses the j-th column.

36

pure strategy

the payoffs

CHAPTER 3. THE MIXED EXTENSION

A strategy where a player chooses one of his actions, say the k-th one,
with probability one, is also called a pure strategy and it is denoted by ek .

DEFINITION

If the players use the mixed strategies p and q, then the probability that the (i, j)-th cell
of the bimatrix (A, B) will be chosen is equal to pi qj (here we use the fact that the players
choose their strategies independently!). So the expected payoff to player 1 corresponding to
the pair (p, q) of mixed strategies is
n
m X
X

aij pi qj = pAq

i=1 j=1

and, similarly, the expected payoff to player 2 is equal to pBq. So formally,

mixed extension

Corresponding to an m n-bimatrix game (A, B) we can


of aconsider
game
the
strategic game hm , n , U1 , U2 i, where

DEFINITION

and

U1 (p, q)

= pAq

U2 (p, q)

= pBq.

This game is called the mixed extension of the m n-bimatrix game (A, B).
Because we are only interested in the mixed extension of a game, from now on we will speak
about a bimatrix game instead of the mixed extension of a bimatrix game.
equilibria

Note that a pair (p , q ) of (mixed) strategies is a Nash equilibrium of the bimatrix game
(A, B) if and only if
p Aq pAq
for all p
and

p Bq

p Bq

for all q.

The set of all Nash equilibria of the game (A, B) is denoted as E(A, B).


EXERCISE 1 Prove that ( 21 , 21 ), ( 12 , 21 ) is a Nash equilibrium of the 2 2-bimatrix game
(A, B) =

(1, 1)

(0, 0)

(0, 0)

(1, 1)

Does every mixed extension of a finite game in strategic form have a Nash equilibrium? In
1950 J.F. Nash showed that the answer to this question is affirmative. In fact he gave not
one, but two proofs for the existence of Nash equilibrium, one based on Brouwers fixed
point theorem for continuous functions, and the other (simpler) one based on Kakutanis
fixed point theorem for correspondences. We are not going to discuss these proofs, but we
do state the result here.
Equilibrium Point Theorem of Nash
Every finite strategic game has a mixed strategy Nash equilibrium.
THEOREM 1

The question that now arises is: how can we compute these Nash equilibria? Both proofs
of Nash are non-constructive: they show that there is an equilibrium but not how to get it.

3.2. THE MIXED EXTENSION OF A FINITE GAME

37

Nevertheless, for bimatrix games there is a method to actually compute a Nash equilibrium,
or even all Nash equilibria. In the remainder of this section we explain how this can be done.
The method is based on the following characterization of Nash equilibrium.
EXERCISE 2

Let (A, B) be an m n-bimatrix game and let (p , q ) m n .


Prove that
(
p Aq ei Aq for all i

(p , q ) E(A, B)
p Bq p Bej for all j.

Let (A, B) be an m n-bimatrix game and let q be a strategy of player 2.


Then there is at least one pure strategy that is a best response against q.

LEMMA 1

PROOF

Let p be a best response against q. Then pAq ei Aq for all i. Therefore

maxk ek Aq pAq = hp, Aqi

= p1 e1 Aq + + pm em Aq

p1 maxk ek Aq + + pm maxk ek Aq

= (p1 + + pm ) maxk ek Aq = maxk ek Aq.

So we may conclude that pAq = maxk ek Aq. Then however there must be at least one i such
that pAq = ei Aq. Since p is a best response against q, the pure strategy ei is a best response
against q too.

This lemma implies that a pure strategy that is a best response within the class of pure
strategies is also a best response within the class of all strategies. For this reason we
introduce the class of pure best responses.

pure best responses

DEFINITION

Let (A, B) be an m n-bimatrix game and let p m and q n . Then


P B2 (p) = {j| pBej = max pBel }
l

is the set of pure best responses of player 2 to p and


P B1 (q) = {i| ei Aq = max ek Aq}
k

is the set of pure best responses of player 1 to q.

In the following lemma we show that all best responses against a strategy can easily be
determined if the pure best responses against that strategy are available. In fact a strategy
is a best response if all pure strategies chosen with a positive probability are a pure best
response. In order to make this precise we need the following

carrier

DEFINITION

If p t is a mixed strategy, then


C(p) = {i| pi > 0}

is called the carrier of p.

38

CHAPTER 3. THE MIXED EXTENSION

Let (A, B) be an m n-bimatrix game and let q be a strategy of player 2.


Then a strategy p of player 1 is a best response against q if and only if

LEMMA 2

C(p) P B1 (q).
PROOF
(1) Suppose that p is a best response of player 1 against the strategy q of player
2. Take an i C(p), so pi > 0. Now assume that ei Aq < maxk ek Aq. Then

pAq

= p1 e1 Aq + + pm em Aq

< p1 maxk ek Aq + + pm maxk ek Aq = maxk ek Aq = pAq,

which is a contradiction. So ei Aq = maxk ek Aq, which means that i P B1 (q).


(2) Suppose that C(p) P B1 (q). Then
X
X
X
pi max ek Aq = max ek Aq.
pi ei Aq =
pi ei Aq =
pAq =
i

iC(p)

iC(p)

So pAq ei Aq for all i, that is p is a best response against q.

Since a pair of strategies one for each player is a Nash equilibrium if each strategy is a
best response against the other one, Lemma 2 leads to the following result.
Let (A, B) be a bimatrix game. Then a strategy pair (p, q) is a Nash
equilibrium if and only if

THEOREM 2

C(p) P B1 (q)

and

C(q) P B2 (p).

In the following example we will show how this result can be used.
EXAMPLE 1

Consider the bimatrix game

(1, 1) (0, 1)
(1, 1) (1, 1)

(A, B) =
(1, 1) (1, 1)
(1, 1)

(1, 1)

(0, 1)

(0, 1)

(0, 1)

(0, 1)

(0, 1)

(1, 1)
(1, 1)

(1, 1)

and the strategies p = (0, 31 , 31 , 13 ) and q = ( 21 , 12 , 0, 0). Since


1
2

Aq =
1
1

and

pB = [ 1

1],

P B1 (q) = {2, 3, 4} and P B2 (p) = {1, 2, 3, 4}. Since C(p) = {2, 3, 4} and C(q) = {1, 2}, this
implies that C(p) P B1 (q) and C(q) P B2 (p). So (p, q) E(A, B).

EXERCISE 3

Let (A, B) be an m n-bimatrix game. Suppose that there exists a y n


such that yn = 0 and By > Ben (i.e. there exists a mixture of the first n 1
columns of B that is strictly better than playing the last column of B).
a) Prove that qn = 0 for each (p, q) E(A, B).
Let (A , B ) be the bimatrix game obtained from (A, B) by deleting the last
column.

39

3.3. SOLVING BIMATRIX GAMES


b)

EXERCISE 4

Prove that (p, q ) E(A , B ) (p, q) E(A, B), where q is the strategy obtained from q by skipping the last element (which is equal to zero
by part a)).

Consider the 3 3-bimatrix game

(0, 4)

(4, 0)

(5, 3)

(A, B) = (4, 0)

(0, 4)

(5, 3) .

(3, 5)

(3, 5)

(6, 6)

Let (p, q) E(A, B).


a) Prove that it is impossible that {1, 2} C(p).
b) Prove that it is impossible that C(p) = {2, 3}.
c) Find all the equilibria of this game.

3.3

Solving Bimatrix Games

In the previous section you already used several ad hoc techniques to find the set of Nash
equilibria of a bimatrix game. We will now present a more unified method to solve bimatrix
games. Actually this technique can be used for bimatrix games of any size. However, we
restrict our analysis to 2 3-bimatrix games because tehn it is still possible to visualize what
is going on.
The technique for solving bimatrix games is based on an analysis of what is called the bestresponse correspondence. A correspondence is, as you may know already, a generalization
of the notion of a function. The difference being that a function assigns exactly one point
to a element in its domain, while a correspondence may assign an entire set of points to an
element in its domain.

correspondence

DEFINITION

Let C be a non-empty set in IRt . A correspondence


: C
C

is a mapping that assigns to each point x C a non-empty subset (x) of C. For such a
correspondence the set
{(x, y) C C| y (x)}
graph

fixed point

is called its graph.

We call an x such that x (x ) a fixed point of the correspondence .


EXAMPLE 2

Let be the mapping that assigns to a number x [0, 1] the set


(x) = {y [0, 1]| y x2 }.

40

CHAPTER 3. THE MIXED EXTENSION

Then : [0, 1]
[0, 1] is a correspondence and its graph can be found in the following figure.
y
y
1

EXERCISE 5
EXERCISE 6

(a)

x
a
1
What are the fixed points of the correspondence ?

Let the correspondence : IR


IR be defined by
(x) = {y [0, 1]| y x3 }

for all x IR.

Draw the graph of and determine its fixed points. Give an example of a
correspondence that does not have any fixed points.
Switching back to the setting of bimatrix games, the best-response correspondence : m
n
m n of a bimatrix game (A, B) is defined by (p, q) = 1 (q) 2 (p), where
1 (q) := {
p| pAq = max pAq}
p

is the set of best responses of player 1 to the strategy q and


2 (p) := {
q | pB q = max pBq}
q

is the set of best responses of player 2 to the strategy p.


Actually this is the correspondence that John Nash introduced when he gave his second
proof of the existence of Nash equilibrium. (By the way, the term Nash equilibrium was
not coined by Nash himself. He used the term equilibrium point.)
At least it is clear that (p , q ) is an equilibrium of the game (A, B) if and only if (p , q ) is
a fixed point of the correspondence . (The proof of Nash is based on the fact that is a
type of correspondence of which Kakutanis theorem asserts that it does indeed have a fixed
point.)
What we are going to do is to use the best-response correspondences 1 and 2 to determine
the set of all equilibria of a bimatrix game where one of the players has two actions and the
other one has at most three actions. For such bimatrix games it is possible to make a picture
of the best-response sets
B1 = {(p, q)| p 1 (q)}
and

B2

= {(p, q)| q 2 (p)}.

41

3.3. SOLVING BIMATRIX GAMES


Note that for a bimatrix game (A, B)
(p, q) E(A, B)

p 1 (q) and q 2 (p)

(p, q) B1 and (p, q) B2


(p, q) B1 B2 .

Consider the 2 2-bimatrix game




(0, 9) (3, 5)
(A, B) =
.
(1, 3) (2, 4)

EXAMPLE 3

In order to determine the best responses of player 1 we note that



 


3 3q1
0 3
q1
.
=
Aq =
2 q1
1 q1
1 2
As
it follows that

e1 Aq = e2 Aq 3 3q1 = 2 q1 q1 =

{e1 }
1 (q) = 2

{e2 }

Similarly,
pB = [ p1

1 p1 ]

and
imply

if q1 <
if q1 =
if q1 >

= [ 3 + 6p1

if p1 <
if p1 =
if p1 >

q = ( 21 , 12 ),

1
2
1
2
1
.
2

pBe1 = pBe2 3 + 6p1 = 4 + p1 p1 =

{e2 }
2 (p) = 2

{e1 }

1
2

4 + p1 ]
1
5

p = ( 51 , 45 )

1
5
1
5
1
.
5

In the following figure you find the (intersection of the) best-response sets:
e2
player 2

B2
( 12 , 12 )
B1
e1

( 51 , 45 )

e2

player 1



So E(A, B) = { ( 15 , 54 ), ( 12 , 21 ) }.

42

CHAPTER 3. THE MIXED EXTENSION

EXERCISE 7

Consider the 3 3-bimatrix game

(2, 0) (1, 1)

(A, B) = (3, 4) (1, 2)


(1, 3) (0, 2)

(4, 2)

(2, 3) .
(3, 0)

Find the set of equilibria of this game.

EXAMPLE 4

We consider the 2 3-bimatrix game




(2, 1) (1, 0) (1, 1)
(A, B) =
.
(2, 0) (1, 1) (0, 0)

Note that the Nash equilibria of this game are contained in the set 2 3 of all possible
strategy pairs. This set is the product of the sets 2 and 3 and can be represented by the
following efigure.
3

e2

FIGURE 2

e1
The set of all strategy pairs.

e2

Here player 2 chooses a point in the triangle with vertices e1 , e2 and e3 , while player 1 chooses
a point of the horizontal line segment with vertices e1 and e2 .
In order to determine the best responses of player 1 we note that


2q1 + q2 + q3
Aq =
.
2q1 + q2
As e1 Aq = e2 Aq q3 = 0, we may conclude that

{e1 } if q3 > 0
1 (q) =
2
if q3 = 0.
This yields the best-response set as depicted in the following figure:
e3

B1
e2

e1

e2

43

3.3. SOLVING BIMATRIX GAMES


Similarly, the fact that
pB = [ p1
implies that

p2

{e2 }
2 (p) = 3

{q| q2 = 0}

p1 ]
if p1 < p2
if p1 = p2
if p1 > p2 .

This yields the best-response set as depicted in the following figure:

e3

B2
e2

e1

( 21 , 12 )

e2

In the following figure you find the intersection of the two best-response sets:
e3

e2

e1
FIGURE 3

( 12 , 21 )

e2

The set of equilibria.

EXERCISE 8

Determine with the help of the best-response sets the set of equilibria of the
game

(0, 0) (2, 1)

(A, B) = (2, 2) (0, 2) .


(2, 2)

(0, 2)

44

CHAPTER 3. THE MIXED EXTENSION


Also represent the set of Nash equilibria in a figure.

3.4

The Interpretation of Mixed Strategy Nash Equilibria

Let hA1 , A2 , u1 , u2 i be a finite two-person game.


random device

In its most straightforward interpretation a mixed strategy entails a deliberate decision by


a player to introduce randomness into his behavior: he uses some randomization device to
select one of his actions. In fact a player, say i, is supposed to choose an element of (Ai )
(in the mixed extension) in the same way that he is supposed to choose an element of Ai (in
the finite game).
In real life players sometimes really introduce randomness into their behavior: players randomly bluff in a card game, and governments use randomizations to audit taxpayers as well
as some stores offering discounts.

extended game

In modeling a players behavior as random, a mixed strategy Nash equilibrium captures


the dependence of behavior on those external factors that are impossible or too costly to
determine. This can be compared with the situation where the outcome of a coin toss is
modeled as a random variable instead of describing this outcome as a result of the starting
position and velocity of the coin, the wind speed and so 
on.


To give an example consider the mixed Nash equilibrium ( 23 , 13 ), ( 31 , 23 ) of the Battle of the
Sexes (Example 5 in Chapter I). Now suppose that each player has three possible moods,
determined by factors he does not understand. Each player is in each of these moods onethird of the time, independently of the other players mood; his mood has no effect on his
payoff. Assume that player 1, the man, chooses opera whenever he is in moods 1 or 2 and
soccer when he is in mood 3 and player 2, the woman, chooses opera when she is in mood 1
and soccer when she is in moods 2 or 3. This situation can be modeled as an (extended) game
that has a pure equilibrium corresponding exactly to the mixed Nash equilibrium mentioned
before.

beliefs

Another interpretation is that, in general, a player is not completely uncertain about the
behavior of his opponent. Often you expect your opponent to behave in a certain way on
the basis of information about the way that game or similar games were played in the past.
Or may be the player has some experience with his opponent. The player can use this
information to form his belief about the future behavior of his opponent.
Suppose for instance that you are the goalkeeper in a soccer match and that you have to
stop a penalty of your opponent.
Your opponent has three actions:
L: to kick the ball in the left corner of the goal
M : to kick the ball through the centre of the goal
R: to kick the ball in the right corner of the goal.
Suppose you know that in foregoing matches your opponent has chosen the action R in 70
percent of the situations and the action M in 30 percent of the cases.
This experience can be represented by the belief
7
10

R+

3
10

M,

3.4. THE INTERPRETATION OF MIXED STRATEGY NASH EQUILIBRIA 45




7
3
, 10
.
that corresponds with the vector q = 0, 10
Similarly, your opponent knows from the past that
you decided in 50 percent of the cases to defend the corner at your left hand side
(action l)
you decided in 50 percent of the cases the corner at your right hand side (action r)

you never decided to defend the centre of the goal (action m).
This experience can be represented by the belief
1
2

r + 21 l


that corresponds with the vector p = 12 , 0, 12 .
Now the behavior of the two players is consistent with their beliefs if
player 1 chooses an action
in C(p)

(player 2 believes he will do so)

that is best response against q (because player 1 believes that player 2


behaves according to q)
and if player 2 chooses an action
in C(q)

(player 1 believes he will do so)

that is best response against p (because player 2 believes that player 1


behaves according to p).
This means that C(p) P B1 (q) and C(q) P B2 (p), that is the pair of beliefs forms an
equilibrium.
Note that under this interpretation each player chooses a single action rather than a mixed
strategy.

46

CHAPTER 3. THE MIXED EXTENSION

Chapter 4

Zero-Sum Games
In this chapter we focus on a special class of bimatrix games, namely those in which players
1 and 2 have strictly opposite preferences. This means that if player 1 prefers some outcome
a over some other outcome b, then player 2 prefers b over a. A typical example is the game
of chess. In this game there are three possible outcomes: a win for white, a tie, and a win
for black. Clearly, player white prefers a win for white over a tie, and prefers a tie over a win
for black, while player black prefers a win for black over a tie, and prefers a tie over a win for
white. Another example would be the division of a fixed amount of money, say 100 euros,
among two players. Every outcome can be represented as some pair (x, 100 x) where x
denotes the amount received by player 1, and 100 x is the amount received by player 2. If
both players are only interested in their own payoff, they have strictly opposite preferences.
Namely, player 1 prefers the outcome (x, 100 x) over (y, 100 y) whenever x is greater
than y, in which case player 2 would prefer the outcome (y, 100 y) over (x, 100 x).
A particularly convenient way of modeling strictly opposite preferences is by assuming that
the payoffs for player 1 and 2 always add up to zero. We refer to such games as zero-sum
games.
Definition 4.1 A zero-sum game is a bimatrix game in which
u1 (a1 , a2 ) + u2 (a1 , a2 ) = 0
for all pairs of actions (a1 , a2 ).
In order to see that zero-sum games reflect opposite preferences, consider two different action
pairs (a1 , a2 ) and (a1 , a2 ), yielding two different outcomes. If player 1 strictly prefers (a1 , a2 )
over (a1 , a2 ), we must have that u1 (a1 , a2 ) > u1 (a1 , a2 ). But then, since u2 = u1 , it follows
that u2 (a1 , a2 ) < u2 (a1 , a2 ) and hence player 2 strictly prefers (a1 , a2 ) over (a1 , a2 ).
As an illustration, consider the bimatrix game in Figure 1.

47

48

CHAPTER 4. ZERO-SUM GAMES

a
b

c
1, 1
0, 0

d
2, 2
3, 3

Figure 1
It is easily seen to be a zero-sum game since in each cell of the matrix, the payoffs for both
players add up to zero. Now suppose that player 1 chooses the pure strategy a. As a possible
criterion for evaluating the pure strategy a, one could look at the minimum possible payoff
that player 1 could possibly get by choosing a. This payoff is 2, obtained when player 2
happens to choose d, and we write
(a) = 2.
umin
1
So, one could say that player 1 can actually guarantee the payoff umin
1 (a) = 2 by choosing
the pure strategy a. Similarly, the minimum possible payoff for player 1 by choosing pure
strategy b is 0, obtained when player 2 happens to choose c. Hence,
(b) = 0.
umin
1
Is there a way for player 1 to guarantee a payoff higher than 0 in this game? Obviously, if
player 1 could only choose pure strategies, this would be impossible, since the pure strategy
that guarantees the most to player 1 is b, guaranteeing a payoff of 0. But perhaps player 1
can guarantee more than 0 by choosing a mixed strategy, that is, a probability distribution
over the pure strategies a and b.
Take, for instance, the mixed strategy p = ( 13 , 32 ), assigning probability 13 to a and probability
2
to b. If player 2 would choose c, the expected payoff for player 1 would be 31 1 + 23 0 = 13 .
3
If player 2 would choose d, the expected payoff for player 1 would be 31 (2) + 32 3 = 43 .
Similarly, one can check that if player 1 chooses the mixed strategy p = ( 13 , 32 ) and player
2 chooses an arbitrary mixed strategy, then the expected payoff for player 1 will always be
greater or equal than 31 . So, we may say that
1
1 2
umin
1 ( , ) =
3 3
3
is the expected payoff that player 1 can guarantee by choosing the mixed strategy p = ( 31 , 23 ),
and this expected payoff is obtained when player 2 happens to choose the pure strategy c.
Hence, by playing the mixed strategy p = ( 13 , 23 ), player 1 can guarantee more than by
playing either one of the pure strategies.
Definition 4.2 Let p be a mixed strategy for player 1. Then, by
umin
1 (p) = min u1 (p, q)
q

we denote the minimum expected payoff that player 1 can possibly get by choosing mixed
strategy p.
(p) = 31
For the mixed strategy p = ( 31 , 23 ) above, we have seen that the expected payoff umin
1
will be realized if player 2 chooses a pure strategy, namely c. This is not a coincidence: for
any mixed strategy p for player 1, the minimum expected payoff umin
1 (p) can be computed
by looking only at pure strategies for player 2.

49
Lemma 4.3 For every mixed strategy p for player 1, it holds that
(p) = min u1 (p, a2 ).
umin
1
a2 A2

Here, A2 denotes the set of actions, or pure strategies, for player 2.


Proof. Suppose that the mixed strategy q for player 2 is such that umin
(p) = u1 (p, q ).
1
This means that
u1 (p, q ) = min u1 (p, q).
q

Take an arbitrary action a2 A2 with q (a2 ) > 0. We show that u1 (p, a2 ) = u1 (p, q ).
Suppose that this would not be true, so suppose that u1 (p, a2 ) 6= u1 (p, q ). Since u1 (p, q ) =
minq u1 (p, q), this would mean that u1 (p, a2 ) > u1 (p, q ). Since for all pure strategies a2 A2
we have that u1 (p, a2 ) minq u1 (p, q) = u1 (p, q ), this would imply that
u1 (p, q )

q (a2 )u1 (p, a2 )

a2 A2

q (a2 )u1 (p, a2 ) +

q (a2 )u1 (p, a2 )

a2 A2 \{a
2}

>

(a2 )u1 (p, q )

q (a2 )u1 (p, q )

a2 A2 \{a
2}

q (a2 ))u1 (p, q ) = u1 (p, q )

a2 A2

which is a contradiction. Hence, we must have that u1 (p, a2 ) = u1 (p, q ). Since umin
1 (p) =
u1 (p, q ) it follows that u1 (p, a2 ) = umin
(p). This completes the proof.
1
If player 1s decision criterion is the expected payoff umin
1 (p) that he can guarantee by choosing some mixed strategy p, then he would like to choose a mixed strategy p for which
umin
(p ) is maximal. Such a strategy p is called a max-min strategy for player 1, since it
1
maximizes the minimum possible payoff obtained by a mixed strategy.
Definition 4.4 A max-min strategy for player 1 is a mixed strategy p such that

min
umin
(p)
1 (p ) u1

for all mixed strategies p.


In order to compute the max-min strategy for player 1 in Figure 1, choose an arbitrary mixed
strategy p = (p1 , 1 p1 ) for player 1, assigning probability p1 to a and probability 1 p1 to
b. If player 2 chooses the pure strategy c, the expected payoff for player 1 would be
u1 (p, c) = p1 .
If player 2 chooses pure strategy d the expected payoff for player 1 would be
u1 (p, d) = p1 (2) + (1 p1 )3.
By interpreting u1 (p, c) and u1 (p, d) as functions of p1 , and drawing these functions in the
same graph, we obtain the following figure.

50

CHAPTER 4. ZERO-SUM GAMES

3
2
1
0
0.2

0.4

0.6

0.8

-1
-2
Figure 2
Here, u1 (p, c) corresponds to the increasing line, whereas u1 (p, d) corresponds to the decreasing line. By Lemma 4.3 we know that
umin
(p) = min{u1 (p, c), u1 (p, d)}
1
and hence the minimum expected payoff umin
1 (p) corresponds to the minimum of the two
1
lines for every p1 . That is, the graph of umin
1 (p) consists of the increasing line up to p1 = 2 ,
and consists of the decreasing line for p1 between 12 and 0. But then, we see that umin
(p)
has
1
a unique maximum at p1 = 21 . We may thus conclude that p = ( 12 , 21 ) is the unique max-min

1
strategy for player 1 in this zero-sum game. In Figure 2, we also see that umin
1 (p ) = 2 . This
means that the highest expected payoff that player 1 can guarantee in the zero-sum game
is equal to 21 . We say that 12 is the value for player 1 in this zero-sum game. In order to
guarantee his value 12 , player 1 should play his max-min strategy p = ( 21 , 21 ).
Definition 4.5 Let p be a max-min strategy for player 1. Then, the expected payoff

v1 = umin
1 (p )

that player 1 can guarantee by playing this max-min strategy is called the value for player 1
in the zero-sum game.
Note that the value v1 does not depend on which max-min strategy p we choose: if p and
(p ) =
p are two different max-min strategies for player 1, then we must have that umin
1

(p
).
umin
1
In a similar fashion, we may define max-min strategies for player 2, and the value for player
2.
Definition 4.6 Let q be a mixed strategy for player 2. Then, by
umin
2 (q) = min u2 (p, q)
p

51
we denote the minimum expected payoff that player 2 can possibly get by choosing mixed
strategy q.
A max-min strategy for player 2 is a mixed strategy q such that
umin
(q ) umin
(q)
2
2
for all mixed strategies q for player 2.

The expected payoff umin


2 (q ) that player 2 can guarantee by playing a max-min strategy is
called the value for player 2, and we denote it by v2 .
In a similar way as we have done for player 1, we can now compute the max-min strategy
and the value for player 2 in the zero-sum game of Figure 1. Choose an arbitrary mixed
strategy q = (q1 , 1 q1 ) for player 2, assigning probability q1 to c and probability 1 q1 to
d. If player 1 chooses the pure strategy a, the expected payoff for player 2 would be
u2 (a, q) = q1 (1) + (1 q1 )2.
If player 1 chooses the pure strategy b, the expected payoff for player 2 would be
u2 (b, q) = q1 0 + (1 q1 )(3).
By interpreting these expected payoffs as functions of q1 , and drawing these two functions
in the same picture, we obtain the following figure.

2
1
0.8
0
0.2

0.4

0.6

-1
-2
-3
Figure 3
Since Lemma 4.3 also applies to player 2, we know that
umin
2 (q) = min{u2 (a, q), u2 (b, q)}
and hence umin
2 (q), as a function of q1 , corresponds to the minimum of the two lines for every
(q) consists of the increasing line up to q1 = 56 and consists of the decreasing
q1 . Thus, umin
2
line for q1 between 56 and 1. The expected payoff umin
(q) obtains its maximum value at
2
q1 = 56 , which implies that q = ( 56 , 16 ) is the unique max-min strategy for player 2. In
(q ) = 12 , and therefore the value for player 2 is v2 = 21 .
Figure 3 we can see that umin
2

52

CHAPTER 4. ZERO-SUM GAMES

In this particular example we discover the following remarkable relationship between the
values for player 1 and 2: the value for player 2 is exactly the negative of the value for player
1. In other words: the highest expected payoff that player 2 can guarantee in the zero-sum
game is exactly the negative of the highest expected payoff thet player 1 can guarantee in
the game. Is this a coincidence or is this a general property for zero-sum games? Well, we
shall see below that this is in fact a general property that holds for all zero-sum games: it is
always the case that v2 = v1 . This result has been shown by John von Neumann back in
1928, and constitutes one of the earliest achievements in game theory.
Definition 4.7 The expected payoff v = v1 = v2 is called the value of the zero-sum game.
There is another remarkable property of zero-sum games that we would like to discuss. In
the game of Figure 1, we have seen that player 1 and 2 both have a unique max-min strategy,
namely p = ( 12 , 12 ) and q = ( 65 , 16 ). Since we have discussed the concept of Nash equilibrium
in mixed strategies already, we might ask ourselves whether these max-min strategies have
anything to do with the concept of Nash equilibrium in mixed strategies. By applying
the method presented in Section 3 of Chapter II, we may compute all Nash equilibria in
mixed strategies for the game in Figure 1, and find out that the game has a unique Nash
equilibrium in which player 1 chooses the mixed strategy ( 21 , 12 ) and player 2 chooses the
mixed strategy ( 65 , 16 ). However, this means that in this particular zero-sum game, the set
of Nash equilibria in mixed strategies coincides exactly with the set of max-min strategies
for the players! Again, the question arises whether this is a coincidence, or whether this is
a general property for zero-sum games. It turns out that also this is a general property: in
all zero-sum games, the set of Nash equilibria in mixed strategies is exactly equal to the set
of pairs of max-min strategies.
We summarize the above mentioned results in the following theorem.
Theorem 4.8 Every zero-sum game has the following properties:
(a) v1 = v2 , that is, the value for player 1 is the negative of the value for player 2;
(b) every pair of max-min strategies (p, q) yields the same expected payoffs, namely v1 for
player 1 and v1 for player 2;
(c) a pair (p, q) of mixed strategies is a Nash equilibrium if and only if both p and q are
max-min strategies.
Proof. (a) From Theorem 1 in Chapter II we know that every finite strategic game has a
Nash equilibrium in mixed strategies. So, the zero-sum game under consideration has at least
one Nash equilibrium in mixed strategies. Choose such a Nash equilibrium (p , q ). By the
definition of a Nash equilibrium, it holds that u1 (p , q ) = maxp u1 (p, q ) and u2 (p , q ) =
maxq u2 (p , q). Since u1 (p, q ) = u2 (p, q ) for all mixed strategies p for player 1, we have
that u2 (p , q ) = maxp (u2 (p, q )) and hence
u2 (p , q ) = min u2 (p, q ) = umin
(q ).
2
p

(4.1)

Since v2 = maxq umin


(q) umin
2
2 (q ), it follows that

u2 (p , q ) v2 .

(4.2)

On the other hand, since u2 (p , q ) = maxq u2 (p , q) and u2 (p , q) = u1 (p , q) for all mixed


strategies q for player 2, we have that u1 (p , q ) = maxq (u1 (p , q)) and hence

u1 (p , q ) = min u1 (p , q) = umin
1 (p ).
q

(4.3)

53
min
Since v1 = maxp umin
1 (p) u1 (p ), it follows that

u1 (p , q ) v1 .

(4.4)

Since the game is a zero-sum game, we must have that u1 (p , q ) + u2 (p , q ) = 0. It then


follows from (4.2) and (4.4) that
0 = u1 (p , q ) + u2 (p , q ) v1 + v2
and hence
v1 + v2 0.

(4.5)
p)
umin
1 (

Now, let (
p, q) be a pair of max-min strategies, meaning that
Then,
u1 (
p, q) min u1 (
p, q) = umin
p ) = v1
1 (

= v1 and

q)
umin
2 (

and

u2 (
p, q) min u2 (p, q) = umin
(
q ) = v2 .
2
p

= v2 .
(4.6)
(4.7)

Since u1 (
p, q) + u2 (
p, q) = 0, it follows that
0 = u1 (
p, q) + u2 (
p, q) v1 + v2 ,
and hence
v1 + v2 0.

(4.8)

From (4.5) and (4.8) we may conclude that v1 + v2 = 0, and hence v1 = v2 . This completes
the proof of (a).
(b) Let (
p, q) be an arbitrary pair of max-min strategies. We have seen in (4.6) and (4.7)
that u1 (
p, q) v1 and u2 (
p, q) v2 . However, since we know that u1 (
p, q) + u2 (
p, q) = 0 and
v1 + v2 = 0, it follows that
u1 (
p, q) = v1 and u2 (
p, q) = v2 = v1 .
This completes the proof of (b).
(c) Let (p , q ) be a Nash equilibrium. Then, we know from (4.2) and (4.4) that
u1 (p , q ) v1 and u2 (p , q ) v2 .
Since u1 (p , q ) + u2 (p , q ) = 0 and v1 + v2 = 0, this implies that
u1 (p , q ) = v1 and u2 (p , q ) = v2 .

(4.9)

Moreover, from (4.1) and (4.3) we know that u1 (p , q ) = umin


(p ) and u2 (p , q ) = umin
1
2 (q ).
min
min
Together with (4.9) this yields that u1 (p ) = v1 and u2 (q ) = v2 , which means that both
p and q are max-min strategies. So we have shown that every Nash equilibrium in mixed
strategies (p , q ) consists of max-min strategies.
Now, suppose that (
p, q) is a pair of max-min strategies. Then, we know from part (b) that
u1 (
p, q) = v1 = minq u1 (
p, q). Since u2 (
p, q) = u1 (
p, q) for all mixed strategies q for player
2, this implies that
u2 (
p, q) = max u2 (
p, q).
(4.10)
q

By exchanging the roles of players 1 and 2, we may show in a similar way that
u1 (
p, q) = max u1 (p, q)
p

(4.11)

54

CHAPTER 4. ZERO-SUM GAMES

However, (4.10) and (4.11) imply that (


p, q) is a Nash equilibrium. We thus have shown that
every pair (
p, q) of max-min strategies constitutes a Nash equilibrium. This completes the
proof of (c).
Exercise 1. Consider the following zero-sum game.

a
b

c
6, 6
2, 2

d
1, 1
3, 3

(a) Compute the max-min strategies for players 1 and 2 and the value of the game.
(b) Compute the Nash equilibria in mixed strategies and show that they coincide with the
max-min strategies for the players.
Exercise 2. Consider the following zero-sum game.

a
b

c
2, 2
2, 2

d
1, 1
10, 10

(a) Compute the max-min strategies for players 1 and 2 and the value of the game.
(b) Compute the Nash equilibria in mixed strategies and show that they coincide with the
max-min strategies for the players.
In order to compute the value of a zero-sum game, it is sufficient to either compute the value
for player 1 or the value for player 2, since the value for player 2 is simply the negative of the
value for player 1. This insight enables us to graphically compute the value of a zero-sum
game whenever one player has two pure strategies and the other player has an arbitrary
(possibly very large) number of pure strategies. Consider, for instance, the zero-sum game
in Figure 4 in which player 1 has three pure strategies and player 2 has two pure strategies.

a
b
c

d
1, 1
1, 1
4, 4

e
1, 1
4, 4
1, 1

Figure 4
By using the same method as above, we may now compute the value v2 for player 2. Let
q = (q1 , 1 q1 ) be an arbitrary mixed strategy for player 2. Then, we know that
umin
2 (q) = min{u2 (a, q), u2 (b, q), u2 (c, q)}.
By interpreting u2 (a, q), u2 (b, q) and u2 (c, q) as functions of q1 , and drawing their graphs in
the same picture, we obtain the following figure.

55

4
3
2
1
0
0.2

0.4

0.6

0.8

-1
Figure 5
The graph of umin
(q) thus consists of the increasing line up to q1 = 52 , consists of the
2
horizontal line between q1 = 25 and q1 = 53 , and consists of the decreasing line for q1
3
. Player 2 therefore has multiple max-min strategies, namely all those mixed strategies
5
(q1 , 1 q1 ) where 25 q1 35 . The value for player 2 is v2 = 1. Since v1 = v2 , we may
conclude that the value for player 1 equals 1, and hence the value of the game is 1. Note
that it is not necessary to compute the max-min strategies for player 1 in order to compute
the value of the game. Nevertheless, we may conclude that the maximal expected payoff that
player 1 can guarantee is equal to 1. So, every max-min strategy for player 1 guarantees a
payoff of 1. In the game, it is easily seen that the pure strategy a for player 1 guarantees
a payoff of 1, and hence we may conclude that the pure strategy a is a max-min strategy
for player 1.
Exercise 3. Consider the following zero-sum game.
a
b

c
5, 5
2, 2

d
10, 10
5, 5

e
0, 0
2, 2

f
2, 2
6, 6

Compute the value of this game. Can you compute the max-min strategies for player 1?
Can you compute the max-min strategies for player 2?
We conclude this section with a theorem on chess, known as Zermelos theorem (1913),
which is often viewed as the oldest result in game theory. Although Zermelos original proof
of 1913 is quite complex, the result is surprisingly easy to prove with the help of Theorem
4.8. The game of chess is a classical example of a zero-sum game. In this game, there are
three possible outcomes: a win for white, a draw, and a win for black. If we identify player
1 with white and player 2 with black, then we may define the payoffs as follows: a win for
white gives payoffs (1, 1), a draw gives payoffs (0, 0), and a win for black gives payoffs
(1, 1). In order to guarantee that the game stops after finitely many moves, we assume the
following stopping rule: if the same configuration on the chess board has occurred more than

56

CHAPTER 4. ZERO-SUM GAMES

twice, the game ends in a draw. Since there are only finitely many possible configurations
on the chess board, the game must stop after finitely many moves.
As we will see in Chapter IV, the game of chess is a so-called extensive form game with
perfect information. A pure strategy for a player in this game is a complete plan which
specifies a move for this player for every possible configuration that can possibly occur on
the chess board. Theorem 1 in Chapter IV states that every extensive form game with
perfect information has a Nash equilibrium in pure strategies, so we know that the game of
chess has a Nash equilibrium in pure strategies. This insight, together with Theorem 4.8,
will be enough to prove Zermelos theorem.
Theorem 4.9 (Zermelos Theorem) In the game of chess, either white has a pure strategy
that guarantees a win, or black has a pure strategy that guarantees a win, or both white and
black have a pure strategy that guarantee at least a draw.
Note that Zermelos theorem does not say which of the three statements above is true; it
only says that one of these statements must be true. In fact, anno 2005 we still do not know
which of the three statements is true.
Proof. We have seen above that the game of chess has at least one Nash equilibrium in
pure strategies. Choose such a Nash equilibrium (a1 , a2 ) in pure strategies. From Theorem
4.8(c) we know that a1 and a2 are max-min strategies for players 1 and 2, respectively. We
distinguish three possible cases.
Case 1. (a1 , a2 ) leads to a win for white. In this case, we have that u1 (a1 , a2 ) = 1. By
Theorem 4.8(b) we then know that the value of the game is equal to 1. Hence, player 1
(white) can guarantee 1 by choosing the pure max-min strategy a1 , which means that white
can guarantee a win by playing a1 .
Case 2. (a1 , a2 ) leads to a draw. In this case, u1 (a1 , a2 ) = 0, and by Theorem 4.8(b) it
follows that the value of the game is 0. Hence, both player 1 (white) and player 2 (black) can
guarantee 0 by choosing the pure max-min strategies a1 and a2 , respectively. This means
that both white and black can guarantee at least a draw by playing a1 and a2 , respectively.
Case 3. (a1 , a2 ) leads to a win for black. In this case, u1 (a1 , a2 ) = 1, and by Theorem
4.8(b) it follows that the value of the game is 1. In particular, the value v2 for player 2 is
1, which means that player 2 (black) can guarantee 1 by playing the pure max-min strategy
a2 . This means that black can guarantee a win by playing a2 . This completes the proof.
Exercise 4. You all probably know the game of Tic-Tac-Toe (or Boter, Kaas en Eieren in
Dutch). It is played on a 3 3 board. Player 1 starts by putting a cross on one of the nine
fields. Afterwards, player 2 puts a circle on one of the eight remaining fields. Then, player
1 puts a cross on one of the seven remaining fields, and so on. If one of the two players
achieves three pieces in a row (either vertically, horizontally or diagonally), this player wins.
If the board is full, and no player has three pieces in a row, the game ends in a draw.
(a) Design a pure max-min strategy for player 1. Show that this max-min strategy guarantees
at least a draw to him.
(b) Show that player 1 cannot guarantee a win.
(c) What is the value of the game?

Chapter 5

Extensive Form Games:


Perfect Information
In this chapter we study the model of an extensive form game or tree game with perfect information. An extensive form game reflects the sequential structure of the decision problems
faced by players in a conflict situation.
We argue that for such a game Nash equilibria are unsatisfactory since they ignore the
sequential structure of the decision problem. Therefore we discuss the alternative notion
of subgame perfect equilibrium, in which a player is required to reassess his plans as play
proceeds.

5.1

Games in Extensive Form: the Model

An extensive form game is a detailed description of the sequential structure of the decision
problems faced by players in a conflict situation. There is perfect information in such a
game if each player, when making a decision, is perfectly informed of all events that have
previously occurred.
An important tool to represent such games are trees.

tree

A tree is a finite set of points, called nodes, together with a set of edges,
each of which connects a pair of nodes such that each pair of nodes is connected by exactly
one path of edges.

DEFINITION

root

57

decision nodes

terminal nodes

58

CHAPTER 5. EXTENSIVE FORM GAMES: PERFECT INFORMATION

The node at the top of the root is called the root of the tree and it represents the beginning
of the game. The nodes that are at the end of the tree are called the terminal nodes. They
represent the possible ways the game could end.
The nonterminal nodes of the tree are called decision nodes. Each decision node corresponds
to the moment a specific player has to move. The edges leaving that node correspond to the
actions available to that player at that decision moment.
The extensive form of a finite game with perfect information is a tree and a specification of:
(1) the order of the moves in the game
(2) for every decision moment, which player has to move
(3) the choices available to a player when he has to move
(4) the payoffs for all the players.
EXAMPLE 1

As an example we consider the game 1 :


1 x
L

y1

y2

 
   
 
0
2
1
3
0
1
2
1
This game models a situation where first player 1 takes a decision and then player 2 who
knows what player 1 has done takes his decision.
To be more specific: player 1 chooses an action a1 from the set {L, R}. Then player 2
observes a1 and subsequently chooses an action a2 from the set {l, r}.
The payoff ui (a1 , a2 ) to player i can be found at the end of the game tree.
The decision nodes are called x (for player 1) and y1 and y2 (for player 2). The game has
four terminal nodes.

A strategy for a player in an extensive form game is any rule that can be used to determine
a move at every possible decision node of that player. So

strategy

A (pure) strategy of a player in an extensive form game is a plan that


specifies the action chosen by the player for each of his decision nodes.

DEFINITION

In the game 1 the strategies of player 1 are: L and R. For player 2 the strategies are:
(l, l), (l, r), (r, l)

and

(r, r),

where (l, r) denotes the strategy where player 2 chooses l at the decision node {y1 } and r at
the decision node {y2 }.

5.2. THE STRATEGIC FORM OF AN EXTENSIVE FORM GAME


EXERCISE 1

REMARK 1

Two people use the following procedure to share two desirable identical
indivisible objects. Player 1 proposes an allocation, which is accepted or
rejected by player 2. If player 2 rejects neither person receives any object.
Each person cares only about the number of objects he obtains.
Model this situation as a game in extensive form. Also determine the strategies of the players of this game.
The following game 2
1
L

x1
R

y
 
3
3

59

x2

 
1
1

 
 
2
0
0
2
illustrates an important point: a strategy specifies the action chosen by a player for each
of his decision nodes, even for those decision nodes that are never reached if the
strategy is followed. In this game player 1 has four strategies: (L, L ), (L, R ), (R, L ) and
(R, R ). Here the first action specifies the action for the decision node x1 and the second
action specifies the action for the decision node x2 . Note that for the strategies (R, L ) and
(R, R ) an action is specified for the decision node x2 while this node is never reached if
player 1 chooses his action R at his decision node x1 .

As in a strategic game we can define a mixed strategy to be a probability distribution over


the set of (pure) strategies. In extensive form games with perfect information little is added
by considering such strategies. In the next chapter, where we study extensive form games in
which the players are not perfectly informed when taking actions, mixed strategies will be
considered.

5.2

The Strategic Form of an Extensive Form Game

Once we have determined the (pure) strategies for the players of a game in extensive form,
the game can be represented as a game in strategic form. The game 1 for example can be

60

CHAPTER 5. EXTENSIVE FORM GAMES: PERFECT INFORMATION

represented as the 2 4-bimatrix game


L
R
strategic form

"

(l, l)

(l, r)

(r, l)

(r, r)

(3, 1)

(3, 1)

(1, 2)

(1, 2)

(2, 1)

(0, 0)

(2, 1)

(0, 0)

This game is called the strategic form of the game 1 .


With the help of the strategic form of a game we can define a Nash equilibrium of an extensive
form game.

Nash equilibrium

A Nash equilibrium of a game in extensive form is a (pure) Nash equilibrium of the strategic form G of .

DEFINITION



By considering the bimatrix game described before we observe that the pair R, (r, l) is a
Nash equilibrium of the game 1 in extensive form.
EXERCISE 2

Determine the strategic form of the game described in Exercise 1. Also find
the Nash equilibria of this game.

THEOREM 1

An extensive form game of perfect information has a (pure) Nash equilib-

rium.
Since the game is finite, it has a set of penultimate nodes i.e. nodes whose
immediate successors are terminal nodes. Specify that the player who can move at each such
node chooses the action leading to the successive terminal node with the highest payoff for
him (in case of a tie, make an arbitrary selection). Now specify that each player at nodes
whose immediate successors are penultimate nodes chooses the action that maximizes his
payoff over the feasible successors, given that players at the penultimate nodes play as we
have just specified. We can now roll back through the tree, specifying actions at each node.
When we are done, we will have specified a strategy for each player. It is easy to check that
these strategies form a Nash equilibrium.

PROOF

backwards induction

The procedure used in this proof is often referred to as backwards induction.


Stackelberg Model of Duopoly
Stackelberg (1934) proposed a dynamic model of duopoly in which a dominant (or leader)
firm moves first and a subordinate (or follower) firm moves second. Following Stackelberg,
we will develop the model under the assumption that the firms choose quantities as in the
Cournot model in Example 8 of Chapter 2.
The timing of the game is as follows:
(1) firm 1 chooses a quantity q1 0
(2) firm 2 observes q1 and then chooses a quantity q2
(3) the payoff to firm i is given by the profit function
EXAMPLE 2

ui (q1 , q2 ) =

qi (a q1 q2 ) cqi
cqi

if q1 + q2 < a
if q1 + q2 a.

where c is the constant marginal cost of production (no fixed costs).

5.2. THE STRATEGIC FORM OF AN EXTENSIVE FORM GAME

61

In order to apply the backwards induction procedure, we first determine the reaction of the
follower to an arbitrary quantity of the leader. Well, we know from Example 8 of Chapter I
that the best response to a quantity q1 is given by
B2 (q1 ) = max{0, 12 (a c q1 )}.
Now the leader should anticipate that the quantity choice q1 will be met with the reaction
B2 (q1 ). Thus, the problem of the leader amounts to
 
max u1 (q1 , B2 q1 ) = max q1 [a q1 B2 (q1 ) c] = max 12 q1 (a c q1 ).
q1 0

q1 0

q1 0

So the backwards induction procedure leads to


q1 = 12 (a c)
EXERCISE 3

and

B2 (q1 ) = 41 (a c).

Compare the position of the two firms in the Stackelberg game in the foregoing example with the Cournot game in Example 8 of Chapter 2.

62

CHAPTER 5. EXTENSIVE FORM GAMES: PERFECT INFORMATION

5.3

Subgame Perfect Equilibria

As you can check, the game 3


x

y
 
0
2

1
1

 
1
1

has two Nash equilibria: (L, l) and (R, r). But one could argue that the equilibrium (L, l)
is not very plausible. If action R would have been chosen by player 1, then player 2 would
choose r over l, since he obtains a higher payoff by doing so. The equilibrium (L, l) is
sustained by the threat of player 2 to choose l if player 1 chooses R. This threat is not
credible since player 2 has no way of committing himself to this choice. Thus player 1 can
be confident that if he chooses R then player 2 will choose r; since he prefers the outcome
(1, 1) to the Nash equilibrium outcome (1, 1). Thus he has an incentive to deviate from
the equilibrium and choose r.
EXERCISE 4

Discuss the plausibility of the 9 Nash equilibria of the extensive form game
introduced in Exercise 1.

In the foregoing chapter we ruled out implausible equilibria as discussed before by introducing the concept of perfect equilibria. Next we are going to define a new notion of equilibrium
that also rules out these implausible equilibria. This notion is based on the following consideration. In a game, once a decision node x is reached, the part of the game tree which
does not come after x has become irrelevant. Therefore the decision at the node x should
be based only on that part of the tree which comes after x. This subtree starting at x
constitutes a game of its own, called the subgame starting at x. An equilibrium of the game
is sensible only if it prescribes an equilibrium also in this subgame. Equilibria which possess
this property are called subgame perfect.
We begin by defining the notion of a subgame.

subgame

A subgame of an extensive form game is an extensive form game


satisfying the following properties:

DEFINITION

(1) the first decision node (root) of the game is one of the decision nodes, say x, of the
game

63

5.3. SUBGAME PERFECT EQUILIBRIA

(2) it includes all the decision nodes, edges and terminal nodes following x in the game
tree of
(3) it includes the payoffs corresponding to the terminal nodes following x.
The subgame corresponding to a node x is denoted by x .

Since a strategy of a player of an extensive form game is a plan that specifies the actions
chosen by that player for each of his decision nodes, it is clear how such a strategy induces
a strategy for a subgame of that game: the specification of the actions to be chosen by the
player is restricted to the decision nodes of the subgame.

subgame perf. eq.

A strategy profile is called a subgame perfect equilibrium of a game if for


each subgame of the restriction of this profile is a Nash equilibrium for that subgame.

DEFINITION

Because any game is a subgame of itself, a subgame perfect equilibrium is necessarily a Nash
equilibrium. If the only subgame is the whole game, the sets of Nash and subgame perfect
equilibria coincide. If there are other subgames, some Nash equilibria may fail to be subgame
perfect.
EXAMPLE 3

The game 3 has two subgames: the game itself and the game y :
2

1
1

 
1
1

The strategies of the subgame y are l and r. Since the strategy r is the only equilibrium
of the subgame y , (R, r) is a subgame perfect equilibrium of the game 3 and (L, l) is not
subgame perfect.

EXERCISE 5

Determine the subgame perfect equilibria of the game introduced in Exercise


1.

EXERCISE 6

Determine the subgame perfect equilibria of the games 1 and 2 .


Show that there is an order of elimination of weakly dominated actions
in the strategic form of the game 2 that eliminates the unique subgame
perfect equilibrium of 2 .

EXERCISE 7

Consider the game 4 :


1

x1

 
0
0

x2

 
1
2

 
1
1

 
0
0

64

CHAPTER 5. EXTENSIVE FORM GAMES: PERFECT INFORMATION


a)
b)
c)
d)

5.4

Determine the strategic form of this game.


Determine the Nash equilibria of this game.
Determine the subgame perfect equilibria of this game.
Determine the backwards-induction outcomes of this game.

Applications of Backwards Induction

In this section we discuss two examples of the backwards-induction principle.


Wages and Employment in a Unionized Firm
We consider a model due to Leontief (1946) of the relationship between a union that has
exclusive control over wages and a firm which has exclusive control over employment. The
unions utility function U depends on the wage w the union demands from the firm and
the employment L. We assume that U is increasing both in w and in L. The firms profit
function is defined by (w, L) = R(L) wL, where R(L) is the revenue the firm can earn
if it employs L workers. We assume that R: [0, ) IR has the following properties:
(a) R(0) = 0
(b) R > 0
(c) R < 0
(d) R (L) as L 0
(e) lim R (L) = 0.
EXAMPLE 4

So R is an increasing and concave function as suggested in Figure 1.


Suppose the timing of the game is as follows:
(1) the union makes a wage demand w
(2) the firm observes (and accepts) w and then chooses employment L
(3) the payoffs are U (w, L) and (w, L).
Although the functions U and R are not explicitly given, we can characterize the firms best
response B2 (w) in stage (2) to an arbitrary wage demand w by the union in stage (1).
Given w, the firm chooses B2 (w) to solve
h
i
max (w, L) = max R(L) wL .
L0

EXERCISE 8

L0

Consider the function f : [0, ) IR defined by


f (L) = R(L) wL,
where w > 0.
a) Prove that a > 0 exists such that f (L) > 0 for all L (0, ]. This
means that the function has no maximum at L = 0.
In view of part a) a maximum of the function f can be found by solving the
equation
R (L) w = 0.
b) Prove that a solution of this equation leads to a maximum indeed.
c) Prove that this equation has at least one solution.
d) Prove that this equation is uniquely solvable.

5.4. APPLICATIONS OF BACKWARDS INDUCTION

65

The unique solution of the equation R (L) = w can be found as suggested in the following
figure.
slope = w
R

FIGURE 1

B2 (w)
The function R

Next we turn to the unions problem at stage (1). Obviously the union should anticipate
that the firms reaction to the wage demand w will be to choose the employment level B2 (w).
So the unions problem amounts to


max U w, B2 (w) .
w0

Or put it in another way: the union would like to choose the point on the graph of the best
response function B2 with the highest possible utility. Since B2 (w) 0 as w and
B2 (w) as w 0 the graph of B2 looks like

66

CHAPTER 5. EXTENSIVE FORM GAMES: PERFECT INFORMATION

B2

w
Now let us consider the unions indifference curves (being the level curves of the function U )
as depicted in the following figure:
L

w
Holding L fixed, the union does better when w is higher. So higher indifference curves
represent higher utility levels for the union. As a consequence,

the union would like to
choose the wage demand w that yields the outcome w, B2 (w) that is on the highest
possible indifference curve. Thus the
is the wage demand w such that the
 unions solution


indifference curve trough the point w , B2 (w ) is tangent to the graph of B2 at that point:
L

B2 (w )
B2
w
w


So w , B2 (w ) is the backwards-induction outcome of this game.




EXAMPLE 5

Sequential Bargaining

5.4. APPLICATIONS OF BACKWARDS INDUCTION

67

Players 1 and 2 are bargaining over one dollar. They alternate in making offers: first player 1
makes a proposal that player 2 can accept or reject; if 2 rejects then 2 makes a proposal that
1 can accept or reject; and so on. Once an offer has been rejected, it ceases to be binding
and is irrelevant to the subsequent play of the game. Each offer takes one period, and the
players discount payoffs received in later periods by the factor (0, 1) per period.
We will model this situation as the following three-period bargaining game:
(1a) At the beginning of the first period, player 1 proposes to take a share of s1 of the
dollar, leaving 1 s1 for player 2.
(1b) Player 2 either accepts the offer (in which case the game ends and the payoffs s1 to
player 1 and 1 s1 to player 2 are immediately received) or rejects the offer (in which
case play continues in the second period).
(2a) At the beginning of the second period, player 2 proposes that player 1 takes a share
s2 of the dollar, leaving 1 s2 for player 2.
(2b) Player 1 either accepts the offer (in which case the game ends and the payoffs s2 to
player 1 and 1 s2 to player 2 are immediately received) or rejects the offer (in which
case play continues in the third period).
(3) At the beginning of the third period, player 1 receives a share s of the dollar, leaving
1 s for player 2, where s (0, 1) is given exogenously.
To solve for the backwards induction outcome of this game, we first compute player 2s
optimal offer if the second period is reached.
period 2

Player 1 can receive s in the third period by rejecting player 2s offer of s2 this period. Note
however that the value this period of receiving s in the next period is only s. Thus player
1 will accept s2 if and only if s2 s.
The decision problem for player 2 in the second period amounts to choosing between
receiving 1 s this period (by offering s2 = s to player 1) and
receiving 1 s next period (by offering player 1 any s2 < s).
The discounted value of the latter option is (1 s). Because this is less than the 1 s
available from the former option, player 2s optimal offer in the second period is
s2 = s.

period 1

In the first period player 1 knows that player 2 can receive 1 s2 in the second period (by
rejecting player 1s offer of s1 this period). The value of receiving 1 s2 in the next period
has a value of (1 s2 ) in this period. Thus player 2 will accept 1 s1 if and only if
1 s1 (1 s2 ) s1 1 (1 s2 ).
The decision problem for player 1 in the first period therefore amounts to choosing between
receiving 1 (1 s2 ) this period (by offering 1 s1 = (1 s2 ) to player 2) and
receiving s in the next period (by offering any 1 s1 < (1 s2 ) to player 2).
The discounted value of the latter option is s2 = 2 s. Because this is less than the 1
(1 s2 ) = 1 + 2 s available from the former option, player 1s optimal choice in the first
period is
s1 = 1 (1 s2 ) = 1 (1 s).
Thus, the backwards-induction outcome is: in period 1 player 1 offers (1 s) to player 2
who accepts.

68
EXERCISE 9

CHAPTER 5. EXTENSIVE FORM GAMES: PERFECT INFORMATION

Suppose a parent and child play the following game. First, the child takes an action A that
produces income IC (A) for the child and income IP (A) for the parent. Second, the parent
observes the incomes IC and IP and then chooses a bequest (in Dutch: legaat) B to leave to
the child. The childs payoff is U (IC + B); the parents is V (IP B) + kU (IC + B), where
k > 0 represents the parents concern for the childs well-being. Assume that:
the action is a nonnegative number A 0

the income functions IC and IP are strictly concave and are maximized at AC > 0
and AP > 0, respectively
the bequest B can be positive or negative, and
the utility functions U and V are increasing and strictly concave.
Prove the Rotten Kid Theorem: in the backwards-induction outcome, the child chooses the
action that maximizes the familys aggregate income IC + IP even though only the parents
payoff exhibits altruism.
EXERCISE 10 Suppose a firm wants a worker to invest in a firm-specific skill S, but the skill is too nebulous

for a court to verify whether the worker has acquired it. The firm therefore cannot contract
to repay the workers cost of investing: even if the worker invests, the firm can claim that the
worker did not invest, and the court cannot tell whose claim is true. Likewise, the worker
cannot contract to invest if paid in advance.
It may be that the firm can use the (credible) promise of a promotion as an incentive for
the worker to invest, as follows. Suppose that there are two jobs in the firm, one easy (E)
and the other difficult (D), and that the skill is valuable on both jobs but more so on the
difficult job: yD0 < yE0 < yES < yDS , where yij is the workers output in job i (= E or D)
when the workers skill level is j (= 0 or S). Assume that the firm can commit to paying
different wages in the two jobs, wE and wD , but that neither wage can be less than the
workers alternative wage, which we normalize to zero.
The timing of the game is as follows:
at date 0 the firm chooses wE and wD and the workers observe these wages.
At date 1 the worker joins the firm and can acquire the skill S at cost C. (We ignore
production and wages during this first period. Since the worker has not yet acquired
the skill, the efficient assignment is to job E.) Assume that yDS yE0 > C, so that it
is efficient for the worker to invest.
At date 2 the firm observes whether the worker has acquired the skill and then decides
whether to promote the worker to job D for the workers second (and last) period of
employment.
The firms second-period profit is yij wi when the worker is in job i and has skill level j.
The workers payoff from being in job i in the second period is wi or wi C, depending on
whether the worker invested in the first period.
Solve for the backwards-induction outcome.

Chapter 6

Extensive Form Games:


Imperfect Information
In this chapter we deal with extensive form games in which a player, when taking an action,
may have only partial information about the actions taken previously.

6.1

Information Sets

For all the classes of games we studied before the players were in some sense not perfectly
informed when making their choices. In a strategic game a player, when taking an action,
does not know the actions taken by his opponents. In an extensive form game with perfect
information a player does not know the actions of his opponents to be taken in the future.
In the case of an extensive form with imperfect information we will suppose that the players
may be imperfectly informed about some of the actions that already have been chosen.
To represent this type of ignorance of previous actions, we introduce the concept of an
information set for a player.

information set

An information set for a player is a subset of his decision nodes satisfying


the following property:

DEFINITION

(1) the player (with the move) must have the same set of feasible actions at each decision
node in the subset
(2) when a play of the game reaches a node in the information set, the player (that
currently has the move) does not know which node in the information set has been
reached.

In an extensive form game we will indicate that a subset of decision nodes constitutes an
information set by connecting the nodes by a dotted line as in the following figure:

69
a

70

CHAPTER 6. EXTENSIVE FORM GAMES: IMPERFECT INFORMATION

When a play of the game reaches this information set the only thing the player with the
move knows is that he must choose between the actions a and b. Using this notation the
Prisoners Dilemma can be represented as an extensive form game as given in the following
figure:
1 x
c

nc

y1

6
6

y2

nc

0
9

 

9
0

nc

1
1

The interpretation of player 2s information set {y1 , y2 } is that when taking an action all
player 2 knows is that player 1 has taken his action but player 2 does not know which one.

71

6.1. INFORMATION SETS


EXAMPLE 1

We consider the game 1 :


1

x1

x2

 
8
0

 
6
0

 
0
8

x3

 
0
8

 
8
0

In this game player 1 chooses an action from the set {L, R}. If player 1 chooses R the game
stops. If player 1 chooses L, then player 2 chooses an action from the set {l, r}.
Then player 1 chooses without knowing what player 2 has done an action from the set
{L , R }.
At his second move player 1 does not know which node in the set {x2 , x3 } has been reached:
{x2 , x3 } is an information set. So player 1 has two information sets: {x1 } and {x2 , x3 }.
Player 2 has only one information set: {y}.

EXERCISE 1

Represent the following situation as an extensive form game:

(1) player 1 chooses an action from the set {L, R}

(2) player 2 observes this action and then chooses an action from the set {l, r}

(3) player 3 observes whether or not the pair of actions (R, r) has been chosen and then
chooses an action from the set {, }.

72

6.2

CHAPTER 6. EXTENSIVE FORM GAMES: IMPERFECT INFORMATION

Mixed and Behavioral Strategies

In the foregoing chapter a pure strategy of a player appeared to be a plan specifying the
action to be taken for every decision moment that player has to make a move. In an extensive
form game with perfect information these moments correspond with the decision nodes of
that player. Obviously in a game with imperfect information the decision moments where
a player has to make a move correspond to his information sets. So we have to adapt the
definition of a pure strategy as given in the foregoing chapter a little bit:

pure strategy

A pure strategy of a player in an extensive form game is a plan that specifies


an action chosen by the player for each of his information sets.

DEFINITION

In the game 1 the pure strategies of player 1 are:


(L, L ), (L, R ), (R, L )

and

(R, R ),

where a pair (a, b) denotes the strategy where player 1 chooses the action a at the information
set {x1 } and the action b at the information set {x2 , x3 }.
For player 2 the strategies are: l and r; these are the actions to be taken at the only
information set {y} of player 2.
As in the case of an extensive form game with perfect information, once we have determined
the pure strategies for the players in an extensive form game with imperfect information,
this game can be represented as a game in strategic form. The game 1 for example can be
represented as the 4 2-bimatrix game
l

(R, L )
(R, R )
(L, L )
(L, R )

(6, 0)

(6, 0)

(8, 0)

(0, 8)

r
(6, 0)

(6, 0)
.
(0, 8)

(8, 0)

There are two ways to model the possibility that the players use randomization in order to
enlarge their strategic possibilities: a player may (as in a strategic game) randomly select a
pure strategy or he may plan a collection of randomizations, one for each of the points at
which he has to taken an action. Formally,

mixed strategy
behavioral strategy

A mixed strategy of a player in an extensive form game is a probability


distribution over the set of pure strategies of that player.
By contrast, a behavioral strategy specifies, for each information set, a probability distribution
over the actions available at that information set.
DEFINITION

In the game 1 a mixed strategy of player 1 is a probability distribution over the four pure
strategies mentioned before. Such a strategy is in fact a mixed strategy of player 1 in the
strategic form of the game 1 .

73

6.2. MIXED AND BEHAVIORAL STRATEGIES

A behavioral strategy of player 1 is, by contrast, a pair of probability distributions, one for
each information sets. Such a strategy can be represented as follows in the game tree:
x1
1
p

x2

1r

or as in the following diagram:


player

1
where p, r [0, 1],

1p

x3

1r

information set
{x1 }

{x2 , x3 }

(p, 1 p)

(r, 1 r)

In order to determine the expected payoff corresponding to a behavioral strategy profile b (i.e.
a combination of behavior strategies, one for each player), we first consider for each terminal
node the probability of reaching this node when b is played.

outcome

DEFINITION
Let b be a behavior strategy profile. The probability distribution over the
terminal nodes generated by the profile b is called the outcome of b.
This outcome is denoted by IPb and IPb (z) denotes the probability of reaching the terminal
node z when b is played.

74

CHAPTER 6. EXTENSIVE FORM GAMES: IMPERFECT INFORMATION

We determine, for the game 1 , the outcome of the profile consisting of


the behavioral strategies
player information set
player information set

EXAMPLE 2

{x1 }

{x2 , x3 }

( 21 , 12 )

( 31 , 23 )

{y}
( 14 , 43 )

In doing so the game tree appears to be helpful:


1

x1

1
2

1
2

1
2
1
4

3
4

x2

1
3

x3

2
3

1
3

1
24

2
24

2
3

3
24

6
24

The probability of reaching the left-most terminal node is for example equal to 12 14 31 =

1
.
24

Once we know how to determine the outcome of a behavior strategy profile b, it is easy to
calculate the expected payoff to a player corresponding to this profile: the expected payoff
to player i is given by
X
IPb (z) ui (z).
Ui (b) :=
z

EXAMPLE 3
For the behavioral strategy profile introduced in Example 2 the expected
payoff to player 1 is

U1 (b) =

1
2

6+

1
24

8+

1
12

0+

1
8

0+

1
4

8 = 5 31

0+

1
12

8+

1
8

8+

1
4

0 = 1 32 .

and the expected payoff to player 2 is


U2 (b) =

1
2

0+

1
24

Next we will show how for a behavior strategy of a player a mixed strategy of that player
can be constructed in such a way that both strategies lead to the same outcome no matter
what the opponents do. Such strategies are called outcome-equivalent. Formally,

75

6.2. MIXED AND BEHAVIORAL STRATEGIES

outcome-equivalence

Two (mixed or behavioral) strategies of a player are called outcomeequivalent if for every collection of pure strategies of the other player(s) the two strategies
generate the same outcome.

DEFINITION

Let bi be a behavioral strategy of player i in an extensive form game. If we want to construct


an outcome-equivalent mixed strategy of player i, then it is sufficient to describe which
probabilities should be assigned to the pure strategies of player i.
Well, a pure strategy specifies a unique action for each information set of player i and the
behavioral strategy bi assigns a probability to each of these (unique) actions. Now the weight
on this pure strategy must be the product of these probabilities.
We will illustrate this method by an example.
We consider the behavioral strategy b1 of player 1 in the game 1 as
represented in the following picture

EXAMPLE 4

x1

1
1
2

z5
r

l
1

x2

1
3

x3

2
3

z1

1
2

1
3

z2

2
3

z3

z4

The weight on the pure strategy (L, R )


The weight on the pure strategy (L, L ) is =
is 21 23 = 31 . The weight on the pure strategy (R, L ) is 21 13 = 16 . The weight on the pure
strategy (R, R ) is 21 32 = 13 . So a mixed strategy of player 1 that is outcome-equivalent
with his behavioral strategy b1 is

1
2

1
3

1
.
6

1 = 61 (L, L ) + 13 (L, R ) + 16 (R, L ) + 13 (R, R ).


Now suppose that player 2 chooses the pure strategy l. Then the behavioral strategy profile
(b1 , l) generates the following outcome:
z1

z2

1
3

z3

z4

z5

1
0
0
2
The pure strategy profile (L, L ), l leads to the terminal node z1 ;

1
6

76

CHAPTER 6. EXTENSIVE FORM GAMES: IMPERFECT INFORMATION



the pure strategy profile (L, R ), l leads to the terminal node z2 ;




the pure strategy profiles (R, L ), l and (R, R ), l lead to the terminal node z5 .

Hence the mixed strategy profile (1 , l) generates the same outcome.




In a similar way one can show that the behavior strategy profile b1 , r and the mixed
strategy profile (1 , r) generate the same outcome.

So we may conclude that b1 and 1 are outcome-equivalent.


EXERCISE 2

Consider the one-player game tree (without payoffs):


x1

1
1
2

1
1
4

1
2

x2

z3
3
4

z1

z2

Determine all mixed strategies that are outcome equivalent with the behavioral strategy represented in the picture.
For extensive form games with perfect recall that are games in which at every decision node
every player remembers whatever he knew in the past it is possible to construct for a given
mixed strategy of a player an outcome-equivalent behavioral strategy of that player. We will
give a sketch of the construction procedure.
Let i be a mixed strategy of player i. In order to define a behavior strategy bi for player i
that is outcome equivalent with i , let I be an information set for player i and let a be an
action that can be chosen at I.
First of all we determine the class P (I) of all pure strategies of player i such that a path
through I exists containing the edges that correspond with the pure strategy.
Next we determine within this class those pure strategies of player i such that the path
trough I also contains the edge corresponding with the action a.
Then let the probability bi (a) of choosing action a at the information set I be equal to the
total weight that i puts on the pure strategies in the subclass divided by the total weight
that 1 puts on the pure strategies in the class P (I).
We will illustrate this method by an example.
EXAMPLE 5

We consider the game 2 :


1

x1
R

x2

x3

77

6.3. EQUILIBRIA

We will construct a behavioral strategy b1 of player 1 that is equivalent to the mixed strategy
1 = 25 (R, L ) +

3
(R, R )
10

+ 15 (L, L ) +

1
(L, R ).
10

Note that P ({x1 }) consists of all pure strategies, while the subclass that is relevant for L
contains the pure strategies (L, L ) and (L, R ). So

and

b1 (L)

b1 (R)

1 (L,L )+1 (L,R )


1
7
.
10

1
5

1
10

3
10

Note that P ({x2 , x3 }) consists of the pure strategies (L, L ) and (L, R ), while the subclass
that is relevant for L only contains the pure strategy (L, L ). So

and

1 (L,L )
1 (L,L )+1 (L,R )

b1 (L )

b1 (R )

= 13 .

1
5
1+ 1
5
10

2
3

One can easily show that the behavioral strategy b1 is equivalent to the mixed strategy 1 .

In the following exercise we will present a game with imperfect recall. There is only one
player. This player has to make a move twice. When he has to choose his second move he
has forgotten what his first move is.
EXERCISE 3 Consider the game
x1

1
L

R
1

x2

x3

z1
z3
z4
z2

1
1
Consider the mixed strategy

=
(L,
L
)
+
(R,
R
).
2
2

Prove that the outcome 12 , 0, 0, 12 generated by this mixed strategy cannot
be achieved by any behavioral strategy.
EXERCISE 4

6.3

Determine for the game 1 a behavioral strategy that is outcome equivalent


with the mixed strategy 13 (L, L ) + 23 (R, L ) of player 1.

Equilibria

It is quite obvious how to define equilibria in mixed and behavioral strategies.

78
Nash equilibrium
in mixed strategies

CHAPTER 6. EXTENSIVE FORM GAMES: IMPERFECT INFORMATION

A mixed strategy profile of a game in extensive form is a Nash


equilibrium of the strategic form G of .

DEFINITION

EXAMPLE 6

The strategy pair


e1 , ( 34 , 14 ) is a mixed strategy equilibrium of (the

strategic form of) the extensive form game 1 .

Nash equilibrium
in behavioral strategies

A behavioral strategy profile b is an equilibrium for an extensive


form game if, for all i, bi is a best response to bi .

DEFINITION

We consider for the game 1 the behavior strategy profile b given by


information set
player information set

EXAMPLE 7

player

{x1 }

{x2 , x3 }

(0, 1)

(1, 0)

{y}
2

( 43 , 14 )

79

6.3. EQUILIBRIA
In doing so the game tree appears to be helpful:
1

x1

2
3
4

 
8
0

 
6
0

1
4

x2

x3

 
0
8

 
0
8

 
8
0

If player 1 uses the behavioral strategy b1 , then the payoff to player 2 is equal to zero for
any behavioral strategy b2 .
Hence, b2 is a best response against b1 .
If player 2 uses the behavioral strategy b2 and player 1 uses the behavioral strategy b1 (L) = p
and b1 (L ) = r, for some p, r [0, 1], then the payoff to player 1 is equal to
3
pr
4

8 + 41 p(1 r) 8 + (1 p) 6

= 6pr + 2p(1 r) + 6(1 p)


= p[6r + 2 2r 6] + 6
= 6 4p(1 r).

This payoff is maximal if p = 0 or r = 1. Hence b1 is a best response against b2 .


So b is an equilibrium for the game 1 .

The behavioral strategy profile discussed in Example 7 is a profile generated by the equilibrium for the strategic form of 1 discussed in Example 6.
One can show that any behavioral profile generated by an equilibrium of the strategic form
of a game in extensive form is an equilibrium of . Hence
THEOREM 1

A finite game in extensive form has an equilibrium in behavioral strategies.

Since the foregoing makes clear that restriction to behavioral strategies doesnt harm the
strategic possibilities of a player, we will from now on restrict ourselves to behavioral strategies.
In order to rule out unreasonable Nash equilibria for extensive form games with perfect
information we introduced, in Chapter IV, subgame perfect equilibria. In order to give a

80

CHAPTER 6. EXTENSIVE FORM GAMES: IMPERFECT INFORMATION

definition of subgame perfectness for extensive form games with imperfect information, we
only need to add the following condition to the definition of a subgame in Section 3 of the
foregoing chapter:
(4) if I is an information set of the game and a decision node y I is also a decision
node of the game , then all the decision nodes contained in I are a decision node of
the game .
EXAMPLE 8

The game 1 has only one proper subgame: the game y :


y
2
r

l
1

x2

x3

 
8
0

The strategic form of this game is given by


L

"

 
0
8

 
0
8

(8, 0)

(0, 8)

 
8
0

.
(0, 8) (8, 0)


The only Nash equilibrium of this game is 12 L + 21 R , 12 l + 12 r .
R

So if b is a subgame perfect equilibrium (in behavioral strategies) of the game 1 , then


b1 (L ) = 21 and b2 (l) = 12 .
Now suppose that b1 (L) = p. Then the expected payoff to player 1 corresponding to the
profile b is equal to
1
p 8 + 14 p 8 + (1 p) 6 = 6 2p.
4

This payoff is maximal if p = 0. So b1 (L) = 0 and the game 1 has only one subgame perfect
equilibrium.

Bank Runs
Two investors have each deposited 1 unit with a bank. The bank has invested these deposits
in a long-term project. If the bank is forced to liquidate its investment before the project
matures, a total of 1.5 units can be recovered. If the bank allows the investment to reach
maturity, however, the project will pay out a total of 2.5 units.
There are two dates at which the investors can make withdrawals from the bank: date 1
is before the banks investment matures; date 2 is after. For simplicity, assume there is no
discounting. The payoffs for the investors can be found in the following game tree in which
the first half corresponds with the decisions at date 1 and the second part is related to date
2.
EXAMPLE 9

81

6.4. PERFECT BAYESIAN NASH EQUILIBRIA


x

1
W

D
2

y1

y2

1


0.75
0.75

1
0.5

 

0.5
1

D
2

y1

1.25
1.25

1.5
1

 

y2

1
1.5

1.25
1.25

Note that W, W , w and w represent the decision to withdraw, while D, D , d and d mean
dont withdraw.
In order to determine the subgame perfect equilibria of this game, we consider the subgame
x .
Since the action w strictly dominates the action d , the pair (W , w ) is the only Nash
equilibrium of this subgame.





It is easy to check that (W, W ), (w, w ) and (D, W ), (d, w ) are the subgame perfect
equilibria of the game. The first equilibrium leads to a return of 25 percent for both
investors, while the second equilibrium leads to a return of 25 percent.
The first of these equilibria can be interpreted as a run on the bank. If investor 1 believes
that investor 2 will withdraw at date 1 then the best response of investor 2 is to withdraw as
well, even though both investors would be better off if they waited until date 2 to withdraw.

6.4

Perfect Bayesian Nash Equilibria

In case of a subgame perfect equilibrium of an extensive form game with perfect information
every players strategy should be optimal (given the other players strategies) at any of his

82

CHAPTER 6. EXTENSIVE FORM GAMES: IMPERFECT INFORMATION

decision nodes whether or not this decision node is reached if the players follow their strategies. Application of this idea in case of an extensive form game with imperfect information
leads to the requirement that
every players strategy should be optimal at each of his information sets.
This can help us to rule out unreasonable subgame perfect equilibria as the following example
shows.
For the game 3 :
1
x

EXAMPLE 10

M
2

y1

 
2
1

 
1
3

y2

 
0
0

 
0
2

 
0
1

the profiles (L, l) and (R, r) are both an equilibrium. Since this game has no proper subgame(s), these equilibria are even subgame perfect. The equilibrium (R, r) however is quite
unreasonable, since it prescribes at the information set of player 2 the strategy r. This
strategy is strictly dominated by the strategy l: for player 2 it is always better to choose the
strategy l.

Thus for the game in the foregoing example the requirement of local optimality formulated
at the beginning of this section works: the unreasonable equilibrium (R, r) is ruled out,
because the strategy r is not a rational option for player 2 at this information set. Unfortunately, as the following example shows, things are less straightforward than we could hope
for.

EXAMPLE 11

For the game 4

83

6.4. PERFECT BAYESIAN NASH EQUILIBRIA


x

 
2
2

M
2

y1

y2

 
1
1

 
0
2

 
0
2

 
3
1

the profile (R, r) is an equilibrium in which players two information set {y1 , y2 } is not
reached. But when this information set is reached the strategy r is optimal only if player 2
believes that the probability that node y1 has been reached is at least equal to the probability
that node y2 has been reached. For suppose that player 2 beliefs that node y1 has been
reached with probability p and node y2 with probability 1 p. Then given this belief, the
expected payoff from playing l is
p 1 + (1 p) 2 = 2 p,
while the expected payoff from playing r is
2 p + (1 p) 1 = 1 + p.
Now 2 p 1 + p p 12 . So his optimal action depends on his belief. This belief
however cannot be derived from the equilibrium strategy R, because this strategy assigns
probability zero to his information set being reached.

belief system

In the foregoing example rational/optimal behavior at a non-singleton information set was


determined by assigning a probability to all the nodes in that information set. Therefore
we are going to assume, more generally, that each player of an extensive form game, given a
behavior strategy profile b, assigns to each of his information sets a probability distribution
over the nodes contained in that information set. We call
such a system of probability distributions a belief system. Such a belief system is represented
by the probabilities (between square brackets) attached to the relevant nodes in the tree. Of
course these probabilities are written down only for the nodes contained in a non-singleton
information set.
Given a belief system, say , and the behavior strategy profile b, we can determine for a
player his expected payoff for all his information sets. In doing so we suppose that, given
such an information set, say I, the game starts at I. Before we make this a little more
precise, we consider an example.
EXAMPLE 12

Let the belief system


x

M
2

[]

 
2
1

 
1
3

[1 ]
q

1q

 
0
0

 
0
2

1q

 
0
1

84

CHAPTER 6. EXTENSIVE FORM GAMES: IMPERFECT INFORMATION

be given. Then at the information set of player 2, the expected payoff to player 2 corresponding to his behavior strategy (q, 1 q) and given his beliefs is equal to
q 1 + (1 q) 0 + (1 )q 2 + (1 )(1 q) 1 = q + 1 .

In order to define, more general, the expected payoff to player i at one of his information
sets, say I, we first introduce the probability IPxb (z) of reaching the terminal node z when b
is played and the game starts at x I. Then
X

IPxb (z)ui (z)

zZ

is the expected payoff to player i if the game began in node x. By multiplying this payoff
with the probability (x) (that the game began at x) and summing up over all the nodes x
in I we obtain the expected payoff to player i if the game starts at I.
Now we are able to explain when a behavior strategy b prescribes rational behavior of player
i at his information set, say I: player i is rational at I if his action determined according to
b maximizes his expected payoff given his belief with respect to the nodes of I and given the
actions after I determined according to b. This leads to the following definition.

sequentially rational

A behavior strategy profile b is called sequentially rational with respect


to a belief system if at each information set the action taken by the corresponding player
according to b maximizes his expected payoff given the players belief at that information
set and the players subsequent strategies as they are given by b.

DEFINITION

Until now we did not give a description how to obtain a reasonable belief system. If an
information set I is reached given a behavior strategy b, this is not a problem. In that
cases we will suppose that the belief a node y I has been reached is just the conditional
probability
(y)

= P (reaching y given b| reaching I given b)


=

P (reaching y given b)
.
P (reaching I given b)

We consider the following example.


For the game 3 we consider the behavior strategy b1 of player 1, where L
is chosen with probability 21 , M is chosen with probability 13 and R is chosen with probability
1
.
6
1
1
6
x

EXAMPLE 13

1
2

y1

1
3

y2

85

6.4. PERFECT BAYESIAN NASH EQUILIBRIA


Now player 2s belief about the nodes y1 and y2 is

(y1 ) =

P (reaching {y1 } given b1 )


=
P (reaching {y1 , y2 } given b1 )

1
2
1
2

1
3

and (y2 ) = 25 .

3
5

For the case in which some information set, given the behavior strategy b, will not be reached
we could try to apply the same method for all the subgames of the given extensive form game.

consistent belief system

Let b be a behavioral profile. We call a belief system consistent with b


if the beliefs are determined by using conditional properties. If necessary this will be done
with respect to a subgame of the given game.

DEFINITION

We will illustrate this concept in two examples.


EXAMPLE 14

For the game 5


2
0
0

1
G
2
L

R
3

[]


1
2
1


3
0
3 1
3
2

[1 ]
l


0
1
1

the strategy profile (S, L, l) is an equilibrium. Since for this equilibrium the information set
of player 3 will not be reached, we consider the subgame starting at the decision node of
player 2. Then we find that = 1 is the only belief that is consistent with the given profile.

EXAMPLE 15

We have modified the game 5 from the foregoing example by adding a

86

CHAPTER 6. EXTENSIVE FORM GAMES: IMPERFECT INFORMATION

third possible action S for player 2. We have ignored the payoffs in this new game.
S
1
G
S

2
L

R
3

[]

[1 ]
r

Now suppose that this game has an equilibrium of the form (S, S , a), where a {l, r}. Then
again the information set of player 3 will not be reached. Therefore we consider the subgame
starting at the decision node of player 2. But also in this subgame and for the restricted
profile (S , a) the information set of player 3 will not be reached. So in this case consistency
puts no restrictions on the beliefs of player 3 and any belief system is weakly consistent with
this profile.

Now we come to the definition of a perfect Bayesian Nash equilibrium.

perfect Bayesian equilibrium

DEFINITION

A combination hb, i is called a perfect Bayesian Nash equilibrium if

(a) b is sequentially rational with respect to


(b) is consistent with b.

EXAMPLE 16
The game 5 in Example 14 has one proper subgame: it begins at players
two singleton information set. The unique equilibrium in this subgame between players 2
and 3 is (L, r). So the unique subgame perfect equilibrium of the entire game is (G, L, r).
The combination h(G, L, r), = 1i is a perfect Bayesian equilibrium.

Now we consider the strategy profile (S, L, l). Note that = 1 is the only belief that is
consistent with this profile. However, the action l is not optimal given this belief. So the
combination h(S, L, l), = 1i is not perfect Bayesian.

EXERCISE 5

Consider the extensive form game


x

M
2

y1

 
4
1

 
2
2

y2

 
0
0

 
3
0

 
0
1

87

6.4. PERFECT BAYESIAN NASH EQUILIBRIA


a) Derive the strategic form of this game and find all the pure Nash
equilibria.
b) Find the subgame perfect equilbria of .
c) Find all perfect Bayesian equilibria of .
EXERCISE 6

Consider the game


x

M
2

y1

 
3
3

 
0
2

 
4
2

 
4
0

 
1
2

 
1
3

y2

 
2
4

a) Derive the strategic form of this game and find all the pure Nash
equilibria.
b) Find the subgame perfect equilbria of .
c) Find all perfect Bayesian equilibria of .
EXERCISE 7

Determine all perfect Bayesian equilibria of the game


1
R
x

M
2

y1

 
3
0

 
2
2

y2

 
0
1

 
0
1

 
3
0

88
EXERCISE 8

CHAPTER 6. EXTENSIVE FORM GAMES: IMPERFECT INFORMATION

Consider the game

 
4
4

1p

1
p
2
q

1q
1

[]

 
2
2
a)
b)
c)
d)

[1 ]
r

1r

 
0
0

 
3
0

1r

 
6
6

Determine the strategic form of this game.


Find all mixed Nash equilibria of this strategic form game.
Find the subgame perfect equilibria of the game .
Find all perfect Bayesian equilibria of the game .

89

6.4. PERFECT BAYESIAN NASH EQUILIBRIA


EXAMPLE 17

Consider the game 6 with three players.


1

b
2

3

2
1
0


1
0
3


1
1
0


1
2
3 0
0
0


2
3
1

Our aim is to find all perfect Bayesian equilibria in this game. The question is: what is the
most practical way to do so? Unfortunately, there is no general answer to this question, since
it depends heavily on the particular game that is being considered. It turns out that in the
game 6 , an efficient method to compute all perfect Bayesian equilibria is to distinguish cases
according to the possible behavioral strategies that player 2 can choose. Let the behavioral
strategies for the three players be given by
b1

b2

b3

a + (1 )b

c + (1 )d and
e + (1 )f

respectively. Moreover, let


[x, 1 x]

and

[y, 1 y].

be player 2 and 3s respective beliefs. The question is thus: for which values of , , , x
and y is this a perfect Bayesian equilibrium? We distinguish several cases according to the
values can possibly take.
Case 1. Suppose that = 0. Then, b2 = d. Since player 1 must choose optimally, it follows
that b1 = a. In a perfect Bayesian equilibrium, player 2s beliefs must be consistent with
player 1s behavioral strategy, and hence x = 1.
Player 2, by choosing c, gets 3(1 ), whereas by choosing d he gets 1. Since b2 = d must

90

CHAPTER 6. EXTENSIVE FORM GAMES: IMPERFECT INFORMATION

be optimal for player 2, it must hold that 3(1 ) 1 and hence 23 . We distinguish two
subcases.
Case 1.1. If = 1. Then, b3 = e. Player 3, by choosing e, gets an expected payoff of 3y
whereas by choosing f he gets an expected payoff of 1 y. Since b3 = e must be optimal
for player 3, we must have that 3y 1 y. Hence y 41 . We thus may conclude that
the behavioral strategy profile (a, d, e) together with beliefs x = 1 and y [ 14 , 1] are perfect
Bayesian equilibria. (Note that consistency of beliefs puts no restriction on player 3s beliefs,
since player 3s information set is reached with probability zero under the behavioral strategy
profile (a, d, e)).
Case 1.2. If [ 23 , 1). Since 0 < < 1, player 3 chooses both e and f with positive
probability. Therefore, both e and f must be optimal for player 3. We have seen above that
playing e yields 3y whereas playing f yields 1 y. We thus must have that 3y = 1 y, which
means that y = 14 . Hence, all behavioral strategy profiles (a, d, e + (1 )f ) with [ 32 , 1)
together with beliefs x = 1 and y = 14 are perfect Bayesian.
Case 2. Suppose that (0, 1). Then, player 2 chooses both c and d with positive probability, which implies that c and d must yield the same expected payoff for player 2. Player
2, by choosing c, gets 3(1 ), whereas by choosing d he gets 1. It thus must be the case
that 3(1 ) = 1 which implies that = 32 .
This, in turn, means that player 3 chooses both e and f with positive probability. Hence,
both e and f should give the same expected payoff to player 3. We have seen that player
3, by choosing e gets 3y, while by choosing f he gets (1 y). Consequently, 3y = 1 y,
implying that y = 14 .
Let the nodes in player 3s information set be called z1 and z2 , and let player 3s information
set be called I. Then,
P (reaching z1 given b1 , b2 )

P (reaching z2 given b1 , b2 )

P (reaching I given b1 , b2 )

(1 )
.

We assume that > 0, and therefore P (reaching I given b1 , b2 ) > 0. Since player 3s beliefs
(y, 1 y) must be consistent with b1 and b2 , it must hold that
y=

P (reaching z1 given b1 , b2 )

=
= .
P (reaching I given b1 , b2 )

Since we have seen that y = 41 , it follows that = 14 .


In particular, player 1 chooses both a and b with positive probability. This implies that both
a and b must be optimal for player 1. Player 1, by choosing a, gets 1 + (1 )2, whereas
by choosing b he gets 2 + (1 )1. We thus must have that 1 + (1 )2 = 2 + (1 )1,
which means that = 21 .
Since = 41 , and player 2s beliefs [x, 1 x] must be consistent with b1 , it follows that x = 14 .
Our conclusion is that the behavioral strategy profile
3 1
1 2
1
1
( a + b, c + d, e + f )
4
4 2
2 3
3
together with beliefs x = 14 , y = 41 is perfect Bayesian.
Case 3. Suppose that = 1. Then b2 = c. Player 1, by choosing a, gets 1 whereas by
choosing b he gets 2. So, player 1 will choose b1 = b. Since player 2s beliefs must be
consistent with b1 it follows that x = 0. As above, let I be player 3s information set, and

91

6.4. PERFECT BAYESIAN NASH EQUILIBRIA


let z1 and z2 be the nodes in this information set. Then,
P (reaching z1 given b1 , b2 )

P (reaching z2 given b1 , b2 )

P (reaching I given b1 , b2 )

1.

Since player 3s beliefs [y, 1 y] must be consistent with b1 , b2 , it must be that


y=

P (reaching z1 given b1 , b2 )
= 0.
P (reaching I given b1 , b2 )

But given these beliefs, it is optimal for player 3 to choose b3 = f . So, the behavioral strategy
profile (b, c, f ) together with the beliefs x = 0, y = 0 is a perfect Bayesian equilibrium. Since
we have covered all possible cases, we have found all perfect Bayesian equilibria in this game.
EXERCISE 9

Find all perfect Bayesian equilibria in the game


1

a
 
2
2

 
2
2

               
3
0
0
2
1
0
0
5
0
0
3
3
0
0
1
1
(Hint: Distinguish cases according to the possible beliefs that player 2 can have at his
information set).

92

CHAPTER 6. EXTENSIVE FORM GAMES: IMPERFECT INFORMATION

Chapter 7

Signaling Games
In this chapter we deal with a type of two-player extensive form games especially designed
to model situations in which only one player is informed and has the option to (partially)
reveal this information to the second player. The question in these situations is under which
circumstances the first player will use this option.

7.1

The Model

The information that is privately owned by player 1 is modelled in the game tree below
which for clarity is now drawn horizontally instead of verticallyas a move of NATURE, the
outcome of which is known to player 1, but not to player 2.

93

94

CHAPTER 7. SIGNALING GAMES

(0,

1)

lu
1

UL

(2,

0)

ld
2

(1, 1)

lu

(3,

0)

ld

(1,

1)

rd

(3,

0)

ru

(0, 1)

rd

(2,

UR

NATURE

DL

ru

DR

0)

As can be seen, NATURE moves up (U) with (fixed!) probability and down (D) with (fixed)
probability 1 . (These probabilities reflect the statistical information player 2 has about
the occurrence of the various types of player 1. A convenient way of thinking about the move
of NATURE is to imagine that NATURE is a third player in this game with a fixed strategy
(, 1 ) and payoff zero in any outcome.) These fixed probabilities with which NATURE
moves up or down are known to both players. (Player 1 may have more than two types in a
signaling game. The analysis of such signaling games is not significantly different from the
ones we will look at in this chapter though. Therefore, for the sake of clarity and brevity,
we will only consider signaling games in which player 1 has only two types.)

As soon as NATURE has made its move, player 1 is informed about the outcome of this
move and, depending on this outcome, he can choose between UL and UR in case the move
of NATURE was up, and between DL and DR in case the move of NATURE was down.

As soon as player 1 has made his move, it is player 2s turn. Player 2 is first informed about
the move of player 1, but only to the extent that he is told whether player 1 chose left or
right. So, player 2 does not know whether he is in the upper half of the figure or in the lower
half. Based on this information, player 2 has a choice between going up or down. Thus, he
has the choice between playing lu and ld in case player 1 played left, and ru and rd in case
player 1 played right.

Once each player has made a choice, play arrives at one of the eight end nodes of the game,
and each player receives the corresponding payoff. Then the game ends.

Both players are of course also allowed to use mixed strategies. Player 1 e.g. is allowed to play UL with probability 51 and UR with probability 45 .
BEHAVIORAL STRATEGIES

95

7.2. THE BEER AND QUICHE GAME

s
p

1-r
2
r

1-p

1-s
2

NATURE

1-q

1-r

1-s

Thus, a generic behavioral strategy of player 1 is of the form





(p, 1 p), (q, 1 q) .

In this notation, p is the weight that is put on UL and q the weight put on DL. Similarly, a
mixed behavioral strategy of player 2 is of the form



(r, 1 r), (s, 1 s)

where r is the weight put on lu and s the weight put on ru. Now it is easy to see that the
probability to get D, DR, ru as a realization (and (0, 1) as payoffs) is equal to (1)(1q)s.
Remark

These games are called signaling games because player 1, being either of type

U or of type D, has the option to signal his type to player 2 by playing a different strategy
in the lower half of the diagram than he does in the upper half (UL combined with DR or
UR combined with DL). This gives player 2 the opportunity to react differently on different
outcomes of the move of NATURE. This signaling is also called separating behavior of

player 1 (he separates his types for player 2). If player 1 makes the same choice in both
halves of the diagram he is said to pool his types. In other words, he does not reveal his
type to player 2.

7.2

The Beer and Quiche Game

One of the better known examples of signaling games is the beer and quiche game. It has
been thought up by Cho and Kreps (1987) as a metaphor for entry deterrence.
In this game, player 1 can be of two types. Depending on the move of NATURE, he is either
1
9
, or a wimp with probability 10
.
surly, with probability 10

96

CHAPTER 7. SIGNALING GAMES

(0,

1)

bf
1

WB

(2,

0)

1
10

ba

0)

1)

wimp

qa

(3,

0)

qf

(0, 1)

qa

(2,

2
9
10

bf

SB
(3,

(1,

WQ

2
(1, 1)

qf

surly

SQ

ba

0)

Secondly, depending on his type, player 1 can choose between beer and quiche for breakfast.
(Depending on his type means that he is allowed to make different choices in the upper
and lower half of the figure.) Player 2 only observes what player 1 has for breakfast, and
has subsequently to decide whether he will fight player 1 or whether he will acquiesce.
The payoffs are determined as follows. Player 1 gets 0 units if player 2 decides to fight him
and 2 units if player 2 chooses to acquiesce. On top of that, he also gets 1 unit for drinking
beer in case he is surly (surly types like beer) and 1 unit for eating quiche in case he is a
wimp (wimps like quiche). So, basically player 1 wants to stay out of trouble and to eat
whatever he feels like eating.
The payoffs for player 2 though are geared towards competition. Player 2 gets nothing if he
acquiesces (compare this with entry deterrence). If he fights, he gets 1 unit in case player 1
turns out to be a wimp but he looses 1 unit if player 1 turns out to be surly. So, basically
player 2 wants to know player 1s type and subsequently to make the right choice based on
this information.
As long as we are only interested in finding the pure Nash
equilibria of the beer and quiche game, we can proceed as follows. First compute the normal
form of this game and then compute the Nash equilibria of the normal form. The normal
form of this game turns out to be
PURE NASH EQUILIBRIA

(WB, SB)
(WB, SQ)
(WQ, SB)
(WQ, SQ)

(bf, qf)

(bf, qa)

9 , 8

0 , 8


10 , 8

9 , 8

29 , 0

18 , 1

2 , 9

12 , 9

28 , 1

21 , 0

1 , 8

1 , 8

(ba, qf)

(ba, qa)
29 , 0

20 , 0 .

30 , 0

21 , 0

(All payoffs are multiplied by 10 for convenience.) The asterisks indicate pure best responses

97

7.2. THE BEER AND QUICHE GAME

against the other players strategy. This way we can easily check that the only Nash equilibria
in pure strategies are

with expected payoffs

(WB, SB), (ba, qf)

29
10

and

21
,
10

and

(WQ, SQ), (bf, qa)

resp., for player 1 and payoff 0 for player 2 in both equilibria.

Remark Notice that in both equilibria player 1 pools his types. For this reason such
equilibria are called pooling equilibria. In this game there are apparently no (pure) equilibria
in which player 1 uses the option to separate his types for player 2 by signaling.

In the special case of a two-person signaling game in which


each player has two choices in each information set there is a graphical way to compute all
mixed Nash equilibria in behavioral strategies. For the beer and quiche game this method
looks as follows.
MIXED NASH EQUILIBRIA

First we will follow player 2 in his strategic considerations. Suppose that player 1 is using
behavioral strategy ((p, 1 p), (q, 1 q)). Then the probability of ending up in the upper
1
node of the left information set is 10
p. The probability of ending up in the lower node of
9
the left information set is 10
q. So, the expected payoff of playing bf in the left hand side
information set is
1
p
10

1+

9
q
10

(1) =

1
p
10

9
q.
10

The expected payoff of playing ba though is 0. So, playing bf is a pure best response whenever
p 9q. Graphically, this looks like

q
q=

ba

1
9

bf

0
0

Of course, on the diagonal line p = 9q both strategies are equally good, and below that line
bf is the best pure response. Performing these computations for the right hand information
set shows that qf is a best response if p 9q 8. Putting these results together gives

98

CHAPTER 7. SIGNALING GAMES

q=1
III
q=

8
9

II
q=

1
9

q=0
p

In area I the pure best response is (bf, qa), in area II the pure best response is (ba, qa) and
in area III we get (ba, qf) as a pure best response 1 .
Now suppose youre player 1, and assume that player 2 is playing behavioral strategy ((r, 1
r), (s, 1s)). Then, in case the wimpy decision node is reached, WB is a better pure strategy
than (or equally good as) WQ as soon as s r 12 . For the surly decision node we get that
SB is a pure best reply as soon as s r 21 . Graphically,
s=1
III

s=

1
2

II

r=1

1
2

r=

r=0

I
s=0

In area I the pure best response is (WQ, SQ), in area II the pure best response is (WQ, SB)
and in area III we get (WB, SB) as a pure best response.
Given these figures it is easy to find all equilibria. Take for example area I of player 1. In
this area (bf, qa) is the best pure response of player 2. Now, playing (bf, qa) translates to
taking r = 1 and s = 0 in the language of (mixed) behavioral strategies. In the second
figure you can see that this behavioral strategy is in area I, on which (WQ, SQ) is a pure
best response. And that strategy (corresponding to p = 0 and q = 0) is indeed an element
1 Notice that (bf, qf) doesnt occur as a best response. Further notice that on the line separating
two areas the pure best responses of both areas become best responses, as well as all combinations
thereof. For example, on the line p = 9q between area I and II all strategies of the form ((r, 1
r), (0, 1)) are best responses for player 2.

99

7.2. THE BEER AND QUICHE GAME

of (the boundary of) area I! Thus we find that ((WQ, SQ), (bf, qa)) is a Nash equilibrium in
behavioral strategies.
It is also possible to find Nash equilibria that are not in pure strategies. Take for example
p = 0 and q = 0. On this strategy the best replies of player 2 are of the form ((r, 1r), (0, 1))
where 0 r 1. Now, of these best responses, the strategies with 12 r 1 are in area I for
player 2. So, playing p = 0 and q = 0 is a best response for player 1 to these strategies. All
in all, the strategy pairs (((0, 1), (0, 1)), ((r, 1 r), (0, 1))) for 12 r 1 are Nash equilibria.
A perfect Bayesian equilibrium is a Nash equilibrium
that can be supported by beliefs for player 2 in his information sets. A little bit more
formally, let us look at the Nash equilibrium ((WB, SB), (ba, qf)). The question is whether
we can find beliefs (, 1 ) and (, 1 ) such that

PERFECT BAYESIAN EQUILIBRIA

(1) these beliefs are in accordance with the probabilities to arrive in some decision node that
are generated by the strategies that are actually played, and
(2) player 2 acts optimally given these beliefs.
Well, given the Nash equilibrium, the probability to end up in the upper left decision node
1
1
1 = 10
and the probability to end up in the lower left node is equal
of player 2 is p = 10
1
9
9
. So, at least we get that (, 1 ) = ( 10
, 10
). This implies that, given this belief, the
to 10
expected payoff for player 2 when playing bf equals
1
9
8
1+
(1) = .
10
10
10
Doing this calculation for the pure strategy ba yields an expected payoff of 0. So, given this
belief, playing ba in his left information set is optimal for player 2.

1
10

2
9
10

In the information set on the right hand side things are a bit less straightforward. Since
both nodes in this information set are reached with probability zero, the requirement that
(, 1 ) should be in accordance with these probabilities seems meaningless. In this case
(i.e., when all nodes of an information set are reached with probability zero) we simply leave

100

CHAPTER 7. SIGNALING GAMES

out condition (1) and only require that the belief in this information set is such that (2) is
fulfilled. So, the question now is, can we find a belief (, 1 ) such that playing qf (this
being the equilibrium strategy in this information set in this case) is optimal. Well, in that
case the expected payoff of playing qf given the belief (, 1 ), which happens to be equal
to
1 + (1 ) (1) = 2 1,
should be larger than or equal to the expected payoff of playing qa given this belief, which
is obviously 0. So, any belief (, 1 ) for which 12 will support the equilibrium strategy
in the right hand side information set.
EXERCISE 1

Show that ((WQ, SQ), (bf, qa)) is also perfect Bayesian.


Still, all other things being equal, the equilibrium in which
player 1 always takes beer for breakfast seems to be the more intuitive one, simply because
the probability of being surly, and with that the probability of having a taste for beer during
breakfast, is a lot higher than the probability for being a wimp. Nevertheless, apparently
we need fairly strong requirements on our equilibrium before we can single out the more
intuitive one. One of the ways to do this is the intuitive criterion.
THE INTUITIVE CRITERION

Consider the equilibrium ((WQ, SQ), (bf, qa)), shown in the figure below.

(0,

1)

(1,

1)

0)

(2,

0)

(1, 1)

1
10

wimp

(3,

9
10

surly

(0, 1)

1
(3,

0)

(2,

0)

Now suppose that player 2, despite the equilibrium agreement, does observe that player 1
is drinking beer for breakfast. So, he is in the left information set, and apparently player 1
has deviated from the equilibrium agreement. Then player 2 will have to ask himself why
player 1 would do such a thing. Well, if player 1 is of the wimpy type, would he then have
any reason for such obtrusive behavior? No! Simply because the wimpy type would get
payoff 3 when he sticks to the equilibrium and takes quiche for breakfast, while he can get
at most 2 when he deviates from the equilibrium and starts drinking beer. Only the surly
type could possibly hope to get more than his equilibrium payoff (he gets 2 in equilibrium,
and might get 3 if he switches from eating quiche to drinking beer). So, the only reasonable
belief player 2 could have in such a case (seeing player 1 drink beer) is (, 1 ) = (0, 1).
However, bf is not a best response given this belief, and the equilibrium breaks down.

101

7.2. THE BEER AND QUICHE GAME

Formally, the intuitive criterion works as follows. Suppose a perfect Bayesian equilibrium is
given. Suppose that player 2 has an information set that is not reached by the equilibrium
play. Then
(1) for each node in this information set, player 2 has to check whether the corresponding
type of player 1 has any incentive to deviate from the equilibrium agreement. In other
words, does this type of player 1 have any chance of getting more than the equilibrium
payoff by deviating. If not, assign probability zero to this type of player 1 2 .
(2) Given the remaining types of player 1 in this information set, is there a belief of player 2
over these remaining types such that his current equilibrium strategy in this information
set is a best response given this belief.
Finally, if in each information set that is reached with probability zero by the equilibrium
play the strategy of player 2 in that information set is a best response against some belief
of the form given in (2) the equilibrium is said to survive the intuitive criterion.
EXERCISE 2

Show that the other Nash equilibrium ((WB, SB), (ba, qf)) of the beer and quiche game
survives the intuitive criterion.

EXERCISE 3

Consider the following signaling game.

(1, 1)

(2, 2)
1

1
2

(2, 0)
2
(0, 0)

(0, 0)
2

NATURE
1
2

(1, 0)

1
(0, 1)

(1, 1)

(a) Compute all Nash equilibria in mixed behavioral strategies.


(b) Figure out which ones are perfect Bayesian.

2 If all types in the information set get eliminated this way, the intuitive criterion does not work,
and the equilibrium survives the test.

102
EXERCISE 4

CHAPTER 7. SIGNALING GAMES

Consider the following signaling game.

(2, 2)

(1, 0)
1

1
2

(3, 1)
2
(0, 1)

(0, 0)
2

NATURE
1
2

(1, 0)

1
(0, 0)

(1, 1)

(a) Compute all Nash equilibria in mixed behavioral strategies. (Notice that the payoffs in
this game are partly the same payoffs as in the previous game.)
(b) Which pure Nash equilibria are perfect Bayesian?
(c) Which pure Nash equilibria survive the intuitive criterion?

7.3

The Spence Signaling Model

To conclude this section, we study an economic application of signaling games which is


known as the Spence signaling model. In this model there is a worker who wishes to apply
for a job at a firm. The problem is that the firm must propose a wage to the worker, while
being unaware of the workers quality. Intuitively, the firm will choose a higher wage if it
believes that the worker is of a higher quality. Before the firm chooses the wage, the worker
has the opportunity to invest in education. The level of education is observed by the firm,
and may serve as a signal for the workers quality since a worker of lower quality should put
more effort than a worker of higher quality to obtain the same level of education. Based on
the observed level of education, the firm has beliefs about the workers quality. Finally, the
firm will propose the wage that maximizes its profits given these beliefs. The question is:
what level of education should a worker of a given quality choose, and how should the firm
determine the wage?
The situation may be formalized as a signaling game. The worker is player 1, and every
possible quality level of the worker may be interpreted as a type. Obviously, player 1 knows
his own type. The firm is player 2, who is unaware of the workers quality, hence is unaware
of player 1s type. The possible education levels that the worker may choose are the possible
signals in the game. Finally, the possible wages that can be chosen by the firm are the
possible actions for player 2.
More precisely, assume that the worker may be of high quality (type H) or low quality
(type L), each occurring with probability 0.5. The worker may choose any education level

103

7.3. THE SPENCE SIGNALING MODEL

e [0, ). The firm may choose any wage w [0, ). If the worker is of high quality,
chooses education level e and receives wage w, the payoff for the worker is
u1 (H, e, w) = w e2
while the payoff for the firm will be

u2 (H, e, w) = 2 ew w.
The intuition is that e2 represents the effort that a high quality worker must put in order

to obtain an education level of e, while 2 ew reflects the high quality workers productivity
when having education level e and receiving wage w.
If the worker is of low quality, the payoffs are
u1 (L, e, w) = w 2e2
and
u2 (L, e, w) =

ew w.

Hence, in comparison with the high quality worker, a low quality worker must put more
effort, namely 2e2 , in order to obtain the same education level e. On the other hand, the

productivity of a low quality worker, ew, is lower compared to a high quality worker when
receiving the same education e and wage w.
Note that the worker may choose between infinitely many signals, and that the firm may
choose between infinitely many wages. In contrast, the definitions and concepts developed
so far in this section only apply to signaling games where players 1 and 2 can choose between
finitely many actions only. However, the concepts of pure strategies, beliefs, perfect Bayesian
equilibria and the intuitive criterion can easily be adapted to the Spence signaling model.
In this game, a pure strategy for player 1 is a function E that assigns to every type t {H, L}
some education level E(t) [0, ). Here, E(H) is the education level chosen by a high
quality worker, and E(L) is the education level chosen by a low quality worker. A pure
strategy for player 2 is a function W that assigns to every possible education level e that
can be observed some wage W (e) [0, ). In order to keep things as simple as possible,
we will restrict our attention to pure strategies only in the analysis of the Spence signaling
model. We therefore do not allow players to randomize between actions. A belief system for
player 2 is a function that assigns to every education level e some probabilities (e)(H)
and (e)(L) to types H and L respectively, with (e)(H) + (e)(L) = 1. Intuitively, (e)(H)
is the probability that the firm assigns to the event that the worker is of high quality if it
observes that the worker has chosen education level e. Similarly for (e)(L).
A perfect Bayesian equilibrium in this game is a combination (E, W, ) of pure strategies
and beliefs such that
(1) for both types t the education level E(t) is optimal given the wage scheme W offered by
the firm;
(2) for all possible education levels e that can be observed by the firm, the wage W (e) offered
must be optimal given the beliefs (e)(H) and (e)(L);
(3.a) if education level e is chosen by both types, that is, if E(H) = E(L) = e, then
(e)(H) = (e)(L) = 0.5;
(3.b) if education level e is only chosen by type H, that is, if E(H) = e but E(L) 6= e, then
(e)(H) = 1;

104

CHAPTER 7. SIGNALING GAMES

(3.c) if education level e is only chosen by type L, that is, if E(L) = e but E(H) 6= e, then
(e)(L) = 1.
A perfect Bayesian equilibrium (E, W, ) is said to survive the intuitive criterion if the beliefs
have the following additional property:
(4) if an education level e is such that E(H) 6= e, E(L) 6= e, type H has a chance of
getting more than his equilibrium payoff by choosing e instead, but type L has not, then
(e)(H) = 1;
(5) if an education level e is such that E(H) 6= e, E(L) 6= e, type L has a chance of
getting more than his equilibrium payoff by choosing e instead, but type H has not,
then (e)(L) = 1.
It may be checked that there are many different perfect Bayesian equilibria in this game.
However, we shall show that all perfect Bayesian equilibria that survive the intuitive criterion
induce a unique education level for type L and a unique education level for type H. Moreover,
these perfect Bayesian equilibria have the property that the high quality worker chooses a
higher education level than the low quality worker. We call such equilibria in which different
types chooses different signals separating equilibria or screening equilibria.
As a preparatory step, we first investigate which wages the firm may choose in a perfect
Bayesian equilibrium. Choose a perfect Bayesian equilibrium (E, W, ). After observing
education level e, the firm has beliefs (e), assigning probability (e)(H) to type H and
probability (e)(L) = (1 (e)(H)) to type L. Then the firms expected payoff, when
observing education level e, having beliefs (e) and choosing wage w, is given by




u2 (e, w, (e)) = (e)(H) 2 ew w + (1 (e)(H)) ew w

= (1 + (e)(H)) ew w.
Since the wage W (e) must be optimal given the beliefs (e), we must have that
du2 (e, W (e), (e))
= 0.
dw
Hence,

which yields

e
1
1=0
((1 + (e)(H)) p
2
W (e)

1
(1 + (e)(H))2 e.
(7.1)
4
So, in every perfect Bayesian equilibrium, the wage offered after observing any education
level e is given by W (e) = 41 (1 + (e)(H))2 e.
W (e) =

Now, choose a perfect Bayesian equilibrium that survives the intuitive criterion. First, we
show that both types must choose different education levels, that is, E(H) 6= E(L). Suppose
the contrary, so assume that E(H) = E(L) = e for some e [0, ). We show that this is
not possible.
By (3.a), we must have that (
e)(H) = (
e)(L) = 0.5. Hence, by (7.1) we know that
W (
e) =

9
1
(1 + 0.5)2 e =
e.
4
16

In the picture below, we draw the indifference curves for types H and L at the point (
e, W (
e)).

105

7.3. THE SPENCE SIGNALING MODEL


w

H
L

W (
e)


s



((((

((( (
(
(
(

e

((
((( (




w=e

1
(( ( w = 4 e

e2

Figure 1
So, the indifference curve for type H contains all education-wage pairs (e, w) that yield the
same payoff to type H as (
e, W (
e)). That is, it contains all pairs (e, w) with
w e2 = W (
e) e2 .
Similarly we construct the indifference curve for type L, all pairs (e, w) with
w 2e2 = W (
e) 2
e2 .
For every other education level e 6= e, the firm will always respond with a wage offer W (e) =
1
(1 + (e)(H))2 e. Since (e)(H) always lies between 0 and 1, we have that
4
1
W (e) [ e, e]
4

(7.2)

for all other education levels e that may be chosen. Here, W (e) = 41 e is chosen if (e)(L) = 1
and W (e) = e is chosen if (e)(H) = 1. The lines w = 41 e and w = e have been drawn in
Figure 1.
Now, consider the education level e2 drawn in Figure 1. We see that type L cannot obtain
more than his equilibrium payoff by choosing e2 instead of e. Namely, type Ls equilibrium
payoff is given by W (
e) 2
e2 . By choosing e2 the firm will choose some wage w between
1
e and e2 . However, all the points (e2 , w) with w [ 14 e2 , e2 ] lie strictly below type Ls
4 2
indifference curve that goes through the equilibrium pair (
e, W (
e)). Hence, by choosing e2
type L will always receive strictly less than his equilibrium payoff W (
e) 2
e2 .
On the other hand, type H can get more than his equilibrium payoff by choosing e2 . Namely,
if the firm would hold beliefs (e2 )(H) = 1, then the firm would choose W (e2 ) = e2 , and
type H would receive the payoff e2 (e2 )2 . Since the point (e2 , e2 ) lies above type Hs
indifference curve through (
e, W (
e)), we have that
e2 e22 > W (
e) e2 ,
and hence type H could get more than his equilibrium payoff by choosing e2 instead of e.

106

CHAPTER 7. SIGNALING GAMES

But then, by the intuitive criterion, the firms beliefs after observing education level e2 must
satisfy (e2 )(H) = 1. However, if this is true, then the firm will choose wage W (e2 ) = e2
after observing education level e2 . Since e2 e22 > W (
e) e2 , this would mean that education
level e is not optimal for type H against the wage scheme W , since by choosing e2 he would
e) e2 . However, this would imply that (E, W, )
get e2 e22 which is strictly more than W (
is not a perfect Bayesian equilibrium, which is a contradiction. So, we have shown that
E(H) = E(L) is not possible in a perfect Bayesian equilibrium that survives the intuitive
criterion.

Now, take a perfect Bayesian equilibrium (E, W, ) that survives the intuitive criterion.
Then, necessarily, E(H) 6= E(L). Consider type L. Let E(L) = eL . Since L is the only type
that chooses education level eL the firms beliefs after observing eL must satisfy (eL )(L) = 1.
Then we know that the firm will choose wage W (eL ) = 41 eL . We may draw type Ls
indifference curve through the education-wage pair (eL , 41 eL ) as is done in Figure 2.
w

Ls indifference curve

 

 








1
 w = 4e
 



eL

Figure 2

Now, assume that Ls indifference curve is not tangent to the line w = 41 e. Suppose first that
the situation is as in Figure 2. Then, by choosing some education level e slightly smaller
than eL type L would be strictly better off than by choosing eL . Namely, by choosing e
slightly smaller than eL the firm will respond with a wage W (e) 41 e. Hence, the point
(e, W (e)) lies strictly above the indifference curve, which means that type L could strictly
improve upon his equilibrium payoff by choosing this education level e. However, this would
mean that eL would not be optimal for type L, which is a contradiction.

The second possible situation that can occur if Ls indifference curve is not tangent to the
line w = 14 e is depicted in Figure 3.

107

7.3. THE SPENCE SIGNALING MODEL

Ls indifference curve

 










 

1
w = 4 e

eL

Figure 3

But then, by choosing an education level e slightly bigger than eL , type L will be strictly
better off than by choosing eL . The argument is the same as above. This would be a
contraction, since eL must be optimal for type L.

108

CHAPTER 7. SIGNALING GAMES

So, the only possibility that remains is that Ls indifference curve through (eL , 41 eL ) is
tangent to w = 41 e. Since Ls indifference curve is given by
w 2e2 =

1
eL 2e2L ,
4

its slope at (eL , 14 eL ) is 4eL . Hence, we must have that 4eL =


eL =

1
4

which means that

1
.
16

Now, consider the indifference curve of type L through the education-wage pair (eL , 41 eL ) =
1
1
( 16
, 64
), as depicted in Figure 4.
w



(( (
 ((( (( (
(
(
eL







((( (
( (((

w=e

( w = 41 e
((
-

e2

Figure 4
We know from the above that Ls indifference curve must be tangent to the line w = 41 e. Let
e2 be the education level where this indifference curve hits the line w = e. It may be verified
that e2 = 0.492. Suppose that the firm would observe an education level e higher than e2 .
Then, type L could never improve upon his equilibrium payoff by choosing e2 . Namely, if
type L chooses e > e2 , then the firm will choose a wage w between 41 e and e, and hence the
point (e, w) would lie strictly below the indifference curve. Hence, choosing e > e2 will give
type L always strictly less than his equilibrium payoff.
We now turn to type H. Let E(H) = eH . Since type H is the only type that chooses eH ,
we must have that the firms beliefs after observing eH satisfy (eH )(H) = 1. Hence, the
firm offers the wage W (eH ) = eH after observing education level eH . Conditional on the
fact that H receives wage eH , the best education level that type H could possibly choose is
the education level eH that maximizes his payoff
e e2 .
The maximum is reached at e = 12 . We now show that type H must choose E(H) = 21 .
Suppose that type H would choose some E(H) = eH 6= 12 , resulting in an equilibrium payoff
eH e2H .

7.3. THE SPENCE SIGNALING MODEL

109

Then, if H would choose e = 21 , type H could improve upon his equilibrium payoff if the
firm would choose wage w = 12 after observing education level e = 12 . On the other hand,
since 21 > e2 = 0.492, type L can never improve upon his equilibrium payoff by choosing
e = 21 instead of eL (see the argument above). Hence, by the intuitive criterion, the firms
beliefs after observing education level e = 21 must satisfy ( 12 )(H) = 1. However, if this is
true, the firm will choose the wage w = e = 12 after observing education level e = 21 . But
then, type H would certainly improve upon his equilibrium payoff by choosing e = 12 instead
of eH , since e = 21 maximizes e e2 . This means that eH is not an optimal education level
for type H which is a contradiction. We must therefore conclude that type H must choose
eH = 21 .
Summarizing, we have shown that every perfect Bayesian equilibrium that survives the
1
intuitive criterion leads to the same education levels, namely education level 16
for type L
1
and education level 2 for type H.