This action might not be possible to undo. Are you sure you want to continue?

Nachbar Spring 2003

Basic Non-cooperative Game Theory: The NC-17 Version 1 Preliminary remarks.

These are notes on basic non-cooperative game theory. By “basic,” I mean “fundamental,” not “easy.” These notes are written at a ﬁrst-year graduate level. I mention only a few applications, and then only in passing. I do not discuss equilibrium reﬁnements (with the partial exception of iterated weak dominance). And I do not discuss cooperative game theory. With the exception of Section 8.9 (repeated games), I restrict formal deﬁnitions and theorems to ﬁnite games, that is, games in which there are a ﬁnite number of players and each player has only a ﬁnite number of strategies. I provide some comments on how, or whether, the deﬁnitions and theorems can be extended to inﬁnite games. And I include some examples of inﬁnite games, e.g. Example 4, the Cournot game.

2

2.1

**Games in Strategic Form.
**

Strategic forms.

A strategic form consists of a list of players and a list for each player of that player’s strategies, that is, a list of what that player can do in the game. Formally, let I denote the set of players and let Si denote the set of strategies available to player i ∈ I. I sometimes refer to the elements of Si as pure strategies, to distinguish them from the mixed strategies introduced in Section 3.3. As noted in Section 1, all deﬁnitions, with the exception of those in Section 8.9, assume that the game is ﬁnite: both Si and I are ﬁnite. Let N denote the number of players; N = |I|.1 Then the set of players can be represented as I = {1, . . . , N }. A strategic form is then (I, {Si }N ), i=1 where {Si }N = {S1 , . . . , SN }. i=1 A strategy proﬁle s = (s1 , . . . , sN ) lists a strategy for each player. The set of strategy proﬁles is then

N

S=

i=1

1

Si .

In general, given a set X, let |X| denote the cardinality of X.

1

For future reference, I also establish notation for proﬁles of strategies of players other than player i. Explicitly, if s = (s1 , . . . , si−1 , si , si+1 , . . . , sN ) then s−i = (s1 , . . . , si−1 , si+1 , . . . , sN ). The set of strategy proﬁles for players other than i is then S−i =

j=i

Sj .

By deﬁnition, if s−i = (s1 , . . . , si−1 , si+1 , . . . , sN ) then (si , s−i ) = (s1 , . . . , si−1 , si , si+1 , . . . , sN ).

2.2

Strategic form games.

A strategic form becomes a strategic form game once one speciﬁes, for each player i, a payoﬀ function ui : S → R. A game in strategic form is written formally as (I, {Si }N , {ui }N ). i=1 i=1 In most economic applications, players are (or at least would like to be) subjective expected utility maximizers and the interpretation of ui is as follows. Each strategy proﬁle s generates an outcome. The outcome might be, for example, that player i gets a gold star while every other player gets a lump of coal. Each player has a felicity function over outcomes, and this in turn induces a felicity function ui over strategy proﬁles. In economic applications, it is common to assume that if the strategy proﬁle s results in player i receiving, say, $12 then ui (s) = 12. This implicitly assumes both risk neutrality and the absence of externalities like envy or altruism. Remark 1. In game theory as studied by evolutionary biologists, ui is a measure of reproductive ﬁtness, such as expected numbers of children or grandchildren. I do not, however, discuss this application further.

2.3

Examples.

As discussed in Section 8.7, any non-cooperative game can be represented as a game in strategic form. But for the moment it is convenient, in terms of interpretation, to think of a strategic form game as one in which players act simultaneously. I now give some standard examples. It is often convenient to represent games via game boxes. Rather than deﬁne what a game box is, I give examples using 2 × 2 games, by which I mean games with two players and two strategies for each player. 2

a b a 8, 10 0, 0 b 0, 0 10, 8 Figure 1: Battle of the Sexes. Example 1. The game box for one version of Battle of the Sexes is in Figure 1. Players have two strategies, a and b. If both players choose a then player 1, the row player, gets 8 and player 2, the column player, gets 10. If both players choose b then player 1 gets 10 and player 2 gets 8. If player 1 chooses a and player 2 chooses b, or if player 1 chooses b and player 2 chooses a, then both players get 0. Thus, player 1 and player 2 would like to coordinate on either (a, a) or (b, b), but they disagree about which of those two is better. Example 2. The game box for Matching Pennies is in Figure 2. Player 1 likes the H T H 1, −1 −1, 1 T −1, 1 1, −1 Figure 2: Matching Pennies. proﬁles (H, H) and (T, T ). Player 2 likes the proﬁles (H, T ) and (T, H). This game is called zero sum because, for any strategy proﬁle, the sum of payoﬀs is zero. Example 3. The game box for a version of the Prisoner’s Dilemma (sometimes Prisoners’ Dilemma; your choice) is in Figure 3. The players can maximize the sum of C F C 4, 4 0, 6 F 6, 0 1, 1 Figure 3: A Prisoner’s Dilemma. their payoﬀs by coordinating on (C, C). But, regardless of what the opponent does, each player wants to to play F . Game boxes are convenient for small games, such as the two player, two strategy games above, but they are useless for large game. A standard example of a large game is the following. 3

if the strategy proﬁle is (q1 . 12]. if player 1. q2 ) = P (q1 . in which case Q = q1 + q2 . A strategy for ﬁrm i is a choice of quantity. q2 ). But it is often simplest to deal directly with the continuum case. is ui (q1 . It can be made ﬁnite by limiting attention to. There are two ﬁrms producing a homogeneous product for which market demand is given by Q= 12 − P 0 if P ∈ [0. the ﬁrms would like to coordinate on an output of 3 each. were to produce 3 then player 2 would maximize its proﬁt by producing 4. I denote a typical strategy in the Cournot game as qi (“q” for “quantity”). say. For example. Market price is set to clear the market. For simplicity. Thus. 0 if q1 + q2 > 12. then market price is P (q1 . suppose that there are no costs. integer quantities less than or equal to 12. say. The game just described is called the Cournot duopoly. Then the proﬁt to ﬁrm i. Thus. One can verify that the sum of proﬁts is maximized if total output is 6. if possible. Note that the strategy set Si = [0.5 rather than 3. 12]. Suppose that the cost of producing qi is Ci (qi ). if P > 12. q2 )qi − Ci (qi ). Generalizing this game to N ﬁrms yields a Cournot oligopoly. 4 . which I equate with ﬁrm i’s payoﬀ. q2 ) = 12 − (q1 + q2 ) if q1 + q2 ∈ [0. But they have incentive to cheat.Example 4. Following tradition. ∞) is inﬁnite.

supp(σ) is the set of s such that σ(s) > 0. b 0 1/2 Figure 5: A strategy distribution for Battle of the Sexes. such as matching pennies (Example 2). For example if s = (s1 . b) = a b a 1/2 0 . Then one possible σ is represented in Figure 5 Thus σ(a.1 Probability Distributions over Strategies. consider the Battle of the Sexes game of Example 1 in Section 2. γ. meaning that the analyst is certain that some particular strategy proﬁle s will be played. s2 ) then σ(s) = β. Let ∆(S) denote the set of probability distributions over S. But it is easy to provide examples.3. I can represent σ as in Figure 4. a) = 0. ˆ Example 6. If σ(s) = 1 then I use s interchangeably with σ. written supp(σ). The support of σ. not mutually exclusive. Subjective. 1/2. interpretations of a strategy distribution as a game theory prediction.2 σ ∈ ∆(S) is called a strategy distribution. A strategy distribution σ is an analyst’s subjective forecast of how the game will be played. where it seems reasonable to make non-degenerate forecasts. α. 2 Here and below I will use the notation ∆(·) to denote the set of probabilities over a ﬁnite set.3 3. Say that σ is degenerate if there is some s such that σ(s) = 1. It may be that the forecast is degenerate. in which case supp(σ) = {s}. β. σ(s) is the probability under σ of the strategy proﬁle s. Here are three. Similar deﬁnitions hold for σi ∈ ∆(Si ) and σ−i ∈ ∆(S−i ). Basic Probability Notation. σ(a. Example 5. 1. δ ≥ 0 and α + β + γ + δ = 1. is the set of s ∈ S that get positive probability under σ. In the case of a 2 × 2 game. 5 . Here ˆ s2 s2 α β γ δ s1 s1 ˆ Figure 4: A strategy distribution. a) = σ(b. b) = σ(b. that is. Less abstractly.

σi (si ) = s−i ∈S−i σ(si . s1 5/12 1/4 ˆ Figure 6: A correlated distribution. Example 9. b 1/4 1/4 Figure 7: An independent distribution. s−i ). 6 . and σ is the objective probability distribution over S generated by this randomization. perhaps with the same players. Consider the Battle of the Sexes and suppose that the strategy distribution is the one given in Figure 5 (Example 6). s Example 8. and σ(s) is the frequency with which s occurs. Figure 5. The implied marginals are σ1 (a) = σ1 (b) = σ2 (a) = σ2 (b) = 1/2. Again consider the Battle of the Sexes but suppose that the strategy distribution is as in Figure 7. The marginal distribution σ2 is deﬁned by s σ2 (s2 ) = 1/3 + 5/12 = 3/4 and σ2 (ˆ2 ) = 1/4. The game is played many times.2. 1/3 and σ1 (ˆ1 ) = 5/12 + 1/4 = 2/3. perhaps with diﬀerent players. for each player i. The marginal distribution σ1 is deﬁned by σ1 (s1 ) = s2 s2 ˆ s1 1/3 0 . Even though the distribution is diﬀerent than in a b a 1/4 1/4 . Example 7. Players actually randomize. Empirical.3. Consider Figure 6. A strategy distribution σ induces. 3.2 Marginal distributions. σ is an empirical distribution generated by actual play. a marginal distribution σi ∈ ∆(Si ) deﬁned by. for each si . the implied marginals are the same: σ1 (a) = σ1 (b) = σ2 (a) = σ2 (b) = 1/2. I discuss this further in Section 3. 3. Objective.

which I don’t introduce until Section 8. let me note that. s−i ). for any s = (s1 .3 σi is called a mixed strati=1 egy. is independent. . they exhibit correlation. Nash equilibrium (Section 6). si is the mixed strategy σi with σi (si ) = 1.3 Independence and mixed strategies. it may not be reasonable to view games as literally complete descriptions of the strategic environment. {Si }N ). in practice. σN ) in place of σ. To formalize this requires an understanding of extensive forms and extensive form strategies. in some circumstances. in Section 9. Brieﬂy. Of course. The distributions in Example 6 and Example 8 are not independent. the standard game theory prediction. I sometimes abuse notation and write (σ1 . This has various consequences for the analysis. sN ). Suppose σ is independent. . If players actually can randomize then the true strategic form is then not (I. Still assuming that σ is independent. {∆(Si )}N ). Explicitly. and σ(s) = σi (si )σ−i (s−i ). 3 7 . hence σ−i (s−i ) = j=i σj (sj ). . . But the extension of the theory to the continuum in this particular case is trivial. one can generate an augmented game with the property that the correlated strategy distribution of the original game becomes an independent distribution in the augmented game. {Si }N ) but rather (I. imposes independence and as a consequence independence is assumed in virtually all game theory applications. . For example. Then it is as if each player i were randomizing independently according to σi . it might be that players can correlate because they can talk prior to the start of play. one of which is that. . . i=1 i=1 i=1 rather than (I. . σ(s) = σ1 (s1 ) × · · · × σN (sN ). But before leaving this topic. By embedding the correlating mechanism into the description of the game. {∆(Si )}N ). . In particular. If players can correlate then there must be some mechanism generating the correlation.3. so I am getting ahead of the story in even bringing up this issue now. I provide explicit examples of what I have in mind later. . . The distribution in Example 9 is independent. (σ1 . ∆(Si ) is inﬁnite whereas I have assumed strategy sets are ﬁnite. . it may be sensible to expect correlation. One writes (I. One can argue that independence is without loss of generality if the game is a complete description of the strategic environment. A pure strategy is just a degenerate mixed strategy. only as a kind of shorthand. the argument goes as follows. If σ is independent then. If σ is independent then σ−i . σN ) is called a mixed strategy proﬁle. the marginal distribution over S−i deﬁned by σ−i (s−i ) = si ∈Si σ(si . Much of game theory focuses on strategy distributions that are independent.

Under the standard interpretation of game theory. 8 . has the expected utility form. σ(s) = σi (si )σ−i (s−i ).2). It is standard practice in game theory to write ui (σ) instead of Ui (σ). Thus.4 Mixed extensions. 3. σ−i ) in place of σ. ui with its domain thus extended from S to ∆(S) is called the mixed extension of ui . where Eσ denotes expectation with respect to the probability distribution σ.f. with felicity function given by the payoﬀ function ui (c. σ−i ) in place of σ and I continue to refer to σi as a mixed strategy. player i’s utility over lotteries. denoted Ui . where σ−i may be correlated. it may that for some i and all s.I often write (σi . If σ satisﬁes this expression then I continue to write (σi . the discussion in Section 2. the utility from the lottery (strategy distribution) σ is Ui (σ) = Eσ [ui ] = s∈S σ(s)ui (s). Even if σ is not independent.

As the examples of Section 4. one can always ask whether.2 Basic Facts. Theorem 1 does not fully generalize to non-ﬁnite games. σi ∈ ∆(Si ) is a best response to σ−i ∈ ∆(S−i ) iﬀ. This follows from the fact that. In this game. Consider the following game. 4. σ−i ). it is possible to have more than one best response to the same σ−i . ˆ σ ui (ˆi . best responses (but not necessarily strict best responses) always exist. as any alternative strategy. First. It is a basic mathematical fact that a continuous function deﬁned on a compact set attains a maximum. For any σ−i ∈ ∆(S−i ). given σ−i . I have two observations. Theorem 1. Thus. Second.1 Best Response. given the payoﬀ functions. is compact. even if the player is Bayesian. in which case the best response is pure (Theorem 4 below). σi = σi . who names a natural number z ∈ N+ and then receives z as her payoﬀ. One interpretation of σ−i is Bayesian: σ−i is player i’s subjective belief about the behavior of the other players. there exists a best response to σ−i Proof. there isn’t 9 . namely ∆(Si ). the Name Your Prize game. than any alternative strategy. σi is a best response iﬀ it yields at least as high an expected payoﬀ. σ−i ) ≥ ui (σi . for all σi ∈ ∆(Si ). one can ask whether players are acting as if they were fully informed and were optimizing. Deﬁnitions. There is only one player. distribution over S−i generated by the actual behavior of the other players. σi is a strict best response iﬀ it yields a ˆ higher expected payoﬀ. There is a unique best response if and only if there is a strict best response. That is. In ﬁnite games. Another interpretation is that σ−i is the true. ˆ σ ui (ˆi . Remark 2. objective. (a) the mixed extension ui is continuous as a function of σi and (b) the domain of this function. For any σ−i . ˆ given σ−i . let BRi (σ−i ) denote the set of player i’s best responses to σ−i . it may be that the subjective σ−i and the objective σ−i are diﬀerent: a Bayesian’s beliefs can be wrong.3 illustrate. σ−i ). regardless of what one assumes about the rationality of the players. ˆ σi ∈ ∆(Si ) is a strict best response to σ−i iﬀ. for any ﬁxed σ−i . for all σi ∈ ∆(Si ). σ−i ) > ui (σi . Deﬁnition 1.4 4. a player’s strategy is a best response in the objective sense.

The ﬁrst observation along these lines is that a mixed strategy is a best response to some σ−i if and only if it yields as high an expected payoﬀ as any pure strategy. If. multiplying by the σi (si ) and adding across the si . Then. And if p2 = 0 then any p1 is a best response. say. ⇒. But if p2 ∈ (0. Firm 1’s payoﬀ is its proﬁt. Note that this problem disappears if one considers instead (perhaps more realistically) the ﬁnite game in which there is no price between. The only if direction is immediate (since a pure strategy is just a special kind of mixed strategy). The problem here is that ﬁrm 1’s payoﬀ function is not continuous in p1 for any p2 ∈ (0. σ−i ) ≥ ui (si . Two ﬁrms produce a homogeneous good.. ⇐.any z that is a best response (I still use the term “best response” even though there are no other players) since z + 1 is better than z. (1) 10 . The bottom line is that in some non-ﬁnite games of interest. Theorem 2. its proﬁts are zero. σ−i ). σ−i ) ≥ ui (si . Proof. σ−i ) ≥ σ si ∈Si si ∈Si σi (si )ui (si . there is a discontinuity at p1 = p2 . which is the monopoly price. pi . ﬁrm 1 has no best response. Informally. strictly speaking. for any si ∈ Si . there is no price that is “just below” p2 . Firm i’s strategy is its price. Remark 3 below) may be counterintuitive. Suppose that ui (ˆi . the Bertrand duopoly being a classic example. In particular. 1/2] then.f. ﬁrm 1 wants to undercut ﬁrm 2’s price by as little as possible. A price of p1 = p2 − ε/2 will always be better than a price of p1 = p2 − ε. Consider any σi ∈ σ ∆(Si ). The remaining results in this subsection all bear on this issue. hence not compact. in fact. 59 cents and 60 cents. If p1 < p2 and p1 < 1 then the demand for ﬁrm 1 is 1 − p1 . Finally. exist for some proﬁles of opposing strategies. however. since no matter what price ﬁrm 1 charges. The linear structure of expected utility has strong implications for whether a mixed strategy can be a best response. σ−i ). As a second example. 1). It is easy to verify that if p2 > 1/2 then ﬁrm 1’s best response is p1 = 1/2. The problem here is a failure of compactness: the set of all numbers is not bounded. p1 = p2 < 1 then the ﬁrm’s split the market. σi (si )ui (ˆi . assume that there is zero cost of production. best responses may not. Some of these results (c. σi is a best response to σ−i iﬀ ˆ σ ui (ˆi . consider the Bertrand duopoly game. If p1 > p2 or if p1 > 1 then demand for ﬁrm 1 is zero. Immediate from the deﬁnition of best response. σ−i ) for any si ∈ Si . The if direction follows from the fact that the payoﬀ to a mixed strategy is just the average of payoﬀs to pure strategies. determined as follows. the demand for ﬁrm 1 is (1 − p1 )/2.

σ−i ) σ = σi (˜i ) [ui (ˆi . ⇒. Proof. and σi (si ) = σi (si ) for all si other than si or si . as was to be shown. ˆ Theorem 3. σ−i ) = si ∈Si [ˆi (si ) − σ(si )] ui (si . σ−i ). I argue by contraposition. si ∈Si Therefore. let ˆ σi (˜i ) = 0. σ−i ) = si ∈Si σi (si )ui (si . The expected payoﬀ from a mixed strategy σi is an average of the expected payoﬀs from the pure strategies in its support. σ−i ). σ−i ) = ui (ˆi . σ−i ) ≥ ui (σi . If si is in the support of σi but si is not a best response to σ−i then one could raise one’s payoﬀ average by shifting probability away from si . σ−i ). Suppose that si ∈ BRi (σ−i ). Formally. σ−i ) > ui (˜i . Consider ˜ any σi such that σi (˜i ) > 0. ˆ s ˆ s s s ˆ ˜ ˆ Then s s σi (˜i ) − σi (˜i ) = −σi (˜i ). for any other si . by deﬁnition. I show that σi is not a best response to σ−i . σi (si )ui (si . σi (si )ui (ˆi . σ ˆ Since σi was arbitrary. The second observation is that a mixed strategy is a best response to σ−i if and only if all the pure strategies in its support are likewise best responses. σ−i ) − ui (˜i . σ σ si ∈Si Moreover. σ−i ) − ˆ si ∈Si σ(si )ui (si . σi is a best response to σ−i iﬀ every si in the support of σi is a best ˆ response to σ−i . ˜ ˆ Deﬁne σi by shifting all probability that had been on si onto si . The intuition is much like the intuition for Theorem 2. ˆ Therefore. σ−i ) − ui (σi . σi (ˆi ) = σi (˜i ) + σi (ˆi ). σ−i ). since si ∈Si σi (si ) = 1. σi (si ) − σi (si ) = 0. σ−i ) = ui (σi . it follows that σ ∈ BRi (σ−i ). ˆ s s s σi (ˆi ) − σi (ˆi ) = σi (˜i ) ˆ s and. σ ui (ˆi . 11 .But. σ−i )] s s s > 0. Theorem 2 implies that there is a si such that ˜ ˆ s s ui (ˆi . (1) implies ui (ˆi . Since s si ∈ BRi (σ−i ).

Since si ∈ BRi (σ−i ) for every si ∈ supp(ˆi ). Then. it follows that σ ui (ˆi . The ﬁnal observation is that a best response is strict if and only if it is the unique best response. I have recorded only the payoﬀs for player 1. σ−i ) < ui (σi . Uniqueness together with Theorem 3 then implies that a strict best response must be pure: if it were not pure then Theorem 3 implies that there would be multiple pure best responses. whenever randomization is optimal. With expected utility. The proof σ follows by contraposition. Since σi was arbitrary. ui (si . there is never any strict incentive to randomize. One sometimes sees claims to the eﬀect that in games like Matching Pennies (Example 2). ⇐. Remark 3. mixed as well as pure. σi is a strict best response to σ−i iﬀ it is the unique best response to ˆ σ−i . σ much as in the proof of Theorem 2. σ−i ). it follows that. For simplicity. (Note that if si ∈ supp(ˆi ) then it is possible that ui (si . the uniqueness claim follows. Consider the game in Figure 8. Consider any σi . σ−i ). Then for any σi = σi . σi ∈ BRi (σ−i ). Proof. σ−i ) > ui (σi .) Since σi was arbitrary. ui (ˆi . playing a pure strategy is optimal as well. In particular the procedure for constructing mixed strategy Nash equilibria is to determine what mixtures over actions for players other than i will yield more than one best response for player i. ˆ Theorem 3 implies that there is always a pure strategy best response to any σ−i . σ−i ). σ−i ) = si ∈Si σi (si )ui (si . It is easy to verify 12 . but in σ this case σi (si ) = 0. σ−1 ) ≥ ui (σi . Let σi be a strict best responses to σ−i . A strict best response to σ−i is pure. σi is not a best response.3 Examples. Theorem 4. Example 10. Theorem 3 also implies that if a nondegenerate mixed strategy σi is a best response to σ−i then there are multiple pure strategy best responses to σ−i . The claim is almost immediate. 4. Theorem 3 and 4 imply that this intuition is incompatible with expected utility maximization. This implies that σi ∈ BRi (σ−i ). In this case. σ−i ) > ˆ ˆ σ ui (σi . for any such si . σ−i ). the set of best response to σ−i . is a continuum.Hence ui (ˆi . σ−i ) ≥ ui (σi . σ−i ). Theorem 3 plays an important role in the analysis to follow. players should randomize in order to avoid being exploited by their opponents. Let p be the probability that player 1 plays T and let q be the probability that player 2 plays L: p = σ1 (T ) and q = σ2 (L). Therefore.

one can verify that if q < 1 − K/16 then the strict best response is p = 1. player 1’s ¯ best response is strict. Let p = σ1 (E) and let q = σ2 (Fight). if q is small. the incumbent’s strict best response is to accommodate. if q = 7/10 then every p ∈ [0. Assume that the cost function is Ci (qi ) = cqi where c > 0 and c < 12. On the other hand. Example 12. On the other ¯ hand. σ2 ) is Eσ2 [(12 − q1 − q2 )q1 − cq1 ] = (12 − q1 − Eσ2 [q2 ] − c)q1 = (12 − q1 − q2 − c)q1 . 1] is a best response. Consider the entry deterrence game of Figure 29 (Example 43). meaning that the incumbent accommodates with high probability. In summary. Recall the Cournot duopoly of Example 4. there is zero chance of entry. if there is any chance of entry. But if p = 0. if q < 7/10 then player 1’s strict best response is p = 0. it is easy to verify that. Given this. if q2 ≥ 12 − c then the ﬁrm’s best response is zero. ¯ if q1 ≥ 12 − c. Conversely. Ignoring boundary issues (e. Conversely. that if q > 7/10 then player 1’s strict best response is p = 1 (player 1 plays T for sure). Similarly. ¯ BR1 (σ2 ) = Similarly. then any q is a best response. 1] is a best response.g. if q is large. Note that this is an inﬁnite game. Explicitly. if q > 1 − K/16 then the strict best response is p = 0. meaning that the incumbent ﬁghts with high probability. ¯ 0 if q1 < 12 − c. ¯ 13 . for any p > 0 the strict best response is q = 0. if q1 = Eσ1 [q1 ] then ¯ BR1 (σ2 ) = 12−c−¯1 q 2 12−c−¯2 q 2 0 if q2 < 12 − c. for q2 < 12 − c. ¯ ¯ where q2 = Eσ2 [q2 ]. then the entrant’s strict best response is to stay out. then the entrant’s strict best response is to enter. hence unique. while if q = 1 − K/16 then every p ∈ [0. That is. ¯ if q2 ≥ 12 − c.T B L R 5 0 2 7 Figure 8: Payoﬀs are for player 1 only. Informally. ignoring q1 + q2 ≥ 12) the payoﬀ to player 1 from the proﬁle (q1 . Example 11. and is given by (12 − c − q2 )/2.

in this particular example. But. . with mutual play of C in every period.” The “always ﬁnk” player receives (using the payoﬀs of Example 3) the sequence of stage game payoﬀs 6. “always ﬁnk” may not be a best response to “tit for tat” or “grim. depending on the discount factors. 14 . . many other best responses as well. In the repeated prisoner’s dilemma (Example 46). . 1 − δi One can verify that the payoﬀ to “grim” is larger than the payoﬀ to “always ﬁnk” whenever 2 δi > .” And there are many. It is easy to verify that “always ﬁnk” is a best response to either “always cooperate” or “always ﬁnk.More generally. so that δi = 1/(1+ri ). 2 6 + δi + δi + · · · = 6 + 4 It is not the unique best response. 2 Market interest rates (in real terms) are usually less than 10% per year. . ¯ The fact that the best response to σ−i depends only on q−i is not general. If δi > 2/5 then both “always C” and “tit for tat” are also best responses to “grim. then one can show. . 1 − δi In contrast. But. things are much more complicated. So this restriction on ri (which depends on the particular stage game payoﬀs we are using) seems weak. 4. One can readily verify that F is the best response to any σ−i . Example 13. 4. ¯ if q−i ≥ 12 − c. The repeated game payoﬀ is 4 . then “grim” is a better response to “grim” than is “always ﬁnk” provided 3 ri < = 150%. .” Explicitly. using the notation q−i = Eσ−i [ j=i qj ]. The repeated game payoﬀ is then δi .4 If one interprets δi as reﬂecting a rate of time preference ri . that the best response to σ−i is ¯ BR1 (σ−i ) = 12−c−¯−i q 2 0 if q−i < 12 − c. the analysis of the stage game carries over to the repeated game. all with cost C(qi ) = cqi . if a player plays “grim” when his opponent plays “grim” then the player receives the sequence of stage game payoﬀs 4. 5 In fact. Recall the prisoner’s dilemma of Example 3. if there are N ﬁrms. . 1. one can show that “grim” is a best response to “grim” if this condition on δi is met. . 1. which in turn comes from the assumption that demand is linear. It ¯ comes from the fact that proﬁt is linear in s−i .” In this respect. consider the payoﬀ to “always ﬁnk” when the opponent plays “grim. all of the best responses yield the same path of play.

s−i ) > σ s−i ∈S−i s−i ∈S−i σ−i (s−i )ui (σi . Nash equilibrium (Section 6). ˆ The deﬁnition of strict dominance requires that σi have a higher expected payoﬀ ˆ than σi for all opposing σ−i . s−i ). with correlated equilibrium coming last. Deﬁnition 2. rationalizability is the least restrictive. Hence ui (ˆi . ⇐. ⇒. such as subgame perfect Nash equilibrium) almost exclusively. σ−i ) > ui (σi . Of these solution concepts. because people tend to ﬁnd correlated equilibrium strange at ﬁrst. and correlated equilibrium (Section 7). The following result establishes that it is suﬃcient to restrict attention to opposing proﬁles that are pure. σ−i ). σ Then. s−i ) for every s−i ∈ S−i . ˆ Theorem 5. σ−i (s−i )ui (ˆi . s−i ) > ui (σi . rationalizability (this section). and Nash equilibrium is the most restrictive. correlated equilibrium is the next least restrictive. for all s−i ∈ S−i . Applied game theory employs Nash equilibrium (or one of its reﬁnements. 15 . In the next three sections I introduce three basic solution concepts used in game theory.1 Strict Dominance. 5. Consider any σ−i . σ−i ) ˆ σi is strictly dominant iﬀ it strictly dominates every σi = σi . multiplying by the σ−i (s−i ) and adding across the s−i . σi strictly dominates σi iﬀ.5 Dominance and Rationalizability. I discuss some of the reasons for this in Section 9. Suppose that ui (ˆi . I describe them out of order. σi is strictly dominated ˆ iﬀ there exists a σi that strictly dominates σi . This enormously simpliﬁes that task of verifying whether some σi dominates some σi . Proof. ˆ σ ui (ˆi . for any σ−i ∈ ∆(S−i ). s−i ). Immediate. σ as was to be shown. σ−i ) > ui (σi . ˆ σ ui (ˆi . s−i ) > ui (σi . σi strictly dominates σi iﬀ. But rationalizability and correlated equilibrium play a prominent role in theoretical (as opposed to applied) and empirical game theory.

pure strategy. It is easy to see that. and it is true that M 16 . B is strictly dominated by M . A strategy is strictly dominant iﬀ it is a strict best response to every opposing strategy proﬁle. C is strictly dominated by F . B is not not strictly dominated by any L R 10 0 0 10 4 4 T M B Figure 10: Payoﬀs are for player 1 only. B is strictly dominated L R 4 0 3 3 0 2 T M B Figure 9: Payoﬀs are for player 1 only. In this example. No strategy is strictly dominant. consider the game in Figure 11. It is true that T does better than B against L. by M . Consider the game in Figure 10. B is strictly dominated. Indeed. In particular. Example 15. Consider the game in Figure 9. Example 16. B is not strictly dominated. This mixed strategy earns 5 in expectation against either L or R. Proof.The following result is trivial but still worth recording. it is not a best response to L. In this example. which is strictly dominant. But M is not strictly dominant. Consider the Prisoner’s Dilemma of Example 3. Theorem 6. no strategy is strictly dominant. In contrast. B is strictly dominated by the strategy that randomizes 50:50 between T and M . Example 17. despite this. But it is strictly dominated by some mixed strategies. B is not a best response to any pure strategy. for either player. Example 14. whereas B only earns 4. As in the game Figure 10. Immediate from the deﬁnition of strict dominance and from Theorem 4. But. If a strictly dominant strategy exists then it is unique and it is pure.

as I discuss further in Example 19 in Section 5. Theorem 7. strategy is never a best response if and only if it is strictly dominated. the statement “si is strictly dominant” means (Theorem 6) that si is the strict best response to every σ−i . A common error is to confuse strict dominance with strict best response. Even if si is the strict best response to some σ−i it may not be a best response. But for B to be strictly dominated there would have to be a single strategy for player 1 that does better than B against both L and R. with 5 Since preferences have the expected ˆ probabilities p ∈ R+ ˆ k=1 pk = 1. however. ⇒. σ−i ). It is rare for a player to have a strictly dominant strategy. for all σ−i . does better than B against R. I must ﬁnd σi such that. 5. ui (ˆi . No strategy is strictly dominated.2 Remark 4. The “if” direction is trivial. 5 RK = {x ∈ RK : xk ≥ 0 ∀k}. Represent σi as a vector of ˆ K K . Similarly.2 Never a Best Response Deﬁnition 3. That is. A strategy σi is never a best response iﬀ there does not exist any σ−i such that σi ∈ BRi (σ−i ). I must ﬁnd a σi that strictly domiˆ nates σi . Theorem NBRSD. σ Let K = |Si | be the number of pure strategies. σi is never a best response iﬀ it is strictly dominated. ˆ σ or ui (ˆi . below. In contrast. to other σ−i . σ−i ) − ui (σi . + − 17 . RK = {x ∈ RK : xk ≤ 0 ∀k}. Proof. The examples above suggest that there may be a close connection between strict dominance and never a best response.T M B L R 10 0 0 10 6 6 Figure 11: Payoﬀs are for player 1 only. The statement “si is a strict best response” means that there is some σ−i such that si is the strict best response to σ−i . let alone the strict best response. establishes that. σ−i ) > ui (σi . Suppose that σi is never a best response. The “only if” direction. σ−i ) > 0. No such strategy exists. is diﬃcult because one must ﬁnd a dominating strategy. indeed.

M is a best response (and is strictly better than B) for q ≤ 1/2 and that T is a best response (and is strictly better than B) for q ≥ 1/2. B is not a best response to any pure strategy but B is the strict best response to the mixed strategy in which player 2 randomizes 50:50 between L and R. q = 0. Example 19. σ−i ) − ui (σi . . Thus B is never a best response. Theorem 2 implies that V ∩RK = ∅. 0. Hence σ is never a best response. σ−i ) − ui (σi . since 0 ∈ RK . Consider again the game in Figure 10. Since σi is never a best response. 0. σ−i ) − ui (σi . . p · v σ−i > 0. . σ−i ) = σ k=1 pk [ui (sk . that B is strictly dominated. q need not be a probability vector. by deﬁnition. σ−i ). the expected payoﬀ from M is 10 − 10q. deﬁne v σ−i ∈ RK to be the vector whose k component is ui (sk . . In − fact. Suppose that player 1 thinks that player 2 plays L with probability q. Consider again the game in Figure 11. Let pk = ˆ qk . . −1. ˆ i For each σ−i . since each of the negative unit vectors. vectors of the form − (0. . and an r ∈ R such that q · v > r for v ∈ V and q · x < r for x ∈ RK . 18 . − By the Separating Hyperplane Theorem. it follows that q · v > 0 for v ∈ V and q · x ≤ 0 for all − x ∈ RK . K ui (ˆi . It is easy to verify that V is convex and compact. I have already shown. . If σi is strictly dominated then. I now verify directly that it is never a best response. The task. σ−i )]. in Example 16. is to ﬁnd a probability vector p ˆ i such that. or qk > 0 for − all k: q is strictly positive.utility form. Moreover. Example 18. Then the expected payoﬀ from T is 10q. is in RK . ⇐. since it need not sum to one. . there exists a vector q ∈ RK . there exists a σi that yields ˆ strictly higher expected payoﬀ for every possible opposing strategy proﬁle. But this is easily ﬁxed. then. and the expected payoﬀ from B is 4. ˆ Let V ⊂ RK be the set of v σ−i . for all v σ−i . it follows that −qk < 0 for all k. The result ˆ ˆ follows. 0). K k=1 qk p is a probability vector and by construction p · v σ−i > 0 for all σ−i .

In fact. For the moment. the set inclusions are set equalities after some point. (Technical. S 1 = N Si . the strategy is a best response to a σ−i that exhibits correlation. sometimes it is easier to check for the other. For never a best response. . SN . S 1 . S 0 ⊃ S 1 ⊃ S 2 ⊃ S 3 . i=1 • And so on.) Deﬁne R Si ∞ = t=0 t Si . . In this sense. Let S R = R i Si . . it suﬃces (thanks to Theorem 2) to check whether a strategy is a better than any pure strategy but one may have to consider opposing distributions that are not pure (see Example 19). equivalently (Theorem 7). Again. (This is not necessarily true for inﬁnite games. 1 0 • Si is the subset of Si consisting of strategies that are not strictly dominated 1 in the original game.) If N ≥ 3 then one can construct examples in which a strategy is not a best response to any independent σ−i and yet the strategy is not strictly dominated. That is. there will be a T such that for all t > T . Given this. S 2 = N Si . there is a T for which S R = S T . Remark 5. deﬁne recursively sequences S 0 .3 Rationalizability. 5.Theorem 7 implies that whether one works with strict dominance or with never a best response is a matter of convenience. For strict dominance. for a player j = i. take it as self evident that a strategy that is never a best response or. since I am working with ﬁnite games. With this sort of argument in mind. S t = S T . it suﬃces (thanks to Theorem 5) to limit attention to pure opposing strategies but one may have to use a dominating strategy that is mixed (see Example 16). strictly dominated will not be played. . . Note that S t+1 ⊂ S t does not rule out S t+1 = S t . By construction. In such examples. . . since the games are ﬁnite. . also take it as self evident that if a pure strategy sj . I discuss this issue again in Remark 6. as follows. i=1 2 1 • Si is the subset of Si consisting of strategies that are not strictly dominated 1 1 2 in the game with strategy sets S1 . . Sometimes it is easier to check for one. • S 0 = S. . 19 . as Example 23 will illustrate. Theorem 7 requires that one allow the σ−i to be correlated. is never a best response then a “reasonable” belief σ−i about i’s opponents puts probability zero on any s−i containing sj .

I refer to this as the largest best response set. I discuss these issues further in Section 9. Proof. one of the original papers on this topic. S T is a best response set. ˆ ˆ ˆ An alternate way to characterize S R is as follows. Since S T +1 = S T . si ∈ Si is rationalizable iﬀ si ∈ Si .” But this sort of sophisticated reasoning is not necessary to motivate rationalizability. because I think that he will play that. there is a σ−i with supp(σ−i ) ⊂ S−i ˆ a best response set iﬀ. Hence S R is a best response set. predict that play eventually becomes rationalizable. Proof. Recall that. dominated by L and that no other strategy is strictly dominated. D}×{L}. It is easy to verify that the union of all best response sets is a best response set. Many learning models. Theorem 9. 1 D 1. . That is. It is easy to show by induction on the S t that any best response set is a subset of S R . 2 Figure 12: An example to illustrate rationalizability.. S R is the largest best response set.e. S R = S T for some T .R Deﬁnition 4. It follows that S 1 = {U. Consider the game in Figure 12. It is now easy to verify that. s ∈ S is rationalizable iﬀ s ∈ S R . This follows from Theorem 1 and the fact that. 20 . in ﬁnite games. arguments of the form. It is easy to verify that R is strictly L R U 4. since the game is ﬁnite. if s−i ∈ S ˆ can be justiﬁed as a best response to some belief over the strategies in S−i . having deleted R. . It remains to show that S R is a best response set. If Si ⊂ Si . Example 20. 4 0. The next theorem records that.6 The prediction that play eventually becomes rationalizable seems also to be consistent with the empirical evidence. since the game is ﬁnite. even learning models based on extremely naive behavior. D is never a best 6 The point was raised explicitly in Bernheim (1984). S R = ∅. “I should play this. because I think that he thinks . S R = S T for some T . say that S = Si is ˆi . any si ∈ S−i ˆ (i. rationalizable strategies exist. . for any i. σ ∈ ∆(S) is rationalizable iﬀ supp(σ) ⊂ S R . Rationalizability was originally motivated primarily by introspective arguments. Theorem 8. for any si ∈ S ˆ−i then σ−i (s−i ) = 0 ) such that si ∈ BRi (σ−i ). 8 7.

Recall the game Matching Pennies. and suppose that the ui are continuous. (Technical. 6 − c/2]. S R = {F } × {F }. any qi > 6 − c/2 is strictly dominated. C) is (4.4). suppose that each Si is a compact (closed and bounded) subset of Rki . The set of independent rationalizable proﬁles is smaller than the set of correlated rationalizable strategy proﬁles. no further deletion is possible: S R = [0. which is an inﬁnite game. for some positive integer ki . 6 − c/2] × [0. Recall the Prisoner’s Dilemma game of Example 3 and Example 14. This is unfortunate for the players because the payoﬀ from (F. with three ﬁrms. If I were to delete these as well. Example 24. 21 . Again assume that there are no costs. In this game no strategy is strictly dominated: S R = S. 6 − c/2] × [3 − c/4. These best responses imply that any qi greater than 6 − c/2 is never a best response and hence is strictly dominated. Assume that the cost function for either ﬁrm is Ci (qi ) = cqi . Again. F ) is only (1. But one can show that the S t are compact. 6 − c/2]. no output below 3 − c/4 is ever a best response. Once one has deleted outputs above 6-c/2. As discussed in Remark 5. No other pure strategies are strictly dominated. 6 − c/2]. Again. For example. There may be strategies at some stages that are best responses to correlated σ−i but not to any independent σ−i . Remark 6. Thus S 2 = [3 − c/4. introduced in Example 2. And this is the last deletion one can make. the best responses were computed in Example 12. Again recall the Cournot game of Example 4 but now suppose that there are three ﬁrms instead of two. One round of strict dominance deletion then yields S 1 = [0.) The deﬁnition of rationalizability that I have given corresponds to what is sometimes called correlated rationalizability. Since the strategy sets are inﬁnite. I would end up with a set of strategy proﬁles that satisﬁesindependent rationalizability. The best responses were computed in Example 12. One can show that S R = {4−c/3}×{4−c/3}. 6 − c/2] × (0. 6 − c/2] × [0. Having deleted these strategies. Independent rationalizability is the version of rationalizability originally proposed by Bernheim (1984) and Pearce (1984). this equivalence requires considering σ−i that are correlated. Likewise Theorem 9 extends to this environment: S R is the largest best response set. Example 22. the set of rationalizable strategy proﬁles is almost as large as the original set of strategy proﬁles. the set inclusions S t+1 ⊂ S t may be strict for all t. Since F is strictly dominant for either player. Example 23.response. Thus. Recall the Cournot duopoly of Example 4. The analysis extends to inﬁnite games under some circumstances. In constructing the S t . Each S t+1 is a proper subset of S t . there is no T such that S T = S T +1 . which is equivalent (Theorem 7) to eliminating strategies that are never a best response at that stage. Example 21. I am eliminating strategies that are strictly dominated at that stage.1) while the payoﬀ from (C. which implies that S R is not empty. where c > 0 and small (less than 4). Hence S R = S 2 = {U } × {L}. And so on.

ˆ There is some ambiguity in the literature as to how to deﬁne what it means for a strategy to be weakly dominant. however. the order of deletion may matter. σ−i ). σ−i ) ≥ ui (σi . 5. 22 . identical in every way except for name. There is a well established tradition within game theory of rejecting any prediction σ for the game that puts positive probability on proﬁles outside of S W . ˆ σ ui (ˆi . Under this weaker deﬁnition. It should be evident that S W ⊂ S D . One problem with S W is that it is not always clear how it should be deﬁned. there may be more than one weakly dominant strategy. Deﬁnition 5. The obvious deﬁnition is that σi is weakly dominant iﬀ it weakly dominates every other strategy. it turns out not to matter whether at stage t I delete all strategies for all players that are strictly dominated at that stage or just some of the strictly dominated strategies for some of the players. I can deﬁne S W via the iterated deletion of weakly dominated strategies. since each player has only a single opponent. instead. σi is weakly dominated iﬀ there exists a σi that weakly dominates σi . ˆ σ ui (ˆi . σi weakly dominates σi iﬀ. but this is somewhat arbitrary. Much as I deﬁned S D via iterated deletion of strictly dominated strategies. σi is weakly dominant iﬀ. Deﬁnition 6. But the argument for focusing on S W is not as secure as the argument for focusing on S D = S R . σ−i ) ≥ ui (σi . I have deﬁned S W by deleting all weakly dominated strategies for all players at each stage. for any σ−i ∈ ∆(S−i ). But with S W . One common solution is to deﬁne S W to be. the set of strategies that survive the iterated deletion of weakly dominated strategies for some deletion order. Note that if N = 2 then correlated and independent rationalizability are equivalent. dominance.4 Weak dominance Game theory is frequently interested in weak. rather than strict.often strictly so. I adopt a slightly weaker deﬁnition: a strategy is weakly dominant if it always does at least as well as any other strategy. This is a bit too strong. with strict inequality for at least one σ−i . In constructing S R . for any σi and for any σ−i ∈ ∆(S−i ). I discuss some of the reasons for holding this view in Section 9. because it implies that a strategy cannot be weakly dominant if it has a twin. A proﬁle s may be in S W because it survives under one deletion order though it fails to survive under another. If N ≥ 3 then I ﬁnd correlated rationalizability somewhat more persuasive than independent rationalizability. Correlated rationalizability also yields a tidier theory. σ−i ).

23 . rather than S W R . introspection. Restriction to S W . none of the usual game theory justiﬁcations (e. One can make a fairly strong (but not ironclad) case for predicting that weakly dominated strategies will be seen at most only infrequently but the exclusion of strategies that fail to survive the iterated deletion of weakly dominated strategies can be problematic.A more severe problem is that. as I discuss in Section 9. rather than S W R . as a general principle. I take the position that it is prudent to proceed on a case by case basis. these are the strategies that survive one round of weak dominance deletion followed by the iterated deletion of strictly dominated strategies. An alternative with stronger theoretical support is S W R . the empirical evidence) provide general support for focusing on S W . Rather than elevate restriction to S W . is often compelling. but not always. S W ⊂ S W R ⊂ S R .g. learning.

Deﬁnition and interpretation. If supp(σ ∗ ) = ∗ ∗ S then (σ1 . That is to say. a necessary condition for the agreement to make sense is that it be a Nash equilibrium. If σ ∗ (s∗ ) = 1 for some pure strategy proﬁle s∗ then s∗ is a pure strategy Nash equi∗ ∗ librium. the observed frequency distribution approximately equals the associated Nash equilibrium distribution. with high probability (certainty in the case of a pure equilibrium). σN ) is a fully mixed (or completely mixed) Nash equilibrium. I am 24 . If each group plays the same Nash equilibrium then. here are four basic stories that are often told to justify Nash equilibrium. the focus is not on intended behavior but rather on some statistic characterizing play in some larger game. each player i chooses a strategy σi that is a best response to a belief σ−i that is correct. Note that this does not require that player i understand the game or actually have beliefs. . I sometimes refer to a Nash erwise. ∗ ∗ If σ ∗ is a Nash equilibrium distribution then (σ1 . σN ) is a mixed strategy Nash equilibrium. σ ∗ ∈ ∆(S) is a Nash equilibrium distribution iﬀ σ ∗ is independent and for all i ∗ ∗ σi ∈ BRi (σ−i ). . (σ1 N equilibrium as simply an equilibrium.1 Nash Equilibrium. players negotiate an agreement on how to play. σ ∗ ) is a partly mixed Nash equilibrium. For either interpretation. and one can ask whether the resulting frequency distribution is a Nash equilibrium distribution. Two interpretations of a Nash equilibrium are standard. The distinction between a Nash equilibrium and a Nash equilibrium distribution may become clearer when I discuss examples in Section 6. In ∗ a Nash equilibrium. . . . One can compute the frequency with which each strategy proﬁle of the game occurs across the groups.3. a Nash equilibrium. Otherwise (σ1 . In the alternative interpretation. All that is required is that player i acts as if he is best responding to beliefs that are in turn correct. The ﬁrst interpretation is compatible with the second. Brieﬂy. . σN ) is a Nash equilibrium. . A necessary condition for this agreement to make sense is that no one has incentive to deviate from the agreement so long as everyone else adheres to it. . . . Oth∗ .6 6. . The most straightforward justiﬁcation for Nash equilibrium is preplay agreement: before the game begins. For example. But the converse claim is false: it is possible for the empirical frequency across groups to be approximately that of a Nash equilibrium even if no group plays. . The most straightforward is that a Nash equilibrium describes actual intended behavior in the game. the obvious question is why would one think that a Nash equilibrium would arise. even approximately. . . . one could imagine an N -player game being played by a large number of groups of N people. . Deﬁnition 7.

For more on the interpretation of and justiﬁcations for Nash equilibrium and other solution concepts. . 6. BRN (σ−N )). But quite apart from this.2 Existence. Proof.4. .” The ﬁnal possibility is that Nash equilibrium arises over time in a dynamic environment in which players learn and adjust. see Section 9. There is broad. It may. The following existence theorem ﬁrst appeared in Nash (1950). there are many. but it may be a Nash equilibrium. learning theories that guarantee some form of convergence in all games. All of the theories that exhibit this sort of global convergence model players as comparatively unsophisticated. . This justiﬁcation for Nash equilibrium is problematic. Therefore. At this writing. An alternative justiﬁcation is the claim that if players are rational then. There is controversy as to whether these auxiliary assumptions ought to be incorporated into the deﬁnition of rationality. Nash (1950) called it an “equilibrium point”. i ∆(Si ) is compact and convex and one can show that BR is convex-valued and has a closed graph. or something called a self-conﬁrming equilibrium. There is. most of these learning theories suggest the following general conclusions. all learning models that yield global convergence results with sophisticated players make strong auxiliary assumptions. Every game has at least one Nash equilibrium. arguments that rationality implies Nash equilibrium invariably invoke strong auxiliary assumptions. σN ) = (BR1 (σ−1 ). So some other justiﬁcation is needed for Nash equilibrium. Given the individual best response correspondences BRi . Broadly. instead. . deﬁne the correspondence BR by BR(σ1 . the deﬁnition of “converges” also varies from setting to setting. along with the deﬁnition of what we now call a Nash equilibrium. . What sort of equilibrium varies from setting to setting. players simply do not negotiate prior to the start of the game. Moreover. diﬀerent learning theories. In any event. in others as extremely naive. wander around endlessly. convergence obtains in some games but not in others. For most learning theories. . they will nevertheless play a Nash equilibrium. but not universal. by the Kakutani Fixed 25 . . consensus among game theorists that the answer is “no. even if they have no prior experience playing this particular game with this particular group of opponents. If the learning process converges then it converges to an equilibrium of some sort. players are modeled as highly sophisticated.glossing over some subtleties here but the most important diﬃculty with preplay agreement is that. In some. the learning process may not converge in any reasonable sense. . many. of course. As I discuss in Section 9. the issue of whether real players are rational. Theorem 10. in many strategic settings. There are a few exceptions. BR is a correspondence from i ∆(Si ) to itself. a correlated equilibrium.

The state of the art is represented by Simon and Zame (1990). If it is easy to compute S R and if S R is small relative to S then this can be a useful way to simplify the problem of computing Nash equilibria. F ) in Example 25 as (1. and Zhou (1993). σN ) ∈ BR(σ1 . 1). there is a (σ1 . Fudenberg and Tirole (1991) provides a survey of classical results. . I give an example in Example 45. and Reny (1999). From Example 21. σN ) is a Nash equilibrium. In contrast. Nash equilibrium can fail to exist in non-ﬁnite games. Deciding which. by Theorem 11. Remark 8. if any. Theorem 10 gives existence of a Nash equilibrium for any ﬁnite game. The “Name Your Prize” game in Remark 2 is perhaps the simplest possible example of a game without an equilibrium.3 Examples. Theorem 11 implies that if one is searching for Nash equilibria then one can R abridge each strategy set Si to just Si . . The converse of Theorem 11 is false: just because a strategy is rationalizable doesn’t mean that the strategy gets positive probability in a Nash equilibrium. it can put positive probability on strategies that are weakly dominated. . σN ) such that (σ1 . Nash equilibria are plausible is a recurring theme in game theory. This follows from Theorem 9 and the fact that if σ ∗ is a Nash equilibrium distribution then supp(σ ∗ ) is a best response set. This confuses the equilibrium. σN ). Remark 9. Tian. with the equilibrium 26 . Example 25.∗ ∗ ∗ ∗ ∗ ∗ Point Theorem. . A common error is to write the equilibrium (F. The associated Nash equilibrium distribution is given in Figure 13. . 6. . The following trivial observation is often useful. . Consider the Prisoner’s Dilemma of Example 3. if σ ∗ is a Nash equilibrium distribution then it is rationalizable. . Thus. the unique Nash equilibrium is the pure strategy equilibrium (F. S R = {F } × {F }. . The bottom line is that existence theorems for non-ﬁnite games require auxiliary assumptions. ∗ ∗ This (σ1 . . If one views such strategies are implausible then one views such equilibria as implausible. . Given a ﬁnite game. Remark 7. Although a Nash equilibrium cannot (by Theorem 11) put positive probability on strategies that are strictly dominated. strategy sets are compact (in contrast to the “Name Your Prize” game) but the payoﬀ functions are discontinuous. Proof. . Theorem 11. . Baye. . . In the Sion and Wolfe (1957) example. . F ). which is a strategy proﬁle. without any further assumptions. Sion and Wolfe (1957) provides a considerably more subtle example of non-existence. see Example 29.

Example 26. The unique Nash equilibrium is fully mixed and has each player randomize 50:50. introduced in Example 2. But in strategic form games derived from complicated extensive form games. In this particular game. Mixed strategy equilibria. The associated Nash equilibrium distribution is represented in Figure 14. the question is. So I take a hard line and insist that (1. It is straightforward to verify that the Nash equilibrium in Example 26 is an equilibrium. L). but how does one ﬁnd it in the ﬁrst place? Pure strategy equilibria are usually easy to compute by simply making a record of which si are best responses to which s−i . u2 (T. Consider Matching Pennies. One proceeds as follows. I return to this issue in Section 6. Suppose that the game is as in Figure 15. however. payoﬀ proﬁle. If there is a mixed equilibrium in which player 1 randomizes then it must be that player 1 is indiﬀerent between T and B (see Theorem 3). For the case of 2 × 2 games. L) u1 (B. discussed in Section 8. H T H 1/4 1/4 T 1/4 1/4 Figure 14: The Nash equilibrium distribution for Matching Pennies.C F C F 0 0 0 1 Figure 13: The Nash equilibrium distribution for the Prisoner’s Dilemma. this sort of mistake can lead to serious error. R) B u1 (B. 27 . 1) is not an equilibrium. let alone all mixed equilibria. on the other hand. u2 (T. R). R) Figure 15: A general 2 × 2 game. L). in large games. L) u1 (T. the mistake appears to be harmless. R). the analysis is fairly easy. L R T u1 (T. u2 (B. what σ2 makes player 1 indiﬀerent? Let q be the probability that player 2 plays L. are a problem and there is no known algorithm that is “good” for ﬁnding even one mixed equilibrium. So. u2 (B.4.

e. in the actual equilibrium. R) + u2 (T. Instead. Let q ∗ be the q for which this occurs. R) . L) − u2 (B. Then one can compute that player 2 is indiﬀerent if p = p∗ . L) = ui (T. A source of confusion is that player 1’s probability p∗ depends not on player 1’s payoﬀs but on player 2’s payoﬀs.q = σ2 (L). player 1 plays T with probability p∗ and player 2 plays L with probability q ∗ . if the game is trivial. where p∗ = u2 (B. Or there may be an inﬁnite number of partly mixed equilibria. L) − u2 (T. with ui (T. Remark 11. L) − u1 (T. R) + u1 (T. Because of this. Write out player 1’s expected payoﬀ as a function of p and q to get pqu1 (T. u1 (B. L) + (1 − q)u1 (T. a common error is to write that. If player 1 is indiﬀerent then these are equal. what σ1 makes her indiﬀerent? Let p be the probability that player 1 plays T . R) − u1 (T. R) 28 . player 2 randomizes only if she is indiﬀerent. If p∗ or q ∗ are undeﬁned (because the denominator is zero) or if they lie outside the relevant range (i. qu1 (T. R) If there is only one fully mixed Nash equilibrium then in this equilibrium player 1 randomizes by choosing p = p∗ and player 2 randomizes by choosing q = q ∗ . R). L) . Then a bit of manipulation yields q∗ = u1 (B. Remark 10. L) + (1 − q)u1 (B. There may be no mixed strategy equilibria. a common error is to compute p∗ correctly but then ascribe it incorrectly to player 2. R) − u2 (B. R) and the expected payoﬀ to B is qu1 (B. L) + (1 − p)(1 − q)u1 (B. L) + p(1 − q)u1 (T. R). R) = ui (B. L) − u1 (B. R) + (1 − p)qu1 (B. there are an inﬁnite number of fully mixed Nash equilibria. player 1 plays T with probability q ∗ and player 2 plays L with probability p∗ . R) (2) Similarly. u2 (B. then the denominator may be zero even though every mixed strategy proﬁle is a Nash equilibrium. Then the expected payoﬀ of T is qu1 (T. L) + (1 − q)u1 (B. A number of textbooks instruct students to ﬁnd the mixed strategy equilibrium by the following calculus-like procedure. R). and the question is.8 below. outside of (0. Thus. 1)) then the game requires more careful analysis. In this case. in the mixed Nash equilibrium. R) = qu1 (B. as in Example 45 in Section 8. Or. L) = ui (B. L) + (1 − q)u1 (T.

Consider the game in Figure 16. p = 1. that of Figure 8. 2 7. L) + (1 − q)u1 (T. But if that is your assumption. in particular. The assumption that every p ∈ [0. yielding qu1 (T. This game has three Nash equilibria. L) − (1 − q)u1 (B. 1]. That is. It is just expressing the same assumption in a diﬀerent. The assumption that the solution is interior is therefore equivalent to an assumption that every p ∈ [0. The Nash equilibrium distribution for the mixed strategy equilibrium is represented in Figure 18. Thus. Don’t kid yourself that the calculus-like approach is more correct. b) and one mixed Nash equilibrium in which player 1 plays a with probability 4/9 and player 2 plays a with probability 5/9. so the solution is either p = 0. R) = 0 which is equivalent to equation (2). Whose optimization problem are you solving. In taking a derivative and setting it equal to zero you are assuming that the solution is interior. (a.then diﬀerentiate this expression with respect to p and set this derivative equal to zero. Example 27. equation (2). L) an There is also a mixed strategy Nash equilibrium in which player 1 plays T with probability 2/3 and player 2 plays L with probability 7/10. But I recommend that you do not use the calculus-like approach. this is the same game as L R T 5. Two of the Nash equilibria are pure: (T. This game has three Nash equilibria. That there is something screwy with this approach should be evident from the fact that you found q ∗ by taking the derivative with respect to p. you can skip the rigmarole of writing down the expected payoﬀ and taking the derivative and just write down the indiﬀerence expression. directly. Consider the Battle of the Sexes game of Example 1. 29 . player 1’s or player 2’s? What is going on is the following. you can manipulate this equality to get the q ∗ found above. that player 1 is indiﬀerent between T and B. a) and (b. 1]. R) − qu1 (B. But this problem is linear in p (expected utilities are always linear in the probabilities) with the constraint p ∈ [0. The associated equilibrium distribution is given in Figure 17. less transparent form. 1] is a solution. For player 1. 4 0. There are two pure Nash equilibria. 8 Figure 16: A game to illustrate Nash equilibrium. the calculus-like approach works. 1] is a solution implies. You should ﬁnd this at least somewhat troubling. 1 B 2. as I told you to do. Example 28. or every p ∈ [0. My experience is that the calculus-like approach often leads to error.

the set of rationalizable strategy proﬁles is large: S R = [0. 2 2. 6 − c/2] × [0. As discussed in Example 23. The converse is false: in some games there are rationalizable strategies that never get positive probability in any equilibrium. In this game. 2 0. 3 2. a b a 20/81 16/81 b 25/81 20/81 Figure 18: The strategy distribution for mixed strategy equilibrium of Battle of the Sexes. 6 − c/2] × [0. Consider the Cournot duopoly of Example 4. Now consider the three ﬁrm Cournot game of Example 24. Theorem 11 states that if a strategy gets positive probability in a Nash equilibrium then it must be rationalizable. Another way to ﬁnd this equilibrium is to restrict attention to qi < 12 − c and use the best response functions given in Example 12. At such an equilibrium. ∗ ∗ ∗ ∗ q1 = BR1 (q2 ) and q2 = BR1 (q1 ). Example 29. 0 0.L R T 14/30 6/30 B 7/30 3/30 Figure 17: The Nash equilibrium distribution for the mixed strategy Nash equilibrium of the game in Figure 16. Therefore. 2 2. This gives two linear equations in two unknowns. Hence the Nash equilibrium is q1 = q2 = 4 − c/3. Example 30. 2 3. ∗ = q ∗ = 4 − c/3. 3 3. if ﬁrms have the cost function Ci (qi ) = cqi with c > 0 then S R = ∗ ∗ {4 − c/3} × {4 − c/3}. But the unique Nash equilibrium is (T. Consider the game in Figure 19. Here. L). 0 T M B Figure 19: A game in which some of the rationalizable strategies do not appear in any Nash equilibrium. 4 2. any such equilibrium is pure. Solving yields q1 2 Example 31. If qi < 12 − c for all i then players are never indiﬀerent. 6 − 30 . is rationalizable. S R = S: every strategy L C R 4.

payoﬀs can be perturbed slightly and there will still be 2L − 1 Nash equilibria. 1) along the diagonal and (0. Example 32. So restriction to S R does not help much. it means that the maximum number of Nash equilibria is growing exponentially in the size of the game. it suggests that the problem of ﬁnding even one Nash equilibrium may. Consider a 2 × 2 game. for a large subset of opposing strategies. a best response does not exist. payoﬀs are (1. one can show that there are L pure strategy Nash equilibria. a) and (b. In ﬁnite games. More generally. Note that the issue is not whether algorithms exist for ﬁnding one equilibrium. the number of equilibria is “typically” ﬁnite and odd. is very badly behaved in that. The total number of Nash equilibria is thus 2L − 1. typically have an inﬁnite number of Nash equilibria. For ﬁnite games. however.c/2]. there exist many such algorithms.4 Counting Nash equilibria. This establishes that the general problem of calculating all of the equilibria is computationally intractable. each with L strategies) in which. Example 34. In such games. which are inﬁnite games. The game in Figure 20 has two Nash equilibria. |S| = 4. in the game box representation.2. An additional fact about Nash equilibria is sometimes of technical use. is still merely a conjecture. How many equilibria are typical in ﬁnite games? A lower bound on how large the set of Nash equilibria can possibly be in ﬁnite games is provided by L × L games (two players. if there are N ﬁrms then there is a Nash equilibrium of the ∗ Cournot game (with this particular demand function) with qi = (12 − c)/(N + 1) for all i. First. be computationally intractable. 6. The latter. or even all equilibria. Nevertheless. (a. The sense in which the number of equilibria is “typically” ﬁnite and odd is as follows. There are four pure strategy proﬁles. The problem is that the time taken by these algorithms to reach a solution can grow explosively in the size of the game. But an argument analogous to that just given for the two-ﬁrm Cournot game shows that there is a Nash equilibrium: ∗ ∗ ∗ q1 = q2 = q3 = 3 − c/4. and 31 . repeated games. I ﬁrst present some examples. Example 35. corresponding to play along the diagonal and an additional 2L − (L + 1) fully or partly mixed Nash equilibria. This example is robust. b). The game of Example 27 has three Nash equilibria. Second. which is the competitive price (recall that I have assumed no production costs). The entry deterrence game of Example 45 has an inﬁnite number of Nash equilibria. the Bertrand game does have a Nash equilibrium: both ﬁrms charge 0. As discussed in Example 48. 0) elsewhere. This is extremely bad news. The Bertrand game. Example 33. in general. introduced in Remark 2 in Section 4.

0 Figure 20: A game with two Nash equilibria. I postpone further discussion of this issue until I have discussed extensive form games.a b a 1. see Remark 12 in Section 8. especially games with a non-trivial dynamical structure. 1 0. More generally. to describe the payoﬀ functions. and so far you have found four. and the main reason why I bother to mention this topic in the ﬁrst place. One can show that the subset of RN |S| for which the number of equilibria is not ﬁnite and odd is “thin” in the sense that an arbitrarily small bump to payoﬀs can transform the game into one where the number of equilibria is ﬁnite and odd. however. 0 0. I can thus describe the payoﬀ functions as an element of R8 . not robust.8. in this sense. A practical implication of this. One can show that Example 34 and Example 35 are. The statement that the number of equilibria is “typically” ﬁnite and odd is. the set of all payoﬀ functions for a ﬁnite game form is RN |S| . 0 b 0. The deﬁnition of “typical” sketched above is not appropriate for many games. or eight payoﬀs in all. The set of all payoﬀ functions for a 2 × 2 game form is simply all of R8 . hence. subject to an important qualiﬁcation. I must specify four payoﬀs for each player. then you are probably missing at least one. 32 . is that if you are trying to ﬁnd all the equilibria of the game.

let ΣR denote the set of strategy distributions for which only rationalizable strategies get positive probability. Every ﬁnite game has at least one correlated equilibrium distribution. its geometric structure is not as tidy as that of either ΣR or ΣCorr . σN ). for all i. in turn. Given a strategy distribution σ. if σ is not independent then it does not make sense to refer to the proﬁle of marginals. not necessarily independent. . Theorem 13.2 and Section 3. σ ∈ ∆(S) is a correlated equilibrium distribution iﬀ. σi (si ) The following equilibrium concept was introduced in Aumann (1974). lies somewhere inside of ΣCorr but. let ΣNash denote the set of Nash equilibrium distributions and let ΣCorr denote the set of correlated equilibrium distributions. Thus. Deﬁnition 8. if σ and σ are both Nash equilibrium distributions then an ˆ element of ∆(ΣNash ) might put probability 1/2 on σ and probability 1/2 on σ . The proof is the same as the proof of Theorem 11.7 Correlated Equilibrium. Theorem 10 implies the following trivial corollary. ΣR constitutes either all of ∆(S) (if all strategies are rationalizable) or some lower dimensional face of ∆(S). vice versa. A distribution over Nash equilibrium distributions is an element of ∆(ΣNash ). Therefore. . Recall from Section 3. If σ is a correlated equilibrium distribution then σ is rationalizable. as a correlated equilibrium. Since a Nash equilibrium 33 . is deﬁned by σ−i|si (s−i ) = σ(si . in general. it is easy to verify that any Nash equilibrium distribution is a correlated equilibrium distribution. Theorem 12. . if σi (si ) > 0 then si ∈ BRi (σ−i|si ). ΣCorr is a convex set called a polytope (think of a cut diamond) that lies in ΣR . Proof. ˆ Example 36 will illustrate this for Battle of the Sexes. Given a game. in general. It is also possible to prove Theorem 12 by elementary means (without resorting to ﬁxed point theorems). But it is common to abuse terminology and refer to the correlated equilibrium distribution σ as a correlated equilibrium. the marginal distribution over S−i conditional on si . let σi be the marginal distribution for player i and suppose that σi (si ) > 0. for example. Since the deﬁnition of correlated equilibrium distribution allows σ to be independent.3 that if σ is not independent then the marginals σi do not have an interpretation as mixed strategies. (σ1 . . Therefore. ΣNash . Then σ−i|si . s−i ) . but not.

I abuse notation and view ∆(ΣNash ) as simply a lottery over S. All mixtures over the corresponding Nash equilibrium distributions are correlated equilibrium distributions.distribution is a lottery over S. each player gets the same expected payoﬀ) but each player gets only 40/9 ≈ 4. Example 37. Thus. The game was J D J 6. I have to choose some number to represent death. This distribution yields an expected payoﬀ of 5 Shouldn’t the payoﬀ from death be −∞? As I discuss in the notes on decision theory. players can either jump (J) or drive (D). 0 Figure 21: A game of Chicken. the distribution shown in Figure 5 (Example 6) is a correlated equilibrium distribution. In contrast. All of the Nash equilibrium distributions identiﬁed for this game in Example 28 are likewise correlated equilibrium distributions. Reducing this compound lottery. Example 36. The preferred outcome is to have the other guy jump. any probability distribution over Nash equilibrium distributions is a correlated equilibrium distribution. In addition. J) and the mixed strategy equilibrium in which each player jumps with probability 2/3. but there are also other correlated equilibrium distributions. players randomize 50/50 between the two pure strategy equilibria. 2 0. giving you a payoﬀ of 7 versus 2 for the chicken who jumps. (a. it is easy to verify that ∆(ΣNash ) ⊂ ΣCorr . a payoﬀ of −∞ for death is not as sensible as it might at ﬁrst appear.e. 7 34 . the expected payoﬀ in the mixed strategy equilibrium is also symmetric (i. the set of correlated equilibria include all mixtures over Nash equilibria. in which troubled 1950s teenagers competed to see who would be the last to jump to safety when driving cars towards a cliﬀ. which is worth zero. (D. not in general. Figure 21 shows an example of a game called Chicken. in particular. With this abuse of notation. 6 2. b). each player gets an expected payoﬀ of 9. inspired by the movie Rebel Without a Cause. the set of correlated equilibria equals the set of all mixtures over Nash equilibria: does ∆(ΣNash ) = ΣCorr ? The answer is “no. an element of ∆(ΣNash ) is a compound lottery over S.7 You can verify that this game has three Nash equilibria: (J. and here I have chosen 0. in the context of expected utility. In this distribution. If neither player jumps then both die.4. D). Under this correlated equilibrium distribution.” I will illustrate this with Example 37. In words. A natural question is whether. a) and (b. In this stylized representation. Consider once again the Battle of the Sexes game of Example 1. in fact. 7 D 7. One of them is shown in Figure 22.

The associated strategy distribution is shown in Figure 14. the set of Nash equilibrium distributions coincides with the set of correlated equilibrium distributions. so that the two together get 10. In contrast. the pure Nash equilibria give only 9 and the mixed Nash equilibria gives only 28/3 ≈ 9. Consider Matching Pennies. Example 38.3. In this game. This is also the unique correlated equilibrium distribution. So any mixture over Nash equilibria likewise yields strictly less than 10. introduced in Example 2.J D J 1/3 1/3 D 1/3 0 Figure 22: A correlated equilibrium distribution for Chicken. The unique Nash equilibrium is for each player to randomize 50/50. 35 . This establishes. among other things. that the distribution in Figure 22 cannot be generated by mixing over Nash equilibria. to each player.

at the decision node labeled x0 . To keep things simple. and chooses one of two actions.1 8. there are two players. The goal is to represent games with a dynamic structure.1 Extensive forms. an industry incumbent (player 2) and a potential entrant (player 1). Perfect information. chess for example. the nodes at which player i must act. For a formal treatment. Other nodes are then reached depending on the actions chosen. And I introduce a new “player” called Nature who can introduce randomness into the game. My discussion. therefore. Let Ai = Aix be the set of all of player i’s possible actions.1. . Xi are the decision nodes for player i. fully informed of the previous actions in the game. There is an initial decision node x0 . . is informal. For each node x ∈ Xi there is a set Aix . I start by discussing games like chess in which each player acts in turn. There is a ﬁnite set X of nodes. typically a decision node for player 1 but possibly a decision node for one of the other players. The general formalism for such games is cumbersome.8 Games in Extensive Form. see Fudenberg and Tirole (1991). 8. . Player 2 In Player 1 x0 Out x2 x1 Fight x3 Acc x4 Figure 23: An entry deterrence extensive form. the actions available to player i at x. Formalizing this turns out to be tedious and so I merely illustrate with an example. which are the points in the game at which either (a) players take actions or (b) play terminates. I then complicate things in two ways. In the game shown in Figure 23. There is a set I of players. Such games are called games of perfect information. X is partitioned into the the disjoint sets X1 . . Xτ . Xτ are the terminal nodes. I allow players to be only partially informed of previous play in the game. Player 1 moves ﬁrst. XN . Osborne and Rubinstein (1994) provides an alternative formalism. In (enter the industry) or Out (don’t 36 . An extensive form of perfect information consists of the following components. the nodes at which the game ends.

x3 . Thus Xi = h∈Hi h and. The terminal nodes have no successors and they are the only nodes with this property. x1 . In the game shown in Figure 23. play starts at x0 and ends at some terminal node. so pictures like that in Figure 23 are convenient for small extensive forms but useless for large or complicated extensive forms. h ∈ Hi if h = h then h ∩ h = ∅.enter the industry). As usual. If player 1 chooses I then play moves to the decision node labeled x1 and it is player 2’s turn.” 37 . The immediate successors of x0 are x1 and x2 . But if player 1 chooses either “top” or “middle” then play moves to one of the two decision nodes.” If he chooses “bottom” then play moves to x3 and player 2 chooses either “in” or “out. If player 2 chooses Fight (launch a price war) then play ends at the terminal node labeled x3 . 8. but he does not observe directly whether player 1’s action was “top” or “middle. The abstract formalism does not require that one be able to draw a picture to have a well deﬁned extensive form. The interpretation is that if this information set is reached then player 2 knows that he is either at decision node x1 or x2 but he does not know which node he is at. x3 is identiﬁed with the play path (x0 . labeled x1 and x2 inside the oval. I use “play path” and “terminal node” interchangeably. the set of i’s decision nodes is partitioned into information sets. One can also identify a play path by the sequence of actions taken. See Figure 24. For example. the predecessors of x4 are x1 and x0 . Just as game boxes are convenient for small games but useless for large games.” don’t launch a price war) then play ends at the terminal node labeled x4 . x3 ). Rather. In general. Every terminal node is thus uniquely identiﬁed with the play path (or path of play) taken to reach that node. passing through each node at most once. One last remark. The play path taken in a chess match is reported in this way. Finally. He may have opinions based on how he thinks player 1 would play. for any h. To capture strategic environments in which players are only partially informed of what has happened in the game to date. with dots. x2 . x1 is the immediate predecessor of x4 . I have marked the decision nodes. Hi denotes the set of information ˆ ˆ ˆ sets for player i. Fight). if player 1 chooses O then play ends at the terminal node labeled x2 . the decision node x0 is the unique node with no predecessors. The oval represents the information set containing x1 and x2 . and x4 .” All he directly observes is that player 1’s action was not “bottom. These assumptions guarantee that play cannot cycle. If player 2 chooses Acc (“Accommodate. I assume that for each player i.” “middle. Thus x3 is identiﬁed by (In. x0 is a predecessor of every other node. The successors of x0 are x1 .” after which the game is over. but not the terminal nodes. A node can have at most one immediate predecessor. Player 1 moves ﬁrst and chooses either “top.2 Information sets. I illustrate with a picture rather than provide a complete formalization.” or “bottom. As a convention.

Path of play is deﬁned for general extensive forms exactly as it is for extensive forms of perfect information. Extensive forms of perfect information are extensive forms in which all information sets are singletons. a strategy for player i speciﬁes an action at every information set.3 Extensive form strategies. 8. the action sets ˆ Aix and Aiˆ are the same. who could then execute the strategy on behalf of the player. Because the strategy speciﬁes an action for every one of i’s information sets. Formally a strategy is a function si : Hi → Ai such that. si (h) ∈ Aih . the information set containing x0 is a singleton.Player 2 top Player 1 x0 bottom middle x2 x1 x4 left center x5 x6 right left x7 center x8 right x9 In x10 x11 x3 Out Player 2 Figure 24: An extensive form with a non-trivial information set. In general. In an extensive form game. let S = 38 . One can think of a strategy as a complete set of instructions that could be handed to an agent. let Si be the set of player i’s strategies. where si (h) is the action chosen at h. the strategy provides instruction for how the agent should act no matter how the game unfolds. I therefore let Aih denote the set of actions available at every node x ∈ h. This is contrary to the intended interpretation of information sets. For any two distinct decision nodes x and x in an information set. If an extensive form is not one of perfect information then it is one of imperfect information. Exactly as for strategic forms. The reason for this is that if Aix and Aiˆ were diﬀerent x x then player i would be able to observe which node he was at merely by observing which action set was available. for any h ∈ Hi .

. the strategy proﬁle s = (s1 . Say that a node x is reachable under s if x is contained in the path of play determined by s.be the set of strategy proﬁles. sN ) determines a path of play (terminal node). or has an agent execute the plan for him. no matter how unlikely. one can model a player as if he were making all his decisions up front even though he is actually deciding dynamically. First. however. especially dynamic players. . Player i may not make up his mind about what to do at information set h ∈ Hi unless play actually reaches h. a common error in writing down extensive form strategies is to fail to specify actions for all of the player’s information sets. It is common for people just learning game theory to state that Sweet if Blue is a strategy. Player i Si Player 2 Sweet x 3 Sour x1 Blue x4 Player 1 x0 Red x2 Hot x 5 Warm6 x Cold x7 Figure 25: Player 1 has two strategies. 1 has just two strategies. I discuss this in Section 8. This interpretation seems farfetched.4. . After all. might randomize at some information sets. Consider Figure 25. Player 2 has six. the distinction between a player who decides on an extensive form strategy before play begins and a player who decides on actions dynamically as play unfolds is irrelevant. Two observations may be helpful. one of which is Sweet if Blue and Warm if Red. Player 2 has six strategies. It turns out. and the strategy can be thought of as merely recording what that action will be. Say that an information set h is reachable under s if h contains a node that is reachable under s. Blue and Red. and then simply executes this plan. Of all the concepts in game theory. the player eventually chooses some action at every information set that he reaches. One typically thinks of real people as making decisions about how to play dynamically. . the concept of extensive form strategy may be the one that causes the most confusion. as the game proceeds. the most straightforward interpretation of a strategy is that the player decides before the game is played how he will play under every possible contingency. but not quite all of game theory. Second. This is wrong because we also need to 39 . and let S−i = j=i Sj be the set of strategy proﬁles for players other than i. One may also want the strategy to reﬂect the fact that players. that for much. Given strategies si for each of the players. For much of game theory.

not just the strategy fragment Sweet if Blue. there is a positive. why specify what she will do at her x4 ? Without belaboring the point. The number of strategies for player i is the number of diﬀerent ways to assign actions to information sets. probability that player 1 will play Red. For the time being. which is |Aih |. 40 . high) and (bottom. I view the case in which player 1 is certain to play Blue as a limit case. in thinking through what she wants to do. only if she chooses top at x0 . for example. low) are distinct strategies for player 1. In reality. I need to do so. This is true even if our prediction is that player 1 will play Blue. useful relationship between the number of terminal nodes and the number of strategies. This seems bizarre. Strictly Player 2 top Player 1 x0 bottom x2 x1 left x3 high low x5 x6 x4 right Player 1 Figure 26: An extensive form in which one player can move more than once. Similarly. the answer is that in some cases it is indeed common to write both (bottom. Player 1 can get to node x4 . I know of no general. A related. One way to help avoid error in thinking about extensive form strategies is to learn how to count strategies. but more forgivable. in order to work with subgame perfection. you should view the distinction between (bottom. There are two reasons for insisting that player 2’s strategy be complete in this way. low) as a harmless nuisance. player 1 must consider how player 2 would respond if player 1 were to play Red as opposed to Blue. First. error can be illustrated using Figure 26.specify an action for player 2 should player 1 choose Red. high) and (bottom. h∈Hi Thus. That means player 1 must contemplate player 2’s strategy. and choose between high and low. low) as simply bottom. If. But there are also circumstances in which one needs to keep track of the full strategy. if possibly very small. for the extensive form in Figure 25. the number of strategies for player 2 is 2 × 3 = 6 as claimed. in Figure 26. high) and (bottom. speaking (bottom. she chooses bottom. Second. player 1 has 2 × 2 = 4 strategies. as here. which I discuss later in the course. Note that the number of terminal nodes in the game of Figure 26 is 6.

and node x2 is reached with probability 1/2. Black has 20 information sets and at each information set she has 20 actions. In eﬀect. the set of probability distributions over proﬁles for players other than i. Example 40. out in the lower information set. Node x3 is reached with probability 1/4. Player 1 has three strategies.) before play begins to determine an extensive form strategy. Suppose σ2 puts probability 1/2 on both left and right. nodes x4 and x5 are each reached with probability 1/8. the set of probability distributions over player i’s strategies. Black has 2020 ≈ 1026 strategies. and probability 1/2 on the strategy (bottom. In contrast. I say that an information set h is reachable under σ iﬀ h contains a decision node that is reachable under σ. player i tosses a coin (or a die. The ﬁrst is that player i chooses a mixed strategy. and hence information sets. low). Therefore. etc. a behavior strategy for player i is a function b σi : Hi → h∈Hi ∆(Aih ) 41 . Note that. Assuming that σ is independent.Example 39. or bottom. are reachable under σ. at her information sets. and ∆(S−i ). meaning center in the upper information set. low). then the game is over. which is simply an element σi ∈ ∆(Si ). high). one can consider ∆(S). Consider the game of Figure 24. or spins a wheel. middle. σ ∈ ∆(S) determines a probability distribution over paths of play (terminal nodes).” White moves ﬁrst. Then the distribution over terminal nodes is as follows. I give an example below. then black. Exactly as for strategic forms. all decision nodes.4 Probability distributions in extensive forms. top. I do not specify actions for x1 and x2 separately. Player 2 has six strategies. since x1 and x2 are contained within the same information set. Say that a node x is reachable under σ iﬀ x node is reachable under a strategy proﬁle s for which σ(s) > 0. high). White has 20 actions and hence 20 strategies: each of the eight pawns can move in one of two ways. Formally. there are two ways to think of how player i might randomize. consider Figure 26. the set of probability distributions over strategy proﬁles. and each of the two knights can move in one of two ways. For example. 8. and probability zero on (bottom. In this example. two move chess has 20 × 20 = 400 terminal nodes. Suppose that σ1 puts probability 1/4 on the strategy (top. The alternative way to think about randomization is that player i randomizes dynamically. ∆(Si ). then executes that strategy. out). probability 1/4 on the strategy (top. One of these is center. Consider “two move chess.

σ−i ) does. Suppose that. As before. assume that σ2 puts probability 1/2 on both left and right. 11 There has. Then the distribution over terminal nodes is as follows.9 σi (h)(ai ) is the probability of ai under the randomb ization σi (h) ∈ ∆(Aih ). information sets must be constructed so that opponents do not directly observe the mixtures chosen. σ−i ) determines a distribution over paths of play just as (σ. More formally. By b way of example. Thus. only. There is a subtlety here. σ1 puts probability 4 . The problem was ﬁrst noted in Aumann (1964). been some recent research interest in games without perfect recall. and not the randomization itself. where b by “equivalent” I mean that. rather than dynamically.5. at x 1 low. Another solution was proposed in Milgrom and Weber (1985). see Fudenberg and Tirole (1991). Finally. for any σ−i . for each information set. one is implicitly assuming that the opponents can observe at most the action taken. If an extensive form satisﬁes a property called perfect recall b then for any behavior strategy σi there is an equivalent mixed strategy σi .b b such that σi (h) ∈ ∆(Aih ). For a readable discussion of the diﬀerence between the two solutions. (σ b .8. σ−i ) generate the same probability distribution over terminal nodes. if the choice is between left and right the opponent might observe that the action was left but not whether the probabilities used for randomization were 50/50 rather than 70/30. (σi . 8 42 . Note that the example using mixed strategies and the example using behavior strategies generated the same probability distribution over terminal nodes. nodes x4 and x5 are each reached with probability 1/8. Node x3 is reached with probability 1/4. “Nature. 9 A number of authors use the term behavioral strategy rather than behavior strategy.” discussed in Section 8. at x0 . This reﬂects a general fact. which also proposed a solution. Whether one works with mixed strategies or behavior strategies is largely a matter of convenience. consider again Figure 26. The terms are equivalent. 10 Note that I have not claimed that there is a unique equivalent strategy or behavior strategy.) In assuming that a player can execute a behavior strategy. at most. A strategy si can be thought of as a degenerate behavior b strategy: if si (h) = ai then σi (h)(ai ) = 1. as play proceeds. and node x2 is reached with probability 1/2. before play begins. to determine what action is actually realized at each information set for each player. if a player can execute a behavior strategy then the true extensive form must specify. (You may wish to skip this footnote. not only the available actions but also the available probability mixtures over actions. There will be many.11 Kuhn’s theorem is another aspect of the fact that it is in many respects without loss of generality to assume that decisions about how to play are made up front. Finally. however. I mention in passing that there is a mathematical diﬃculty that arises when one tries to deﬁne behavior strategies in extensive forms with continuum action sets. for example. I spare you a deﬁnition of perfect recall. σ−i ) and (σi . It is a property satisﬁed by virtually every extensive form used in practice. for any mixed strategy there is an equivalent behavior strategy.10 This result is known as Kuhn’s Theorem and ﬁrst appeared in Kuhn (1964). σ b puts probability 1/2 on both high and 1/2 on both top and bottom and. The extensive form must also include a notional player. the actions realized. And conversely.

If he chooses up then nature randomizes 50/50 between left and right. I model various kinds of uncertainty in the game by introducing a new notional player. By asb sumption. player 0.7 Extensive form games ⇔ strategic form games. in modeling poker I view Nature as determining the hands of the players. The set of players is I. If he chooses down the game is over. Thus. and typically is assumed to exclude Nature. Henceforth I label nodes only occasionally. 16). to specify payoﬀs. It remains 43 . Nature chooses a behavior strategy σ0 that is part of the description of the extensive form. so extensive forms are transformed into extensive form games by adding payoﬀ functions vi : Xτ → R. Figure 27 shows a trivial game with nature. K is a setup or entry cost.8. recall that Xτ is the set of terminal nodes. Note that I have stopped labeling nodes explicitly. and I assume K ∈ (0. Example 42. form game determines a strategic form game as follows. Any extensive strategic form strategic form Si . called Nature. The entry deterrence extensive form of Figure 23 become an extensive form game by adding payoﬀs. 8. Just as strategic forms were transformed into strategic form games by adding payoﬀ functions: ui : S → R. if player 1 enters and player 2 accommodates the player 1 gets a payoﬀ of 16 − K while player 2 gets a payoﬀ of 16. Player 1 has two actions.5 Nature. See Figure 28. 8.6 Extensive form games. Example 41. No payoﬀ function for Nature is required. Nature’s information sets are singletons. For example. 1/2 left N up Player 1 down 1/2 right Figure 27: A game with nature and one player. The set of strategies for player i is simply the set of extensive form strategies.

. u1 (up) = 1 (3) + 1 (−1) = 1 2 2 while u2 (down) = 0. in this one player game. Fight Acc In −K. 36) Figure 28: An entry deterrence game. Consider a strategy proﬁle s = (s1 . Thus any extensive form game generates a strategic form game. Figure 30 gives the extensive form for a very simple game with one player in addition to nature. 44 . The associated strategic form game box. . As indicated.8 Nash equilibria in extensive form games. set ui (s) equal to the expectation of vi (x). . Example 43. Consider again the extensive form game of Figure 28. . 0 16 − K. 16 . but I do not pursue that issue. 36 Figure 29: An Entry Deterrence Game b If Nature is present then the true strategy proﬁle is (σ0 . If there are no moves by Nature then s determines a terminal node x ∈ Xτ . set the payoﬀ ui (s) = vi (x). . 0) (16-K. .Player 2 In Player 1 Out Fight (-K. Example 44. 8. 16) Acc (0. Deﬁnition 9. is shown in Figure 29. Similar statements hold for rationalizable strategy proﬁles and correlated equilibria. s1 . Given this distribution. . For the strategic form. Nature is equally likely to choose either of her two actions. which yields a probability distribution over terminal nodes. 36 0. sN ). Out 0. . The Nash equilibria of a game in extensive form are the Nash equilibria of the associated game in strategic form. The converse is also true. sN ). Therefore. with player 1 being row and player 2 being column.

(In. 45 . the claim must be qualiﬁed because it rests on a deﬁnition of “typical” that is not always sensible. the entrant chooses NE and the incumbent chooses Fight with probability at least 1 − K/16. with an explicit dynamic structure. and entry and accommodation – and these in turn generate the payoﬀs. There is. This game has two pure Nash equilibria. especially for games. In one distribution. Fight) as well as the partly mixed equilibria are implausible because Fight is weakly dominated by Acc (the threat of Fight is not credible). the appropriate space of payoﬀs is not R8 but R6 . if the incumbent threatens to play Fight with high enough probability then the entrant does not enter. In the latter. The fact that the entry deterrence game has an continuum of NE appears to contradict the claim made in Section 6. The associated strategic form game is exhibited in Example 43. In this sense. but these are Nash equilibria. there are three outcomes – no entry. This phenomenon generalizes. the equilibria generate only two possible distributions over outcomes. Consider the entry deterrence game of Example 42.4 that the number of Nash equilibria in a ﬁnite game is typically ﬁnite and odd. Acc) and (Out. in the entry deterrence game of Example 42. although the set of Nash equilibria is inﬁnite. and a continuum of partly mixed Nash equilibria. see Govindan and McLennan (1998). Intuitively. it is robust in R6 . Given that there are only three outcomes. like the entry deterrence game. Example 45. although not quite as completely as one might hope.1/2 N up Player 1 down 0 1/2 3 -1 Figure 30: An extensive form game with nature. one piece of good news. Notice that in the entry deterrence game. however. even though there are four pure strategy proﬁles. For example. As I stated at the time. Remark 12. Fight). While the entry deterrence game is not robust in R8 . the fact that the entry deterrence game has an inﬁnite number of equilibria is robust. entry and ﬁghting. the outcome is no entry and in the other the outcome is entry and accommodation. The pure equilibrium (Out. both degenerate.

In the repeated game.9 Repeated games. The action taken in the ﬁrst period is simply si (h0 ). As usual. . 2. Example 46. regardless of history). A is the set of action proﬁles in the stage game. But four are well known. Ai is the set of actions for player i in the stage game. . a t-period history is an element of At . Formally. tit for tat (play C initially and thereafter play in period t + 1 whatever action your opponent took in period t). . In a repeated game. People often face the same. In the repeated game based on G. 2. which is the inﬁnite history that records the action taken by each player in every period 1. literally the same people play literally the same game over and over. if either player plays F in any period then play F in every subsequent period).8. I focus exclusively on the path of play. . the set of strategies in the repeated prisoner’s dilemma is uncountably inﬁnite. Si is the set of player i’s strategies in the repeated game and S = N Si is the set of strategy proﬁles. . Given a path 46 . They are always cooperate (play C in every period. which is equivalent to the set of all information sets. The assumption of perfect monitoring means that player i’s information set in period t + 1 is uniquely identiﬁed by the t period history of the game to date. always ﬁnk (play F in every period. . To describe the outcome of the repeated game. I focus here on inﬁnitely repeated games of perfect monitoring. . t. . a pure strategy for player i is a function si : H → Ai giving player i’s action for every possible history. decision problems repeatedly. I include in H a special abstract element h0 . Thus. which is the empty history at the beginning of period 1. Unless Ai is trivial (contains only i=1 one action). at date t + 1 player i knows the actions taken by his opponents in every preceding period. time is divided into periods 1. or at least similar. 2. I refer to strategies in the stage game G as actions. A t-period history records the action taken by each player in every period 1. I discuss some alternative models of repeated interaction at the end of this subsection. . Let H be the set of all possible ﬁnite histories. . For technical convenience. As noted above. I can deﬁne strategies using histories rather than information sets. . . a path of play is an element of A∞ . regardless of history). A repeated game is an idealized representation of this sort of strategic environment. G is called the stage game or the one-shot game. Formally. and grim (play C every period provided neither player ever plays F . Fix a strategic form game G. 3. with G played in each period. To avoid confusion with strategies in the repeated game. under perfect monitoring. Because information sets are identiﬁed with histories. Si is uncountably inﬁnite. Formally. The most famous repeated game by far is the repeated prisoner’s dilemma (Example 3).

of play ζ. Repeated games with the same stage game but diﬀerent δi are diﬀerent. δi has two interpretations. I give interpretations of δi below. it may make sense to model players as knowing the history of market prices but not directly the history of their opponents’ output choices. Thus ζt ∈ A. Let vi denote player i’s payoﬀ function in the stage game. and many of these variations have been explored in the literature. referring to the second interpretation of δ given above. In particular. the formal description of the repeated game is complete. One can consider diﬀerent time structures. 1). In this case. one can consider games in which the probability of termination. payoﬀs are deﬁned as follows. One can consider ﬁnitely repeated instead of inﬁnitely repeated games. and may have much diﬀerent properties. One can consider repeated games of imperfect monitoring. There are many possible variations on the basic repeated game model. player i’s payoﬀ in the repeated game is the discounted sum of his payoﬀs in each period. let ζt be the action proﬁle taken in period t. The second interpretation is that in every period there is a constant probability 1 − δ that the game ends (in which case. 47 . Then δi = 1/(1 + ri ). ri . 1 − δ. Note that the δi are part of the deﬁnition of the repeated game. Thus. ui is the expected payoﬀ in the repeated game. suppose that player i has a rate of time preference ri and that there is a probability 1 − δ each period that the game will end. Thus. this corresponds to the case where player i is “inﬁnitely patient. in which players know their own action history but not necessarily the action histories of their opponents. as I illustrate in Example 13 and Example 48 below. which I denote ζ(s). For each player i. there is a discount factor δi ∈ (0. in a discounted repeated game.” Or one can stick within the discounting framework but allow for non-constant δi . The ﬁrst is that player i is impatient and has a rate of time preference. In a repeated Cournot oligopoly game. A strategy proﬁle s in the repeated game determines a path of play. Finally. the game eventually ends with probability 1). varies or is unknown. Then player i’s payoﬀ function in the discounted repeated game is ui : S → R given by ∞ ui (s) = t=1 t−1 δi vi (ζt (s)) This can be extended to mixed strategy proﬁles but I do not do so explicitly. One can deﬁne repeated game payoﬀs in diﬀerent ways. A common alternative is limit of means: ui (s) = lim inf T →∞ 1 T T vi (ζt (s)). t=1 Loosely. For example. Then player i’s discount factor is δi = δ/(1 + ri ). Having deﬁned strategies and payoﬀs in the repeated game. with δi = δ. These two interpretations are compatible. for example. since 1 − δ > 0.

they could alternate. see also Example 48. And ﬁnally. and so on. One can consider models in which the same players meet each period in diﬀerent games. One can consider stochastic games. Or they could play A for the ﬁrst three periods. If a∗ is a Nash equilibrium of the stage game. Moreover. Example 47.” but “grim” is a best response to “grim. But there may be many other Nash equilibria as well. This contrasts sharply with the one-shot Prisoner’s Dilemma. If there are at least two equilibria in the stage game. If the discount factors.” Hence mutual play of “always ﬁnk” is a Nash equilibrium in the repeated Prisoner’s Dilemma. Nash Equilibria in the repeated Prisoner’s Dilemma.one can consider games that take place in continuous rather than discrete time. I discuss this phenomenon in more detail later in the course.14159 . Rationalizable strategies in the repeated Prisoner’s Dilemma. where the unique rationalizable proﬁle has both players play F . Recall the repeated Prisoner’s Dilemma. If both players play “grim” then the path of play is mutual play of C. in repeated games. As discussed in Example 13. if δi > 2/5 then. Example 46. if the discount factors δi are close enough to 1 then the set of rationalizable strategies can be very large. If the δi are close to 1. for the stage game payoﬀs of Example 46. . in which the stage game played in period t depends on what has happened in previous periods. ).” Indeed. one can consider oligopoly games in which investments made in earlier periods change the cost structure of players in later periods. For example. there is an uncountable inﬁnity of equilibria in the repeated game. Example 48. B in odd periods. and hence mutual play of “grim” is a rationalizable proﬁle. Or one can consider games in which players alternate. This illustrates a general property of repeated games. if the stage game has more than one Nash equilibrium (which is not the case with the Prisoner’s Dilemma) then there are Nash equilibria in which player’s play one stage game equilibrium in some periods and other stage game equilibria in other periods. First of all. playing A in even periods. two ﬁrms with multiple divisions may compete against each other in more than one market. corresponding to the decimal expansion for π (3. then the strategy proﬁle in which each player i plays a∗ in every i period regardless of history is a Nash equilibrium in the repeated game. if there are two equilibria. this is no longer true. rationalizable proﬁles in the repeated game can generate paths of play in which players take actions that are not rationalizable in the stage game. For example. 48 . just as F is strictly dominant in the stage game (Example 21). . one can consider models in which the set of players changes from one period to the next. A and B. one can verify that “grim” is rationalizable. “always ﬁnk” is not a best response to “grim. More generally. “always ﬁnk” is a best response to “always ﬁnk. “always ﬁnk” is strictly dominant in the repeated game. As noted in Example 13. For example. A for the next four periods. B the next period. the δi are very low then. some acting in some periods and others acting in other periods.

But there may be more equilibria than even this suggests. Again consider the repeated Prisoner’s Dilemma. As also noted in Example 13, “grim” is a best response to “grim” provided δi is large enough (δi > 2/5). Therefore, depending on the value of δi , there are equilibria of the repeated game that do not correspond to repeated equilibria of the stage game. In fact, there are lots of (an uncountable inﬁnity of) such equilibria, a fact that is usually referred to as the “Folk Theorem.” I discuss the Folk Theorem later in the course.

8.10

Games of incomplete information.

One is often interested in environments in which players are asymmetrically informed about diﬀerent aspects of the game. The potential scope for asymmetry is large. For example, player 1 may know that terminal node x means that she receives $1000, but she may not know what the other players receive. And even if she does know what her opponents receive, she may not know their vi (i.e., she may not know their felicity functions). And even if she knows the vi , player 1 still faces strategic uncertainty: she does not know what strategies the other players will play. Pursuing these sorts of issues quickly leads into extremely deep philosophical waters. Games with these sorts of information asymmetries are called games of incomplete information. The standard approach within game theory, following Harsanyi (1967), has been to simplify things by modeling a game of incomplete information as a game of imperfect information in which Nature moves ﬁrst and determines player types, where a player’s type speciﬁes what the player knows at the start of the game. This representation has been shown to be almost without loss of generality. The critical qualiﬁcation is that in the imperfect information representation of the game, the probabilities used by Nature in selecting type proﬁles are part of the description of the game and as such are common knowledge among the players: the players agree on the probabilities, know that they agree on the probabilities, and so on. This is called the common prior assumption (CPA) and it is substantive. I discuss it further in Section 9.3. For more on this issue, see Dekel and Gul (1997). Remark 13. A game of incomplete information that has been represented as a game of imperfect information is often called a Bayesian game and the Nash equilibrium of such a game is often referred to as a Bayesian Nash Equilibrium. I do not use these terms, however.

49

9

9.1

**Justifying Rationalizability, Correlated Equilibrium, and Nash Equilibrium.
**

Preplay agreement.

Suppose that players meet before the game and reach an agreement as to what strategy proﬁle, or mixed strategy proﬁle, to play. A minimal condition for such an agreement to be meaningful is that, assuming that each player believes that the others will abide by the agreement, no player has a strict incentive to deviate away from it. A mixed strategy proﬁle with this property is a Nash equilibrium. So if players reach an agreement on how to play, the agreement must be a Nash equilibrium. One objection to this is that while a player might have an incentive to deviate, he won’t if the others can punish him. For example, the players might be able to write a contract with ﬁnes, enforced by a judicial system, if any player deviates. The methodological position taken in non-cooperative game theory (and this is one of the ways that non-cooperative game theory and cooperative game theory diﬀer) is that if any such method of punishment is available then it must be incorporated directly into the description of the game. Thus, for example, under the contract story the players, in eﬀect, agree to play a game in which each player’s payoﬀs have been modiﬁed to include a ﬁne for any strategy other than the agreed strategy. Similarly, if players will meet again, and if they can use the threat of retaliatory future behavior to deter deviation in today’s game, then the game must explicitly capture this repeated interaction and the agreement will be strategies in this repeated game, not just over the actions taken in a single period. Because any enforcement mechanism is supposed to be incorporated into the description of the game, one sometimes sees Nash equilibrium referred to as a self-enforcing agreement. As a motivation for Nash equilibrium, I ﬁnd preplay agreement compelling but it is subject to several qualiﬁcations. First, it is not always relevant. To take an extreme example, the entire world economy can be modeled as one gigantic game, but to this date the world population has yet to meet to agree on exactly how to play it. If one thinks that play in this gigantic game conforms, or eventually will conform, even approximately to equilibrium then one must appeal to some story other than preplay agreement. Second, if the game has multiple Nash equilibria then the negotiation over which equilibrium to play may be non-trivial. The appeal to preplay negotiation may thus end up replacing one puzzle, namely how can players reach equilibrium in a game, with another, namely how can players reach equilibrium in their negotiations about the game. And third, if players can negotiate prior to play of the game then they may be able to negotiate to change the game itself.12 One expects that players will try to

12

In principle, one can model the negotiation explicitly as taking part in an even larger game, in

50

alter games that yield outcomes that they dislike. One elementary way to change the game is to introduce a correlation device, which may be unobservable to an outside observer. As a consequence of the correlation device, play to the observer may look correlated. Put diﬀerently, from the perspective of an outside observer, the implication of preplay negotiation is, in general, not Nash equilibrium but correlated equilibrium. I will illustrate this using Battle of the Sexes, introduced in Example 1. I will refer to this game as G. Suppose that players agree to toss a fair coin before play begins and then to play (a, a) if the coin lands heads and (b, b) if the coin lands tails. This makes good sense for the players, because it will give them an expected payoﬀ of 9 each, which is the highest symmetric payoﬀ that they can receive in the game. The true game, which I call Γ, incorporates the coin toss explicitly as an initial move by Nature. Figure 31 provides one possible extensive form representation of Γ. As

Player 2 Player 1 a 1/2 heads N 1/2 tails a b Player 1 Player 2 b

a b a b a b a b

(8,10) (0,0) (0,0) (10,8) (8,10) (0,0) (0,0) (10,8)

Figure 31: An extensive form game for Battle of the Sexes with a correlating device. drawn, player 1 nominally goes ﬁrst but the information sets for player 2 imply that play is eﬀectively simultaneous.13 In this extensive form, each player has four strategies: aa, ab, ba, and bb, where ab is read, “a if heads and b if tails.” The strategic form for Γ is shown in Figure 32. For example, if player 1 chooses ab and player 2 chooses bb then when the coin lands heads they play (a, b) for a payoﬀ of (0, 0) and when the coin lands tails they play (b, b) for a payoﬀ of (10, 8). Since the

which players choose what game to play, but this approach quickly becomes sterile. 13 Alternatively, one could have player 2 nominally go ﬁrst. This multiplicity in the possible extensive form representations is irrelevant, since the strategic form of the game, given in Figure 32, is not aﬀected.

51

suppose that an outside observer cannot observe strategies in Γ. Thus. ab). ab). aa). Note that this is exactly the a b a 1/2 0 b 0 1/2 Figure 34: The induced distribution over actual play. play looks correlated. 52 . aa ab ba bb 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 aa ab ba bb Figure 33: The strategy distribution for the Nash equilibrium (ab. 4 9. Now. 4 ba bb 4. namely (ba. but. Instead.4). and (bb. 5 9. 5 ba 4. 0 ab 4.aa aa 8. as recorded in the game box. 5 bb 0. 0 5. same as Figure 5. to an outsider who can only observe play in G. 10 ab 4. 5 0. Finally. ba). 0 5. The associated Nash equilibrium distribution is shown in Figure 33. such as (ab. probability of heads is 1/2. (aa. 4 10. The induced distribution over actual play is shown in Figure 34. ab). adding a correlating device can introduce new possibilities for equilibrium behavior. aa) correspond to the original pure equilibria of the game: players eﬀectively ignore the coin toss. 9 0. 8 Figure 32: The strategic form game for Battle of the Sexes with a correlating device. note that there are other pure Nash equilibria of the game in Figure 32. since the correlating device can always be ignored. The equilibria (aa. an observer can see only what actions in G are actually played. bb). namely (ab. like ab. is a Nash equilibrium of this game. 9 5. One can easily verify that the suggested agreement. 4 5. 0 0. the expectation of this is (5. bb) and (bb. In general.

e. including games with multiple equilibria. indeed. the question of what equilibrium (if any) will be played has been replaced with another. more abstract question. the empirical evidence is inconsistent with the introspective reasoning that I am about to describe.” which prescribes how to play in every game. the calculating power of actual players is limited and. how did society settle on one particular Big Book of Game Theory in the ﬁrst place? To answer this question. implies for player behavior. The seminal paper is Aumann (1976). . but other big books of game theory are also possible. The main obstacle to this story is that the book must specify how to play in every game. . By introspection I mean reasoning of the form.3 Introspection. Note that the argument here is very similar to that for preplay agreement. 9. 53 . that strategies survive the iterated deletion of weakly dominated strategies) or that play conforms to a Nash or correlated equilibrium.e. .2 The “Big Book of Game Theory. In a game like Battle of the Sexes. and I think that my opponent will play that because I think that he thinks that I will play .analogs of the Nash equilibria of the original game are always present as well (this holds for mixed strategy equilibria as well as pure). This raises the problem of what equilibrium should be played. one must appeal to one of the other stories described in this section.14 In brief. the answer is that introspection implies that play lies in S R (i. which contradicts the hypothesis that everyone follows the book. I ﬁnd the literature on introspection nevertheless interesting. I give other cites below. at least one player has strict incentive to deviate. You may think of the “Big Book of Game Theory” as a social convention for how to play games. The main diﬀerence is that players do not negotiate before the start of each individual game. the only prescriptions that make sense are Nash equilibrium prescriptions. Otherwise. With this hypothesis. that play is rationalizable) but (this is more controversial) falls short of implying that play lies in S W (i. Of course. “I should play this because I think that my opponent will play that.” This subsection addresses what this sort of introspective reasoning. Harsanyi and Selten (1988) provides one possible resolution and can thus be considered a “Big Book of Game Theory. by players of unlimited calculating power. it is not at all obvious how this multiplicity problem should be resolved. introduced in Example 1. Thus. in part 14 What I am calling introspection the literature calls interactive epistemology. The book’s underlying hypothesis is that everyone follows the book.” Suppose that everyone learns game theory from the same book.” Harsanyi and Selten (1988) make a good case that their particular prescriptions are sensible. 9. the “Big Book of Game Theory. by deﬁnition.

but I will not do so. σ−i is cautious iﬀ it rules nothing out. say that something is common knowledge if everyone knows it. (b) any other information that the player might have (such as the outcome of a coin toss). the players do not know their opponents’ actions.16 One can show that if common knowledge of rationality holds at a type proﬁle then the action proﬁle associated with that type proﬁle lies in S R . and so on. 16 15 54 . possibly based in turn about beliefs about the others’ beliefs. A basic assumption of introspective arguments is that rationality is common knowledge: everyone is rational. let alone common knowledge of rationality. without reference to a meta formalism that contains F ). by a formalism. not G. everyone knows that every knows it. and S W .” where a belief σ−i is cautious iﬀ supp(σ−i ) = S−i . A natural Common knowledge was ﬁrst formalized in Aumann (1976). One can show that it is without loss of generality to represent the game of incomplete information.15 Say that a player is rational if she chooses an action that is a best response to her belief. I want to reserve the word “strategy” for another use. the type merely records what action the player eventually ends up choosing as a function of his information. and so on. everyone knows that everyone knows that everyone is rational. But of all the material in this section. his belief about their beliefs. the material on introspection is both the most demanding and the most easily skipped. In the F formalism. and (c) the player’s action in G. that σi is not a best response to any cautious belief iﬀ it is weakly dominated. or even approximate rationality. That is. I refer to players as choosing actions in G rather than strategies. S R . and so on.because it grapples with diﬃcult questions. Common knowledge of rationality is a substantive assumption. A player’s type speciﬁes (a) his belief about the other players’ actions. Informally. 9. which I call F . Strictly speaking.e. see Brandenburger and Dekel (1993) but also Heifetz (1999). Introspection provides considerably less support for restricting attention to S W . The fact that a type speciﬁes a player’s action does not take away the player’s free will. One can show. In this sense. and so on. as an analog of Theorem 7. Say that a player is cautiously rational if he chooses σi only if σi is a best response to a “cautious belief. one type for each player. Consider players engaged in a game G.1 Introspection. the F formalism per se does not require rationality. together with whatever behavior it generates. where I use common knowledge in an informal sense (i. the players are playing. in which Nature chooses a proﬁle of types. but rather a game of incomplete information arising from G.3. There is an implicit assumption that F itself is common knowledge. introspection implies rationalizability. although each may have beliefs about the others’ actions. This sort of common knowledge can be viewed as being more or less without loss of generality. everyone knows that everyone is rational. In particular. Common knowledge of rationality can be deﬁned rigorously within the F formalism just described.

” The F formalism described above resembles a game of imperfect information. and correlated equilibrium. For example.2 Introspection.) But on closer examination. namely (T. equilibrium. 0 98. In game theory. in the game of Example 29. 98 Figure 35: A game with a Pareto dominant Nash equilibrium that may not be focal. For example. introspection does not imply that play lies in S W . even though every strategy proﬁle in this game is rationalizable. such equilibria are referred to as focal. I wrote “resembles” rather than “is” for two reasons. The equilibrium (b. common knowledge of cautious rationality implies that players play strategies that survive one round of weak dominance deletion followed by the iterated deletion of strictly dominated strategies. First. F allows two players to 55 . This leads to the question of whether. or at least correlated. In this sense. The “Big Book of Game Theory” proposed by Harsanyi and Selten (1988) selects (b. Nash equilibrium.conjecture is that. This conjecture turns out to be false. while some games would have focal equilibria. a strictly Pareto dominant Nash equilibrium so presumably it is focal. in general games. others would not. rather than (a. common knowledge of cautious rationality implies that play lies in S W . L). (A strategy proﬁle is strictly Pareto dominant if it gives every play a strictly higher payoﬀ than that of any other strategy proﬁle. 99 b 99. I turn to the question of whether introspection implies Nash. Even if we could agree on a deﬁnition of what “focal” means. Although most game theorists subscribe to the idea of a focal equilibrium. in some games there is a Nash equilibrium that is so compelling that it seems reasonable that players would coordinate on it even if they could not communicate beforehand. Consider Figure 35. 9. The answer is. So b looks more prudent than a. a term introduced in Schelling (1960). Finally. b). a) is a b a 100. (a. b) satisﬁes a property that Harsanyi and Selten (1988) called risk dominance. “not quite. it seems likely that. First. it has resisted all attempts at formalization. introspection can lead to Nash or at least correlated equilibrium. 100 0.3. Instead. But playing b guarantees a payoﬀ of at least 98 whereas playing a risks 0 for a gain (relative to 98) of at most 2. it seems reasonable to predict that players would select the unique Nash equilibrium. this is not so clear. just as common knowledge of rationality implies that play lies in S R . a). one might suppose that if a Nash equilibrium is strictly Pareto dominant then it is focal. See B¨rgers and Samuelson o (1992). in this game.

14/3 3. do what Nature tells you to do. 2/9. P DJ . suppose that G is the game of Chicken. 14/3 4/3. Suppose that CPA holds and reinterpret the part of a type that speciﬁes an action as a suggestion of how the player should play (which the player could ignore) rather than as a record of how the player actually does play. Player i can be either of type θi or θi . The strategy JD means play J when of type θi and JJ JD DJ DD JJ 6. whereas a game of imperfect information requires that players all assign the same probability to any given type proﬁle. where a strategy in Γ is a function that speciﬁes a player’s action in G as a function of his type. and θDD . conditional on player 1 being of type proﬁle is either θ J J type θ1 . once player 1 learns that his type is θ1 . 20/3 2. There are thus four type proﬁles. θJD is shorthand for J D the proﬁle (θ1 . P DD ). 0 Figure 36: Chicken with an initial move by Nature. 2/3 0. 7/3 DD 7. the F formalism is a game Γ of imperfect information. 1/9). 14/3 DJ 20/3. player 1 learns that the JJ or θ JD . Assume CPA and thus that there is a game of imperfect information Γ that corresponds to the incomplete information formalism F . θJD . Each player observes his type. and so on. It is not diﬃcult to show that rationality is common knowledge at every type proﬁle in F if and only if the “do as you are told” strategy proﬁle in Γ is a Nash equilibrium. 8/3 2/3. That is. 14/3 14/3. D play D when of type θi . 3 8/3. either J J or D. 6 14/3. θDJ . for example. 2 14/3. θJJ . and then each player simultaneously chooses an action. In this way. 7 JD 19/3. 2/9. where. 4/3 7/3. but not that his opponent. it is thus as though F speciﬁes both a game of imperfect information Γ and a strategy proﬁle for Γ. θ2 ). With this interpretation. The strategy DJ 56 . The simplest interesting type J D structure is the following. P JD .assign diﬀerent probabilities to any given type proﬁle. Suppose that P = (4/9. Second. Consider the strategy proﬁle in Γ in which each player takes the action that Nature suggests at each type. It follows that. F speciﬁes what actions players take for each type. player 1’s type records not only Natures suggestion P about how to play but also player 1’s belief about player 2’s type. introduced in Example 37. Moreover. the probability of player 2 being of type θ2 . Nature moves ﬁrst and chooses the type proﬁle according to the distribution P = (P JJ . 10/3 14/3. 19/3 10/3. Assuming CPA. The strategic form game generated by J this P is shown in Figure 36. and hence player 1’s belief about player 2’s belief about player 1’s type. and likewise being told J. is JJ /(P JJ + P JD ). The interpretation is that if player 1 is of type θ1 then player 1 is told to J play J. The assumption that all players use the same probability distribution over type proﬁles is called the common prior assumption (CPA). To illustrate. In Γ.

Nature in eﬀect mixes on behalf of the players. Thus. It is tempting to appeal 17 Note that when I change P I also change the payoﬀs in the strategic form game shown in Figure 36. too. CPA and common knowledge of rationality at every type proﬁle imply correlated equilibrium. asks for common knowledge of rationality at every type proﬁle. The correlated equilibrium argument. only the correlated distributions work: (JD. The strategy JJ means. Moreover. In the Chicken example. such as (JJ. But in other type spaces common knowledge of rationality could hold at some type proﬁles while failing at others. in particular. do we want to interpret this as saying that introspection implies correlated equilibrium? Of the two assumptions. the particular type space used had the property that common knowledge of rationality at one type proﬁle implied common knowledge of rationality at every type proﬁle. the common knowledge assumption used to get correlated equilibrium is only somewhat stronger than the common knowledge assumption used to get rationalizability. if CPA holds then what P are compatible with common knowledge of rationality at every type proﬁle? It should be fairly intuitive that any P that is a Nash equilibrium distribution works. 57 . One can readily verify that in this game (JD. Thus. In fact. for what P is the “do as you are told” strategy proﬁle (JD. JD) is a Nash equilibrium of Γ iﬀ P is a correlated equilibrium distribution of G. in contrast. This should not be surprising: I choose P so that the “do as you are told” Nash equilibrium of Γ corresponds exactly to the mixed strategy equilibrium of G. CPA and common knowledge of rationality at every type proﬁle jointly imply that the “do as you are told” strategy proﬁle is a Nash equilibrium of Γ. play J no matter what Nature tells you. The strategic form payoﬀs are tedious to compute.18 The question of whether introspection implies correlated equilibrium thus comes down to the question of whether introspection implies CPA. In this sense. The question is. any P that is a correlated equilibrium distribution works. This reasoning extends to general games and general type structures.1. 1/3. Now ask the question. In Γ. JD) is a Nash equilibrium. assuming both CPA and common knowledge of rationality at every type proﬁle in F implies that P is a correlated equilibrium distribution for G. Moreover. since it is in this equilibrium that the actions taken correspond to those speciﬁed in the original formalism. 1/3. 18 Recall that common knowledge of rationality at a type proﬁle implies rationalizability at that type proﬁle. and this in turn implies that the induced distribution over action proﬁles is that of a correlated equilibrium. CPA and common knowledge of rationality at every type proﬁle. DD). given that we are considering introspection by players of unlimited computing power. JD) a Nash equilibrium of Γ?17 This is equivalent to asking. but I want to focus on the equilibrium where players do as they are told. The computation is similar to the one for Figure 32 in Section 9. for F . I can choose P using the correlated equilibrium distribution of Figure 22: P = (1/3. There are other Nash equilibria here. 0). I view the common knowledge of rationality assumption as strong but acceptable.means do the opposite of what Nature tells you.

which suggests constructing a new. such diﬀerences in belief must reﬂect diﬀerences in information (rather than calculating errors). one of the great game theorists. or at least correlated. But in games with multiple equilibria like Battle of the Sexes (Example 1). still larger formalism F 2 . larger.” Therefore. Robert Aumann. and so on. For more on this topic (assuming you haven’t already had more than enough). is to model each player as choosing a decision algorithm that takes as input the opponent’s decision algorithm (thus allowing the player to think through the game from his opponent’s perspective) and that produces as output an action in the game. On the other side are game theorists who argue that CPA forces coordination of belief. This leads to the question. in which CPA holds? It turns out that this question was already answered in the original construction of F . not assume it at the outset. One way to do this. In games with a focal equilibrium. Why would two players assign diﬀerent probabilities to how Nature assigns type proﬁles? Given that players have unlimited calculating power. has taken the position that CPA is justiﬁable in this context. CPA has to be defended on its own terms. but invoking focal equilibria just takes us back to where we started. The objective here is diﬀerent: to ﬁnd out whether introspection in and of itself can yield Nash. if players have such and such information and if they best respond then their actions must obey such and such restrictions. Rename the original formalism F 0 and let F 1 denote the new formalism. See Aumann (1987). such coordination may be natural. As you might infer. One last remark. perhaps the most natural way. Whether CPA is too substantive to impose in a theory of introspection is controversial. see Brandenburger (1992) (a survey written for a general economics audience and therefore relatively accessible) and Dekel and Gul (1997) (another survey. One defense of CPA that I ﬁnd especially seductive is the following. Given this objective. F is F ∞ (or. but targeted at a more sophisticated audience). the restriction 58 . Canning (1992) formalized Binmore (1987) and showed that these problems can be avoided if one restricts the set of available algorithms. But then it is possible that players have diﬀerent probabilities about Nature in F 1 . assuming CPA is substantive. But the objective in applied game theory is to construct models that capture useful intuitions and that have testable predictions. So we should embed the formalism F inside a new. equilibrium. and that the whole point of the introspection literature was to explain such coordination. The introspective theories I have surveyed model introspection implicitly. does this sequence of formalisms lead to an over arching formalism.to precedent and cite the fact that CPA is a standard assumption throughout applied game theory. On one side. F ∞ . rather. An alternative is to try to model introspection explicitly. The theories are of the form. which is the seminal paper in this area. F is equivalent to F ∞ in an appropriate sense) and so the answer is “no. formalism that explicitly models the information that players used to derive their probabilities for Nature in F . I am on the anti-Aumann side of this debate. Binmore (1987) pointed out that this approach quickly leads to conceptual problems.

and theories in which players play games with their neighbors.1 Fictitious play. over and over. who then play games with their neighbors. There are theories in which the same players play the same opponents repeatedly. Much of the learning literature has been devoted to theories with large populations of players with simple update rules. ﬁctitious play lies toward the more sophisticated end of the spectrum of learning theories. Player 1’s forecast at the start of period Two good texts on evolutionary game theory are Weibull (1995) and Samuelson (1997). meaning that player 1 thinks that player 2 will play H with probability q 0 . a general class of ﬁctitious play theories can be described as follows. On extending ﬁctitious play to games with more than two players. Young (1998) surveys theories in which random shocks play an important role. each player makes a forecast of the other player’s action. these theories bridge the evolutionary and non-evolutionary literatures. These theories are often called “evolutionary” because they share some mathematical similarities with dynamics that arise in evolutionary biology. So. But ﬁctitious play is relatively easy to describe.4.4 Learning. Fudenberg and Levine (1998) is. 1 − q 0 ). Finally. once again. ﬁctitious play in Matching Pennies (Example 2). and theories in which players update their behavior by copying what some other player is doing. 9. For this game. Player 1’s forecast at the start of period 1 is (q 0 . There are many. I do not make any claim that ﬁctitious play is particularly realistic. theories in which players are part of a large population and never meet opponents more than once. and so on. we are back where we started. In each period. see Fudenberg and Levine (1998).on decision algorithms eﬀectively requires players to coordinate on an equilibrium before introspection begins. 19 59 . forever. In terms of player sophistication. at this writing. but the focus of Weibull (1995) is on evolution in strategic form games while the focus of Samuelson (1997) is more on issues particular to games with non-trivial extensive forms. and it nicely illustrates some of the more general points that I wish to make. players learn by playing the same game against the same opponent. I focus on ﬁctitious play in two-player games. Consider. There are theories in which players are sophisticated Bayesian optimizers. This is only a partial enumeration of the diﬀerent modeling choices made. Mailath (1998) provides a concise survey. the most general survey of the learning literature. These texts overlap. 9. In ﬁctitious play. many competing theories of learning that diﬀer on all sorts of dimensions. in particular. I start by describing a particular learning theory called ﬁctitious play.19 Lest this material quickly become too abstract. theories in which players use rules of thumb to update their behavior based on their own experience.

H2t is the empirical frequency with ¯ ¯ which player 2 has played H to date. for ¯ k2 large. 1 − q t+1 ) = k2 t ¯ ¯ (q 0 . It turns out that this tie breaking rule is essentially irrelevant: one can show that for almost any values of k2 and q 0 . For any k1 . First. p0 . In this sense. Given his forecast for period t + 1. So the sense in which play in ﬁctitious play converges to Nash equilibrium is weak: it is not that the players’ actual behavior converges to that of a Nash equilibrium but only that the empirical distribution of play resembles the empirical distribution that would be generated by repeated play of the Nash equilibrium. But. 1 − q 0 ) + (H2t . 1 − H2t ). however. and q 0 . and 1 − H2t = T2t is the empirical frequency with which player 2 has played T . Note. player i then best responds.t + 1 is (q t+1 . the players are engaged in a repeated version of Matching Pennies and therefore one must deﬁne best response in terms of this repeated game. regardless of H2t . regardless of k2 or q 0 . players never randomize. player 1 is never indiﬀerent. the following. It turns out that ﬁctitious play assumes this problem away: one can show that the ﬁctitious play forecast rule implicitly assumes that a player’s action today has no aﬀect on her opponent’s future actions. as in Figure 26. Thus. player i cares about not only how his date t action aﬀects his date t payoﬀ but also about how his date t action might aﬀect his future payoﬀs. by aﬀecting his opponent’s future actions. I assume that when a player is indiﬀerent. Thus. k2 can be thought of as measuring the strength of player 1’s conﬁdence in his initial forecast q 0 . t + k2 t + k2 ¯ where k2 > 0 is a constant discussed below. k2 . The second subtlety is that one must specify what happens if player i turns out to be exactly indiﬀerent. The unique Nash equilibrium of matching pennies calls for each player to randomize 50/50. 1. Now consider what happens in Matching Pennies. ﬁctitious play predicts convergence to the Nash equilibrium of the game. One would like to claim that observable play looks as if 60 . as I have deﬁned ﬁctitious play. until t is likewise very large. ignoring the fact that play will be repeated. In a repeated game. There are two subtleties. she chooses H. q t approximately equals the initial forecast q 0 . as t grows large eventually nearly all the weight is on the second term: eventually player 1’s forecast is that ¯ player 2 will play H with a probability approximately equal to H2t . On the other hand. for any k2 . 1 − p0 ) and he updates in a similar way using a weight k1 . the empirical frequency for each of the four strategy proﬁles converges to 1/4. Notice that. So a player’s best response can be computed by simply computing the best response for the current period. the empirical frequency of play converges to the mixed strategy Nash equilibrium distribution for Matching Pennies. the empirical frequency with which player 2 has played H to date. Player 2’s forecast rule is similar: he has an initial forecast (p0 . Thus. if player 2 plays H 25% of the time then eventually q t ≈ 1/4.

looking at the sequence of action proﬁles chosen under ﬁctitious play. First. provides an example. is not correct. The Nash equilibrium in this game calls for each player to randomize (1/3. L C R 1. (A pure Nash equilibrium s∗ is strict iﬀ each s∗ is the unique best response to s∗ .it had been generated by players actually randomizing. would be able to detect that players were not randomizing 50/50. whenever both players play H in period t. 2. But this. It is true that observable play passes one statistical test (namely that the empirical distribution resembles the distribution generated by repeated play of a Nash equilibrium) but it fails others. Based on the behavior of ﬁctitious play in Matching Pennies. never converging. 1 1. Now the bad news. Conversely. 1 1. • If the empirical distribution of play converges to a degenerate distribution that assigns probability 1 to s∗ then s∗ is a pure strategy equilibrium. the good news. to Nash equilibrium. it is tempting to conjecture that ﬁctitious play always converges. 1/3). 61 . 0 T M B Figure 37: A game for which ﬁctitious play fails to converge. one can perturb payoﬀs slightly without altering the qualitative conclusion. too. An outside observer.) i −i • If the empirical distribution of play converges then the marginals form a Nash equilibrium. This example is robust. The game in Figure 37. player 1 also plays H in period t + 1. One can show that the empirical frequency of play under ﬁctitious play wanders around forever. For example. (This statement is for two-player games). 0 0. 0 0. • The empirical distribution of play may never converge. which originated in Shapley (1962). an observer would see that. In contrast. at least in a weak sense. 0 0. if player 1 were playing the equilibrium mixed strategy then he would play H only half the time under these (or any other) circumstances. 0 0. 0 0. This conjecture is false. 1 0. 1/3. if s∗ is a strict equilibrium and if initial beliefs are close to s∗ then the empirical distribution of play converges to s∗ .

under ﬁctitious play. and the fact that forecasts under ﬁctitious play are eventually close to the empirical frequencies of play. Because of this. On the other hand. L) and (B. Given this. equilibrium distribution. Fictitious play has a Bayesian interpretation. Remark 14. Although play under ﬁctitious play may fail to converge to a Nash or correlated equilibrium. and the fact that players best respond.• One can construct examples in which play converges. For example consider Figure 38. 1 0. 0 Figure 38: A game for which ﬁctitious play can converge to a distribution that is not a Nash equilibrium distribution. let alone a Nash. Consider again the Matching Pennies example and suppose that player 1 is certain that player 2 is playing an 62 . The proof of this fact mirrors the iterated deletion of strategies that are never a best response. Because ﬁctitious play players always best respond. S R . It is not known how robust this particular problem is. R) and also a mixed Nash equilibrium in which both players randomize 50/50. And so on. 1 B 1. play under ﬁctitious play always converges to the rationalizable set. but for which the limiting empirical distribution is not a correlated. and hence the empirical marginals converge to a Nash equilibrium. players eventually never play strategies that do not survive the second round of deletion of strategies that are never a best response. players eventually forecast that the probability is close to zero that their opponent will play a strategy that is never a best response. play under ﬁctitious does not necessarily converge to S W . Note that the marginals are indeed 50/50 but the distribution L R T 1/2 0 B 0 1/2 Figure 39: The marginals form a Nash equilibrium for the game of Figure 38 even though the distribution itself is not a Nash equilibrium distribution. they never play a strategy that is never a best response. If p0 = q 0 then. the empirical distribution converges to the distribution shown in Figure 39. Nash equilibria. This game has two pure L R T 0. 0 1. This example looks pathological. (T. itself is not close to any sort of equilibrium distribution.

But as we include more complicated strategies in a player’s belief. Most but not quite all. Each player is certain that the other is i. More generally. Thus. And similarly for player 2. Even if players are extremely naive. for that matter.d. a strategy of the form. and a forecast rule of the sort described. And in evolutionary theories it is possible for play to converge to something not in S W . Suppose that player 1 thinks (correctly) that player 2 thinks that player 1 is playing an i. related. in many learning theories. strategy. Nachbar (1997) shows that the inconsistency exhibited by ﬁctitious play is. even though neither is i. strategies that are not 63 . and S W . concepts) is complicated and only partially understood. First.i. But it is also quite possible for play to cycle. their long run behavior can look as though it were generated by players of unlimited calculating power.4.d. Suppose that player 1 does not know q ∗ but believes that it is drawn from the uniform distribution on [0. Actually. if play converges to something then that something is either in S W or is close (possibly only in a very weak sense) to something in S W . then returning at a later date.i. Then player 1’s forecast for period t + 1 is as described above.i. regardless of what has happened in the game. spending part of the time in S W . if player 1 believes that player 2 is playing an i. 9. then veering outside of S W .i.e.d. etc. a general feature of Bayesian and “as if” Bayesian learning theories.i.d. strategies. But they aren’t very sophisticated. given this correct belief about player 2’s belief.) It is not hard to show that. see Binmore and Samuelson (1999). rationality is not necessary to justify rationalizability. the situation is worse than this suggests. The statement that play eventually lies in S R (or. The natural response to this inconsistency in the Bayesian interpretation of ﬁctitious play is to enrich each player’s belief by including more complicated strategies along with the i. For many but not all theories. in S W ) is subject to several caveats. play H with probability q ∗ in every period. learning theories predict that play eventually lies in S R (i. it is quite another for a learning theory to model an optimizing player as being certain that his opponent is not optimizing. It is one thing for a learning theory to model an optimizing player as thinking his opponent might not be optimizing. so that the richer theory may still be inconsistent.d. strategy then player 1 is certain that player 2 is not optimizing. I spend the remainder of this subsection surveying what I believe to be the main lessons of the learning literature. we also make her best response more complicated. that is. with q 0 = 1/2 and k2 = 2. a k2 > 0.2 Learning. strategy. in fact.i. that play is eventually rationalizable). The connection between learning theories and S W (and other. (But player 1 need not know the details of exactly how player 2 assigns probability. 1]. S R .i. any Beta distribution for q ∗ implies a q 0 .d. Fictitious play players are thus “as if” Bayesian.

in the repeated game it is rationalizable. As already noted in connection with rationalizability (Section 9.) See also Hart and Mas-Colell (1999). a) played in even periods and (b. in particular. for a discount factor close enough to 1. Convergence to play of a strict Nash equilibrium (“strict Nash equilibrium” was deﬁned in Section 9.3 Learning. ΣCorr . without ever settling down. The empirical distribution over actions in original game is then the one given Figure 34. I mention two consequences of this. and convergence must be understood as being with respect to equilibria of Γ. Convergence to rationalizable behavior means convergence to behavior that is rationalizable in Γ. not necessarily to behavior that is rationalizable in G. Convergence to correlated equilibrium is somewhat easier to obtain. the players are actually engaged in a Nash equilibrium of Γ but an observer sees a correlated distribution of G. Here are a few additional comments. and correlated equilibrium. Nash equilibrium.4. learning theories invariably embed the original game.“too” strictly dominated can survive indeﬁnitely. One standard example is the following. forever. call it G. Foster and Vohra (1997). Then one possible outcome is that play alternates between the two pure Nash equilibria. in a larger game. First. call it Γ. but play can fail to converge.2). to that of a mixed equilibrium. depending on one’s perspective (G or Γ). 64 . 1. So one might naively conjecture that learning theories predict that eventually both players play F in every period. So it is possible for a learning theory to forecast play of C even though C is not rationalizable in G. Consider two players playing the Prisoner’s Dilemma (Example 21) over and over.4. but in the repeated Prisoner’s Dilemma (Γ).1) is usually fairly easy to obtain. For example. As noted in Example 21. the players are engaged. not necessarily to equilibria of G. if players only ε optimize. strictly speaking. play can look correlated. not in the Prisoner’s Dilemma (G). As discussed in Example 47. suppose that two players play Battle of the Sexes (Example 1) repeatedly. (The empirical distribution can wander around within ΣCorr . 9. Fudenberg and Levine (1999). b) played in odd periods. But. A more typical result is that play converges to Nash equilibrium in some situations but not in others. learning theories embed the original game G in a larger game Γ. and Hart and Mas-Colell (2000) provide related but distinct learning theories in which the empirical distribution of play eventually converges to the set of correlated equilibrium distributions. the Prisoner’s Dilemma played only once.4. for players to play C in some or even all periods along the path of play. with (a. Second. Learning theories typically do not guarantee convergence to Nash equilibrium. That is. even in a weak sense. the only rationalizable action in the Prisoner’s Dilemma is F . This is the case.

For this reason. in which players know their own payoﬀ function for the stage game. as representing explicit randomization. to date. whereas only F is played in the unique Nash equilibrium of the stage game. I ﬁnd the classical interpretation of a mixed strategy. but not the payoﬀ functions of their opponents. This is extremely demanding because it requires that forecasts be correct not only at information sets along the path of play (at which players may already have observations from previous experiences with the game) but also at information sets oﬀ the path of play (at which players may have few if any observations).Second. there may be Nash equilibria of Γ in which the outcomes look completely unlike anything that could arise as an equilibrium outcome of G. In games with non-trivial extensive forms. for example. been able to establish global convergence only at the price of imposing strong. Nash equilibrium requires that players behave as if they were best responding to their opponents’ extensive form strategies. randomizing. weaker. already in equilibrium within a repeated game of incomplete information. As illustrated in Section 9. This sort of phenomenon. but not necessarily vice versa. 2. inherent in Bayesian theories of learning in repeated games. Fudenberg and Levine (1993b) provides an example of a 65 . An alternative. there is a learning literature with sophisticated players that gets a form of global convergence but that assumes that players are. The standard example is the repeated Prisoner’s Dilemma. there may be equilibria in which players play C in every period along the path of play. The diﬃculty in establishing global convergence in learning theories with sophisticated players may be related to the inconsistency. in which play looks at least somewhat random to an outside observer even though the players are not. 3. Learning theories with sophisticated players (Bayesian optimizing players) have. As discussed in Example 48. shows up in various guises throughout the learning literature. discussed in Remark 14. Learning theories that exhibit global convergence to Nash or at least correlated equilibrium typically model players as unsophisticated.4. before play begins. equilibrium or equilibrium-like. needlessly limiting.1. equilibrium concept called self-conﬁrming equilibrium requires that players behave as if they were best responding to forecasts that were correct along the path of play but not necessarily oﬀ the path of play. strictly speaking. if one’s standard of convergence is that the empirical frequency of play converges to that of a Nash equilibrium then it is possible to get convergence to a mixed strategy Nash equilibrium without the players actually playing mixed strategies. assumptions at the outset. Note that any Nash equilibrium is a self-conﬁrming equilibrium. 4. Thus.

” Papers that deﬁne convergence using empirical frequencies sometimes ﬁnd evidence of correlation.1). There is a large body of evidence that players persistently play weakly dominated strategies. in many games and settings but also failure of convergence in other games and other settings.5 Empirical Evidence. if convergence to Nash equilibrium obtains at all it is typically in the weak sense discussed in connection with ﬁctitious play (Section 9.20 9. Solution concepts similar to self-conﬁrming equilibrium have been proposed. players carry out one or perhaps two rounds of strict dominance deletion before choosing their own action. For two-player games. See Fudenberg and Levine (1993a). There exist simple three-player examples in which there is a self-conﬁrming equilibrium that reaches a terminal node that could not have been reached under any Nash equilibrium. But this is no longer true for games with more than two players.learning theory that yields convergence to self-conﬁrming equilibrium but not necessarily to Nash equilibrium. as one would expect. which coined the term. Surveys include Roth (1995) and Crawford (1997). 3.4. 2. 20 66 . The evidence suggests convergence to Nash. which is consistent with the discussion in Section 9. In games like matching pennies in which the unique equilibrium involves randomization. the distinction between Nash equilibrium and self-conﬁrming equilibrium is largely irrelevant: if one looks at distributions over terminal nodes (which is all an observer typically sees) then any distribution that can be generated by a self-conﬁrming equilibrium can also be generated by a Nash equilibrium. But there is evidence of eventual convergence to S R as players grow more experienced.4. The evidence is. There is evidence that. The evidence is not consistent with most players carrying out more than two rounds of strict dominance deletion. in at least some games. in diﬀerent contexts. In addition to Fudenberg and Levine (1993a). by a number of authors. Much of this evidence is associated with a stylized bargaining game called the ultimatum game. consistent with players learning over time. or at least correlated equilibrium. Some of the debate in this literature has hinged on the correct deﬁnition of “converge. My discussion is brief. 4.3. There is continuing controversy over exactly how this evidence should be interpreted. 5. The literature is vast and growing rapidly. 1. The empirical evidence on game theory is drawn partly from laboratory experiments and partly from real world data. see Hahn (1977) and Kalai and Lehrer (1993).

But Nash equilibrium sometimes fails to predict behavior well even in games that. 67 . that players execute strategies that are consistent with equilibrium. which are very complicated games. Conversely. one would expect that Nash or correlated equilibrium would be a good predictor of long run behavior in simple games but possibly not in complicated games. there is real world evidence from auctions. are simple. The classic cite here is Hendricks and Porter (1988).Naively. one would think.

Brandenburger. 189–198. and A. 4. Annals of Mathematics Studies. Computability. by D. 3. Binmore.” Economics and Philosophy. 60. Zhou (1993): “Characterizations of the Existence of Equilibria in Games with Discontinuous and Non-quasiconcave Payoﬀs. D. 6(4). (1976): “Agreeing to Disagree. (1964): “Mixed and Behaviour Strategies in Inﬁnite Extensive Games. 13–25.. (1974): “Subjectivity and Correlation in Randomized Strategies. 935–948.” in Advances in Eonomics and Econometrics: Theory and Applications. D. Cambridge.. Canning. Dekel (1993): “Hierarchies of Beliefs and Common Knowledge. (1987): “Correlated Equilibrium as an Expression of Bayesian Rationality.” Annals of Statistics.” in Advances in Game Theory.” Review of Economics Studies. A. K. Dresher.” Econometrica. UK. Kreps. 83–101. 1. M. E.References Aumann. R. 21. 66(2). vol. Wallis. Princeton University Press. by D. Cambridge University Press.” Econometrica. Tucker. Princeton. V. 55. (1987): “Modeling Rational Players. L. 179–214. 59. 52(4). W.” Journal of Economic Theory. K. NJ. 1236–1239. Shapley.. chap. 60. Brandenburger. Gul (1997): “Rationality and Knowledge in Game Theory. B. G. and L. Tian.” Econometrica... ¨ Borgers. 67–96. M. 1–18. by M. Part I. S. ed. 877–888. and J. T. F. Samuelson (1999): “Evolutionary Drift and Equilibrium Selection. G. ed. and L. Bernheim. (1992): “Knowledge and Equilibrium in Games.” Review of Economic Studies. 68 . Samuelson (1992): “‘Cautious’ Utility Maximization and Iterated Weak Dominance. and Nash Equilibrium. and E. (1984): “Rationalizable Strategic Behavior. 1.” Journal of Economic Perspectives. Baye. and F. 627–650. Dekel.” International Journal of Game Theory. ed.” Journal of Mathematical Economics. (1997): “Theory and Experiment in the Analysis of Strategic Interaction. pp. Crawford. 52. (1992): “Rationality.” in Advances in Eonomics and Econometrics: Theory and Applications. 1007–1028. M. 363–393. and K. 7. Binmore. A.

. 5. Foster. Cambridge. Wallis. (1993b): “Steady State Learning and Nash Equilibrium. 14. and E. 69 . 78(5).” Management Science. (1999): “Conditional Universal Consistency. 61(5). (1998): Theory of Learning in Games. D. UK. (1999): “How Canonical is the Canonical Model? A Comment on Aumann’s Interactive Epistomology. Lehrer (1993): “Subjective Equilibrium in Repeated Games. Fudenberg. 159–182. Heifetz. D. and J. (1977): “Exercises in Conjectural Equilibrium.. Kalai. and D. Govindan. E.” Econometrica. Fudenberg. MA.” Econometrica. K. 1. S. F.” Scandinavian Journal of Economics. 486–502. 29. chap. and R. 68(5). J. S. Cambridge. (2000): “A Simple Adaptive Procedure Leading to Correlated Equilibrium.” Games and Economic Behavior. 435–442. 547–574. 320–334.” Econometrica. 79. 523–545. Mas-Colell (1999): “A General Class of Adaptive Strategies.Kreps. and R... 28. III.. MA.. MIT Press. Hendricks. 61(3).. II.. J. 1231–1240. Vohra (1997): “Calibrated Learning and Correlated Equilibrium. Hahn. and A. Hart. A. MIT Press. Levine (1993a): “Self-Conﬁrming Equilibrium. Porter (1988): “An Empirical Study of an Auction with Asymmetric Information.” Games and Economic Behavior. and R. Selten (1988): A General Theory of Equilibrium Selection in Games.” Econometrica. F. Cambridge. McLennan (1998): “On the Generic Finiteness of Equilibrium Outcome Distributions in Game Forms. 21. Tirole (1991): Game Theory.” American Economic Review. vol. 104–130. Cambridge University Press.” The Hebrew University of Jerusalem. 865–883. 40–55. and K. MIT.. Harsanyi. 61(3). Parts I.” International Journal of Game Theory. D. (1967): “Games with Incomplete Information Played by Bayesian Players. and A.” University of Minnesota. Harsanyi. 1127–1150. 210–226. Cambridge.

M. Shapley. 10. Rubinstein (1994): A Course in Game Theory. H. (1997): Evolutionary Games and Equilibrium Selection. L. P..” in Contributions to the Theory of Games III. E. 70 . Kagel. 1029–1056. Volume II. 275–309. and J. R. D. (1964): “Extensive Games and the Problem of Information. Schelling. RAND. Cambridge.. (1984): “Rationalizable Strategic Behavior and the Problem of Perfection. Green (1995): Microeconomic Theory. Mas-Colell. Princeton University Press. Annals of Mathematics Studies. 619–632.” Proceedings of the National Academy of Sciences. Sion. Forthcoming.” in Handbook of Experimental Economics. M. Oxford University Press. and R. J.. A. 36. Whinston. Osborne. MA.” Discussion Paper RM-3026. and Learning in Repeated Games. 28.” Econometrica. (1998): “Do People Play Nash Equilibrium? Lessons from Evolutionary Game Theory.” Journal of Economic Literature. pp. Zame (1990): “Discontinuous Games and Endogenous Sharing Rules. M. MA.” Econometrica. Wolfe (1957): “On a Game Without a Value. Princeton University Press. Roth. (1997): “Prediction. (1962): “On the Nonconvergence of Fictitious Play. 65.. NY.” Econometrica. J. 193–216. Weber (1985): “Distributional Strategies for Games with Incomplete Information. P. Princeton. Cambridge. MIT Press. 67(7). ed. MA. (1999): “On the Existence of Pure and Mixed Strategy Nash Equilibria in Discontinuous Games. G. New York. H. 52(4). Cambridge. Nash. Nachbar. H. 861–872. D. 48–49. T.” in Contributions to the Theory of Games. Mailath. Shapley.” Econometrica. Dresher. Tucker. by J. L. and W. L. (1995): “Introduction to Experimental Economics. Princeton University Press. S. G. Pearce. Reny. and A. and A. by M.” Mathematics of Operations Research. J. A. MIT Press. Harvard University Press. Roth. Milgrom. W. L. (1960): The Strategy of Conﬂict. E. and A.. (1950): “Equilibrium Points in n-Person Games. Samuelson. F.Kuhn. ed. and P. Optimization. Simon. 58. NJ. 1029–1049. W.

MIT Press. (1995): Evolutionary Game Theory. J.Weibull. (1998): Individual Strategy and Social Structure: An Evolutionary Theory of Institutions. H. Princeton University Press. 71 . Cambridge. P. Princeton. Young. MA. NJ.

Sign up to vote on this title

UsefulNot useful- Clever Algorithms
- Rational Expectations in Games
- Gametheorypresentationattributions 110710042053 Phpapp02 (1)
- Piece of Document
- [2007] Advanced Game Theory – Two Person Zero Sum Games
- OM BaranyCaratheodory
- Five Lectures on Optimal Transportation
- 4
- Classification Methods With Reject Option Based on Convex Risk Minimization
- 9 IJAEST Volume No 1 Issue No 2 Fixed Points Theorems in Three Metric FPTTMS
- Kybpatch2 REV
- HamblyMetzTepl_SSEnergOnPCFSS_2006
- Christian Maes, Frank Redig and Ellen Saada- The Abelian Sandpile Model on an Infinite Tree
- 2003-tt-SPA
- John_Milnor_Dynamics_in_one_complex_variable__1980.ps.ps
- Beltran Rojas 1995
- 1970516.pdf
- A General Formula for Channel Capacity
- Game Theory
- Lecture 29 - The Binomial Theorem
- EulerHomogeneity
- 10.1.1.129
- Final Sp09 Solutions
- PPT1
- A Companion to Modal Logic
- Ball77
- 9781852338688-c1
- d 02402033039
- Appendix II - A Note on the Acceleration Theorem
- Proof of Cauchy Theorem
- GameTheoryNC17