You are on page 1of 14

Solutions to Problem Set 4

David Jimenez-Gomez, 14.11 Fall 2014

Due on 11/7. If you are working with a partner, you and your partner may turn in a single copy
of the problem set. Please show your work and acknowledge any additional resources consulted.
Questions marked with an (∗) are intended for math-and-game-theory-heads who are interested
in deeper, formal exploration, perhaps as preparation for grad school. The questions typically
demonstrate the robustness of the results from class or other problems, and the answers do not
change the interpretation of those results. Moreover, this material will not play a large role on the
exam and tends to be worth relatively little on the problem sets. Some folks might consequently
prefer to skip these problems.

1 Grim Trigger in the Repeated Prisoner’s Dilemma (70 points)


In one instance of the prisoner’s dilemma, each player chooses whether to pay some cost c > 0 in
order to confer a benefit b > c onto the other player. The payoffs from a single iteration of this
prisoner’s dilemma are therefore:

Cooperate Defect
Cooperate (b − c, b − c) (−c, b)
Defect (b, −c) (0, 0)

The repeated prisoner’s dilemma1 is built out of several stages, each of which is a copy of the
above game. At the end of each stage, the two players repeat the prisoner’s dilemma again with
probability δ, where 0 ≤ δ ≤ 1. A strategy in the repeated prisoner’s dilemma is a rule which
determines whether a player will cooperate or defect in each given stage. This rule may depend on
which round it is, and on either player’s actions in previous rounds.
For example, the grim trigger strategy is described by the following rule: cooperate if both
players have never defected, and defect otherwise. The goal of this problem is to show that the
strategy pair in which both players play grim trigger is a Nash equilibrium if δ > cb .

1. Suppose that player 1 and player 2 are both following the grim trigger strategy. What actions
will be played in each stage of the repeated game? What are the payoffs to players 1 and 2
in each stage?
Answer: In each stage of the repeated game, players 1 and 2 will cooperate. The payoffs
will then be b − c to each player in a given stage.

2. Using your result from part 1, write down the expected payoff to player 1 from the entire
repeated prisoner’s dilemma in terms of c, b, and δ.
1
Please consult Section 5 of the Game Theory handout on Repeated Games for details.

1
Hint: Remember that, if |δ| < 1:

a
a + aδ + aδ 2 + aδ 3 + . . . =
1−δ

Answer: The payoff to player 1 is:

b−c
(b − c) + (b − c)δ + (b − c)δ 2 + . . . =
1−δ
3. Now we will check whether player 1 can improve his payoff by deviating from the grim trigger
strategy. Argue that we only need to check the case where player 1 plays all-D, that is, player
1 defects in every round.
Answer: If player 1 deviates, then he must defect in some round. If he has incentive to
defect in some round k, then by symmetry, player 1 has incentive to defect in the first round.
But if player 1 defects in the first round, then player 2 defects forever, so it could not possibly
be Nash for player 1 to cooperate in any round k. Thus, if player 1 has incentive to deviate
to any strategy at all, he also has incentive to deviate to all-D.

4. Suppose that player 2 plays grim trigger and player 1 deviates from grim trigger and plays
all-D. What is the total payoff to player 1 from the entire repeated prisoner’s dilemma?
Answer: Player 1 receives payoff b in the first round and 0 in each subsequent round, so his
total payoff is b.

5. For grim trigger to be a Nash equilibrium, we need that the payoff to player 1 from playing
grim trigger is greater than or equal to the payoff to player 1 from playing all-D, assuming
player 2’s strategy is fixed.
Using your results from parts 2 and 4, write down an inequality that must be satisfied in order
for grim trigger to be a Nash equilibrium. Simplify this inequality to obtain the condition
δ > cb .
Answer: We have:

b−c b−c
b< ⇒1−δ <
1−δ b
b−c c
⇒1− <δ⇒ <δ
b b

This is the desired result.

6. (∗) - 10 points. Show that the Grim Trigger is a Subgame Perfect equilibrium in addition to
being a Nash equilibrium [Hint: use the one-stage deviation principle]

2
Answer. To show that the Grim Trigger is SPE, we need to show that players do not have
incentives to deviate even in stories which are impossible when both players follow the strat-
egy. Namely, we have to consider strategies where at least one player defected.

Following the same logic as in 3, we only need to make sure that player 1 does not have an
incentive to play C if somebody ever played D. If player 1 sticks to the Grim Trigger strategy
and plays D, she obtains 0. If she deviates only this period, and later conforms to Grim
Trigger, she obtains −c (because player 2 plays D since he is using Grim Trigger and we are
in a history where somebody defected). Therefore, the deviation is not profitable in this kind
of histories; we already knew deviations where not profitable in histories where both players
have always cooperated. Therefore, by the one-stage deviation principle, the Grim Trigger is
an SPE.

So far we have focused on the Grim Trigger because it is a relatively simple strategy to under-
stand, but not necessarily because we think it is used in practice. Importantly, many of the
insights we have learned from studying the Grim Trigger generalize to any Nash equilibrium.

7. (∗) - 10 points. Show that in any Nash equilibrium in which both players play C at each
period, player 2 must cooperate less in the future if player 1 were to deviate and play D at
any period instead of C. Interpret this result in terms of ‘reciprocity,’ as discussed in lecture.

Answer. Let s be a Nash equilibrium where both players choose C at each period. That
means that player 1’s payoffs of s are (b − c)/(1 − δ). Let W be the total payoff that player 1
would get from round 2 onwards if she chose D in the first period; then her payoffs for playing
D would be b + δW . In order for s to be a Nash equilibrium it must be the case that

b−c
≥ b + δW.
1−δ

Now, if player 2 cooperated at all future rounds, then W would be at least (b − c)/(1 − δ),
and possibly more. That means that in that case for s to be a Nash equilibrium it must be
the case that

b−c b−c
≥b+δ ⇐⇒ b − c ≥ b,
1−δ 1−δ
which is a contradiction! We obtained this contradiction from assuming that player 2 con-
tinued to cooperate at all periods after player 1 defected: therefore it must be the case that
player 2 defects at least one period after player 1 defects, proving our claim.

3
2 No Cooperation for Small δ (50 points)
In lecture, we argued that cooperative equilibria exist in the repeated prisoner’s dilemma if and
only if δ > cb . In problem 1, you showed that we can have a Nash equilibrium in which both players
always cooperate (specifically, the equilibrium in which both players play grim trigger) if δ > cb . In
this problem, we will show that if δ < cb , then the only Nash equilibrium is (all-D, all-D). That is,
cooperative equilibria exist only if δ > cb . Combined, your responses to these two questions thus
provide a complete proof to our claim from lecture.

1. Suppose that the strategy pair (s1 , s2 ) is a Nash equilibrium, and let U1 (s1 , s2 ) and U2 (s1 , s2 )
be the payoffs to players 1 and 2, respectively. Show that U1 (s1 , s2 ) ≥ 0 and U2 (s1 , s2 ) ≥ 0.
Answer: Suppose either player received some negative payoff ui (s1 , s2 ) < 0. Then player
i could improve his payoff by deviating to all-D, since this strategy guarantees a payoff
ui (all-D, s−i ) ≥ 0. Hence player i has an incentive to deviate, contradicting that (s1 , s2 ) is
Nash. We conclude that each player’s payoff is nonnegative.

2. Notice that, in each round of the prisoner’s dilemma, the sum of the payoffs to players 1 and
2 is either 2(b − c), b − c, or 0. Show that, if s1 and s2 are any two strategy pairs, then
U1 (s1 , s2 ) + U2 (s1 , s2 ) ≤ 2(b−c)
1−δ .

Answer: The sum of the payoffs in each round is 2(b − c), b − c, or 0. In either case, the
sums of the payoffs is at most 2(b − c). Summing over all rounds,

2(b − c)
u1 (s1 , s2 ) + u2 (s1 , s2 ) ≤ 2(b − c) + 2(b − c)δ + . . . = (1)
1−δ

3. Now assume δ < cb . Using your results from part 2, show that U1 (s1 , s2 ) + U2 (s1 , s2 ) < 2b for
any strategy pair (s1 , s2 ). Use this to conclude that, if (s1 , s2 ) is a Nash equilibrium, at least
one player receives total payoff less than b.
Answer: If δ < cb , then 1
1−δ < b
b−c by algebra. Plugging this into the result of (b),

b
u1 (s1 , s2 ) + u2 (s1 , s2 ) ≤ 2(b − c) · = 2b (2)
b−c

Then at least one player receives total payoff less than b (if both received payoff greater than
b, the sum would exceed 2b, a contradiction).

4. Suppose that, when players 1 and 2 play s1 and s2 , both players cooperate in some round
k. Without loss of generality, we may assume that k = 1 (otherwise we repeat the argument
from parts 1-3 to the subgame starting at round k, introducing a factor of δ k−1 ). Using your
result from part 3, show that one of the players can improve his payoff by deviating.
Answer: Suppose player 1 receives payoff less than b. Then if players 1 and 2 cooperate in
round 1, player 1 can improve his overall payoff by playing all-D and getting a total payoff at
least b. Thus (s1 , s2 ) cannot be a Nash equilibrium.

4
5. Next we need to rule out the possibility of a round in which one player cooperates and the
other defects. Repeat the argument of part 2 using the additional result that players 1 and
2 never simultaneously cooperate (so the sum of their payoffs in a given round is either b − c
b−c
or 0). Show that U1 (s1 , s2 ) + U2 (s1 , s2 ) ≤ 1−δ .
Answer: Since players 1 and 2 never simultaneously cooperate, we have that the sum of
their payoffs in a given round is at most b − c. Thus by the argument above,

b−c
u1 (s1 , s2 ) + u2 (s1 , s2 ) < (3)
1−δ

6. Again assume that δ < cb . Use your results from parts 1 and 5 to conclude that each player’s
payoff is less than b; that is, U1 (s1 , s2 ) < b and U2 (s1 , s2 ) < b.
1 b
Since we have also shown 1−δ < b−c , we have:

b−c
u1 (s1 , s2 ) + u2 (s1 , s2 ) < <b (4)
1−δ
Since the payoffs are nonnegative, both players receive payoff less than b.

7. Now suppose that, in the first round, player 1 cooperates and player 2 defects. By your
reasoning from part (f), player 2 receives total payoff less than b. Show that player 2 can
improve his payoff by deviating, so that (s1 , s2 ) is not a Nash equilibrium.
Answer: In the first round, player 2 defects and receives payoff b. But by part (e), player
2 receives total payoff less than b over the whole game. Thus player 2 has an incentive to
deviate and play all-D, which would earn him payoff at least b. So we conclude that (s1 , s2 )
is not a Nash equilibrum.

Using this proof by contradiction, you have showed that a strategy pair (s1 , s2 ) which involves
cooperation in any period cannot be a Nash equilibrium if δ < cb . It follows that (all-D, all-D) is
the only equilibrium in this case.

3 Incorporating Altruism’s Quirks: Observability and Inattention


to Efficacy (50 points)
Consider the following simple twist to the repeated PD: the first period agents are not paying full
attention to the game, and with probability 1 − p they do not observe what actions were played.
For the rest of periods, both agents are paying attention and know which actions were played. Note
that when p = 1, we are back to the standard case of the repeated PD. However, when p < 1, it
might be that actions where not observed in the first period.

1. For an arbitrary p, we define Grim Trigger as the strategy that cooperates if the player never
observed D, and defects otherwise. We know from lecture that when p = 1, the strategy

5
profile where both players use the Grim Trigger is a Nash equilibrium when δ ≥ c/b. Show
that both players using Grim Trigger is a Nash equilibrium if p ≥ 1−δ c
δ b−c .

[Hint: Things become much easier if you define a new variable W which is the total payoff
player i will receive from period 2 and onwards if both play according to the strategy profile.
That allows us to write payoffs as Ui (s) = b − c + δW ].

Answer. Consider the deviation which plays D in the first period: in that case on period 2
onwards payoffs will be V . This deviation will not be profitable if

b − c + δW ≥ b + δ(1 − p)W,

c
which is equivalent to p ≥ δW . Now, taking into account that W = (b − c)/(1 − δ), we get
that the deviation is not profitable when

1−δ c
p≥ .
δ b−c

2. (∗) - 5 points. Show that when p < 1−δ c


δ b−c at least one player must play D in the first period
[Hint: in addition to W , define a new variable V as the payoff player i will get on period
2 and onwards if he deviates from the strategy profile. Also, notice there is always a player
whose utility is at most (b − c)/(1 − δ).]

b−c
Answer. We know that U1 (s) + U2 (s) ≤ 1−δ . Therefore there is at least one player i such
b−c
that Ui (s) ≤ 1−δ . Let V be the payoff player i would get on round 2 and onwards if player 1
observed him playing D, and W be the payoff player i would get on period 2 and onwards if
player −i did not observe him playing D. Then player i will defect if

b − c + δW < b + δ(1 − p)W + δpV, (5)

or equivalently

δpW < c + δpV. (6)

b−c
Now, we know that W ≤ 1−δ and V ≥ 0, and therefore Equation 9 holds if p < (1 − δ)/δ ·
c/(b − c).

3. In light of these results, discuss the connection between altruism and observability. How
does this relate to the ‘observability’ experimental results discussed in class, such as the “eye
spots” experiment?

6
Answer. These results show that observability is fundamental for players being able to co-
operate. If player 2 cannot observe the actions of player 1, then player 2 use punishments to
incentivize player 1 to cooperate - and therefore cooperation breaks down.

Therefore, we would expect players to cooperate more when they are being observed. The
eye spots experiment shows that even a cue of observability is enough to obtain this effect.

4. Now suppose that in all periods, in addition to D and C, there is an extra action E, which
has the same payoffs as playing C, except it costs −e to play and it yields 2e extra to the
other player, for e ≥ 0. In the first period, we now suppose that, with probability 1 − p play-
ers cannot tell whether the other player chose C or E. In period 2 and subsequent periods,
c+e
players can always tell which action their opponent chose. Suppose that δ ≥ b+2e .

Consider the Efficient Grim Trigger strategy, which plays E if nothing different than E was
observed, and D otherwise. Show that the strategy profile where both players use the Efficient
Grim Trigger strategy profile is a Nash equilibrium if

(1 − δ)e
p≥
δ(b + e − c)

Answer. Consider first the deviation where player 1 plays C in the first round. This deviation
is not profitable if

b + e − c + δW ≥ b + 2e − c + δ(1 − p)W,

or equivalently

e
p≥ .
δW
c+e−b
Taking into account W = 1−δ , last expression becomes

(1 − δ)e
p≥ . (7)
δ(b + e − c)

Therefore when Equation 7 holds, playing C is not a profitable deviation. Now, let’s consider
a deviation to D. This deviation is always observable, and it will be profitable whenever

b + e − c + δW ≥ b + 2e,

b−c
and taking into account W = 1−δ and solving for δ we find that D is not profitable if

7
c+e
δ≥ ,
b + 2e
which was assumed in the description of the problem.

(1−δ)e
5. (∗) - 5 points. Show that when p < δ(b+e−c) , there is no Nash equilibrium in which both
players play E in the first round.

Answer. We will prove this using a similar method as in 2. We know that U1 (s) + U2 (s) ≤
b+e−c b+e−c
1−δ . Therefore there is at least one player i such that Ui (s) ≤ 1−δ . Let V be the payoff
player i would get on round 2 and onwards if player −i observed him playing C, and W be
the payoff player i would get on period 2 and onwards if player −i observed him playing E
(or could not differentiate between C and E). Then player i will play C if

b + e − c + δW < b + 2e − c + δ(1 − p)W + δpV, (8)

or equivalently

δpW < e + δpV. (9)

(1−δ)e
Now, we know that W ≤ b+e−c 1−δ and V ≥ 0, and therefore Equation 9 holds if p < δ(b+e−c) .
This proves that C is a profitable deviation, and therefore player 1 will not play E in the first
round.

6. Connect your answer to 4 with what you learned in lecture about the interaction between ob-
servability, efficiency, and altruism. In particular: what happens as the efficiency parameter
e and the observability parameter p change?

Answer. Our answer to 4 suggests that players will choose the efficient altruistic action
E whenever the other player can observe whether the altruistic action is efficient (i.e. can
distinguish C from E) with enough probability. When the other player cannot differentiate
C and E (i.e. p = 0), then neither player will ever play E. We saw in class that people
seem oblivious to the effect of their contributions (whether they are saving 1,000 or 100,000
lives; whether their donations are matched, etc.) most of the time - we could interpret that
as p = 0, because others do not know how efficient the donation was. But we also saw an
experiment in which people donated more efficiently when another person knew could observe
both the donation and how efficient it was - that would correspond with a high p, which in
our model corresponds to E being played.

8
(1−δ)e
In particular, as e increases, δ(b+e−c) also increases, what makes Equation 7 less likely to
hold (and so less likely that E will be played in the first round). Recall that players pay an
extra cost of e to give an extra benefit of 2e. As e increases, the interpretation is that the
efficient action, while being more efficient, is also more costly for ourselves. For example, as e
gets high, people might need to do extensive research in order to find out the efficient NGOs,
etc. This makes it less likely that they will donate efficiently. As p increases, Equation 7 is
more likely to hold (and so more likely that E will be played in the first round). We already
explained the intuition for this: as p increases, it is easier to observe whether C or E was
played - for example if whenever a donation is matched, it is posted to the person’s Facebook
wall.

4 Costly Signaling as an Extensive Form Game of Incomplete In-


formation (30 points (∗))
We’ve seen that when we represent sequential move (extensive form) games as simultaneous move
(matrix form) games, we can lose meaningful information. So far, we have analyzed costly signaling
as a matrix form game. We now confirm that this did not somehow yield misleading results by
analyzing it as an extensive form game.
We start by rigorously defining a simplified version of the game presented in problem 1 of
problem set 3. For the rest of this problem, assume the following:

1
• There are two types of senders: good and bad. A fraction p = 3 of senders are good (and
1 − p are bad). There is one type of receiver.

• There are two levels of signal, which we call s0 , s1 .

• For good types, sending these signals costs 0 and 1 respectively. For bad types, sending these
signals costs 0 and 6.

• Senders receive a payoff of 5 if receivers accept them. Receivers receive a payoff of 10 upon
accepting a good sender, and a payoff of −10 upon accepting a bad sender.

The game proceeds as follows: (1) To model random assignment of sender’s type, we assume
that Chance moves first in the game. Chance has two possible actions, {Good, Bad}, and chooses
Good with probability p = 1/3. (2) After that, the sender sends a signal (knowing her type; i.e.
knowing what Chance chose). (3) Finally the receiver chooses whether to accept or reject, without
knowing what action Chance took, but knowing the signal that the sender sent.

1. Write this game formally (refer to the appropriate section of the Game Theory handout), and
draw the game tree. [Hint: the game tree will look very similar to the one for the beer-quiche

9
game in the Game Theory handout]

The set of players is N = {Sender, Receiver}. The set of actions is ASender,h = {s0 , s1 } at all
histories where the Sender moves, and AReceiver,h = {Accept, Reject} at all histories at which
the receiver moves. The player function is P (∅) = Chance, ¶(Good) = P (Bad) = Sender,
and P (h) = Receiver for all other non-termnal histories. µ(good|∅) = 1/3, µ(bad|∅) = 2/3.

The game tree is as follows, where dotted lines indicate that two histories are in the same
part of the Receiver’s partition:

2. Prove that each of the following is a Perfect Bayesian equilibrium of the game:

(a) Efficient separating: good senders send signal s1 , bad senders send signal s0 , and re-
ceivers accept signal s1 .

Answer. First of all, we need to define the assessment, which in this case will be
βR (Good|s1 ) = 1, and βR (Good|s1 ) = 0. This assessment are consistent, because they
are derived from Bayes’ rule.

Next we check sequential rationality. Note that no sender has an incentive to deviate.
The high type’s payoff is 4, and would get -1 if she deviated to s0 ; the low type’s payoff
is 0 and would get −1 if she deviated to s1 .

Next, the receiver does not have a profitable deviation either. Her payoff is 10/3, if she
deviated to
• accepting all, her payoff would be 10(2/3 − 1/3),
• rejecting all, her payoff would be 0
• accepting s0 , rejecting s1 : her payoff would be 2/3(−10)
Because no player has an incentive to deviate, the strategy profile is sequentially ratio-
nal. That, together with the assesment beings consistent shows that they constitute a
PBE.

(b) Pooling with rejection: good and bad senders send s0 and receivers never accept any
signal.

Answer. First we define the assesment: βR (Good|s0 ) = 1/3. However, no player sends
s1 in equilibrium. Because of that, we need to consider a possible tremble: for example
the tremble where the band sender sends s1 with probability . Then, we can apply
Bayes rule to the tremble, and obtain

10
P (s1 |Bad) 
βR (Bad |s1 ) = = = 1.
P (s1 |Bad) + P (s1 |Good) +0
This is a tremble because as  → 0, it converges to our strategy profile (where the bad
sender never sends s1 ). Because the assessment is derived from Bayes’s rule whenever
possible it is consistent.

Next, let’s show that the strategy profile is sequentially rational. The payoff for the re-
ceiver is 0; if she deviated and accepted s0 her payoff would be 10(1/3 − 2/3), therefore
it is not profitable.

They payoff for the good and bad types is 0. If either of them would deviate and send s1 ,
they would incur a cost and be rejected, so they would have a negative payoff. Therefore
neither type has a profitable deviation.

Because the assessment is consistent and the strategy profile is sequentially rational,
they constitute a PBE.

3. Suppose the fraction of good senders increases to 90%. Is pooling with rejection still an equi-
librium (i.e. if both types of senders send s0 , would receivers still be better off rejecting all
senders)? Show that there is an alternate equilibrium in which receivers accept all senders.
Call this equilibrium “pooling with acceptance.”

Answer. When the fraction of good senders increases to 90%, any consistent assessment will
have βR (good|s0 ) = .9. Therefore, the receiver has a profitable deviation, to accept s0 , which
yields payoff 10(.9 − .1) = 8 > 0.

Next, we will show that equilibrium with acceptance is a PBE. Let βR (bad|s1 ) = 1. Again,
no player sends s1 in equilibrium. Because of that, we need to consider a possible tremble:
the tremble where the band sender sends s1 with probability . Then, we can apply Bayes
rule to the tremble, and obtain βR (Bad |s1 ) = 1. As we have showed above in part 2b, this
assessment is consistent.

We have already showed that the receiver does not have an incentive to deviate. By a rea-
soning similar to part 2b, neither sender has an incentive to deviate. This shows the strategy
profile is sequentially rational, and pooling with acceptance is a PBE.

4. Now suppose that the cost of sending s1 for the good type increases to 4. Are the efficient

11
separating, pooling with acceptance, and pooling with rejection PBE for p = 1/3. How about
p = .9? What if the cost had increased to 6 instead, for p = 1/3 and p = .9?

Answer. [Complete but somewhat less formal - any answer at the same level of formality
from students should get full credit]
When the cost for the good type increases to 4, all the reasonings in the previous parts hold,
and therefore

(a) When p = 1/3, the efficient separating and pooling with rejection are PBE; pooling with
acceptance is not a PBE
(b) When p = .9, the efficient separating and pooling with acceptance are PBE; pooling
with rejection is not a PBE

However, when the cost for the good type increases to 6, sending s0 becomes a dominant
strategy for her. Therefore the efficient separating is never a PBE, and:

(a) When p = 1/3, pooling with rejection is PBE; pooling with acceptance is not a PBE
(b) When p = .9, pooling with acceptance is a PBE; pooling with rejection is not a PBE

5. Panchanathan and Boyd First consider possible deviations. Starting with the collective
game a deviation takes the form of a defection. If a player defects here then they have no in-
centive to cooperate for the rest of the game as they will be shunned. As such they will defect
in all future rounds as well. We will term this strategy All-D. Given that a player cooperates
in the collective game we consider another possible deviation. A defection in the mutual aid
in the first round where a player is not needy is the sole reasonable deviation. Given that a
player intends to defection at some point, if it is profitable for them at an arbitrary stage it
will also be profitable in the first possible round. Any other deviation will lead to a payoff
less than these two deviations.

Now calculate the payoffs for these three strategies (Shunner, All D, Mutual Defect (MD):

n−1
n · (b − c)
B−C +
1−ω
B
n−1
n ·b
B−C +
1 − ω/n

Next we compare the payoff of Shunner to the payoffs from deviations. First deviating to All D

12
n−1
· (b − c)
n
B−C + >B
1−ω
n−1
· (b − c)
=⇒ n >C
1−ω

This matches the first condition from our assumptions. Next consider a deviation to MD:

n−1 n−1
n · (b − c) ·b
B−C + >B−C + n
1−ω 1 − ω/n
n−1 n−1
n · (b − c) ·b
> n
1−ω 1 − ω/n
b−c b
>
1−ω 1 − ω/n

This matches our second condition. Then for it not to be a beneficial deviation these condi-
tions must be satisfied. As such all shunner is an NE iff these conditions are met.

6. Introduction to Information Structures

(a) Player one can see two states, red and blue. Player two can see green or orange. A state
dependent strategy for P1 is A when blue and B when red. For P2 A when green, B
when orange.
(b) The players payoffs depend upon the proportion of the time a single state is realized.
Player 1: q · b + r · d + s · c
Player 2: q · c + r · d + s · b
(c) To show this is a ESDS we will show that there are no better choices in each individual
state. Suppose P1 sees red, then P1 knows with certainty that the state is q and P2 will
see green and therefore play A. As such the best choice is A. Suppose P1 sees red blue,
r
then with probability r+s the state is r, and P2 will play A. As such P2’s probability of
playing A is less than p by the assumption and P1 will play B optimally. Therefore P1
doesn’t have incentive to deviate.
Now consider P2 when P2 sees orange they know for certain P1 will play B as such B is
their best option. Suppose they see green, then the state is either q or r. The probability
q
of it being state q is q+r since this is more than p P2’s best decision is A.
Then neither player will deviate and this is a BNE.

7. Using Information Structures to Understand Higher-Order Beliefs

r
(a) i. Given that P2 has seen blue the state is r or s. With probability r+s > p P1 will
play A, therefore P1 benefits from playing A here.

13
s
ii. Now P2 knows P1 will play A s+t > p of the time, and therefore is also incentivized
to play A as well.
iii. Lastly P1 knows that P2 will always play A as such P1’s best decision here is to
play A as well.
(b) i. Player 2 thinks P1 sees either blue or yellow.
ii. P2 thinks that P1 thinks that P2 sees green or orange.
iii. P2 thinks that P1 thinks that P2 will do A if it is green.
iv. P2 should determine the relative probabilities of r, s and t and use it to optimize
their payoffs. Playing A always would be a simple way to ensure that players always
coordinate.

14

You might also like