Professional Documents
Culture Documents
Outline:
C D
C 2,2 0,3
D 3,0 1,1
Suppose game is repeated N times
Define: Payoff of trajectory to be the sum of stage payoffs
Example (N = 3): Action sequence {(C, C), (C, D), (D, D)} results in payoffs of
(2 + 0 + 1, 2 + 3 + 1)
for a total of
1 + 4 + 16 = 21
subhistories.
1
Subgame perfect equilibria and backwards induction
2
Infinite horizon and discounting
{(C, C), (C, C), (C, C), (D, C), (D, D), (D, D), ...}
Aside: Alternative (and equivalent) interpretation of discounting is that game ends after
every turn with probability 1 .
3
Infinite horizon strategies, best response, and NE
Tit-for-Tat:
ai (t) = ai (t 1)
Limited Punishment:
Play C unless....
If opponent ever plays D, then play D for k-turns, and revert to C afterwards
As before, a strategy profile (s1 , s2 ) is a NE if each strategy is a best response to the
other.
Simple example: Best response to always C is always D.
4
Best response to Grim Trigger
where
u1 (a( )) 1, for all T + 1
therefore, player 1 is forced to play
a1 ( ) = D, for all T + 1
5
Best response to Limited Punishment
(1 ) 2, 2, 2 2..., k 2
vs
(1 ) 3, , 2 , ..., k
or
2(1 k+1 ) vs 2(1 ) + (1 k+1 )
6
Best response to Tit-for-Tat
{2, 2, ..., 2}
7
Feasible discounted average payoffs
We have seen that a total discounted average payoff of (2, 2) can be supported by a NE
(for suitable )
A total discounted payoff of (1, 1) also can be supported by a NE (both players always
defect)
Conclusion: Set of NE payoffs of repeated game with discounting is richer than one shot
game
Next questions:
What is the complete set of repeated game discounted average payoffs?
What is the complete set of NE payoffs?
8
Feasible payoffs, cont
(1 )(a + b + 2 c + 3 a + 4 b + 5 c + ...)
= (1 )(a(1 + 3 + 6 + ...) + b(1 + 3 + 6 + ...) + 2 c(1 + 3 + 6 + ...))
1
= (a + b + 2 c)
1 3
Fact:
1 1
lim =
1 1 3 3
Therefore,
1 a+b+c
lim(a + b + 2 c) 3
=
1 1 3
Result generalizes to other high order periodic sequences
Fact: Set of feasible repeated game payoffs is weighted average of one-shot game
payoffs
Repeated prisoners dilemma: Take weighted average of
(0,3)
Player 2 payoff
(2,2)
(1,1)
(3,0)
Player 1 payoff
9
Lower bound on achievable payoff
(0,3)
Player 2 payoff
(2,2)
(1,1)
(3,0)
Player 1 payoff
10
Greedy strategy
Notation:
ai (t) = si (h(t)): The action of player i at time t, ai (t), is a function of the repeated
game strategy, si (), evaluated at the history at time t, h(t).
bi (ai ) the stage game best response function
Bi (si ) the repeated game best response function
Minimax payoff:
Suppose opponents commit to strategy si .
What is guaranteed payoff of si = Bi (si )?
Lower bound: Define the greedy response, Gi (si ):
i.e., the repeated game payoff using a repeated game best response is at least the
repeated game payoff using a day-by-day greedy response.
Why arent these equal?!
11
Minimax payoff
12
Nash folk theorem
Nash folk theorem: Let (w1 , w2 ) be payoff levels that are i) feasible and ii) at least
minimax levels (`1 , `2 ). Then for sufficiently close to 1, there exists a NE with payoffs
(w1 , w2 ).
This is the converse statement from what was on the previous slide.
Statement holds for more than 2 players.
Idea of proof:
Both players agree on an action path that leads to (w1 , w2 ).
Apply grim trigger if any player deviates from agreed upon path.
Best response to grim trigger is to stick to path
13
Subgame perfect?
14