You are on page 1of 4

ECE 594D Game Theory and Multiagent Systems

Takehome Final

Policy:
You must show your work & explain your answer to receive credit for a problem.
Time limit: 3 consecutive hours from the start of the exam
Acceptable material: Course textbook, course notes/homeworks, Matlab
Unacceptable material: Internet, alternative notes/books
Absolutely no collaboration. All questions must be directed to me.
Due: March 14, 2016 by 4:00 pm at my office (HFH 5161).
Grading Breakdown: Written questions
Question 1: 25 pts
Question 2: 40 pts
Question 3: 35 pts
Date/Time Started:

Date/Time Finished:
I hereby acknowledge that I completed this exam in accordance with the policy set above
(sign and date).

1
1. Two players are to choose between two movies. Player 1 (row) reads movie reviews and has a better
idea which is one is better. Player 2 (col) ignores movie reviews. The player preferences reflect a desire
to see the better movie but also to go together. This scenario may be described as a Bayesian game
as follows:
States: 1 and 2
State dependent payoff matrices:

A B A B
A 3, 3 2, 0 A 1, 1 0, 2
B 0, 2 1, 1 B 2, 0 4, 4
1 2

Types & beliefs (i.e., Pr () given type) over the states {1 , 2 }:


Player 1 (row) has 2 types, described by t1 = 1 () as follows:
1 (1 ) = with beliefs {2/3, 1/3}
1 (2 ) = with beliefs {1/3, 2/3}
Player 2 (col) only has 1 type, described by t2 = 2 () as follows:
2 (1 ) = 2 (2 ) = () with beliefs {1/2, 1/2}

(a) Build the entire best response map for column player 2 (over pure strategies)
(b) Derive all Bayes-Nash equilibria (over pure strategies)

2
2. Consider the following class of resource allocation problems.

A set of agents N = {1, 2}.


A finite set of resources R = {r1 , r2 , r3 }.
An anonymous welfare function for each resource r of the form Wr : {0, 1, 2} R. The specific
welfare functions are:
Wr1 (0) = 0, Wr1 (1) = v1 , Wr1 (2) = 1.5 v1
Wr2 (0) = 0, Wr2 (1) = v2 , Wr2 (2) = 1.5 v2
Wr3 (0) = 0, Wr3 (1) = v3 , Wr3 (2) = 1.5 v3
where v1 , v2 , v3 0 but unknown.
A finite action set of each player Ai R. Note that an action ai Ai corresponds to just a single
resource r R. Let A := A1 A2 represent the set of joint actions.
A system-level welfare function of the form
X
W (a) = Wr (|a|r )
rR

where |a|r represents the number of player that chose resource r in the action profile a, i.e.,

|a|r = |{j N : r aj }|

The goal of this question is to understand how available utility design methodologies impact the
efficiency of the resulting pure Nash equilibria.

(a) Consider the special case where A1 = A2 = {r1 , r2 } (i.e., no resource r3 for this specific problem);
however, v1 and v2 are unknown. Suppose each agent is assigned the marginal contribution utility.
What is the price of anarchy for the class of games induced by the set of feasible values v1 , v2 0.
(b) Consider the special case where A1 = {r1 , r2 } and A2 = {r2 , r3 }; however, v1 , v2 and v3 are
unknown. Suppose each agent is assigned the marginal contribution utility. What is the price of
anarchy for the class of games induced by the set of feasible values v1 , v2 , v3 0.
(c) Consider the special case where A1 = {r1 , r2 } and A2 = {r2 , r3 }; however, v1 , v2 and v3 are
unknown. Suppose each agent is assigned the Shapley value utility. What is the price of anarchy
for the class of games induced by the set of feasible values v1 , v2 , v3 0.
(d) Consider the special case where A1 = A2 = {r1 , r2 } (i.e., no resource r3 for this specific problem)
and v1 = 1 and v2 = 0.6. Compare and contrast the resulting behavior associated with log-
linear learning for both the marginal contribution utility design and the Shapley value utility
design. Provide a characterization of the limiting behavior of log-linear learning for both cases
with temperature values {2, 0.5, 0.1}.

Hint: For each scenario (a-c), try various values for v1 , v2 , and v3 to try and gain insight into the
structural form of the worst case situations.

3
3. One role for learning algorithms in distributed control is to guide the decisions of players in real time.
However, in many settings players may not have the ability to select any particular action in their
action set at any given time. For example, consider the problem of multi-vehicle motion control where
an agents action set represents discrete spatial locations. Here, mobility limitations restrict the ability
to traverse from one location to another in a given time period.
To formalize this notion of constrained action sets, consider the following process: Let a(t 1) be the
joint action at time t 1. With constrained action sets, the set of actions available to player i at time
t is a function of his action at time t 1 and will be denoted as Ci (ai (t 1)) Ai . For example,
consider the following two player identical interest game with payoffs

Player 2
b1 b2 b3
Player 1 a1 0, 0 0, 0 9,9
a2 10, 10 -10,-10 -10,-10

and constrained action sets that satisfy

C1 (a1 ) = {a1 , a2 }, C1 (a2 ) = {a1 , a2 }

C2 (b1 ) = {b1 , b2 }, C2 (b2 ) = {b1 , b2 , b3 }, C2 (b3 ) = {b2 , b3 }

Standard log-linear learning assumes an agent can access any action at each iteration, i.e., Ci (ai ) = Ai
for all actions ai Ai . The goal of this problem is to understand how mobility limitations impact
the resulting behavior associated with log-linear learning. Here, we now consider two variations of
log-linear learning which can accommodate constrained action sets.

Variation #1: At each time t, the updating player i plays a strategy pi (t) (Ai ) where
1
e Ui (ai ,ai (t1))

1 for any action ai Ci (ai (t 1)),
ai e Ui (ai ,ai (t1))
P
pi (t) = ai Ci (ai )
(1)

0 for any action ai
/ Ci (ai (t 1)).

Variation #2: At each time t, the updating player i selects one trial action ai (uniformly) from
his constrained action set Ci (ai (t 1)) Ai . Player i plays a strategy pi (t) (Ai ) where
1
e Ui (ai ,ai (t1))

1 for any action ai {ai (t 1), ai },
ai e Ui (ai ,ai (t1))
P
pi (t) = ai {ai (t1),ai }
(2)

0 for any action ai
/ {ai (t 1), ai }.

(a) Characterize the behavior of log-linear learning with Variation #1 as 0+ for the above
example.
(b) Characterize the behavior of log-linear learning with Variation #2 as 0+ for the above
example.

Hint: It may be useful to think of e1/ as .

You might also like