This action might not be possible to undo. Are you sure you want to continue?

**Eric Rasmusen, Erasmuse@indiana.edu
**

Mathematical Appendix

This appendix has three purposes: to remind some readers of the deﬁnitions of terms they

have seen before, to give other readers an idea of what the terms mean, and to list a

few theorems for reference. In accordance with these limited purposes, some terms such

as “boundary point” are left undeﬁned. For fuller exposition, see Rudin (1964) on real

analysis, Debreu’s Theory of Value (1959), and Chiang (1984) and Takayama (1985) on

mathematics for economists. Intriligator (1971) and Varian (1992) both have good math-

ematical appendices and are strong in discussing optimization, and Kamien & Schwartz

(1991) covers maximizing by choice of functions. Border’s 1985 book is entirely about ﬁxed

point theorems. Stokey & Lucas (1989) is about dynamic programming. Fudenberg &

Tirole (1991a) is the best source of mathematical theorems for use in game theory.

The web is very useful for mathematical deﬁnitions. See http://en.wikipedia.org,

http://mathworld.wolfram.com, and http://planetmath.org.

*A.1 Notation

Summation.

3

i=1

x

i

= x

1

+ x

2

+ x

3

.

Π Product. Π

3

i=1

x

i

= x

1

x

2

x

3

.

[x[ Absolute value of x. If x ≥ 0 then [x[ = x and if x < 0 then [x[ = −x.

[ “Such that,” “given that,” or “conditional upon.” ¦x[x < 3¦ denotes the set of real

numbers less than three. Prob(x[y < 5) denotes the probability of x given that y is

less than 5.

: “Such that.” ¦x : x < 3¦ denotes the set of real numbers less than three. The colon is

a synonym for [.

R

n

The set of n-dimensional vectors of real numbers (integers, fractions, and the least

upper bounds of any subsets thereof).

¦x, y, z¦ A set of elements x, y, and z. The set ¦3, 5¦ consists of two elements, 3 and 5.

∈ “Is an element of.” a ∈ ¦2, 5¦ means that a takes either the value 2 or 5.

⊂ Set inclusion. If X = ¦2, 3, 4¦ and Y = ¦2, 4¦, then Y ⊂ X because Y is a subset of

X.

[x, y] The closed interval with endpoints x and y. The interval [0, 1000] is the set

¦x[0 ≤ x ≤ 1000¦. Square brackets are also used as delimiters.

502

(x, y) The open interval with endpoints x and y. The interval (0, 1000) is the set

¦x[0 < x < 1000¦. (0, 1000] would be a half-open interval, the set ¦x[0 < x ≤ 1000¦.

Parentheses are also used as delimiters.

x! x-factorial. x! = x(x −1)(x −2)....(2)(1). 4! = 4(3)(2)(1) = 24.

_

a

b

_

The number of unordered combinations of b elements from a set with a elements.

_

a

b

_

=

a!

b!(a−b)!

, so

_

4

3

_

=

4!

3!(4−3)!

= 24/6 = 4. (See combination and permutation

below.)

The Cartesian product. X Y is the set of points ¦x, y¦, where x ∈ X and y ∈ Y .

An arbitrarily small positive number. If my payoﬀ from both Left and Right

equals 10, I am indiﬀerent between them; if my payoﬀ from Left is changed to 10+,

I prefer Left.

∼ We say that X ∼ F if the random variable X is distributed according to distribu-

tion F.

∃ “There exists...” ∃x > 0 : 9 −x

2

= 0.

∀ “For all...” ∀x ∈ [0, 3], x

2

< 10.

≡ “Equals by deﬁnition” “For clarity, let us deﬁne the average income x ≡

θ

w−1

(a−1)

2

+b

2

+c

for use in the expressions below.”

→ If f maps space X into space Y then f : X → Y .

df

dx

,

d

2

f

dx

2

The ﬁrst and second derivatives of a function. If f(x) = x

2

then

df

dx

= 2x and

d

2

f

dx

2

= 2.

f

, f

**The ﬁrst and second derivatives of a function. If f(x) = x
**

2

then f

= 2x and

f

= 2. Primes are also used on variables (not functions) for other purposes: x

and

x

**might denote two particular values of x.
**

∂f

∂x

,

∂

2

f

∂x∂y

Partial derivatives of a function. If f(x, y) = x

2

y then

∂f

∂x

= 2xy and

∂

2

f

∂x∂y

= 2x.

y

−i

The set y minus element i. If y = ¦y

1

, y

2

, y

3

¦, then y

−2

= ¦y

1

, y

3

¦.

Max(x, y) The maximum of two numbers x and y. Max(8, 24) = 24.

Min(x, y) Theminimum of two numbers x and y. Min(5, 3) = 3.

¸x| Ceiling (x). A number rounded up to the nearest integer. ¸4.2| = 5. This notation

is not well known in economics.

¸x| Floor (x). A number rounded down to the nearest integer. ¸6.9| = 6. This notation

is not well known in economics.

503

Sup X The supremum (least upper bound) of set X. If X = ¦x[0 ≤ x < 1000¦, then

sup X = 1000. The supremum is useful because sometimes, as here, no maximum

exists.

Inf X The inﬁmum (greatest lower bound) of set X. If X = ¦x[0 ≤ x < 1000¦,

then inf X = 0.

Argmax The argument that maximizes a function. If e

∗

= argmax EU(e), then e

∗

is the value of e that maximizes the function EU(e). The argmax of f(x) = x − x

2

is 1/2.

Maximum The greatest value that a function can take. Maximum(x −x

2

) = 1/4.

Minimum The least value that a function can take. Minimum(−5 + x

2

) = −5.

*A.2 The Greek Alphabet

A α alpha

B β beta

Γ γ gamma

∆ δ delta

E or ε epsilon

Z ζ zeta

H η eta

Θ θ theta

I ι iota

K κ kappa

Λ λ lambda

M µ mu

N ν nu

Ξ ξ xi

O o omicron

Π π pi

P ρ rho

Σ σ sigma

T τ tau

Υ υ upsilon

Φ φ phi

X χ chi

Ψ ψ psi

Ω ω omega

*A.3 Glossary

almost always See “generically.”

504

annuity A riskless security paying a constant amount each year for a given period of years,

with the amount conventionally paid at the end of each year.

closed A closed set in R

n

includes its boundary points. The set ¦x : 0 ≤ x ≤ 1000¦ is

closed.

combination The number of unordered sets of b elements from a set with a elements, de-

noted

_

a

b

_

=

a!

b!(a−b)!

. If we form sets of 2 element from the set A = ¦w, x, y, z¦, the

possibilities are ¦w, x¦, ¦w, y¦, ¦w, z¦, ¦x, y¦, ¦x, z¦, ¦y, z¦. Thus,

_

4

2

_

=

4!

2!(4−2)!

=

24/6 = 6. (See permutation for the ordered version.)

compact If set X in R

n

is closed and bounded, then X is compact. Outside of Euclidean

space, however, a set being closed and bounded does not guarantee compactness.

complete metric space A metric space that includes the limits of all possible Cauchy

sequences. All compact metric spaces and all Euclidean spaces are complete.

concave function The continuous function f(x) deﬁned on interval X is concave if for

all elements w and z of X, f(0.5w +0.5z) ≥ 0.5f(w) +0.5f(z). If f maps R into R

and f is concave, then f

≤ 0. See Figure 1.

Figure 1: Concavity and Convexity

505

continuous function Let d(x, y) represent the distance between points x and y. The

function f is continuous if for every > 0 there exists a δ() > 0 such that d(x, y) <

δ() implies d(f(x), f(y)) < .

continuum A continuum is a closed interval of the real line, or a set that can be mapped

one-to-one onto such an interval.

contraction The mapping f(x) is said to be a contraction if there exists a number c < 1

such that for the metric d of the space X,

d(f(x), f(y)) ≤ c ∗ d(x, y), for all x, y ∈ X. (1)

convex function The continuous function f(x) is convex if for all elements w and z of X,

f(0.5w +0.5z) ≤ 0.5f(w) +0.5f(z). See Figure 1. Convex functions are only loosely

related to convex sets.

convex set If set X is convex, then if you take any two of its elements w and z and a real

number t : 0 ≤ t ≤ 1, then tw + (1 −t)z is also in X.

correspondence A correspondence is a mapping that maps each point to one or more

other points, as opposed to a function, which only maps to one.

domain The domain of a mapping is the set of elements it maps from– something like the

land that it can alter as it pleases. (The mapping maps from the domain onto the

range.)

Leibniz’s integral rule This is the rule for diﬀerentiation under the integral sign.

∂

∂z

_

b(z)

a(z)

f(x, z)dx = f(b(z), z)

∂b(z)

∂z

−f(a(z), z)

∂a(z)

∂z

+

_

b(z)

a(z)

∂f(x, z)

∂z

dx. (2)

function If f maps each point in X to exactly one point in Y , f is called a function. The

two mappings in Figure 1 are functions, but the mapping in Figure 2 is not.

generically If a fact is true on set X generically, “except on a set of measure zero,” or

“almost always,” then it is false only on a subset of points Z that have the property

that if a point is randomly chosen using a density function with support X, a point in

Z is chosen with probability zero. This implies that if the fact is false on z ∈ R

n

and

z is perturbed by adding a random amount , the fact is true on z + with probability

one.

integration by parts This is a technique to rearrange integrals so they can be solved

more easily. It uses the formula

_

b

z=a

g(z)h

(z)dz = g(z)h(z)

¸

¸

b

z=a

−

_

b

z=a

h(z)g

(z)dz. (3)

To derive this, diﬀerentiate g(z)h(z) using the chain rule, integrate each side of the

equation, and rearrange.

506

Lagrange multiplier The Lagrange multiplier λ is the marginal value of relaxing a con-

straint in an optimization problem. If the problem is

¦

Maximize

x

x

2

subject to x ≤ 5 ¦, then λ = 2x

∗

= 10.

lattice A lattice is a partially ordered set (the ≥ ordering is deﬁned) where for any two

elements a and b, the values inf(a, b) and sup(a, b) are also in the set. A lattice is

complete if the inﬁmum and supremum of each of its subsets are in the lattice.

list (or n-Tuple) A set of n elements in which the element positions are ordered. (Up, Down, Up)

is a list, but (Down, Up, Up) are (Up, Down) are diﬀerent lists. See set and multiset.

lower semicontinuous correspondence The correspondence φ is lower semicontinuous

at the point x

0

if

x

n

→ x

0

, y

0

∈ φ(x

0

), implies ∃y

n

∈ φ(x

n

) such that y

n

→ y

0

, (4)

which means that associated with every x sequence leading to x

0

is a y sequence lead-

ing to its image. See Figure 2. This idea is not as important as upper semicontinuity.

maximand A maximand is what is being maximized. In the problem “Maximize f(x, θ)

by choice of x”, the maximand is f.

mean-preserving spread See the Risk section below.

measure zero See “generically.”

metric The function d(w, z) deﬁned over elements of set X is a metric if (1) d(w, z) > 0

if w ,= z and d(w, z) = 0 if and only if w = z; (2) d(w, z) = d(z, w); and (3)

d(w, z) ≤ d(w, y) + d(y, z) for points w, y, z ∈ X.

metric space Set X is a metric space if it is associated with a metric that deﬁnes the

distance between any two of its elements.

multiset A set of n elements in which position in the listing is does not matter but

multiplicity does. (Up, Down, Up) is a multiset, and (Down, Up, Up) is the same

multiset, but (Up, Down) is a diﬀerent multiset. See list and set.

n-Tuple (or Tuple). See list.

one-to-one The mapping f : X → Y is one-to-one if every point in set X maps to a

diﬀerent point in Y , so x

1

,= x

2

implies f(x

1

) ,= f(x

2

). An example is f(x) = x/2

with X = [0, 1] and Y = [0, 2].

onto The mapping f : X → Y is onto Y if every point in Y is mapped onto by some point

in X. An example is f(x) = x

2

with X = [−1, 1] and Y = [0, 1].

open In the space R

n

, an open set is one that does not include all its boundary points.

The set ¦x : 0 ≤ x < 1000¦ is open (even though it does include one of its boundary

points). In more general spaces, an open set is a member of a topology.

507

permutation The number of lists (sets with an order) of b elements from a set with a ele-

ments, which equals

a!

(a−b)!

. If we form sets of 2 elements from the set A = ¦w, x, y, z¦,

the possibilities are

¦w, x¦, ¦x, w¦, ¦w, y¦, ¦y, w¦, ¦w, z¦, ¦z, w¦, ¦x, y¦, ¦y, x¦, ¦x, z¦, ¦z, x¦, ¦y, z¦, ¦z, y¦.

The number of these is

4!

(4−2)!

= 24/2 = 12. (See combination for the unordered ver-

sion.)

perpetuity A riskless security paying a constant amount each year in perpetuity, with

the amount conventionally paid at the end of each year.

quasi-concave The continuous function f is quasi-concave if for w ,= z, f(0.5w+0.5z) >

min[f(w), f(z)], or, equivalently, if the set ¦x ∈ X[f(x) > b¦ is convex for any number

b. Every concave function is quasi-concave, but not every quasi-concave function is

concave.

quasilinear utility A utility function is quasilinear in variable w if a monotonic trans-

formation can make it linear in w and w is separable from all other variables in the

utility function. u(w, x) = w +

_

(x) and u(w, x) = log(w +

_

(x)) are quasilinear;

u(w, x) = wx and u(w, x) = log(w) +

_

(x) are not.

range The range of a mapping is the set of elements to which it maps– the property over

which it can spew its output. (The mapping maps from the domain onto the range.)

risk See the Risk section below.

set A set is a collection of objects in which position of listing and multiplicity do not

matter. ¦Up, Down¦, ¦Down, Up¦, and ¦Up, Down, Down¦ are all the same set of 2

elements. See list and multiset.

stochastic dominance See the Risk section below.

strict The word “strict” is used in a variety of contexts to mean that a relationship does

not hold with equality or is not arbitrarily close to being violated. If function f

is concave and f

> 0, then f

≤ 0, but if f is strictly concave, then f

< 0. The

opposite of “strictly” is “weakly.” The word “strong” is sometimes used as a synonym

for “strict.”

supermodular See the Supermodularity section below.

support The support of a probability distribution F(x) is the closure of the set of values

of x such that the density is positive. If each output between 0 and 20 has a pos-

itive probability density, and no other output does, then the support of the output

distribution is [0,20].

topology Besides denoting a ﬁeld of mathematics, a topology is a collection of subsets of

a space called “open sets” that includes (1) the entire space and the empty set, (2)

the intersection of any ﬁnite number of open sets, and (3) the union of any number

of open sets. In a metric space, the metric “induces” a topology by deﬁning an open

set. Imposing a topology on a space is something like deﬁning which elements are

close to each other, which is easy to do for R

n

but not for every space (e.g., spaces

consisting of functions or of game trees).

508

upper semicontinuous correspondence The correspondence φ : X → Y is upper semi-

continuous at point x

0

if

x

n

→ x

0

, y

n

∈ φ(x

n

), y

n

→ y

0

, implies y

0

∈ φ(x

0

), (5)

which means that every sequence of points in φ(x) leads to a point also in φ(x). See

Figure 2. An alternative deﬁnition, appropriate only if Y is compact, is that φ is

upper semicontinuous if the set of points ¦x, φ(x)¦ is closed.

Figure 2 Upper Semicontinuity

vector A vector is a list (a set with order) that has a certain structure I will not describe

here, but which, for example, a list of real numbers, a point in R

n

, will satisfy. The

point (2.5, 3, −4) is a vector in R

3

.

weak The word “weak” is used in a variety of contexts to mean that a relationship might

hold with equality or be on a borderline. If f is concave and f

> 0, then f

≤ 0,

but to say that f is weakly concave, while technically adding nothing to the meaning,

emphasizes that f

**= 0 under some or all parameters. The opposite of “weak” is
**

“strict” or “strong.”

*A.4 Formulas and Functions

log(xy) = log(x) + log(y).

509

log(x

2

) = 2log(x).

a

x

= (e

log(a)

)

x

.

e

rt

= (e

r

)

t

.

e

a+b

= e

a

e

b

.

a > b ⇒ ka < kb, if k < 0.

The Quadratic Formula: Let ax

2

+ bx + c = 0. Then x =

−b±

√

b

2

−4ac

2a

.

Derivatives

f(x) f

(x)

x

a

ax

a−1

1/x −

1

x

2

1

x

2

−

2

x

3

e

x

e

x

e

rx

re

rx

log(ax) 1/x

log(x) 1/x

a

x

a

x

log(a)

f(g(x)) f

(g(x))g

(x)

Determinants

¸

¸

¸

¸

a

11

a

12

a

21

a

22

¸

¸

¸

¸

= a

11

a

22

−a

21

a

12

.

¸

¸

¸

¸

¸

¸

a

11

a

12

a

13

a

21

a

22

a

23

a

31

a

32

a

33

¸

¸

¸

¸

¸

¸

= a

11

a

22

a

33

−a

11

a

23

a

32

+ a

12

a

23

a

31

−a

12

a

21

a

33

+ a

13

a

21

a

32

−a

13

a

22

a

31

.

Table 1: Some Useful Functional Forms

510

f (x) f

(x) f

**(x) Slope for x > 0 Curvature
**

log(x)

1

x

−

1

x

2

increasing concave

√

x

1

2

√

x

−

1

4x

(3/2)

increasing concave

x

2

2x 2 increasing convex

1

x

−

1

x

2

2

x

3

decreasing convex

7 −x

2

−2x -2 decreasing concave

7x −x

2

7 −2x -2 increasing/decreasing concave

The signs of derivatives can be confusing. The function f(x) = x

2

is increasing at an

increasing rate, but the function f(x) =

1

x

is decreasing at a decreasing rate, even though

f

> 0 in each case.

*A.5 Probability Distributions The deﬁnitive listing of probability distributions and

their characteristics is the three-volume series of Johnson & Kotz (1970). A few major

distributions are listed here. A probability distribution is the same as a cumulative

density function for a continuous distribution. Any single value has inﬁnitesimal proba-

bility if the distribution is continuous (unless there is a probability atom there), so rather

than speaking of the probability of a value, we speak of the probability of the value being

in an interval, or of the density at a single value.

The Exponential Distribution

The exponential distribution, which has the set of nonnegative real numbers as its support,

has the density function for mean λ of

f(x) =

e

−x/λ

λ

. (6)

The cumulative density function is

F(x) = 1 −e

−x/λ

. (7)

The Uniform Distribution

A variable is uniformly distributed over support X if each point in X has equal probability.

If the support is [α, β], the mean is α + β)/2, the density is

f(x) =

_

¸

¸

¸

¸

_

¸

¸

¸

¸

_

0 x < α

1

β−α

α ≤ x ≤ β

0 x > β,

(8)

511

and the cumulative density function is

F(x) =

_

¸

¸

¸

¸

_

¸

¸

¸

¸

_

0 x < α

x−α

β−α

α ≤ x ≤ β

1 x > β

(9)

The Normal Distribution

The normal distribution is a two-parameter single-peaked distribution which has as its

support the entire real line. The density function for mean µ and variance σ

2

is

f(x) =

1

√

2πσ

2

e

−(x−µ)

2

2σ

2

(10)

The cumulative density function is the the integral of this, often denoted Φ(x), which can-

not be simpliﬁed analytically. You can ﬁnd values at websites such as

http://www.math2.org/math/stat/distributions/z-dist.htm or from a spreadsheet program.

The Lognormal Distribution

If log(x) has a normal distribution, x has a lognormal distribution. This is a skewed

distribution which has the set of positive real numbers as its support since the logarithm

of a negative number is not deﬁned. The mean is e

σ

2

/2

.

A.6 Supermodularity

Suppose that there are N players in a game, subscripted by i and j, and that player i

has a strategy consisting of s

i

elements, subscripted by s and t, so his strategy is the vector

y

i

= (y

i

1

, . . . , y

i

s

i

). Let his strategy set be S

i

and his payoﬀ function be π

i

(y

i

, y

−i

; z), where

z represents a ﬁxed parameter. We say that the game is a supermodular game if the

following four conditions are satisﬁed for every player i = 1, . . . N:

(A1) S

i

is a complete lattice.

(A2) π

i

: S → R∪ ¦−∞¦ is order semicontinuous in y

i

for ﬁxed y

−i

, and order continuous

in y

−i

for ﬁxed y

i

, and has a ﬁnite upper bound.

(A3) π

i

is supermodular in y

i

, for ﬁxed y

−i

. For all strategy proﬁles y and y

in S,

π

i

(y) + π

i

(y

) ≤ π

i

(supremum¦y, y

¦) + π

i

(infimum¦y, y

¦). (11)

(A4) π

i

has increasing diﬀerences in y

i

and y

−i

. For all y

i

≥ y

i

, the diﬀerence π

i

(y

i

, y

−i

) −

π

i

(y

i

, y

−i

) is nondecreasing in y

−i

.

In addition, it is sometimes useful to use a ﬁfth assumption:

(A5) π

i

has increasing diﬀerences in y

i

and z for ﬁxed y

−i

; for all y

i

≥ y

i

, the diﬀerence

π

i

(y

i

, y

−i

, z) −π

i

(y

i

, y

−i

, z) is nondecreasing with respect to z.

512

The conditions for smooth supermodularity are:

A1

**The strategy set is an interval in R
**

s

i

:

S

i

= [y

i

, y

i

]. (12)

A2

π

i

is twice continuously diﬀerentiable on S

i

.

A3

**(Supermodularity) Increasing one component of player i’s strategy does not decrease
**

the net marginal beneﬁt of any other component: for all i, and all s and t such that

1 ≤ s < t ≤ s

i

,

∂

2

π

i

∂y

i

s

∂y

i

t

≥ 0. (13)

A4

**(Increasing diﬀerences in one’s own and other strategies) Increasing one component
**

of i’s strategy does not decrease the net marginal beneﬁt of increasing any component of

player j’s strategy: for all i ,= j, and all s and t such that 1 ≤ s ≤ s

i

and 1 ≤ t ≤ s

j

,

∂

2

π

i

∂y

i

s

∂y

j

t

≥ 0. (14)

The ﬁfth assumption becomes

A5

**: (Increasing diﬀerences in one’s own strategies and parameters) Increasing parameter z
**

does not decrease the net marginal beneﬁt to player i of any component of his own strategy:

for all i, and all s such that 1 ≤ s ≤ s

i

,

∂

2

π

i

∂y

i

s

∂z

≥ 0. (15)

Theorem 1

If the game is supermodular, there exists a largest and smallest Nash equilibrium in pure

strategies.

Theorem 1 is useful because it shows (a) existence of an equilibrium in pure strategies,

and (b) if there are at least two equilibria (note that the largest and smallest equilibria

might be the same strategy proﬁle), then two of them can be ranked in the magnitudes of

the components of each player’s equilibrium strategy.

Theorem 2

If the game is supermodular and assumption (A5

**) is satisﬁed, then the largest and smallest
**

equilibria are nondecreasing functions of the parameter z.

Theorem 3

If a game is supermodular, then for each player there is a largest and smallest serially

undominated strategy, where both of these strategies are pure.

Theorem 4

Let y

i

denote the smallest element of player i’s strategy set S

i

in a supermodular game.

513

Let y

∗

and y

∗

denote two equilibria, with y

∗

≥ y

∗

, so y is the “big” equilibrium. Then,

1. If π

i

(y

i

, y

−i

) is increasing in y

−i

, then π

i

(y

∗

) ≥ π

i

(y

∗

).

2. If π

i

(y

i

, y

−i

) is decreasing in y

−i

, then π

i

(y

∗

) ≤ π

i

(y

∗

).

3. If the condition in (1) holds for a subset N

1

of players, and the condition in (2) holds for

the remainder of the players, then the big equilibrium y

∗

is the best equilibrium for players

in N

1

and the worst for the remaining player, and the small equilibrium y

∗

is the worst

equilibrium for players in N

1

and the best for the remaining players.

The theorems here are taken from Milgrom & Roberts (1990). Theorem 1 is their

corollary to Theorem 5. Theorem 2 is their Theorem 6 and corollary. Theorem 3 is their

Theorem 5, and 4 is their Theorem 7. For more on supermodularity, see Milgrom &

Roberts (1990), Fudenberg & Tirole (1991, pp. 489-497), or Vives’s 2005 survey article.

For a mathematician’s view, see Topkis’s 1998 Supermodularity and Complementarity.

A.7 Fixed Point Theorems

Fixed points theorems say that various kinds of mappings from one set to another

result in at least one point being mapped back onto itself. The most famous ﬁxed point

theorem is Brouwer’s Theorem, illustrated in Figure 3. I will use a formulation from page

952 of Mas-Colell, Whinston & Green (1994).

Figure 3 A Mapping with Three Fixed Points

The Brouwer Fixed Point Theorem. Suppose that set A in R

N

is nonempty, compact,

and convex; and that f : A → A is a continuous function from A into itself. ( “Compact”

514

means closed and bounded, in Euclidean space.) Then f has a ﬁxed point; that is, there is

an x in A such that x = f(x).

The usefulness of ﬁxed point theorems is that an equilibrium is a ﬁxed point. Consider

equilibrium prices. Let p be a point in N-space consisting of one price for each good. Let

P be the set of all possible price points. P will be convex and compact if we limit it

to ﬁnite prices. Agents in the economy look at p and make decisions about consumption

and output. These decisions change p into f(p). An equilibrium is a point p∗ such that

f(p∗) = p∗. If you can show that f is continuous, you can show that p∗ exists.

This is also true for Nash equilibria. Let s be a point in N − space consisting of

one strategy for each player – a strategy proﬁle. Let S be the set of all possible strategy

proﬁles. This will be compact and convex if we allow mixed strategies (for convexity) and

if strategy sets are closed and bounded. Each strategy proﬁle s will cause each player to

react by choosing his best response f(s). A Nash equilibrium is s∗ such that f(s∗) = s∗.

If you can show that f is continuous – which you can do if payoﬀ functions are continuous

– you can show that s∗ exists.

The Brouwer theorem is useful in itself, and conveys the intuition of ﬁxed point the-

orems, but to prove existence of prices in general equilibrium and existence of Nash equi-

librium in game theory requires the Kakutani ﬁxed point theorem. That is because the

mappings involved are not one-to-one functions, but one-to-many-point correspondences.

In general equilibrium, one ﬁrm might be indiﬀerent between producing various amounts

of output. In game theory, one player might have two best responses to another player’s

strategy.

The Kakutani Fixed Point Theorem (Kakutani [1941]) Suppose that set A in R

N

is a nonempty, compact, convex set and that f : A → A is an upper hemicontinuous

correspondence from A into itself, with the property that the set f(x) is nonempty and

convex for every x. Then f has a ﬁxed point; that is, there is an x in A such that x is one

element of f(x).

Other ﬁxed point theorems exist for other kinds of mappings — for example for a

mapping from a set of functions back into itself. In deciding which theorem to use, care

must be taken to identify the mathematical nature of the set of strategy proﬁles and the

smoothness of the best response functions.

*A.8 Genericity

Suppose we have a space X consisting of the interval between 0 and 100 on the real

line, [0, 100], and a function f such that f(x) = 3 except that f(15) = 5. We can then say

any of the following:

1 f(x) = 3 except on a set of measure zero.

2 f(x) = 3 except on a null set.

3 Generically, f(x) = 3.

515

4 f(x) = 3 almost always.

5 The set of x such that f(x) = 3 is dense in X.

6 The set of x such that f(x) = 3 has full measure.

These all convey the idea that if parameters are picked using a continuous random

density, f(x) will be 3 with probability one, and any other value of f(x) is very special in

that sense. If you start with point x, and add a small random perturbation , then with

probability one, f(x + ) = 3. So unless there is some special reason for x to take the

particular value of 15, you can count on observation f(x) = 3.

Statements like these always depend on the deﬁnition of the space X. If, instead, we

took a space Y consisting of the integers between 0 and 100, which is 0,1,2,...,100, then it is

not true that “f(x) = 3 except on a set of measure zero.” Instead, if x is chosen randomly,

there is a 1/101 probability that x = 15 and f(x) = 5.

The concept of “a set of measure zero” becomes more diﬃcult to implement if the

space X is not just a ﬁnite interval. I have not deﬁned the concept in these notes; I have

just pointed to usage. This, however, is enough to be useful to you. A course in real

analysis would teach you the deﬁnitions. As with the concepts of “closed” and “bounded”,

complications can arise even in economic applications because of inﬁnite spaces and in

dealing with spaces of functions, game tree branchings, or other such objects.

Now, let us apply the idea to games. Here is an example of a theorem that uses

genericity.

Theorem: “Generically, all ﬁnite games of perfect information have a unique subgame

perfect equilibrium.”

Proof. A game of perfect information has no simultaneous moves, and consists of a tree

in which each player moves in sequence. Since the game is ﬁnite, each path through the

tree leads to an end node. For each end node, consider the decision node just before it.

The player making the decision there has a ﬁnite number N of choices, since this is a ﬁnite

game. Denote the payoﬀs from these choices as (P

1

, P

2

, ..., P

N

). This set of payoﬀs has

a unique maximum, because generically no two payoﬀs will be equal. (If they were, and

you perturbed the payoﬀs a little, with probability one they would no longer be equal, so

games with equal payoﬀs have measure zero.) The player will pick the action with the

biggest payoﬀ. Every subgame perfect equilibrium must specify that the players choose

those actions, since they are the unique Nash strategies in the subgames at the end of the

game.

Next, consider the next-to-last decision nodes. The player making the decision at such

a node has a ﬁnite number of choices, and using the payoﬀs determined from the optimal

choice of ﬁnal moves, he will ﬁnd that some move has the maximum payoﬀ. The subgame

perfect equilibrium must specify that move. Continue this procedure until you reach the

very ﬁrst move of the game. The player there will ﬁnd that some one of his ﬁnite moves has

the largest payoﬀ, and he will pick that one move. Each player will have one best action

choice at each node, and so the equilibrium will be unique. Q.E.D.

516

Genericity entered this as the condition that we are ignoring special games in the

theorem’s statement – games that have tied payoﬀs. Whether those are really special or

not depends on the context.

*A.9 Discounting

A model in which the action takes place in real time must specify whether payments

and receipts are valued less if they are made later, i.e., whether they are discounted.

Discounting is measured by the discount rate or the discount factor.

The discount rate, r, is the extra fraction of a unit of value needed to compensate for

delaying receipt by one period.

The discount factor, δ, is the equivalent in present units of value of one unit to be received

one period from the present.

The discount rate is analogous to the interest rate, and in some models the interest

rate determines the discount rate. The discount factor represents exactly the same idea as

the discount rate, and δ =

1

1+r

. Models use r or δ depending on notational convenience.

Not discounting is equivalent to r = 0 and δ = 1, so the notation includes zero discounting

as a special case.

Whether to put discounting into a model involves two questions. The ﬁrst is whether

the added complexity will be accompanied by a change in the results or by a surprising

demonstration of no change in the results. A second, more speciﬁc question is whether the

events of the model occur in real time, so that discounting is appropriate. The bargaining

game of Alternating Oﬀers from Section 12.3 can be interpreted in two ways. One way is

that the players make all their oﬀers and counteroﬀers between dawn and dusk of a single

day, so essentially no real time has passed. The other way is that each oﬀer consumes a

week of time, so that the delay before the bargain is reached is important to the players.

Discounting is appropriate only in the second interpretation.

Discounting has two important sources: time preference and a probability that the

game might end, represented by the rate of time preference, ρ, and the probability each

period that the game ends, θ. It is usually assumed that ρ and θ are constant. If they both

take the value zero, the player does not care whether his payments are scheduled now or

ten years from now. Otherwise, a player is indiﬀerent between

x

1+ρ

now and x guaranteed

to be paid one period later. With probability (1 − θ) the game continues and the later

payment is actually made, so the player is indiﬀerent between (1−θ)x/(1+ρ) now and the

promise of x to be paid one period later contingent upon the game still continuing. The

discount factor is therefore

δ =

1

1 + r

=

(1 −θ)

(1 + ρ)

. (16)

Table 2 summarizes the implications of discounting for the value of payment streams of

various kinds. We will not go into how these are derived, but they all stem from the basic

fact that a dollar paid in the future is worth δ dollars now. Continuous time models usually

refer to rates of payment rather than lump sums, so the discount factor is not so useful a

concept, but discounting works the same way as in discrete time except that payments are

517

continuously compounded. For a full explanation, see a ﬁnance text (e.g., Appendix A of

Copeland & Weston [1988]).

Table 2 Discounting

Discounted Value

r-notation δ-notation

Payoﬀ Stream (discount rate) (discount factor)

x at the end of one period

x

1+r

δx

x at the end of each period in perpetuity

x

r

δx

1−δ

x at the start of each period in perpetuity x +

x

r

x

1−δ

x at the end of each period up through T (ﬁrst formula)

T

t=1

x

(1+r)

t

T

t=1

δ

t

x

x at the end of each period up through T (second formula)

x

r

_

1 −

1

(1+r)

T

_

δx

1−δ

_

1 −δ

T

_

x at time t in continuous time xe

−rt

—

Flow of x per period up to time T in continuous time

_

T

0

xe

−rt

dt —

Flow of x per period in perpetuity, in continuous time

x

r

—

The way to remember the formula for an annuity over a period of time is to use the

formulas for a payment at a certain time in the future and for a perpetuity. A stream of

x paid at the end of each year is worth

x

r

. A payment of Y at the end of period T has a

present value of

−Y

(1+r)

T

. Thus, if at the start of period T you must pay out a perpetuity

of x at the end of each year, the present value of that payment is

_

x

r

_ _

1

1+r

_

T

. One may

also view a stream of payments each year from the present until period T as the same

thing as owning a perpetuity but having to give away a perpetuity in period T. This leaves

a present value of

_

x

r

_ _

1 −(

1

1+r

)

T

_

, which is the second formula for an annuity given in

Table 2. Figure 4 illustrates this approach to annuities and shows how it can also be used

to value a stream of income that starts at period S and ends at period T.

518

Figure 4: Discounting

Discounting will be left out of most dynamic games in this book, but it is an especially

important issue in inﬁnitely repeated games, and is discussed further in Section 5.2.

*A.10 Risk

We say that a player is risk averse if his utility function is strictly concave in money,

which means that he has diminishing marginal utility of money. He is risk neutral if his

utility function is linear in money. The qualiﬁer “in money” is used because utility may be

a function of other variables too, such as eﬀort.

We say that probability distribution F dominates distribution G in the sense of ﬁrst-

order stochastic dominance if the cumulative probability that the variable will take a

value less than x is greater for G than for F, i.e. if

for any x, F(x) ≤ G(x), (17)

and (17) is a strong inequality for at least one value of x. The distribution F dominates

G in the sense of second-order stochastic dominance if the area under the cumulative

distribution G up to G(x) is greater than the area under F, i.e. if

for any x,

_

x

−∞

F(y)dy ≤

_

x

−∞

G(y)dy, (18)

and (18) is a strong inequality for some value of x. Equivalently, F dominates G if, limiting

U to increasing functions for ﬁrst-order dominance and increasing concave functions for

519

second-order dominance,

for all functions U,

_

+∞

−∞

U(x)dF(x) >

_

+∞

−∞

U(x)dG(x). (19)

If F is a ﬁrst-order dominant gamble, it is preferred by all players; if F is a second-order

dominant gamble, it is preferred by all risk-averse players. If F is ﬁrst-order dominant it

is second-order dominant, but not vice versa.

Milgrom (1981b) has used stochastic dominance to carefully deﬁne what we mean by

good news. Let θ be a parameter about which the news is received in the form of message

x or y, and let utility be increasing in θ. The message x is more favorable than y (is “good

news”) if for every possible nondegenerate prior for F(θ), the posterior F(θ[x) ﬁrst-order

dominates F(θ[y).

Rothschild & Stiglitz (1970) shows how two gambles can be related in other ways equiv-

alent to second-order dominance, the most important of which is the mean-preserving

spread. Informally, a mean-preserving spread is a density function which transfers prob-

ability mass from the middle of a distribution to its tails. More formally, for discrete

distributions placing suﬃcient probability on the four points a

1

, a

2

, a

3

, and a

4

,

A mean-preserving spread is a set of four locations a

1

< a

2

< a

3

< a

4

and four

probabilities γ

1

≥ 0, γ

2

≤ 0, γ

3

≤ 0, γ

4

≥ 0 such that −γ

1

= γ

2

, γ

3

= −γ

4

, and

i

γ

i

a

i

= 0.

Figure 5: Mean-Preserving Spreads

Figure 5 shows how this works. Panel (a) shows the original distribution with solid

bars. The mean is 0.1(2) + 0.3(3) + 0.3(4) + 0.2(5) + 0.1(6), which is 3.9. The spread has

520

mean 0.1(1) −0.3(.3) −0.1(4) +0.2(6), which is 0, so it is mean-preserving. Panel (b) shows

the resulting spread-out distribution.

The deﬁnition can be extended to continuous distributions, and can be alternatively

deﬁned by taking probability mass from one point in the middle and moving it to the sides;

panel (c) of Figure 5 shows an example. See Rasmusen & Petrakis (1992), which also, with

Leshno, Levy, & Spector (1997), ﬁxes an error in the original Rothschild & Stiglitz proof.

Hazard Rates

The hazard rate has been important in the analysis above. Suppose we have a density

f(v) for a buyer’s value for an object being sold, with cumulative distribution F(v) on

support [v, v]. The hazard rate h(v) is deﬁned as

h(v) =

f(v)

1 −F(v)

(20)

What this means is that h(v) is the probability density of v for the distribution which is

like F(v) except cutting oﬀ the lower values, so its support is limited to [v, v]. In economic

terms, h(v) is the probability density for the valuing equalling v given that we know that

value equals at least v.

For most distributions we use, the hazard rate is increasing, including the uniform,

normal, logistic, and exponential distributions, and any distribution with increasing density

over its support (see Bagnoli & Bergstrom [1994]). Figure 6 shows three of them.

521

Figure 6: Three Densities To Illustrate Hazard Rates

522

(x, y) The open interval with endpoints x and y. The interval (0, 1000) is the set {x|0 < x < 1000}. (0, 1000] would be a half-open interval, the set {x|0 < x ≤ 1000}. Parentheses are also used as delimiters. x! x-factorial. x! = x(x − 1)(x − 2)....(2)(1). 4! = 4(3)(2)(1) = 24. a b The number of unordered combinations of b elements from a set with a elements. a = b below.)

a! , b!(a−b)!

so

4 3

=

4! 3!(4−3)!

= 24/6 = 4. (See combination and permutation

× The Cartesian product. X × Y is the set of points {x, y}, where x ∈ X and y ∈ Y . An arbitrarily small positive number. If my payoﬀ from both Lef t and Right equals 10, I am indiﬀerent between them; if my payoﬀ from Lef t is changed to 10 + , I prefer Lef t. ∼ We say that X ∼ F if the random variable X is distributed according to distribution F . ∃ “There exists...” ∃x > 0 : 9 − x2 = 0. ∀ “For all...” ∀x ∈ [0, 3], x2 < 10. ≡ “Equals by deﬁnition” “For clarity, let us deﬁne the average income x ≡ for use in the expressions below.” → If f maps space X into space Y then f : X → Y .

2 df , df dx dx2 d2 f dx2

θw−1 (a−1)2 +b2 +c

The ﬁrst and second derivatives of a function. If f (x) = x2 then = 2.

df dx

= 2x and

f ,f

The ﬁrst and second derivatives of a function. If f (x) = x2 then f = 2x and f = 2. Primes are also used on variables (not functions) for other purposes: x and x might denote two particular values of x. Partial derivatives = 2x. of a function. If f (x, y) = x2 y then

∂f ∂x

2 ∂f , ∂f ∂x ∂x∂y ∂2f ∂x∂y

= 2xy and

y−i The set y minus element i. If y = {y1 , y2 , y3 }, then y−2 = {y1 , y3 }. M ax(x, y) The maximum of two numbers x and y. M ax(8, 24) = 24. M in(x, y) Theminimum of two numbers x and y. M in(5, 3) = 3. x x Ceiling (x). A number rounded up to the nearest integer. 4.2 = 5. This notation is not well known in economics. Floor (x). A number rounded down to the nearest integer. 6.9 = 6. This notation is not well known in economics. 503

*A.2 The Greek Alphabet A B Γ ∆ E Z H Θ I K Λ M N Ξ O Π P Σ T Υ Φ X Ψ Ω α β γ δ or ε ζ η θ ι κ λ µ ν ξ o π ρ σ τ υ φ χ ψ ω alpha beta gamma delta epsilon zeta eta theta iota kappa lambda mu nu xi omicron pi rho sigma tau upsilon phi chi psi omega *A. M inimum The least value that a function can take. then sup X = 1000. Argmax The argument that maximizes a function. then e∗ is the value of e that maximizes the function EU (e). then inf X = 0. no maximum exists.3 Glossary almost always See “generically. The supremum is useful because sometimes. M inimum(−5 + x2 ) = −5. M aximum The greatest value that a function can take. as here. If e∗ = argmax EU (e). Inf X The inﬁmum (greatest lower bound) of set X. If X = {x|0 ≤ x < 1000}.” 504 . The argmax of f (x) = x − x2 is 1/2.Sup X The supremum (least upper bound) of set X. If X = {x|0 ≤ x < 1000}. M aximum(x − x2 ) = 1/4.

{w.) compact If set X in Rn is closed and bounded. = 2!(4−2)! = 2 24/6 = 6. the b 4 4! possibilities are {w. {w. dea a! noted = b!(a−b)! . then f ≤ 0.5f (w) + 0. See Figure 1. f (0. z}. however. y}. If we form sets of 2 element from the set A = {w. Thus. with the amount conventionally paid at the end of each year. concave function The continuous function f (x) deﬁned on interval X is concave if for all elements w and z of X. then X is compact. The set {x : 0 ≤ x ≤ 1000} is closed. y}. x. z}. If f maps R into R and f is concave. Outside of Euclidean space. y. closed A closed set in Rn includes its boundary points.5f (z). complete metric space A metric space that includes the limits of all possible Cauchy sequences. All compact metric spaces and all Euclidean spaces are complete. combination The number of unordered sets of b elements from a set with a elements. a set being closed and bounded does not guarantee compactness. x}.5w + 0.5z) ≥ 0. Figure 1: Concavity and Convexity 505 .annuity A riskless security paying a constant amount each year for a given period of years. (See permutation for the ordered version. {x. z}. {y. {x. z}.

z)dx = f (b(z). contraction The mapping f (x) is said to be a contraction if there exists a number c < 1 such that for the metric d of the space X.5f (z). f is called a function. correspondence A correspondence is a mapping that maps each point to one or more other points. generically If a fact is true on set X generically. (3) To derive this. diﬀerentiate g(z)h(z) using the chain rule.continuous function Let d(x.5f (w) + 0.” then it is false only on a subset of points Z that have the property that if a point is randomly chosen using a density function with support X. This implies that if the fact is false on z ∈ Rn and z is perturbed by adding a random amount . f (y)) ≤ c ∗ d(x. y ∈ X. (The mapping maps from the domain onto the range. as opposed to a function. ∂z (2) function If f maps each point in X to exactly one point in Y . y) represent the distance between points x and y. but the mapping in Figure 2 is not. d(f (x). a point in Z is chosen with probability zero. z) + ∂z ∂z b(z) a(z) ∂f (x.” or “almost always. z) dx. ∂ ∂z b(z) f (x. domain The domain of a mapping is the set of elements it maps from– something like the land that it can alter as it pleases. The function f is continuous if for every > 0 there exists a δ( ) > 0 such that d(x. convex set If set X is convex. 506 . (1) convex function The continuous function f (x) is convex if for all elements w and z of X. then tw + (1 − t)z is also in X. f (y)) < . “except on a set of measure zero.) Leibniz’s integral rule This is the rule for diﬀerentiation under the integral sign. y). integration by parts This is a technique to rearrange integrals so they can be solved more easily. continuum A continuum is a closed interval of the real line. It uses the formula b b g(z)h (z)dz = g(z)h(z) z=a b z=a − z=a h(z)g (z)dz. which only maps to one. f (0. Convex functions are only loosely related to convex sets. the fact is true on z + with probability one. for all x. then if you take any two of its elements w and z and a real number t : 0 ≤ t ≤ 1. See Figure 1. z) a(z) ∂b(z) ∂a(z) − f (a(z). y) < δ( ) implies d(f (x).5w + 0. integrate each side of the equation. and rearrange. The two mappings in Figure 1 are functions.5z) ≤ 0. or a set that can be mapped one-to-one onto such an interval.

A lattice is complete if the inﬁmum and supremum of each of its subsets are in the lattice. measure zero See “generically. lower semicontinuous correspondence The correspondence φ is lower semicontinuous at the point x0 if xn → x0 . See set and multiset. y0 ∈ φ(x0 ). mean-preserving spread See the Risk section below. b) and sup(a. onto The mapping f : X → Y is onto Y if every point in Y is mapped onto by some point in X. z) > 0 if w = z and d(w. z ∈ X. U p) is a multiset. but (U p. In more general spaces. This idea is not as important as upper semicontinuity. See list. w). z) deﬁned over elements of set X is a metric if (1) d(w. metric space Set X is a metric space if it is associated with a metric that deﬁnes the distance between any two of its elements. an open set is one that does not include all its boundary points. x lattice A lattice is a partially ordered set (the ≥ ordering is deﬁned) where for any two elements a and b. See list and set. The set {x : 0 ≤ x < 1000} is open (even though it does include one of its boundary points). multiset A set of n elements in which position in the listing is does not matter but multiplicity does. U p. Down) are diﬀerent lists. y. (2) d(w. and (Down. 1] and Y = [0.” metric The function d(w. 1]. z) ≤ d(w. θ) by choice of x”. b) are also in the set. an open set is a member of a topology. z) = 0 if and only if w = z. (4) which means that associated with every x sequence leading to x0 is a y sequence leading to its image. the values inf (a. Down) is a diﬀerent multiset. y) + d(y. so x1 = x2 implies f (x1 ) = f (x2 ). An example is f (x) = x/2 with X = [0. then λ = 2x∗ = 10. and (3) d(w. An example is f (x) = x2 with X = [−1. z) for points w. 507 . 1] and Y = [0. U p) is the same multiset. (U p. U p. the maximand is f . U p) are (U p. implies ∃yn ∈ φ(xn ) such that yn → y0 .Lagrange multiplier The Lagrange multiplier λ is the marginal value of relaxing a constraint in an optimization problem. 2]. n-Tuple (or Tuple). (U p. See Figure 2. maximand A maximand is what is being maximized. open In the space Rn . In the problem “Maximize f (x. but (Down. one-to-one The mapping f : X → Y is one-to-one if every point in set X maps to a diﬀerent point in Y . list (or n-Tuple) A set of n elements in which the element positions are ordered. If the problem is M aximize 2 { x subject to x ≤ 5 }. Down. Down. U p) is a list. z) = d(z.

z}. f (z)]. x}. but if f is strictly concave.permutation The number of lists (sets with an order) of b elements from a set with a elea! ments. set A set is a collection of objects in which position of listing and multiplicity do not matter. which equals (a−b)! . x) = w + (x) and u(w. but not every quasi-concave function is concave. {x. Imposing a topology on a space is something like deﬁning which elements are close to each other. x) = log(w + (x)) are quasilinear.. If each output between 0 and 20 has a positive probability density. and {U p. then f < 0. {U p. 4! The number of these is (4−2)! = 24/2 = 12. {y.5w + 0. {w. a topology is a collection of subsets of a space called “open sets” that includes (1) the entire space and the empty set. w}. stochastic dominance See the Risk section below. if the set {x ∈ X|f (x) > b} is convex for any number b. with the amount conventionally paid at the end of each year. {y. {z.” The word “strong” is sometimes used as a synonym for “strict. y}.” supermodular See the Supermodularity section below. the metric “induces” a topology by deﬁning an open set.g. U p}. If function f is concave and f > 0. w}. {x. x) = log(w) + (x) are not. y}.) perpetuity A riskless security paying a constant amount each year in perpetuity. (2) the intersection of any ﬁnite number of open sets. Every concave function is quasi-concave. f (0. {w. spaces consisting of functions or of game trees). Down. u(w. 508 . w}. and no other output does. x) = wx and u(w. Down}.5z) > min[f (w). x. strict The word “strict” is used in a variety of contexts to mean that a relationship does not hold with equality or is not arbitrarily close to being violated. If we form sets of 2 elements from the set A = {w. z}. then f ≤ 0. {Down. See list and multiset. and (3) the union of any number of open sets. quasi-concave The continuous function f is quasi-concave if for w = z. {z. The opposite of “strictly” is “weakly. y.) risk See the Risk section below.20]. which is easy to do for Rn but not for every space (e. support The support of a probability distribution F (x) is the closure of the set of values of x such that the density is positive. {y. Down} are all the same set of 2 elements. range The range of a mapping is the set of elements to which it maps– the property over which it can spew its output. z}. y}. x}. {z. topology Besides denoting a ﬁeld of mathematics. z}. then the support of the output distribution is [0. In a metric space. the possibilities are {w. equivalently. quasilinear utility A utility function is quasilinear in variable w if a monotonic transformation can make it linear in w and w is separable from all other variables in the utility function. (The mapping maps from the domain onto the range. (See combination for the unordered version. or. x}. u(w. {x.

while technically adding nothing to the meaning. is that φ is upper semicontinuous if the set of points {x. appropriate only if Y is compact. emphasizes that f = 0 under some or all parameters. yn ∈ φ(xn ). but to say that f is weakly concave. An alternative deﬁnition. See Figure 2. a point in Rn . The opposite of “weak” is “strict” or “strong. then f ≤ 0. a list of real numbers. 509 . 3. will satisfy. yn → y0 . implies y0 ∈ φ(x0 ). but which. for example.upper semicontinuous correspondence The correspondence φ : X → Y is upper semicontinuous at point x0 if xn → x0 . weak The word “weak” is used in a variety of contexts to mean that a relationship might hold with equality or be on a borderline. (5) which means that every sequence of points in φ(x) leads to a point also in φ(x).4 Formulas and Functions log(xy) = log(x) + log(y). φ(x)} is closed. Figure 2 Upper Semicontinuity vector A vector is a list (a set with order) that has a certain structure I will not describe here. If f is concave and f > 0. −4) is a vector in R3 .” *A. The point (2.5.

log(x2 ) = 2log(x). ea+b = ea eb . = a11 a22 a33 − a11 a23 a32 + a12 a23 a31 − a12 a21 a33 + a13 a21 a32 − a13 a22 a31 . ert = (er )t . 2a f (x) axa−1 1 − x2 2 − x3 ex erx log(ax) log(x) ax ex rerx 1/x 1/x ax log(a) f (g(x)) f (g(x))g (x) Determinants a11 a12 a21 a22 a11 a12 a13 a21 a22 a23 a31 a32 a33 = a11 a22 − a21 a12 . The Quadratic Formula: Let ax2 + bx + c = 0. Then x = Derivatives f (x) xa 1/x 1 x2 √ −b± b2 −4ac . if k < 0. Table 1: Some Useful Functional Forms 510 . a > b ⇒ ka < kb. ax = (elog(a) )x .

but the function f (x) = x is decreasing at a decreasing rate.5 Probability Distributions The deﬁnitive listing of probability distributions and their characteristics is the three-volume series of Johnson & Kotz (1970). or of the density at a single value. so rather than speaking of the probability of a value. The Exponential Distribution The exponential distribution. (7) (6) The Uniform Distribution A variable is uniformly distributed over support X if each point in X has equal probability. which has the set of nonnegative real numbers as its support. f (x) = λ The cumulative density function is F (x) = 1 − e−x/λ . The function f (x) = x2 is increasing at an 1 increasing rate. 511 . A few major distributions are listed here.f (x ) f (x ) f (x ) Slope for x > 0 Curvature log(x) √ x 1 x 1 √ 2 x 1 − x2 1 − 4x(3/2) increasing increasing increasing decreasing decreasing concave concave convex convex concave x2 1 x 2x 1 − x2 2 2 x3 7 − x2 7x − x2 −2x 7 − 2x -2 -2 increasing/decreasing concave The signs of derivatives can be confusing. we speak of the probability of the value being in an interval. has the density function for mean λ of e−x/λ . A probability distribution is the same as a cumulative density function for a continuous distribution. the mean is α + β)/2. even though f > 0 in each case. β]. *A. If the support is [α. the density is x<α 0 1 α≤x≤β f (x) = (8) β−α 0 x > β. Any single value has inﬁnitesimal probability if the distribution is continuous (unless there is a probability atom there).

Let his strategy set be S i and his payoﬀ function be π i (y i . For all y i ≥ y i . y −i ) is nondecreasing in y −i . . We say that the game is a supermodular game if the following four conditions are satisﬁed for every player i = 1. (A2) π i : S → R ∪ {−∞} is order semicontinuous in y i for ﬁxed y −i . y }). so his strategy is the vector i i y i = (y1 . z). . subscripted by s and t. it is sometimes useful to use a ﬁfth assumption: (A5) π i has increasing diﬀerences in y i and z for ﬁxed y −i . . . .math2. In addition. x has a lognormal distribution. For all strategy proﬁles y and y in S. the diﬀerence π i (y i . π i (y) + π i (y ) ≤ π i (supremum{y. y }) + π i (inf imum{y. You can ﬁnd values at websites such as http://www. z) is nondecreasing with respect to z. and has a ﬁnite upper bound. (A3) π i is supermodular in y i . subscripted by i and j. where z represents a ﬁxed parameter.org/math/stat/distributions/z-dist. the diﬀerence π i (y i . y −i .6 Supermodularity Suppose that there are N players in a game. for all y i ≥ y i . . often denoted Φ(x). and that player i has a strategy consisting of si elements. A. z) − π i (y i . The Lognormal Distribution If log(x) has a normal distribution. This is a skewed distribution which has the set of positive real numbers as its support since the logarithm 2 of a negative number is not deﬁned. which cannot be simpliﬁed analytically. y −i ) − π i (y i . ysi ). N : (A1) S i is a complete lattice. The mean is eσ /2 . . for ﬁxed y −i . y −i . 512 . y −i . and order continuous in y −i for ﬁxed y i .htm or from a spreadsheet program. (11) (A4) π i has increasing diﬀerences in y i and y −i . The density function for mean µ and variance σ 2 is f (x) = √ 1 2πσ 2 e −(x−µ)2 2σ 2 (10) The cumulative density function is the the integral of this.and the cumulative density function is 0 1 x<α α≤x≤β x>β (9) F (x) = x−α β−α The Normal Distribution The normal distribution is a two-parameter single-peaked distribution which has as its support the entire real line.

there exists a largest and smallest Nash equilibrium in pure strategies. Theorem 3 If a game is supermodular. i ∂ys ∂z (15) (14) (12) i Theorem 1 If the game is supermodular. then for each player there is a largest and smallest serially undominated strategy. ∂ 2πi j ≥ 0. and all s and t such that 1 ≤ s ≤ si and 1 ≤ t ≤ sj . i ∂ys ∂yt The ﬁfth assumption becomes A5 : (Increasing diﬀerences in one’s own strategies and parameters) Increasing parameter z does not decrease the net marginal beneﬁt to player i of any component of his own strategy: for all i. and all s such that 1 ≤ s ≤ si . Theorem 2 If the game is supermodular and assumption (A5 ) is satisﬁed. then the largest and smallest equilibria are nondecreasing functions of the parameter z. ∂ 2πi ≥ 0. Theorem 4 Let y i denote the smallest element of player i’s strategy set S i in a supermodular game. ∂ 2πi ≥ 0. (13) i i ∂ys ∂yt A4 (Increasing diﬀerences in one’s own and other strategies) Increasing one component of i’s strategy does not decrease the net marginal beneﬁt of increasing any component of player j’s strategy: for all i = j. y i ]. A3 (Supermodularity) Increasing one component of player i’s strategy does not decrease the net marginal beneﬁt of any other component: for all i. and (b) if there are at least two equilibria (note that the largest and smallest equilibria might be the same strategy proﬁle). then two of them can be ranked in the magnitudes of the components of each player’s equilibrium strategy. Theorem 1 is useful because it shows (a) existence of an equilibrium in pure strategies. 513 . where both of these strategies are pure.The conditions for smooth supermodularity are: A1 The strategy set is an interval in Rs : S i = [y i . and all s and t such that 1 ≤ s < t ≤ si . A2 π i is twice continuously diﬀerentiable on S i .

The most famous ﬁxed point theorem is Brouwer’s Theorem. then π i (y ∗ ) ≥ π i (y ∗ ). Whinston & Green (1994). I will use a formulation from page 952 of Mas-Colell. Fudenberg & Tirole (1991. see Topkis’s 1998 Supermodularity and Complementarity. 1. Figure 3 A Mapping with Three Fixed Points The Brouwer Fixed Point Theorem. then π i (y ∗ ) ≤ π i (y ∗ ). y −i ) is decreasing in y −i . with y ∗ ≥ y ∗ . A. If π i (y i . If the condition in (1) holds for a subset N1 of players. The theorems here are taken from Milgrom & Roberts (1990). illustrated in Figure 3. pp. Theorem 1 is their corollary to Theorem 5. or Vives’s 2005 survey article. For more on supermodularity. so y is the “big” equilibrium. 489-497). y −i ) is increasing in y −i .7 Fixed Point Theorems Fixed points theorems say that various kinds of mappings from one set to another result in at least one point being mapped back onto itself. compact. Theorem 2 is their Theorem 6 and corollary. see Milgrom & Roberts (1990). Suppose that set A in RN is nonempty. 2. and 4 is their Theorem 7. For a mathematician’s view. ( “Compact” 514 . and the condition in (2) holds for the remainder of the players.Let y ∗ and y ∗ denote two equilibria. Then. then the big equilibrium y ∗ is the best equilibrium for players in N1 and the worst for the remaining player. Theorem 3 is their Theorem 5. and convex. If π i (y i . and that f : A → A is a continuous function from A into itself. 3. and the small equilibrium y ∗ is the worst equilibrium for players in N1 and the best for the remaining players.

in Euclidean space. *A. The Brouwer theorem is useful in itself. f (x) = 3. Let P be the set of all possible price points. These decisions change p into f (p). Let s be a point in N − space consisting of one strategy for each player – a strategy proﬁle. care must be taken to identify the mathematical nature of the set of strategy proﬁles and the smoothness of the best response functions.8 Genericity Suppose we have a space X consisting of the interval between 0 and 100 on the real line. one ﬁrm might be indiﬀerent between producing various amounts of output.) Then f has a ﬁxed point. The Kakutani Fixed Point Theorem (Kakutani [1941]) Suppose that set A in RN is a nonempty. that is. and a function f such that f (x) = 3 except that f (15) = 5. P will be convex and compact if we limit it to ﬁnite prices. If you can show that f is continuous. but to prove existence of prices in general equilibrium and existence of Nash equilibrium in game theory requires the Kakutani ﬁxed point theorem. We can then say any of the following: 1 f (x) = 3 except on a set of measure zero. with the property that the set f (x) is nonempty and convex for every x. 3 Generically. An equilibrium is a point p∗ such that f (p∗) = p∗. 515 . Consider equilibrium prices. you can show that p∗ exists. In game theory. In deciding which theorem to use. 100]. That is because the mappings involved are not one-to-one functions. 2 f (x) = 3 except on a null set. This is also true for Nash equilibria. convex set and that f : A → A is an upper hemicontinuous correspondence from A into itself. Other ﬁxed point theorems exist for other kinds of mappings — for example for a mapping from a set of functions back into itself. Then f has a ﬁxed point. one player might have two best responses to another player’s strategy. [0. This will be compact and convex if we allow mixed strategies (for convexity) and if strategy sets are closed and bounded. there is an x in A such that x is one element of f (x). compact. Let S be the set of all possible strategy proﬁles.means closed and bounded. If you can show that f is continuous – which you can do if payoﬀ functions are continuous – you can show that s∗ exists. In general equilibrium. Each strategy proﬁle s will cause each player to react by choosing his best response f (s). and conveys the intuition of ﬁxed point theorems. that is. The usefulness of ﬁxed point theorems is that an equilibrium is a ﬁxed point. but one-to-many-point correspondences. A Nash equilibrium is s∗ such that f (s∗) = s∗. there is an x in A such that x = f (x). Agents in the economy look at p and make decisions about consumption and output. Let p be a point in N -space consisting of one price for each good.

Now. Every subgame perfect equilibrium must specify that the players choose those actions.2.” Proof. complications can arise even in economic applications because of inﬁnite spaces and in dealing with spaces of functions. which is 0. consider the decision node just before it. I have not deﬁned the concept in these notes. you can count on observation f (x) = 3. or other such objects. P2 . As with the concepts of “closed” and “bounded”. then with probability one. we took a space Y consisting of the integers between 0 and 100. The player making the decision at such a node has a ﬁnite number of choices.D. let us apply the idea to games. and using the payoﬀs determined from the optimal choice of ﬁnal moves. consider the next-to-last decision nodes.1..100. since this is a ﬁnite game. Each player will have one best action choice at each node. and any other value of f (x) is very special in that sense. all ﬁnite games of perfect information have a unique subgame perfect equilibrium. The player making the decision there has a ﬁnite number N of choices.. 516 . Next. Statements like these always depend on the deﬁnition of the space X. 5 The set of x such that f (x) = 3 is dense in X. f (x + ) = 3. A game of perfect information has no simultaneous moves.. then it is not true that “f(x) = 3 except on a set of measure zero. f (x) will be 3 with probability one.” Instead. If you start with point x.. So unless there is some special reason for x to take the particular value of 15. If. instead. . however. with probability one they would no longer be equal. The player there will ﬁnd that some one of his ﬁnite moves has the largest payoﬀ.. For each end node. he will ﬁnd that some move has the maximum payoﬀ. since they are the unique Nash strategies in the subgames at the end of the game.. Since the game is ﬁnite. and he will pick that one move.. Here is an example of a theorem that uses genericity. so games with equal payoﬀs have measure zero. This set of payoﬀs has a unique maximum.E. there is a 1/101 probability that x = 15 and f (x) = 5.4 f (x) = 3 almost always. The subgame perfect equilibrium must specify that move. A course in real analysis would teach you the deﬁnitions. and so the equilibrium will be unique. 6 The set of x such that f (x) = 3 has full measure. and consists of a tree in which each player moves in sequence. (If they were. is enough to be useful to you. The concept of “a set of measure zero” becomes more diﬃcult to implement if the space X is not just a ﬁnite interval. and you perturbed the payoﬀs a little. Theorem: “Generically. These all convey the idea that if parameters are picked using a continuous random density. This. each path through the tree leads to an end node. if x is chosen randomly. Denote the payoﬀs from these choices as (P1 . Continue this procedure until you reach the very ﬁrst move of the game. because generically no two payoﬀs will be equal. Q. game tree branchings. I have just pointed to usage. and add a small random perturbation .) The player will pick the action with the biggest payoﬀ. PN ).

The discount factor represents exactly the same idea as 1 the discount rate. but they all stem from the basic fact that a dollar paid in the future is worth δ dollars now. so that discounting is appropriate. and δ = 1+r . Discounting is measured by the discount rate or the discount factor. ρ. δ. Whether to put discounting into a model involves two questions.e. θ. so the player is indiﬀerent between (1 − θ)x/(1 + ρ) now and the promise of x to be paid one period later contingent upon the game still continuing. With probability (1 − θ) the game continues and the later payment is actually made. so the notation includes zero discounting as a special case. *A. Not discounting is equivalent to r = 0 and δ = 1.Genericity entered this as the condition that we are ignoring special games in the theorem’s statement – games that have tied payoﬀs. a player is indiﬀerent between 1+ρ now and x guaranteed to be paid one period later. Discounting is appropriate only in the second interpretation. If they both take the value zero. Whether those are really special or not depends on the context.3 can be interpreted in two ways. whether they are discounted. so that the delay before the bargain is reached is important to the players. Discounting has two important sources: time preference and a probability that the game might end. so the discount factor is not so useful a concept. One way is that the players make all their oﬀers and counteroﬀers between dawn and dusk of a single day. is the extra fraction of a unit of value needed to compensate for delaying receipt by one period. Continuous time models usually refer to rates of payment rather than lump sums. but discounting works the same way as in discrete time except that payments are 517 . so essentially no real time has passed. the player does not care whether his payments are scheduled now or x ten years from now. The ﬁrst is whether the added complexity will be accompanied by a change in the results or by a surprising demonstration of no change in the results. i. The discount rate. We will not go into how these are derived. represented by the rate of time preference. and the probability each period that the game ends. The other way is that each oﬀer consumes a week of time. Models use r or δ depending on notational convenience.9 Discounting A model in which the action takes place in real time must specify whether payments and receipts are valued less if they are made later. Otherwise. and in some models the interest rate determines the discount rate. more speciﬁc question is whether the events of the model occur in real time. The discount rate is analogous to the interest rate. A second. The discount factor. (16) 1+r (1 + ρ) Table 2 summarizes the implications of discounting for the value of payment streams of various kinds. r. It is usually assumed that ρ and θ are constant.. The bargaining game of Alternating Oﬀers from Section 12. The discount factor is therefore 1 (1 − θ) δ= = . is the equivalent in present units of value of one unit to be received one period from the present.

if at the start of period T you must pay out a perpetuity 1 of x at the end of each year. T 518 .g. This leaves 1 a present value of x 1 − ( 1+r )T . which is the second formula for an annuity given in r Table 2. Figure 4 illustrates this approach to annuities and shows how it can also be used to value a stream of income that starts at period S and ends at period T . A payment of Y at the end of period T has a r −Y present value of (1+r)T . A stream of x paid at the end of each year is worth x . the present value of that payment is x 1+r .. Thus. One may r also view a stream of payments each year from the present until period T as the same thing as owning a perpetuity but having to give away a perpetuity in period T . in continuous time xe−rt T 0 x r — — — xe−rt dt The way to remember the formula for an annuity over a period of time is to use the formulas for a payment at a certain time in the future and for a perpetuity. For a full explanation. see a ﬁnance text (e.continuously compounded. Table 2 Discounting Discounted Value r-notation δ-notation (discount rate) (discount factor) x 1+r x r Payoﬀ Stream x at the end of one period x at the end of each period in perpetuity x at the start of each period in perpetuity x at the end of each period up through T (ﬁrst formula) x at the end of each period up through T (second formula) δx δx 1−δ x r x 1−δ T t t=1 δ x δx 1−δ x+ T x t=1 (1+r)t x r 1− 1 (1+r)T 1 − δT x at time t in continuous time Flow of x per period up to time T in continuous time Flow of x per period in perpetuity. Appendix A of Copeland & Weston [1988]).

i. F dominates G if.e. −∞ F (y)dy ≤ −∞ G(y)dy.2. limiting U to increasing functions for ﬁrst-order dominance and increasing concave functions for 519 . F (x) ≤ G(x). (17) and (17) is a strong inequality for at least one value of x.Figure 4: Discounting Discounting will be left out of most dynamic games in this book. i. *A. Equivalently. if x x for any x.10 Risk We say that a player is risk averse if his utility function is strictly concave in money. We say that probability distribution F dominates distribution G in the sense of ﬁrstorder stochastic dominance if the cumulative probability that the variable will take a value less than x is greater for G than for F . if for any x. The distribution F dominates G in the sense of second-order stochastic dominance if the area under the cumulative distribution G up to G(x) is greater than the area under F . such as eﬀort. which means that he has diminishing marginal utility of money. (18) and (18) is a strong inequality for some value of x. and is discussed further in Section 5. He is risk neutral if his utility function is linear in money.e. but it is an especially important issue in inﬁnitely repeated games. The qualiﬁer “in money” is used because utility may be a function of other variables too.

Let θ be a parameter about which the news is received in the form of message x or y. a mean-preserving spread is a density function which transfers probability mass from the middle of a distribution to its tails.2(5) + 0. for discrete distributions placing suﬃcient probability on the four points a1 . If F is ﬁrst-order dominant it is second-order dominant. it is preferred by all risk-averse players.9. More formally. if F is a second-order dominant gamble.1(2) + 0. The mean is 0.3(4) + 0. which is 3. A mean-preserving spread is a set of four locations a1 < a2 < a3 < a4 and four probabilities γ1 ≥ 0.3(3) + 0. and let utility be increasing in θ. (19) If F is a ﬁrst-order dominant gamble. Panel (a) shows the original distribution with solid bars. γ2 ≤ 0. Milgrom (1981b) has used stochastic dominance to carefully deﬁne what we mean by good news. γ3 = −γ4 . Informally. The message x is more favorable than y (is “good news”) if for every possible nondegenerate prior for F (θ). The spread has 520 .1(6). +∞ +∞ for all functions U. γ3 ≤ 0. the most important of which is the mean-preserving spread. Rothschild & Stiglitz (1970) shows how two gambles can be related in other ways equivalent to second-order dominance. a2 . a3 . the posterior F (θ|x) ﬁrst-order dominates F (θ|y).second-order dominance. and i γi ai = 0. and a4 . −∞ U (x)dF (x) > −∞ U (x)dG(x). it is preferred by all players. γ4 ≥ 0 such that −γ1 = γ2 . but not vice versa. Figure 5: Mean-Preserving Spreads Figure 5 shows how this works.

3(. v]. ﬁxes an error in the original Rothschild & Stiglitz proof. and any distribution with increasing density over its support (see Bagnoli & Bergstrom [1994]). & Spector (1997). normal.1(4) + 0. For most distributions we use. The hazard rate h(v) is deﬁned as h(v) = f (v) 1 − F (v) (20) What this means is that h(v) is the probability density of v for the distribution which is like F (v) except cutting oﬀ the lower values. and exponential distributions. Levy. panel (c) of Figure 5 shows an example. 521 . with Leshno. including the uniform. which also.1(1) − 0. with cumulative distribution F (v) on support [v. so its support is limited to [v. Panel (b) shows the resulting spread-out distribution. Hazard Rates The hazard rate has been important in the analysis above. which is 0. The deﬁnition can be extended to continuous distributions.3) − 0.2(6). Suppose we have a density f (v) for a buyer’s value for an object being sold. In economic terms. logistic. the hazard rate is increasing. h(v) is the probability density for the valuing equalling v given that we know that value equals at least v. Figure 6 shows three of them. v]. and can be alternatively deﬁned by taking probability mass from one point in the middle and moving it to the sides. so it is mean-preserving. See Rasmusen & Petrakis (1992).mean 0.

Figure 6: Three Densities To Illustrate Hazard Rates 522 .

Sign up to vote on this title

UsefulNot useful- Newton’s Method for Convex Programming and Tchebycheff Aproximationby Wiliams Cernades Gomez
- List of Mathematical Symbols - Wikipedia, The Free Encyclopediaby CATMentor
- A Game Theoretic Formulation for Strategic Sensing in Cognitive Radio Networks: Equilibrium Analysis and Fully Distributed Learningby ijcsis
- Abstract Mathematicsby narendra

- Real Analysis Notes
- Real Anal
- Basic Analysis
- Real Anal
- Newton’s Method for Convex Programming and Tchebycheff Aproximation
- List of Mathematical Symbols - Wikipedia, The Free Encyclopedia
- A Game Theoretic Formulation for Strategic Sensing in Cognitive Radio Networks
- Abstract Mathematics
- Good Men (Final)
- QG
- A Metrics Framework for Evaluating Group Formation -Main
- CS 364A
- v2 Acmlarge Sample
- BSP1005 Lecture Notes 9 - Game Theory I - Strategic Thinking in Static Games
- Real Mathematical Analysis
- Kolmogorov, Fomin, Elements of the Theory of Functions and Functional Analysis. Volume 1 Metric and Normed Spaces
- Elements of the Theory of Computation - Papadimitriou
- Principia Mathematica
- Work of John Nash in Game Theory
- AMOS Tutorial
- part1
- Business Mathematics HO (1)
- Binmore Hume
- Introduction
- 05_ECE MATH 311_Proofs Applications on Sets & Summation
- HSP-ADD-MATH-F4
- Prontuario_3023_2014_1
- CK12 Calculus
- CK12_Calculus.pdf
- LP Aug 26 2013
- Math Appendix Good

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.