You are on page 1of 126

Asset Pricing

Teaching Notes

João Pedro Pereira


Nova School of Business and Economics
Universidade Nova de Lisboa
joao.pereira@novasbe.pt
http://docentes.fe.unl.pt/∼jpereira/

June 18, 2015


Contents

1 Introduction 5

2 Choice theory 7
2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 The utility function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Choice under certainty . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Choice under uncertainty . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.3 Interpretation of utility numbers . . . . . . . . . . . . . . . . . . . 11
2.3 Risk aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.1 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 Measures of risk aversion . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.3 Risk neutrality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Important utility functions . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 Certainty Equivalent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.6 Stochastic dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6.1 First Order Stochastic Dominance . . . . . . . . . . . . . . . . . . 18
2.6.2 Second Order Stochastic Dominance . . . . . . . . . . . . . . . . . 19
2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Portfolio choice 24
3.1 Canonical portfolio problem . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Analysis of the optimal portfolio choice . . . . . . . . . . . . . . . . . . . 26
3.2.1 Risk aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2.2 Wealth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Canonical portfolio problem for N > 1 . . . . . . . . . . . . . . . . . . . . 31
3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4 Portfolio choice for Mean-Variance investors 35


4.1 Mean-Variance preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.1 Quadratic utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.1.2 Normal returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2 Review: Mean-Variance frontier with 2 stocks . . . . . . . . . . . . . . . . 39

2
Contents 3

4.3 Setup for general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41


4.3.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3.2 Brief notions of matrix calculus . . . . . . . . . . . . . . . . . . . . 41
4.4 Frontier with N risky assets . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.4.1 Efficient portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.4.2 Frontier equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.4.3 Global minimum variance portfolio . . . . . . . . . . . . . . . . . . 45
4.5 Frontier with N risky assets and 1 risk-free asset . . . . . . . . . . . . . . 45
4.5.1 Efficient portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.5.2 Frontier equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.5.3 Tangency portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.6 Optimal portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.7 Additional properties of frontier portfolios . . . . . . . . . . . . . . . . . . 50
4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5 Capital Asset Pricing Model 54


5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.3 Important results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3.1 Capital Market Line . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.3.2 Security Market Line . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.4 Other remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

6 Arbitrage Pricing Theory and Factor Models 61


6.1 Factor Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.2 Example of simple factor structure: Market Model . . . . . . . . . . . . . 63
6.2.1 Return generating process . . . . . . . . . . . . . . . . . . . . . . . 63
6.2.2 Application: the Covariance matrix is simplified . . . . . . . . . . 63
6.2.3 Implication: Diversification eliminates Specific risk . . . . . . . . . 64
6.2.4 Another interpretation of the CAPM β . . . . . . . . . . . . . . . 65
6.3 Pricing equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.3.1 Exact factor pricing with one factor . . . . . . . . . . . . . . . . . 67
6.3.2 Exact factor pricing with more than one factor . . . . . . . . . . . 68
6.3.3 Approximate factor pricing . . . . . . . . . . . . . . . . . . . . . . 70
6.4 How to identify the factors . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.4.2 Fama and French model . . . . . . . . . . . . . . . . . . . . . . . . 71
6.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.5.1 Fund performance . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.5.2 Market neutral strategy . . . . . . . . . . . . . . . . . . . . . . . . 74
6.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Contents 4

7 Pricing in Complete Markets 77


7.1 Basic and Complex securities . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.2 Computing AD prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.3 Complete Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.3.1 Price of complex securities . . . . . . . . . . . . . . . . . . . . . . . 79
7.3.2 Quick test for market completeness . . . . . . . . . . . . . . . . . . 80
7.4 Risk-Neutral Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
7.4.1 Price of complex securities . . . . . . . . . . . . . . . . . . . . . . . 81
7.4.2 Fundamental theorems . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

8 Consumption-Based Asset Pricing 86


8.1 The investor’s problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8.2 Fundamental Asset Pricing Equation . . . . . . . . . . . . . . . . . . . . . 88
8.3 Relation to Arrow-Debreu Securities . . . . . . . . . . . . . . . . . . . . . 89
8.4 Relation to the Risk-Neutral measure . . . . . . . . . . . . . . . . . . . . 90
8.5 Risk Premiums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
8.6 Consumption CAPM (CCAPM) . . . . . . . . . . . . . . . . . . . . . . . 92
8.7 The CAPM reloaded . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
8.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
8.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

9 Conclusion 99

Bibliography 100

A Background Review 102


A.1 Math Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
A.1.1 Logarithm and Exponential . . . . . . . . . . . . . . . . . . . . . . 102
A.1.2 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
A.1.3 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
A.1.4 Means and Variances . . . . . . . . . . . . . . . . . . . . . . . . . . 106
A.2 Undergraduate Finance Review . . . . . . . . . . . . . . . . . . . . . . . . 107
A.2.1 Financial Markets and Instruments . . . . . . . . . . . . . . . . . . 107
A.2.2 Time value of money . . . . . . . . . . . . . . . . . . . . . . . . . . 109
A.2.3 Risk and Return . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
A.2.4 Equilibrium and No Arbitrage . . . . . . . . . . . . . . . . . . . . 111

B Solutions to Problems 112


Chapter 1

Introduction

These notes follow Danthine and Donaldson (2005) closely, though we will use other
sources as needed. We will start by analyzing individual choices and portfolio decisions.
Then, we will study the prices that result from the interaction of many individuals in
the market.

To motivate the work to come, consider the following question:

What is the role of financial markets?

Answer: allowing the desynchronization of agents’ income and consumption. Ex-


ample: buy a house now and pay for it during the next 20 years. This is achieved by
trading financial securities with financial institutions.

Preference for smooth consumption

Financial economists see the world in two dimensions. It is useful to understand why
agents want to dissociate consumption and income across these two dimensions.

1. Time Dimension. Most people prefer to smooth their consumption through


their life cycle. Usually, consumption is higher than income during early years
of life (buy the house), then people save during active life (y > c), finally people
consume their savings after retirement (y = 0, c > 0).

2. Risk Dimension. The future is uncertain. At any point in the future, one of
many states of nature will be realized.1 Most people want to smooth consumption
1
A state of nature is a complete description of a possible scenario for the future across all the
dimensions relevant for the problem at hand.

5
6

across the different possibilities that may arise. That’s why people buy health
insurance (to be able to consume even if they stop working) or fire insurance for
the new house (avoid low consumption in the “burned to the ground” state of
nature).

Financial assets serve precisely to move consumption through time and across states
of nature.

Modelling the preference for smoothness

Financial economics builds on the fact that people have a preference for smoothness, as
just mentioned. How to model this preference for smoothness, also called risk aversion?

Consider two assets that offer two different consumption plans:

asset 1 asset 2
time/state 1 4 3
time/state 2 4 5

Since investors like smoothness, they must prefer asset 1.2 Let U (c) be the utility
function, i.e., it tells us how much the investor likes consumption c. The utility function
must thus satisfy
U (4) + U (4) > U (3) + U (5)
1 1
⇔U (4) > U (3) + U (5)
2 2
What shape must U(.) have to satisfy this condition? 3 Plot it:

U (c)
6

-
c
2
Suppose your employer offers you the following salary scheme: under scheme 1, you get
$4,000 per month; under scheme 2, you get $3,000 if it rains or $5,000 if it is sunny. Which
scheme would you take?
3
Answer: It must be strictly concave
Chapter 2

Choice theory

1. Under certain conditions, investors’ preferences can be represented


by a utility function,

x  y ⇔ E[U (x)] ≥ E[U (y)]

2. Typical utility functions:

U (w) = ln(w) [CRRA]


U (w) = w1−γ /(1 − γ) [CRRA]
U (w) = − exp(−αw) [CARA]
U (w) = aw − bw2

2.1 Motivation

We want to find a method to choose between risky assets. Consider the following simple
example:

Example 2.1.1. There are 3 assets and 2 equally likely possible states of
nature in the future:

t=0 t=1
state θ = 1 state θ = 2
asset 1 -100 100 120
asset 2 -100 91 131
asset 3 -100 100 140

7
2.2. The utility function 8

Which asset would you rather have? In this case, the choice is easy. Asset 3 B
clearly dominates the other assets, since it pays at least as much in all states of
nature, and strictly more in some states. This is an example of state-by-state
dominance.

State-by-state dominance is the strongest possible form of dominance. We can safely


assume that all rational agents will always prefer asset 3.1

However, the world is not that simple and we will not usually be able to use this
concept to make choices. (Is it likely we will observe a market like in this example? Why
not?) B

Suppose now that asset 3 does not exist. Do you prefer asset 1 or asset 2? The B
choice is not obvious... To understand the choices people make in the real world we need
a better machinery — utility theory.

2.2 The utility function

To be able to represent agents’ preferences by a formal mathematical object like a func-


tion, we need to make precise assumptions about how people make choices.2

2.2.1 Choice under certainty

We start by postulating the existence of a preference relation. For two consumption


bundles a and b (two vectors with the amount of consumption of each good), we either
say that

ab a is strictly preferred to b


a∼b a is indifferent to b
ab a is strictly preferred or indifferent to b (a not worse than b)

We make the following economic rationality assumptions:

A1: Every investor possesses a complete preference relation. I.e., he must be able to
state a preference for all a and b.
1
More precisely, we are assuming agents to be nonsatiated in consumption (always like more
consumption)
2
People have wasted time thinking about reformulating the canonical portfolio problem just
because they were not aware of the axioms that lead to an expected utility representation.
2.2. The utility function 9

A2: The preference relation satisfies the property of transitivity:


∀a, b, c, a  b and b  c ⇒ a  c

A3: The preference relation is continuous.3


Under these circumstances, we can now state the following useful theorem:
Theorem 2.2.1. Assumptions A1–3 are sufficient to guarantee the existence of a con-
tinuous function u : RN → R such that, for any consumption bundles a and b,
a  b ⇔ u(a) ≥ u(b)
This real-valued function u is called a utility function.

Note that the notion of consumption bundle used in the theorem is quite general.
Different elements of the bundle may represent the consumption of the same good in
different time periods or in different states of nature.

2.2.2 Choice under uncertainty

Even thought the previous thm is quite general, we want to extend it in a way that
captures uncertainty explicitly and separates utility from probabilities.
Definition (Lottery). The simple lottery (x, y, π) is a gamble that offers payoff x with
probability π and payoff y with probability 1 − π. B

This notion of lottery is quite general. The payoffs x and y can represent monetary
or consumption amounts. If there is no uncertainty, we can write
(x, y, 1) = x
The payoffs can themselves be other lotteries, leading to compound lotteries. For exam-
ple, if y = (y1 , y2 , τ ), we will have
(x, y, π) = (x, (y1 , y2 , τ ), π)
We assume that the agent is able to “work out” the probability tree and only cares about
the final outcomes.4

Assume the following axioms:


3
Technical assumption. See Danthine and Donaldson (2005) for details on this and Huang
and Litzenberger (1988) for further technical details.
4
A lottery is the simplest example of a random variable. Stock prices are random variables,
so you can see where we are going.
2.2. The utility function 10

B1: There exists a preference relation , defined on lotteries, which is complete, tran-
sitive, and continuous.

Since the consumption bundles in theorem 2.2.1 where general enough to include
consumption in different states of nature, it can be applied here to ensure that there
exists a utility function U () defined on lotteries. To get an expected utility representation
of preferences, we need the following crucial axiom:

B2: Independence of irrelevant alternatives. Let (x, y, π) and (x, z, π) be any two
lotteries. Then,
y  z ⇔ (x, y, π)  (x, z, π)

In other words, x is irrelevant; including it does not change the investor’s preferences
about y and z.

This axiom is not trivial and has been strongly contested. One well know violation is
the Allais Paradox.5 This and other violations have lead to the exploration of alternatives
to the expected utility framework, namely to the growing field of Behavioral Finance.
Despite this, recall that the goal of financial economics is to understand the aggregate
market behavior and not individual behavior. At this point, expected utility is the most
useful framework.

We now get to the punchline:

Theorem 2.2.2 (Expected Utility Theorem). If axioms B1–2 hold, then there exists a
real-valued function U , defined on the space of lotteries, such that the preference relation
can be represented as an expected utility, that is, for any lotteries x and y,

x  y ⇔ E[U (x)] ≥ E[U (y)]

The function U (), defined over lotteries, is called a von Neumann-Morgenstern (vNM)
utility function.6
5
Allais Paradox. Given the four lotteries defined below, most people show the following
preferences:
L1 = ($10000, $0, 0.10) ≺ L2 = ($15000, $0, 0.09)
and
L3 = ($10000, $0, 1.00)  L4 = ($15000, $0, 0.90)
However, given that L1 = (L3, $0, 0.1) and L2 = (L4, $0, 0.1), with $0 the irrelevant alternative,
the independence axiom would imply L3  L4 ⇒ L1  L2 !
6
This designation is sometimes confusing. Some people define U := E[U ()] and call this U
the vNM utility function, while others call vNM to the u() defined on sure things. Nonetheless,
it is always used in the context of preferences that have an expected utility representation —
theorem 2.2.2
2.2. The utility function 11

Note that x and y can be lotteries with multiple outcomes. Denoting by xs the
outcome in state s that occurs with probability πs ,7 we have
(P
U (xs )πs x is a discrete r.v.
E[U (x)] = R s
s U (xs )πs ds x is a continuous r.v.


Example 2.2.1. Let U (x) = x. Choose between assets 1 and 2 in example
2.1.1.

Example 2.2.2. Now consider another investor with U (x) = x1−2 /(1 − 2) =
−1/x. (It will soon become clear that this investor is very similar to the previous
one, though a little bit more risk averse). Check that this investor prefers the
other asset.

2.2.3 Interpretation of utility numbers

The numbers returned by the utility function do not have any meaning per se, as the
following proposition makes clear.
Proposition 2.2.1. If U (x) is a vNM utility function for a given preference relation,
then V (x) = aU (x) + b, a > 0, is also a vNM utility function for the same preference
relation, that is,
E[U (x)] ≥ E[U (y)] ⇒ E[V (x)] ≥ E[V (y)]

Proof.
E[U (x)] ≥ E[U (y)] ⇒ aE[U (x)] + b ≥ aE[U (y)] + b, since a > 0
⇒ E[aU (x) + b] ≥ E[aU (y) + b]
⇒ E[V (x)] ≥ E[V (y)]


Example 2.2.3. Suppose a different investor has utility V (x) = 1 + 2 x. His
choice between assets 1 and 2 (from example 2.1.1) will be the same as the

choice of the investor with U (x) = x. (Check it!) B

Hence, the utility function serves only to rank the choices under consideration. The
precise magnitude of the number does not have any meaning.
7
More often, especially in probability classes, the state of nature is denoted by ω ∈ Ω, and
the probability measure by P (ω).
2.3. Risk aversion 12

2.3 Risk aversion

2.3.1 Concepts

Consider an investor with wealth Y. Consider also the fair gamble, or lottery, L =
(+h, −h, 1/2). B
Definition (Risk aversion). An investor displays risk aversion if he wishes to avoid a
fair gamble, i.e., Y  Y + L.

This implies that the utility function of a risk-averse agent must satisfy

E[U (Y )] > E[U (Y + L)]


1 1
⇒ U (Y ) > U (Y + h) + U (Y − h)
2 2

This inequality is satisfied for all wealth levels if the utility function is strictly con-
cave.8 Plot it: B

U (Y )
6

-
Y

For twice differentiable utility functions, the sufficient condition for concavity is
that U 00 (Y ) < 0. This means that U 0 (Y ) is decreasing in wealth. This important
economic concept is called decreasing marginal utility. As wealth increases, the utility B
from additional consumption decreases. “When I am starving, a sandwich tastes great,
while when I am almost satiated I don’t care about another sandwich.”
8
This is formally justified by Jensen’s inequality: E[g(X)] ≤ g(E[X]), for concave g. If g is
strictly concave, the inequality is strict. For the utility function in particular, E[U (Y + L)] <
U (E[Y + L]) = U (E[Y ] + E[L]) = U (Y + 0) = U (Y )
2.3. Risk aversion 13

2.3.2 Measures of risk aversion

We would like to compare utility functions and say which one is more risk averse. Toward
this end, we define the following measures of risk aversion:
00
Absolute Risk Aversion: ARA(Y ) ≡ − UU 0 (Y
(Y )
)
U 00 (Y )
Relative Risk Aversion: RRA(Y ) ≡ −Y U 0 (Y )

Interpretation of ARA. Let π(Y, h) be the probability of the favorable outcome


at which the investor with wealth Y is indifferent between accepting or rejecting the
lottery L = (+h, −h, π()). Note that h is an amount of money. It can be shown that B
1 1
π(Y, h) ∼
= + h · ARA(Y ) (2.1)
2 4
The favorable odds requested increase with the amount at stake h. More importantly,
the higher the ARA, the more favorable odds the investor demands to accept the lottery.

Example 2.3.1. A commonly used utility function is U (Y ) = − exp(−γY ),


which is known for having constant ARA, ie, ARA = γ.9 For this investor, B
1 1
π(Y, h) ∼
= + hγ
2 4
The higher the degree of ARA (parameter γ), the higher the favorable odds
requested (π). However, π does not depend on the level of wealth Y . Is this
particular utility function U (Y ) = − exp(−γY ) a good description of human
behavior? B

We now derive equation (2.1).

Proof. π(Y, h) must be such that

π: Y ∼Y +L
⇒E[U (Y )] = E[U (Y + L)]
⇒U (Y ) = πU (Y + h) + (1 − π)U (Y − h)
00
9
This is the only utility function with constant ARA. To see this, write − UU 0 (Y (Y )
) = γ ⇒
00 0
U (Y )+γU (Y ) = 0, which is a homogeneous linear differential equation of the second order with
constant coefficients. The two special solutions are U1 = 1 and U2 = exp(−γY ) and the general
solution is thus U (Y ) = c1 + c2 exp(−γY ). This is a linear transformation of U (Y ) = exp(−γY ),
therefore representing the same preferences. Thanks to Diogo Bessam for pointing this out.
2.3. Risk aversion 14

Expanding U (Y + h) and U (Y − h) in Taylor series around Y , we get10


1
U (Y + h) = U (Y ) + hU 0 (Y ) + h2 U 00 (Y ) + O(h2 )
2
U (Y − h) = . . .

Ignoring terms of higher order, replacing both these approximations in the previous
equation, and canceling terms, we get equation (2.1). B

Interpretation of RRA. Now we define a gamble in terms of a proportion of


the investor’s initial wealth. Specifically, we set h = θY , and the lottery becomes
L = (θY, −θY, π()). π(Y, θ) is the probability of the favorable outcome at which the B
investor is indifferent between accepting or rejecting the lottery. It can be shown that
1 1
π(Y, θ) ∼
= + θ · RRA(Y ) (2.2)
2 4
The favorable odds requested increase with the proportion of wealth at stake θ. More
importantly, the higher the RRA, the more favorable odds the investor demands to accept
the lottery.

Example 2.3.2. An important utility function is U (Y ) = Y 1−γ /(1−γ), which


is known for having constant RRA, ie, RRA = γ.11 For this investor, B
1 1
π(Y, θ) ∼
= + θγ (2.3)
2 4
The higher the degree of RRA (parameter γ), the higher the favorable odds
requested (π). Again, π does not depend on the level of wealth Y . It depends
only on the proportion of wealth θ at stake.
XWe do like this! Historically, stock returns look stationary (same mean through B
time), while aggregate wealth has been increasing. Thus, investors must require
an expected return that cannot depend on the amount of wealth at risk. (Note
that the expected return is determined by π.) The utility function with constant
10
Taylor series: f (x) = f (a) + f 0 (a)(x − a) + 12 f 00 (a)(x − a)2 + · · · + n!
1 (n)
f (a)(x − a)n + . . .
00
11
This is the only utility function with constant RRA. To see this, write −Y UU 0 (Y (Y )
) = γ ⇒
00 γ 0
U (Y ) + Y U (Y ) = 0, which is a homogeneous linear differential equation of the second order.
One specific solution is U1 = Y 1−γ /(1 − γ) (check that it satisfies the equation). The second
R exp{− R γ/Y dY }
linearly independent solution is given by U2 = U1 (U1 )2 dY = −1. The general
solution is thus U (Y ) = c1 Y 1−γ /(1 − γ) − c2 , a linear transformation of U (Y ) = Y 1−γ /(1 − γ),
therefore representing the same preferences. Again, thanks to Diogo Bessam for pointing this
out.
2.4. Important utility functions 15

RRA (RRA = γ only, Y does not show up) is consistent with these empirical
facts.12

The proof of equation (2.2) is left as an exercise.

2.3.3 Risk neutrality

Risk-neutral investors don’t care about risk. Their utility function is linear:

U (Y ) = a + bY, b>0

Check that ARA = 0 and RRA = 0, which implies π(Y, h) = π(Y, θ) = 1/2. Hence, B
risk neutral investors are indifferent to fair games (i.e., symmetrical games with 50–50
chances).

They will always choose the asset with highest expected payoff, regardless of its risk.

2.4 Important utility functions

The most common utility functions are the following:

Name U (Y ) = Restrictions ARA RRA


on parameters

Log ln(Y ) na

Power Y 1−γ /(1 − γ)

Exponential − exp(−αY )

Quadratic aY − bY 2
2.5. Certainty Equivalent 16

Complete the table. In particular, define the restrictions on parameters s.t. the B
functions are proper utility functions, i.e., U 0 > 0 and U 00 < 0. Note that the quadratic
utility function also needs a restriction on the domain (Y < . . . ). Also, compute the
ARA and RRA functions, and classify the corresponding utility as increasing, decreasing,
or constant ARA/RRA.

As mentioned above, the power (and log) utility are considered “good” utility func-
tions. Typical values for the degree of risk-aversion are γ = 1, 2, 3, 5. The other two
utility functions are not so good descriptors of human behavior (as you can see by the
ARA and RRA functions you got). As we will see in later sections, the exponential utility
is used because it simplifies the calculations when asset returns are normally distributed,
and the quadratic utility simplifies them even further for any distribution.

2.5 Certainty Equivalent

Consider an investor with initial wealth Y . Consider a gamble Z = (Z1 , Z2 , π). How
much is this risky asset worth?

Definition (Certainty Equivalent). CE(Y, Z), the certainty equivalent of the risky in-
vestment Z, is the certain amount of money which provides the same utility as the
gamble, i.e.,
E[U (Y + Z)] = U (Y + CE)

The investor is indifferent between receiving CE(Y, Z) for sure and playing the gam-
ble Z. In other words, if the investor owns the asset, he is willing to sell it at a price
equal to the certainty equivalent. The CE is useful to compare different assets in more
intuitive terms (money, instead of utility numbers).

Note that a risk-averse agent will always value an asset at something less than its
expected payoff: CE < E[Z].13
12
Thinking about the cross section of assets, note that (2.3) allows different assets to have
different expected returns: π increases with θ, and thus the expected return also increases with
θ. Does this make sense? Think about risk !
13
Let Z be any random variable. Since U is strictly concave (U 00 < 0), from Jensen’s inequality,

E[U (Y + Z)] < U (E[Y + Z]) = U (Y + E[Z])

Hence, from the definition of CE,

U (Y + CE) < U (Y + E[Z])

Since U is increasing (U 0 > 0), we must have

Y + CE < Y + E[Z] ⇒ CE < E[Z]


2.5. Certainty Equivalent 17

Example 2.5.1. The investor has log utility and initial wealth Y = 1000. The
risky investment is Z = (200, 0, 0.5). Compute the CE:

E[U (Y + Z)] = U (Y + CE)


⇒...
⇒CE = 95.45

Why is the investor willing to accept less than the expected value of the gamble,
ie, why is CE = 95.45 < E[Z] = 100? Risk aversion. B
Plot the utility function, marking the points Y + Z1 , Y + Z2 , Y + EZ, Y + CE.

U (Y )
6

-
Y

Consider now a fair gamble:

Example 2.5.2. The investor has log utility and initial wealth Y = 100. The
risky prospect is Z = (20, −20, 0.5). We get:

E[U (Y + Z)] = U (Y + CE)


⇒1/2 ∗ ln(120) + 1/2 ∗ ln(80) = ln(100 + CE)
⇒CE = −2.02

What does it mean the CE to be negative? Plot the utility function, marking B
the points Y + Z1 , Y + Z2 , Y + EZ, Y + CE.
2.6. Stochastic dominance 18

2.6 Stochastic dominance


We now reverse gears and look for circumstances where the ranking among random
variables is preference free, that is, where we do not need to specify a utility function. We
will develop two concepts of dominance that are weaker, thus more broadly applicable,
than state-by-state dominance.

2.6.1 First Order Stochastic Dominance

Consider two assets, X1 , X2 , with the following payoffs:

Payoff
State (s) Prob(s) X1 X2
1 0.4 10 10
2 0.4 100 100
3 0.2 100 2000

Clearly, all rational investors prefer X2 : it at least matches X1 and has a positive
probability of exceeding it.

To formalize this intuition, let Fi (x) denote the cumulative distribution function of
Xi , that is, Fi (x) = Prob[Xi ≤ x].
Definition (1SD). Fa (x) 1SD Fb (x) ⇔ Fa (x) ≤ Fb (x), ∀x

Plot the two distribution functions in the example and check that F2 (x) ≤ F1 (x), ∀x.
Note that if the distribution of X2 is always below X1 , then the probability of X2 B
exceeding a given payoff is always larger, that is,
F2 (x) ≤ F1 (x) ⇒ 1 − F2 (x) ≥ 1 − F1 (x) ⇒ Prob[X2 ≥ x] ≥ Prob[X1 ≥ x], ∀x

The usefulness of this concept comes from the following theorem:


Theorem 2.6.1.
Fa (x) 1SD Fb (x)

Ea [U (x)] ≥ Eb [U (x)] for all nondecreasing U
R
Rwhere Ei is the expectation under the distribution of i, Ei [U (x)] = U (x) dFi (x) =
U (x)fi (x) dx.

Hence, all nonsatiable investors prefer asset X2 .

Note that 1SD is not the same as state-by-state dominance. See exercise 4.8 in
Danthine and Donaldson (2005).
2.6. Stochastic dominance 19

2.6.2 Second Order Stochastic Dominance

1SD is still a very strong condition, thus not applicable to most situations. If we add
the assumption of risk aversion, we get the much more useful concept of Second Order
Stochastic Dominance (2SD).

Consider the following investments:

X3 X4
Payoff Prob Payoff Prob
4 0.25 1 0.33
5 0.50 6 0.33
9 0.25 8 0.33

Plot the two distribution functions. Even though no investment 1SD the other, B
intuitively X3 “looks” better. To make this precise:
Rx Rx
Definition (2SD). Fa (x) 2SD Fb (x) ⇔ −∞ Fa (s) ds ≤ −∞ Fb (s) ds, ∀x ⇔
Rx
−∞ [Fb (s) − Fa (s)] ds ≥ 0, ∀x

That is, at any point the accumulated difference between Fb and Fa must be positive.
Note that 1SD implies 2SD, but the converse is not true.

In the plot of the previous example, this basically means that the area of the difference
where F3 > F4 is “small”. To make this a bit more precise, we can compute the integrals B
at all relevant jump points.
Rx Rx Rx Rx
x F3 (x) 0 F3 (s)ds F4 (x) 0 F4 (s)ds 0 F4 (s)ds − 0 F3 (s)ds
1 0.00 0 1/3 0 0≥0
4 0.25 0 1/3 1 1≥0
5 0.75 0.25 1/3 4/3 13/12 ≥ 0
6 0.75 1.00 2/3 5/3 2/3 ≥ 0
8 0.75 2.50 3/3 3 0.50 ≥ 0
9 1.00 3.25 3/3 4 0.75 ≥ 0
Rx
The last columns shows that −∞ [F4 (s) − F3 (s)] ds ≥ 0, ∀x. (After x = 9, the difference
between the two integrals will always be 0.75 ≥ 0.)

All risk averse investors will prefer X3 , as the following theorem shows.
Theorem 2.6.2.
Fa (x) 2SD Fb (x)

Ea [U (x)] ≥ Eb [U (x)] for all nondecreasing and concave U

Note that risk aversion is enough, i.e., we do not have to assume a specific utility
function.
2.7. Exercises 20

Mean preserving spread. The concept of 2SD is even more useful to understand
the tradeoff between risk and return.

Definition. Suppose there exists a random variable Z s.t. Xb = Xa +Z, with E[Z|Xa ] =
0 for all values of Xa . Then, we say that Xb is a mean preserving spread of Xa . (Or Fb
or fb is a m.p.s. of Fa or fa ).

Note that Xb has the same mean as Xa , but it is more noisy, i.e., risky. Intuitively, all
risk averse investors should prefer the payoff with less risk, Xa . The following theorem
justifies this intuition:

Theorem 2.6.3. Let Fa (x) and Fb (x) be two distribution functions with identical means.
Then,
Fa (x) 2SD Fb (x)

Fb is a mean preserving spread of Fa

Mean-Variance criterion. This popular investment criterion states that: (i) for
two investments with the same mean, investors prefer the one with smaller variance; (ii)
for two investments with the same variance, investors prefer the one with higher mean.
We will discuss later the exact conditions for this criterion to be true. For now, note
that theorem 2.6.3 helps to explain part (i).

2.7 Exercises
Ex. 1 — (This is problem 3.1. in Danthine and Donaldson (2005))
Utility function. Under certainty, any increasing monotone transformation of a utility
function is also a utility function representing the same preferences. Under uncertainty,
we must restrict this statement to linear transformations if we are to keep the same
preference representation.
Check it with this example. Assume an initial utility function attributes the following
values to 3 perspectives:
B u(B) = 100
M u(M) = 10
P u(P) = 50

a. Check that with this initial utility function, the lottery L = (B, M, 0.50)  P .
b. The proposed transformations are f (x) = a + bx, a ≥ 0, b > 0 and g(x) = ln(x).
Check that under f , L  P , but that under g, P  L.
2.7. Exercises 21

Ex. 2 — (This is problem 3.3. in Danthine and Donaldson (2005))


Inter-temporal consumption. Consider a two-date economy and an agent with utility
function over consumption:
c1−γ
U (c) = , γ>0
1−γ
at each period. Define the inter-temporal utility function as V (c1 , c2 ) = U (c1 ) + U (c2 ).
Show that the agent will always prefer a smooth consumption stream to a more variable
one with the same mean, that is,
c1 + c2
U (c̄) + U (c̄) > U (c1 ) + U (c2 ), if c̄ =
2

1. Start by showing that the utility function U is concave.


2. Then, show the required relation geometrically.
3. Finally, do the proof formally.
Hint: use the following definition of a concave function. A function f : RN → R1
is concave if

f (ax + (1 − a)y) ≥ af (x) + (1 − a)f (y), ∀x, y ∈ RN and ∀a ∈ [0, 1]

Ex. 3 — An agent with wealth = 100 is faced with the following game: with probability
1/2 his wealth will increase to 200; with probability 1/2 it will decrease to 0. Complete
the following sentence:
If the agent is a risk- he is willing to pay some money to play
this game, whereas if he is risk- he is willing to pay some money
to avoid the game.

Ex. 4 — The ARA and RRA measures have the first derivative of the utility function
in the denominator. Why? Hint: read Danthine and Donaldson (2005)

Ex. 5 — Prove equation (2.2).

Ex. 6 — Complete the table in section 2.4 and plot the utility functions.

Ex. 7 — The CRRA utility function is usually presented as


(
ln(W ) ,γ = 1
U (W ) = 1−γ
W /(1 − γ) , γ > 1

because ln(W ) is “almost” the limiting case as γ → 1. More precisely, the true limit is
1−γ
limγ→1 W 1−γ−1 = ln(W ).
W 1−γ W 1−γ −1
1. Explain why U1 (W ) = 1−γ and U2 (W ) = 1−γ represent exactly the same
preferences.
2.7. Exercises 22

2. Prove that
W 1−γ − 1
lim = ln(W )
γ→1 1−γ
Hint: L’Hôpital’s rule.

Ex. 8 — Consider the utility function U (Y ) = 5 + 10Y 2 . What does it imply in terms
of risk-taking behavior? Would it be economically reasonable to model an investor’s
behavior with this utility function?

Ex. 9 — An investor has an initial wealth of Y = 10. To play a game where he could
win or loose 5% of his wealth, he demands π = 0.6, where π is the probability of the
favorable outcome (winning 5%). Nonetheless, if his wealth were Y = 1000, he would
still demand the same π = 0.6 to play the game.
1. What can you say about the risk characteristics of this investor? (One sentence
answer).
2. Give an example of an utility function consistent with this behavior.

Ex. 10 — The risk-aversion characteristics of an investor can be described by two


functions: ARA and RRA.
1. Give a very brief definition in words of these two measures.
2. What does it mean to say that an investor has increasing ARA? Does it make
intuitive sense? Give an example of an utility function with this characteristic.
3. Give an example of an utility function with constant RRA (compute the actual
coefficient of RRA).

Ex. 11 — An investor with initial wealth Y0 = 100 is faced with the following lottery:
win 20 with 0.3 probability; loose 20 with 0.7 probability. The utility function is U (W ) =
ln(W ). What is the Certainty Equivalent of this lottery? What does this number mean?

Ex. 12 — Consider the following risky investment: Z = (100, 0, 0.5). The investor has
log utility, U = ln(Y ).
1. If the initial wealth is Y = 100, what is the certainty equivalent of the gamble?
2. If the initial wealth is Y = 1, what is the certainty equivalent of the gamble?
3. Explain in simple terms the change in CE.

Ex. 13 — Exercise 4.5 in Danthine and Donaldson (2005, p.354)

Ex. 14 — Exercise 4.7 in Danthine and Donaldson (2005, p.355). They meant to refer
to table 4.2.

Ex. 15 — Exercise 4.8 in Danthine and Donaldson (2005, p.355). Be careful in distin-
guishing between states of nature and distributions defined over payoffs.
2.7. Exercises 23

Ex. 16 — Consider two assets with returns ra ∼ N (0.1, 0.2) and rb ∼ N (0.1, 0.3). An
investor has the utility function U (W ) = −exp(−γW ). Which asset does the investor
prefer?
Chapter 3

Portfolio choice

1. The investor’s typical problem is

maximize E[U (Y )]
a

2. It can be solved explicitly if we assume either:


1. Quadratic utility, or
2. CARA utility and normal returns.

3.1 Canonical portfolio problem

This section analyzes the problem of an investor that must decide how much to invest
in a risky asset. Consider the following notation1

a ≡ amount (in $) to invest in a risky portfolio


r̃ ≡ uncertain rate of return on the risky portfolio
rf ≡ risk-free (certain) rate of return
Y0 ≡ initial wealth
Y˜1 ≡ terminal wealth
= a(1 + r̃) + (Y0 − a)(1 + rf ) = Y0 (1 + rf ) + a(r̃ − rf )

The investor’s problem is

maximize E[U (Y˜1 )] (3.1)


a

1
Tildes denote random variables. We’ll drop them when it is clear which variables are random.

24
3.1. Canonical portfolio problem 25

The (necessary) first order condition for a maximum is


 
d ˜ dU (.)
foc: E[U (Y1 )] = 0 ⇔ E (r̃ − rf ) = 0
da dỸ1
and the (sufficient) second order condition is
" #
d2 d2 U (.)
soc: E[U (Y˜1 )] < 0 ⇔ E (r̃ − rf )2 < 0
da2 dỸ12

which is true if the investor is risk averse (U 00 < 0).

Example 3.1.1. Assume U = 11Y − 5Y 2 , with Y0 = $1. Let rf = 0,


E[r] = 0.1, Var[r] = 0.22 . Recall Var[x] = E[x2 ] − E[x]2 . Use the foc to get
the optimal amount invested in the risky asset:
foc:

... a = $0.2
Use the soc to check that this is indeed a maximum:
soc:

The analysis of the optimality conditions produces the following important theorem:

Theorem 3.1.1. Let â denote the solution to problem (3.1) and assume the investor is
nonsatiable (U 0 > 0) and risk-averse (U 00 < 0). Then

â > 0 ⇔ E[r] > rf


â = 0 ⇔ E[r] = rf
â < 0 ⇔ E[r] < rf
3.2. Analysis of the optimal portfolio choice 26

The theorem says that a risk-averse investor will only invest in the risky asset (stocks)
if its expected return is higher than the risk-free rate. Conversely, if this is the case
( E[r] > rf ), then the investor will always participate in the stock market (even if with
just a tiny amount of money).

Example 3.1.2. Suppose U (Y ) = ln(Y ). For simplicity, assume the risky


return is the simple lottery (r2 , r1 , π). Further assume r2 > rf > r1 (why?). B
The problem is thus
maximize E[ln(Y˜1 )]
a
The foc is  
r − rf
E =0
Y0 (1 + rf ) + a(r − rf )
or, given the two possible states,
r2 − rf r1 − rf
π + (1 − π) =0
Y0 (1 + rf ) + a(r2 − rf ) Y0 (1 + rf ) + a(r1 − rf )
which after some algebra is
a (1 + rf )( E[r] − rf )
=
Y0 −(r1 − rf )(r2 − rf )
Check that the sign of the rhs depends on the sign of E[r] − rf . In particular,
if E[r] − rf > 0, we get a/Y0 > 0, as in theorem 3.1.1. Note also the following
intuitive results:
1) The fraction of wealth invested in the risky asset (a/Y0 ) increases with the
return premium ( E[r] − rf );
2) The fraction of wealth invested in the risky asset (a/Y0 ) decreases with the
return “dispersion” around rf , (−(r1 − rf )(r2 − rf )).
Lastly, note that the fraction of wealth invested in the risky asset (a/Y0 ) does
not depend on the level of wealth (there is no Y0 on the rhs). This result is
specific to the CRRA utility function as described in a theorem below.2

3.2 Analysis of the optimal portfolio choice

3.2.1 Risk aversion

We now relate the portfolio decision to the risk aversion of the investor.

The follwoing theorem states, quite intuitively, that a more risk averse individual
will invest less in the stock market:
2
See the numerical examples in Danthine and Donaldson (2005) for further interpretation.
3.2. Analysis of the optimal portfolio choice 27

Theorem 3.2.1. Let â denote the solution to problem (3.1).

∀Y > 0, ARAinv1 (Y ) > ARAinv2 (Y ) =⇒ âinv1 < âinv2

Furthermore, since ARAinv1 (Y ) > ARAinv2 (Y ) ⇔ RRAinv1 (Y ) > RRAinv2 (Y ), we also


have
∀Y > 0, RRAinv1 (Y ) > RRAinv2 (Y ) =⇒ âinv1 < âinv2

Lets check this result:

Example 3.2.1. Assume rf = 0.05 and r = (r2 = 0.4, r1 = −0.2, 1/2). For
U (Y ) = ln(Y ), we can use the results in the last example to get B

â/Y0 = 0.6

Now consider the power utility function U (Y ) = Y 1−γ /(1 − γ), with γ = 3.
Note that it has both higher RRA (3 > 1) and ARA (3/Y > 1/Y ). Check
(end-of-chapter exercise 18) that the optimal portfolio decision for this utility
function is
â/Y0 = 0.198
Hence, this more risk-averse agent invests a smaller percentage of his wealth in
the risky asset. The initial wealth (Y0 ) is the same for both investors, so the
money invested (â) is also smaller, as the theorem stated.

3.2.2 Wealth

We now analyze the portfolio decision as the initial wealth changes. We might expect
wealthier investors to put more money in the stock market. However, the result is not
so simple; it depends on the characteristics of the specific utility function.

Absolute Risk Aversion

Theorem 3.2.2. Let â = â(Y0 ) denote the solution to problem (3.1). Then,

(Decreasing ARA) ARA0 (Y ) < 0 ⇒ â0 (Y0 ) > 0


(Constant ARA) ARA0 (Y ) = 0 ⇒ â0 (Y0 ) = 0
(Increasing ARA) ARA0 (Y ) > 0 ⇒ â0 (Y0 ) < 0
3.2. Analysis of the optimal portfolio choice 28

DARA. If the investor has decreasing absolute risk aversion (DARA), he is willing to
put more money at risk as he becomes wealthier. Recall that power utility has DARA
(ARA(Y ) = γ/Y ). (Is this reasonable behavior?) B

CARA. The second case, constant absolute risk aversion (CARA) is also important
because the exponential utility satisfies this condition. Recall that

U (Y ) = − exp(−αY ) ⇒ ARA(Y ) = α ⇒ ARA0 (Y ) = 0

The theorem states that this investor will put the same amount of money in the risky
asset regardless of how much wealth he has. (Is this a reasonable description of investors’
behavior?) B

Illustration: solving the problem for CARA

Lets verify the CARA case of the theorem. The portfolio problem is

maximize {E[− exp(−αY1 )]} (3.2)


a

with Y1 = Y0 (1 + rf ) + a(r − rf ). The foc is

E [α(r − rf ) exp(−αY1 )] = 0 (3.3)

which cannot be solved explicitly for a without further assumptions! To proceed, we


consider two alternatives.

1. Implicit Function Theorem


Even though we cannot explicitly solve the problem, we can still describe the
optimal solution using a very useful trick in economics: the Implicity Function
Theorem.3 Intuitively, this theorem says the following. Suppose the (implicity)
function y = y(x) is the solution to some equation, that is, f (x, y) = 0. More
3
Implicit Function Theorem. Consider the equation f (y, x1 , . . . , xm ) = 0 and the solution
(ȳ, x̄1 , . . . , x̄m ). If ∂f (ȳ, x̄)/∂y 6= 0, then there exists an implicit function y = y(x1 , . . . , xm )
that satisfies the equation for every (x1 , . . . , xm ) in the neighborhood of (x̄1 , . . . , x̄m ), i.e.,
f (y(x1 , . . . , xm ), x1 , . . . , xm ) = 0. Furthermore, the partial derivatives are given by

∂y(x̄1 , . . . , x̄m ) ∂f (ȳ, x̄1 , . . . , x̄m )/∂xi


=−
∂xi ∂f (ȳ, x̄1 , . . . , x̄m )/∂y
3.2. Analysis of the optimal portfolio choice 29

precisely, as we change x, y(x) adjusts to keep f at 0, f (x, y) ≡ 0. We can thus


conclude that f does not change, ie, its total differential is zero. Therefore,

df (x, y) = 0
∂f ∂f
⇒ dx + dy = 0
∂x ∂y
dy ∂f /∂x
⇒ =−
dx ∂f /∂y

Going back to the maximization problem, â = â(Y0 ) is the implicit function that
guarantees that the lhs of (3.3) is always zero. We can thus take the total differ-
ential of the foc and get

dâ(Y0 ) ∂ E[. . . ]/∂Y0


=−
dY0 ∂ E[. . . ]/∂a
=0 (f oc)
z }| {
(1 + rf )α E[α(r − rf )e−αY1 ]
=−
E[α2 (r − rf )2 e−αY1 ]
| {z }
>0
= 0

Hence, the amount invested in the risky asset does not change with the investor’s
wealth, as the theorem claimed. Furthermore, the implicit function theorem al-
lowed us to check this without solving the maximization problem explicitly.

2. Normal returns
To get an explicit closed-form solution to problem (3.2) we need an additional
assumption. It is this assumption that justifies the wide use of exponential utility.
Assume the return on the risky asset is normally distributed, r ∼ N (µ, σ 2 ). Then,
next period’s wealth is also normally distributed, Y1 ∼ N (Y0 (1 + rf ) + a(µ −
rf ), a2 σ 2 ). Using the moment generating function for the normal distribution4 ,
we can simplify the portfolio problem:

max {E[− exp(−αY1 )]} = max − exp −α[Y0 (1 + rf ) + a(µ − rf )] + 1/2α2 a2 σ 2


 
a a

that is, the rhs does not have E[.]. We can thus solve the maximization problem
and get a closed-form solution for a. Exercise 24 asks you to do these final steps.
Check that the final expression for a does not depend on Y0 , as the theorem
stated. To summarize, even though the exponential utility is not the best intuitive
description of human behavior, it is very useful if we assume that returns are
normally distributed.
If X ∼ N (m, s2 ), then E e−γX = exp −γm + 21 γ 2 s2 , for any γ.
4
  
3.2. Analysis of the optimal portfolio choice 30

Relative Risk Aversion

We can also characterize the optimal portfolio choice in terms of the relative risk aversion
measure, RRA. Define ŵ ≡ â/Y0 , the optimal proportion of wealth invested in the risky
asset, or the optimal portfolio weight in the risky asset.
Theorem 3.2.3. Express the solution to problem (3.1) as a fraction of wealth, ŵ(Y0 ) ≡
â(Y0 )/Y0 . Then,
(Decreasing RRA) RRA0 (Y ) < 0 ⇒ ŵ0 (Y0 ) > 0
(Constant RRA) RRA0 (Y ) = 0 ⇒ ŵ0 (Y0 ) = 0
(Increasing RRA) RRA0 (Y ) > 0 ⇒ ŵ0 (Y0 ) < 0

For example, if the investor has decreasing RRA, he will invest a higher proportion
of wealth in the risk asset as he becomes wealthier. The most interesting case is perhaps
the constant relative risk aversion (CRRA) case, as it characterizes the power and log
utility functions. These investors always invest the same fraction of their wealth in the
stock market, regardless of their initial wealth.5

Example 3.2.2. Consider U = ln(Y ). Define w ≡ a/Y0 , the fraction of


wealth invested in the risky asset. The investor’s problem is to
maximize E[ln(Y1 )]
w

with Y1 = Y0 (1 + rf ) + wY0 (r − rf ). Writing the foc and using the implicit


function theorem, we can show that (end-of-chapter exercise 19)
dŵ
=0
dY0
That is, the optimal fraction does not change with wealth.
5 dâ/â
This theorem can also be expressed in terms of η ≡ dY0 /Y0 , the wealth elasticity of the
investment in the risky asset:
(Decreasing RRA) RRA0 (Y ) < 0 ⇒ η > 1
(Constant RRA) RRA0 (Y ) = 0 ⇒ η = 1
(Increasing RRA) RRA0 (Y ) > 0 ⇒ η < 1

To see that increasing ŵ(Y0 ) ≡ â(Y0 )/Y0 is the same as η > 1, note
 
d d â(Y0 ) dâ 1 dâ/â
[ŵ(Y0 )] = >0⇔ − â/Y02 > 0 ⇔ dâ/ dY0 > â/Y0 ⇔ >1
dY0 dY0 Y0 dY0 Y0 dY0 /Y0
and similarly for the other cases.
3.3. Canonical portfolio problem for N > 1 31

3.3 Canonical portfolio problem for N > 1

Now we generalize the portfolio choice problem. There are N risky assets and 1 risk-free
asset. Terminal wealth is
N
X
Y˜1 = Y0 (1 + rf ) + ai (r˜i − rf )
i=1

The investor’s problem is thus


N
" !#
X
maximize E U Y0 (1 + rf ) + ai (r˜i − rf )
{a1 ,...,aN }
i=1

P weights instead of $ values. We thus define wi ≡ ai /Y0


It will be convenient to choose
and write Y1 = Y0 (1 + rf ) + N i=1 wi Y0 (r˜i − rf ). The investor’s problem can thus be
rewritten as
N
" " #!#
X
maximize E U Y0 (1 + rf ) + wi (r˜i − rf )
{w1 ,...,wN }
i=1

Define r̃p to be the return on the portfolio:


N
X
r̃p := wf rf + wi r˜i
i=1

Imposing the constraint that the weights must add up to one, we have that
N N N
!
X X X
r̃p = 1 − wi rf + wi r˜i = rf + wi (r˜i − rf )
i=1 i=1 i=1

Hence, the portfolio problem can also be written as

maximize E [ U (Y0 (1 + r̃p ))]


{w1 ,...,wN }

Unfortunately, this problem is hard to solve without some simplifying assumptions.

3.4 Exercises
Ex. 17 — State the investor’s problem (expression 3.1) in words.
3.4. Exercises 32

Ex. 18 — Check the results in example 3.2.1. The final expression is in the book; you
just need to do the intermediate calculations. Caution: the expression in the book is
correct, but the number is not (at least I get a different answer: a/Y = 0.198 instead of
0.24).

Ex. 19 — Check the results in example 3.2.2, ie, do the intermediate computations.

Ex. 20 — Consider the standard portfolio choice between a risk-free asset and a risky
stock. An investor with initial wealth $1000 makes an optimal choice to allocate $400
to the stock. We know that if the same investor had an initial wealth larger than $1000,
he would allocate more than $400 to the stock.
1. This investor has (decreasing / con-
stant / increasing) ARA.
2. Give an example of a utility function consistent with this behavior.

Ex. 21 — Consider the utility function U (Y ) = −e−gY , where g is a constant param-


eter.
1. Compute the ARA and RRA coefficients.
2. Interpret in words the result obtained for ARA (relate it to a simple lottery and
to the portfolio choice problem).

Ex. 22 — Consider the canonical portfolio choice problem with 1 risky asset (with
random return r) and 1 risk-free asset (with return rf ). The investor chooses the amount
of money (a) to invest in the risky asset.
1. Write the problem explicitly for an investor with U (Y ) = − exp(−αY ), where Y
is the wealth.
2. If the risk-free rate increases, what should happen to the amount invested in the
risky asset? Explain intuitively (5 lines).
da
3. Show it explicitly. Hint: compute drf and determine its sign.

Ex. 23 — There is a risk-free and a risky asset. The investor chooses the amount
invested in the risky asset, a, to maximizea EU (Y1 ), where Y1 is next period’s wealth.
Assume a regular utility function (U 0 > 0, U 00 < 0).
1. In general, what can you say about the sign of da/dY0 ?
2. Assume U (Y ) = −e−αY . Compute da/dY0 .

Ex. 24 — Consider the standard portfolio choice problem

maximize E[− exp(−γY1 )]


a

where next-period’s wealth is Y1 = Y0 (1 + rf ) + a(r − rf ), and the return on the risky


asset is normally distributed, r ∼ N (µ, σ 2 ). Compute the explicit optimal amount to
3.4. Exercises 33

invest in the risky asset (a).


Hint. Use the following property of the normal distribution (called
 moment generating
function): If X ∼ N (m, s2 ), then E e−γX = exp −γm + 12 γ 2 s2 , for any γ.


Ex. 25 — Computing returns with dividends.


Consider the following daily closing prices and dividends (D) for two stocks (in $):

Stock A Stock B
day t Pt Dt Pt Dt
fri 0 10 – 10 –
mon 1 11 – 11 –
tue 2 10 – 10 –
wed 3 11 – 11 1.1
thu 4 9 – 9 –
fri 5 12 – 12 –
Pt +Dt
Note that when a stock pays dividends, the return should be computed as rt = Pt−1 −1.
1. Compute daily returns for these two stocks. Compute also the weekly returns
assuming that the dividends are reinvested in the stock. This is a standard as-
sumption, so use the standard formula, 1+r0,T = (1+r0,1 )(1+r1,2 ) . . . (1+rT −1,T ).
Note: this is usually called Holding Period Return in databases such as CRSP
or DataStream.
2. Suppose you invested $4,000 in A and $6,000 in B in the beginning of the week.
Compute the portfolio return over this week. (Use the weekly returns already
computed and apply the standard formula for the portfolio return).
3. Since we assume that dividends are reinvested in the stock, we may end up with
more shares than we started with. How many shares of each stock do you have
at the beginning of the week? How many shares do you have at the end of the
week?
Note: to check that you have the right answer, compute the terminal value of the
portfolio by doing V5 = PA,5 NA,5 + PB,5 NB,5 , where N is the number of shares
that you got. It should imply the same weekly return as in the previous question.
4. Again, the way weekly returns were computed assumes that dividends are rein-
vested in the stock. Hence, while for the stock without dividends (A) we have

rA,week = P5 /P0 − 1
0.2 = 12/10 − 1

the same is no longer true for the dividend-paying stock (stock B)

rB,week 6= P5 /P0 − 1
0.32 6= 12/10 − 1
3.4. Exercises 34

Hence, databases usually also show an adjusted price, P a , that can be used to
compute returns without having to know the dividends. The true return from
market closes plus dividends must equal the return with adjusted closes:
Pt + Dt Pa
− 1 = at − 1
Pt−1 Pt−1

Fix the last price P5a = P5 = 12. Compute the adjusted prices for the previous
days for both stocks.
(Check my website for an exercise with data from finance.yahoo.com)
Chapter 4

Portfolio choice for


Mean-Variance investors

1. Quadratic utility or Normal returns imply mean-variance prefer-


ences, E[U ] = f (µp , σp2 ).

2. The optimal investment opportunities are described by the mean-


variance frontier.

3. The investor’s portfolio choice problem with N > 1 risky assets


can be solved explicitly.

These concepts were developed by Harry Markowitz in 1952 and they are still the
benchmark for optimal portfolio allocation.

4.1 Mean-Variance preferences

The general portfolio problem (N > 1) is hard to solve unless we make one of the
simplifying assumptions below. Either one of these assumptions will lead to mean-
variance preferences, that is, to investors that care only about the first two moments of
Y1 or rp .1

Expand U (Ỹ1 ) around E(Ỹ1 ). To simplify the notation, let Y ≡ Y1 .

U (Y ) = U ( EY ) + U 0 ( EY ) · (Y − EY ) + 1/2 · U 00 ( EY ) · (Y − EY )2 + remainder
1
Note that the two are related: E[Y1 ] = Y0 (1 + E[rp ]) and Var[Y1 ] = Y02 Var[rp ].

35
4.1. Mean-Variance preferences 36

Taking expectations on both sides we get

EU (Y ) = U ( EY )+U 0 ( EY )· E[(Y − EY )]+1/2·U 00 ( EY )· E[(Y − EY )2 ]+ E[remainder]

or, simplifying,

EU (Y ) = U ( EY ) + 1/2 · U 00 ( EY ) · Var(Y ) + E[remainder]

Note that this expression, EU (Y ), is what the investor maximizes in his portfolio prob-
lem. The question is thus “under what conditions can we say that E[remainder] = 0, or
at least that E[remainder] itself depends only on the first two moments of wealth?”

4.1.1 Quadratic utility

If the utility function is quadratic, U = aY − bY 2 , all derivatives of order higher than 2


are null, thus remainder = 0. Therefore, we have an exact expression:

EU (Y ) = U ( EY ) + 1/2 · U 00 ( EY ) · Var(Y ) (4.1)

and the portfolio problem becomes quite simple to solve.

Drawbacks of quadratic utility. Quadratic utility has IARA, which is not very
reasonable. Furthermore, in practical applications we have to be careful defining the
parameters a and b such that we only use the range of wealth where U is increasing.

4.1.2 Normal returns

Alternatively, we can assume that stock returns are normally distributed. Note that if
rp ∼ N , then the wealth is also normally distributed, Y ≡ Y0 (1 + rp ) ∼ N .

For a normal distribution, all higher-order central moments are either zero or a
function of the variance:
(
n 0, n odd
E[(Y − EY ) ] = n! 1 n/2
(n/2)! ( 2 Var[Y ]) , n even

These are the terms in E[remainder]. Hence,

EU (Y ) = U ( EY ) + 1/2 · U 00 ( EY ) · Var(Y ) + f ( VarY )

that is, investor’s objective function depends only on the first two moments.
4.1. Mean-Variance preferences 37

Advantages of normality

We are considering the case where the investor can combine several assets into a portfolio.
If we start by assuming that the return on individual assets is determined by their means
and variances, we need to make sure that the return on any combination of these assets
(portfolio) is also determined by the mean and variance only. The Normal distribution
satisfies this additivity requirement (in fact, it is the only distribution with finite variance
that does so).

To see this, let V denote the value of a portfolio with N assets, and wi denote the
percentage of wealth invested in each asset. The portfolio return is just the weighted-
average of individual returns:

rp := V1 /V0 − 1
N
X N
X N
X N
X
= ai (1 + ri )/V0 − 1 = wi (1 + ri ) − wi = wi ri
i=1 i=1 i=1 i=1

Since the sum of normally distributed random variables also follows a normal distribu-
tion, if we assume that each stock has a normal distribution, then the portfolio return
is also normally distributed: ri ∼ N ⇒ rp ∼ N .

Drawbacks of normality

The returns we are considering here are discrete returns, defined as:

r: P1 = P0 (1 + r)

Since the Normal distribution has R support, saying that r ∼ N is the same as saying
that prices can be negative. This is an unrealistic description for assets with limited
liability, such as stocks and bonds, where the worst that can happen is bankruptcy, in
which case P1 = 0 and r = −100%.2
2
We can go around this issue by using instead continuously-compounded returns:

z: P1 = P0 ez ⇔ z = ln(P1 /P0 )

This guarantees P1 > 0, ∀z ∈ R. We can thus safely assume z ∼ N . Continuous returns


are very convenient for time-series aggregation in multiperiod settings. If short-horizon returns
are normally distributed, then the long-horizon return, z0,T , is also normally distributed: z0,T =
ln(PT /P0 ) = ln( PPTT−1 · PT −1 P2 P1
PT −2 . . . P1 · P0 ) = z0,1 +z1,2 +· · ·+zT −2,T −1 +zT −1,T ∼ N . For cross-section
P 
N zi
aggregation, the expression is a bit more cumbersome: zp := ln(V1 /V0 ) = ln i=1 ai e /V0 =
P 
N zi
PN
ln i=1 wi e or, ezp = i=1 wi ezi . Normality is not preserved.
4.1. Mean-Variance preferences 38

Empirical evidence

It is an empirical question whether normality is a reasonable first approximation to


security returns. The answer is yes, the normal distribution is a useful approximation,
particularly for returns measured over long horizons, such as one year.

If we were interested in high-frequency returns, then the normal assumption would


be more questionable, due to the following empirical facts:

1. Short-term daily returns have fat tails, that is, empirical returns have more kurtosis
than the normal distribution.

2. Short-term daily returns (especially for stock indices) are skewed to the left, that
is, extremely bad returns are more likely than under a true normal distribution.

Fortunately, these problems are less severe at longer horizons, say monthly or yearly.
Hence, since the portfolio problem we are considering here typically has a long horizon,
normality is a reasonable assumption.

Note that, despite the caveats above, the normal distribution is still the bench-
mark and the work-worse in finance. For instance, J.P.Morgan/Reuters’ RiskMetrics
system (outputs Value-at-Risk estimates) assumes that even daily returns are normally
distributed (see J.P. Morgan, 1996).

4.1.3 Conclusion

Either assuming quadratic utility or normal returns, we conclude that the investor max-
imizes a function of the mean (µ := E[r]) and variance (σ := Var[r]) of the return on
the portfolio:
maximize E[U (Y )]
| {z }
f (µp ,σp2 )

Quite intuitively, it can be shown that the objective function increases with the
expected return, df / dµp > 0, and decreases with the standard-deviation, df / dσp < 0.3
This leads to two important results.
3
For quadratic utility, this follows directly from taking derivatives of (4.1). For normal returns,
r −µ
standardize the portfolio
R returns: sRp = pσ ∼ N (0, 1). Then, the fn to be maximized is
f := E[U R (rp )] = U (r)p(r)dr = U (σs + µ)p(s)ds, Rwhere p(.) is the Normal pdf. Thus,
df /dµ = U 0 (.)p(s)ds > 0, since U 0 > 0. Also, df /dσ = U 0 (.)sp(s)ds < 0, since U 00 < 0 means
that U 0 is decreasing, which implies that for each ±s pair the negative s gets more weight. See
appendix 6.1 in Danthine and Donaldson (2005) for illustrations. To be precise, the investor
maximizes EU (Y1 ), not EU (rp ), but the derivatives have the same sign since Y1 = Y0 (1 + rp ).
4.2. Review: Mean-Variance frontier with 2 stocks 39

Mean-Variance dominance. Asset a mean-variance dominates asset b iff:


µa ≥ µb and σa < σb
or µa > µb and σa ≤ σb

All mean-variance investors prefer asset a. This implies that, for a fixed given level of
variance(mean), all mean-variance investors prefer the portfolio with the largest(smallest)
return(risk).

Optimal portfolio. It can be shown that a mean-variance investor will choose his
portfolio through the following program:4
g
maximize µp − σp2
{w1 ,...,wN } 2
That is, his objective function trades-off mean against variance. The parameter g deter-
mines how much the investor dislikes variance, i.e., how risk-averse he is.

4.2 Review: Mean-Variance frontier with 2 stocks

This section analyzes the investment opportunity set for an investor with mean variance
preferences (by one of the two possible assumptions in section 4.1). The goal is to
develop intuition for the diversification effect with just two stocks. The following sections
consider the portfolio problem in full generality.

Suppose there are just two risky assets (stocks). The investor only cares about the
mean and variance of the return on the portfolio formed by these two assets:
" 2 #
X
µp ≡ E wi ri
i=1
= w1 µ1 + (1 − w1 )µ2

and
2
" #
X
σp2 ≡ Var wi ri
i=1
= w12 σ12 + (1 − w1 )2 σ22 + 2w1 (1 − w1 )σ1 σ2 ρ
4
Assume a quadratic utility function over returns: U (rp ) = rp − g2 (rp −µp )2 . Then, E[U (rp )] =
µp − g2 σp2 . Note that U (rp ) = rp − g2 (rp − µp )2 reflects the same preferences as U (rp ) = const1 rp −
const2 rp2 .
4.2. Review: Mean-Variance frontier with 2 stocks 40

where ρ is the correlation coefficient between r1 and r2 (recall −1 ≤ ρ ≤ +1). The


opportunity set depends critically on this correlation.

The main point we want to illustrate in this section is the diversification effect.
Whereas the expected return on the portfolio is the weighted average of expected returns
on the individual assets, the same is not true for the risk. In fact, the standard-deviation
of the portfolio is typically less than the weighted average of the individual standard-
deviations. This is the gain from diversifying the portfolio. The smaller the correlation
coefficient, the greater the benefits from diversification.

Perfect positive correlation (ρ = 1). There is no gain from diversification since


the assets are essentially identical (the return on one asset is a linear function of the
other). The portfolio standard-deviation is equal to the weighted average of the two
standard-deviations
σp = w1 σ1 + (1 − w1 )σ2
which means that all the possible portfolio lie on the straight line between the two assets
(in σ, µ - space) — see figure 6.2 in Danthine and Donaldson (2005).

Imperfect correlation (−1 < ρ < +1). Now we have the diversification benefit.
At each level of µp , the corresponding σp is less than in the ρ = 1 case. This is because
σp2 increases in ρ (∂σp2 /∂ρ = 2w1 w2 σ1 σ2 > 0). See figure 6.3 in Danthine and Donaldson
(2005) and appendix 6.2 for a formal proof.

Note that only the portfolios on the upper part of the curve are efficient, that is, they
(mean-variance) dominate the ones on the lower part of the curve.

Perfect negative correlation (ρ = −1). For this (theoretical) case we would be


able to construct a risk-free asset. See figure 6.4 in the book.

1 Risk-free and 1 risky asset. If one asset is risk-free (σ1 = 0), we have σ12 = 0
and σp = w2 σ2 . The opportunity set is again linear — figure 6.5 in the book.

Extension to N risky assets. Intuitively, this analysis can be generalized to 3 risky


assets by taking one of the possible previous portfolios and a new 3rd asset. Proceeding
with these iterations, we could get to N risky assets. The minimum variance frontier
will have the shape in figure 6.6 in the book. We will derive this carefully in the next
section.
4.3. Setup for general case 41

Extension to N risky assets plus 1 risk-free asset. The investor will pick on
particular portfolio on the mean-variance frontier (the tangency portfolio) to combine
with the risk-free asset. The straight line going through rf and µT is the efficient
frontier. See figure 6.6. Again, this will be derived below.

The fact that all investors will invest in the same two assets (the risk-free and the
tangency portfolio), even though in different proportions, is known as the two fund
theorem or the separation theorem.

4.3 Setup for general case

4.3.1 Notation

Let r be the (N.1) vector of returns on the N risky assets. Define the vector of expected
returns:  
E[r1 ]
r̄ := E[r] =  ... 
 

E[rN ]
Let the covariance matrix be
..
 
 . 
. . .
V := Cov(r) =  σij . . .
..

.

Let 1 be a (N.1) vector of ones. Let µ (scalar) be the required return on the portfolio.
The choice variable is the vector of portfolio weights:
 
w1
 .. 
w= . 
wN

4.3.2 Brief notions of matrix calculus

For a scalar-valued function f (x1 , . . . , xn ), the gradient is


 
∂f /∂x1
∂f (x)  ..
=

∂x . 
∂f /∂xn
4.4. Frontier with N risky assets 42

Let a be a (n.1) vector of constants and A a (n.n) symmetric matrix of constants. Some
useful rules are:
d(a0 x)/dx = a
and
d |x0{z
Ax} /dx = 2Ax
1.n.n.1

To check the second rule, consider


 
1 3
A=
3 4

Note that x0 Ax = x21 + 4x22 + 6x1 x2 . Thus,

d(x0 Ax)
     
2x1 + 6x2 2 6 x
= = · 1 = 2Ax
dx 6x1 + 8x2 6 8 x2

4.4 Frontier with N risky assets

4.4.1 Efficient portfolio

The variance of the return on a portfolio (rp = w0 r) is given by

Var[w0 r] = w0 V w

The program to find the minimum-variance portfolio, for a given expected return µ,
is thus:
1 0
minimize wVw (4.2)
w 2
s.t. w0 r̄ = µ
w0 1 = 1

This is a constrained optimization problem. To solve it, define the Lagrangian


1
L = w0 V w + λ(µ − w0 r̄) + γ(1 − w0 1)
2
where the scalars λ and γ are Lagrange multipliers.
4.4. Frontier with N risky assets 43

The first-order conditions are:


dL
= V w − λr̄ − γ1 = 0 (N eqns) (4.3)
dw
dL
= µ − w0 r̄ = 0 (1 eqn) (4.4)

dL
= 1 − w0 1 = 0 (1 eqn) (4.5)

The foc for w can be rewritten as

V w = λr̄ + γ1
⇒ V −1 V w = V −1 (λr̄ + γ1)
⇒ w = λV −1 r̄ + γV −1 1 (4.6)

But this is not over yet because we don’t know the value of the multipliers.

Pre-multiplying (4.6) by r̄0 and using the foc for λ we get

r̄0 w = λ(r̄0 V −1 r̄) + γ(r̄0 V −1 1)


⇒ µ = λ(r̄0 V −1 r̄) + γ(r̄0 V −1 1) (4.7)

Pre-multiplying again (4.6) by 10 and using the foc for γ we get

10 w = λ(10 V −1 r̄) + γ(10 V −1 1)


⇒ 1 = λ(10 V −1 r̄) + γ(10 V −1 1) (4.8)

Equations (4.7) and (4.8) form a system of two (scalar) equations that can be solved
for the two unknown lagrange multipliers:
( (
µ = λB + γA γ = B−Aµ
D

1 = λA + γC λ = Cµ−A
D

where we defined the scalars A := 10 V −1 r̄, B := r̄0 V −1 r̄, C := 10 V −1 1, and D :=


BC − A2 . Since the matrix of covariances (V ) is positive definite and thus also V −1 , we
have that B > 0 and C > 0.5 It can also be shown that D > 0.
5
We say that the matrix A is positive (semi)definite if x0 Ax > 0 (≥) for all nonzero x. The
covariance matrix is PD because the variance of a portfolio must be positive, Var[w0 r] = w0 V w >
0. In general, a covariance matrix need only be PSD, but this would mean that we might be able
to construct a risk-free portfolio using only stocks, Var[w0 r] = w0 V w = 0. This is typically not
the case, so we assume that V is PD.
4.4. Frontier with N risky assets 44

Plugging these numbers back into (4.6), we get the final answer:
Cµ − A −1 B − Aµ −1
w∗ = V r̄ + V 1 (4.9)
D D

This equation is a closed formula for the efficient portfolio with return µ, that is,
for the portfolio with smallest variance between all portfolios with return µ. You can
0
double check that we do indeed get the required return, i.e. E[rp∗ ] ≡ w∗ r̄ = µ. The
portfolio variance can be computed as Var[rp∗ ] ≡ Var(w∗0 r) = w∗0 V w∗ . By varying µ
and computing the respective w∗ and Var[rp∗ ], we can plot the frontier of the investment
opportunity set.

Example 4.4.1. Assume that there are only 2 risky assets with E[r1 ] = 15%,
σ1 = 25%, E[r2 ] = 10%, σ2 = 20%, and zero correlation. First, check that
A = 4.9
B = 0.61
C = 41
D=1
Hint: see the formula sheet for an easy way to invert a diagonal matrix.
Second, if we require say an expected return of µ = 0.14, the optimal portfolio
from the formula above is
 
∗ 0.8
w = ... =
0.2
0
We can check that E[rp∗ ] ≡ w∗ r̄ = 0.14. The risk of the portfolio is
Var[rp∗ ] = w∗0 V w∗ = 0.0416 ⇒ σp∗ = 0.204

4.4.2 Frontier equation

If we work out the Var[rp∗ ] = w∗0 V w∗ expression, we arrive at the following equation for
the mean-variance frontier :
A 2
 
∗ C 1
Var[rp ] = µ− +
D C C
which is a parabola in ( Var[rp ], E[rp ])-space.6

Example 4.4.2. Continuing the previous example, check that we get the same
Var[rp∗ ] for µ = 0.14
6
The frontier is an hyperbola in (σp , E[rp ])-space.
4.5. Frontier with N risky assets and 1 risk-free asset 45

4.4.3 Global minimum variance portfolio

From this equation, we can immediately identify the global minimum variance portfolio:
B
E[rmvp ] = A/C
Var[rmvp ] = 1/C

The set of portfolios located on the mean-variance frontier with E[rp ] > A/C is called
the efficient frontier.7

Example 4.4.3. For the previous example, check that

E[rmvp ] = 0.1195
Var[rmvp ] = 0.0244 ⇒ σmvp = 0.1562

4.5 Frontier with N risky assets and 1 risk-free


asset

4.5.1 Efficient portfolio

In addition to the N risky assets of the previous section, we now consider one additional
risk-free asset with (known) return rf . Let w be the (N.1) vector of weights in the risky
assets as defined before. The proportion of wealth invested in the risk-free asset is thus
what is left, wf = 1 − w0 1. Therefore, the expected return on a given portfolio is

E[rp ] = w0 r̄ + wf rf
= w0 r̄ + (1 − w0 1)rf

Note that the second equation already imposes that the weights add up to 1.

The program to find the minimum-variance portfolio, for a given expected return µ,
is now
1 0
minimize wVw (4.10)
w 2
s.t. w0 r̄ + (1 − w0 1)rf = µ
7
Different people call slightly different names to all these “frontiers”. So make sure you
understand the concepts well (what dominates what).
4.5. Frontier with N risky assets and 1 risk-free asset 46

The solution is:


µ − rf −1
w∗ = V (r̄ − rf 1) (4.11)
H
where the scalar H := (r̄ − rf 1)0 V −1 (r̄ − rf 1) = B − 2Arf + Crf2 > 0. The scalars
A, B, C are as defined above.

Example 4.5.1. Continuing the previous two-stock example, further assume


rf = 0.04. First, check that
H = 0.2836
Second, if we require an expected return of µ = 0.14, the optimal portfolio
from the formula above is
 
0.6206
w∗ = . . . =
0.5289
0
We can check that E[rp∗ ] ≡ w∗ r̄ + wf rf = 0.14. The risk of the portfolio is
Var[rp∗ ] = w∗0 V w∗ = 0.0353 ⇒ σp∗ = 0.1878

4.5.2 Frontier equation

To plot the mean-variance frontier, we can again compute w∗ and the respective Var[rp∗ ]
for different values of µ. Alternatively, we can compute an explicit expression for Var[rp∗ ]:
Var[rp∗ ] ≡ Var(w∗0 r) = w∗0 V w∗
µ − rf 2  −1
 
0 
V (r̄ − rf 1) V V −1 (r̄ − rf 1)

=
H
µ − rf 2
 
= (r̄ − rf 1)0 (V −1 )0 (r̄ − rf 1)
H | {z }
=H

Note that V is symmetric, thus (V −1 )0 = (V 0 )−1 = V −1 . Finally, the mean-variance


frontier with a risk-free asset is:
(µ − rf )2
Var[rp∗ ] = (4.12)
H
This draws two straight lines in (σp , r̄p )-space (an exercise will ask you to check this
with real data). The one that goes through rf and the tangency portfolio (ie, the set of
portfolios with E[rp ] > rf ) is the efficient frontier :8

µ = rf + σp H (4.13)
8
Equation (4.12) implies
µ − rf µ − rf √ √
σp = √ or − σp = √ ⇒ µ = rf + σp H or µ = rf − σp H
H H
4.5. Frontier with N risky assets and 1 risk-free asset 47

Example 4.5.2. Check that we get the same Var[rp∗ ] for µ = 0.14

4.5.3 Tangency portfolio

We can compute the precise coordinates of the tangency portfolio T by noting that it
is the only frontier portfolio composed only by risky assets, i.e. 10 wT∗ = 1. We can use
(4.11) to find the corresponding expected return (µT ):

10 wT∗ = 1
µT − rf 0 −1
⇒ 1 V (r̄ − rf 1) = 1
H
H
⇒µT = + rf
A − Crf

Plugging back into (4.11) we obtain an explicit expression for the weights in the tangency
portfolio:
V −1 (r̄ − rf 1)
wT∗ =
A − Crf

Example 4.5.3. Continuing the previous example, check that


 
∗ 0.5399
wT = . . . =
0.4601
Thus,
E[rT ] = 0.1270
σT = 0.1634

Example 4.5.4. Two-fund separation: Find the linear combination of T and


rf that will give E[rp ] = 0.14.
Check that the weights in the two stocks are equal to the ones obtained above
using (4.11)

We are interested in the line with positive slope, µ = rf + σp H, which under “normal” circum-
stances will be the tangent line. More precisely, the tangency portfolio is located on the upper
limb of the hyperbola if rf < E[rmvp ] = A/C. If the reverse is true, the tangency portfolio is
located on the lower limb. Further, if rf = A/C, there is no finite point of tangency. However,
note that from theorem 3.1.1, the equilibrium case under the CAPM model (section 5) must be
rf < E[rmvp ] (otherwise, there would
√ be no demand for the risky assets). Hence, in equilibrium
the frontier is given by µ = rf + σp H. See Huang and Litzenberger (1988) or Ingersoll (1987)
for details.
4.6. Optimal portfolio 48

4.6 Optimal portfolio

The particular portfolio on the efficient frontier that the investor picks depends on his
level of risk aversion. Given that the investor has mean-variance preferences, he chooses
his optimal portfolio weights by
g
maximize E[rp ] − V ar[rp ]
2
where g is a constant parameter and rp denotes the return on the portfolio.

Assuming that there are N risky assets plus one risk-free asset, the problem in matrix
notation is
g
maximize w0 r̄ + wf rf − w0 V w
w 2
s.t. 1 = w0 1 + wf

or
g
maximize w0 r̄ + (1 − w0 1)rf − w0 V w
w 2
The foc is
r̄ − rf 1 − gV w = 0
which implies the solution
1 −1
w∗ = V (r̄ − rf 1) (4.14)
g

Example 4.6.1. Assume that g = 5, rf = 4%, and that there are only 2
risky assets with E[r1 ] = 15%, σ1 = 25%, E[r2 ] = 10%, σ2 = 20%, and
zero correlation. Compute the exact expected return and standard-deviation of
the optimal portfolio. Hint: see the formula sheet for an easy way to invert a
diagonal matrix.
Solution:
 
0.352
Using the formula above, w∗ = . . . =
0.300
and wf = 0.348.
Thus,
E[rp ] = w0 r̄ + (1 − w0 1)rf = 9.67%
V ar[rp ] = w0 V w = 0.0113 ⇒ σp = 10.65%
4.6. Optimal portfolio 49

We can verify that this portfolio is efficient, ie, that it actually lies on the mean-
variance frontier. Write

E[rp ] = w0 r̄ + (1 − w0 1)rf = (r̄ − rf 1)0 w + rf

For this investor’s optimal portfolio, using w∗ from (4.14),


1 1
E[rp∗ ] = (r̄ − rf 1)0 V −1 (r̄ − rf 1) + rf = H + rf
g g

where we also used H := (r̄ − rf 1)0 V −1 (r̄ − rf 1).

There are two alternatives now:

1. Plug E[rp∗ ] in the formula for frontier portfolios (4.11) and show that the portfolio
is the same one as the investor chose:
µ − rf −1
w= V (r̄ − rf 1)
H
1
g H + rf − rf −1
= V (r̄ − rf 1)
H
1
= V −1 (r̄ − rf 1)
g

which is indeed the same as (4.14).

2. Alternatively, we can show that the investor’s portfolio verifies the equation for
the efficient frontier (4.13). Start by computing the portfolio variance, using w∗
from (4.14),
0 1
Var[rp∗ ] = (w∗ ) V w∗ = 2 H
g
Then, plug this variance into (4.13):

µ = rf + σp H
1 √
r
= rf + H H
g2
1
= rf + H
g
which is indeed the expected return on the investor’s portfolio. Note that this
second alternative is a bit more correct, since it explicitly shows that the investor
portfolio lies on the upper part of the mean-variance frontier, ie, that it is efficient.
4.7. Additional properties of frontier portfolios 50

4.7 Additional properties of frontier portfolios

We now derive a relation that will be used to prove the CAPM in the next chapter. We
want to do the derivation right now to stress that the part done here is just math, not
economics. In other words, it does not depend on any model of market equilibrium.

Define:
p ≡ a frontier portfolio (still assume there is a risk-free asset)
a ≡ any portfolio, not necessarily on the frontier (eventually a single asset), but
without the risk-free asset.

The covariance between the two portfolios is given by (exercise 30 at the end shows
this):
Cov(ra , rp ) = wa0 V wp
Since p is a frontier portfolio, wp is given by (4.11). Hence,
 
0 µ − rf −1
Cov(ra , rp ) = wa V V (r̄ − rf 1)
H
µ − rf 0
= wa (r̄ − rf 1)
H
E[rp ] − rf
= ( E[ra ] − rf )
H
since µ = E[rp ], wa0 r̄ = E[ra ], and wa0 1 = 1. Solving for E[ra ] − rf and using (4.12) for
H,
H Cov(ra , rp )
E[ra ] − rf =
E[rp ] − rf
Cov(ra , rp )
= ( E[rp ] − rf ) (4.15)
Var[rp ]

Note that all we did so far was to characterize the relation between a frontier portfolio
(p) and any other asset (a). Since, p can be any frontier portfolio, the previous relation
applies in particular to the Tangency portfolio:
Cov(ra , rT )
E[ra ] − rf = ( E[rT ] − rf ) (4.16)
Var[rT ]

4.8 Exercises
Ex. 26 — Consider the quadratic utility function U (W ) = a + bW + cW 2 , where W
is the terminal wealth and a, b, c are constants. Assume that W = W0 (1 + rp ), where
W0 is the initial wealth and the rate of return on the portfolio is normally distributed,
4.8. Exercises 51

rp ∼ N (µ, σ 2 ). (Note that the normality assumption is a bit of an overkill; we only


need quadratic utility). Show that the investor only cares about the first two moments
of returns, i.e., write E[U (W )] as an explicit function of µ and σ (and the constants
a, b, c, W0 ).

Ex. 27 — Normal returns for PSI20. Download the file “PSI20.xls” from my website.
It has daily and monthly closing prices for the Portuguese Stock Index 20.
Note: If you do this in Matlab, you may want to use my DescStats.m function (also
posted on the website).
1. Compute daily continuously-compounded returns. Compute the mean, variance,
skewness, and kurtosis of the distribution. Does it look normal?
2. Do the same for monthly returns.

Ex. 28 — Mean-Variance Frontier. Assume there are N risky assets and that there
is no risk-free asset. Formulate the problem of finding the minimum-variance portfolio
for a given level of return. State in words what the objective and the restrictions mean.
Solve for the optimal portfolio weights. Note: The goal of this exercise is for you to go
through all the intermediate calculations in detail.

Ex. 29 — Solve problem (4.10), ie, show the intermediate steps that lead to (4.11).

Ex. 30 — Let rp and rq denote the returns on two portfolios. By definition, the
covariance between these returns is given by Cov(rp , rq ) := E[(rp − E[rp ])(rq − E[rq ])].
Starting from this definition, show that the covariance can also be computed as wp0 V wq ,
where wi is the N by 1 vector of weights in portfolio i and V := Cov(r) = E[(r −
Er)(r − Er)0 ] is the N by N covariance matrix of individual stock returns. Hint: write
ri = wi0 r, i = p, q.

Ex. 31 — An investor has mean-variance preferences and thus chooses his optimal
portfolio weights (w, an N by 1 vector) by solving:
g
maximize E[rp ] − V ar[rp ]
w 2
0
s.t. w 1 = 1

where rp is the return on the portfolio, g is a constant parameter, and 1 a vector of ones.
There is no risk-free asset. To simplify the notation, denote by
V := Cov(r), the covariance matrix, and
r̄ := E[r], the vector of expected returns,
where r is the random vector of returns on the N risky assets.
1. Solve for the optimal w∗ .
Hints: First write E[rp ] and V ar[rp ] in matrix notation, ie, using w, r̄, and V .
To simplify the notation, use the scalars A, B, and C (as defined section 4.4)
along the calculations whenever possible.
4.8. Exercises 52

2. The rest of the exercise will help you to show that the portfolio just found is
mean-variance efficient. Start by computing its expected return.
3. Now look at the solution for an efficient portfolio (equation 4.6):
Cµ − A −1 B − Aµ −1
w∗ = V r̄ + V 1
D D
Plug in the expected return found in part (2.) and verify that the resulting w∗
is identical to the one found in part (1.). (This shows that the solution to the
initial problem is indeed mean-variance efficient.)

Ex. 32 — There are N risky assets and 1 risk-free asset. Consider the standard port-
folio choice problem, maximizew E[U (Y1 )], where the terminal wealth is Y1 = Y0 (1 + rp ).
All risky assets follow a normal distribution and thus the return on the portfolio is
also normally distributed, rp ∼ N (E[rp ], V ar[rp ]). The utility function is U (Y ) =
− exp(−b.Y ), where b is a constant parameter. Compute the optimal weights in the
risky assets, w∗ (an N by 1 vector).
Hint: Start by writing E[rp ] and V ar[rp ] in matrix notation. Then, write the distribution
of Y1 . Then, use the moment generating function to simplify the objective function.

Ex. 33 — Frontier with Industry Portfolios. Download the file 10_Industry_Portfolios.xls


from my website. It has monthly returns on 10 industry portfolios (from K. French’s
website, http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/index.html).
1. Ignoring the risk-free asset, draw the frontier in mean-std space.
2. Now consider a risk-free rate of rf = 0.4% (this the 1-month TBill rate at the end
of the sample, as you can check on French’s website). Draw the efficient frontier
(do it on the same figure as 1; you should get something like fig 6.6 in Danthine
and Donaldson (2005)).
3. Compute the tangency portfolio (weights, expected return, standard deviation)
and plot it in the figure.
4. An investor has mean-variance preferences and thus chooses his optimal portfolio
weights by
g
maximize E[rp ] − V ar[rp ]
2
where g is a constant parameter and rp denotes the return on the portfolio. The
solution is
1
w∗ = V −1 (r̄ − rf 1)
g
Assume that g = 8 and that the investor has $1 Million to invest. Compute
the amount of money that the investor should put in each of the 10 industry
portfolios and in the risk-free asset. Plot the optimal portfolio in the same figure
as the previous questions.
5. Find the value of the parameter g that would make the investor optimally choose
the Tangency portfolio.
4.8. Exercises 53

Ex. 34 — No short selling. Use the same data as in the previous exercise. Consider the
same investor as in question 4, ie, mean-variance preferences with g = 8. Assume that
the investor cannot short sell any of the stock portfolios. Compute the optimal amount
of money that the investor should put in each of the 10 industry portfolios and in the
risk-free asset. Compute the expected return and standard-deviation of the optimal
portfolio. Plot the new optimal portfolio in the same figure as the other questions in the
previous exercise.
Hint: There is no closed form solution. Look for ways to solve the problem numerically.
Matlab and other software (like EXCEL) do this.
Chapter 5

Capital Asset Pricing Model

The CAPM states that the market portfolio is mean-variance efficient.


For any asset,
E[rj ] = rf + βj ( E[rM ] − rf )

5.1 Introduction

Our goal is to understand why different assets have different average returns. The CAPM
proposes a very precise answer to this question.

The value of any asset is the present value, or discounted value, of its future cash
flows. The CAPM gives us a formula for the discount rate. Hence, it is used everyday
by corporations and investors to price investment projects, stocks, mutual funds, etc.

The CAPM is an equilibrium model that results directly from assuming that all
investors are mean-variance optimizers. It was developed simultaneously in three papers
by Sharpe in 1964, Lintner in 1965, and Mossin in 1966.

5.2 Derivation

We make the following assumptions:

A1: All investors have mean-variance preferences.

A2: There is a risk-free asset with return rf .

54
5.3. Important results 55

A3: Investors have homogeneous expectations. This means that everybody has the
same beliefs about the return distribution of every asset.

These assumptions immediately imply the following results:

1. The efficient frontier (namely, the straight line through rf and T ) is the same for
every investor.

2. Two fund separation: every investor allocates his wealth between two portfolios:
the risk-free asset and the Tangency portfolio.

3. In equilibrium, all risky assets must belong to T .


T
To see this, suppose that IBM is not in T (wIBM = 0). Then, there would be
i T
no demand for this stock, (wIBM = wIBM = 0, for every investor i). We would
thus have Demand 6= Supply, which is not equilibrium. Therefore, in equilibrium,
wjT > 0, ∀ asset j.

4. Furthermore, for every asset, the weight in T must be the same as in the whole
market:
Market Capj
wjT = P =: wjM , ∀ asset j
j Market Cap j

If we all put 2% of our (risky) money into IBM stock, then IBM will have 2% of
all money invested in the stock market, meaning that the market capitalization
of IBM will be worth 2% of the whole market capitalization.1 In other words,
T
wIBM Market .
= wIBM

5. Hence, the Market portfolio is the Tangency portfolio, M = T . This is the eco-
nomic content of the CAPM. In one sentence, the CAPM states that the Market
portfolio is mean-variance efficient.

5.3 Important results

Once we have the economic result that M is on the efficient frontier, we can use the
statistical relations derived in section 4.4, replacing M for T.
1
Different investors put different amounts of money at risk, ie, in the tangency portfolio. But
from these amounts, each investor allocates the same 2% to IBM.
5.3. Important results 56

5.3.1 Capital Market Line

When we use M instead of T, the efficient frontier is called Capital Market Line:

E[r]
6

-
σ

All individual optimal portfolios plot along the CML. For an efficient portfolio p, (ie,
p ∈ CML),
E[rM ] − rf
CM L : E[rp ] = rf + × σp
σM |{z}
“quantity” of risk
| {z }
“reward” for risk

Recall that p is a combination of the risk-free and the market portfolio, thus σp = wM σM .

Application: exercise 35. B

5.3.2 Security Market Line

Replacing M for T in (4.16), we have that for any asset j (not necessarily on the CML)

Cov(rj , rM )
E[rj ] − rf = ( E[rM ] − rf )
Var[rM ]
Cov(rj ,rM )
or, defining βj ≡ Var[rM ] ,

SM L : E[rj ] = rf + βj ( E[rM ] − rf )

Note that this applies to every single asset or portfolio.


5.3. Important results 57

E[r]
6

-
β

The SML says that the risk premium on any asset, E[rj ] − rf , depends only on one
factor: the market. More precisely, it is a linear function of the relevant measure of risk,
βj . The slope of the line, E[rM ] − rf , is called the “market risk premium”. This risk
premium is the same for every asset.

Valuing Risky Cash Flows

The price of an asset is the present value of its future cash flows discounted at the
appropriate rate. The CAPM is commonly used to provide us with that discount rate.
This amounts to requiring that the asset give us an expected return equal to the SML
formula.
˜ the random cash flow to be generated
Formally, let p be the price of the asset and CF
one period from now. The random return on this project is
˜
CF
r̃ = −1
p
which has expectation
˜ ]
E[CF
E[r̃] = −1
p
Using the CAPM, E[r] = rf + β ( E[rM ] − rf ), we get

E[CF˜ ]
p=
1 + rf + β ( E[rM ] − rf )

In practice, this valuation method is extended informally to assets with cash flows
over multiple periods. Further, it is also applied to nontraded assets by using betas of
similar traded assets.
5.3. Important results 58

Example 5.3.1. A media company is considering going into the cell phone
business. By investing $100M today, it is expecting to receive $20M in 1 year,
$30M in 2 yrs, and $90M in 3 yrs. Telecom companies have an average beta
of 0.7. The risk-free rate is 3% and the average market risk premium is 6%.
Should the company expand its operations into the cell phone business?

Application to stock pricing: exercise 36. B

Economic interpretation of β

Suppose asset a has more risk than asset b, ie βa > βb . According to the SML, this will
lead to E[ra ] > E[rb ]. What is the economic intuition for this?

According to the CAPM, all investors hold the market portfolio. Hence, they are
happy when the market goes up, unhappy when it goes down. But recall that marginal
utility is decreasing. This means that the investor is really interested in additional
payoffs in bad times (low market returns) and less enthusiastic about additional payoffs
in good times (high market returns). Therefore, investors like assets with low covariance
with the market. βj := cov(rj , rM )/var(rM ) is precisely a (standardized) measure of
this covariance. Asset b has higher payoffs when the market is in relatively poor states,
5.4. Other remarks 59

making it more desirable. Hence, investors are willing to hold asset b at a lower expected
return. Equivalently, they will pay a higher price for b.

This intuition is extremely important — it is the core of asset pricing. We’ll come
back to it again (formally) in chapter 8.

5.4 Other remarks

Note the following remarks about the CAPM:

1. The CAPM is a model of partial equilibrium only. That is, important aspects of
the economy like production, consumption, etc, do not appear in the model. As a
result, the interest rate rf is exogenous.

2. The CAPM is a cross-sectional model, that is, it expresses a relation between the
returns on all assets at some point in time.

3. There is no time in the model. We can interpret it as a single-period model,


though the exact length of the period is left unspecified (the asset return can be
over a day or over a year).

4. Investors don’t put their money only in stocks. Thus, the true market portfolio
of the CAPM should include all assets in the economy. In a famous critique,
Roll (1977) argues that the true market portfolio is not observable. Moreover, the
return on a stock market index may not even be a good proxy for the return on the
aggregate wealth portfolio. According to Jagannathan and McGrattan (1995), “in
the United States, only one-third of nongovernmental tangible assets are owned by
the corporate sector, and only one-third of corporate assets are financed by equity.
Furthermore, intangible assets, like human capital, are not captured by stock
market indexes.” For example, the biggest asset for most families is their house;
also, most people get more income from labor than from their financial assets.
To summarize, the validity of the CAPM depends very much on the particular
proxy used for the market return. In practice, this means that if you estimate the
required return for Microsoft using two different proxies for the market (say, the
SP500 and the NASDAQ indices), you may get two significantly different numbers.

Nevertheless, the CAPM is still the center of equilibrium asset pricing. It helps us
understand the risk-return tradeoff by specifying exactly what is the risk factor that
matters — the market.

Its empirical validity is still being debated. Overall, it is a good model to describe av-
erage returns of different assets over long periods of time. See the survey in Jagannathan
and McGrattan (1995).
5.5. Exercises 60

5.5 Exercises
Ex. 35 — CML. You expect the stock market to go up by 10% over the next year.
The standard deviation of the market return is 20%. You can buy 1 year government
bonds yielding 4%. If you have $100,000 to invest and you are willing to tolerate a risk
(standard deviation) of 15%, what is the best allocation of your money? How much
money do you expect to have one year from now?

Ex. 36 — Industry-type application of the CAPM. Suppose E[rM ] = 10% and rf =


4%. You estimate stock a will pay a dividend of $2 one year from now. After that, you
expect dividends to grow at 5% per year. You also estimate the beta of the stock to be
βa = 0.9. What is the equilibrium price of the stock?
D1
Note: recall that the present value of a stream of dividends growing at rate g is P0 = r−g ,
where r is the discount rate. Thus, you just need to use the CAPM to estimate the
required discount for stock a.

Ex. 37 — Portfolio β. Suppose you can only buy two securities: asset a, with βa = 1.2;
and the risk-free asset. Your goal is to have a portfolio with a beta of 0.9.
1. Show that the beta of a portfolio equals the weighted average of individual secu-
rity betas:
XN
βp = wi βi
i=1
where wi are the portfolio weights.
Hint: Start from the definition βp := Cov(rp , rM )/ Var(rM ), with rp = N
P
i=1 wi ri .
Recall the following properties of covariance (uppercase letters are random vari-
ables; lowercase are constants):

Cov(aX, bY ) = ab Cov(X, Y )
Cov(aX, bY + cZ) = Cov(aX, bY ) + Cov(aX, cZ)

2. Compute the portfolio weights that will achieve your goal.

Ex. 38 — The Security Market Line gives the expected return for any portfolio: r̄p =
rf + βp (r̄M − rf ). Under what condition will the Capital Market Line give the same r̄p ?

Ex. 39 — The equation for the SML is

E[rj ] = rf + βj ( E[rM ] − rf )

Explain what part of this equation is “just mathematics” (or mean-variance optimiza-
tion) and what part is an economic model of market equilibrium.

Ex. 40 — Make sure you understand the economic interpretation of β. Write in your
own words why βa < βb ⇒ E[ra ] < E[rb ].
Chapter 6

Arbitrage Pricing Theory and


Factor Models

If there are no arbitrage opportunities,


K
X
E[rj ] = rf + βjk ( E[Fk ] − rf ) , ∀j
k=1

Pricing by (no) arbitrage is based on the assumption that there are no arbitrage
opportunities in the market, that is, it is not possible to make money at zero risk and
zero cost. The goal is to obtain pricing relations with as few assumptions as possible
(namely we will not have to assume any utility function).

Arbitrage techniques allow us to relate the prices of a set of assets to the prices of
another set of basic fundamental assets (eg, the price of a stock option is a function of
the price of the underlying stock). In particular, the Arbitrage Pricing Theory (APT),
developed by Ross (1976), explains the returns on all stocks as a function of a (small)
set of fundamental “risk factors”.

6.1 Factor Structure

APT starts from a statistical characterization of realized returns. It is an empirical fact


that stock return returns move together to some extent. There is comovement at the
market level, at the industry level, etc. This suggests that there are just a few “forces”
that drive stock returns.

61
6.1. Factor Structure 62

We assume that the return on stock j is generated by K random variables called risk
factors:
XK
rj = aj + βjk Fk + εj , j = 1, 2, . . . , n (6.1)
k=1

Note that the factors are common to all stocks, ie, they are pervasive risk factors. The
parameters βjk , called factor loadings, measure the sensitivity of security j to factor
k. The random variable εj is the residual, ie, the part of return not explained by the
common factors ( Var[εj ] is idiosyncratic risk).

The point is to have the number of factors much smaller than the number of stocks
(K < n). Most empirical applications find that the number of factors ranges from 1 to
5. Hence, the theory is not vacuous.

The goal of the APT is to derive a relation about expected returns, E[r]. In other
words, to see how expected returns are related to the pervasive risk factors. The intuition
is that idiosyncratic risk can be diversified away and thus should not be priced; investors
only care about the covariance with the pervasive risk factors.

Assumptions:

A1: E[εj ] = 0, ∀j

A2: Cov(Fk , εj ) = 0, ∀k, j

A3: Cov(εj , εi ) = 0, ∀j 6= i

Assumptions 1 and 2 are relatively innocuous (they can even be imposed by construction
if we estimate the parameters aj , βj1 , . . . , βjK in (6.1) by OLS regression). A3 is the
critical assumption that gives economic content to this model. It says that we only
need K factors to explain all commonality in returns. Whatever is not explained by the
factors (εj ) is specific to asset j and has nothing in common with the residuals from
other assets. In other words, we were able to identify all the pervasive risk factors.

A meaningful factor structure must therefore have two properties:


(1) the factors movement should explain a substantial fraction of the movement of the
returns on the priced assets;
(2) the unexplained parts of the returns on the priced assets should be uncorrelated
across these priced assets.
Remark. Equation (6.1) is sometimes stated
P as deviations from means. Take expecta-
tions on both sides to get E[rj ] = aj + K k=1 βjk E[Fk ]. Plug the resulting value for aj
into (6.1) to get
K
X
rj = E[rj ] + βjk F̂k + εj
k=1
6.2. Example of simple factor structure: Market Model 63

with F̂k ≡ Fk − E[Fk ] and thus E[F̂k ] = 0. Stock returns deviate from their means as
a result of unexpected realizations of risk factors. Note that this is just a mathematical
manipulation of (6.1); it is still not saying anything about E[rj ].

6.2 Example of simple factor structure: Market


Model

6.2.1 Return generating process

One important example of a simple factor structure is the Market Model. This model
states that there is just one factor, the market. Formally,

rj = aj + βj rM + εj , ∀j (6.2)

If we estimate this regression by OLS we get the CAPM beta, βj = Cov(rj , rM )/ Var(rM ),
and we guarantee A1–2 are true. Again A3 is the critical assumption. In this context,
it says that the market return is enough to capture all the common movement between
stock returns.1

6.2.2 Application: the Covariance matrix is simplified

The factor structure is really a restriction on the covariance matrix of returns. The com-
putation of real-life covariance matrices is a challenging problem. Note that a covariance
matrix has (N 2 − N )/2 different covariances plus N variances. With say N=100 stocks
in your portfolio, you need to estimate 5,050 different parameters. If the market model
is true, the covariance matrix is much simpler. Using A1–3, we can show

diagonal: σj2 = βj2 σM


2
+ σε2j (6.3)
2
off diagonal: σij = βi βj σM (6.4)

Hence, we only need to compute N betas, plus N+1 variances. For N=100, we only need
to estimate 201 parameters to get the full covariance matrix.2
1
If this was true, CAPM would be the end of asset pricing. It isn’t.
2
This simplification motivated the use of the market model when computer power was scarce.
Nowadays, we no longer have to accept the extreme simplification (A3) of this model and better
models are being developed. Nonetheless, obtaining a good estimate of a large covariance matrix
is still hard and a lot of research is still going on.
6.2. Example of simple factor structure: Market Model 64

More importantly, imposing a factor structure may help to get more meaningful esti-
mates of the covariances. For example, suppose that we want to estimate the covariance
between stock A and B. Assume that during the early part of the sample period there
were rumors that A was going to acquire B, which led to a decrease in the price of
A and an increase in B. Later, the rumors were strongly denied by the CEOs of both
companies, which led to a reverse in prices (A back up, B back down). If we use a simple
historical estimate, we are going to get a strong negative correlation between A and B.
And if we then use this estimate in a portfolio allocation rule, we are probably going to
get big allocations to A and B in order to reduce the total risk of the portfolio. However,
this historical correlation is spurious, not likely to be a good predictor of what is going
to happen to the two companies in the future. If instead we estimate the correlations
with the market model, we are forcing this specific event, unrelated to the market, to go
to the residuals of (6.2). If the estimated betas are small, we will then forecast that the
correlation of these two stocks is low. This is likely to lead to a better allocation rule
going forward.

6.2.3 Implication: Diversification eliminates Specific risk

From (6.2) and using assumption A2 ( Cov(rM , εj ) = 0), the total risk of a single stock
is
σj2 = βj2 σM
2
+ σε2j
|{z} | {z } |{z}
total risk systematic risk nonsystematic risk

However, the nonsystematic risk (also called specific or unique or idiosyncratic risk,
or diversifiable risk) can be easily diversified away by holding a large portfolio. That
is, a well-diversified portfolio only has systematic risk (also called market risk or non-
diversifiable risk).

Proof. Consider a portfolio of N securities. Its return is


N
X N
X N
X N
X
rp = wj rj = wj a j + wj βj rM + w j εj
j=1 j=1 j=1 j=1

The variance is
   
XN XN
Var[rp ] = Var  wj βj rM  + Var  w j εj 
j=1 j=1

where we used A2 to remove the Cov(rM , εj ). Using A3 to remove Cov(εi , εj ), we get


N
X
Var[rp ] = Var (βp rM ) + wj2 Var(εj )
j=1
6.2. Example of simple factor structure: Market Model 65

PN
where we also used βp = j=1 wj βj .

We now show that the second term goes to zero in a large, well-diversified portfolio.
Set wj = 1/N so that the portfolio is well-diversified. Assume that the residual variance
is the same for all assets: Var[εj ] = v, ∀j.3 We get
N N
X 1 X v N →∞
wj2 Var(εj ) = 2 v= −−−−→ 0
N N
j=1 j=1

Hence, nonsystematic risk can be eliminated through diversification. In standard nota-


tion,
N →∞
σp2 −−−−→ βp2 σM
2

In graphical terms,

σ2
6

specific risk (σε2p )

market risk (βp2 σM


2 )

-
N

Hence, we should expect that E[rp ] depends only on βp .

6.2.4 Another interpretation of the CAPM β

In the CAPM, every investor holds a well-diversified portfolio, namely the Market port-
folio (εM ≡ 0, σε2M = 0). Hence, there is no reward for the nonsystematic risk of a
security. Only the systematic risk of each stock is rewarded.4 β is the measure of mar-
ket risk (high β implies high systematic risk). The higher the β, the higher the expected
3
The argument also works if we assume only that the variances are bounded, Var[εj ] ≤ c <
∞, ∀j.
4
To find how much reward we can get for the systematic part, we find the return of an efficient
portfolio with risk only σj = βj σM . That is, we plug this quantity of risk into the CML equation.
Note that this produces the SML. Hence, the SML only rewards systematic risk.
6.2. Example of simple factor structure: Market Model 66

return (through the SML). Hence, β can be interpreted as the measure of the risk that
matters, i.e., of the risk that carries a risk premium in the CAPM — market risk.

Example 6.2.1. This example shows that nonsystematic risk is not rewarded.
Assume that returns are generated by the market model. In particular, stocks’
A and B are generated by:

ra = 0.004 + 0.9rM + a , σa = 0.05


rb = −0.008 + 1.2rM + b , σb = 0.1

The return on the market has a mean of 10% and standard deviation of 20%.
The risk-free rate is 4%. The CAPM correctly explains expected returns in this
economy.
Part 1 - An under-diversified portfolio.
Suppose we form an equal-weighted portfolio p of these two stocks (wa = wb =
0.5). Note that this portfolio is not well diversified since it only has two stocks.
The portfolio beta is

βp = . . .

and using the CAPM we get the expected return for this portfolio

SM L : E[rp ] = . . .

Part 2 - Risk decomposition.


Using the assumptions of the market model, the systematic risk is

βp2 σM
2
= ...

and the nonsystematic risk is

2
X
σ2p = wj2 Var(εj ) = . . .
j=1

As expected, the portfolio has unique risk because it is not well diversified.
Hence, the total risk of the portfolio is

σp2 = βp2 σM
2
+ σ2p = . . .
⇒ σp = 21.7%

Part 3 - A well-diversified portfolio.


6.3. Pricing equation 67

Consider another efficient portfolio q located on the CML, with total risk equal
to the systematic risk of p: σq2 = βp2 σM
2 = 0.0441. Recall that portfolios on the
2
CML have no specific risk: σq = 0. Hence,

σq2 = 0.0441 + 0 ⇒ σq = 21%

Using the CML equation, its expected return is

CM L : E[rq ] = . . .

the same as E[rp ]!


Part 4 - Conclusion.
The 0.7% of σp corresponding to the unique risk of p do not get any reward, ie,
the CAPM only rewards market (ie, non-diversifiable) risk. Graphically, while
q sits on the CML, p is to the right of the CML. In simple terms, buying only
stocks a and b is not the best way to get an expected return of 10.3%.

6.3 Pricing equation

We start with the exact version of the APT. We assume that εj ≡ 0, ∀j. This provides
all the necessary intuition for the general case in the last section. We follow Huang and
Litzenberger (1988).

6.3.1 Exact factor pricing with one factor

Assume 1 single factor exactly generates all returns:

rj = aj + βj F, ∀j

Construct a portfolio p by investing in the risk-free asset and the factor itself with
the following weights:    
wf 1 − βj
p: =
wF βj
The return on this portfolio is thus

rp = wf rf + wF F = (1 − βj )rf + βj F

This portfolio has the same return, state-by-state, as stock j, except for the intercept.
6.3. Pricing equation 68

If there are no arbitrage opportunities, the intercepts must be the same:

aj = (1 − βj )rf

To see why this must be so, suppose aj > (1 − βj )rf , ie, j is a better investment. Then,
short $1 of the portfolio and buy $1 of j. This costs nothing and guarantees a sure profit
of aj − (1 − βj )rf > 0. This is called a free lunch. If instead aj < (1 − βj )rf , short j and
buy p for another free lunch. There cannot be such arbitrage opportunities in financial
markets. Check exercise 42. B
Replacing aj in the return-generating process, we get

rj = rf + βj (F − rf )

Taking expectations on both sides we get

E[rj ] = rf + βj ( E[F ] − rf ), ∀j

The asset risk premium ( E[rj ] − rf ) depends on the factor risk premium ( E[F ] − rf )
and the asset’s loading on the factor (βj ). The factor’s risk premium is exogenous. Once
we know this single “price”, we can price all other assets in the economy.

Note the similarity with the CAPM. The CAPM basically says that the unique risk
factor is the Market. Replacing F with rM in the previous equation produces the CAPM.

6.3.2 Exact factor pricing with more than one factor

Now consider an exact K-factor structure:


K
X
rj = aj + βjk Fk , ∀j
k=1

The argument is identical to the single factor case. Construct a portfolio p by


investing in the risk-free asset and the factors itself with the following weights:

1− K
   P 
wf k=1 βjk
 wF 1   βj1 
p:  =
 ...  

... 
wF K βjK
The return on this portfolio is thus
K K
!
X X
rp = 1− βjk rf + βjk Fk
k=1 k=1
6.3. Pricing equation 69

The no arbitrage condition is:


K
!
X
aj = 1− βjk rf
k=1

Replacing aj in the return-generating process and taking expectations, we get

K
X
E[rj ] = rf + βjk ( E[Fk ] − rf ) , ∀j (6.5)
k=1

The risk premiums on the K exogenous sources of risk now determine the expected
returns on all securities.

Extensions

There are two important extensions to model (6.5):

1. Factors are excess returns. Suppose all factors are returns on long-short portfolios
with zero price. For example, market minus risk-free rate (like in the CAPM) or
portfolio A minus portfolio B (like in the Fama-French model in section 6.4.2).
Then, the model is
K
X
E[rje ] = βjk E[Fke ], ∀j
k=1

where Fke is the excess return on factor k and rje is the excess return on asset
j. For a single stock j, E[rje ] = E[rj ] − rf ; for a long-short portfolio p, E[rpe ] =
E[rlong ] − E[rshort ].

2. Nontraded factors. So far we assumed that factors are returns, i.e., factors are
based on portfolios that we can buy or sell. If the factors are not returns on
traded portfolios (e.g., industrial production), the model is
K
X
E[rj ] = rf + βjk λk , ∀j
k=1

The difference is that the risk premium on each factor is no longer its mean.
Instead, the risk premium on factor k is given by the free parameter λk that we
need to estimate.

See Cochrane (2005) for proofs and details.


6.4. How to identify the factors 70

6.3.3 Approximate factor pricing

We now consider the general K-factor structure with noise in (6.1):


K
X
rj = aj + βjk Fk + εj , j = 1, 2, . . . , n
k=1

For this case, we will only be able to get a limiting result as the number of stocks
increases. That is, APT is an approximation.

We have to consider a different arbitrage concept. An asymptotic arbitrage opportu-


nity exists if we can construct a (large) portfolio satisfying the following conditions: (1)
zero cost; (2) strictly positive expected return; (3) negligible variance. This is almost a
free lunch.

If there is no such arbitrage opportunity, then a linear pricing relation will hold
approximately for most of the assets in a large economy:
K
X
E[rj ] ≈ rf + βjk ( E[Fk ] − rf ) , ∀j
k=1

The approximation is in the sense that


n K
!2
1X X
lim E[rj ] − rf − βjk ( E[Fk ] − rf ) =0
n→∞ n
j=1 k=1

The model prices most of the assets “correctly” and all of the assets together with
a negligible mean square error. However, it can be arbitrarily bad at pricing a finite
number of the assets.

For an intuitive proof see Danthine and Donaldson (2005). A somewhat better proof
is in Cvitanić and Zapatero (2004, p.436). Rigorous proofs are in Ingersoll (1987, p.172)
and Huang and Litzenberger (1988, p.106).

6.4 How to identify the factors

6.4.1 Overview

The major drawback of the APT is that the theory does not say what the factors should
be. Hence, identifying the factors has been an (not yet over) empirical quest. Again,
the goal is to identify a small number of factors that describe all stock returns.
6.4. How to identify the factors 71

There are several approaches:

• Statistical factors. Using factor analysis and principal components analysis, re-
searchers have concluded that 3 to 5 factors are enough to describe the returns on
most stocks.

• Economically meaningful factors. The idea is to test whether relevant macroe-


conomic variables are good risk factors. There is a big literature on this. One
important example is Chen, Roll, and Ross (1986). They identify the following
factors: industrial production, credit spread, term spread.

• Financially meaningful factors. The factors are constructed as the return on a


(meaningful) portfolio of financial assets. The most important model nowadays is
the Fama and French 3 factor model (described below). More recently, Momentum
has also been considered a risk factor.

6.4.2 Fama and French model

Fama and French (1993) propose the following 3-factor asset pricing model:

E[rj ] − rf = βjM ( E[rM ] − rf ) + βjs E[SM B] + βjh E[HM L] (6.6)

where the loadings (βjM , βjs , βjh ) are the slopes in the time-series regression5

rj − rf = aj + βjM (rM − rf ) + βjs SM B + βjh HM L + εj (6.7)

To form the two new factors, FF divide all firms into six buckets depending on their
size (market equity, ME) and the ratio of book equity to market equity (BE/ME):6

50th ME prct
Small Value Big Value > 70th BE/ME prct
Small Neutral Big Neutral
Small Growth Big Growth < 30th BE/ME prct

“Small” stocks have ME smaller than the median ME. Typically, small stocks perform
better than what the CAPM predicts (this is a so called anomaly).
5
The set up looks slightly different from our previous return generating process, but they are
equivalent. To see this, write the exact version of (6.7) as rj = âj + βjM rM + βjs (SM B + rf ) +
βjh (HM L + rf ), with âj := aj + rf − βjM rf − βjs rf − βjh rf . Apply the no arbitrage condition
and take expectations to get (6.6).
6
See the details in http://mba.tuck.dartmouth.edu/pages/faculty/ken.french
6.5. Applications 72

“Value” stocks have BE/ME higher than the 70th BE/ME percentile; their book-
to-market ratio is High. “Growth” stocks have BE/ME lower than the 30th BE/ME
percentile; their book-to-market ratio is Low. Typically, BE/ME is high when the ME
(denominator) is low. This happens when the firm has had low returns and is now near
financial distress. Nonetheless, most of these firms usually rebound and thus, if you hold
a large portfolio of these firms, you end up making more money than their CAPM beta
would suggest (another CAPM anomaly).

Each month, the factors are computed in the following way:

• SMB (Small Minus Big) is the average return on the three small portfolios minus
the average return on the three big portfolios,
SMB = 1/3 (Small Value + Small Neutral + Small Growth)
- 1/3 (Big Value + Big Neutral + Big Growth)
Historically, the SMB portfolio generated an annual return somewhere between
1.5% and 3%. This is the size premium.
• HML (High Minus Low) is the average return on the two value portfolios minus
the average return on the two growth portfolios,
HML = 1/2 (Small Value + Big Value) - 1/2 (Small Growth + Big Growth)
Historically, the HML portfolio generated an annual return somewhere between
3.5% and 5%. This is the value premium.

This model has had considerable empirical success in explaining CAPM anomalies
(portfolios that don’t plot on the SML) and in capturing the variation in the cross-section
of expected returns. Thus, Fama and French (1996) argue that SMB and HML mimic
combinations of two underlying risk factors of special concern to investors.

6.5 Applications

6.5.1 Fund performance

One important question in finance is: How to assess the performance of a fund manager?
We cannot just look at raw realized returns because we want to distinguish stock-picking
skills from simple risk taking (if we see a big return, was it because the manager was
able to identify mispriced stocks or was it because he took large risks and got lucky?)

Therefore, we need to compute risk adjusted returns, that is, we need to measure the
difference between the empirical realized returns and the returns “appropriate” for the
risk of the fund. This difference is called Jensen’s alpha.

We have two models to adjust returns for risk:


6.5. Applications 73

CAPM

To evaluate the performance of fund p, estimate the following time-series regression:

(rp − rf )t = αp + βp (rM − rf )t + εpt (6.8)

This is the standard regression to estimate the CAPM beta. Now, we are also interested
in the intercept. According to the CAPM, αp = 0. Graphically, a positive Jensen’s alpha
implies that the portfolio lies above the SML:

E[r]
6

-
β

If αp is (significantly) positive, we can conclude that the fund returns are higher
than what its level of risk would require (according to the CAPM). In other words, the
manager has skill.

FF3

If we don’t believe that CAPM is a good model to adjust returns for risk, we can use
the Fama-French model. Run the regression

(rp − rf )t = αp + βpM (rM − rf )t + βps SM Bt + βph HM Lt + εpt

Again, if αp > 0 (statistically), the manager has skill. Note that the βpM estimator
that comes out of this regression is not the CAPM beta (due to the presence of other
regressors).

See my website for an application (homework) with real data.

There is a huge literature on fund performance and research is still going on. For a
survey of the evidence and its implication on the Efficient Market Hypothesis see Malkiel
(2005).
6.5. Applications 74

Remark on CAPM’s Beta estimation

Equation (6.8) is considered a better way to estimate the CAPM beta than the market
model regression. If the interest rate is constant, both lead to the same beta:

rpt − rf = αp + βp (rM t − rf ) + t
⇒rpt = rf (1 − βp ) + αp + βp rM t + t
⇒rpt = α¯p + βp rM t + t

with the interest rate folded into the intercept, α¯p := rf (1−βp )+αp . However, in practice
interest rates are not constant and the two regressions lead to different beta estimates.
The CAPM is really mute about “statistical” issues (it is assumed that investors know
the true parameter values). But we can nonetheless argue that regression (6.8) is more
in the spirit of the CAPM. This is because we are interested in explaining excess returns
(remember rf is exogenous).

Consider an example to magnify the potential differences. Suppose we estimate the


market model with the following raw returns (in %):

t rM ri
1 6 8
2 5 3

We get rit = −0.22 + 5 ∗ rM t and thus (wrongly) infer that this security has a very high
beta, βi = 5. Now we take into account the interest rate in each of those periods and
estimate (6.8):

t rM ri rf rM − rf ri − rf
1 6 8 7 -1 1
2 5 3 4 1 -1

We get (rp − rf )t = 0 − 1(rM − rf )t , and now (correctly) conclude that this is actually a
negative beta security, βi = −1. This makes sense, since security i is acting as an hedge
against excess market returns. In other words, a market return of 6% is “bad times” if
rf = 7%, whereas 5% is “good times” if the risk-free rate is only rf = 4%.

6.5.2 Market neutral strategy

This investment strategy is typical of many hedge funds — see Bodie, Kane, and Marcus
(2005, sec 10.4).
6.6. Exercises 75

A portfolio manager has identified an underpriced portfolio p with the following


characteristics:
(rp − rf ) = 0.04 + 1.4(rSP 500 − rf ) + p
The manager is very confident about this alpha of 4%.

However, even if the manager is right, he may loose money if the whole market turns
down. He would like to explore the relative mispricing of p, regardless of what happens
to the market.

The solution is to construct a tracking portfolio (T ) that matches the systematic


component of p. It must therefore have a beta of 1.4, which requires wSP 500 = 1.4 and
wf = −0.4. The return on the tracking portfolio is thus

rT = 1.4rSP 500 − 0.4rf ⇒ (rT − rf ) = 1.4(rSP 500 − rf )

The investment strategy is to go long (buy) on p and go short (sell) on T . The


combined portfolio C thus has a return of

rC = rp − rT = (rp − rf ) − (rT − rf ) = 0.04 + p

This combined position is thus market neutral. Regardless of what happens to the
market, the manager earns 4%.7

6.6 Exercises
Ex. 41 — Assume the market model is true. Show that the covariance matrix is:

diagonal: σj2 = βj2 σM


2
+ σε2j
2
off diagonal: σij = βi βj σM

Ex. 42 — Stock returns are generated by the following exact market model:

rj = aj + βj rM (6.9)

The risk-free rate is 4%. After a careful analysis, you identify stock a whose return is
generated by
ra = 0.01 + 0.9rM (6.10)
Can you become filthy rich? Explain how (quantify the profit).
7
Note that there is still some residual risk, p . This will be small if the single market factor
explains rp well. In practice, we typically need more factors.
6.6. Exercises 76

Ex. 43 — The Fama-French model states that returns can be explained by a three
factor model. The first factor is the market (as in the CAPM). Briefly explain what the
other two new factors are (ie, name them, describe how they are constructed, and what
they measure).

Ex. 44 — See my webpage for exercises on beta estimation and fund performance.
Chapter 7

Pricing in Complete Markets

Two general pricing frameworks inPcomplete markets:


• Arrow-Debreu pricing: p = Ss=1 x(s) · pad (s)
EQ [x]
• Risk-Neutral pricing: p = 1+rf

The Arrow-Debreu pricing framework is a very general setup. It gives us intuition


and helps us understand all other pricing models. In a sense, it is the mother of all asset
pricing models. It can be set from an equilibrium or an arbitrage perspective. Here, we
follow the second approach. The Risk-Neutral pricing framework is essentially equivalent
to AD pricing. RN is the center of mathematical finance and derivatives pricing.

7.1 Basic and Complex securities


Definition (Arrow-Debreu security). An Arrow-Debreu (AD) security pays 1 unit of
consumption, or one unit of currency, in one single state of nature and nothing else in
other states. For example, the AD security for state s, with price pads , produces the
following payoffs:

State Payoff
1 0
2 0
.. ..
. .
s 1
.. ..
. .
S 0

77
7.2. Computing AD prices 78

An AD security is also called a basic or primitive security, a state-contingent claim (the


payoff is contingent on the realized state of nature), or simply a contingent claim or
state claim.

Definition (Complex security). A complex security is an asset that pays off in more
than one state of nature. Examples: a stock; a bond; a stock option.

Example 7.1.1. A portfolio with one share of each of the S different AD


securities is in fact a risk-free security (pays one unit of currency regardless of
the state). It’s price must be
S
X
pf = 1 × pad
s
s=1

1
Since we prefer to speak of the risk-free rate instead of price, use pf = 1+rf to
get:
S
1 X
= pad
s
1 + rf
s=1

Note that the payoffs of complex securities also depend on the realized state of nature,
so they are also called contingent claims (sorry, but I did not create these terms...)

7.2 Computing AD prices

In reality, we only observe the prices of complex securities. So, we need to extract the
implicit AD prices from the complex securities’ prices.

Example 7.2.1. There are 3 states and 3 assets with the following payoffs:

State Asset a Asset b Asset c


1 3 1 2
2 2 1 0
3 1 1 2

Price 1.5 0.8 0.8


7.3. Complete Markets 79

To find the AD price for the first state, pad1 , we find the portfolio of complex
securities that replicates the AD security:
       
1 3 1 2
0 = 2 wa + 1 wb + 0 wc
0 1 1 2

resulting in the linear system


 
 1 = 3wa +wb +2wc  wa = 0.5
0 = 2wa +wb ⇒ w = −1
 b
0 = wa +wb +2wc wc = 0.25

Note that the w above represent quantities, not percentage weights. The AD
price must thus be

pad
1 = wa pa + wb pb + wc pc = 0.15

Proceeding in the same way for the other two states, we get the corresponding
AD prices. Check that pad ad
2 = 0.40 , p3 = 0.25.

7.3 Complete Markets

7.3.1 Price of complex securities

Definition (Complete market). The market is complete if there exists one AD price for
each possible state of nature, that is, if we can compute pad
s , ∀s.

The reason why market completeness is important is that in arbitrage-free complete


markets every financial contract has a unique arbitrage-free price. This is very useful
for pricing derivatives.

Proposition 7.3.1. If the market is complete, any complex security (ie, any cash flow
stream) can be replicated and priced as a portfolio of AD securities.

The Arrow-Debreu pricing formula for any complex security is

S
X
p= x(s) · pad (s) (7.1)
s=1

where x(s) denotes the payoff in state s.


7.3. Complete Markets 80

Example 7.3.1. Consider the market from the previous example. Consider an
additional security with the following payoffs:1

State Payoff for asset d


1 2
2 1
3 0

Since the market is complete, this security can be replicated as a portfolio of


AD securities:        
2 1 0 0
1 = 2 0 + 1 1 + 0 0
0 0 0 1
Its price must be

pd = 2pad ad
1 + 1p2 + 0 = 2 × 0.15 + 1 × 0.4 = 0.7

Otherwise, there would be arbitrage opportunities.

7.3.2 Quick test for market completeness

Given a payoff matrix, how can we know if all AD prices exist, ie the market is complete,
before having to go through the calculations? First, note that if there are more states
than complex securities, S > N , the market is incomplete. Second, if S = N , the test is
given by the following proposition.
Proposition 7.3.2. The market is complete if
i) N = S
ii) The N complex securities are linearly independent.

Intuitively, the market with N = S is complete if the N securities are truly different
from each other.

Formally, note that in the previous example, we computed each AD price by finding
a vector of weights satisfying

as = Xw ⇒ w = X −1 as

where X is the (S by N ) matrix of payoffs for the complex securities and as is the vector
of payoffs for the AD security ( 1 in state s, zeros everywhere else).2 We will be able
1
Note that this security is a call option on asset 1 with a strike price of 1: call(s) = max[x1 (s)−
1, 0]
2
We can find the whole matrix of weights at once by doing I = XW ⇒ W = X −1 . The
S-by-1 vector of AD prices is thus P ad = W 0 P , with P being the N-by-1 vector of security prices.
7.4. Risk-Neutral Pricing 81

to replicate all S AD payoffs if the N complex securities span the entire S-dimensional
space, RS . Hence, we are really asking whether X has full rank (ie, all columns are
linearly independent). The following result from linear algebra is helpful:

Proposition 7.3.3. X has full rank if and only if |X| =


6 0.

6 0 guarantees that X −1 exists and thus that the previous


By the way, note that |X| =
equation has a solution.

Example 7.3.2. From the previous example,


 
3 1 2
X = 2 1 0
1 1 2

Computing its determinant,3

|X| = 4 6= 0

Hence the market is complete (thus we can compute all AD prices).

7.4 Risk-Neutral Pricing

7.4.1 Price of complex securities

Define
pad (s)
π Q (s) := PS
ad
s=1 p (s)

Note that all π Q (s) are positive4 and sum to 1, so they form a legitimate set of proba-
bilities.
3
The determinant of a square matrix A of size K is:
K
X
|A| = aik (−1)i+k |A−ik |, for any row i
k=1

where aik is the ik-th element of A and A−ik is what’s left of A after deleting the row and column
that go through aik . |A| can also be computed along any column instead; pick the row or col
with most zeroes.
4
If some pad (s) ≤ 0 there would be an arbitrage opportunity.
7.4. Risk-Neutral Pricing 82

From (7.1), the price of a complex security is given by


S
X
p= pad (s)x(s)
s=1
S S
! ! !
X 1 X
= pad (s) · PS · pad (s)x(s)
ad
s=1 s=1 p (s) s=1
S
!  S 
X X pad (s)
= pad (s) ·  P  x(s)
S ad
s=1 s=1 s=1 p (s)
S
1 X Q
= π (s)x(s)
1 + rf
s=1
PS ad (s) 1
where we also used s=1 p = 1+rf . The risk-neutral pricing formula is thus

EQ [x]
p= (7.2)
1 + rf

where EQ means that we take the expectation using the probabilities π Q (s).

This is called risk-neutral pricing because we are discounting the expected cash flow
at the risk-free rate. Very important: this does not mean that the investor is risk-neutral.
All we did was to distort the expected cash flow by using the artificial probabilities π Q .
This distortion captures the risk aversion of the investor, so that (7.2) produces the real
price of the security.

Example 7.4.1. Using the market from the previous examples,



ad (s) ...
p
π Q (s) = PS =  ...
ad
s=1 p (s) ...
The risk-free rate is
1
1 + rf = PS = ...
ad (s)
s=1 p
Hence, the price of the call option from example 7.3.1 is
EQ [x]
p=
1 + rf
= ...
= 0.7
7.4. Risk-Neutral Pricing 83

7.4.2 Fundamental theorems

The function Q that defines the probabilities π Q (s) is known as Risk-Neutral probability
measure, or Subjective probability measure, or Equivalent Martingale Measure. Formally,
Definition. (Risk-Neutral probability) A probability measure Q is a Risk-Neutral prob-
ability measure if:
i) π Q (s) > 0, ∀s, and
ii) Equation (7.2) holds for all securities.
Theorem 7.4.1. (First fundamental theorem of mathematical finance) There exists a
risk-neutral probability measure Q if and only if there are no arbitrage opportunities.

There are several definitions of arbitrage. The one we are considering here is the
following:
Definition. (Arbitrage) There is an arbitrage opportunity if we can create a portfolio
with the following characteristics:
i) p ≤ 0, and
ii) x(s) ≥ 0, for all s.
(A more precise definition should exclude the case p = 0 with x(s) = 0, ∀s, but this is
obviously a useless portfolio.)

The following is a counter-example to the theorem: if there are arbitrage opportu-


nities, there is no Q.

Example 7.4.2. Consider a different market:


State Asset 1 Asset 2 Asset 3
1 3 1 2
2 2 1 0
3 1 1 2

Price 0.7 0.8 0.8


Clearly, there is an arbitrage opportunity since asset 1 is always better than 2,
but costs less.
Computing AD prices (exercise),
 
−0.25
pad =  0.40 
0.65
again pad (1) = −0.25 signalling arbitrage.
This would imply π Q (1) = −0.25/0.8 = −0.3125 < 0, which does not satisfying
the strict positivity requirement. Hence, there is no risk-neutral measure.
7.5. Conclusion 84

However, the theorem does not say that the measure is unique. In incomplete mar-
kets, we may have many risk-neutral measures (and thus many prices). See example 11.3
in Danthine and Donaldson (2005). The measure and thus security prices are guaranteed
to be unique only in complete markets, as the following theorem states:

Theorem 7.4.2. (Second fundamental theorem of mathematical finance) Assume that


the market is arbitrage free. Then, the market is complete if and only if the risk-neutral
measure is unique.

7.5 Conclusion

If the market is complete we can:

• combine the existing complex securities to generate any payoff (ie, the existing
securities span the space of all possible payoffs);

• recover all AD prices or π Q probabilities;

• use the AD prices or π Q probabilities to price any new security (though this new
security would be redundant).

A good example is the Black-Scholes option pricing model. The market formed
by the stock and the risk-free asset is complete. Thus, we can use the stock and the
risk-free asset to replicate and price the stock option.

We can also interpret the APT in this context. The factors are like AD securities
that span the whole set of (redundant) stocks. Hence, we are able to impose no arbitrage
conditions and get the results in chapter 6.

7.6 Exercises
Ex. 45 — Define the following concepts in your own words:
1. Arrow-Debreu security.
2. Complete market.

Ex. 46 — Consider the following payoff matrix (states in rows, securities in columns):
 
3 7 8
X = 1 2 9 
7 16 25

Is the market complete?


7.6. Exercises 85

Ex. 47 — Consider the following payoff matrix (states in rows, securities in columns):
 
1 2 3
X = 2 1 1
3 4 5

The prices of the complex securities (columns) are: p1 = 1, p2 = 1.2, p3 = 1.5.


1. Compute the three Arrow-Debreu prices.
2. Find the price of a new security with payoff [3, 2, 7]0 .

Ex. 48 — (Risk-Neutral Pricing) There are 3 states of nature and 3 complex securities.
The payoff matrix is:

Payoff
State Asset 1 Asset 2 Asset 3
s=1 3 4 2
s=2 0 2 1
s=3 0 0 1
Price 1.2 1.8 1.2

1. Consider a new fourth security with payoff = [2, 10, 4]’. Compute its price using
the Risk-Neutral pricing method.
2. The price you just computed assumes that investors are risk neutral? Explain
briefly (5 lines).
3. Suppose a Bank is willing to sell you this new security for p4 = 2.5. What would
you do? (Hint: define a trading strategy with the Arrow-Debreu securities and
quantify your profit. Note: you may also define the trading strategy on the
original 3 complex securities, though this requires more work.)
Chapter 8

Consumption-Based Asset Pricing

The fundamental asset pricing equation is

pt = Et [mt+1 xt+1 ]

with mt+1 ≡ δU 0 (ct+1 )/U 0 (ct )

The consumption asset pricing framework links the financial market to the real side
of the economy. Namely, we will be able to link consumption to expected stock returns.
This approach to asset pricing based on first principles is much more solid (utility should
be written over consumption, not wealth, because most people are not like Uncle Scrooge
and don’t enjoy swimming in their coins). The next generation of asset pricing models
will probably be some form of consumption asset pricing. It is currently a very active
area of research. A lot of the material in this section comes from Cochrane (2005).

8.1 The investor’s problem

Consider a 2 period consumption model. The investor chooses the quantity (z) of the
security to buy today (t) to maximize the utility of consumption (c). The problem is
thus:

maximize U (ct ) + Et [δU (ct+1 )] (8.1)


z
s.t. ct + zpt = et
ct+1 = et+1 + zxt+1

86
8.1. The investor’s problem 87

where pt is the price of the security and xt+1 is the total payoff (xt+1 includes dividends,
xt+1 = pt+1 + dt+1 ) and e is an exogenous endowment the investor receives each period.
The parameter δ ≤ 1 captures impatience. The expectation Et [.] is conditional on time-t
information.

Substituting the constraints into the objective function, we get the following first-
order condition:
 
dU (ct )  dU (ct+1 ) 
− = Et 
δ.

| {zdz } | {zdz } 
U loss for addit. unit of security U gain for addit. unit of security

The investor buys more or less of the asset until this foc holds. That is, until the
marginal utility loss for consuming less today (on the lhs) equals the (discounted, ex-
pected) marginal utility gain from consuming more tomorrow (on the rhs).

The foc can be further written as

U 0 (ct )pt = Et [δU 0 (ct+1 )xt+1 ]


 0 
U (ct+1 )
⇒pt = Et δ 0 xt+1
U (ct )

Remark. If there is more than one asset, similar foc hold for each asset,
 0 
j U (ct+1 ) j
pt = Et δ 0 x , ∀j
U (ct ) t+1

Remark. The investor’s problem can be set in a more general and realistic framework. If
we consider a representative agent (represents the average of all agents in the economy),
his problem is

!
X
maximize

E0 δ t U (ct )
{zt }t=1
t=0
s.t. ct + zt+1 pt = zt xt , ∀t

where zt+1 is the quantity of the security to hold from t to t+1. The problem is set in an
Exchange Economy where total output (GDP) is random and exogenous. The output is
distributed through dividends (replacing the endowments in the previous formulation),
which are included in the sequence {xt }. This is the famous “Lucas Tree” model, de-
veloped in 1978. The point is that this general formulation leads exactly to the same
first-order condition as (8.1).
8.2. Fundamental Asset Pricing Equation 88

Remark. No-trade equilibrium. Note that we will not solve the foc until the end, ie, we
will not try to find z. This is an equilibrium model, so we must have Demand = Supply.
In other words, it must be the case that z ≡ 1 (total supply of the asset is normalized
to 1). This is because there is only one investor (the representative agent), thus there
is no one else left for him to trade with. Hence, the model does not describe traded
quantities. (Note that the CAPM and the APT also do not; we need microstructure
models for this.)

8.2 Fundamental Asset Pricing Equation

The first order condition is the central equation in asset pricing. It is more conveniently
written as
pt = Et [mt+1 xt+1 ] (8.2)
with
U 0 (ct+1 )
mt+1 ≡ δ (8.3)
U 0 (ct )

The random variable m is called stochastic discount factor (SDF), pricing kernel, or
marginal rate of substitution. The important point is that one single m prices all assets.

c1−γ
Example 8.2.1. Suppose U (c) = 1−γ . Then,
 −γ
ct+1
mt+1 = δ
ct

and this single random variable prices all assets,


"  #
ct+1 −γ j

j
pt = E t δ xt+1 , ∀j
ct

In practice, the consumption stream is exogenous (must equal aggregate consumption


in the economy), so the goal of asset pricing is to find a specification for m (ie, for U )
that makes (8.2) consistent with observed stock returns.
8.3. Relation to Arrow-Debreu Securities 89

8.3 Relation to Arrow-Debreu Securities


To interpret the SDF, we can relate it to AD securities. Consider a finite number
of possible states of nature, S. Equation (8.2) can be written as (dropping the time
subscripts)
XS
p= π(s)m(s)x(s)
s=1
where p is the price today, x(s) is next-period’s payoff if state s occurs, and π(s) is the
probability of state s.

In the AD setup, the price of a security is


S S
X X pad (s)
p= pad (s)x(s) = π(s) x(s)
π(s)
s=1 s=1

where pad (s) is today’s price of an AD security that pays 1 in state s.

Hence, the SDF is related to AD prices as

pad (s)
m(s) = (8.4)
π(s)

Example 8.3.1. Consider the same market as in example 7.2.1. We now also
include the probabilities for each state:

State Asset 1 Asset 2 Asset 3 π(s)


1 3 1 2 0.2
2 2 1 0 0.3
3 1 1 2 0.5

Price 1.5 0.8 0.8

Recall that we had already computed AD prices: pad = [0.15, 0.40, 0.25]. The
SDF is thus the following vector:
   
0.15/0.2 0.75
m = 0.40/0.3 = 1.33
0.25/0.5 0.50

Suppose we want to price a new security (as in example 7.3.1) with payoffs
 
2
xt+1 = 1

0
8.4. Relation to the Risk-Neutral measure 90

Using (8.2) we get

pt = Et [mt+1 xt+1 ]
= 0.2 ∗ (0.75 ∗ 2) + 0.3 ∗ (1.33 ∗ 1) + 0.50 ∗ (0.50 ∗ 0)
= 0.7

which is the same number we got in example 7.3.1.

8.4 Relation to the Risk-Neutral measure

Equation (8.2) can be written as (dropping the time subscripts)


S
X
p= π P (s)m(s)x(s)
s=1

where π P (s) is the objective or physical probability of state s.

Using risk-neutral pricing, the price of a security is (equation 7.2):


S
EQ [x] X π Q (s)
p= = x(s)
1 + rf 1 + rf
s=1

Comparing the two equations, we conclude that

π Q (s) = (1 + rf ) · π P (s)m(s)

This equation says a lot. Risk aversion is equivalent to worrying about unpleasant
states. People that report high subjective probabilities (Q) for unpleasant events like
market crashes may not have irrational expectations; they may simply be reporting the
risk-neutral probabilities. These are the product π P × m, hence they are high if either:
the event is truly highly probable (high π P (s)); or it is improbable but has disastrous
consequences (high m(s)). (See Cochrane, 2005, p.53)

Example 8.4.1. Continuing the previous example, we have 1 + rf = 1/0.8


and      
0.2 0.75 0.1875
π Q = 1/0.8 · 0.3 · 1.33 = 0.5000
0.5 0.50 0.3125
the same numbers we got in example 7.4.1.
8.5. Risk Premiums 91

For future reference, we can rewrite the previous equation as

1 π Q (s)
m(s) =
(1 + rf ) π P (s)
Q
The expression ππP (s)
(s)
is called the Radon-Nikodym derivative of Q w.r.t. P . You’ll see
a lot of this in option pricing.

8.5 Risk Premiums

The consumption model provides the fundamental economic intuition to understand


why different assets have different prices or expected returns. We will show that it is
the correlation between the random common SDF (m) and the asset-specific payoff (x)
that generates asset-specific risk corrections.

We can write the foc in return form. For all asset j,

pjt = Et [mt+1 xjt+1 ]


" #
xjt+1 h
j
i
⇒ 1 = Et mt+1 j ≡ Et mt+1 Rt+1 (8.5)
pt
j
where Rt+1 is the gross return on the asset.
f
Consider in particular a risk-free security with return Rt+1 (not random and known
at time t):
h i
f f
1 = Et mt+1 Rt+1 = Et [mt+1 ] Rt+1
f
⇒ Et [mt+1 ] = 1/Rt+1 (8.6)

Using the definition of covariance, Cov(x, y) = E[xy] − E[x] E[y], and (8.6) we can
write (8.2) as

pt = Et [mt+1 xt+1 ]
⇒ pt = Et [mt+1 ] Et [xt+1 ] + Covt (mt+1 , xt+1 )
Et [xt+1 ]
⇒ pt = f
+ Covt (mt+1 , xt+1 )
Rt+1

The first term is the value of the asset if investors were risk-neutral. The second term
is a risk adjustment. If the payoff covaries positively with the sdf the security price will
be higher (returns will be lower).
8.6. Consumption CAPM (CCAPM) 92

To see why this is so, write the sdf explicitly:

Et [xt+1 ] δ
pt = + Covt (U 0 (ct+1 ), xt+1 ) (8.7)
f
Rt+1 U 0 (ct )

Recall that marginal utility (U 0 ) is decreasing (U 00 < 0). Investors like smooth consump-
tion. If an asset pays off well when consumption is low (marginal utility is high), it will
help to smooth consumption. Thus investors are willing to pay a high price for it (the
covariance term is positive); equivalently, demand a low return.1

This intuition can be restated in return form. Starting from (8.5) and using (8.6),
h i
j
1 = Et mt+1 Rt+1
j
Et [Rt+1 ] j
⇒1 = f
+ Covt (mt+1 , Rt+1 )
Rt+1
j f f j
⇒ Et [Rt+1 ] − Rt+1 = −Rt+1 Covt (mt+1 , Rt+1 )

Writing the sdf explicitly and using net returns instead of gross returns (R = 1 + r),

j f f δ j
Et [rt+1 ] − rt+1 = −(1 + rt+1 ) Covt (U 0 (ct+1 ), rt+1 ) (8.8)
U 0 (ct )

Again, the same intuition applies. If an asset has a positive covariance with marginal
utility (negative covariance with consumption), its risk premium will be low. Investor
are willing to hold this asset at low return (high price) because it smooths consumption.
On the other hand, if the asset has a high correlation with consumption (pays off well
when you are wealthy, pays off badly when you are poor), it will only contribute to make
consumption more volatile, thus investors require a higher return premium to hold it
(lower price).

Check exercise 52. B

8.6 Consumption CAPM (CCAPM)

To give a more familiar look to equation (8.8), we can specialize the consumption model
to the case of quadratic utility. This leads to the Consumption Capital Asset Pricing
Model (CCAPM).2
1
Insurance is an extreme example. We are happy to hold insurance even though its expected
return is negative.
2
Breeden (1979) derives the model in continuous time, which amounts to assuming that only
the first two moments of returns matter.
8.6. Consumption CAPM (CCAPM) 93

Assume U (c) = ac − 2b c2 . It follows that U 0 (c) = a − bc. Substituting into (8.8),

j f f δ j
Et [rt+1 ] − rt+1 = −(1 + rt+1 ) Covt (a − bct+1 , rt+1 )
a − bct
f
(1 + rt+1 )δb j
= Covt (ct+1 , rt+1 ) (8.9)
a − bct

Denote by rĉ the return on the portfolio most highly correlated with consumption
growth. Since this is a traded security, it must also satisfy equation (8.9),
f
ĉ f (1 + rt+1 )δb ĉ
Et [rt+1 ] − rt+1 = Covt (ct+1 , rt+1 )
a − bct
f ĉ ] − r f
(1 + rt+1 )δb Et [rt+1 t+1
⇒ = ĉ
a − bct Covt (ct+1 , rt+1 )

Replacing back into (8.9),


ĉ ] − r f
Et [rt+1
j f t+1 j
Et [rt+1 ] − rt+1 = ĉ
Covt (ct+1 , rt+1 )
Covt (ct+1 , rt+1 )
j
Covt (ct+1 , rt+1 )/ Vart (ct+1 )  ĉ f

= ĉ )/ Var (c
E [r
t t+1 ] − rt+1
Covt (ct+1 , rt+1 t t+1 )

Defining the consumption beta of security i to be


i ,c
Covt (rt+1 t+1 )
βi,c ≡
Vart (ct+1 )
we get the CCAPM:

j f βj,c  ĉ f

Et [rt+1 ] − rt+1 = Et [rt+1 ] − rt+1 (8.10)
βĉ,c

To interpret this equation, suppose βĉ,c = 1 (ĉ mimics c perfectly). We get a direct
analogue to the CAPM,
 
j f ĉ f
Et [rt+1 ] − rt+1 = βj,c Et [rt+1 ] − rt+1

However, the market risk premium is now replaced by the excess return on the consump-
tion portfolio and the relevant risk measure is the consumption beta of j. A security
with high consumption beta must have a high expected return. This is because it pays
off well when consumption is already high (low marginal utility), but pays off badly
when consumption is low (high marginal utility). Hence, we get the same intuition as
in (8.8).
8.7. The CAPM reloaded 94

8.7 The CAPM reloaded

Consumption asset pricing has not been very successful empirically, presumably because
the sdf depends on marginal utility (mt+1 = δU 0 (ct+1 )/U 0 (ct )), which is not easy to
measure empirically. We don’t know the true utility function, neither the value of the
parameters, and even consumption data has its problems. Beta asset pricing models
(CAPM, APT) have thus the upper hand on empirical applications nowadays.

However, all asset pricing models are nested in the fundamental asset pricing equation
(8.2). The models differ by proposing different, easier to measure, proxies for marginal
utility.

The CAPM is the special case where

M
mt+1 = a − bRt+1 (8.11)

Marginal utility is proxied by the return on the market portfolio. In the CAPM, the
investor holds the market portfolio, hence higher Rt+1M , allows for higher consumption,

which means lower marginal utility. It is the return on the market that describes whether
the typical investor is happy or unhappy. Rt+1 M is perfectly negatively correlated with

mt+1 . Schematically,
M
Rt+1 ct+1 U 0 (ct+1 ) mt+1
% % & &
& & % %

To show that (8.11) implies the CAPM pricing relation (SML), start by writing the
fundamental pricing equation in return form, as in the derivation of (8.8):
h i
j
1 = Et mt+1 Rt+1
j f f j
⇒ Et [Rt+1 ] − Rt+1 = −Rt+1 Covt (mt+1 , Rt+1 )
M , we get
For mt+1 = a − bRt+1
j f f M j
Et [Rt+1 ] − Rt+1 = −Rt+1 Covt (a − bRt+1 , Rt+1 )
f M j
= Rt+1 b Covt (Rt+1 , Rt+1 )

Since this model applies to any asset, it also applies to the Market itself:
M f f M M
Et [Rt+1 ] − Rt+1 = Rt+1 b Covt (Rt+1 , Rt+1 )
M ] − Rf
Et [Rt+1 t+1 f
⇒ M )
= Rt+1 b
Vart (Rt+1
8.7. The CAPM reloaded 95

Replacing in the previous equation for any asset j,


M ] − Rf
Et [Rt+1
j f t+1 M j
Et [Rt+1 ] − Rt+1 = M )
Covt (Rt+1 , Rt+1 )
Vart (Rt+1
M , Rj ) 
Covt (Rt+1 
t+1 M f
= M )
E [R
t t+1 ] − Rt+1
Vart (Rt+1

which is the CAPM. In the more standard notation with beta, net (instead of gross)
returns, and stressing that R is for any asset j,

E[rj ] − rf = βj ( E[rM ] − rf ), ∀j

Alternative proof (Cochrane, 2005)

To show that (8.11) implies the CAPM pricing relation (SML), we start by determining
the constants a and b. First, the model must price the risk-free asset:

1 = E[mRf ]
⇒1 = E[(a + bRM )Rf ]
1 − bRf E[RM ]
⇒a =
Rf
Second, since the model applies to any asset, it also applies to RM itself:

1 = E[mRM ]
⇒1 = E[(a + bRM )RM ]
⇒1 = a E[RM ] + b E[(RM )2 ]

Using the previous expression for a and the fact that Var(x) = E[x2 ] − ( Ex)2 ,

1 − bRf E[RM ]
⇒1 = E[RM ] + b E[(RM )2 ]
Rf
⇒Rf = (1 − bRf E[RM ]) E[RM ] + bRf E[(RM )2 ]
⇒Rf − E[RM ] = −bRf ( E[RM ])2 + bRf E[(RM )2 ]
E[RM ] − Rf
⇒b = −
Rf Var[RM ]

We can now show that the fundamental asset pricing equation with m = a + bRM
8.8. Conclusion 96

implies the CAPM. Starting from (8.5), using (8.6), and the expression above for b,

1 = E[mR]
⇒1 = b Cov(RM , R) + E[R]/Rf
E[RM ] − Rf
⇒1 = − Cov(RM , R) + E[R]/Rf
Rf Var[RM ]
Cov(RM , R)
⇒ E[R] − Rf = ( E[RM ] − Rf )
Var[RM ]
which is the CAPM. In the more standard notation with beta, net (instead of gross)
returns, and stressing that R is for any asset j,

E[rj ] − rf = βj ( E[rM ] − rf ), ∀j

8.8 Conclusion
The fundamental asset pricing equation,

p = E[mx]

m = δU 0 (ct+1 )/U 0 (ct ) is the basic framework that should be able to answer all asset
pricing questions. However, if we specify the model to quadratic utility (CCAPM) or
even power utility, the model does not match the empirical stock returns data.

Hence, beta or factor pricing models (CAPM, APT, FF3) are currently better em-
pirical alternatives. The point to note is that these beta models are specific cases of the
general consumption framework. They are just using proxies for marginal utility that
are easier to measure. For instance, the CAPM is the special case where m = a − bRM .

8.9 Exercises
Ex. 49 — In a 2 period consumption model, the investor chooses the quantity (x) of
the security to buy today (t) to maximize the utility of consumption (c). The problem
is thus:

maximize Et [U (ct ) + δU (ct+1) ]


x
s.t. ct = et − xpt
ct+1 = et+1 + xvt+1
where pt is the price of the security, vt+1 its terminal payoff, and e is an exogenous
endowment the investor receives each period.
8.9. Exercises 97

1. Write the first-order condition.


2. Write the second-order condition. What condition on the utility function will
ensure that we are at a maximum?
c1−γ
3. Compute the pricing kernel for the utility function U (c) = 1−γ .

Ex. 50 — There are 3 states of nature and 3 complex securities. The payoff matrix is:

Payoff Prob
State Asset 1 Asset 2 Asset 3 . π(s)
s=1 3 4 1 0.25
s=2 0 2 1 0.50
s=3 0 0 1 0.25
Price 1.2 1.8 0.7

1. Is the market complete?


2. Compute the Arrow-Debrew prices.
3. Consider a new fourth security with payoff = [2, 10, 4]’. Compute its price.
4. Compute the value of the pricing kernel at each state.
5. Compute the price of the new fourth security defined above using the pricing
kernel (recall p = E[m × payoff]).

Ex. 51 — The fundamental pricing equation can be written in return form as 1 =


Et [mt+1 Rt+1 ], where mt+1 is the pricing kernel and Rt+1 is the gross return.
f
1. A risk-free security costs 1 and pays a gross return of Rt+1 , known at time t.
Write the pricing equation for this security.
2. Manipulate the fundamental pricing equation for a risky security to get the excess
f
return Et [Rt+1 ]−Rt+1 (on the left-hand side) explained by the covariance between
marginal utility and returns (on the right-hand side).
3. Explain in words the economic meaning of the previous equation.

Ex. 52 — (Risk Premiums) There are 2 states of nature and 2 assets. The payoffs and
consumption in the next period can be:

Payoff
State Asset 1 Asset 2 Consumption Probability
s=1 10 20 100 0.5
s=2 20 10 150 0.5

Consumption today is c0 = 100. The representative investor has log utility and is
indifferent between consuming the same amount today or in 1 period.
1. Use the fundamental asset pricing equation to compute the price of the two assets.
8.9. Exercises 98

2. Using only words, provide intuition for why one asset is more expensive than the
other.
3. Now use equations and numbers to explain rigorously the price differences. (Hint:
manipulate the fundamental equation so that price equals two terms: the first is
an “intuitive” price; the second is a risk adjustment. Compute the values and
explain in words what the numbers mean.)

Ex. 53 — “The stochastic discount factor is always positive.” True or False?

Ex. 54 — Consider a 2 period consumption model. The investor chooses the quantity
of the stock (z s ) and the quantity of a risky corporate bond (z b ) to buy today (t). The
problem is thus:

maximize U (ct ) + Et [δU (ct+1 )] (8.12)


z s ,z b

s.t. ct + z s pst + z b pbt = Wt


ct+1 = z s xst+1 + z b xbt+1

where pjt is the price of the security and xjt+1 is the total payoff (j = s, b). Wt is the
exogenous initial wealth of the investor.
1. Write the first order conditions for this problem.
2. Assume U (c) = ln(c). Write the pricing kernel for this utility function.
3. Assume δ = 0.99 and ct = 1000. There are 4 possible states of nature tomorrow.
Consumption and the payoffs of the bond are the following:
State(s) Prob[s] ct+1 xbt+1
1 0.1 900 0
2 0.2 1000 95
3 0.5 1100 100
4 0.2 1200 100
Compute the price of the bond, pbt .
Chapter 9

Conclusion

Overview of asset pricing frameworks, models, and applications.

max E[U(Y)] Factor Model Complete Market max E[U(c)]


+ + + +
Equilibrium No Arbitrage No Arbitrage Equilibrium


General SDF:
p = E[mx]

m=a−bRM m=pad ./π


  U 0 (c)=a−bc
 t  x AD pricing
x(s)pad (s)
P
CAPM APT p= 
p = 1+E[x]
E[r]
o p = 1+E[x]
E[r] CCAPM
f =rM RN pricing
E[r] from SML ex: E[r] from FF3
EQ [x]
p= 1+rf

 y   
1. Stock pricing 1. Covariance Matrix Derivatives Future
2. Corporate Projects 2. Hedging strategies pricing AP models
3. Fund Performance

“Mind what you have learned. Save you it can.”

99
Bibliography

Bodie, Z., A. Kane, and A. Marcus, 2005, Investments. McGraw-Hill, 6th ed edn.

Breeden, D., 1979, “An Intertemporal Asset Pricing Model with Stochastic Consumption
and Investment Opportunities,” Journal of Financial Economics, 7, 265–296.

Chen, N.-F., R. Roll, and S. A. Ross, 1986, “Economic Forces and the Stock Market,”
Journal of Business, 59, 383–403.

Chiang, A. C., 1984, Fundamental Methods of Mathematical Economics. McGraw-Hill.

Cochrane, J. H., 2005, Asset Pricing. Princeton University Press.

Cvitanić, J., and F. Zapatero, 2004, Introduction to the Economics and Mathematics of
Financial Markets. The MIT Press.

Danthine, J.-P., and J. B. Donaldson, 2005, Intermediate Financial Theory. Elsevier


Academic Press, 2nd edn.

Fama, E. F., and K. R. French, 1993, “Common Risk Factors in the Returns on Stocks
and Bonds,” Journal of Financial Economics, 33, 3–56.

, 1996, “Multifactor explanations of Asset Pricing Anomalies,” Journal of Fi-


nance, 51(1), 55–84.

Huang, C.-f., and R. H. Litzenberger, 1988, Foundations for Financial Economics.


Prentice-Hall.

Ingersoll, J. E., 1987, Theory of Financial Decision Making. Rowman and Littlefield.

Jagannathan, R., and E. R. McGrattan, 1995, “The CAPM Debate,” Federal Reserve
Bank of Minneapolis Quarterly Review, 19(4), 2–17.

J.P. Morgan, 1996, Risk Metrics — Technical Document. J.P. Morgan.

Malkiel, B. G., 2005, “Reflections on the Efficient Market Hypothesis: 30 Years Later,”
Financial Review, 40, 1–9.

100
Bibliography 101

Roll, R., 1977, “A Critique of the Asset Pricing Theory’s Test — Part I: On past and
potential testability of the theory,” Journal of Financial Economics.

Ross, S. A., 1976, “The Arbitrage Theory of Capital Asset Pricing,” Journal of Economic
Theory, 13, 341–360.
Appendix A

Background Review

A.1 Math Review

A.1.1 Logarithm and Exponential

Definition. ln x = y ⇔ ey = x

The log function is increasing,

(ln x)0 = 1/x > 0

and concave,
(ln x)00 = (1/x)0 = −x−2 = −1/x2 < 0
Plot it:

y
6

-
x

102
A.1. Math Review 103

The exponential function y = ex is increasing, but not concave. Since we will be


interested in increasing and also concave functions, we will use y = −e−x . Plot y = ex ,
y = e−x , and y = −e−x :

y
6

-
x

A.1.2 Derivatives

Basic rules

Let f and g be functions of x. Let a be a constant.

Function Derivative
af af 0
fg f 0g + f g0
f /g (f 0 g − f g 0 )/g 2
fa af 0 f a−1
ef f 0 ef
ag g 0 ag ln a
fg gf 0 f g−1 + g 0 f g ln f
ln x 1/x
ln f f 0 /f

Chain rule

df (g(x)) df dg
= = f 0 (g)g 0 (x)
dx dg dx

The following examples are from Chiang (1984, p.170).


A.1. Math Review 104

Example A.1.1. Let z = 3y 2 , with y = 2x + 5. Note that a change in x


causes a change in y, which in turn causes a change in z, like a chain reaction,
hence the name. Applying the rule,
dz dz dy
= = (6y) × 2 = 12(2x + 5)
dx dy dx
We can check this result by replacing y = 2x + 5 in z, and computing dz/ dx
directly: dz/ dx = 24x + 60.

The rule becomes useful for complicated functions like:

Example A.1.2. With z = (x2 + 3x − 2)17 , computing dz/ dx directly would


require a lot of work. Instead, we set z = y 17 , with y = x2 + 3x − 2, and apply
the chain rule:
dz dz dy
= = 17y 16 (2x + 3) = 17(x2 + 3x − 2)16 (2x + 3)
dx dy dx

Implicit Function Theorem

Given the equation f (x, y) = 0, then dy


dx = − ∂f /∂x
∂f /∂y

The following examples are from Chiang (1984, p.208).

Example A.1.3. Let f (x, y) := y − 3x4 = 0 (which implicitly defines y =


4
3x ). Applying the IFT,
dy ∂f /∂x
=− = 12x3
dx ∂f /∂y
dy d(3x4 )
Which can be checked by computing directly dx = dx = 12x3

The rule becomes useful for complicated equations like:

Example A.1.4. Let f (x, y, w) := y 3 x2 + w3 + yxw − 3 = 0. Note that we


cannot write explicitly y = y(x, w). Still, we can use the IFT to compute
dy ∂f /∂x 2y 3 x + yw
=− =− 2 2
dx ∂f /∂y 3y x + xw
A.1. Math Review 105

Taylor expansion

1 1
f (x) = f (a) + f 0 (a)(x − a) + f 00 (a)(x − a)2 + · · · + f (n) (a)(x − a)n + . . .
2 n!

This allows us to express any arbitrary function f as a polinomial

The following examples are from Chiang (1984, p.259).

Example A.1.5. Consider the quadratic function f (x) = 5 + 2x + x2 . Note


that this is already a polynomial, so Taylor’r rule will take us back to the original
function. Just for illustration:

f (x) = 5 + 2a + a2 + (2 + 2a)(x − a) + 1/2 × 2(x − a)2 + 0 = 5 + 2x + x2

Taylor’r rule is typically used to approximate a function by a low-degree polynomial.

Example A.1.6. We can use Taylor’s rule to approximate the quadratic func-
tion in the previous example by a linear function:

f (x) ≈ 5 + 2a + a2 + (2 + 2a)(x − a)

For example, around a = 1 we have the approximation f (x) ≈ 4x + 4. (Plot


it!) B

A.1.3 Optimization

The following examples are from Chiang (1984, p.370).

Consider the following problem:

maximize x1 x2 + 2x1
x1 ,x2

s.t. 4x1 + 2x2 = 60

(This can be seen as maximizing the utility of consuming two goods, subject to a budget
restriction.)

There are two equivalent ways to solve this problem:


A.1. Math Review 106

Option 1: Substitution Substituting the constraint x2 = 30−2x1 into the function,


we get

maximize x1 (30 − 2x1 ) + 2x1


x1

The first order condition for an optimum is

d(x1 (30 − 2x1 ) + 2x1 )/ dx1 = 0 ⇒ x1 = 8

and thus x2 = 14.

Option 2: Lagrangian For more complicated problems, the Lagrangian is more


useful:
L = x1 x2 + 2x1 + λ(4x1 + 2x2 − 60)
We now have 3 foc:
 
dL/dx1 = x2 + 2 + 4λ = 0
 x1 = 8

dL/dx2 = x1 + 2λ = 0 ⇒ x2 = 14
 
dL/dλ = 4x1 + 2x2 − 60 = 0 λ = −4
 

As expected, the solution is the same.

Note that we can also write the lagrangian as

L = x1 x2 + 2x1 − λ(4x1 + 2x2 − 60)

or
L = x1 x2 + 2x1 + λ(−4x1 − 2x2 + 60)
These will change the sign of the multiplier, λ, but will not change the values of the
choice variables, x1 and x2 .

A.1.4 Means and Variances

For an intuitive review of random variables and their moments, consider the following
returns on two stocks:

month ra rb
1 0 0
2 0.05 0.1
3 0 0
4 -0.05 -0.1
5 0 0
A.2. Undergraduate Finance Review 107

Plot these time series. B

We can easily see that


E[ra ] = E[rb ] = 0
and
Var[ra ] < Var[rb ]
Furthermore, it should be clear that the two stocks are perfectly correlated,

Cov(a, b)
ρ(a, b) := p =1
Var(a) Var(b)

As an exercise, assume that each value is equally likely (ie, each observation has 0.2
probability) and compute the variances, the covariance, and check that the correlation
coefficient is indeed 1.

A.2 Undergraduate Finance Review

A good reference for undergraduate finance is Bodie, Kane, and Marcus (2005).

A.2.1 Financial Markets and Instruments

Money Market

Short-term market. Instruments:

• Treasury Bills
• Certificates of Deposit
• Commercial paper
• LIBOR market
• EURIBOR (Euro Interbank Offered Rate)

Bonds

Bonds are debt instruments. The typical bond has a fixed (known) coupon rate and
is fully amortized at maturity. Bonds can be issued by corporations and governments
(Treasury Bills, T. Notes, T. Bonds).
A.2. Undergraduate Finance Review 108

Stocks

Stocks represent ownership in a corporation. Shareholders vote to elect the board at


an annual meeting. Each stock receives a (variable, unknown) dividend each year (or
quarter). However, a stock is the residual claim on the value of the corporation, meaning
that shareholders will only receive a dividend after all other liabilities have been paid.

Stock Indexes

Uses:

• Track average returns.

• Comparing performance of managers.

• Base of derivatives

Examples:

• Dow Jones Industrial Average (30 Stocks)


• Standard & Poors 500 Composite
• NASDAQ Composite
• Nikkei 225
• FTSE
• Dax
• PSI20

Derivatives

Examples: Forward, Futures, Options, Swaps, etc. Value depends on underlying asset.
Used to hedge risks or speculate.

Short selling

Purpose: to profit from a decline in the price of a stock or security.

Mechanics:

1. Borrow stock through a dealer.

2. Sell it and deposit proceeds and margin in an account.


A.2. Undergraduate Finance Review 109

3. To close out the position: buy the stock and return to the party from which it
was borrowed.

Short Selling Puzzle. Most stocks are easy to short sell. However, investors do very
little short selling.

A.2.2 Time value of money

$1 today is worth less than $1 tomorrow. Assume a risk-free interest rate of 5% per
year. The Present Value of $1 to be received for sure in one year is
$1
PV = = $0.95 < $1
1.05
We are indifferent between receiving $0.95 today or $1 in one year.

Receiving $10 per year for the next 2 years is equivalent to having today
$10 $10
PV = +
1.05 1.052

Example A.2.1. A one-month TBill sells for 99.6737% (of par value). The
one-month risk-free interest rate is
100
99.6737 = ⇒ r1m = 0.003274 = 0.3274% = 32.74bps
1 + r1m
Interest rates are usually expressed in a annual base. There are two options:
AP R = 0.3274 × 12 = 3.9288%
Annual Percentage rate: r1m
EAR = (1 + 0.003274)12 − 1 = 4%
Equivalent Annual rate: r1m

Some useful formulas:

Annuity Present value (t = 0) of $1 received during T periods (from t = 1 to t = T ),


discounted at a rate of r:
1 − (1 + r)−T
AF (T, r) :=
r

Example A.2.2. An 8-yr Treasury Bond pays annual coupons at 6%. The
risk-free term structure is flat at 5%. The price of the Bond is
100%
P = 6% × AF (8, 5%) +
1.058
A.2. Undergraduate Finance Review 110

Perpetuity Present value (t = 0) of $c received forever (from t = 1 to ∞), discounted


at an interest rate of r:
P V = c/r

Perpetuity with growth Present value (t = 0) of {ct }∞


t=1 with ct+1 = ct (1 + g),
discounted at an interest rate of r:

P V = c1 /(r − g)

Example A.2.3. A stock will pay a dividend of $2 in one year. Dividends are
expected to grow at 6% forever. The required return on the stock is 10%. It’s
fair value today is
2
P = = $50
0.10 − 0.06

A.2.3 Risk and Return

Risk-return tradeoff

Statistics on annual returns on US assets for 1926–2002 (in %):

Asset Mean Std Dev Risk Premium


Small Stocks 17.7 39.3 13.9
Large Stocks 12.0 20.6 8.2
LT Gov Bonds 5.7 8.2 1.9
T-Bills 3.8 3.2 –

More risk is compensated with higher returns. But what exactly explains these risk
premiums? Most of this course is about explaining differences in risk premiums.

Other kinds of risk

Current research is trying to understand and model other sources of risk:

Liquidity risk: The risk of not being able to trade immediately at a fair price.

Credit risk: The risk of not receiving promised payments.


A.2. Undergraduate Finance Review 111

A.2.4 Equilibrium and No Arbitrage

Financial models can be classified into two categories: Equilibrium and Arbitrage.

Definition (Arbitrage). Arbitrage is the possibility to make money without any risk.

In financial markets there are no arbitrage opportunities. After all, it only takes
a few “sharks” to constantly monitor the markets and quickly eliminate any arbitrage
opportunity. Hence, in modeling financial markets, we always assume that there is no
arbitrage.

Pricing by arbitrage can only give relative values, ie, it uses the (given) prices of
some basic assets to explain the prices of other securities (sometimes called “ketchup
economics”) Nonetheless, arbitrage models require less assumptions and are more appli-
cable in practice.

Definition (Equilibrium). The market for an asset is in equilibrium if the supply equals
the demand for that asset.

The demand is the result of many investors making optimal choices, i.e., buying the
quantity that optimizes their well-being (subject to some restrictions). In most cases of
financial models, the supply is taken as exogenous.

Equilibrium models aim for a complete theory of value, ie, they start from primitives
(investors’ preferences, firms’ technology, market structure, etc) and get to prices. The
goal is to understand how prices (or risk premiums) depend on the fundamental char-
acteristics of the economy. They are thus more general than arbitrage models, though
harder to implement.
Remark. If the market is in Equilibrium, then there are No Arbitrage opportunities
(everybody is maximizing, so there can be no easy way to make money). However,
the reverse is not true. Hence, Equilibrium is a stronger condition. The advantage of
requiring only No Arbitrage is that we need to make less assumptions.

Suggestion: read sections 2.1–2.3 in Danthine and Donaldson (2005) for an intro-
duction to the valuation methods we will be studying in this course.
Appendix B

Solutions to Problems

Answer (Ex. 3) — lover; averse.

Answer (Ex. 4) — To preserve those measures under linear transformations of the


utility function. If we used only the second derivative (eg, ARA∗ = −U 00 ), then for
example ln W and a + b ln W would have different ARA∗ and RRA∗ (check this). This
is not desirable because ln W and a + b ln W represent exactly the same preferences (ie,
the same person).

Answer (Ex. 5) — The indifference probability is such that


U (Y ) = πU (Y + θY ) + (1 − π)U (Y − θY )
Expanding U (Y + θY ) and U (Y − θY ) in Taylor series around Y , we get
1
U (Y + θY ) ∼= U (Y ) + θY U 0 (Y ) + (θY )2 U 00 (Y )
2
∼ 1
U (Y − θY ) = U (Y ) − θY U (Y ) + (θY )2 U 00 (Y )
0
2
Replacing back in the previous equation and canceling terms produces the required
relation:
   
1 2 00 0 1
∼ π U (Y ) + θY U (Y ) + (θY ) U (Y ) + (1 − π) U (Y ) − θY U (Y ) + (θY ) U (Y )
0 2 00
U (Y ) =
2 2
 
1 1
⇒ U (Y ) ∼
= π U (Y ) + θY U 0
(Y ) + (θY )2 00
U (Y ) − U (Y ) + θY U 0
(Y ) − (θY )2 00
U (Y )
2 2
1
+ U (Y ) − θY U 0 (Y ) + (θY )2 U 00 (Y )
2

112
113

1
⇒0∼
= π 2θY U 0 (Y ) − θY U 0 (Y ) + (θY )2 U 00 (Y )
 
2
2 2
1 θ Y U (Y ) 00
⇒π∼ = −
2 4θY U 0 (Y )
1 1
⇒π∼
= + θ.RRA(Y )
2 4

Answer (Ex. 6) — .

Name U (W ) = Restrictions ARA RRA


on parameters

Log ln(W ) na 1/W 1


decreasing constant

Power W 1−γ /(1 − γ) γ>0 γ/W γ


decreasing constant

Exponential − exp(−αW ) α>0 α αW


constant increasing
b > 0(⇒ U 00 < 0)
Quadratic aW − bW 2 W < 2ba
(⇒ U 0 > 0) 2b
a−2bW
2bW
a−2bW
0
a > 0(⇒ U > 0 on W > 0) increasing(1) increasing(1)
.
dARA dRRA
(1) Check that dW > 0 and dW >0

Answer (Ex. 7) — .
1) U2 is a linear transformation of U1 , hence represents the same preferences (see propo-
sition 2.2.1).
2) Use L’Hopital’s rule to get
d 1−γ
W 1−γ − 1 dγ (W − 1) −W 1−γ ln W
lim = lim d
= lim = ln(W )
γ→1 1−γ γ→1
dγ (1 − γ) γ→1 −1

Answer (Ex. 8) — U 0 = 20Y, U 00 = 20 > 0, hence the investor is risk-loving. Most


investors demand a premium to bear risk (are risk-averse), hence this utility would not
be a reasonable assumption.
114

Answer (Ex. 9) — .
1) Constant RRA.
2) ln(Y ) or Y 1−g /(1 − g)

Answer (Ex. 10) — .


1) The ARA measures the willingness to take gambles defined in absolute terms (money).
The RRA measures the willingness to take gambles defined in percentage of wealth.
2) It means that as his wealth increases, he becomes less willing to take a gamble defined
in absolute terms. For example, consider a fair gamble of winning or loosing $100. As
the investor’s wealth increases from, say, $1 to $2000, the investor becomes less willing
to risk $100. It does not seem a reasonable assumption. U = aW − bW 2 (show that
dARA/dW > 0).
1−g
3) U = W1−g , RRA = g.

Answer (Ex. 11) — EU (Y + L) = U (Y + CE) ⇒ 0.3 ln 120 + 0.7 ln 80 = ln(100 +


CE) ⇒ CE = −9.65. The investor is indifferent between playing the game or reducing
his wealth to 90.35 for sure.

Answer (Ex. 12) — .


1) CE = 41.42
2) CE = 9.05
3) At higher values of wealth, the investor is less risk averse, hence he only trades the
gamble for a high value (41.42) closer to the expected value (50). At lower wealth levels
(1), the investor is more risk averse, thus willing to trade the gamble for a smaller sure
amount (9.05). Another way to say this is the following: At low wealth (1), marginal
utility is very high, that is, the investor is desperate for more food. Thus, if he owned
the gamble, he would be willing to sell it for a sure amount as low as 9.05. If instead he
was fat (Y = 100) and not so desperate for more food (low marginal utility), he would
not mind taking the gamble himself, that is, he would only sell it for a high price (41.42).

Answer (Ex. 14) — No. The signal reverses.


Rx Rx Rx Rx
x F3 (x) 0 F3 (s)ds F4 (s)ds 0 F4 (s)ds 0 F4 (s)ds − 0 F3 (s)ds
1 0 0 1/3 0 0≥0
3 0.25 0 1/3 2/3 2/3 ≥ 0
4 0.75 0.25 1/3 1 0.75 ≥ 0
6 0.75 1.75 2/3 5/3 −1/12 ≤ 0
8 0.75 3.25 3/3 3
12 1.00
115

Answer (Ex. 16) — The distribution of b is a mean-preserving spread of a. Thus,


a 2SD b. All risk-averse investors prefer a. This investor is risk averse (compute U 00 ),
hence also prefers a.

Answer (Ex. 17) — Possible answer: The investor chooses his portfolio allocation by
maximizing the expected utility of terminal wealth.

Y11−γ
Answer (Ex. 18) — The problem is maxa E[ 1−γ ], with Y1 = Y0 (1 + rf ) + a(r − rf ).
The foc is

π [Y0 (1 + rf ) + a(r2 − rf )]−γ (r2 − rf )+


(1 − π) [Y0 (1 + rf ) + a(r1 − rf )]−γ (r1 − rf ) = 0

−1/γ
⇒ π [Y0 (1 + rf ) + a(r2 − rf )]−γ (r2 − rf )

=
−1/γ
−(1 − π) [Y0 (1 + rf ) + a(r1 − rf )]−γ (r1 − rf )


⇒ [Y0 (1 + rf ) + a(r2 − rf )] · [π(r2 − rf )]−1/γ =


[Y0 (1 + rf ) + a(r1 − rf )] · [−(1 − π)(r1 − rf )]−1/γ

⇒ [Y0 (1 + rf ) + a(r2 − rf )] · [(1 − π)(rf − r1 )]1/γ =


[Y0 (1 + rf ) + a(r1 − rf )] · [π(r2 − rf )]1/γ

⇒ Y0 (1 + rf ) · [(1 − π)(rf − r1 )]1/γ + a(r2 − rf ) · [(1 − π)(rf − r1 )]1/γ =


Y0 (1 + rf ) · [π(r2 − rf )]1/γ + a(r1 − rf ) · [π(r2 − rf )]1/γ

a [(1 − π)(rf − r1 )]1/γ − [π(r2 − rf )]1/γ


⇒ =
Y0 (1 + rf ) (r1 − rf )[π(r2 − rf )]1/γ − (r2 − rf )[(1 − π)(rf − r1 )]1/γ
which is the same as (5.4) in Danthine and Donaldson (2005). Plugging in the numbers
we get
a
= 0.198
Y0
116

Answer (Ex. 19) — The foc is

E[Y0 (r − rf )/Y1 ] = 0

with Y1 = Y0 (1 + rf ) + wY0 (r − rf ).
Applying the implicit function theorem we get
d
dŵ dY0 E[Y0 (r − rf )/Y1 ]
=− d
dY0 dw E[Y0 (r − rf )/Y1 ]
E[(r − rf )/Y1 − Y0 (r − rf )(1 + rf + w(r − rf ))/Y12 ]
=
E[Y02 (r − rf )2 /Y12 ]
E[(r − rf )/Y1 − (r − rf )/Y1 ]
=
E[Y02 (r − rf )2 /Y12 ]
=0

Note that the denominator is strictly positive as long as r − rf 6= 0 in some states, which
is always the case in reality. (In fact, if r is continuous, then P rob[r = rf ] = 0, and
thus the integral in the denominator is not affected by this event. This sentence is not
a required part of the course.)
Alternatively, note that the foc can be further simplified:
 
(r − rf )
E[Y0 (r − rf )/Y1 ] = E =0
1 + rf + w(r − rf )

It does not depend on Y0 , hence we immediately get dŵ


dY0 = − df oc/
...
dY0
= 0 (check the
denominator is not zero, just to be sure).

Answer (Ex. 20) — .


1) Decreasing.
Y ( 1−g)
2) U (Y ) = 1−g

Answer (Ex. 21) — .

1) ARA = g, RRA = gY .
2) ARA is constant. Consider a gamble with only two possible outcomes expressed in
monetary units. With constant ARA, the probability of the good outcome the investor
requires to play the game does not depend on the wealth level. Constant ARA also
implies that, in a portfolio choice problem, the optimal amount invested in the risky
asset does not change with the wealth level.
117

Answer (Ex. 22) — .


1)
maximize {E[− exp(−αY1 )]}
a
with Y1 = Y0 (1 + rf ) + a(r − rf ).
2) The risky asset becomes relatively less attractive (less excess return for the same
variance), hence a should decrease.
3) The foc is:
E [α(r − rf ) exp(−αY1 )] = 0
Using the Implicit Function Theorem,
da(Y0 ) ∂ E[. . . ]/∂rf
=−
drf ∂ E[. . . ]/∂a
E[−αe−αY1 − α2 (r − rf )(Y0 − a)e−αY1 ]
=−
E[−α2 (r − rf )2 e−αY1 ]
>0 =0 (f oc)
z }| { z }| {
E[αe−αY1 ] +α(Y0 − a) E[α(r − rf )e−αY1 ]
=
E[−α2 (r − rf )2 e−αY1 ]
| {z }
<0
< 0

Answer (Ex. 23) — .


1) Nothing; depends on the particular utility function (example: positive for decreasing
ARA, but negative for increasing ARA)
2) da/dY0 = 0, see the application of the IFT below equation (3.3).

Answer (Ex. 24) — Y1 ∼ N (Y0 (1 + rf ) + a(µ − rf ), a2 σ 2 ). The problem is thus

maximize − exp −γ[Y0 (1 + rf ) + a(µ − rf )] + 1/2γ 2 a2 σ 2



a

foc:
− −γ(µ − rf ) + aγ 2 σ 2 exp(.) = 0


Since exp(x) > 0, ∀x,

⇒ −γ(µ − rf ) + aγ 2 σ 2 = 0
µ − rf
⇒a=
γσ 2
118

Answer (Ex. 25) — .

1) The returns are:

Stock A Stock B
day t rt rt
fri 0 – –
mon 1 0.1000 0.1000
tue 2 -0.0909 -0.0909
wed 3 0.1000 0.2100
thu 4 -0.1818 -0.1818
fri 5 0.3333 0.3333

weekly returns 0.2000 0.3200

2) The return on the portfolio over this week is

rp,week = 0.4 × 0.2 + 0.6 × 0.32 = 27.2%

3) NA,0 = NA,5 = 4, 000/10 = 400 and NB,0 = 600. On Wednesday, we receive a


dividend of $1.1 ∗ 600 = $660, which allows us to buy $660/$11 = 60 more shares of
stock B. Hence, NB,3 = NB,5 = 600 + 60 = 660. (We can check that the terminal value
of this portfolio is V5 = 12 ∗ 400 + 12 ∗ 660 = $12, 720, which implies a weekly return of
12720/10000 − 1 = 27.2%.)
4) The adjusted prices are:

Stock A Stock B
day t Pt Pta Pt Pta
fri 0 10 10 10 9.09
mon 1 11 11 11 10
tue 2 10 10 10 9.09
wed 3 11 11 11 11
thu 4 9 9 9 9
fri 5 12 12 12 12

Answer (Ex. 26) — Have EW = W0 (1 + µ), V ar(W ) = W02 σ 2 . Thus, EU = a +


bEW +cE[W 2 ] = a+bEW +c(V ar(W )+(EW )2 ) = a+bW0 (1+µ)+c(W02 σ 2 +W02 (1+µ)2 )

Answer (Ex. 27) — .


1) Daily returns
119

Mean 0.0003
Stdev 0.0105
Skew(Nrm=0) -0.6292
Kurt(Nrm=3) 10.2897
Test[H0:Normal]
Jarque-Bera[Pvalue] 0.0000
Clearly, there are fat tails (high kurtosis). Normality is rejected (JB test; not covered
in class).
2) Monthly returns
Mean 0.0068
Stdev 0.0614
Skew(Nrm=0) -0.1338
Kurt(Nrm=3) 3.8117
Test[H0:Normal]
Jarque-Bera[Pvalue] 0.1418
At the monthly horizon, the problem is much less severe. There is also less skewness.
Normality is not rejected. (The JB test is asymptotic but we only have 145 monthly
observations. Statistical purists might ask for additional finite sample tests. We did not
cover any of these in class; I don’t expect you to know this.)

Answer (Ex. 28) — As done in section 4.4 of these notes.

Answer (Ex. 29) — The lagrangian is


1
L = w0 V w − λ(w0 r̄ + (1 − w0 1)rf − µ)
2
and the first-order conditions are
dL
= V w − λ(r̄ − rf 1) = 0 (N eqns)
dw
dL
= w0 r̄ + (1 − w0 1)rf − µ = 0 (1 eqn)

The foc for w can be written as:
w = λV −1 (r̄ − rf 1)
⇒ (r̄ − rf 1)0 w = λ (r̄ − rf 1)0 V −1 (r̄ − rf 1)
| {z }
≡H
0
⇒ (r̄ − rf 1) w = λH
The foc for λ implies
w0 (r̄ − rf 1) + rf − µ = 0
⇒ (r̄ − rf 1)0 w = µ − rf
120

Plugging this expression for (r̄ − rf 1)0 w into the previous equation, we find the value of
the multiplier:

µ − rf = λH
µ − rf
⇒λ=
H
Substituting this value of λ in the foc for w we get (4.11):

w = λV −1 (r̄ − rf 1)
µ − rf −1
⇒w= V (r̄ − rf 1)
H
Additionally, we can also check that H is indeed as defined in the text:

H ≡ (r̄ − rf 1)0 V −1 (r̄ − rf 1)


= r̄0 V −1 (r̄ − rf 1) − rf 10 V −1 (r̄ − rf 1)
= r̄0 V −1 r̄ − rf r̄0 V −1 1 − rf 10 V −1 r̄ + rf2 10 V −1 1
= B − 2rf A + rf2 C

Answer (Ex. 30) — Let m := E[r]. E[(rp −E[rp ])(rq −E[rq ])] = E[(wp0 r−wp0 m)(wq0 r−
wq0 m)] = wp0 E[(r − m)(r − m)0 ]wq . By definition, V := Cov(r) := E[(r − m)(r − m)0 ],
hence the result follows.

Answer (Ex. 31) — .


1) L = w0 r̄ − g2 w0 V w + m(w0 1 − 1)
foc m: w0 1 = 1.
foc w: w = V −1 (r̄ + m1)/g.
Use the foc for m to get 10 w = (10 V −1 r̄ + m10 V −1 1)/g = 1 ⇒ m = (g − A)/C. Plug
back in foc w to get:
1 g − A −1
w∗ = V −1 r̄ + V 1
g gC
2)
E[rp ] = r̄0 w∗ = B/g + A/C − A2 /(gC)
3)

C B/g + A/C − A2 /(gC) − A −1




w = V r̄+
D
B − A B/g + A/C − A2 /(gC) −1

V 1
D
121

Simplifying all the scalars,

CB/g − A2 /g −1 BC − ABC/g − A2 + A3 /g −1
w∗ = V r̄ + V 1
D DC
(BC − A2 )/g −1 (BC − A2 ) − (BC − A2 )A/g −1
w∗ = V r̄ + V 1
D DC
D/g −1 D − DA/g −1
w∗ = V r̄ + V 1
D DC
1 g − A −1
w∗ = V −1 r̄ + V 1
g gC

we do indeed get (1.).

Answer (Ex. 32) — Using standard matrix notation, E[rp ] = w0 r̄ + (1 − w0 1)rf and
V ar[rp ] = w0 V w.
Since Y1 = Y0 (1 + rp ) and rp is normally distributed, we have that Y1 also follows a
normal distribution with the following parameters:

Y1 ∼ N Y0 [1 + w0 r̄ + (1 − w0 1)rf ], Y02 w0 V w


Using the moment generating function for the normal distribution, the objective function
becomes
 
0 0 1 2 2 0
E[− exp(−b.Y1 )] = − exp −bY0 [1 + w r̄ + (1 − w 1)rf ] + b Y0 w V w
2

The investor problem is thus


 
0 0 1 2 2 0
maximize − exp −bY0 [1 + w r̄ + (1 − w 1)rf ] + b Y0 w V w
w 2

The foc is
 
1 2 2
− −bY0 (r̄ − 1rf ) + b Y0 2V w exp(.) = 0
2
⇒b2 Y02 V w = bY0 (r̄ − 1rf )
1 −1
⇒w = V (r̄ − 1rf )
bY0

Answer (Ex. 33) — (Will be posted on my website)

Answer (Ex. 34) — moneyp =


122

125199.55
0.16
0.00
113651.25
0.00
58178.46
0.65
84884.29
0.00
0.07
moneyrf = 618085.58
rp = 0.0064
stdp = 0.0174
Remark:
With this portfolio, the investor attains the following expected utility: max E[U ] =
0.0064− 82 (0.0174)2 = 0.0052102. If you got a different w, check your maximum expected
utility. If it is higher than this, please let me know. Different software may use different
algorithms and thus give different answers.

Answer (Ex. 35) — Buy an efficient (CML) portfolio with σp = 0.15. The weights
are: σp = wM σM ⇒ wM = 0.75 and wf = 0.25. Thus, put $75,000 in the stock market
and $25,000 in the risk-free bond. The expected return is E[rp ] = 0.25∗0.04+0.75∗0.1 =
0.085, thus we expect to have $108,500 in 1 year.

2
Answer (Ex. 36) — Pa = (0.04+0.06∗0.9)−0.05 = $45

Answer (Ex. 37) — .


1) Developing the definition of beta,
N
X
βp := Cov(rp , rM )/ Var(rM ) = Cov( wi ri , rM )/ Var(rM )
i=1
N
X N
X
= wi Cov(ri , rM )/ Var(rM ) = wi β i
i=1 i=1
2) 0.9 = 0 + wa × 1.2 ⇒ wa = 0.75, and wf = 0.25

Answer (Ex. 38) — The portfolio must be efficient, ie, a combination of the risk-free
asset and the market. Hence, it must have corr(rp , rM ) = 1. (Check that plugging this
in the SML you get the CML).
123

Answer (Ex. 39) — From mean-variance optimization, we can write

E[rj ] = rf + βj ( E[rp ] − rf )

where p is any frontier portfolio, and in particular we can choose p = T (this is just
math). The economic content comes from realizing that if all investors are identical
(mean-var preferences + homogeneous expectations), we must have T = M . Thus, the
economic part of the equation is to use M instead of p.

Answer (Ex. 41) — Use the covariance properties to get


2
σij = βj βi σM + βj Cov(rm , εi ) + βi Cov(rm , εj ) + Cov(εi , εj )

In the diagonal (i=j), use A2 to get (6.3). Off diagonal (i 6= j), use A2 and A3 to get
(6.4).

Answer (Ex. 42) — Suppose we replicate the random part of a by creating a portfolio
(wf = 1−βa , wM = βa ). Its return is rp = (1−βa )rf +βa rM , which matches a, except for
the intercept. Since aa = 0.01 > (1 − βa )rf = 0.004, there is an arbitrage opportunity.
Short sell $1 of p and buy $1 of a. This guarantees a sure profit of 0.6%. Doing this
arbitrage as much as possible will make me extremely rich.

Answer (Ex. 43) — Small-minus-Big (SMB) is the difference between the return of a
portfolio of small stocks and the return of a portfolio of large stocks. It measures the
size premium, the additional return required for investing in small firms.
High-minus-Low (HML) is the difference between the return of a portfolio of firms
with high BE/ME (”value”) and the return of a portfolio of firms with low BE/ME
(”growth”). It measures the value premium, ie the additional return required to invest
in firms with low market cap, which typically are firms which have had low returns and
are now in risk of bankruptcy.

Answer (Ex. 46) — No, |X| = 0.

Answer (Ex. 47) — .


1) q1 = 0.05, q2 = 0.1, q3 = 0.25.
2) p = 2.1

Answer (Ex. 48) — .


124

1) Given the simple structure of X, we can find the AD prices almost directly:
p1 = 3pad ad ad
1 ⇒ 1.2 = 3p1 ⇒ p1 = 0.4
p2 = 4pad ad ad ad
1 + 2p2 ⇒ 1.8 = 4 ∗ 0.4 + 2p2 ⇒ p2 = 0.1
p3 = 2pad ad ad ad
1 + p2 + p3 ⇒ p3 = 0.3

The risk-free rateis 1 +rf = 1/ s pad Q


P
s = 1/0.8 = 1.25. The RN prob are π (s) =
0.5
ad
P ad
p (s)/ s ps = 0.125. Hence, the price of the new security is

0.375
EQ [x] 0.5 ∗ 2 + 0.125 ∗ 10 + 0.375 ∗ 4
p4 = = =3
1 + rf 1.25
2) No. Risk aversion is impounded in the risk-neutral probabilities, π Q .
3) There is an arbitrage opportunity. Since the bank is selling the security cheap, I
should buy it. The replicating portfolio using AD-securities consists of 2 units of AD(1),
10 units of AD(2), and 4 units of AD(3). The price of this portfolio is 3 (as in the
previous question). Hence, we pay 2.5 to the bank and sell the replicating portfolio in
the market for 3. The profit is 0.5 today. One period from now, my payoff is 0 regardless
of the state (an arbitrage).
Note: the portfolio with the AD-securities “means” the following portfolio in the complex
securities: [2, 10, 4]0 = Xq ⇒ q = [−6, 3, 4]. That is, sell 6 units of asset 1, buy 3 units
of asset 2, buy 4 of asset 3. The price of this portfolio equals 3. Selling the replicating
portfolio means to buy 6 units of asset 1, sell 3 units of 2, and sell 4 units of 3.

Answer (Ex. 49) — .


0
1) Et [−pt U 0 (ct ) + δU 0 (ct+1 )vt+1 ] = 0 or pt = Et [δ UU(c0 (ct+1
t)
)
vt+1 ]
2 00 00 2 00
2) Et [pt U (ct ) + δU (ct+1 )vt+1 ] < 0. Need U < 0, ie risk-aversion.
 −γ
3) m = δ ct+1 ct

Answer (Ex. 50) — .


1) Yes, determinant = 6, so the assets are linearly independent.
2) This can be solved with the general method, i.e., finding replicating weights for each
AD security. However, given the simple structure of X, we can find the AD prices almost
directly:
p1 = 3pad ad ad
1 ⇒ 1.2 = 3p1 ⇒ p1 = 0.4
p2 = 4pad ad ad ad
1 + 2p2 ⇒ 1.8 = 4 ∗ 0.4 + 2p2 ⇒ p2 = 0.1
p3 = pad ad ad ad ad
1 + p2 + p3 ⇒ 0.7 = 0.4 + 0.1 + p3 ⇒ p3 = 0.2
125

3) p = [2, 10, 4] ∗ [0.4, 0.1, 0.2]0 = 2.6


4) m =P q./π = [1.6, 0.2, 0.8]0
5) p = s π(s)m(s)payof f (s) = 2.6

Answer (Ex. 51) — .


f f
1) 1 = Et [mt+1 Rt+1 ] = Rt+1 Et [mt+1 ]
2) 1 = cov(m, R) + R1f E[R] ⇒ E[R] = Rf (1 − cov(m, R)), and finally,
f f
covt δU 0 (ct+1 )/U 0 (ct ), Rt+1

Et [Rt+1 ] − Rt+1 = −Rt+1
f δ
covt U 0 (ct+1 ), Rt+1

= −Rt+1 0
U (ct )
3) Investors are willing to pay a high price (demand a low excess return) for securities
that have high covariance with marginal utility. This makes sense since these securities
payoff exactly when the investor values the payoff most (ie, when he has high marginal
utility).

Answer (Ex. 52) — .

1) For log utility and δ = 1, we have m = δU 0 (c1 )/U 0 (c0 ) = c0 /c1 = [100/100, 100/150]0 =
[1, 2/3]0 . The asset prices are thus
X
p1 = π(s)m(s)x(s) = 1 ∗ 10/2 + 2/3 ∗ 20/2 = 11.67
s
X
p2 = π(s)m(s)x(s) = 1 ∗ 20/2 + 2/3 ∗ 10/2 = 13.33
s

2) Asset 2 is more expensive because it has a high payoff in bad times (high marginal
utility, low consumption c1 = 100). Equivalently, asset 1 is cheap because its high payoff
occurs in an already good state (low mg util, high consumption).
3) p = E[mx] ⇒ pt = Et [xf t+1 ] + Covt (mt+1 , xt+1 ). The first term is the price of the asset
Rt+1
if investors were risk neutral. The second term is a risk adjustment: if the payoff has a
high covariance with the sdf or mg utility (meaning low cov with consumption), then it
will payoff precisely when the investor is in most need. Its price will thus be high. For
the example, note that Rf = 1/ E[m] = 1/0.8333 = 1.2. Hence,
EP [x]/Rf Cov(m, x)
p1 = 11.6667 = +12.5000 -0.8333
p2 = 13.3333 = +12.5000 +0.8333
Without risk-aversion, both asset would have the same price (12.5). However, risk-
aversion makes the price of asset 2 increase by 0.83.
126

Answer (Ex. 53) — True. m = δU 0 (ct+1 )/U 0 (ct ). Since marginal utility is always
positive, the sdf is always positive.

Answer (Ex. 54) — .


1) The 2 foc are:

zs : pst = Et [δU 0 (ct+1 )/U 0 (ct )xst+1 ]


zb : pbt = Et [δU 0 (ct+1 )/U 0 (ct )xbt+1 ]

2) U 0 = 1/c, hence mt+1 = δct /ct+1 .


3) The pricing kernel is

mt+1 = 0.99 ∗ 1000/ct+1 = [1.1000, 0.9900, 0.9000, 0.8250]0

Using the foc for z b ,


4
X
pbt = Et [mt+1 xbt+1 ] = P rob(s)m(s)xb (s) = 80.31
s=1

You might also like