Asset Pricing

Teaching Notes
Jo˜ao Pedro Pereira
Finance Department
ISCTE Business School - Lisbon
joao.pereira@iscte.pt
www.iscte.pt/∼jpsp
September 9, 2013
Contents
1 Introduction 5
2 Choice theory 7
2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 The utility function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Choice under certainty . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Choice under uncertainty . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.3 Interpretation of utility numbers . . . . . . . . . . . . . . . . . . . 11
2.3 Risk aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.1 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 Measures of risk aversion . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.3 Risk neutrality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Important utility functions . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 Certainty Equivalent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.6 Stochastic dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6.1 First Order Stochastic Dominance . . . . . . . . . . . . . . . . . . 18
2.6.2 Second Order Stochastic Dominance . . . . . . . . . . . . . . . . . 19
2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Portfolio choice 24
3.1 Canonical portfolio problem . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Analysis of the optimal portfolio choice . . . . . . . . . . . . . . . . . . . 26
3.2.1 Risk aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2.2 Wealth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Canonical portfolio problem for N > 1 . . . . . . . . . . . . . . . . . . . . 31
3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4 Portfolio choice for Mean-Variance investors 35
4.1 Mean-Variance preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.1 Quadratic utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.1.2 Normal returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2 Review: Mean-Variance frontier with 2 stocks . . . . . . . . . . . . . . . . 39
2
Contents 3
4.3 Setup for general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.3.2 Brief notions of matrix calculus . . . . . . . . . . . . . . . . . . . . 41
4.4 Frontier with N risky assets . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.4.1 Efficient portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.4.2 Frontier equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.4.3 Global minimum variance portfolio . . . . . . . . . . . . . . . . . . 45
4.5 Frontier with N risky assets and 1 risk-free asset . . . . . . . . . . . . . . 45
4.5.1 Efficient portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.5.2 Frontier equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.5.3 Tangency portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.6 Optimal portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.7 Additional properties of frontier portfolios . . . . . . . . . . . . . . . . . . 50
4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5 Capital Asset Pricing Model 54
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.3 Important results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3.1 Capital Market Line . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.3.2 Security Market Line . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.4 Other remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6 Arbitrage Pricing Theory and Factor Models 61
6.1 Factor Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.2 Example of simple factor structure: Market Model . . . . . . . . . . . . . 63
6.2.1 Return generating process . . . . . . . . . . . . . . . . . . . . . . . 63
6.2.2 Application: the Covariance matrix is simplified . . . . . . . . . . 63
6.2.3 Implication: Diversification eliminates Specific risk . . . . . . . . . 64
6.2.4 Another interpretation of the CAPM β . . . . . . . . . . . . . . . 65
6.3 Pricing equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.3.1 Exact factor pricing with one factor . . . . . . . . . . . . . . . . . 67
6.3.2 Exact factor pricing with more than one factor . . . . . . . . . . . 68
6.3.3 Approximate factor pricing . . . . . . . . . . . . . . . . . . . . . . 70
6.4 How to identify the factors . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.4.2 Fama and French model . . . . . . . . . . . . . . . . . . . . . . . . 71
6.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.5.1 Fund performance . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.5.2 Market neutral strategy . . . . . . . . . . . . . . . . . . . . . . . . 74
6.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Contents 4
7 Pricing in Complete Markets 77
7.1 Basic and Complex securities . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.2 Computing AD prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.3 Complete Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.3.1 Price of complex securities . . . . . . . . . . . . . . . . . . . . . . . 79
7.3.2 Quick test for market completeness . . . . . . . . . . . . . . . . . . 80
7.4 Risk-Neutral Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
7.4.1 Price of complex securities . . . . . . . . . . . . . . . . . . . . . . . 81
7.4.2 Fundamental theorems . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8 Consumption-Based Asset Pricing 86
8.1 The investor’s problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8.2 Fundamental Asset Pricing Equation . . . . . . . . . . . . . . . . . . . . . 88
8.3 Relation to Arrow-Debreu Securities . . . . . . . . . . . . . . . . . . . . . 89
8.4 Relation to the Risk-Neutral measure . . . . . . . . . . . . . . . . . . . . 90
8.5 Risk Premiums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
8.6 Consumption CAPM (CCAPM) . . . . . . . . . . . . . . . . . . . . . . . 92
8.7 The CAPM reloaded . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
8.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
8.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
9 Conclusion 99
Bibliography 100
A Background Review 102
A.1 Math Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
A.1.1 Logarithm and Exponential . . . . . . . . . . . . . . . . . . . . . . 102
A.1.2 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
A.1.3 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
A.1.4 Means and Variances . . . . . . . . . . . . . . . . . . . . . . . . . . 106
A.2 Undergraduate Finance Review . . . . . . . . . . . . . . . . . . . . . . . . 107
A.2.1 Financial Markets and Instruments . . . . . . . . . . . . . . . . . . 107
A.2.2 Time value of money . . . . . . . . . . . . . . . . . . . . . . . . . . 109
A.2.3 Risk and Return . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
A.2.4 Equilibrium and No Arbitrage . . . . . . . . . . . . . . . . . . . . 111
B Solutions to Problems 112
Chapter 1
Introduction
These notes follow Danthine and Donaldson (2005) closely, though we will use other
sources as needed. We will start by analyzing individual choices and portfolio decisions.
Then, we will study the prices that result from the interaction of many individuals in
the market.
To motivate the work to come, consider the following question:
What is the role of financial markets?
Answer: allowing the desynchronization of agents’ income and consumption. Ex-
ample: buy a house now and pay for it during the next 20 years. This is achieved by
trading financial securities with financial institutions.
Preference for smooth consumption
Financial economists see the world in two dimensions. It is useful to understand why
agents want to dissociate consumption and income across these two dimensions.
1. Time Dimension. Most people prefer to smooth their consumption through
their life cycle. Usually, consumption is higher than income during early years
of life (buy the house), then people save during active life (y > c), finally people
consume their savings after retirement (y = 0, c > 0).
2. Risk Dimension. The future is uncertain. At any point in the future, one of
many states of nature will be realized.
1
Most people want to smooth consumption
1
A state of nature is a complete description of a possible scenario for the future across all the
dimensions relevant for the problem at hand.
5
6
across the different possibilities that may arise. That’s why people buy health
insurance (to be able to consume even if they stop working) or fire insurance for
the new house (avoid low consumption in the “burned to the ground” state of
nature).
Financial assets serve precisely to move consumption through time and across states
of nature.
Modelling the preference for smoothness
Financial economics builds on the fact that people have a preference for smoothness, as
just mentioned. How to model this preference for smoothness, also called risk aversion?
Consider two assets that offer two different consumption plans:
asset 1 asset 2
time/state 1 4 3
time/state 2 4 5
Since investors like smoothness, they must prefer asset 1.
2
Let U(c) be the utility
function, i.e., it tells us how much the investor likes consumption c. The utility function
must thus satisfy
U(4) +U(4) > U(3) +U(5)
⇔U(4) >
1
2
U(3) +
1
2
U(5)
What shape must U(.) have to satisfy this condition?
3
Plot it:
-
c
6
U(c)
2
Suppose your employer offers you the following salary scheme: under scheme 1, you get
$4,000 per month; under scheme 2, you get $3,000 if it rains or $5,000 if it is sunny. Which
scheme would you take?
3
Answer: It must be strictly concave
Chapter 2
Choice theory
1. Under certain conditions, investors’ preferences can be represented
by a utility function,
x ≽ y ⇔E[U(x)] ≥ E[U(y)]
2. Typical utility functions:
U(w) = ln(w) [CRRA]
U(w) = w
1−γ
/(1 −γ) [CRRA]
U(w) = −exp(−αw) [CARA]
U(w) = aw −bw
2
2.1 Motivation
We want to find a method to choose between risky assets. Consider the following simple
example:
Example 2.1.1. There are 3 assets and 2 equally likely possible states of
nature in the future:
t = 0 t = 1
state θ = 1 state θ = 2
asset 1 -1000 1030 1050
asset 2 -1000 1012 1070
asset 3 -1000 1030 1100
7
2.2. The utility function 8
Which asset would you rather have? In this case, the choice is easy. Asset 3 ◃
clearly dominates the other assets, since it pays at least as much in all states of
nature, and strictly more in some states. This is an example of state-by-state
dominance.
State-by-state dominance is the strongest possible form of dominance. We can safely
assume that all rational agents will always prefer asset 3.
1
However, the world is not that simple and we will not usually be able to use this
concept to make choices. (Is it likely we will observe a market like in this example? Why
not?) ◃
Suppose now that asset 3 does not exist. Do you prefer asset 1 or asset 2? The ◃
choice is not obvious... To understand the choices people make in the real world we need
a better machinery — utility theory.
2.2 The utility function
To be able to represent agents’ preferences by a formal mathematical object like a func-
tion, we need to make precise assumptions about how people make choices.
2
2.2.1 Choice under certainty
We start by postulating the existence of a preference relation. For two consumption
bundles a and b (two vectors with the amount of consumption of each good), we either
say that
a ≻ b a is strictly preferred to b
a ∼ b a is indifferent to b
a ≽ b a is strictly preferred or indifferent to b (a not worse than b)
We make the following economic rationality assumptions:
A1: Every investor possesses a complete preference relation. I.e., he must be able to
state a preference for all a and b.
1
More precisely, we are assuming agents to be nonsatiated in consumption (always like more
consumption)
2
People have wasted time thinking about reformulating the canonical portfolio problem just
because they were not aware of the axioms that lead to an expected utility representation.
2.2. The utility function 9
A2: The preference relation satisfies the property of transitivity:
∀a, b, c, a ≽ b and b ≽ c ⇒a ≽ c
A3: The preference relation is continuous.
3
Under these circumstances, we can now state the following useful theorem:
Theorem 2.2.1. Assumptions A1–3 are sufficient to guarantee the existence of a con-
tinuous function u : R
N
→R such that, for any consumption bundles a and b,
a ≽ b ⇔u(a) ≥ u(b)
This real-valued function u is called a utility function.
Note that the notion of consumption bundle used in the theorem is quite general.
Different elements of the bundle may represent the consumption of the same good in
different time periods or in different states of nature.
2.2.2 Choice under uncertainty
Even thought the previous thm is quite general, we want to extend it in a way that
captures uncertainty explicitly and separates utility from probabilities.
Definition (Lottery). The simple lottery (x, y, π) is a gamble that offers payoff x with
probability π and payoff y with probability 1 −π. ◃
This notion of lottery is quite general. The payoffs x and y can represent monetary
or consumption amounts. If there is no uncertainty, we can write
(x, y, 1) = x
The payoffs can themselves be other lotteries, leading to compound lotteries. For exam-
ple, if y = (y
1
, y
2
, τ), we will have
(x, y, π) = (x, (y
1
, y
2
, τ), π)
We assume that the agent is able to “work out” the probability tree and only cares about
the final outcomes.
4
Assume the following axioms:
3
Technical assumption. See Danthine and Donaldson (2005) for details on this and Huang
and Litzenberger (1988) for further technical details.
4
A lottery is the simplest example of a random variable. Stock prices are random variables,
so you can see where we are going.
2.2. The utility function 10
B1: There exists a preference relation ≽, defined on lotteries, which is complete, tran-
sitive, and continuous.
Since the consumption bundles in theorem 2.2.1 where general enough to include
consumption in different states of nature, it can be applied here to ensure that there
exists a utility function U() defined on lotteries. To get an expected utility representation
of preferences, we need the following crucial axiom:
B2: Independence of irrelevant alternatives. Let (x, y, π) and (x, z, π) be any two
lotteries. Then,
y ≽ z ⇔(x, y, π) ≽ (x, z, π)
In other words, x is irrelevant; including it does not change the investor’s preferences
about y and z.
This axiom is not trivial and has been strongly contested. One well know violation is
the Allais Paradox.
5
This and other violations have lead to the exploration of alternatives
to the expected utility framework, namely to the growing field of Behavioral Finance.
Despite this, recall that the goal of financial economics is to understand the aggregate
market behavior and not individual behavior. At this point, expected utility is the most
useful framework.
We now get to the punchline:
Theorem 2.2.2 (Expected Utility Theorem). If axioms B1–2 hold, then there exists a
real-valued function U, defined on the space of lotteries, such that the preference relation
can be represented as an expected utility, that is, for any lotteries x and y,
x ≽ y ⇔E[U(x)] ≥ E[U(y)]
The function U(), defined over lotteries, is called a von Neumann-Morgenstern (vNM)
utility function.
6
5
Allais Paradox. Given the four lotteries defined below, most people show the following
preferences:
L1 = ($10000, $0, 0.10) ≺ L2 = ($15000, $0, 0.09)
and
L3 = ($10000, $0, 1.00) ≻ L4 = ($15000, $0, 0.90)
However, given that L1 = (L3, $0, 0.1) and L2 = (L4, $0, 0.1), with $0 the irrelevant alternative,
the independence axiom would imply L3 ≻ L4 ⇒L1 ≻ L2 !
6
This designation is sometimes confusing. Some people define U := E[U()] and call this U
the vNM utility function, while others call vNM to the u() defined on sure things. Nonetheless,
it is always used in the context of preferences that have an expected utility representation —
theorem 2.2.2
2.2. The utility function 11
Note that x and y can be lotteries with multiple outcomes. Denoting by x
s
the
outcome in state s that occurs with probability π
s
,
7
we have
E[U(x)] =
_

s
U(x
s

s
x is a discrete r.v.

s
U(x
s

s
ds x is a continuous r.v.
Example 2.2.1. Let U(x) =

x. Choose between assets 1 and 2 in example
2.1.1.
2.2.3 Interpretation of utility numbers
The numbers returned by the utility function do not have any meaning per se, as the
following proposition makes clear.
Proposition 2.2.1. If U(x) is a vNM utility function for a given preference relation,
then V (x) = aU(x) + b, a > 0, is also a vNM utility function for the same preference
relation, that is,
E[U(x)] ≥ E[U(y)] ⇒E[V (x)] ≥ E[V (y)]
Proof.
E[U(x)] ≥ E[U(y)] ⇒aE[U(x)] +b ≥ aE[U(y)] +b, since a > 0
⇒E[aU(x) +b] ≥ E[aU(y) +b]
⇒E[V (x)] ≥ E[V (y)]
Example 2.2.2. Suppose a different investor has utility V (x) = 1+2

x. His
choice between assets 1 and 2 (from example 2.1.1) will be the same as the
choice of the investor with U(x) =

x. (Check it!) ◃
Hence, the utility function serves only to rank the choices under consideration. The
precise magnitude of the number does not have any meaning. It is in this sense that
utility is said to be cardinal.
7
More often, especially in probability classes, the state of nature is denoted by ω ∈ Ω, and
the probability measure by P(ω).
2.3. Risk aversion 12
2.3 Risk aversion
2.3.1 Concepts
Consider an investor with wealth Y. Consider also the fair gamble, or lottery, L =
(+h, −h, 1/2). ◃
Definition (Risk aversion). An investor displays risk aversion if he wishes to avoid a
fair gamble, i.e., Y ≻ Y +L.
This implies that the utility function of a risk-averse agent must satisfy
E[U(Y )] > E[U(Y +L)]
⇒U(Y ) >
1
2
U(Y +h) +
1
2
U(Y −h)
This inequality is satisfied for all wealth levels if the utility function is strictly con-
cave.
8
Plot it: ◃
-
Y
6
U(Y )
For twice differentiable utility functions, the sufficient condition for concavity is
that U
′′
(Y ) < 0. This means that U

(Y ) is decreasing in wealth. This important
economic concept is called decreasing marginal utility. As wealth increases, the utility ◃
from additional consumption decreases. “When I am starving, a sandwich tastes great,
while when I am almost satiated I don’t care about another sandwich.”
8
This is formally justified by Jensen’s inequality: E[g(X)] ≤ g(E[X]), for concave g. If g is
strictly concave, the inequality is strict. For the utility function in particular, E[U(Y + L)] <
U(E[Y +L]) = U(E[Y ] +E[L]) = U(Y + 0) = U(Y )
2.3. Risk aversion 13
2.3.2 Measures of risk aversion
We would like to compare utility functions and say which one is more risk averse. Toward
this end, we define the following measures of risk aversion:
Absolute Risk Aversion: ARA(Y ) ≡ −
U
′′
(Y )
U

(Y )
Relative Risk Aversion: RRA(Y ) ≡ −Y
U
′′
(Y )
U

(Y )
Interpretation of ARA. Let π(Y, h) be the probability of the favorable outcome
at which the investor with wealth Y is indifferent between accepting or rejecting the
lottery L = (+h, −h, π()). Note that h is an amount of money. It can be shown that ◃
π(Y, h)

=
1
2
+
1
4
h · ARA(Y ) (2.1)
The favorable odds requested increase with the amount at stake h. More importantly,
the higher the ARA, the more favorable odds the investor demands to accept the lottery.
Example 2.3.1. A commonly used utility function is U(Y ) = −exp(−γY ),
which is known for having constant ARA, ie, ARA = γ.
9
For this investor, ◃
π(Y, h)

=
1
2
+
1
4

The higher the degree of ARA (parameter γ), the higher the favorable odds
requested (π). However, π does not depend on the level of wealth Y . Is this
particular utility function U(Y ) = −exp(−γY ) a good description of human
behavior? ◃
We now derive equation (2.1).
Proof. π(Y, h) must be such that
π : Y ∼ Y +L
⇒E[U(Y )] = E[U(Y +L)]
⇒U(Y ) = πU(Y +h) + (1 −π)U(Y −h)
9
This is the only utility function with constant ARA. To see this, write −
U
′′
(Y )
U

(Y )
= γ ⇒
U
′′
(Y )+γU

(Y ) = 0, which is a homogeneous linear differential equation of the second order with
constant coefficients. The two special solutions are U
1
= 1 and U
2
= exp(−γY ) and the general
solution is thus U(Y ) = c
1
+c
2
exp(−γY ). This is a linear transformation of U(Y ) = exp(−γY ),
therefore representing the same preferences. Thanks to Diogo Bessam for pointing this out.
2.3. Risk aversion 14
Expanding U(Y +h) and U(Y −h) in Taylor series around Y , we get
10
U(Y +h) = U(Y ) +hU

(Y ) +
1
2
h
2
U
′′
(Y ) +O(h
2
)
U(Y −h) = . . .
Ignoring terms of higher order, replacing both these approximations in the previous
equation, and canceling terms, we get equation (2.1). ◃
Interpretation of RRA. Now we define a gamble in terms of a proportion of
the investor’s initial wealth. Specifically, we set h = θY , and the lottery becomes
L = (θY, −θY, π()). π(Y, θ) is the probability of the favorable outcome at which the ◃
investor is indifferent between accepting or rejecting the lottery. It can be shown that
π(Y, θ)

=
1
2
+
1
4
θ · RRA(Y ) (2.2)
The favorable odds requested increase with the proportion of wealth at stake θ. More
importantly, the higher the RRA, the more favorable odds the investor demands to accept
the lottery.
Example 2.3.2. An important utility function is U(Y ) = Y
1−γ
/(1−γ), which
is known for having constant RRA, ie, RRA = γ.
11
For this investor, ◃
π(Y, θ)

=
1
2
+
1
4
θγ (2.3)
The higher the degree of RRA (parameter γ), the higher the favorable odds
requested (π). Again, π does not depend on the level of wealth Y . It depends
only on the proportion of wealth θ at stake.
We do like this! Historically, stock returns look stationary (same mean through ◃
time), while aggregate wealth has been increasing. Thus, investors must require
an expected return that cannot depend on the amount of wealth at risk. (Note
that the expected return is determined by π.) The utility function with constant
10
Taylor series: f(x) = f(a) +f

(a)(x −a) +
1
2
f
′′
(a)(x −a)
2
+· · · +
1
n!
f
(n)
(a)(x −a)
n
+. . .
11
This is the only utility function with constant RRA. To see this, write −Y
U
′′
(Y )
U

(Y )
= γ ⇒
U
′′
(Y ) +
γ
Y
U

(Y ) = 0, which is a homogeneous linear differential equation of the second order.
One specific solution is U
1
= Y
1−γ
/(1 − γ) (check that it satisfies the equation). The second
linearly independent solution is given by U
2
= U
1

exp{−

γ/Y dY }
(U
1
)
2
dY = −1. The general
solution is thus U(Y ) = c
1
Y
1−γ
/(1 −γ) −c
2
, a linear transformation of U(Y ) = Y
1−γ
/(1 −γ),
therefore representing the same preferences. Again, thanks to Diogo Bessam for pointing this
out.
2.4. Important utility functions 15
RRA (RRA = γ only, Y does not show up) is consistent with these empirical
facts.
12
The proof of equation (2.2) is left as an exercise.
2.3.3 Risk neutrality
Risk-neutral investors don’t care about risk. Their utility function is linear:
U(Y ) = a +bY, b > 0
Check that ARA = 0 and RRA = 0, which implies π(Y, h) = π(Y, θ) = 1/2. Hence, ◃
risk neutral investors are indifferent to fair games (i.e., symmetrical games with 50–50
chances).
They will always choose the asset with highest expected payoff, regardless of its risk.
2.4 Important utility functions
The most common utility functions are the following:
Name U(Y ) = Restrictions ARA RRA
on parameters
Log ln(Y ) na
Power Y
1−γ
/(1 −γ)
Exponential −exp(−αY )
Quadratic aY −bY
2
2.5. Certainty Equivalent 16
Complete the table. In particular, define the restrictions on parameters s.t. the ◃
functions are proper utility functions, i.e., U

> 0 and U
′′
< 0. Note that the quadratic
utility function also needs a restriction on the domain (Y < . . . ). Also, compute the
ARA and RRA functions, and classify the corresponding utility as increasing, decreasing,
or constant ARA/RRA.
As mentioned above, the power (and log) utility are considered “good” utility func-
tions. Typical values for the degree of risk-aversion are γ = 1, 2, 3, 5. The other two
utility functions are not so good descriptors of human behavior (as you can see by the
ARA and RRA functions you got). As we will see in later sections, the exponential utility
is used because it simplifies the calculations when asset returns are normally distributed,
and the quadratic utility simplifies them even further for any distribution.
2.5 Certainty Equivalent
How much is an investor willing to pay for a risky asset? Consider an investor with
initial wealth Y . Consider a gamble Z = (Z
1
, Z
2
, π).
Definition (Certainty Equivalent). CE(Y, Z), the certainty equivalent of the risky in-
vestment Z, is the certain amount of money which provides the same utility as the
gamble, i.e.,
E[U(Y +Z)] = U(Y +CE)
The investor is indifferent between receiving CE(Y, Z) for sure and playing the gam-
ble Z. In other words, if the investor owns the asset, he is willing to sell it at a price
equal to the certainty equivalent. The CE is useful to compare different assets in more
intuitive terms (money, instead of utility numbers).
Note that a risk-averse agent will always value an asset at something less than its
expected payoff: CE < E[Z].
13
12
Thinking about the cross section of assets, note that (2.3) allows different assets to have
different expected returns: π increases with θ, and thus the expected return also increases with
θ. Does this make sense? Think about risk!
13
Let Z be any random variable. Since U is strictly concave (U
′′
< 0), from Jensen’s inequality,
E[U(Y +Z)] < U(E[Y +Z]) = U(Y +E[Z])
Hence, from the definition of CE,
U(Y +CE) < U(Y +E[Z])
Since U is increasing (U

> 0), we must have
Y +CE < Y +E[Z] ⇒CE < E[Z]
2.5. Certainty Equivalent 17
Example 2.5.1. The investor has log utility and initial wealth Y = 1000. The
risky investment is Z = (200, 0, 0.5). Compute the CE:
E[U(Y +Z)] = U(Y +CE)
⇒. . .
⇒CE = 95.45
Why is the investor willing to accept less than the expected value of the gamble,
ie, why is CE = 95.45 < E[Z] = 100? Risk aversion. ◃
Plot the utility function, marking the points Y +Z
1
, Y +Z
2
, Y +EZ, Y +CE.
-
Y
6
U(Y )
Consider now a fair gamble:
Example 2.5.2. The investor has log utility and initial wealth Y = 100. The
risky prospect is Z = (20, −20, 0.5). We get:
E[U(Y +Z)] = U(Y +CE)
⇒1/2 ∗ ln(120) + 1/2 ∗ ln(80) = ln(100 +CE)
⇒CE = −2.02
What does it mean the CE to be negative? Plot the utility function, marking ◃
the points Y +Z
1
, Y +Z
2
, Y +EZ, Y +CE.
2.6. Stochastic dominance 18
2.6 Stochastic dominance
We now reverse gears and look for circumstances where the ranking among random
variables is preference free, that is, where we do not need to specify a utility function. We
will develop two concepts of dominance that are weaker, thus more broadly applicable,
than state-by-state dominance.
2.6.1 First Order Stochastic Dominance
Consider two assets, X
1
, X
2
, with the following payoffs:
Payoff
State (s) Prob(s) X
1
X
2
1 0.4 10 10
2 0.4 100 100
3 0.2 100 2000
Clearly, all rational investors prefer X
2
: it at least matches X
1
and has a positive
probability of exceeding it.
To formalize this intuition, let F
i
(x) denote the cumulative distribution function of
X
i
, that is, F
i
(x) = Prob[X
i
≤ x].
Definition (1SD). F
a
(x) 1SD F
b
(x) ⇔F
a
(x) ≤ F
b
(x), ∀x
Plot the two distribution functions in the example and check that F
2
(x) ≤ F
1
(x), ∀x.
Note that if the distribution of X
2
is always below X
1
, then the probability of X
2

exceeding a given payoff is always larger, that is,
F
2
(x) ≤ F
1
(x) ⇒1 −F
2
(x) ≥ 1 −F
1
(x) ⇒ Prob[X
2
≥ x] ≥ Prob[X
1
≥ x], ∀x
The usefulness of this concept comes from the following theorem:
Theorem 2.6.1.
F
a
(x) 1SD F
b
(x)

E
a
[U(x)] ≥ E
b
[U(x)] for all nondecreasing U
where E
i
is the expectation under the distribution of i, E
i
[U(x)] =

U(x) dF
i
(x) =

U(x)f
i
(x) dx.
Hence, all nonsatiable investors prefer asset X
2
.
Note that 1SD is not the same as state-by-state dominance. See exercise 4.8 in
Danthine and Donaldson (2005).
2.6. Stochastic dominance 19
2.6.2 Second Order Stochastic Dominance
1SD is still a very strong condition, thus not applicable to most situations. If we add
the assumption of risk aversion, we get the much more useful concept of Second Order
Stochastic Dominance (2SD).
Consider the following investments:
X
3
X
4
Payoff Prob Payoff Prob
4 0.25 1 0.33
5 0.50 6 0.33
9 0.25 8 0.33
Plot the two distribution functions. Even though no investment 1SD the other, ◃
intuitively X
3
“looks” better. To make this precise:
Definition (2SD). F
a
(x) 2SD F
b
(x) ⇔

x
−∞
F
a
(s) ds ≤

x
−∞
F
b
(s) ds, ∀x ⇔

x
−∞
[F
b
(s) −F
a
(s)] ds ≥ 0, ∀x
That is, at any point the accumulated difference between F
b
and F
a
must be positive.
Note that 1SD implies 2SD, but the converse is not true.
In the plot of the previous example, this basically means that the area of the difference
where F
3
> F
4
is “small”. To make this a bit more precise, we can compute the integrals ◃
at all relevant jump points.
x F
3
(x)

x
0
F
3
(s)ds F
4
(x)

x
0
F
4
(s)ds

x
0
F
4
(s)ds −

x
0
F
3
(s)ds
1 0.00 0 1/3 0 0 ≥ 0
4 0.25 0 1/3 1 1 ≥ 0
5 0.75 0.25 1/3 4/3 13/12 ≥ 0
6 0.75 1.00 2/3 5/3 2/3 ≥ 0
8 0.75 2.50 3/3 3 0.50 ≥ 0
9 1.00 3.25 3/3 4 0.75 ≥ 0
The last columns shows that

x
−∞
[F
4
(s) −F
3
(s)] ds ≥ 0, ∀x. (After x = 9, the difference
between the two integrals will always be 0.75 ≥ 0.)
All risk averse investors will prefer X
3
, as the following theorem shows.
Theorem 2.6.2.
F
a
(x) 2SD F
b
(x)

E
a
[U(x)] ≥ E
b
[U(x)] for all nondecreasing and concave U
Note that risk aversion is enough, i.e., we do not have to assume a specific utility
function.
2.7. Exercises 20
Mean preserving spread. The concept of 2SD is even more useful to understand
the tradeoff between risk and return.
Definition. Suppose there exists a random variable Z s.t. X
b
= X
a
+Z, with E[Z|X
a
] =
0 for all values of X
a
. Then, we say that X
b
is a mean preserving spread of X
a
. (Or F
b
or f
b
is a m.p.s. of F
a
or f
a
).
Note that X
b
has the same mean as X
a
, but it is more noisy, i.e., risky. Intuitively, all
risk averse investors should prefer the payoff with less risk, X
a
. The following theorem
justifies this intuition:
Theorem 2.6.3. Let F
a
(x) and F
b
(x) be two distribution functions with identical means.
Then,
F
a
(x) 2SD F
b
(x)

F
b
is a mean preserving spread of F
a
Mean-Variance criterion. This popular investment criterion states that: (i) for
two investments with the same mean, investors prefer the one with smaller variance; (ii)
for two investments with the same variance, investors prefer the one with higher mean.
We will discuss later the exact conditions for this criterion to be true. For now, note
that theorem 2.6.3 helps to explain part (i).
2.7 Exercises
Ex. 1 — (This is problem 3.1. in Danthine and Donaldson (2005))
Utility function. Under certainty, any increasing monotone transformation of a utility
function is also a utility function representing the same preferences. Under uncertainty,
we must restrict this statement to linear transformations if we are to keep the same
preference representation.
Check it with this example. Assume an initial utility function attributes the following
values to 3 perspectives:
B u(B) = 100
M u(M) = 10
P u(P) = 50
a. Check that with this initial utility function, the lottery L = (B, M, 0.50) ≻ P.
b. The proposed transformations are f(x) = a + bx, a ≥ 0, b > 0 and g(x) = ln(x).
Check that under f, L ≻ P, but that under g, P ≻ L.
2.7. Exercises 21
Ex. 2 — (This is problem 3.3. in Danthine and Donaldson (2005))
Inter-temporal consumption. Consider a two-date economy and an agent with utility
function over consumption:
U(c) =
c
1−γ
1 −γ
, γ > 0
at each period. Define the inter-temporal utility function as V (c
1
, c
2
) = U(c
1
) + U(c
2
).
Show that the agent will always prefer a smooth consumption stream to a more variable
one with the same mean, that is,
U(¯ c) +U(¯ c) > U(c
1
) +U(c
2
), if ¯ c =
c
1
+c
2
2
1. Start by showing that the utility function U is concave.
2. Then, show the required relation geometrically.
3. Finally, do the proof formally.
Hint: use the following definition of a concave function. A function f : R
N
→R
1
is concave if
f(ax + (1 −a)y) ≥ af(x) + (1 −a)f(y), ∀x, y ∈ R
N
and ∀a ∈ [0, 1]
Ex. 3 — An agent with wealth = 100 is faced with the following game: with probability
1/2 his wealth will increase to 200; with probability 1/2 it will decrease to 0. Complete
the following sentence:
If the agent is a risk- he is willing to pay some money to play
this game, whereas if he is risk- he is willing to pay some money
to avoid the game.
Ex. 4 — The ARA and RRA measures have the first derivative of the utility function
in the denominator. Why? Hint: read Danthine and Donaldson (2005)
Ex. 5 — Prove equation (2.2).
Ex. 6 — Complete the table in section 2.4 and plot the utility functions.
Ex. 7 — The CRRA utility function is usually presented as
U(W) =
_
ln(W) , γ = 1
W
1−γ
/(1 −γ) , γ > 1
because ln(W) is “almost” the limiting case as γ →1. More precisely, the true limit is
lim
γ→1
W
1−γ
−1
1−γ
= ln(W).
1. Explain why U
1
(W) =
W
1−γ
1−γ
and U
2
(W) =
W
1−γ
−1
1−γ
represent exactly the same
preferences.
2.7. Exercises 22
2. Prove that
lim
γ→1
W
1−γ
−1
1 −γ
= ln(W)
Hint: L’Hˆopital’s rule.
Ex. 8 — Consider the utility function U(Y ) = 5 +10Y
2
. What does it imply in terms
of risk-taking behavior? Would it be economically reasonable to model an investor’s
behavior with this utility function?
Ex. 9 — An investor has an initial wealth of Y = 10. To play a game where he could
win or loose 5% of his wealth, he demands π = 0.6, where π is the probability of the
favorable outcome (winning 5%). Nonetheless, if his wealth were Y = 1000, he would
still demand the same π = 0.6 to play the game.
1. What can you say about the risk characteristics of this investor? (One sentence
answer).
2. Give an example of an utility function consistent with this behavior.
Ex. 10 — The risk-aversion characteristics of an investor can be described by two
functions: ARA and RRA.
1. Give a very brief definition in words of these two measures.
2. What does it mean to say that an investor has increasing ARA? Does it make
intuitive sense? Give an example of an utility function with this characteristic.
3. Give an example of an utility function with constant RRA (compute the actual
coefficient of RRA).
Ex. 11 — An investor with initial wealth Y
0
= 100 is faced with the following lottery:
win 20 with 0.3 probability; loose 20 with 0.7 probability. The utility function is U(W) =
ln(W). What is the Certainty Equivalent of this lottery? What does this number mean?
Ex. 12 — Consider the following risky investment: Z = (100, 0, 0.5). The investor has
log utility, U = ln(Y ).
1. If the initial wealth is Y = 100, what is the certainty equivalent of the gamble?
2. If the initial wealth is Y = 1, what is the certainty equivalent of the gamble?
3. Explain in simple terms the change in CE.
Ex. 13 — Exercise 4.5 in Danthine and Donaldson (2005, p.354)
Ex. 14 — Exercise 4.7 in Danthine and Donaldson (2005, p.355). They meant to refer
to table 4.2.
Ex. 15 — Exercise 4.8 in Danthine and Donaldson (2005, p.355). Be careful in distin-
guishing between states of nature and distributions defined over payoffs.
2.7. Exercises 23
Ex. 16 — Consider two assets with returns r
a
∼ N(0.1, 0.2) and r
b
∼ N(0.1, 0.3). An
investor has the utility function U(W) = −exp(−γW). Which asset does the investor
prefer?
Chapter 3
Portfolio choice
1. The investor’s typical problem is
maximize
a
E[U(Y )]
2. It can be solved explicitly if we assume either:
1. Quadratic utility, or
2. CARA utility and normal returns.
3.1 Canonical portfolio problem
This section analyzes the problem of an investor that must decide how much to invest
in a risky asset. Consider the following notation
1
a ≡ amount (in $) to invest in a risky portfolio
˜ r ≡ uncertain rate of return on the risky portfolio
r
f
≡ risk-free (certain) rate of return
Y
0
≡ initial wealth
˜
Y
1
≡ terminal wealth
= a(1 + ˜ r) + (Y
0
−a)(1 +r
f
) = Y
0
(1 +r
f
) +a(˜ r −r
f
)
The investor’s problem is
maximize
a
E[U(
˜
Y
1
)] (3.1)
1
Tildes denote random variables. We’ll drop them when it is clear which variables are random.
24
3.1. Canonical portfolio problem 25
The (necessary) first order condition for a maximum is
foc:
d
da
E[U(
˜
Y
1
)] = 0 ⇔ E
_
dU(.)
d
˜
Y
1
(˜ r −r
f
)
_
= 0
and the (sufficient) second order condition is
soc:
d
2
da
2
E[U(
˜
Y
1
)] < 0 ⇔ E
_
d
2
U(.)
d
˜
Y
2
1
(˜ r −r
f
)
2
_
< 0
which is true if the investor is risk averse (U
′′
< 0).
Example 3.1.1. Assume U = 11Y − 5Y
2
, with Y
0
= $1. Let r
f
= 0,
E[r] = 0.1, Var[r] = 0.2
2
. Recall Var[x] = E[x
2
] − E[x]
2
. Use the foc to get
the optimal amount invested in the risky asset:
foc:
... a = $0.2
(For more real-life numbers, suppose the initial wealth was one million dollars.
Then, the optimal amount to invest in the risky assets would be $200 000.)
Use the soc to check that this is indeed a maximum:
soc:
The analysis of the optimality conditions produces the following important theorem:
Theorem 3.1.1. Let ˆ a denote the solution to problem (3.1) and assume the investor is
nonsatiable (U

> 0) and risk-averse (U
′′
< 0). Then
ˆ a > 0 ⇔ E[r] > r
f
ˆ a = 0 ⇔ E[r] = r
f
ˆ a < 0 ⇔ E[r] < r
f
3.2. Analysis of the optimal portfolio choice 26
The theorem says that a risk-averse investor will only invest in the risky asset (stocks)
if its expected return is higher than the risk-free rate. Conversely, if this is the case
( E[r] > r
f
), then the investor will always participate in the stock market (even if with
just a tiny amount of money).
Example 3.1.2. Suppose U(Y ) = ln(Y ). For simplicity, assume the risky
return is the simple lottery (r
2
, r
1
, π). Further assume r
2
> r
f
> r
1
(why?). ◃
The problem is thus
maximize
a
E[ln(
˜
Y
1
)]
The foc is
E
_
r −r
f
Y
0
(1 +r
f
) +a(r −r
f
)
_
= 0
or, given the two possible states,
π
r
2
−r
f
Y
0
(1 +r
f
) +a(r
2
−r
f
)
+ (1 −π)
r
1
−r
f
Y
0
(1 +r
f
) +a(r
1
−r
f
)
= 0
which after some algebra is
a
Y
0
=
(1 +r
f
)( E[r] −r
f
)
−(r
1
−r
f
)(r
2
−r
f
)
Check that the sign of the rhs depends on the sign of E[r] −r
f
. In particular,
if E[r] −r
f
> 0, we get a/Y
0
> 0, as in theorem 3.1.1. Note also the following
intuitive results:
1) The fraction of wealth invested in the risky asset (a/Y
0
) increases with the
return premium ( E[r] −r
f
);
2) The fraction of wealth invested in the risky asset (a/Y
0
) decreases with the
return “dispersion” around r
f
, (−(r
1
−r
f
)(r
2
−r
f
)).
Lastly, note that the fraction of wealth invested in the risky asset (a/Y
0
) does
not depend on the level of wealth (there is no Y
0
on the rhs). This result is
specific to the CRRA utility function as described in a theorem below.
2
3.2 Analysis of the optimal portfolio choice
3.2.1 Risk aversion
We now relate the portfolio decision to the risk aversion of the investor.
The follwoing theorem states, quite intuitively, that a more risk averse individual
will invest less in the stock market:
2
See the numerical examples in Danthine and Donaldson (2005) for further interpretation.
3.2. Analysis of the optimal portfolio choice 27
Theorem 3.2.1. Let ˆ a denote the solution to problem (3.1).
∀Y > 0, ARA
inv1
(Y ) > ARA
inv2
(Y ) =⇒ ˆ a
inv1
< ˆ a
inv2
Furthermore, since ARA
inv1
(Y ) > ARA
inv2
(Y ) ⇔RRA
inv1
(Y ) > RRA
inv2
(Y ), we also
have
∀Y > 0, RRA
inv1
(Y ) > RRA
inv2
(Y ) =⇒ ˆ a
inv1
< ˆ a
inv2
Lets check this result:
Example 3.2.1. Assume r
f
= 0.05 and r = (r
2
= 0.4, r
1
= −0.2, 1/2). For
U(Y ) = ln(Y ), we can use the results in the last example to get ◃
ˆ a/Y
0
= 0.6
Now consider the power utility function U(Y ) = Y
1−γ
/(1 − γ), with γ = 3.
Note that it has both higher RRA (3 > 1) and ARA (3/Y > 1/Y ). Check
(end-of-chapter exercise 18) that the optimal portfolio decision for this utility
function is
ˆ a/Y
0
= 0.198
Hence, this more risk-averse agent invests a smaller percentage of his wealth in
the risky asset. The initial wealth (Y
0
) is the same for both investors, so the
money invested (ˆ a) is also smaller, as the theorem stated.
3.2.2 Wealth
We now analyze the portfolio decision as the initial wealth changes. We might expect
wealthier investors to put more money in the stock market. However, the result is not
so simple; it depends on the characteristics of the specific utility function.
Absolute Risk Aversion
Theorem 3.2.2. Let ˆ a = ˆ a(Y
0
) denote the solution to problem (3.1). Then,
(Decreasing ARA) ARA

(Y ) < 0 ⇒ˆ a

(Y
0
) > 0
(Constant ARA) ARA

(Y ) = 0 ⇒ˆ a

(Y
0
) = 0
(Increasing ARA) ARA

(Y ) > 0 ⇒ˆ a

(Y
0
) < 0
3.2. Analysis of the optimal portfolio choice 28
DARA. If the investor has decreasing absolute risk aversion (DARA), he is willing to
put more money at risk as he becomes wealthier. Recall that power utility has DARA
(ARA(Y ) = γ/Y ). (Is this reasonable behavior?) ◃
CARA. The second case, constant absolute risk aversion (CARA) is also important
because the exponential utility satisfies this condition. Recall that
U(Y ) = −exp(−αY ) ⇒ARA(Y ) = α ⇒ARA

(Y ) = 0
The theorem states that this investor will put the same amount of money in the risky
asset regardless of how much wealth he has. (Is this a reasonable description of investors’
behavior?) ◃
Illustration: solving the problem for CARA
Lets verify the CARA case of the theorem. The portfolio problem is
maximize
a
{E[−exp(−αY
1
)]} (3.2)
with Y
1
= Y
0
(1 +r
f
) +a(r −r
f
). The foc is
E[α(r −r
f
) exp(−αY
1
)] = 0 (3.3)
which cannot be solved explicitly for a without further assumptions! To proceed, we
consider two alternatives.
1. Implicit Function Theorem
Even though we cannot explicitly solve the problem, we can still describe the
optimal solution using a very useful trick in economics: the Implicity Function
Theorem.
3
Intuitively, this theorem says the following. Suppose the (implicity)
function y = y(x) is the solution to some equation, that is, f(x, y) = 0. More
3
Implicit Function Theorem. Consider the equation f(y, x
1
, . . . , x
m
) = 0 and the solution
(¯ y, ¯ x
1
, . . . , ¯ x
m
). If ∂f(¯ y, ¯ x)/∂y ̸= 0, then there exists an implicit function y = y(x
1
, . . . , x
m
)
that satisfies the equation for every (x
1
, . . . , x
m
) in the neighborhood of (¯ x
1
, . . . , ¯ x
m
), i.e.,
f(y(x
1
, . . . , x
m
), x
1
, . . . , x
m
) = 0. Furthermore, the partial derivatives are given by
∂y(¯ x
1
, . . . , ¯ x
m
)
∂x
i
= −
∂f(¯ y, ¯ x
1
, . . . , ¯ x
m
)/∂x
i
∂f(¯ y, ¯ x
1
, . . . , ¯ x
m
)/∂y
3.2. Analysis of the optimal portfolio choice 29
precisely, as we change x, y(x) adjusts to keep f at 0, f(x, y) ≡ 0. We can thus
conclude that f does not change, ie, its total differential is zero. Therefore,
df(x, y) = 0

∂f
∂x
dx +
∂f
∂y
dy = 0

dy
dx
= −
∂f/∂x
∂f/∂y
Going back to the maximization problem, ˆ a = ˆ a(Y
0
) is the implicit function that
guarantees that the lhs of (3.3) is always zero. We can thus take the total differ-
ential of the foc and get
dˆ a(Y
0
)
dY
0
= −
∂ E[. . . ]/∂Y
0
∂ E[. . . ]/∂a
= −
(1 +r
f

=0 (foc)
¸ .. ¸
E[α(r −r
f
)e
−αY
1
]
E[α
2
(r −r
f
)
2
e
−αY
1
. ¸¸ .
>0
]
= 0
Hence, the amount invested in the risky asset does not change with the investor’s
wealth, as the theorem claimed. Furthermore, the implicit function theorem al-
lowed us to check this without solving the maximization problem explicitly.
2. Normal returns
To get an explicit closed-form solution to problem (3.2) we need an additional
assumption. It is this assumption that justifies the wide use of exponential utility.
Assume the return on the risky asset is normally distributed, r ∼ N(µ, σ
2
). Then,
next period’s wealth is also normally distributed, Y
1
∼ N(Y
0
(1 + r
f
) + a(µ −
r
f
), a
2
σ
2
). Using the moment generating function for the normal distribution
4
,
we can simplify the portfolio problem:
max
a
{E[−exp(−αY
1
)]} = max
a
_
−exp
_
−α[Y
0
(1 +r
f
) +a(µ −r
f
)] + 1/2α
2
a
2
σ
2
__
that is, the rhs does not have E[.]. We can thus solve the maximization problem
and get a closed-form solution for a. Exercise 24 asks you to do these final steps.
Check that the final expression for a does not depend on Y
0
, as the theorem
stated. To summarize, even though the exponential utility is not the best intuitive
description of human behavior, it is very useful if we assume that returns are
normally distributed.
4
If X ∼ N(m, s
2
), then E
_
e
−γX
¸
= exp
_
−γm+
1
2
γ
2
s
2
_
, for any γ.
3.2. Analysis of the optimal portfolio choice 30
Relative Risk Aversion
We can also characterize the optimal portfolio choice in terms of the relative risk aversion
measure, RRA. Define ˆ w ≡ ˆ a/Y
0
, the optimal proportion of wealth invested in the risky
asset, or the optimal portfolio weight in the risky asset.
Theorem 3.2.3. Express the solution to problem (3.1) as a fraction of wealth, ˆ w(Y
0
) ≡
ˆ a(Y
0
)/Y
0
. Then,
(Decreasing RRA) RRA

(Y ) < 0 ⇒ ˆ w

(Y
0
) > 0
(Constant RRA) RRA

(Y ) = 0 ⇒ ˆ w

(Y
0
) = 0
(Increasing RRA) RRA

(Y ) > 0 ⇒ ˆ w

(Y
0
) < 0
For example, if the investor has decreasing RRA, he will invest a higher proportion
of wealth in the risk asset as he becomes wealthier. The most interesting case is perhaps
the constant relative risk aversion (CRRA) case, as it characterizes the power and log
utility functions. These investors always invest the same fraction of their wealth in the
stock market, regardless of their initial wealth.
5
Example 3.2.2. Consider U = ln(Y ). Define w ≡ a/Y
0
, the fraction of
wealth invested in the risky asset. The investor’s problem is to
maximize
w
E[ln(Y
1
)]
with Y
1
= Y
0
(1 + r
f
) + wY
0
(r − r
f
). Writing the foc and using the implicit
function theorem, we can show that (end-of-chapter exercise 19)
d ˆ w
dY
0
= 0
That is, the optimal fraction does not change with wealth.
5
This theorem can also be expressed in terms of η ≡
dˆ a/ˆ a
dY
0
/Y
0
, the wealth elasticity of the
investment in the risky asset:
(Decreasing RRA) RRA

(Y ) < 0 ⇒η > 1
(Constant RRA) RRA

(Y ) = 0 ⇒η = 1
(Increasing RRA) RRA

(Y ) > 0 ⇒η < 1
To see that increasing ˆ w(Y
0
) ≡ ˆ a(Y
0
)/Y
0
is the same as η > 1, note
d
dY
0
[ ˆ w(Y
0
)] =
d
dY
0
_
ˆ a(Y
0
)
Y
0
_
> 0 ⇔
dˆ a
dY
0
1
Y
0
− ˆ a/Y
2
0
> 0 ⇔ dˆ a/ dY
0
> ˆ a/Y
0

dˆ a/ˆ a
dY
0
/Y
0
> 1
and similarly for the other cases.
3.3. Canonical portfolio problem for N > 1 31
3.3 Canonical portfolio problem for N > 1
Now we generalize the portfolio choice problem. There are N risky assets and 1 risk-free
asset. Terminal wealth is
˜
Y
1
= Y
0
(1 +r
f
) +
N

i=1
a
i
( ˜ r
i
−r
f
)
The investor’s problem is thus
maximize
{a
1
,...,a
N
}
E
_
U
_
Y
0
(1 +r
f
) +
N

i=1
a
i
( ˜ r
i
−r
f
)
__
It will be convenient to choose weights instead of $ values. We thus define w
i
≡ a
i
/Y
0
and write Y
1
= Y
0
(1 + r
f
) +

N
i=1
w
i
Y
0
( ˜ r
i
− r
f
). The investor’s problem can thus be
rewritten as
maximize
{w
1
,...,w
N
}
E
_
U
_
Y
0
_
(1 +r
f
) +
N

i=1
w
i
( ˜ r
i
−r
f
)
___
Define ˜ r
p
to be the return on the portfolio:
˜ r
p
:= w
f
r
f
+
N

i=1
w
i
˜ r
i
Imposing the constraint that the weights must add up to one, we have that
˜ r
p
=
_
1 −
N

i=1
w
i
_
r
f
+
N

i=1
w
i
˜ r
i
= r
f
+
N

i=1
w
i
( ˜ r
i
−r
f
)
Hence, the portfolio problem can also be written as
maximize
{w
1
,...,w
N
}
E[ U (Y
0
(1 + ˜ r
p
))]
Unfortunately, this problem is hard to solve without some simplifying assumptions.
3.4 Exercises
Ex. 17 — State the investor’s problem (expression 3.1) in words.
3.4. Exercises 32
Ex. 18 — Check the results in example 3.2.1. The final expression is in the book; you
just need to do the intermediate calculations. Caution: the expression in the book is
correct, but the number is not (at least I get a different answer: a/Y = 0.198 instead of
0.24).
Ex. 19 — Check the results in example 3.2.2, ie, do the intermediate computations.
Ex. 20 — Consider the standard portfolio choice between a risk-free asset and a risky
stock. An investor with initial wealth $1000 makes an optimal choice to allocate $400
to the stock. We know that if the same investor had an initial wealth larger than $1000,
he would allocate more than $400 to the stock.
1. This investor has (decreasing / con-
stant / increasing) ARA.
2. Give an example of a utility function consistent with this behavior.
Ex. 21 — Consider the utility function U(Y ) = −e
−gY
, where g is a constant param-
eter.
1. Compute the ARA and RRA coefficients.
2. Interpret in words the result obtained for ARA (relate it to a simple lottery and
to the portfolio choice problem).
Ex. 22 — Consider the canonical portfolio choice problem with 1 risky asset (with
random return r) and 1 risk-free asset (with return r
f
). The investor chooses the amount
of money (a) to invest in the risky asset.
1. Write the problem explicitly for an investor with U(Y ) = −exp(−αY ), where Y
is the wealth.
2. If the risk-free rate increases, what should happen to the amount invested in the
risky asset? Explain intuitively (5 lines).
3. Show it explicitly. Hint: compute
da
dr
f
and determine its sign.
Ex. 23 — There is a risk-free and a risky asset. The investor chooses the amount
invested in the risky asset, a, to maximize
a
EU(Y
1
), where Y
1
is next period’s wealth.
Assume a regular utility function (U

> 0, U
′′
< 0).
1. In general, what can you say about the sign of da/dY
0
?
2. Assume U(Y ) = −e
−αY
. Compute da/dY
0
.
Ex. 24 — Consider the standard portfolio choice problem
maximize
a
E[−exp(−γY
1
)]
where next-period’s wealth is Y
1
= Y
0
(1 + r
f
) + a(r − r
f
), and the return on the risky
asset is normally distributed, r ∼ N(µ, σ
2
). Compute the explicit optimal amount to
3.4. Exercises 33
invest in the risky asset (a).
Hint. Use the following property of the normal distribution (called moment generating
function): If X ∼ N(m, s
2
), then E
_
e
−γX
¸
= exp
_
−γm+
1
2
γ
2
s
2
_
, for any γ.
Ex. 25 — Computing returns with dividends.
Consider the following daily closing prices and dividends (D) for two stocks (in $):
Stock A Stock B
day t P
t
D
t
P
t
D
t
fri 0 10 – 10 –
mon 1 11 – 11 –
tue 2 10 – 10 –
wed 3 11 – 11 1.1
thu 4 9 – 9 –
fri 5 12 – 12 –
Note that when a stock pays dividends, the return should be computed as r
t
=
P
t
+D
t
P
t−1
−1.
1. Compute daily returns for these two stocks. Compute also the weekly returns
assuming that the dividends are reinvested in the stock. This is a standard as-
sumption, so use the standard formula, 1+r
0,T
= (1+r
0,1
)(1+r
1,2
) . . . (1+r
T−1,T
).
Note: this is usually called Holding Period Return in databases such as CRSP
or DataStream.
2. Suppose you invested $4,000 in A and $6,000 in B in the beginning of the week.
Compute the portfolio return over this week. (Use the weekly returns already
computed and apply the standard formula for the portfolio return).
3. Since we assume that dividends are reinvested in the stock, we may end up with
more shares than we started with. How many shares of each stock do you have
at the beginning of the week? How many shares do you have at the end of the
week?
Note: to check that you have the right answer, compute the terminal value of the
portfolio by doing V
5
= P
A,5
N
A,5
+ P
B,5
N
B,5
, where N is the number of shares
that you got. It should imply the same weekly return as in the previous question.
4. Again, the way weekly returns were computed assumes that dividends are rein-
vested in the stock. Hence, while for the stock without dividends (A) we have
r
A,week
= P
5
/P
0
−1
0.2 = 12/10 −1
the same is no longer true for the dividend-paying stock (stock B)
r
B,week
̸= P
5
/P
0
−1
0.32 ̸= 12/10 −1
3.4. Exercises 34
Hence, databases usually also show and adjusted price, P
a
, that can be used to
compute returns without having to know the dividends. The true return from
market closes plus dividends must equal the return with adjusted closes:
P
t
+D
t
P
t−1
−1 =
P
a
t
P
a
t−1
−1
Fix the last price P
a
5
= P
5
= 12. Compute the adjusted prices for the previous
days for both stocks.
(Check my website for an exercise with data from finance.yahoo.com)
Chapter 4
Portfolio choice for
Mean-Variance investors
1. Quadratic utility or Normal returns imply mean-variance prefer-
ences, E[U] = f(µ
p
, σ
2
p
).
2. The optimal investment opportunities are described by the mean-
variance frontier.
3. The investor’s portfolio choice problem with N > 1 risky assets
can be solved explicitly.
These concepts were developed by Harry Markowitz in 1952 and they are still the
benchmark for optimal portfolio allocation.
4.1 Mean-Variance preferences
The general portfolio problem (N > 1) is hard to solve unless we make one of the
simplifying assumptions below. Either one of these assumptions will lead to mean-
variance preferences, that is, to investors that care only about the first two moments of
Y
1
or r
p
.
1
Expand U(
˜
Y
1
) around E(
˜
Y
1
). To simplify the notation, let Y ≡ Y
1
.
U(Y ) = U( EY ) +U

( EY ) · (Y − EY ) + 1/2 · U
′′
( EY ) · (Y − EY )
2
+ remainder
1
Note that the two are related: E[Y
1
] = Y
0
(1 + E[r
p
]) and Var[Y
1
] = Y
2
0
Var[r
p
].
35
4.1. Mean-Variance preferences 36
Taking expectations on both sides we get
EU(Y ) = U( EY )+U

( EY )· E[(Y − EY )] +1/2· U
′′
( EY )· E[(Y − EY )
2
] + E[remainder]
or, simplifying,
EU(Y ) = U( EY ) + 1/2 · U
′′
( EY ) · Var(Y ) + E[remainder]
Note that this expression, EU(Y ), is what the investor maximizes in his portfolio prob-
lem. The question is thus “under what conditions can we say that E[remainder] = 0, or
at least that E[remainder] itself depends only on the first two moments of wealth?”
4.1.1 Quadratic utility
If the utility function is quadratic, U = aY −bY
2
, all derivatives of order higher than 2
are null, thus remainder = 0. Therefore, we have an exact expression:
EU(Y ) = U( EY ) + 1/2 · U
′′
( EY ) · Var(Y ) (4.1)
and the portfolio problem becomes quite simple to solve.
Drawbacks of quadratic utility. Quadratic utility has IARA, which is not very
reasonable. Furthermore, in practical applications we have to be careful defining the
parameters a and b such that we only use the range of wealth where U is increasing.
4.1.2 Normal returns
Alternatively, we can assume that stock returns are normally distributed. Note that if
r
p
∼ N, then the wealth is also normally distributed, Y ≡ Y
0
(1 +r
p
) ∼ N.
For a normal distribution, all higher-order central moments are either zero or a
function of the variance:
E[(Y − EY )
n
] =
_
0, n odd
n!
(n/2)!
(
1
2
Var[Y ])
n/2
, n even
These are the terms in E[remainder]. Hence,
EU(Y ) = U( EY ) + 1/2 · U
′′
( EY ) · Var(Y ) +f( VarY )
that is, investor’s objective function depends only on the first two moments.
4.1. Mean-Variance preferences 37
Advantages of normality
We are considering the case where the investor can combine several assets into a portfolio.
If we start by assuming that the return on individual assets is determined by their means
and variances, we need to make sure that the return on any combination of these assets
(portfolio) is also determined by the mean and variance only. The Normal distribution
satisfies this additivity requirement (in fact, it is the only distribution with finite variance
that does so).
To see this, let V denote the value of a portfolio with N assets, and w
i
denote the
percentage of wealth invested in each asset. The portfolio return is just the weighted-
average of individual returns:
r
p
:= V
1
/V
0
−1
=
N

i=1
a
i
(1 +r
i
)/V
0
−1 =
N

i=1
w
i
(1 +r
i
) −
N

i=1
w
i
=
N

i=1
w
i
r
i
Since the sum of normally distributed random variables also follows a normal distribu-
tion, if we assume that each stock has a normal distribution, then the portfolio return
is also normally distributed: r
i
∼ N ⇒r
p
∼ N.
Drawbacks of normality
The returns we are considering here are discrete returns, defined as:
r : P
1
= P
0
(1 +r)
Since the Normal distribution has R support, saying that r ∼ N is the same as saying
that prices can be negative. This is an unrealistic description for assets with limited
liability, such as stocks and bonds, where the worst that can happen is bankruptcy, in
which case P
1
= 0 and r = −100%.
2
2
We can go around this issue by using instead continuously-compounded returns:
z : P
1
= P
0
e
z
⇔z = ln(P
1
/P
0
)
This guarantees P
1
> 0, ∀z ∈ R. We can thus safely assume z ∼ N. Continuous returns
are very convenient for time-series aggregation in multiperiod settings. If short-horizon returns
are normally distributed, then the long-horizon return, z
0,T
, is also normally distributed: z
0,T
=
ln(P
T
/P
0
) = ln(
P
T
P
T−1
·
P
T−1
P
T−2
. . .
P
2
P
1
·
P
1
P
0
) = z
0,1
+z
1,2
+· · ·+z
T−2,T−1
+z
T−1,T
∼ N. For cross-section
aggregation, the expression is a bit more cumbersome: z
p
:= ln(V
1
/V
0
) = ln
_

N
i=1
a
i
e
z
i
/V
0
_
=
ln
_

N
i=1
w
i
e
z
i
_
or, e
z
p
=

N
i=1
w
i
e
z
i
. Normality is not preserved.
4.1. Mean-Variance preferences 38
Empirical evidence
It is an empirical question whether normality is a reasonable first approximation to
security returns. The answer is yes, the normal distribution is a useful approximation,
particularly for returns measured over long horizons, such as one year.
If we were interested in high-frequency returns, then the normal assumption would
be more questionable, due to the following empirical facts:
1. Short-term daily returns have fat tails, that is, empirical returns have more kurtosis
than the normal distribution.
2. Short-term daily returns (especially for stock indices) are skewed to the left, that
is, extremely bad returns are more likely than under a true normal distribution.
Fortunately, these problems are less severe at longer horizons, say monthly or yearly.
Hence, since the portfolio problem we are considering here typically has a long horizon,
normality is a reasonable assumption.
Note that, despite the caveats above, the normal distribution is still the bench-
mark and the work-worse in finance. For instance, J.P.Morgan/Reuters’ RiskMetrics
system (outputs Value-at-Risk estimates) assumes that even daily returns are normally
distributed (see J.P. Morgan, 1996).
4.1.3 Conclusion
Either assuming quadratic utility or normal returns, we conclude that the investor max-
imizes a function of the mean (µ := E[r]) and variance (σ := Var[r]) of the return on
the portfolio:
maximize E[U(Y )]
. ¸¸ .
f(µ
p

2
p
)
Quite intuitively, it can be shown that the objective function increases with the
expected return, df/ dµ
p
> 0, and decreases with the standard-deviation, df/ dσ
p
< 0.
3
This leads to two important results.
3
For quadratic utility, this follows directly from taking derivatives of (4.1). For normal returns,
standardize the portfolio returns: s
p
=
r
p
−µ
σ
∼ N(0, 1). Then, the fn to be maximized is
f := E[U(r
p
)] =

U(r)p(r)dr =

U(σs + µ)p(s)ds, where p(.) is the Normal pdf. Thus,
df/dµ =

U

(.)p(s)ds > 0, since U

> 0. Also, df/dσ =

U

(.)sp(s)ds < 0, since U
′′
< 0 means
that U

is decreasing, which implies that for each ±s pair the negative s gets more weight. See
appendix 6.1 in Danthine and Donaldson (2005) for illustrations. To be precise, the investor
maximizes EU(Y
1
), not EU(r
p
), but the derivatives have the same sign since Y
1
= Y
0
(1 +r
p
).
4.2. Review: Mean-Variance frontier with 2 stocks 39
Mean-Variance dominance. Asset a mean-variance dominates asset b iff:
µ
a
≥ µ
b
and σ
a
< σ
b
or µ
a
> µ
b
and σ
a
≤ σ
b
All mean-variance investors prefer asset a. This implies that, for a fixed given level of
variance(mean), all mean-variance investors prefer the portfolio with the largest(smallest)
return(risk).
Optimal portfolio. It can be shown that a mean-variance investor will choose his
portfolio through
maximize
{w
1
,...,w
N
}
µ
p

g
2
σ
2
p
That is, his objective function trades-off mean against variance. The parameter g deter-
mines how much the investor dislikes variance, i.e., how risk-averse he is.
4.2 Review: Mean-Variance frontier with 2 stocks
This section analyzes the investment opportunity set for an investor with mean variance
preferences (by one of the two possible assumptions in section 4.1). The goal is to
develop intuition for the diversification effect with just two stocks. The following sections
consider the portfolio problem in full generality.
Suppose there are just two risky assets (stocks). The investor only cares about the
mean and variance of the return on the portfolio formed by these two assets:
µ
p
≡ E
_
2

i=1
w
i
r
i
_
= w
1
µ
1
+ (1 −w
1

2
and
σ
2
p
≡ Var
_
2

i=1
w
i
r
i
_
= w
2
1
σ
2
1
+ (1 −w
1
)
2
σ
2
2
+ 2w
1
(1 −w
1

1
σ
2
ρ
where ρ is the correlation coefficient between r
1
and r
2
(recall −1 ≤ ρ ≤ +1). The
opportunity set depends critically on this correlation.
4.2. Review: Mean-Variance frontier with 2 stocks 40
The main point we want to illustrate in this section is the diversification effect.
Whereas the expected return on the portfolio is the weighted average of expected returns
on the individual assets, the same is not true for the risk. In fact, the standard-deviation
of the portfolio is typically less than the weighted average of the individual standard-
deviations. This is the gain from diversifying the portfolio. The smaller the correlation
coefficient, the greater the benefits from diversification.
Perfect positive correlation (ρ = 1). There is no gain from diversification since
the assets are essentially identical (the return on one asset is a linear function of the
other). The portfolio standard-deviation is equal to the weighted average of the two
standard-deviations
σ
p
= w
1
σ
1
+ (1 −w
1

2
which means that all the possible portfolio lie on the straight line between the two assets
(in σ, µ - space) — see figure 6.2 in Danthine and Donaldson (2005).
Imperfect correlation (−1 < ρ < +1). Now we have the diversification benefit.
At each level of µ
p
, the corresponding σ
p
is less than in the ρ = 1 case. This is because
σ
2
p
increases in ρ (∂σ
2
p
/∂ρ = 2w
1
w
2
σ
1
σ
2
> 0). See figure 6.3 in Danthine and Donaldson
(2005) and appendix 6.2 for a formal proof.
Note that only the portfolios on the upper part of the curve are efficient, that is, they
(mean-variance) dominate the ones on the lower part of the curve.
Perfect negative correlation (ρ = −1). For this (theoretical) case we would be
able to construct a risk-free asset. See figure 6.4 in the book.
1 Risk-free and 1 risky asset. If one asset is risk-free (σ
1
= 0), we have σ
12
= 0
and σ
p
= w
2
σ
2
. The opportunity set is again linear — figure 6.5 in the book.
Extension to N risky assets. Intuitively, this analysis can be generalized to 3 risky
assets by taking one of the possible previous portfolios and a new 3rd asset. Proceeding
with these iterations, we could get to N risky assets. The minimum variance frontier
will have the shape in figure 6.6 in the book. We will derive this carefully in the next
section.
4.3. Setup for general case 41
Extension to N risky assets plus 1 risk-free asset. The investor will pick on
particular portfolio on the mean-variance frontier (the tangency portfolio) to combine
with the risk-free asset. The straight line going through r
f
and µ
T
is the efficient
frontier. See figure 6.6. Again, this will be derived below.
The fact that all investors will invest in the same two assets (the risk-free and the
tangency portfolio), even though in different proportions, is known as the two fund
theorem or the separation theorem.
4.3 Setup for general case
4.3.1 Notation
Let r be the (N.1) vector of returns on the N risky assets. Define the vector of expected
returns:
¯ r := E[r] =
_
¸
_
E[r
1
]
.
.
.
E[r
N
]
_
¸
_
Let the covariance matrix be
V := Cov(r) =
_
¸
¸
_
.
.
.
. . . σ
ij
. . .
.
.
.
_
¸
¸
_
Let 1 be a (N.1) vector of ones. Let µ (scalar) be the required return on the portfolio.
The choice variable is the vector of portfolio weights:
w =
_
¸
_
w
1
.
.
.
w
N
_
¸
_
4.3.2 Brief notions of matrix calculus
For a scalar-valued function f(x
1
, . . . , x
n
), the gradient is
∂f(x)
∂x
=
_
¸
_
∂f/∂x
1
.
.
.
∂f/∂x
n
_
¸
_
4.4. Frontier with N risky assets 42
Let a be a (n.1) vector of constants and A a (n.n) symmetric matrix of constants. Some
useful rules are:
d(a

x)/dx = a
and
d x

Ax
. ¸¸ .
1.n.n.1
/dx = 2Ax
To check the second rule, consider
A =
_
1 3
3 4
_
Note that x

Ax = x
2
1
+ 4x
2
2
+ 6x
1
x
2
. Thus,
d(x

Ax)
dx
=
_
2x
1
+ 6x
2
6x
1
+ 8x
2
_
=
_
2 6
6 8
_
·
_
x
1
x
2
_
= 2Ax
4.4 Frontier with N risky assets
4.4.1 Efficient portfolio
The variance of the return on a portfolio (r
p
= w

r) is given by
Var[w

r] = w

V w
The program to find the minimum-variance portfolio, for a given expected return µ,
is thus:
minimize
w
1
2
w

V w (4.2)
s.t. w

¯ r = µ
w

1 = 1
This is a constrained optimization problem. To solve it, define the Lagrangian
L =
1
2
w

V w +λ(µ −w

¯ r) +γ(1 −w

1)
where the scalars λ and γ are Lagrange multipliers.
4.4. Frontier with N risky assets 43
The first-order conditions are:
dL
dw
= V w −λ¯ r −γ1 = 0 (N eqns) (4.3)
dL

= µ −w

¯ r = 0 (1 eqn) (4.4)
dL

= 1 −w

1 = 0 (1 eqn) (4.5)
The foc for w can be rewritten as
V w = λ¯ r +γ1
⇒ V
−1
V w = V
−1
(λ¯ r +γ1)
⇒ w = λV
−1
¯ r +γV
−1
1 (4.6)
But this is not over yet because we don’t know the value of the multipliers.
Pre-multiplying (4.6) by ¯ r

and using the foc for λ we get
¯ r

w = λ(¯ r

V
−1
¯ r) +γ(¯ r

V
−1
1)
⇒ µ = λ(¯ r

V
−1
¯ r) +γ(¯ r

V
−1
1) (4.7)
Pre-multiplying again (4.6) by 1

and using the foc for γ we get
1

w = λ(1

V
−1
¯ r) +γ(1

V
−1
1)
⇒ 1 = λ(1

V
−1
¯ r) +γ(1

V
−1
1) (4.8)
Equations (4.7) and (4.8) form a system of two (scalar) equations that can be solved
for the two unknown lagrange multipliers:
_
µ = λB +γA
1 = λA+γC

_
γ =
B−Aµ
D
λ =
Cµ−A
D
where we defined the scalars A := 1

V
−1
¯ r, B := ¯ r

V
−1
¯ r, C := 1

V
−1
1, and D :=
BC −A
2
. Since the matrix of covariances (V ) is positive definite and thus also V
−1
, we
have that B > 0 and C > 0.
4
It can also be shown that D > 0.
4
We say that the matrix A is positive (semi)definite if x

Ax > 0 (≥) for all nonzero x. The
covariance matrix is PD because the variance of a portfolio must be positive, Var[w

r] = w

V w >
0. In general, a covariance matrix need only be PSD, but this would mean that we might be able
to construct a risk-free portfolio using only stocks, Var[w

r] = w

V w = 0. This is typically not
the case, so we assume that V is PD.
4.4. Frontier with N risky assets 44
Plugging these numbers back into (4.6), we get the final answer:
w

=
Cµ −A
D
V
−1
¯ r +
B −Aµ
D
V
−1
1 (4.9)
This equation is a closed formula for the efficient portfolio with return µ, that is,
for the portfolio with smallest variance between all portfolios with return µ. You can
double check that we do indeed get the required return, i.e. E[r

p
] ≡ w


¯ r = µ. The
portfolio variance can be computed as Var[r

p
] ≡ Var(w
∗′
r) = w
∗′
V w

. By varying µ
and computing the respective w

and Var[r

p
], we can plot the frontier of the investment
opportunity set.
Example 4.4.1. Assume that there are only 2 risky assets with E[r
1
] = 15%,
σ
1
= 25%, E[r
2
] = 10%, σ
2
= 20%, and zero correlation. First, check that
A = 4.9
B = 0.61
C = 41
D = 1
Hint: see the formula sheet for an easy way to invert a diagonal matrix.
Second, if we require say an expected return of µ = 0.14, the optimal portfolio
from the formula above is
w

= . . . =
_
0.8
0.2
_
We can check that E[r

p
] ≡ w


¯ r = 0.14. The risk of the portfolio is
Var[r

p
] = w
∗′
V w

= 0.0416 ⇒σ

p
= 0.204
4.4.2 Frontier equation
If we work out the Var[r

p
] = w
∗′
V w

expression, we arrive at the following equation for
the mean-variance frontier:
Var[r

p
] =
C
D
_
µ −
A
C
_
2
+
1
C
which is a parabola in ( Var[r
p
], E[r
p
])-space.
5
Example 4.4.2. Continuing the previous example, check that we get the same
Var[r

p
] for µ = 0.14
5
The frontier is an hyperbola in (σ
p
, E[r
p
])-space.
4.5. Frontier with N risky assets and 1 risk-free asset 45
4.4.3 Global minimum variance portfolio
From this equation, we can immediately identify the global minimum variance portfolio:

E[r
mvp
] = A/C
Var[r
mvp
] = 1/C
The set of portfolios located on the mean-variance frontier with E[r
p
] > A/C is called
the efficient frontier.
6
Example 4.4.3. For the previous example, check that
E[r
mvp
] = 0.1195
Var[r
mvp
] = 0.0244 ⇒σ
mvp
= 0.1562
4.5 Frontier with N risky assets and 1 risk-free
asset
4.5.1 Efficient portfolio
In addition to the N risky assets of the previous section, we now consider one additional
risk-free asset with (known) return r
f
. Let w be the (N.1) vector of weights in the risky
assets as defined before. The proportion of wealth invested in the risk-free asset is thus
what is left, w
f
= 1 −w

1. Therefore, the expected return on a given portfolio is
E[r
p
] = w

¯ r +w
f
r
f
= w

¯ r + (1 −w

1)r
f
Note that the second equation already imposes that the weights add up to 1.
The program to find the minimum-variance portfolio, for a given expected return µ,
is now
minimize
w
1
2
w

V w (4.10)
s.t. w

¯ r + (1 −w

1)r
f
= µ
6
Different people call slightly different names to all these “frontiers”. So make sure you
understand the concepts well (what dominates what).
4.5. Frontier with N risky assets and 1 risk-free asset 46
The solution is:
w

=
µ −r
f
H
V
−1
(¯ r −r
f
1) (4.11)
where the scalar H := (¯ r − r
f
1)

V
−1
(¯ r − r
f
1) = B − 2Ar
f
+ Cr
2
f
> 0. The scalars
A, B, C are as defined above.
Example 4.5.1. Continuing the previous two-stock example, further assume
r
f
= 0.04. First, check that
H = 0.2836
Second, if we require an expected return of µ = 0.14, the optimal portfolio
from the formula above is
w

= . . . =
_
0.6206
0.5289
_
We can check that E[r

p
] ≡ w


¯ r +w
f
r
f
= 0.14. The risk of the portfolio is
Var[r

p
] = w
∗′
V w

= 0.0353 ⇒σ

p
= 0.1878
4.5.2 Frontier equation
To plot the mean-variance frontier, we can again compute w

and the respective Var[r

p
]
for different values of µ. Alternatively, we can compute an explicit expression for Var[r

p
]:
Var[r

p
] ≡ Var(w
∗′
r) = w
∗′
V w

=
_
µ −r
f
H
_
2
_
V
−1
(¯ r −r
f
1)
¸

V
_
V
−1
(¯ r −r
f
1)
¸
=
_
µ −r
f
H
_
2
(¯ r −r
f
1)

(V
−1
)

(¯ r −r
f
1)
. ¸¸ .
=H
Note that V is symmetric, thus (V
−1
)

= (V

)
−1
= V
−1
. Finally, the mean-variance
frontier with a risk-free asset is:
Var[r

p
] =
(µ −r
f
)
2
H
(4.12)
This draws two straight lines in (σ
p
, ¯ r
p
)-space (an exercise will ask you to check this
with real data). The one that goes through r
f
and the tangency portfolio (ie, the set of
portfolios with E[r
p
] > r
f
) is the efficient frontier:
7
µ = r
f

p

H (4.13)
7
Equation (4.12) implies
σ
p
=
µ −r
f

H
or −σ
p
=
µ −r
f

H
⇒µ = r
f

p

H or µ = r
f
−σ
p

H
4.5. Frontier with N risky assets and 1 risk-free asset 47
Example 4.5.2. Check that we get the same Var[r

p
] for µ = 0.14
4.5.3 Tangency portfolio
We can compute the precise coordinates of the tangency portfolio T by noting that it
is the only frontier portfolio composed only by risky assets, i.e. 1

w

T
= 1. We can use
(4.11) to find the corresponding expected return (µ
T
):
1

w

T
= 1

µ
T
−r
f
H
1

V
−1
(¯ r −r
f
1) = 1
⇒µ
T
=
H
A−Cr
f
+r
f
Plugging back into (4.11) we obtain an explicit expression for the weights in the tangency
portfolio:
w

T
=
V
−1
(¯ r −r
f
1)
A−Cr
f
Example 4.5.3. Continuing the previous example, check that
w

T
= . . . =
_
0.5399
0.4601
_
Thus,
E[r
T
] = 0.1270
σ
T
= 0.1634
Example 4.5.4. Two-fund separation: Find the linear combination of T and
r
f
that will give E[r
p
] = 0.14.
Check that the weights in the two stocks are equal to the ones obtained above
using (4.11)
We are interested in the line with positive slope, µ = r
f

p

H, which under “normal” circum-
stances will be the tangent line. More precisely, the tangency portfolio is located on the upper
limb of the hyperbola if r
f
< E[r
mvp
] = A/C. If the reverse is true, the tangency portfolio is
located on the lower limb. Further, if r
f
= A/C, there is no finite point of tangency. However,
note that from theorem 3.1.1, the equilibrium case under the CAPM model (section 5) must be
r
f
< E[r
mvp
] (otherwise, there would be no demand for the risky assets). Hence, in equilibrium
the frontier is given by µ = r
f

p

H. See Huang and Litzenberger (1988) or Ingersoll (1987)
for details.
4.6. Optimal portfolio 48
4.6 Optimal portfolio
The particular portfolio on the efficient frontier that the investor picks depends on his
level of risk aversion. Given that the investor has mean-variance preferences, he chooses
his optimal portfolio weights by
maximize E[r
p
] −
g
2
V ar[r
p
]
where g is a constant parameter and r
p
denotes the return on the portfolio.
Assuming that there are N risky assets plus one risk-free asset, the problem in matrix
notation is
maximize
w
w

¯ r +w
f
r
f

g
2
w

V w
s.t. 1 = w

1 +w
f
or
maximize
w
w

¯ r + (1 −w

1)r
f

g
2
w

V w
The foc is
¯ r −r
f
1 −gV w = 0
which implies the solution
w

=
1
g
V
−1
(¯ r −r
f
1) (4.14)
Example 4.6.1. Assume that g = 5, r
f
= 4%, and that there are only 2
risky assets with E[r
1
] = 15%, σ
1
= 25%, E[r
2
] = 10%, σ
2
= 20%, and
zero correlation. Compute the exact expected return and standard-deviation of
the optimal portfolio. Hint: see the formula sheet for an easy way to invert a
diagonal matrix.
Solution:
Using the formula above, w

= . . . =
_
0.352
0.300
_
and w
f
= 0.348.
Thus,
E[r
p
] = w

¯ r + (1 −w

1)r
f
= 9.67%
V ar[r
p
] = w

V w = 0.0113 ⇒σ
p
= 10.65%
4.6. Optimal portfolio 49
We can verify that this portfolio is efficient, ie, that it actually lies on the mean-
variance frontier. Write
E[r
p
] = w

¯ r + (1 −w

1)r
f
= (¯ r −r
f
1)

w +r
f
For this investor’s optimal portfolio, using w

from (4.14),
E[r

p
] = (¯ r −r
f
1)

1
g
V
−1
(¯ r −r
f
1) +r
f
=
1
g
H +r
f
where we also used H := (¯ r −r
f
1)

V
−1
(¯ r −r
f
1).
There are two alternatives now:
1. Plug E[r

p
] in the formula for frontier portfolios (4.11) and show that the portfolio
is the same one as the investor chose:
w =
µ −r
f
H
V
−1
(¯ r −r
f
1)
=
1
g
H +r
f
−r
f
H
V
−1
(¯ r −r
f
1)
=
1
g
V
−1
(¯ r −r
f
1)
which is indeed the same as (4.14).
2. Alternatively, we can show that the investor’s portfolio verifies the equation for
the efficient frontier (4.13). Start by computing the portfolio variance, using w

from (4.14),
Var[r

p
] = (w

)

V w

=
1
g
2
H
Then, plug this variance into (4.13):
µ = r
f

p

H
= r
f
+

1
g
2
H

H
= r
f
+
1
g
H
which is indeed the expected return on the investor’s portfolio. Note that this
second alternative is a bit more correct, since it explicitly shows that the investor
portfolio lies on the upper part of the mean-variance frontier, ie, that it is efficient.
4.7. Additional properties of frontier portfolios 50
4.7 Additional properties of frontier portfolios
We now derive a relation that will be used to prove the CAPM in the next chapter. We
want to do the derivation right now to stress that the part done here is just math, not
economics. In other words, it does not depend on any model of market equilibrium.
Define:
p ≡ a frontier portfolio (still assume there is a risk-free asset)
a ≡ any portfolio, not necessarily on the frontier (eventually a single asset), but
without the risk-free asset.
The covariance between the two portfolios is given by (exercise 30 at the end shows
this):
Cov(r
a
, r
p
) = w

a
V w
p
Since p is a frontier portfolio, w
p
is given by (4.11). Hence,
Cov(r
a
, r
p
) = w

a
V
_
µ −r
f
H
V
−1
(¯ r −r
f
1)
_
=
µ −r
f
H
w

a
(¯ r −r
f
1)
=
E[r
p
] −r
f
H
( E[r
a
] −r
f
)
since µ = E[r
p
], w

a
¯ r = E[r
a
], and w

a
1 = 1. Solving for E[r
a
] −r
f
and using (4.12) for
H,
E[r
a
] −r
f
=
H Cov(r
a
, r
p
)
E[r
p
] −r
f
=
Cov(r
a
, r
p
)
Var[r
p
]
( E[r
p
] −r
f
) (4.15)
Note that all we did so far was to characterize the relation between a frontier portfolio
(p) and any other asset (a). Since, p can be any frontier portfolio, the previous relation
applies in particular to the Tangency portfolio:
E[r
a
] −r
f
=
Cov(r
a
, r
T
)
Var[r
T
]
( E[r
T
] −r
f
) (4.16)
4.8 Exercises
Ex. 26 — Consider the quadratic utility function U(W) = a + bW + cW
2
, where W
is the terminal wealth and a, b, c are constants. Assume that W = W
0
(1 + r
p
), where
W
0
is the initial wealth and the rate of return on the portfolio is normally distributed,
4.8. Exercises 51
r
p
∼ N(µ, σ
2
). (Note that the normality assumption is a bit of an overkill; we only
need quadratic utility). Show that the investor only cares about the first two moments
of returns, i.e., write E[U(W)] as an explicit function of µ and σ (and the constants
a, b, c, W
0
).
Ex. 27 — Normal returns for PSI20. Download the file “PSI20.xls” from my website.
It has daily and monthly closing prices for the Portuguese Stock Index 20.
Note: If you do this in Matlab, you may want to use my DescStats.m function (also
posted on the website).
1. Compute daily continuously-compounded returns. Compute the mean, variance,
skewness, and kurtosis of the distribution. Does it look normal?
2. Do the same for monthly returns.
Ex. 28 — Mean-Variance Frontier. Assume there are N risky assets and that there
is no risk-free asset. Formulate the problem of finding the minimum-variance portfolio
for a given level of return. State in words what the objective and the restrictions mean.
Solve for the optimal portfolio weights. Note: The goal of this exercise is for you to go
through all the intermediate calculations in detail.
Ex. 29 — Solve problem (4.10), ie, show the intermediate steps that lead to (4.11).
Ex. 30 — Let r
p
and r
q
denote the returns on two portfolios. By definition, the
covariance between these returns is given by Cov(r
p
, r
q
) := E[(r
p
− E[r
p
])(r
q
− E[r
q
])].
Starting from this definition, show that the covariance can also be computed as w

p
V w
q
,
where w
i
is the N by 1 vector of weights in portfolio i and V := Cov(r) = E[(r −
Er)(r − Er)

] is the N by N covariance matrix of individual stock returns. Hint: write
r
i
= w

i
r, i = p, q.
Ex. 31 — An investor has mean-variance preferences and thus chooses his optimal
portfolio weights (w, an N by 1 vector) by solving:
maximize
w
E[r
p
] −
g
2
V ar[r
p
]
s.t. w

1 = 1
where r
p
is the return on the portfolio, g is a constant parameter, and 1 a vector of ones.
There is no risk-free asset. To simplify the notation, denote by
V := Cov(r), the covariance matrix, and
¯ r := E[r], the vector of expected returns,
where r is the random vector of returns on the N risky assets.
1. Solve for the optimal w

.
Hints: First write E[r
p
] and V ar[r
p
] in matrix notation, ie, using w, ¯ r, and V .
To simplify the notation, use the scalars A, B, and C (as defined section 4.4)
along the calculations whenever possible.
4.8. Exercises 52
2. The rest of the exercise will help you to show that the portfolio just found is
mean-variance efficient. Start by computing its expected return.
3. Now look at the solution for an efficient portfolio (equation 4.6):
w

=
Cµ −A
D
V
−1
¯ r +
B −Aµ
D
V
−1
1
Plug in the expected return found in part (2.) and verify that the resulting w

is identical to the one found in part (1.). (This shows that the solution to the
initial problem is indeed mean-variance efficient.)
Ex. 32 — There are N risky assets and 1 risk-free asset. Consider the standard port-
folio choice problem, maximize
w
E[U(Y
1
)], where the terminal wealth is Y
1
= Y
0
(1+r
p
).
All risky assets follow a normal distribution and thus the return on the portfolio is
also normally distributed, r
p
∼ N(E[r
p
], V ar[r
p
]). The utility function is U(Y ) =
−exp(−b.Y ), where b is a constant parameter. Compute the optimal weights in the
risky assets, w

(an N by 1 vector).
Hint: Start by writing E[r
p
] and V ar[r
p
] in matrix notation. Then, write the distribution
of Y
1
. Then, use the moment generating function to simplify the objective function.
Ex. 33 — Frontier with Industry Portfolios. Download the file 10_Industry_Portfolios.xls
from my website. It has monthly returns on 10 industry portfolios (from K. French’s
website, http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/index.html).
1. Ignoring the risk-free asset, draw the frontier in mean-std space.
2. Now consider a risk-free rate of r
f
= 0.4% (this the 1-month TBill rate at the end
of the sample, as you can check on French’s website). Draw the efficient frontier
(do it on the same figure as 1; you should get something like fig 6.6 in Danthine
and Donaldson (2005)).
3. Compute the tangency portfolio (weights, expected return, standard deviation)
and plot it in the figure.
4. An investor has mean-variance preferences and thus chooses his optimal portfolio
weights by
maximize E[r
p
] −
g
2
V ar[r
p
]
where g is a constant parameter and r
p
denotes the return on the portfolio. The
solution is
w

=
1
g
V
−1
(¯ r −r
f
1)
Assume that g = 8 and that the investor has $1 Million to invest. Compute
the amount of money that the investor should put in each of the 10 industry
portfolios and in the risk-free asset. Plot the optimal portfolio in the same figure
as the previous questions.
5. Find the value of the parameter g that would make the investor optimally choose
the Tangency portfolio.
4.8. Exercises 53
Ex. 34 — No short selling. Use the same data as in the previous exercise. Consider the
same investor as in question 4, ie, mean-variance preferences with g = 8. Assume that
the investor cannot short sell any of the stock portfolios. Compute the optimal amount
of money that the investor should put in each of the 10 industry portfolios and in the
risk-free asset. Compute the expected return and standard-deviation of the optimal
portfolio. Plot the new optimal portfolio in the same figure as the other questions in the
previous exercise.
Hint: There is no closed form solution. Look for ways to solve the problem numerically.
Matlab and other software (like EXCEL) do this.
Chapter 5
Capital Asset Pricing Model
The CAPM states that the market portfolio is mean-variance efficient.
For any asset,
E[r
j
] = r
f

j
( E[r
M
] −r
f
)
5.1 Introduction
Our goal is to understand why different assets have different average returns. The CAPM
proposes a very precise answer to this question.
The value of any asset is the present value, or discounted value, of its future cash
flows. The CAPM gives us a formula for the discount rate. Hence, it is used everyday
by corporations and investors to price investment projects, stocks, mutual funds, etc.
The CAPM is an equilibrium model that results directly from assuming that all
investors are mean-variance optimizers. It was developed simultaneously in three papers
by Sharpe in 1964, Lintner in 1965, and Mossin in 1966.
5.2 Derivation
We make the following assumptions:
A1: All investors have mean-variance preferences.
A2: There is a risk-free asset with return r
f
.
54
5.3. Important results 55
A3: Investors have homogeneous expectations. This means that everybody has the
same beliefs about the return distribution of every asset.
These assumptions immediately imply the following results:
1. The efficient frontier (namely, the straight line through r
f
and T) is the same for
every investor.
2. Two fund separation: every investor allocates his wealth between two portfolios:
the risk-free asset and the Tangency portfolio.
3. In equilibrium, all risky assets must belong to T.
To see this, suppose that IBM is not in T (w
T
IBM
= 0). Then, there would be
no demand for this stock, (w
i
IBM
= w
T
IBM
= 0, for every investor i). We would
thus have Demand ̸= Supply, which is not equilibrium. Therefore, in equilibrium,
w
T
j
> 0, ∀ asset j.
4. Furthermore, for every asset, the weight in T must be the same as in the whole
market:
w
T
j
=
Market Cap
j

j
Market Cap
j
=: w
M
j
, ∀ asset j
If we all put 2% of our (risky) money into IBM stock, then IBM will have 2% of
all money invested in the stock market, meaning that the market capitalization
of IBM will be worth 2% of the whole market capitalization.
1
In other words,
w
T
IBM
= w
Market
IBM
.
5. Hence, the Market portfolio is the Tangency portfolio, M = T. This is the eco-
nomic content of the CAPM. In one sentence, the CAPM states that the Market
portfolio is mean-variance efficient.
5.3 Important results
Once we have the economic result that M is on the efficient frontier, we can use the
statistical relations derived in section 4.4, replacing M for T.
1
Different investors put different amounts of money at risk, ie, in the tangency portfolio. But
from these amounts, each investor allocates the same 2% to IBM.
5.3. Important results 56
5.3.1 Capital Market Line
When we use M instead of T, the efficient frontier is called Capital Market Line:
-
σ
6
E[r]
All individual optimal portfolios plot along the CML. For an efficient portfolio p, (ie,
p ∈ CML),
CML : E[r
p
] = r
f
+
E[r
M
] −r
f
σ
M
. ¸¸ .
“reward” for risk
× σ
p
.¸¸.
“quantity” of risk
Recall that p is a combination of the risk-free and the market portfolio, thus σ
p
= w
M
σ
M
.
Application: exercise 35. ◃
5.3.2 Security Market Line
Replacing M for T in (4.16), we have that for any asset j (not necessarily on the CML)
E[r
j
] −r
f
=
Cov(r
j
, r
M
)
Var[r
M
]
( E[r
M
] −r
f
)
or, defining β
j

Cov(r
j
,r
M
)
Var[r
M
]
,
SML : E[r
j
] = r
f

j
( E[r
M
] −r
f
)
Note that this applies to every single asset or portfolio.
5.3. Important results 57
-
β
6
E[r]
The SML says that the risk premium on any asset, E[r
j
] −r
f
, depends only on one
factor: the market. More precisely, it is a linear function of the relevant measure of risk,
β
j
. The slope of the line, E[r
M
] − r
f
, is called the “market risk premium”. This risk
premium is the same for every asset.
Valuing Risky Cash Flows
The price of an asset is the present value of its future cash flows discounted at the
appropriate rate. The CAPM is commonly used to provide us with that discount rate.
This amounts to requiring that the asset give us an expected return equal to the SML
formula.
Formally, let p be the price of the asset and
˜
CF the random cash flow to be generated
one period from now. The random return on this project is
˜ r =
˜
CF
p
−1
which has expectation
E[˜ r] =
E[
˜
CF]
p
−1
Using the CAPM, E[r] = r
f
+β ( E[r
M
] −r
f
), we get
p =
E[
˜
CF]
1 +r
f
+β ( E[r
M
] −r
f
)
In practice, this valuation method is extended informally to assets with cash flows
over multiple periods. Further, it is also applied to nontraded assets by using betas of
similar traded assets.
5.3. Important results 58
Example 5.3.1. A media company is considering going into the cell phone
business. By investing $100M today, it is expecting to receive $20M in 1 year,
$30M in 2 yrs, and $90M in 3 yrs. Telecom companies have an average beta
of 0.7. The risk-free rate is 3% and the average market risk premium is 6%.
Should the company expand its operations into the cell phone business?
Application to stock pricing: exercise 36. ◃
Economic interpretation of β
Suppose asset a has more risk than asset b, ie β
a
> β
b
. According to the SML, this will
lead to E[r
a
] > E[r
b
]. What is the economic intuition for this?
According to the CAPM, all investors hold the market portfolio. Hence, they are
happy when the market goes up, unhappy when it goes down. But recall that marginal
utility is decreasing. This means that the investor is really interested in additional
payoffs in bad times (low market returns) and less enthusiastic about additional payoffs
in good times (high market returns). Therefore, investors like assets with low covariance
with the market. β
j
:= cov(r
j
, r
M
)/var(r
M
) is precisely a (standardized) measure of
this covariance. Asset b has higher payoffs when the market is in relatively poor states,
5.4. Other remarks 59
making it more desirable. Hence, investors are willing to hold asset b at a lower expected
return. Equivalently, they will pay a higher price for b.
This intuition is extremely important — it is the core of asset pricing. We’ll come
back to it again (formally) in chapter 8.
5.4 Other remarks
Note the following remarks about the CAPM:
1. The CAPM is a model of partial equilibrium only. That is, important aspects of
the economy like production, consumption, etc, do not appear in the model. As a
result, the interest rate r
f
is exogenous.
2. The CAPM is a cross-sectional model, that is, it expresses a relation between the
returns on all assets at some point in time.
3. There is no time in the model. We can interpret it as a single-period model,
though the exact length of the period is left unspecified (the asset return can be
over a day or over a year).
4. Investors don’t put their money only in stocks. Thus, the true market portfolio
of the CAPM should include all assets in the economy. In a famous critique,
Roll (1977) argues that the true market portfolio is not observable. Moreover, the
return on a stock market index may not even be a good proxy for the return on the
aggregate wealth portfolio. According to Jagannathan and McGrattan (1995), “in
the United States, only one-third of nongovernmental tangible assets are owned by
the corporate sector, and only one-third of corporate assets are financed by equity.
Furthermore, intangible assets, like human capital, are not captured by stock
market indexes.” For example, the biggest asset for most families is their house;
also, most people get more income from labor than from their financial assets.
To summarize, the validity of the CAPM depends very much on the particular
proxy used for the market return. In practice, this means that if you estimate the
required return for Microsoft using two different proxies for the market (say, the
SP500 and the NASDAQ indices), you may get two significantly different numbers.
Nevertheless, the CAPM is still the center of equilibrium asset pricing. It helps us
understand the risk-return tradeoff by specifying exactly what is the risk factor that
matters — the market.
Its empirical validity is still being debated. Overall, it is a good model to describe av-
erage returns of different assets over long periods of time. See the survey in Jagannathan
and McGrattan (1995).
5.5. Exercises 60
5.5 Exercises
Ex. 35 — CML. You expect the stock market to go up by 10% over the next year.
The standard deviation of the market return is 20%. You can buy 1 year government
bonds yielding 4%. If you have $100,000 to invest and you are willing to tolerate a risk
(standard deviation) of 15%, what is the best allocation of your money? How much
money do you expect to have one year from now?
Ex. 36 — Industry-type application of the CAPM. Suppose E[r
M
] = 10% and r
f
=
4%. You estimate stock a will pay a dividend of $2 one year from now. After that, you
expect dividends to grow at 5% per year. You also estimate the beta of the stock to be
β
a
= 0.9. What is the equilibrium price of the stock?
Note: recall that the present value of a stream of dividends growing at rate g is P
0
=
D
1
r−g
,
where r is the discount rate. Thus, you just need to use the CAPM to estimate the
required discount for stock a.
Ex. 37 — Portfolio β. Suppose you can only buy two securities: asset a, with β
a
= 1.2;
and the risk-free asset. Your goal is to have a portfolio with a beta of 0.9.
1. Show that the beta of a portfolio equals the weighted average of individual secu-
rity betas:
β
p
=
N

i=1
w
i
β
i
where w
i
are the portfolio weights. Hint: Start from the definition β
p
:=
Cov(r
p
, r
M
)/ Var(r
M
), with r
p
=

N
i=1
w
i
r
i
, and use the properties of covari-
ance (see the formula sheet).
2. Compute the portfolio weights that will achieve your goal.
Ex. 38 — The Security Market Line gives the expected return for any portfolio: ¯ r
p
=
r
f

p
(¯ r
M
−r
f
). Under what condition will the Capital Market Line give the same ¯ r
p
?
Ex. 39 — The equation for the SML is
E[r
j
] = r
f

j
( E[r
M
] −r
f
)
Explain what part of this equation is “just mathematics” (or mean-variance optimiza-
tion) and what part is an economic model of market equilibrium.
Ex. 40 — Make sure you understand the economic interpretation of β. Write in your
own words why β
a
< β
b
⇒ E[r
a
] < E[r
b
].
Chapter 6
Arbitrage Pricing Theory and
Factor Models
If there are no arbitrage opportunities,
E[r
j
] = r
f
+
K

k=1
β
jk
( E[F
k
] −r
f
) , ∀j
Pricing by (no) arbitrage is based on the assumption that there are no arbitrage
opportunities in the market, that is, it is not possible to make money at zero risk and
zero cost. The goal is to obtain pricing relations with as few assumptions as possible
(namely we will not have to assume any utility function).
Arbitrage techniques allow us to relate the prices of a set of assets to the prices of
another set of basic fundamental assets (eg, the price of a stock option is a function of
the price of the underlying stock). In particular, the Arbitrage Pricing Theory (APT),
developed by Ross (1976), explains the returns on all stocks as a function of a (small)
set of fundamental “risk factors”.
6.1 Factor Structure
APT starts from a statistical characterization of realized returns. It is an empirical fact
that stock return returns move together to some extent. There is comovement at the
market level, at the industry level, etc. This suggests that there are just a few “forces”
that drive stock returns.
61
6.1. Factor Structure 62
We assume that the return on stock j is generated by K random variables called risk
factors:
r
j
= a
j
+
K

k=1
β
jk
F
k

j
, j = 1, 2, . . . , n (6.1)
Note that the factors are common to all stocks, ie, they are pervasive risk factors. The
parameters β
jk
, called factor loadings, measure the sensitivity of security j to factor
k. The random variable ε
j
is the residual, ie, the part of return not explained by the
common factors ( Var[ε
j
] is idiosyncratic risk).
The point is to have the number of factors much smaller than the number of stocks
(K < n). Most empirical applications find that the number of factors ranges from 1 to
5. Hence, the theory is not vacuous.
The goal of the APT is to derive a relation about expected returns, E[r]. In other
words, to see how expected returns are related to the pervasive risk factors. The intuition
is that idiosyncratic risk can be diversified away and thus should not be priced; investors
only care about the covariance with the pervasive risk factors.
Assumptions:
A1: E[ε
j
] = 0, ∀j
A2: Cov(F
k
, ε
j
) = 0, ∀k, j
A3: Cov(ε
j
, ε
i
) = 0, ∀j ̸= i
Assumptions 1 and 2 are innocuous since they can be imposed by construction (just
need to estimate the parameters a
j
, β
j1
, . . . , β
jK
in (6.1) by OLS regression).
A3 is the critical assumption that gives economic content to this model. It says that
we only need K factors to explain all commonality in returns. Whatever is not explained
by the factors (ε
j
) is specific to asset j and has nothing in common with the residuals
from other assets. In other words, we were able to identify all the pervasive risk factors.
A meaningful factor structure must therefore have two properties:
(1) the factors movement should explain a substantial fraction of the movement of the
returns on the priced assets;
(2) the unexplained parts of the returns on the priced assets should be uncorrelated
across these priced assets.
Remark. Equation (6.1) is sometimes stated as deviations from means. Take expecta-
tions on both sides to get E[r
j
] = a
j
+

K
k=1
β
jk
E[F
k
]. Plug the resulting value for a
j
into (6.1) to get
r
j
= E[r
j
] +
K

k=1
β
jk
ˆ
F
k

j
6.2. Example of simple factor structure: Market Model 63
with
ˆ
F
k
≡ F
k
− E[F
k
] and thus E[
ˆ
F
k
] = 0. Stock returns deviate from their means as
a result of unexpected realizations of risk factors. Note that this is just a mathematical
manipulation of (6.1); it is still not saying anything about E[r
j
].
6.2 Example of simple factor structure: Market
Model
6.2.1 Return generating process
One important example of a simple factor structure is the Market Model. This model
states that there is just one factor, the market. Formally,
r
j
= a
j

j
r
M

j
, ∀j (6.2)
If we estimate this regression by OLS we get the CAPM beta, β
j
= Cov(r
j
, r
M
)/ Var(r
M
),
and we guarantee A1–2 are true. Again A3 is the critical assumption. In this context,
it says that the market return is enough to capture all the common movement between
stock returns.
1
6.2.2 Application: the Covariance matrix is simplified
The factor structure is really a restriction on the covariance matrix of returns. The com-
putation of real-life covariance matrices is a challenging problem. Note that a covariance
matrix has (N
2
−N)/2 different covariances plus N variances. With say N=100 stocks
in your portfolio, you need to estimate 5,050 different parameters. If the market model
is true, the covariance matrix is much simpler. Using A1–3, we can show
diagonal: σ
2
j
= β
2
j
σ
2
M

2
ε
j
(6.3)
off diagonal: σ
ij
= β
i
β
j
σ
2
M
(6.4)
Hence, we only need to compute N betas, plus N+1 variances. For N=100, we only need
to estimate 201 parameters to get the full covariance matrix.
2
1
If this was true, CAPM would be the end of asset pricing. It isn’t.
2
This simplification motivated the use of the market model when computer power was scarce.
Nowadays, we no longer have to accept the extreme simplification (A3) of this model and better
models are being developed. Nonetheless, obtaining a good estimate of a large covariance matrix
is still hard and a lot of research is still going on.
6.2. Example of simple factor structure: Market Model 64
More importantly, imposing a factor structure may help to get more meaningful esti-
mates of the covariances. For example, suppose that we want to estimate the covariance
between stock A and B. Assume that during the early part of the sample period there
were rumors that A was going to acquire B, which led to a decrease in the price of
A and an increase in B. Later, the rumors were strongly denied by the CEOs of both
companies, which led to a reverse in prices (A back up, B back down). If we use a simple
historical estimate, we are going to get a strong negative correlation between A and B.
And if we then use this estimate in a portfolio allocation rule, we are probably going to
get big allocations to A and B in order to reduce the total risk of the portfolio. However,
this historical correlation is spurious, not likely to be a good predictor of what is going
to happen to the two companies in the future. If instead we estimate the correlations
with the market model, we are forcing this specific event, unrelated to the market, to go
to the residuals of (6.2). If the estimated betas are small, we will then forecast that the
correlation of these two stocks is low. This is likely to lead to a better allocation rule
going forward.
6.2.3 Implication: Diversification eliminates Specific risk
From (6.2) and using assumption A2 ( Cov(r
M
, ε
j
) = 0), the total risk of a single stock
is
σ
2
j
.¸¸.
total risk
= β
2
j
σ
2
M
. ¸¸ .
systematic risk
+ σ
2
ε
j
.¸¸.
nonsystematic risk
However, the nonsystematic risk (also called specific or unique or idiosyncratic risk,
or diversifiable risk) can be easily diversified away by holding a large portfolio. That
is, a well-diversified portfolio only has systematic risk (also called market risk or non-
diversifiable risk).
Proof. Consider a portfolio of N securities. Its return is
r
p
=
N

j=1
w
j
r
j
=
N

j=1
w
j
a
j
+
N

j=1
w
j
β
j
r
M
+
N

j=1
w
j
ε
j
The variance is
Var[r
p
] = Var
_
_
N

j=1
w
j
β
j
r
M
_
_
+ Var
_
_
N

j=1
w
j
ε
j
_
_
where we used A2 to remove the Cov(r
M
, ε
j
). Using A3 to remove Cov(ε
i
, ε
j
), we get
Var[r
p
] = Var (β
p
r
M
) +
N

j=1
w
2
j
Var(ε
j
)
6.2. Example of simple factor structure: Market Model 65
where we also used β
p
=

N
j=1
w
j
β
j
.
We now show that the second term goes to zero in a large, well-diversified portfolio.
Set w
j
= 1/N so that the portfolio is well-diversified. Assume that the residual variance
is the same for all assets: Var[ε
j
] = v, ∀j.
3
We get
N

j=1
w
2
j
Var(ε
j
) =
1
N
2
N

j=1
v =
v
N
N→∞
−−−−→0
Hence, nonsystematic risk can be eliminated through diversification. In standard nota-
tion,
σ
2
p
N→∞
−−−−→β
2
p
σ
2
M
In graphical terms,
-
N
6
σ
2
market risk (β
2
p
σ
2
M
)
specific risk (σ
2
ε
p
)
Hence, we should expect that E[r
p
] depends only on β
p
.
6.2.4 Another interpretation of the CAPM β
In the CAPM, every investor holds a well-diversified portfolio, namely the Market port-
folio (ε
M
≡ 0, σ
2
ε
M
= 0). Hence, there is no reward for the nonsystematic risk of a
security. Only the systematic risk of each stock is rewarded.
4
β is the measure of mar-
ket risk (high β implies high systematic risk). The higher the β, the higher the expected
3
The argument also works if we assume only that the variances are bounded, Var[ε
j
] ≤ c <
∞, ∀j.
4
To find how much reward we can get for the systematic part, we find the return of an efficient
portfolio with risk only σ
j
= β
j
σ
M
. That is, we plug this quantity of risk into the CML equation.
Note that this produces the SML. Hence, the SML only rewards systematic risk.
6.2. Example of simple factor structure: Market Model 66
return (through the SML). Hence, β can be interpreted as the measure of the risk that
matters, i.e., of the risk that carries a risk premium in the CAPM — market risk.
Example 6.2.1. This example shows that nonsystematic risk is not rewarded.
Assume that returns are generated by the market model. In particular, stocks’
A and B are generated by:
r
a
= 0.004 + 0.9r
M

a
, σ
ϵ
a
= 0.05
r
b
= −0.008 + 1.2r
M

b
, σ
ϵ
b
= 0.1
The return on the market has a mean of 10% and standard deviation of 20%.
The risk-free rate is 4%. The CAPM correctly explains expected returns in this
economy.
Part 1 - An under-diversified portfolio.
Suppose we form an equal-weighted portfolio p of these two stocks (w
a
= w
b
=
0.5). Note that this portfolio is not well diversified since it only has two stocks.
The portfolio beta is
β
p
= . . .
and using the CAPM we get the expected return for this portfolio
SML : E[r
p
] = . . .
Part 2 - Risk decomposition.
Using the assumptions of the market model, the systematic risk is
β
2
p
σ
2
M
= . . .
and the nonsystematic risk is
σ
2
ϵ
p
=
2

j=1
w
2
j
Var(ε
j
) = . . .
As expected, the portfolio has unique risk because it is not well diversified.
Hence, the total risk of the portfolio is
σ
2
p
= β
2
p
σ
2
M

2
ϵ
p
= . . .
⇒σ
p
= 21.7%
Part 3 - A well-diversified portfolio.
6.3. Pricing equation 67
Consider another efficient portfolio q located on the CML, with total risk equal
to the systematic risk of p: σ
2
q
= β
2
p
σ
2
M
= 0.0441. Recall that portfolios on the
CML have no specific risk: σ
2
ϵ
q
= 0. Hence,
σ
2
q
= 0.0441 + 0 ⇒σ
q
= 21%
Using the CML equation, its expected return is
CML : E[r
q
] = . . .
the same as E[r
p
]!
Part 4 - Conclusion.
The 0.7% of σ
p
corresponding to the unique risk of p do not get any reward, ie,
the CAPM only rewards market (ie, non-diversifiable) risk. Graphically, while
q sits on the CML, p is to the right of the CML. In simple terms, buying only
stocks a and b is not the best way to get an expected return of 10.3%.
6.3 Pricing equation
We start with the exact version of the APT. We assume that ε
j
≡ 0, ∀j. This provides
all the necessary intuition for the general case in the last section. We follow Huang and
Litzenberger (1988).
6.3.1 Exact factor pricing with one factor
Assume 1 single factor exactly generates all returns:
r
j
= a
j

j
F, ∀j
Construct a portfolio p by investing in the risk-free asset and the factor itself with
the following weights:
p :
_
w
f
w
F
_
=
_
1 −β
j
β
j
_
The return on this portfolio is thus
r
p
= w
f
r
f
+w
F
F = (1 −β
j
)r
f

j
F
This portfolio has the same return, state-by-state, as stock j, except for the intercept.
6.3. Pricing equation 68
If there are no arbitrage opportunities, the intercepts must be the same:
a
j
= (1 −β
j
)r
f
To see why this must be so, suppose a
j
> (1 −β
j
)r
f
, ie, j is a better investment. Then,
short $1 of the portfolio and buy $1 of j. This costs nothing and guarantees a sure profit
of a
j
−(1 −β
j
)r
f
> 0. This is called a free lunch. If instead a
j
< (1 −β
j
)r
f
, short j and
buy p for another free lunch. There cannot be such arbitrage opportunities in financial
markets. Check exercise 42. ◃
Replacing a
j
in the return-generating process, we get
r
j
= r
f

j
(F −r
f
)
Taking expectations on both sides we get
E[r
j
] = r
f

j
( E[F] −r
f
), ∀j
The asset risk premium ( E[r
j
] −r
f
) depends on the factor risk premium ( E[F] −r
f
)
and the asset’s loading on the factor (β
j
). The factor’s risk premium is exogenous. Once
we know this single “price”, we can price all other assets in the economy.
Note the similarity with the CAPM. The CAPM basically says that the unique risk
factor is the Market. Replacing F with r
M
in the previous equation produces the CAPM.
Remark. Here we implicitly assumed that F is the return on some traded financial
portfolio. If the factor is not traded, we replace F by a factor mimicking portfolio, that
is, a portfolio x satisfying a
x
= 0 and β
x
= 1 as close as possible.
6.3.2 Exact factor pricing with more than one factor
Now consider an exact K-factor structure:
r
j
= a
j
+
K

k=1
β
jk
F
k
, ∀j
The argument is identical to the single factor case. Construct a portfolio p by
investing in the risk-free asset and the factors itself with the following weights:
p :
_
¸
¸
_
w
f
w
F
1
. . .
w
F
K
_
¸
¸
_
=
_
¸
¸
_
1 −

K
k=1
β
jk
β
j1
. . .
β
jK
_
¸
¸
_
6.3. Pricing equation 69
The return on this portfolio is thus
r
p
=
_
1 −
K

k=1
β
jk
_
r
f
+
K

k=1
β
jk
F
k
The no arbitrage condition is:
a
j
=
_
1 −
K

k=1
β
jk
_
r
f
Replacing a
j
in the return-generating process and taking expectations, we get
E[r
j
] = r
f
+
K

k=1
β
jk
( E[F
k
] −r
f
) , ∀j (6.5)
The risk premiums on the K exogenous sources of risk now determine the expected
returns on all securities.
Extensions
There are two important extensions to model (6.5):
1. Factors are excess returns. Suppose all factors are returns on long-short portfolios
with zero price. For example, market minus risk-free rate (like in the CAPM) or
portfolio A minus portfolio B (like in the Fama-French model in section 6.4.2).
Then, the model is
E[r
e
j
] =
K

k=1
β
jk
E[F
e
k
], ∀j
where F
e
k
is the excess return on factor k and r
e
j
is the excess return on asset
j. For a single stock j, E[r
e
j
] = E[r
j
] − r
f
; for a long-short portfolio p, E[r
e
p
] =
E[r
long
] −E[r
short
].
2. Nontraded factors. So far we assumed that factors are returns, i.e., factors are
based on portfolios that we can buy or sell. If the factors are not returns on
traded portfolios (e.g., industrial production), the model is
E[r
j
] = r
f
+
K

k=1
β
jk
λ
k
, ∀j
The difference is that the risk premium on each factor is no longer its mean.
Instead, the risk premium on factor k is given by the free parameter λ
k
that we
need to estimate.
See Cochrane (2005) for proofs and details.
6.4. How to identify the factors 70
6.3.3 Approximate factor pricing
We now consider the general K-factor structure with noise in (6.1):
r
j
= a
j
+
K

k=1
β
jk
F
k

j
, j = 1, 2, . . . , n
For this case, we will only be able to get a limiting result as the number of stocks
increases. That is, APT is an approximation.
We have to consider a different arbitrage concept. An asymptotic arbitrage opportu-
nity exists if we can construct a (large) portfolio satisfying the following conditions: (1)
zero cost; (2) strictly positive expected return; (3) negligible variance. This is almost a
free lunch.
If there is no such arbitrage opportunity, then a linear pricing relation will hold
approximately for most of the assets in a large economy:
E[r
j
] ≈ r
f
+
K

k=1
β
jk
( E[F
k
] −r
f
) , ∀j
The approximation is in the sense that
lim
n→∞
1
n
n

j=1
_
E[r
j
] −r
f

K

k=1
β
jk
( E[F
k
] −r
f
)
_
2
= 0
The model prices most of the assets “correctly” and all of the assets together with
a negligible mean square error. However, it can be arbitrarily bad at pricing a finite
number of the assets.
For an intuitive proof see Danthine and Donaldson (2005). A somewhat better proof
is in Cvitani´c and Zapatero (2004, p.436). Rigorous proofs are in Ingersoll (1987, p.172)
and Huang and Litzenberger (1988, p.106).
6.4 How to identify the factors
6.4.1 Overview
The major drawback of the APT is that the theory does not say what the factors should
be. Hence, identifying the factors has been an (not yet over) empirical quest. Again,
the goal is to identify a small number of factors that describe all stock returns.
6.4. How to identify the factors 71
There are several approaches:
• Statistical factors. Using factor analysis and principal components analysis, re-
searchers have concluded that 3 to 5 factors are enough to describe the returns on
most stocks.
• Economically meaningful factors. The idea is to test whether relevant macroe-
conomic variables are good risk factors. There is a big literature on this. One
important example is Chen, Roll, and Ross (1986). They identify the following
factors: industrial production, credit spread, term spread.
• Financially meaningful factors. The factors are constructed as the return on a
(meaningful) portfolio of financial assets. The most important model nowadays is
the Fama and French 3 factor model (described below). More recently, Momentum
has also been considered a risk factor.
6.4.2 Fama and French model
Fama and French (1993) propose the following 3-factor asset pricing model:
E[r
j
] −r
f
= β
jM
( E[r
M
] −r
f
) +β
js
E[SMB] +β
jh
E[HML] (6.6)
where the loadings (β
jM
, β
js
, β
jh
) are the slopes in the time-series regression
5
r
j
−r
f
= a
j

jM
(r
M
−r
f
) +β
js
SMB +β
jh
HML +ε
j
(6.7)
To form the two new factors, FF divide all firms into six buckets depending on their
size (market equity, ME) and the ratio of book equity to market equity (BE/ME):
6
50th ME prct
Small Value Big Value > 70th BE/ME prct
Small Neutral Big Neutral
Small Growth Big Growth < 30th BE/ME prct
“Small” stocks have ME smaller than the median ME. Typically, small stocks perform
better than what the CAPM predicts (this is a so called anomaly).
5
The set up looks slightly different from our previous return generating process, but they are
equivalent. To see this, write the exact version of (6.7) as r
j
= ˆ a
j

jM
r
M

js
(SMB +r
f
) +
β
jh
(HML +r
f
), with ˆ a
j
:= a
j
+r
f
−β
jM
r
f
−β
js
r
f
−β
jh
r
f
. Apply the no arbitrage condition
and take expectations to get (6.6).
6
See the details in http://mba.tuck.dartmouth.edu/pages/faculty/ken.french
6.5. Applications 72
“Value” stocks have BE/ME higher than the 70th BE/ME percentile; their book-
to-market ratio is High. “Growth” stocks have BE/ME lower than the 30th BE/ME
percentile; their book-to-market ratio is Low. Typically, BE/ME is high when the ME
(denominator) is low. This happens when the firm has had low returns and is now near
financial distress. Nonetheless, most of these firms usually rebound and thus, if you hold
a large portfolio of these firms, you end up making more money than their CAPM beta
would suggest (another CAPM anomaly).
Each month, the factors are computed in the following way:
• SMB (Small Minus Big) is the average return on the three small portfolios minus
the average return on the three big portfolios,
SMB = 1/3 (Small Value + Small Neutral + Small Growth)
- 1/3 (Big Value + Big Neutral + Big Growth)
Historically, the SMB portfolio generated an annual return somewhere between
1.5% and 3%. This is the size premium.
• HML (High Minus Low) is the average return on the two value portfolios minus
the average return on the two growth portfolios,
HML = 1/2 (Small Value + Big Value) - 1/2 (Small Growth + Big Growth)
Historically, the HML portfolio generated an annual return somewhere between
3.5% and 5%. This is the value premium.
This model has had considerable empirical success in explaining CAPM anomalies
(portfolios that don’t plot on the SML) and in capturing the variation in the cross-section
of expected returns. Thus, Fama and French (1996) argue that SMB and HML mimic
combinations of two underlying risk factors of special concern to investors.
6.5 Applications
6.5.1 Fund performance
One important question in finance is: How to assess the performance of a fund manager?
We cannot just look at raw realized returns because we want to distinguish stock-picking
skills from simple risk taking (if we see a big return, was it because the manager was
able to identify mispriced stocks or was it because he took large risks and got lucky?)
Therefore, we need to compute risk adjusted returns, that is, we need to measure the
difference between the empirical realized returns and the returns “appropriate” for the
risk of the fund. This difference is called Jensen’s alpha.
We have two models to adjust returns for risk:
6.5. Applications 73
CAPM
To evaluate the performance of fund p, estimate the following time-series regression:
(r
p
−r
f
)
t
= α
p

p
(r
M
−r
f
)
t

pt
(6.8)
This is the standard regression to estimate the CAPM beta. Now, we are also interested
in the intercept. According to the CAPM, α
p
= 0. Graphically, a positive Jensen’s alpha
implies that the portfolio lies above the SML:
-
β
6
E[r]
If α
p
is (significantly) positive, we can conclude that the fund returns are higher
than what its level of risk would require (according to the CAPM). In other words, the
manager has skill.
FF3
If we don’t believe that CAPM is a good model to adjust returns for risk, we can use
the Fama-French model. Run the regression
(r
p
−r
f
)
t
= α
p

pM
(r
M
−r
f
)
t

ps
SMB
t

ph
HML
t

pt
Again, if α
p
> 0 (statistically), the manager has skill. Note that the β
pM
estimator
that comes out of this regression is not the CAPM beta (due to the presence of other
regressors).
See my website for an application (homework) with real data.
There is a huge literature on fund performance and research is still going on. For a
survey of the evidence and its implication on the Efficient Market Hypothesis see Malkiel
(2005).
6.5. Applications 74
Remark on CAPM’s Beta estimation
Equation (6.8) is considered a better way to estimate the CAPM beta than the market
model regression. If the interest rate is constant, both lead to the same beta:
r
pt
−r
f
= α
p

p
(r
Mt
−r
f
) +ϵ
t
⇒r
pt
= r
f
(1 −β
p
) +α
p

p
r
Mt

t
⇒r
pt
= ¯ α
p

p
r
Mt

t
with the interest rate folded into the intercept, ¯ α
p
:= r
f
(1−β
p
)+α
p
. However, in practice
interest rates are not constant and the two regressions lead to different beta estimates.
The CAPM is really mute about “statistical” issues (it is assumed that investors know
the true parameter values). But we can nonetheless argue that regression (6.8) is more
in the spirit of the CAPM. This is because we are interested in explaining excess returns
(remember r
f
is exogenous).
Consider an example to magnify the potential differences. Suppose we estimate the
market model with the following raw returns (in %):
t r
M
r
i
1 6 8
2 5 3
We get r
it
= −0.22 +5 ∗ r
Mt
and thus (wrongly) infer that this security has a very high
beta, β
i
= 5. Now we take into account the interest rate in each of those periods and
estimate (6.8):
t r
M
r
i
r
f
r
M
−r
f
r
i
−r
f
1 6 8 7 -1 1
2 5 3 4 1 -1
We get (r
p
−r
f
)
t
= 0 −1(r
M
−r
f
)
t
, and now (correctly) conclude that this is actually a
negative beta security, β
i
= −1. This makes sense, since security i is acting as an hedge
against excess market returns. In other words, a market return of 6% is “bad times” if
r
f
= 7%, whereas 5% is “good times” if the risk-free rate is only r
f
= 4%.
6.5.2 Market neutral strategy
This investment strategy is typical of many hedge funds — see Bodie, Kane, and Marcus
(2005, sec 10.4).
6.6. Exercises 75
A portfolio manager has identified an underpriced portfolio p with the following
characteristics:
(r
p
−r
f
) = 0.04 + 1.4(r
SP500
−r
f
) +ϵ
p
The manager is very confident about this alpha of 4%.
However, even if the manager is right, he may loose money if the whole market turns
down. He would like to explore the relative mispricing of p, regardless of what happens
to the market.
The solution is to construct a tracking portfolio (T) that matches the systematic
component of p. It must therefore have a beta of 1.4, which requires w
SP500
= 1.4 and
w
f
= −0.4. The return on the tracking portfolio is thus
r
T
= 1.4r
SP500
−0.4r
f
⇒(r
T
−r
f
) = 1.4(r
SP500
−r
f
)
The investment strategy is to go long (buy) on p and go short (sell) on T. The
combined portfolio C thus has a return of
r
C
= r
p
−r
T
= (r
p
−r
f
) −(r
T
−r
f
) = 0.04 +ϵ
p
This combined position is thus market neutral. Regardless of what happens to the
market, the manager earns 4%.
7
6.6 Exercises
Ex. 41 — Assume the market model is true. Show that the covariance matrix is:
diagonal: σ
2
j
= β
2
j
σ
2
M

2
ε
j
off diagonal: σ
ij
= β
i
β
j
σ
2
M
Ex. 42 — Stock returns are generated by the following exact market model:
r
j
= a
j

j
r
M
(6.9)
The risk-free rate is 4%. After a careful analysis, you identify stock a whose return is
generated by
r
a
= 0.01 + 0.9r
M
(6.10)
Can you become filthy rich? Explain how (quantify the profit).
7
Note that there is still some residual risk, ϵ
p
. This will be small if the single market factor
explains r
p
well. In practice, we typically need more factors.
6.6. Exercises 76
Ex. 43 — The Fama-French model states that returns can be explained by a three
factor model. The first factor is the market (as in the CAPM). Briefly explain what the
other two new factors are (ie, name them, describe how they are constructed, and what
they measure).
Ex. 44 — See my webpage for exercises on beta estimation and fund performance.
Chapter 7
Pricing in Complete Markets
Two general pricing frameworks in complete markets:
• Arrow-Debreu pricing: p =

S
s=1
x(s) · p
ad
(s)
• Risk-Neutral pricing: p =
E
Q
[x]
1+r
f
The Arrow-Debreu pricing framework is a very general setup. It gives us intuition
and helps us understand all other pricing models. In a sense, it is the mother of all asset
pricing models. It can be set from an equilibrium or an arbitrage perspective. Here, we
follow the second approach. The Risk-Neutral pricing framework is essentially equivalent
to AD pricing. RN is the center of mathematical finance and derivatives pricing.
7.1 Basic and Complex securities
Definition (Arrow-Debreu security). An Arrow-Debreu (AD) security pays 1 unit of
consumption, or one unit of currency, in one single state of nature and nothing else in
other states. For example, the AD security for state s, with price p
ad
s
, produces the
following payoffs:
State Payoff
1 0
2 0
.
.
.
.
.
.
s 1
.
.
.
.
.
.
S 0
77
7.2. Computing AD prices 78
An AD security is also called a basic or primitive security, a state-contingent claim (the
payoff is contingent on the realized state of nature), or simply a contingent claim or
state claim.
Definition (Complex security). A complex security is an asset that pays off in more
than one state of nature. Examples: a stock; a bond; a stock option.
Example 7.1.1. A portfolio with one share of each of the S different AD
securities is in fact a risk-free security (pays one unit of currency regardless of
the state). It’s price must be
p
f
=
S

s=1
1 ×p
ad
s
Since we prefer to speak of the risk-free rate instead of price, divide through by
p
f
and use p
f
=
1
1+r
f
to get:
1 =
S

s=1
(1 +r
f
) ×p
ad
s

1
1 +r
f
=
S

s=1
p
ad
s
Note that the payoffs of complex securities also depend on the realized state of nature,
so they are also called contingent claims (sorry, but I did not create these terms...)
7.2 Computing AD prices
In reality, we only observe the prices of complex securities. So, we need to extract the
implicit AD prices from the complex securities’ prices.
Example 7.2.1. There are 3 states and 3 assets with the following payoffs:
State Asset a Asset b Asset c
1 3 1 2
2 2 1 0
3 1 1 2
Price 1.5 0.8 0.8
7.3. Complete Markets 79
To find the AD price for the first state, p
ad
1
, we find the portfolio of complex
securities that replicates the AD security:
_
_
1
0
0
_
_
=
_
_
3
2
1
_
_
w
a
+
_
_
1
1
1
_
_
w
b
+
_
_
2
0
2
_
_
w
c
resulting in the linear system
_
_
_
1 = 3w
a
+w
b
+2w
c
0 = 2w
a
+w
b
0 = w
a
+w
b
+2w
c

_
_
_
w
a
= 0.5
w
b
= −1
w
c
= 0.25
Note that the w above represent quantities, not percentage weights. The AD
price must thus be
p
ad
1
= w
a
p
a
+w
b
p
b
+w
c
p
c
= 0.15
Proceeding in the same way for the other two states, we get the corresponding
AD prices. Check that p
ad
2
= 0.40 , p
ad
3
= 0.25.
7.3 Complete Markets
7.3.1 Price of complex securities
Definition (Complete market). The market is complete if there exists one AD price for
each possible state of nature, that is, if we can compute p
ad
s
, ∀s.
The reason why market completeness is important is that in arbitrage-free complete
markets every financial contract has a unique arbitrage-free price. This is very useful
for pricing derivatives.
Proposition 7.3.1. If the market is complete, any complex security (ie, any cash flow
stream) can be replicated and priced as a portfolio of AD securities.
The Arrow-Debreu pricing formula for any complex security is
p =
S

s=1
x(s) · p
ad
(s) (7.1)
where x(s) denotes the payoff in state s.
7.3. Complete Markets 80
Example 7.3.1. Consider the market from the previous example. Consider an
additional security with the following payoffs:
1
State Payoff for asset d
1 2
2 1
3 0
Since the market is complete, this security can be replicated as a portfolio of
AD securities:
_
_
2
1
0
_
_
= 2
_
_
1
0
0
_
_
+ 1
_
_
0
1
0
_
_
+ 0
_
_
0
0
1
_
_
Its price must be
p
d
= 2p
ad
1
+ 1p
ad
2
+ 0 = 2 ×0.15 + 1 ×0.4 = 0.7
Otherwise, there would be arbitrage opportunities.
7.3.2 Quick test for market completeness
Given a payoff matrix, how can we know if all AD prices exist, ie the market is complete,
before having to go through the calculations? First, note that if there are more states
than complex securities, S > N, the market is incomplete. Second, if S = N, the test is
given by the following proposition.
Proposition 7.3.2. The market is complete if
i) N = S
ii) The N complex securities are linearly independent.
Intuitively, the market with N = S is complete if the N securities are truly different
from each other.
Formally, note that in the previous example, we computed each AD price by finding
a vector of weights satisfying
a
s
= Xw ⇒w = X
−1
a
s
where X is the (S by N) matrix of payoffs for the complex securities and a
s
is the vector
of payoffs for the AD security ( 1 in state s, zeros everywhere else).
2
We will be able
1
Note that this security is a call option on asset 1 with a strike price of 1: call(s) = max[x
1
(s)−
1, 0]
2
We can find the whole matrix of weights at once by doing I = XW ⇒ W = X
−1
. The
S-by-1 vector of AD prices is thus P
ad
= W

P, with P being the N-by-1 vector of security prices.
7.4. Risk-Neutral Pricing 81
to replicate all S AD payoffs if the N complex securities span the entire S-dimensional
space, R
S
. Hence, we are really asking whether X has full rank (ie, all columns are
linearly independent). The following result from linear algebra is helpful:
Proposition 7.3.3. X has full rank if and only if |X| ̸= 0.
By the way, note that |X| ̸= 0 guarantees that X
−1
exists and thus that the previous
equation has a solution.
Example 7.3.2. From the previous example,
X =
_
_
3 1 2
2 1 0
1 1 2
_
_
Computing its determinant,
3
|X| = 4 ̸= 0
Hence the market is complete (thus we can compute all AD prices).
7.4 Risk-Neutral Pricing
7.4.1 Price of complex securities
Define
π
Q
(s) :=
p
ad
(s)

S
s=1
p
ad
(s)
Note that all π
Q
(s) are positive
4
and sum to 1, so they form a legitimate set of proba-
bilities.
3
The determinant of a square matrix A of size K is:
|A| =
K

k=1
a
ik
(−1)
i+k
|A
−ik
|, for any row i
where a
ik
is the ik-th element of A and A
−ik
is what’s left of A after deleting the row and column
that go through a
ik
. |A| can also be computed along any column instead; pick the row or col
with most zeroes.
4
If some p
ad
(s) ≤ 0 there would be an arbitrage opportunity.
7.4. Risk-Neutral Pricing 82
From (7.1), the price of a complex security is given by
p =
S

s=1
p
ad
(s)x(s)
=
_
S

s=1
p
ad
(s)
_
·
_
1

S
s=1
p
ad
(s)
_
·
_
S

s=1
p
ad
(s)x(s)
_
=
_
S

s=1
p
ad
(s)
_
·
_
_
S

s=1
p
ad
(s)
_

S
s=1
p
ad
(s)
_x(s)
_
_
=
1
1 +r
f
S

s=1
π
Q
(s)x(s)
where we also used

S
s=1
p
ad
(s) =
1
1+r
f
. The risk-neutral pricing formula is thus
p =
E
Q
[x]
1 +r
f
(7.2)
where E
Q
means that we take the expectation using the probabilities π
Q
(s).
This is called risk-neutral pricing because we are discounting the expected cash flow
at the risk-free rate. Very important: this does not mean that the investor is risk-neutral.
All we did was to distort the expected cash flow by using the artificial probabilities π
Q
.
This distortion captures the risk aversion of the investor, so that (7.2) produces the real
price of the security.
Example 7.4.1. Using the market from the previous examples,
π
Q
(s) =
p
ad
(s)

S
s=1
p
ad
(s)
=
_
_
. . .
. . .
. . .
The risk-free rate is
1 +r
f
=
1

S
s=1
p
ad
(s)
= . . .
Hence, the price of the call option from example 7.3.1 is
p =
E
Q
[x]
1 +r
f
= . . .
= 0.7
7.4. Risk-Neutral Pricing 83
7.4.2 Fundamental theorems
The function Q that defines the probabilities π
Q
(s) is known as Risk-Neutral probability
measure, or Subjective probability measure, or Equivalent Martingale Measure. Formally,
Definition. (Risk-Neutral probability) A probability measure Q is a Risk-Neutral prob-
ability measure if:
i) π
Q
(s) > 0, ∀s, and
ii) Equation (7.2) holds for all securities.
Theorem 7.4.1. (First fundamental theorem of mathematical finance) There exists a
risk-neutral probability measure Q if and only if there are no arbitrage opportunities.
There are several definitions of arbitrage. The one we are considering here is the
following:
Definition. (Arbitrage) There is an arbitrage opportunity if we can create a portfolio
with the following characteristics:
i) p ≤ 0, and
ii) x(s) ≥ 0, for all s.
(A more precise definition should exclude the case p = 0 with x(s) = 0, ∀s, but this is
obviously a useless portfolio.)
The following is a counter-example to the theorem: if there are arbitrage opportu-
nities, there is no Q.
Example 7.4.2. Consider a different market:
State Asset 1 Asset 2 Asset 3
1 3 1 2
2 2 1 0
3 1 1 2
Price 0.7 0.8 0.8
Clearly, there is an arbitrage opportunity since asset 1 is always better than 2,
but costs less.
Computing AD prices (exercise),
p
ad
=
_
_
−0.25
0.40
0.65
_
_
again p
ad
(1) = −0.25 signalling arbitrage.
This would imply π
Q
(1) = −0.25/0.8 = −0.3125 < 0, which does not satisfying
the strict positivity requirement. Hence, there is no risk-neutral measure.
7.5. Conclusion 84
However, the theorem does not say that the measure is unique. In incomplete mar-
kets, we may have many risk-neutral measures (and thus many prices). See example 11.3
in Danthine and Donaldson (2005). The measure and thus security prices are guaranteed
to be unique only in complete markets, as the following theorem states:
Theorem 7.4.2. (Second fundamental theorem of mathematical finance) Assume that
the market is arbitrage free. Then, the market is complete if and only if the risk-neutral
measure is unique.
7.5 Conclusion
If the market is complete we can:
• combine the existing complex securities to generate any payoff (ie, the existing
securities span the space of all possible payoffs);
• recover all AD prices or π
Q
probabilities;
• use the AD prices or π
Q
probabilities to price any new security (though this new
security would be redundant).
A good example is the Black-Scholes option pricing model. The market formed
by the stock and the risk-free asset is complete. Thus, we can use the stock and the
risk-free asset to replicate and price the stock option.
We can also interpret the APT in this context. The factors are like AD securities
that span the whole set of (redundant) stocks. Hence, we are able to impose no arbitrage
conditions and get the results in chapter 6.
7.6 Exercises
Ex. 45 — Define the following concepts in your own words:
1. Arrow-Debreu security.
2. Complete market.
Ex. 46 — Consider the following payoff matrix (states in rows, securities in columns):
X =
_
_
3 7 8
1 2 9
7 16 25
_
_
Is the market complete?
7.6. Exercises 85
Ex. 47 — Consider the following payoff matrix (states in rows, securities in columns):
X =
_
_
1 2 3
2 1 1
3 4 5
_
_
The prices of the complex securities (columns) are: p
1
= 1, p
2
= 1.2, p
3
= 1.5.
1. Compute the three Arrow-Debreu prices.
2. Find the price of a new security with payoff [3, 2, 7]

.
Ex. 48 — (Risk-Neutral Pricing) There are 3 states of nature and 3 complex securities.
The payoff matrix is:
Payoff
State Asset 1 Asset 2 Asset 3
s = 1 3 4 2
s = 2 0 2 1
s = 3 0 0 1
Price 1.2 1.8 1.2
1. Consider a new fourth security with payoff = [2, 10, 4]’. Compute its price using
the Risk-Neutral pricing method.
2. The price you just computed assumes that investors are risk neutral? Explain
briefly (5 lines).
3. Suppose a Bank is willing to sell you this new security for p
4
= 2.5. What would
you do? (Hint: define a trading strategy with the Arrow-Debreu securities and
quantify your profit. Note: you may also define the trading strategy on the
original 3 complex securities, though this requires more work.)
Chapter 8
Consumption-Based Asset Pricing
The fundamental asset pricing equation is
p
t
= E
t
[m
t+1
x
t+1
]
with m
t+1
≡ δU

(c
t+1
)/U

(c
t
)
The consumption asset pricing framework links the financial market to the real side
of the economy. Namely, we will be able to link consumption to expected stock returns.
This approach to asset pricing based on first principles is much more solid (utility should
be written over consumption, not wealth, because most people are not like Uncle Scrooge
and don’t enjoy swimming in their coins). The next generation of asset pricing models
will probably be some form of consumption asset pricing. It is currently a very active
area of research. A lot of the material in this section comes from Cochrane (2005).
8.1 The investor’s problem
Consider a 2 period consumption model. The investor chooses the quantity (z) of the
security to buy today (t) to maximize the utility of consumption (c). The problem is
thus:
maximize
z
U(c
t
) +E
t
[δU(c
t+1
)] (8.1)
s.t. c
t
+zp
t
= e
t
c
t+1
= e
t+1
+zx
t+1
86
8.1. The investor’s problem 87
where p
t
is the price of the security and x
t+1
is the total payoff (x
t+1
includes dividends,
x
t+1
= p
t+1
+d
t+1
) and e is an exogenous endowment the investor receives each period.
The parameter δ ≤ 1 captures impatience. The expectation E
t
[.] is conditional on time-t
information.
Substituting the constraints into the objective function, we get the following first-
order condition:

dU(c
t
)
dz
. ¸¸ .
U loss for addit. unit of security
= E
t
_
¸
¸
_
δ.
dU(c
t+1
)
dz
. ¸¸ .
U gain for addit. unit of security
_
¸
¸
_
The investor buys more or less of the asset until this foc holds. That is, until the
marginal utility loss for consuming less today (on the lhs) equals the (discounted, ex-
pected) marginal utility gain from consuming more tomorrow (on the rhs).
The foc can be further written as
U

(c
t
)p
t
= E
t
[δU

(c
t+1
)x
t+1
]
⇒p
t
= E
t
_
δ
U

(c
t+1
)
U

(c
t
)
x
t+1
_
Remark. If there is more than one asset, similar foc hold for each asset,
p
j
t
= E
t
_
δ
U

(c
t+1
)
U

(c
t
)
x
j
t+1
_
, ∀j
Remark. The investor’s problem can be set in a more general and realistic framework. If
we consider a representative agent (represents the average of all agents in the economy),
his problem is
maximize
{z
t
}

t=1
E
0
_


t=0
δ
t
U(c
t
)
_
s.t. c
t
+z
t+1
p
t
= z
t
x
t
, ∀t
where z
t+1
is the quantity of the security to hold from t to t+1. The problem is set in an
Exchange Economy where total output (GDP) is random and exogenous. The output is
distributed through dividends (replacing the endowments in the previous formulation),
which are included in the sequence {x
t
}. This is the famous “Lucas Tree” model, de-
veloped in 1978. The point is that this general formulation leads exactly to the same
first-order condition as (8.1).
8.2. Fundamental Asset Pricing Equation 88
Remark. No-trade equilibrium. Note that we will not solve the foc until the end, ie, we
will not try to find z. This is an equilibrium model, so we must have Demand = Supply.
In other words, it must be the case that z ≡ 1 (total supply of the asset is normalized
to 1). This is because there is only one investor (the representative agent), thus there
is no one else left for him to trade with. Hence, the model does not describe traded
quantities. (Note that the CAPM and the APT also do not; we need microstructure
models for this.)
8.2 Fundamental Asset Pricing Equation
The first order condition is the central equation in asset pricing. It is more conveniently
written as
p
t
= E
t
[m
t+1
x
t+1
] (8.2)
with
m
t+1
≡ δ
U

(c
t+1
)
U

(c
t
)
(8.3)
The random variable m is called stochastic discount factor (SDF), pricing kernel, or
marginal rate of substitution. The important point is that one single m prices all assets.
Example 8.2.1. Suppose U(c) =
c
1−γ
1−γ
. Then,
m
t+1
= δ
_
c
t+1
c
t
_
−γ
and this single random variable prices all assets,
p
j
t
= E
t
_
δ
_
c
t+1
c
t
_
−γ
x
j
t+1
_
, ∀j
In practice, the consumption stream is exogenous (must equal aggregate consumption
in the economy), so the goal of asset pricing is to find a specification for m (ie, for U)
that makes (8.2) consistent with observed stock returns.
8.3. Relation to Arrow-Debreu Securities 89
8.3 Relation to Arrow-Debreu Securities
To interpret the SDF, we can relate it to AD securities. Consider a finite number
of possible states of nature, S. Equation (8.2) can be written as (dropping the time
subscripts)
p =
S

s=1
π(s)m(s)x(s)
where p is the price today, x(s) is next-period’s payoff if state s occurs, and π(s) is the
probability of state s.
In the AD setup, the price of a security is
p =
S

s=1
p
ad
(s)x(s) =
S

s=1
π(s)
p
ad
(s)
π(s)
x(s)
where p
ad
(s) is today’s price of an AD security that pays 1 in state s.
Hence, the SDF is related to AD prices as
m(s) =
p
ad
(s)
π(s)
(8.4)
Example 8.3.1. Consider the same market as in example 7.2.1. We now also
include the probabilities for each state:
State Asset 1 Asset 2 Asset 3 π(s)
1 3 1 2 0.2
2 2 1 0 0.3
3 1 1 2 0.5
Price 1.5 0.8 0.8
Recall that we had already computed AD prices: p
ad
= [0.15, 0.40, 0.25]. The
SDF is thus the following vector:
m =
_
_
0.15/0.2
0.40/0.3
0.25/0.5
_
_
=
_
_
0.75
1.33
0.50
_
_
Suppose we want to price a new security (as in example 7.3.1) with payoffs
x
t+1
=
_
_
2
1
0
_
_
8.4. Relation to the Risk-Neutral measure 90
Using (8.2) we get
p
t
= E
t
[m
t+1
x
t+1
]
= 0.2 ∗ (0.75 ∗ 2) + 0.3 ∗ (1.33 ∗ 1) + 0.50 ∗ (0.50 ∗ 0)
= 0.7
which is the same number we got in example 7.3.1.
8.4 Relation to the Risk-Neutral measure
Equation (8.2) can be written as (dropping the time subscripts)
p =
S

s=1
π
P
(s)m(s)x(s)
where π
P
(s) is the objective or physical probability of state s.
Using risk-neutral pricing, the price of a security is (equation 7.2):
p =
E
Q
[x]
1 +r
f
=
S

s=1
π
Q
(s)
1 +r
f
x(s)
Comparing the two equations, we conclude that
π
Q
(s) = (1 +r
f
) · π
P
(s)m(s)
This equation says a lot. Risk aversion is equivalent to worrying about unpleasant
states. People that report high subjective probabilities (Q) for unpleasant events like
market crashes may not have irrational expectations; they may simply be reporting the
risk-neutral probabilities. These are the product π
P
×m, hence they are high if either:
the event is truly highly probable (high π
P
(s)); or it is improbable but has disastrous
consequences (high m(s)). (See Cochrane, 2005, p.53)
Example 8.4.1. Continuing the previous example, we have 1 + r
f
= 1/0.8
and
π
Q
= 1/0.8 ·
_
_
0.2
0.3
0.5
_
_
·
_
_
0.75
1.33
0.50
_
_
=
_
_
0.1875
0.5000
0.3125
_
_
the same numbers we got in example 7.4.1.
8.5. Risk Premiums 91
For future reference, we can rewrite the previous equation as
m(s) =
1
(1 +r
f
)
π
Q
(s)
π
P
(s)
The expression
π
Q
(s)
π
P
(s)
is called the Radon-Nikodym derivative of Q w.r.t. P. You’ll see
a lot of this in option pricing.
8.5 Risk Premiums
The consumption model provides the fundamental economic intuition to understand
why different assets have different prices or expected returns. We will show that it is
the correlation between the random common SDF (m) and the asset-specific payoff (x)
that generates asset-specific risk corrections.
We can write the foc in return form. For all asset j,
p
j
t
= E
t
[m
t+1
x
j
t+1
]
⇒1 = E
t
_
m
t+1
x
j
t+1
p
j
t
_
≡ E
t
_
m
t+1
R
j
t+1
_
(8.5)
where R
j
t+1
is the gross return on the asset.
Consider in particular a risk-free security with return R
f
t+1
(not random and known
at time t):
1 = E
t
_
m
t+1
R
f
t+1
_
= E
t
[m
t+1
] R
f
t+1
⇒ E
t
[m
t+1
] = 1/R
f
t+1
(8.6)
Using the definition of covariance, Cov(x, y) = E[xy] − E[x] E[y], and (8.6) we can
write (8.2) as
p
t
= E
t
[m
t+1
x
t+1
]
⇒p
t
= E
t
[m
t+1
] E
t
[x
t+1
] + Cov
t
(m
t+1
, x
t+1
)
⇒p
t
=
E
t
[x
t+1
]
R
f
t+1
+ Cov
t
(m
t+1
, x
t+1
)
The first term is the value of the asset if investors were risk-neutral. The second term
is a risk adjustment. If the payoff covaries positively with the sdf the security price will
be higher (returns will be lower).
8.6. Consumption CAPM (CCAPM) 92
To see why this is so, write the sdf explicitly:
p
t
=
E
t
[x
t+1
]
R
f
t+1
+
δ
U

(c
t
)
Cov
t
(U

(c
t+1
), x
t+1
) (8.7)
Recall that marginal utility (U

) is decreasing (U
′′
< 0). Investors like smooth consump-
tion. If an asset pays off well when consumption is low (marginal utility is high), it will
help to smooth consumption. Thus investors are willing to pay a high price for it (the
covariance term is positive); equivalently, demand a low return.
1
This intuition can be restated in return form. Starting from (8.5) and using (8.6),
1 = E
t
_
m
t+1
R
j
t+1
_
⇒1 =
E
t
[R
j
t+1
]
R
f
t+1
+ Cov
t
(m
t+1
, R
j
t+1
)
⇒E
t
[R
j
t+1
] −R
f
t+1
= −R
f
t+1
Cov
t
(m
t+1
, R
j
t+1
)
Writing the sdf explicitly and using net returns instead of gross returns (R = 1 +r),
E
t
[r
j
t+1
] −r
f
t+1
= −(1 +r
f
t+1
)
δ
U

(c
t
)
Cov
t
(U

(c
t+1
), r
j
t+1
) (8.8)
Again, the same intuition applies. If an asset has a positive covariance with marginal
utility (negative covariance with consumption), its risk premium will be low. Investor
are willing to hold this asset at low return (high price) because it smooths consumption.
On the other hand, if the asset has a high correlation with consumption (pays off well
when you are wealthy, pays off badly when you are poor), it will only contribute to make
consumption more volatile, thus investors require a higher return premium to hold it
(lower price).
Check exercise 52. ◃
8.6 Consumption CAPM (CCAPM)
To give a more familiar look to equation (8.8), we can specialize the consumption model
to the case of quadratic utility. This leads to the Consumption Capital Asset Pricing
Model (CCAPM).
2
1
Insurance is an extreme example. We are happy to hold insurance even though its expected
return is negative.
2
Breeden (1979) derives the model in continuous time, which amounts to assuming that only
the first two moments of returns matter.
8.6. Consumption CAPM (CCAPM) 93
Assume U(c) = ac −
b
2
c
2
. It follows that U

(c) = a −bc. Substituting into (8.8),
E
t
[r
j
t+1
] −r
f
t+1
= −(1 +r
f
t+1
)
δ
a −bc
t
Cov
t
(a −bc
t+1
, r
j
t+1
)
=
(1 +r
f
t+1
)δb
a −bc
t
Cov
t
(c
t+1
, r
j
t+1
) (8.9)
Denote by r
ˆ c
the return on the portfolio most highly correlated with consumption
growth. Since this is a traded security, it must also satisfy equation (8.9),
E
t
[r
ˆ c
t+1
] −r
f
t+1
=
(1 +r
f
t+1
)δb
a −bc
t
Cov
t
(c
t+1
, r
ˆ c
t+1
)

(1 +r
f
t+1
)δb
a −bc
t
=
E
t
[r
ˆ c
t+1
] −r
f
t+1
Cov
t
(c
t+1
, r
ˆ c
t+1
)
Replacing back into (8.9),
E
t
[r
j
t+1
] −r
f
t+1
=
E
t
[r
ˆ c
t+1
] −r
f
t+1
Cov
t
(c
t+1
, r
ˆ c
t+1
)
Cov
t
(c
t+1
, r
j
t+1
)
=
Cov
t
(c
t+1
, r
j
t+1
)/ Var
t
(c
t+1
)
Cov
t
(c
t+1
, r
ˆ c
t+1
)/ Var
t
(c
t+1
)
_
E
t
[r
ˆ c
t+1
] −r
f
t+1
_
Defining the consumption beta of security i to be
β
i,c

Cov
t
(r
i
t+1
, c
t+1
)
Var
t
(c
t+1
)
we get the CCAPM:
E
t
[r
j
t+1
] −r
f
t+1
=
β
j,c
β
ˆ c,c
_
E
t
[r
ˆ c
t+1
] −r
f
t+1
_
(8.10)
To interpret this equation, suppose β
ˆ c,c
= 1 (ˆ c mimics c perfectly). We get a direct
analogue to the CAPM,
E
t
[r
j
t+1
] −r
f
t+1
= β
j,c
_
E
t
[r
ˆ c
t+1
] −r
f
t+1
_
However, the market risk premium is now replaced by the excess return on the consump-
tion portfolio and the relevant risk measure is the consumption beta of j. A security
with high consumption beta must have a high expected return. This is because it pays
off well when consumption is already high (low marginal utility), but pays off badly
when consumption is low (high marginal utility). Hence, we get the same intuition as
in (8.8).
8.7. The CAPM reloaded 94
8.7 The CAPM reloaded
Consumption asset pricing has not been very successful empirically, presumably because
the sdf depends on marginal utility (m
t+1
= δU

(c
t+1
)/U

(c
t
)), which is not easy to
measure empirically. We don’t know the true utility function, neither the value of the
parameters, and even consumption data has its problems. Beta asset pricing models
(CAPM, APT) have thus the upper hand on empirical applications nowadays.
However, all asset pricing models are nested in the fundamental asset pricing equation
(8.2). The models differ by proposing different, easier to measure, proxies for marginal
utility.
The CAPM is the special case where
m
t+1
= a −bR
M
t+1
(8.11)
Marginal utility is proxied by the return on the market portfolio. In the CAPM, the
investor holds the market portfolio, hence higher R
M
t+1
, allows for higher consumption,
which means lower marginal utility. It is the return on the market that describes whether
the typical investor is happy or unhappy. R
M
t+1
is perfectly negatively correlated with
m
t+1
. Schematically,
R
M
t+1
c
t+1
U

(c
t+1
) m
t+1
↗ ↗ ↘ ↘
↘ ↘ ↗ ↗
To show that (8.11) implies the CAPM pricing relation (SML), start by writing the
fundamental pricing equation in return form, as in the derivation of (8.8):
1 = E
t
_
m
t+1
R
j
t+1
_
⇒E
t
[R
j
t+1
] −R
f
t+1
= −R
f
t+1
Cov
t
(m
t+1
, R
j
t+1
)
For m
t+1
= a −bR
M
t+1
, we get
E
t
[R
j
t+1
] −R
f
t+1
= −R
f
t+1
Cov
t
(a −bR
M
t+1
, R
j
t+1
)
= R
f
t+1
b Cov
t
(R
M
t+1
, R
j
t+1
)
Since this model applies to any asset, it also applies to the Market itself:
E
t
[R
M
t+1
] −R
f
t+1
= R
f
t+1
b Cov
t
(R
M
t+1
, R
M
t+1
)

E
t
[R
M
t+1
] −R
f
t+1
Var
t
(R
M
t+1
)
= R
f
t+1
b
8.7. The CAPM reloaded 95
Replacing in the previous equation for any asset j,
E
t
[R
j
t+1
] −R
f
t+1
=
E
t
[R
M
t+1
] −R
f
t+1
Var
t
(R
M
t+1
)
Cov
t
(R
M
t+1
, R
j
t+1
)
=
Cov
t
(R
M
t+1
, R
j
t+1
)
Var
t
(R
M
t+1
)
_
E
t
[R
M
t+1
] −R
f
t+1
_
which is the CAPM. In the more standard notation with beta, net (instead of gross)
returns, and stressing that R is for any asset j,
E[r
j
] −r
f
= β
j
( E[r
M
] −r
f
), ∀j
Alternative proof (Cochrane, 2005)
To show that (8.11) implies the CAPM pricing relation (SML), we start by determining
the constants a and b. First, the model must price the risk-free asset:
1 = E[mR
f
]
⇒1 = E[(a +bR
M
)R
f
]
⇒a =
1 −bR
f
E[R
M
]
R
f
Second, since the model applies to any asset, it also applies to R
M
itself:
1 = E[mR
M
]
⇒1 = E[(a +bR
M
)R
M
]
⇒1 = a E[R
M
] +b E[(R
M
)
2
]
Using the previous expression for a and the fact that Var(x) = E[x
2
] −( Ex)
2
,
⇒1 =
1 −bR
f
E[R
M
]
R
f
E[R
M
] +b E[(R
M
)
2
]
⇒R
f
= (1 −bR
f
E[R
M
]) E[R
M
] +bR
f
E[(R
M
)
2
]
⇒R
f
− E[R
M
] = −bR
f
( E[R
M
])
2
+bR
f
E[(R
M
)
2
]
⇒b = −
E[R
M
] −R
f
R
f
Var[R
M
]
We can now show that the fundamental asset pricing equation with m = a + bR
M
8.8. Conclusion 96
implies the CAPM. Starting from (8.5), using (8.6), and the expression above for b,
1 = E[mR]
⇒1 = b Cov(R
M
, R) + E[R]/R
f
⇒1 = −
E[R
M
] −R
f
R
f
Var[R
M
]
Cov(R
M
, R) + E[R]/R
f
⇒E[R] −R
f
=
Cov(R
M
, R)
Var[R
M
]
( E[R
M
] −R
f
)
which is the CAPM. In the more standard notation with beta, net (instead of gross)
returns, and stressing that R is for any asset j,
E[r
j
] −r
f
= β
j
( E[r
M
] −r
f
), ∀j
8.8 Conclusion
The fundamental asset pricing equation,
p = E[mx]
m = δU

(c
t+1
)/U

(c
t
) is the basic framework that should be able to answer all asset
pricing questions. However, if we specify the model to quadratic utility (CCAPM) or
even power utility, the model does not match the empirical stock returns data.
Hence, beta or factor pricing models (CAPM, APT, FF3) are currently better em-
pirical alternatives. The point to note is that these beta models are specific cases of the
general consumption framework. They are just using proxies for marginal utility that
are easier to measure. For instance, the CAPM is the special case where m = a −bR
M
.
8.9 Exercises
Ex. 49 — In a 2 period consumption model, the investor chooses the quantity (x) of
the security to buy today (t) to maximize the utility of consumption (c). The problem
is thus:
maximize
x
E
t
[U(c
t
) +δU(c
t+1)
]
s.t. c
t
= e
t
−xp
t
c
t+1
= e
t+1
+xv
t+1
where p
t
is the price of the security, v
t+1
its terminal payoff, and e is an exogenous
endowment the investor receives each period.
8.9. Exercises 97
1. Write the first-order condition.
2. Write the second-order condition. What condition on the utility function will
ensure that we are at a maximum?
3. Compute the pricing kernel for the utility function U(c) =
c
1−γ
1−γ
.
Ex. 50 — There are 3 states of nature and 3 complex securities. The payoff matrix is:
Payoff Prob
State Asset 1 Asset 2 Asset 3 . π(s)
s = 1 3 4 1 0.25
s = 2 0 2 1 0.50
s = 3 0 0 1 0.25
Price 1.2 1.8 0.7
1. Is the market complete?
2. Compute the Arrow-Debrew prices.
3. Consider a new fourth security with payoff = [2, 10, 4]’. Compute its price.
4. Compute the value of the pricing kernel at each state.
5. Compute the price of the new fourth security defined above using the pricing
kernel (recall p = E[m×payoff]).
Ex. 51 — The fundamental pricing equation can be written in return form as 1 =
E
t
[m
t+1
R
t+1
], where m
t+1
is the pricing kernel and R
t+1
is the gross return.
1. A risk-free security costs 1 and pays a gross return of R
f
t+1
, known at time t.
Write the pricing equation for this security.
2. Manipulate the fundamental pricing equation for a risky security to get the excess
return E
t
[R
t+1
]−R
f
t+1
(on the left-hand side) explained by the covariance between
marginal utility and returns (on the right-hand side).
3. Explain in words the economic meaning of the previous equation.
Ex. 52 — (Risk Premiums) There are 2 states of nature and 2 assets. The payoffs and
consumption in the next period can be:
Payoff
State Asset 1 Asset 2 Consumption Probability
s = 1 10 20 100 0.5
s = 2 20 10 150 0.5
Consumption today is c
0
= 100. The representative investor has log utility and is
indifferent between consuming the same amount today or in 1 period.
1. Use the fundamental asset pricing equation to compute the price of the two assets.
8.9. Exercises 98
2. Using only words, provide intuition for why one asset is more expensive than the
other.
3. Now use equations and numbers to explain rigorously the price differences. (Hint:
manipulate the fundamental equation so that price equals two terms: the first is
an “intuitive” price; the second is a risk adjustment. Compute the values and
explain in words what the numbers mean.)
Ex. 53 — “The stochastic discount factor is always positive.” True or False?
Ex. 54 — Consider a 2 period consumption model. The investor chooses the quantity
of the stock (z
s
) and the quantity of a risky corporate bond (z
b
) to buy today (t). The
problem is thus:
maximize
z
s
,z
b
U(c
t
) +E
t
[δU(c
t+1
)] (8.12)
s.t. c
t
+z
s
p
s
t
+z
b
p
b
t
= W
t
c
t+1
= z
s
x
s
t+1
+z
b
x
b
t+1
where p
j
t
is the price of the security and x
j
t+1
is the total payoff (j = s, b). W
t
is the
exogenous initial wealth of the investor.
1. Write the first order conditions for this problem.
2. Assume U(c) = ln(c). Write the pricing kernel for this utility function.
3. Assume δ = 0.99 and c
t
= 1000. There are 4 possible states of nature tomorrow.
Consumption and the payoffs of the bond are the following:
State(s) Prob[s] c
t+1
x
b
t+1
1 0.1 900 0
2 0.2 1000 95
3 0.5 1100 100
4 0.2 1200 100
Compute the price of the bond, p
b
t
.
Chapter 9
Conclusion
Overview of asset pricing frameworks, models, and applications.
max E[U(Y)]
+
Equilibrium

Factor Model
+
No Arbitrage

Complete Market
+
No Arbitrage

max E[U(c)]
+
Equilibrium

General SDF:
p = E[mx]
U

(c)=a−bc

m=p
ad
./π

xx
m=a−bR
M
tt
CAPM
p =
E[x]
1+E[r]
E[r] from SML

APT
p =
E[x]
1+E[r]
ex: E[r] from FF3

yys
s
s
s
s
s
s
s
s
s
s
s
s
s
f=r
M
oo
AD pricing
p =

x(s)p
ad
(s)
RN pricing
p =
E
Q
[x]
1+r
f

CCAPM

1. Stock pricing
2. Corporate Projects
3. Fund Performance


1. Covariance Matrix
2. Hedging strategies
__ _ _ _ _

_ _ _ _ __
Derivatives
pricing
__ _ _ _ _

_ _ _ _ __
Future
AP models
“Mind what you have learned. Save you it can.”
99
Bibliography
Bodie, Z., A. Kane, and A. Marcus, 2005, Investments. McGraw-Hill, 6th ed edn.
Breeden, D., 1979, “An Intertemporal Asset Pricing Model with Stochastic Consumption
and Investment Opportunities,” Journal of Financial Economics, 7, 265–296.
Chen, N.-F., R. Roll, and S. A. Ross, 1986, “Economic Forces and the Stock Market,”
Journal of Business, 59, 383–403.
Chiang, A. C., 1984, Fundamental Methods of Mathematical Economics. McGraw-Hill.
Cochrane, J. H., 2005, Asset Pricing. Princeton University Press.
Cvitani´c, J., and F. Zapatero, 2004, Introduction to the Economics and Mathematics of
Financial Markets. The MIT Press.
Danthine, J.-P., and J. B. Donaldson, 2005, Intermediate Financial Theory. Elsevier
Academic Press, 2nd edn.
Fama, E. F., and K. R. French, 1993, “Common Risk Factors in the Returns on Stocks
and Bonds,” Journal of Financial Economics, 33, 3–56.
, 1996, “Multifactor explanations of Asset Pricing Anomalies,” Journal of Fi-
nance, 51(1), 55–84.
Huang, C.-f., and R. H. Litzenberger, 1988, Foundations for Financial Economics.
Prentice-Hall.
Ingersoll, J. E., 1987, Theory of Financial Decision Making. Rowman and Littlefield.
Jagannathan, R., and E. R. McGrattan, 1995, “The CAPM Debate,” Federal Reserve
Bank of Minneapolis Quarterly Review, 19(4), 2–17.
J.P. Morgan, 1996, Risk Metrics — Technical Document. J.P. Morgan.
Malkiel, B. G., 2005, “Reflections on the Efficient Market Hypothesis: 30 Years Later,”
Financial Review, 40, 1–9.
100
Bibliography 101
Roll, R., 1977, “A Critique of the Asset Pricing Theory’s Test — Part I: On past and
potential testability of the theory,” Journal of Financial Economics.
Ross, S. A., 1976, “The Arbitrage Theory of Capital Asset Pricing,” Journal of Economic
Theory, 13, 341–360.
Appendix A
Background Review
A.1 Math Review
A.1.1 Logarithm and Exponential
Definition. ln x = y ⇔e
y
= x
The log function is increasing,
(ln x)

= 1/x > 0
and concave,
(ln x)
′′
= (1/x)

= −x
−2
= −1/x
2
< 0
Plot it:
-
x
6
y
102
A.1. Math Review 103
The exponential function y = e
x
is increasing, but not concave. Since we will be
interested in increasing and also concave functions, we will use y = −e
−x
. Plot y = e
x
,
y = e
−x
, and y = −e
−x
:
-
x
6
y
A.1.2 Derivatives
Basic rules
Let f and g be functions of x. Let a be a constant.
Function Derivative
af af

fg f

g +fg

f/g (f

g −fg

)/g
2
f
a
af

f
a−1
e
f
f

e
f
a
g
g

a
g
ln a
f
g
gf

f
g−1
+g

f
g
ln f
ln x 1/x
ln f f

/f
Chain rule
df(g(x))
dx
=
df
dg
dg
dx
= f

(g)g

(x)
The following examples are from Chiang (1984, p.170).
A.1. Math Review 104
Example A.1.1. Let z = 3y
2
, with y = 2x + 5. Note that a change in x
causes a change in y, which in turn causes a change in z, like a chain reaction,
hence the name. Applying the rule,
dz
dx
=
dz
dy
dy
dx
= (6y) ×2 = 12(2x + 5)
We can check this result by replacing y = 2x + 5 in z, and computing dz/ dx
directly: dz/ dx = 24x + 60.
The rule becomes useful for complicated functions like:
Example A.1.2. With z = (x
2
+3x−2)
17
, computing dz/ dx directly would
require a lot of work. Instead, we set z = y
17
, with y = x
2
+3x −2, and apply
the chain rule:
dz
dx
=
dz
dy
dy
dx
= 17y
16
(2x + 3) = 17(x
2
+ 3x −2)
16
(2x + 3)
Implicit Function Theorem
Given the equation f(x, y) = 0, then
dy
dx
= −
∂f/∂x
∂f/∂y
The following examples are from Chiang (1984, p.208).
Example A.1.3. Let f(x, y) := y − 3x
4
= 0 (which implicitly defines y =
3x
4
). Applying the IFT,
dy
dx
= −
∂f/∂x
∂f/∂y
= 12x
3
Which can be checked by computing directly
dy
dx
=
d(3x
4
)
dx
= 12x
3
The rule becomes useful for complicated equations like:
Example A.1.4. Let f(x, y, w) := y
3
x
2
+w
3
+yxw −3 = 0. Note that we
cannot write explicitly y = y(x, w). Still, we can use the IFT to compute
dy
dx
= −
∂f/∂x
∂f/∂y
= −
2y
3
x +yw
3y
2
x
2
+xw
A.1. Math Review 105
Taylor expansion
f(x) = f(a) +f

(a)(x −a) +
1
2
f
′′
(a)(x −a)
2
+· · · +
1
n!
f
(n)
(a)(x −a)
n
+. . .
This allows us to express any arbitrary function f as a polinomial
The following examples are from Chiang (1984, p.259).
Example A.1.5. Consider the quadratic function f(x) = 5 + 2x +x
2
. Note
that this is already a polynomial, so Taylor’r rule will take us back to the original
function. Just for illustration:
f(x) = 5 + 2a +a
2
+ (2 + 2a)(x −a) + 1/2 ×2(x −a)
2
+ 0 = 5 + 2x +x
2
Taylor’r rule is typically used to approximate a function by a low-degree polynomial.
Example A.1.6. We can use Taylor’s rule to approximate the quadratic func-
tion in the previous example by a linear function:
f(x) ≈ 5 + 2a +a
2
+ (2 + 2a)(x −a)
For example, around a = 1 we have the approximation f(x) ≈ 4x + 4. (Plot
it!) ◃
A.1.3 Optimization
The following examples are from Chiang (1984, p.370).
Consider the following problem:
maximize
x
1
,x
2
x
1
x
2
+ 2x
1
s.t. 4x
1
+ 2x
2
= 60
(This can be seen as maximizing the utility of consuming two goods, subject to a budget
restriction.)
There are two equivalent ways to solve this problem:
A.1. Math Review 106
Option 1: Substitution Substituting the constraint x
2
= 30−2x
1
into the function,
we get
maximize
x
1
x
1
(30 −2x
1
) + 2x
1
The first order condition for an optimum is
d(x
1
(30 −2x
1
) + 2x
1
)/ dx
1
= 0 ⇒x
1
= 8
and thus x
2
= 14.
Option 2: Lagrangian For more complicated problems, the Lagrangian is more
useful:
L = x
1
x
2
+ 2x
1
+λ(4x
1
+ 2x
2
−60)
We now have 3 foc:
_
¸
_
¸
_
dL/dx
1
= x
2
+ 2 + 4λ = 0
dL/dx
2
= x
1
+ 2λ = 0
dL/dλ = 4x
1
+ 2x
2
−60 = 0

_
¸
_
¸
_
x
1
= 8
x
2
= 14
λ = −4
As expected, the solution is the same.
Note that we can also write the lagrangian as
L = x
1
x
2
+ 2x
1
−λ(4x
1
+ 2x
2
−60)
or
L = x
1
x
2
+ 2x
1
+λ(−4x
1
−2x
2
+ 60)
These will change the sign of the multiplier, λ, but will not change the values of the
choice variables, x
1
and x
2
.
A.1.4 Means and Variances
For an intuitive review of random variables and their moments, consider the following
returns on two stocks:
month r
a
r
b
1 0 0
2 0.05 0.1
3 0 0
4 -0.05 -0.1
5 0 0
A.2. Undergraduate Finance Review 107
Plot these time series. ◃
We can easily see that
E[r
a
] = E[r
b
] = 0
and
Var[r
a
] < Var[r
b
]
Furthermore, it should be clear that the two stocks are perfectly correlated,
ρ(a, b) :=
Cov(a, b)

Var(a) Var(b)
= 1
As an exercise, assume that each value is equally likely (ie, each observation has 0.2
probability) and compute the variances, the covariance, and check that the correlation
coefficient is indeed 1.
A.2 Undergraduate Finance Review
A good reference for undergraduate finance is Bodie, Kane, and Marcus (2005).
A.2.1 Financial Markets and Instruments
Money Market
Short-term market. Instruments:
• Treasury Bills
• Certificates of Deposit
• Commercial paper
• LIBOR market
• EURIBOR (Euro Interbank Offered Rate)
Bonds
Bonds are debt instruments. The typical bond has a fixed (known) coupon rate and
is fully amortized at maturity. Bonds can be issued by corporations and governments
(Treasury Bills, T. Notes, T. Bonds).
A.2. Undergraduate Finance Review 108
Stocks
Stocks represent ownership in a corporation. Shareholders vote to elect the board at
an annual meeting. Each stock receives a (variable, unknown) dividend each year (or
quarter). However, a stock is the residual claim on the value of the corporation, meaning
that shareholders will only receive a dividend after all other liabilities have been paid.
Stock Indexes
Uses:
• Track average returns.
• Comparing performance of managers.
• Base of derivatives
Examples:
• Dow Jones Industrial Average (30 Stocks)
• Standard & Poors 500 Composite
• NASDAQ Composite
• Nikkei 225
• FTSE
• Dax
• PSI20
Derivatives
Examples: Forward, Futures, Options, Swaps, etc. Value depends on underlying asset.
Used to hedge risks or speculate.
Short selling
Purpose: to profit from a decline in the price of a stock or security.
Mechanics:
1. Borrow stock through a dealer.
2. Sell it and deposit proceeds and margin in an account.
A.2. Undergraduate Finance Review 109
3. To close out the position: buy the stock and return to the party from which it
was borrowed.
Short Selling Puzzle. Most stocks are easy to short sell. However, investors do very
little short selling.
A.2.2 Time value of money
$1 today is worth less than $1 tomorrow. Assume a risk-free interest rate of 5% per
year. The Present Value of $1 to be received for sure in one year is
PV =
$1
1.05
= $0.95 < $1
We are indifferent between receiving $0.95 today or $1 in one year.
Receiving $10 per year for the next 2 years is equivalent to having today
PV =
$10
1.05
+
$10
1.05
2
Example A.2.1. A one-month TBill sells for 99.6737% (of par value). The
one-month risk-free interest rate is
99.6737 =
100
1 +r
1m
⇒r
1m
= 0.003274 = 0.3274% = 32.74bps
Interest rates are usually expressed in a annual base. There are two options:
Annual Percentage rate: r
APR
1m
= 0.3274 ×12 = 3.9288%
Equivalent Annual rate: r
EAR
1m
= (1 + 0.003274)
12
−1 = 4%
Some useful formulas:
Annuity Present value (t = 0) of $1 received during T periods (from t = 1 to t = T),
discounted at a rate of r:
AF(T, r) :=
1 −(1 +r)
−T
r
Example A.2.2. An 8-yr Treasury Bond pays annual coupons at 6%. The
risk-free term structure is flat at 5%. The price of the Bond is
P = 6%×AF(8, 5%) +
100%
1.05
8
A.2. Undergraduate Finance Review 110
Perpetuity Present value (t = 0) of $c received forever (from t = 1 to ∞), discounted
at an interest rate of r:
PV = c/r
Perpetuity with growth Present value (t = 0) of {c
t
}

t=1
with c
t+1
= c
t
(1 + g),
discounted at an interest rate of r:
PV = c
1
/(r −g)
Example A.2.3. A stock will pay a dividend of $2 in one year. Dividends are
expected to grow at 6% forever. The required return on the stock is 10%. It’s
fair value today is
P =
2
0.10 −0.06
= $50
A.2.3 Risk and Return
Risk-return tradeoff
Statistics on annual returns on US assets for 1926–2002 (in %):
Asset Mean Std Dev Risk Premium
Small Stocks 17.7 39.3 13.9
Large Stocks 12.0 20.6 8.2
LT Gov Bonds 5.7 8.2 1.9
T-Bills 3.8 3.2 –
More risk is compensated with higher returns. But what exactly explains these risk
premiums? Most of this course is about explaining differences in risk premiums.
Other kinds of risk
Current research is trying to understand and model other sources of risk:
Liquidity risk: The risk of not being able to trade immediately at a fair price.
Credit risk: The risk of not receiving promised payments.
A.2. Undergraduate Finance Review 111
A.2.4 Equilibrium and No Arbitrage
Financial models can be classified into two categories: Equilibrium and Arbitrage.
Definition (Arbitrage). Arbitrage is the possibility to make money without any risk.
In financial markets there are no arbitrage opportunities. After all, it only takes
a few “sharks” to constantly monitor the markets and quickly eliminate any arbitrage
opportunity. Hence, in modeling financial markets, we always assume that there is no
arbitrage.
Pricing by arbitrage can only give relative values, ie, it uses the (given) prices of
some basic assets to explain the prices of other securities (sometimes called “ketchup
economics”) Nonetheless, arbitrage models require less assumptions and are more appli-
cable in practice.
Definition (Equilibrium). The market for an asset is in equilibrium if the supply equals
the demand for that asset.
The demand is the result of many investors making optimal choices, i.e., buying the
quantity that optimizes their well-being (subject to some restrictions). In most cases of
financial models, the supply is taken as exogenous.
Equilibrium models aim for a complete theory of value, ie, they start from primitives
(investors’ preferences, firms’ technology, market structure, etc) and get to prices. The
goal is to understand how prices (or risk premiums) depend on the fundamental char-
acteristics of the economy. They are thus more general than arbitrage models, though
harder to implement.
Remark. If the market is in Equilibrium, then there are No Arbitrage opportunities
(everybody is maximizing, so there can be no easy way to make money). However,
the reverse is not true. Hence, Equilibrium is a stronger condition. The advantage of
requiring only No Arbitrage is that we need to make less assumptions.
Suggestion: read sections 2.1–2.3 in Danthine and Donaldson (2005) for an intro-
duction to the valuation methods we will be studying in this course.
Appendix B
Solutions to Problems
Answer (Ex. 3) — lover; averse.
Answer (Ex. 4) — To preserve those measures under linear transformations of the
utility function. If we used only the second derivative (eg, ARA

= −U
′′
), then for
example ln W and a + b ln W would have different ARA

and RRA

(check this). This
is not desirable because ln W and a +b ln W represent exactly the same preferences (ie,
the same person).
Answer (Ex. 5) — The indifference probability is such that
U(Y ) = πU(Y +θY ) + (1 −π)U(Y −θY )
Expanding U(Y +θY ) and U(Y −θY ) in Taylor series around Y , we get
U(Y +θY )

= U(Y ) +θY U

(Y ) +
1
2
(θY )
2
U
′′
(Y )
U(Y −θY )

= U(Y ) −θY U

(Y ) +
1
2
(θY )
2
U
′′
(Y )
Replacing back in the previous equation and canceling terms produces the required
relation:
U(Y )

= π
_
U(Y ) +θY U

(Y ) +
1
2
(θY )
2
U
′′
(Y )
_
+ (1 −π)
_
U(Y ) −θY U

(Y ) +
1
2
(θY )
2
U
′′
(Y )
_
⇒U(Y )

= π
_
U(Y ) +θY U

(Y ) +
1
2
(θY )
2
U
′′
(Y ) −U(Y ) +θY U

(Y ) −
1
2
(θY )
2
U
′′
(Y )
_
+U(Y ) −θY U

(Y ) +
1
2
(θY )
2
U
′′
(Y )
112
113
⇒0

= π
_
2θY U

(Y )
¸
−θY U

(Y ) +
1
2
(θY )
2
U
′′
(Y )
⇒π

=
1
2

θ
2
Y
2
U
′′
(Y )
4θY U

(Y )
⇒π

=
1
2
+
1
4
θ.RRA(Y )
Answer (Ex. 6) — .
Name U(W) = Restrictions ARA RRA
on parameters
Log ln(W) na 1/W 1
decreasing constant
Power W
1−γ
/(1 −γ) γ > 0 γ/W γ
decreasing constant
Exponential −exp(−αW) α > 0 α αW
constant increasing
b > 0(⇒U
′′
< 0)
Quadratic aW −bW
2
W <
a
2b
(⇒U

> 0)
2b
a−2bW
2bW
a−2bW
a > 0(⇒U

> 0 on W > 0) increasing(1) increasing(1)
.
(1) Check that
dARA
dW
> 0 and
dRRA
dW
> 0
Answer (Ex. 7) — .
1) U
2
is a linear transformation of U
1
, hence represents the same preferences (see propo-
sition 2.2.1).
2) Use L’Hopital’s rule to get
lim
γ→1
W
1−γ
−1
1 −γ
= lim
γ→1
d

(W
1−γ
−1)
d

(1 −γ)
= lim
γ→1
−W
1−γ
lnW
−1
= ln(W)
Answer (Ex. 8) — U

= 20Y, U
′′
= 20 > 0, hence the investor is risk-loving. Most
investors demand a premium to bear risk (are risk-averse), hence this utility would not
be a reasonable assumption.
114
Answer (Ex. 9) — .
1) Constant RRA.
2) ln(Y ) or Y
1−g
/(1 −g)
Answer (Ex. 10) — .
1) The ARA measures the willingness to take gambles defined in absolute terms (money).
The RRA measures the willingness to take gambles defined in percentage of wealth.
2) It means that as his wealth increases, he becomes less willing to take a gamble defined
in absolute terms. For example, consider a fair gamble of winning or loosing $100. As
the investor’s wealth increases from, say, $1 to $2000, the investor becomes less willing
to risk $100. It does not seem a reasonable assumption. U = aW − bW
2
(show that
dARA/dW > 0).
3) U =
W
1−g
1−g
, RRA = g.
Answer (Ex. 11) — EU(Y + L) = U(Y + CE) ⇒ 0.3 ln 120 + 0.7 ln 80 = ln(100 +
CE) ⇒CE = −9.65. The investor is indifferent between playing the game or reducing
his wealth to 90.35 for sure.
Answer (Ex. 12) — .
1) CE = 41.42
2) CE = 9.05
3) At higher values of wealth, the investor is less risk averse, hence he only trades the
gamble for a high value (41.42) closer to the expected value (50). At lower wealth levels
(1), the investor is more risk averse, thus willing to trade the gamble for a smaller sure
amount (9.05). Another way to say this is the following: At low wealth (1), marginal
utility is very high, that is, the investor is desperate for more food. Thus, if he owned
the gamble, he would be willing to sell it for a sure amount as low as 9.05. If instead he
was fat (Y = 100) and not so desperate for more food (low marginal utility), he would
not mind taking the gamble himself, that is, he would only sell it for a high price (41.42).
Answer (Ex. 14) — No. The signal reverses.
x F
3
(x)

x
0
F
3
(s)ds F
4
(s)ds

x
0
F
4
(s)ds

x
0
F
4
(s)ds −

x
0
F
3
(s)ds
1 0 0 1/3 0 0 ≥ 0
3 0.25 0 1/3 2/3 2/3 ≥ 0
4 0.75 0.25 1/3 1 0.75 ≥ 0
6 0.75 1.75 2/3 5/3 −1/12 ≤ 0
8 0.75 3.25 3/3 3
12 1.00
115
Answer (Ex. 16) — The distribution of b is a mean-preserving spread of a. Thus,
a 2SD b. All risk-averse investors prefer a. This investor is risk averse (compute U
′′
),
hence also prefers a.
Answer (Ex. 17) — Possible answer: The investor chooses his portfolio allocation by
maximizing the expected utility of terminal wealth.
Answer (Ex. 18) — The problem is max
a
E[
Y
1−γ
1
1−γ
], with Y
1
= Y
0
(1 +r
f
) +a(r −r
f
).
The foc is
π [Y
0
(1 +r
f
) +a(r
2
−r
f
)]
−γ
(r
2
−r
f
)+
(1 −π) [Y
0
(1 +r
f
) +a(r
1
−r
f
)]
−γ
(r
1
−r
f
) = 0

_
π [Y
0
(1 +r
f
) +a(r
2
−r
f
)]
−γ
(r
2
−r
f
)
¸
−1/γ
=
_
−(1 −π) [Y
0
(1 +r
f
) +a(r
1
−r
f
)]
−γ
(r
1
−r
f
)
¸
−1/γ
⇒[Y
0
(1 +r
f
) +a(r
2
−r
f
)] · [π(r
2
−r
f
)]
−1/γ
=
[Y
0
(1 +r
f
) +a(r
1
−r
f
)] · [−(1 −π)(r
1
−r
f
)]
−1/γ
⇒[Y
0
(1 +r
f
) +a(r
2
−r
f
)] · [(1 −π)(r
f
−r
1
)]
1/γ
=
[Y
0
(1 +r
f
) +a(r
1
−r
f
)] · [π(r
2
−r
f
)]
1/γ
⇒Y
0
(1 +r
f
) · [(1 −π)(r
f
−r
1
)]
1/γ
+a(r
2
−r
f
) · [(1 −π)(r
f
−r
1
)]
1/γ
=
Y
0
(1 +r
f
) · [π(r
2
−r
f
)]
1/γ
+a(r
1
−r
f
) · [π(r
2
−r
f
)]
1/γ

a
Y
0
(1 +r
f
)
=
[(1 −π)(r
f
−r
1
)]
1/γ
−[π(r
2
−r
f
)]
1/γ
(r
1
−r
f
)[π(r
2
−r
f
)]
1/γ
−(r
2
−r
f
)[(1 −π)(r
f
−r
1
)]
1/γ
which is the same as (5.4) in Danthine and Donaldson (2005). Plugging in the numbers
we get
a
Y
0
= 0.198
116
Answer (Ex. 19) — The foc is
E[Y
0
(r −r
f
)/Y
1
] = 0
with Y
1
= Y
0
(1 +r
f
) +wY
0
(r −r
f
).
Applying the implicit function theorem we get
d ˆ w
dY
0
= −
d
dY
0
E[Y
0
(r −r
f
)/Y
1
]
d
dw
E[Y
0
(r −r
f
)/Y
1
]
=
E[(r −r
f
)/Y
1
−Y
0
(r −r
f
)(1 +r
f
+w(r −r
f
))/Y
2
1
]
E[Y
2
0
(r −r
f
)
2
/Y
2
1
]
=
E[(r −r
f
)/Y
1
−(r −r
f
)/Y
1
]
E[Y
2
0
(r −r
f
)
2
/Y
2
1
]
= 0
Note that the denominator is strictly positive as long as r −r
f
̸= 0 in some states, which
is always the case in reality. (In fact, if r is continuous, then Prob[r = r
f
] = 0, and
thus the integral in the denominator is not affected by this event. This sentence is not
a required part of the course.)
Alternatively, note that the foc can be further simplified:
E[Y
0
(r −r
f
)/Y
1
] = E
_
(r −r
f
)
1 +r
f
+w(r −r
f
)
_
= 0
It does not depend on Y
0
, hence we immediately get
d ˆ w
dY
0
= −
dfoc/ dY
0
...
= 0 (check the
denominator is not zero, just to be sure).
Answer (Ex. 20) — .
1) Decreasing.
2) U(Y ) =
Y
(
1−g)
1−g
Answer (Ex. 21) — .
1) ARA = g, RRA = gY .
2) ARA is constant. Consider a gamble with only two possible outcomes expressed in
monetary units. With constant ARA, the probability of the good outcome the investor
requires to play the game does not depend on the wealth level. Constant ARA also
implies that, in a portfolio choice problem, the optimal amount invested in the risky
asset does not change with the wealth level.
117
Answer (Ex. 22) — .
1)
maximize
a
{E[−exp(−αY
1
)]}
with Y
1
= Y
0
(1 +r
f
) +a(r −r
f
).
2) The risky asset becomes relatively less attractive (less excess return for the same
variance), hence a should decrease.
3) The foc is:
E[α(r −r
f
) exp(−αY
1
)] = 0
Using the Implicit Function Theorem,
da(Y
0
)
dr
f
= −
∂ E[. . . ]/∂r
f
∂ E[. . . ]/∂a
= −
E[−αe
−αY
1
−α
2
(r −r
f
)(Y
0
−a)e
−αY
1
]
E[−α
2
(r −r
f
)
2
e
−αY
1
]
=
>0
¸ .. ¸
E[αe
−αY
1
] +α(Y
0
−a)
=0 (foc)
¸ .. ¸
E[α(r −r
f
)e
−αY
1
]
E[−α
2
(r −r
f
)
2
e
−αY
1
]
. ¸¸ .
<0
< 0
Answer (Ex. 23) — .
1) Nothing; depends on the particular utility function (example: positive for decreasing
ARA, but negative for increasing ARA)
2) da/dY
0
= 0, see the application of the IFT below equation (3.3).
Answer (Ex. 24) — Y
1
∼ N(Y
0
(1 +r
f
) +a(µ −r
f
), a
2
σ
2
). The problem is thus
maximize
a
−exp
_
−γ[Y
0
(1 +r
f
) +a(µ −r
f
)] + 1/2γ
2
a
2
σ
2
_
foc:

_
−γ(µ −r
f
) +aγ
2
σ
2
_
exp(.) = 0
Since exp(x) > 0, ∀x,
⇒−γ(µ −r
f
) +aγ
2
σ
2
= 0
⇒a =
µ −r
f
γσ
2
118
Answer (Ex. 25) — .
1) The returns are:
Stock A Stock B
day t r
t
r
t
fri 0 – –
mon 1 0.1000 0.1000
tue 2 -0.0909 -0.0909
wed 3 0.1000 0.2100
thu 4 -0.1818 -0.1818
fri 5 0.3333 0.3333
weekly returns 0.2000 0.3200
2) The return on the portfolio over this week is
r
p,week
= 0.4 ×0.2 + 0.6 ×0.32 = 27.2%
3) N
A,0
= N
A,5
= 4, 000/10 = 400 and N
B,0
= 600. On Wednesday, we receive a
dividend of $1.1 ∗ 600 = $660, which allows us to buy $660/$11 = 60 more shares of
stock B. Hence, N
B,3
= N
B,5
= 600 + 60 = 660. (We can check that the terminal value
of this portfolio is V
5
= 12 ∗ 400 + 12 ∗ 660 = $12, 720, which implies a weekly return of
12720/10000 −1 = 27.2%.)
4) The adjusted prices are:
Stock A Stock B
day t P
t
P
a
t
P
t
P
a
t
fri 0 10 10 10 9.09
mon 1 11 11 11 10
tue 2 10 10 10 9.09
wed 3 11 11 11 11
thu 4 9 9 9 9
fri 5 12 12 12 12
Answer (Ex. 26) — Have EW = W
0
(1 + µ), V ar(W) = W
2
0
σ
2
. Thus, EU = a +
bEW+cE[W
2
] = a+bEW+c(V ar(W)+(EW)
2
) = a+bW
0
(1+µ)+c(W
2
0
σ
2
+W
2
0
(1+µ)
2
)
Answer (Ex. 27) — .
1) Daily returns
119
Mean 0.0003
Stdev 0.0105
Skew(Nrm=0) -0.6292
Kurt(Nrm=3) 10.2897
Test[H0:Normal]
Jarque-Bera[Pvalue] 0.0000
Clearly, there are fat tails (high kurtosis). Normality is rejected (JB test; not covered
in class).
2) Monthly returns
Mean 0.0068
Stdev 0.0614
Skew(Nrm=0) -0.1338
Kurt(Nrm=3) 3.8117
Test[H0:Normal]
Jarque-Bera[Pvalue] 0.1418
At the monthly horizon, the problem is much less severe. There is also less skewness.
Normality is not rejected. (The JB test is asymptotic but we only have 145 monthly
observations. Statistical purists might ask for additional finite sample tests. We did not
cover any of these in class; I don’t expect you to know this.)
Answer (Ex. 28) — As done in section 4.4 of these notes.
Answer (Ex. 29) — The lagrangian is
L =
1
2
w

V w −λ(w

¯ r + (1 −w

1)r
f
−µ)
and the first-order conditions are
dL
dw
= V w −λ(¯ r −r
f
1) = 0 (N eqns)
dL

= w

¯ r + (1 −w

1)r
f
−µ = 0 (1 eqn)
The foc for w can be written as:
w = λV
−1
(¯ r −r
f
1)
⇒(¯ r −r
f
1)

w = λ(¯ r −r
f
1)

V
−1
(¯ r −r
f
1)
. ¸¸ .
≡H
⇒(¯ r −r
f
1)

w = λH
The foc for λ implies
w

(¯ r −r
f
1) +r
f
−µ = 0
⇒(¯ r −r
f
1)

w = µ −r
f
120
Plugging this expression for (¯ r −r
f
1)

w into the previous equation, we find the value of
the multiplier:
µ −r
f
= λH
⇒λ =
µ −r
f
H
Substituting this value of λ in the foc for w we get (4.11):
w = λV
−1
(¯ r −r
f
1)
⇒w =
µ −r
f
H
V
−1
(¯ r −r
f
1)
Additionally, we can also check that H is indeed as defined in the text:
H ≡ (¯ r −r
f
1)

V
−1
(¯ r −r
f
1)
= ¯ r

V
−1
(¯ r −r
f
1) −r
f
1

V
−1
(¯ r −r
f
1)
= ¯ r

V
−1
¯ r −r
f
¯ r

V
−1
1 −r
f
1

V
−1
¯ r +r
2
f
1

V
−1
1
= B −2r
f
A+r
2
f
C
Answer (Ex. 30) — Let m := E[r]. E[(r
p
−E[r
p
])(r
q
−E[r
q
])] = E[(w

p
r−w

p
m)(w

q
r−
w

q
m)] = w

p
E[(r − m)(r − m)

]w
q
. By definition, V := Cov(r) := E[(r − m)(r − m)

],
hence the result follows.
Answer (Ex. 31) — .
1) L = w

¯ r −
g
2
w

V w +m(w

1 −1)
foc m: w

1 = 1.
foc w: w = V
−1
(¯ r +m1)/g.
Use the foc for m to get 1

w = (1

V
−1
¯ r + m1

V
−1
1)/g = 1 ⇒ m = (g − A)/C. Plug
back in foc w to get:
w

=
1
g
V
−1
¯ r +
g −A
gC
V
−1
1
2)
E[r
p
] = ¯ r

w

= B/g +A/C −A
2
/(gC)
3)
w

=
C
_
B/g +A/C −A
2
/(gC)
_
−A
D
V
−1
¯ r+
B −A
_
B/g +A/C −A
2
/(gC)
_
D
V
−1
1
121
Simplifying all the scalars,
w

=
CB/g −A
2
/g
D
V
−1
¯ r +
BC −ABC/g −A
2
+A
3
/g
DC
V
−1
1
w

=
(BC −A
2
)/g
D
V
−1
¯ r +
(BC −A
2
) −(BC −A
2
)A/g
DC
V
−1
1
w

=
D/g
D
V
−1
¯ r +
D −DA/g
DC
V
−1
1
w

=
1
g
V
−1
¯ r +
g −A
gC
V
−1
1
we do indeed get (1.).
Answer (Ex. 32) — Using standard matrix notation, E[r
p
] = w

¯ r + (1 − w

1)r
f
and
V ar[r
p
] = w

V w.
Since Y
1
= Y
0
(1 + r
p
) and r
p
is normally distributed, we have that Y
1
also follows a
normal distribution with the following parameters:
Y
1
∼ N
_
Y
0
[1 +w

¯ r + (1 −w

1)r
f
], Y
2
0
w

V w
_
Using the moment generating function for the normal distribution, the objective function
becomes
E[−exp(−b.Y
1
)] = −exp
_
−bY
0
[1 +w

¯ r + (1 −w

1)r
f
] +
1
2
b
2
Y
2
0
w

V w
_
The investor problem is thus
maximize
w
−exp
_
−bY
0
[1 +w

¯ r + (1 −w

1)r
f
] +
1
2
b
2
Y
2
0
w

V w
_
The foc is

_
−bY
0
(¯ r −1r
f
) +
1
2
b
2
Y
2
0
2V w
_
exp(.) = 0
⇒b
2
Y
2
0
V w = bY
0
(¯ r −1r
f
)
⇒w =
1
bY
0
V
−1
(¯ r −1r
f
)
Answer (Ex. 33) — (Will be posted on my website)
Answer (Ex. 34) — moneyp =
122
125199.55
0.16
0.00
113651.25
0.00
58178.46
0.65
84884.29
0.00
0.07
moneyrf = 618085.58
rp = 0.0064
stdp = 0.0174
Remark:
With this portfolio, the investor attains the following expected utility: max E[U] =
0.0064−
8
2
(0.0174)
2
= 0.0052102. If you got a different w, check your maximum expected
utility. If it is higher than this, please let me know. Different software may use different
algorithms and thus give different answers.
Answer (Ex. 35) — Buy an efficient (CML) portfolio with σ
p
= 0.15. The weights
are: σ
p
= w
M
σ
M
⇒ w
M
= 0.75 and w
f
= 0.25. Thus, put $75,000 in the stock market
and $25,000 in the risk-free bond. The expected return is E[r
p
] = 0.25∗0.04+0.75∗0.1 =
0.085, thus we expect to have $108,500 in 1 year.
Answer (Ex. 36) — P
a
=
2
(0.04+0.06∗0.9)−0.05
= $45
Answer (Ex. 37) — .
1) Developing the definition of beta,
β
p
:= Cov(r
p
, r
M
)/ Var(r
M
) = Cov(
N

i=1
w
i
r
i
, r
M
)/ Var(r
M
)
=
N

i=1
w
i
Cov(r
i
, r
M
)/ Var(r
M
) =
N

i=1
w
i
β
i
2) 0.9 = 0 +w
a
×1.2 ⇒w
a
= 0.75, and w
f
= 0.25
Answer (Ex. 38) — The portfolio must be efficient, ie, a combination of the risk-free
asset and the market. Hence, it must have corr(r
p
, r
M
) = 1. (Check that plugging this
in the SML you get the CML).
123
Answer (Ex. 39) — From mean-variance optimization, we can write
E[r
j
] = r
f

j
( E[r
p
] −r
f
)
where p is any frontier portfolio, and in particular we can choose p = T (this is just
math). The economic content comes from realizing that if all investors are identical
(mean-var preferences + homogeneous expectations), we must have T = M. Thus, the
economic part of the equation is to use M instead of p.
Answer (Ex. 41) — Use the covariance properties to get
σ
ij
= β
j
β
i
σ
2
M

j
Cov(r
m
, ε
i
) +β
i
Cov(r
m
, ε
j
) + Cov(ε
i
, ε
j
)
In the diagonal (i=j), use A2 to get (6.3). Off diagonal (i ̸= j), use A2 and A3 to get
(6.4).
Answer (Ex. 42) — Suppose we replicate the random part of a by creating a portfolio
(w
f
= 1−β
a
, w
M
= β
a
). Its return is r
p
= (1−β
a
)r
f

a
r
M
, which matches a, except for
the intercept. Since a
a
= 0.01 > (1 − β
a
)r
f
= 0.004, there is an arbitrage opportunity.
Short sell $1 of p and buy $1 of a. This guarantees a sure profit of 0.6%. Doing this
arbitrage as much as possible will make me extremely rich.
Answer (Ex. 43) — Small-minus-Big (SMB) is the difference between the return of a
portfolio of small stocks and the return of a portfolio of large stocks. It measures the
size premium, the additional return required for investing in small firms.
High-minus-Low (HML) is the difference between the return of a portfolio of firms
with high BE/ME (”value”) and the return of a portfolio of firms with low BE/ME
(”growth”). It measures the value premium, ie the additional return required to invest
in firms with low market cap, which typically are firms which have had low returns and
are now in risk of bankruptcy.
Answer (Ex. 46) — No, |X| = 0.
Answer (Ex. 47) — .
1) q
1
= 0.05, q
2
= 0.1, q
3
= 0.25.
2) p = 2.1
Answer (Ex. 48) — .
124
1) Given the simple structure of X, we can find the AD prices almost directly:
p
1
= 3p
ad
1
⇒1.2 = 3p
ad
1
⇒p
ad
1
= 0.4
p
2
= 4p
ad
1
+ 2p
ad
2
⇒1.8 = 4 ∗ 0.4 + 2p
ad
2
⇒p
ad
2
= 0.1
p
3
= 2p
ad
1
+p
ad
2
+p
ad
3
⇒p
ad
3
= 0.3
The risk-free rate is 1 + r
f
= 1/

s
p
ad
s
= 1/0.8 = 1.25. The RN prob are π
Q
(s) =
p
ad
(s)/

s
p
ad
s
=
_
_
0.5
0.125
0.375
_
_
. Hence, the price of the new security is
p
4
=
E
Q
[x]
1 +r
f
=
0.5 ∗ 2 + 0.125 ∗ 10 + 0.375 ∗ 4
1.25
= 3
2) No. Risk aversion is impounded in the risk-neutral probabilities, π
Q
.
3) There is an arbitrage opportunity. Since the bank is selling the security cheap, I
should buy it. The replicating portfolio using AD-securities consists of 2 units of AD(1),
10 units of AD(2), and 4 units of AD(3). The price of this portfolio is 3 (as in the
previous question). Hence, we pay 2.5 to the bank and sell the replicating portfolio in
the market for 3. The profit is 0.5 today. One period from now, my payoff is 0 regardless
of the state (an arbitrage).
Note: the portfolio with the AD-securities “means” the following portfolio in the complex
securities: [2, 10, 4]

= Xq ⇒ q = [−6, 3, 4]. That is, sell 6 units of asset 1, buy 3 units
of asset 2, buy 4 of asset 3. The price of this portfolio equals 3. Selling the replicating
portfolio means to buy 6 units of asset 1, sell 3 units of 2, and sell 4 units of 3.
Answer (Ex. 49) — .
1) E
t
[−p
t
U

(c
t
) +δU

(c
t+1
)v
t+1
] = 0 or p
t
= E
t

U

(c
t+1
)
U

(c
t
)
v
t+1
]
2) E
t
[p
2
t
U
′′
(c
t
) +δU
′′
(c
t+1
)v
2
t+1
] < 0. Need U
′′
< 0, ie risk-aversion.
3) m = δ
_
c
t+1
c
t
_
−γ
Answer (Ex. 50) — .
1) Yes, determinant = 6, so the assets are linearly independent.
2) This can be solved with the general method, i.e., finding replicating weights for each
AD security. However, given the simple structure of X, we can find the AD prices almost
directly:
p
1
= 3p
ad
1
⇒1.2 = 3p
ad
1
⇒p
ad
1
= 0.4
p
2
= 4p
ad
1
+ 2p
ad
2
⇒1.8 = 4 ∗ 0.4 + 2p
ad
2
⇒p
ad
2
= 0.1
p
3
= p
ad
1
+p
ad
2
+p
ad
3
⇒0.7 = 0.4 + 0.1 +p
ad
3
⇒p
ad
3
= 0.2
125
3) p = [2, 10, 4] ∗ [0.4, 0.1, 0.2]

= 2.6
4) m = q./π = [1.6, 0.2, 0.8]

5) p =

s
π(s)m(s)payoff(s) = 2.6
Answer (Ex. 51) — .
1) 1 = E
t
[m
t+1
R
f
t+1
] = R
f
t+1
E
t
[m
t+1
]
2) 1 = cov(m, R) +
1
R
f
E[R] ⇒E[R] = R
f
(1 −cov(m, R)), and finally,
E
t
[R
t+1
] −R
f
t+1
= −R
f
t+1
cov
t
_
δU

(c
t+1
)/U

(c
t
), R
t+1
_
= −R
f
t+1
δ
U

(c
t
)
cov
t
_
U

(c
t+1
), R
t+1
_
3) Investors are willing to pay a high price (demand a low excess return) for securities
that have high covariance with marginal utility. This makes sense since these securities
payoff exactly when the investor values the payoff most (ie, when he has high marginal
utility).
Answer (Ex. 52) — .
1) For log utility and δ = 1, we have m = δU

(c
1
)/U

(c
0
) = c
0
/c
1
= [100/100, 100/150]

=
[1, 2/3]

. The asset prices are thus
p
1
=

s
π(s)m(s)x(s) = 1 ∗ 10/2 + 2/3 ∗ 20/2 = 11.67
p
2
=

s
π(s)m(s)x(s) = 1 ∗ 20/2 + 2/3 ∗ 10/2 = 13.33
2) Asset 2 is more expensive because it has a high payoff in bad times (high marginal
utility, low consumption c
1
= 100). Equivalently, asset 1 is cheap because its high payoff
occurs in an already good state (low mg util, high consumption).
3) p = E[mx] ⇒p
t
=
E
t
[x
t+1
]
R
f
t+1
+ Cov
t
(m
t+1
, x
t+1
). The first term is the price of the asset
if investors were risk neutral. The second term is a risk adjustment: if the payoff has a
high covariance with the sdf or mg utility (meaning low cov with consumption), then it
will payoff precisely when the investor is in most need. Its price will thus be high. For
the example, note that R
f
= 1/ E[m] = 1/0.8333 = 1.2. Hence,
E
P
[x]/R
f
Cov(m, x)
p
1
= 11.6667 = +12.5000 -0.8333
p
2
= 13.3333 = +12.5000 +0.8333
Without risk-aversion, both asset would have the same price (12.5). However, risk-
aversion makes the price of asset 2 increase by 0.83.
126
Answer (Ex. 53) — True. m = δU

(c
t+1
)/U

(c
t
). Since marginal utility is always
positive, the sdf is always positive.
Answer (Ex. 54) — .
1) The 2 foc are:
z
s
: p
s
t
= E
t
[δU

(c
t+1
)/U

(c
t
)x
s
t+1
]
z
b
: p
b
t
= E
t
[δU

(c
t+1
)/U

(c
t
)x
b
t+1
]
2) U

= 1/c, hence m
t+1
= δc
t
/c
t+1
.
3) The pricing kernel is
m
t+1
= 0.99 ∗ 1000/c
t+1
= [1.1000, 0.9900, 0.9000, 0.8250]

Using the foc for z
b
,
p
b
t
= E
t
[m
t+1
x
b
t+1
] =
4

s=1
Prob(s)m(s)x
b
(s) = 80.31

Sign up to vote on this title
UsefulNot useful