Claus Munk
Aarhus University
email: cmu@econ.au.dk
this version: January 8, 2009
Contents
1 Introduction and overview 1
1.1 What is modern asset pricing? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Elements of asset pricing models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Assets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Investors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.3 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.4 The time span of the model . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Some stylized empirical facts about asset returns . . . . . . . . . . . . . . . . . . . 6
1.4 The organization of this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Uncertainty, information, and stochastic processes 13
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Probability space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Stochastic processes: denition, notation, and terminology . . . . . . . . . . . . . . 18
2.5 Some discretetime stochastic processes . . . . . . . . . . . . . . . . . . . . . . . . 21
2.6 Continuoustime stochastic processes . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6.1 Brownian motions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6.2 Diusion processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.6.3 Ito processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6.4 Jump processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6.5 Stochastic integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.6 Itos Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.6.7 The geometric Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . 33
2.7 Multidimensional processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7.1 Twodimensional processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.7.2 Kdimensional processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
i
ii Contents
3 Assets, portfolios, and arbitrage 45
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Assets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.1 The oneperiod framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.2 The discretetime framework . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2.3 The continuoustime framework . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 Portfolios and trading strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3.1 The oneperiod framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.3.2 The discretetime framework . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3.3 The continuoustime framework . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.4 Arbitrage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4.1 The oneperiod framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4.2 The discretetime and continuoustime frameworks . . . . . . . . . . . . . . 53
3.4.3 Continuoustime doubling strategies . . . . . . . . . . . . . . . . . . . . . . 54
3.5 Redundant assets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.6 Marketed dividends and market completeness . . . . . . . . . . . . . . . . . . . . . 56
3.6.1 The oneperiod framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.6.2 Multiperiod framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.6.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.7 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4 State prices 63
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2 Denitions and immediate consequences . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2.1 The oneperiod framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2.2 The discretetime framework . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.2.3 The continuoustime framework . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.3 Properties of stateprice deators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.3.1 Existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.3.2 Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.3.3 Convex combinations of stateprice deators . . . . . . . . . . . . . . . . . . 80
4.3.4 The candidate deator
i
= Cov[R
i
, R
M
]/ Var[R
M
]. Only the risk correlated with the market will give a risk premium in
terms of a higher expected return (assuming the marketbeta is positive). The remaining risk can
be diversied away and is therefore not priced in equilibrium. In principle, the market portfolio
includes all assets, not only traded nancial assets but also nontraded assets like the human capital
(value of labor income) of all individuals. However, the market portfolio is typically approximated
by a broad stock index, although this approximation is not necessarily very precise.
The CAPM has been very successful as a pedagogical tool for presenting and quantifying the
tradeo between risk and (expected) return, and it has also been widely used in practical appli
cations. It captures some important characteristics of the pricing in nancial markets in a rather
simple way. However, the CAPM is insucient in many aspects and it is based on a number of
unrealistic assumptions. Here is a partial list of problems with the CAPM:
1. The original CAPM is formulated and derived in a oneperiod world where assets and in
vestors are only modeled over one common period. In applications, it is implicitly assumed
that the CAPM repeats itself period by period which intuitively demands some sort of in
dependence between the pricing mechanisms in dierent periods, which again requires the
unrealistic assumption that the demand and supply of agents living for several periods are
the same in all periods.
2. The CAPM is not designed for capturing variations in asset prices over time and cannot do
so.
3. Typical derivations of the CAPM assume that all asset returns over the xed period are
normally distributed. For assets with limited liability you cannot loose more than you have
invested so the rate of return cannot be lower than 100%, which is inconsistent with the
normal distribution that associates a positive probability to any return between and
+. Empirical studies show that for many assets the normal distribution is not even a good
approximation of the return distribution.
4. The true market portfolio contains many unobservable assets so how should you nd the
expected return and variance on the market portfolio and its covariances with all individual
assets?
5. The CAPM is really quite unsuccessful in explaining empirical asset returns. Dierences in
marketbetas cannot explain observed dierences in average returns of stocks.
6. The CAPM is not a full asset pricing model in the sense that it does not say anything
about what the return on the riskfree asset or the expected return on the market portfolio
should be. And it does not oer any insight into the links between nancial markets and
macroeconomic variables like consumption, production, and ination.
1.2 Elements of asset pricing models 3
The purpose of this book is to develop a deeper understanding of asset pricing than the CAPM
can oer.
When an investor purchases a given asset, she obtains the right to receive the future payments
of the asset. For many assets the size of these future payments is uncertain at the time of purchase
since it may depend on the overall state of the economy and/or the state of the issuer of the asset
at the payment dates. Riskaverse investors will value a payment of a given size more highly if
they receive it in a bad state than in a good state. This is captured by the term state price
introduced by Arrow (1953). A state price for a given state at a given future point in time indicates
how much investors are willing to sacrice today in return for an extra payment of one unit in that
future state. Presumably investors will value a given payment in a given state the same no matter
which asset the payment comes from. Therefore state prices are valid for all assets. The value of
any specic asset is determined by the general state prices in the market and the statecontingent
future payments of the asset. Modern asset pricing theory is based on models of the possible states
and the associated state prices.
The wellbeing of individuals will depend on their consumption of goods throughout their lives.
By trading nancial assets they can move consumption opportunities from one point in time to
another and from one state of the world to another. The preferences for consumption of individuals
determine their demand for various assets and thereby the equilibrium prices of these assets. Hence,
the state price for any given state must be closely related to the individuals (marginal) utility of
consumption in that state. Many modern asset pricing theories and models are based on this link
between asset prices and consumption.
1.2 Elements of asset pricing models
1.2.1 Assets
For potential investors the important characteristics of a nancial asset or any other investment
opportunity are its current price and its future payments which the investor will be entitled to if she
buys the asset. Stocks deliver dividends to owners. The dividends will surely depend on the well
being of the company. Bonds deliver coupon payments and repayments of the outstanding debt,
usually according to some predetermined schedule. For bonds issued by most governments, you
might consider these payments to be certain, i.e. riskfree. On the other hand, if the government
bond promises certain dollar payments, you will not know how many consumption goods you
will be able to buy for these dollar payments, that is the payments are risky in real terms. The
payments of bonds issued by corporations are also uncertain. The future payments of derivatives
such as forwards, futures, options, and swaps depend on the evolution of some underlying random
variable and therefore are also uncertain.
Let us simply refer to the payments of any asset as dividends. More precisely, a dividend
means the payment of a given asset at a given point in time. The uncertain dividend of an asset
at a given point in time is naturally modeled by a random variable. If an asset provides the owner
with payments at several points in time, we need a collection of random variables (one for each
payment date) to represent all the dividends. Such a collection of random variables is called a
stochastic process. A stochastic process is therefore the natural way to represent the uncertain
4 Chapter 1. Introduction and overview
ow of dividends of an asset over time. We will refer to the stochastic process representing the
dividends of an asset as the dividend process of the asset.
1.2.2 Investors
In reality, only a small part of the trading in nancial markets is executed directly by individuals
while the majority of trades are executed by corporations and nancial institutions such as pension
funds, insurance companies, banks, broker rms, etc. However, these institutional investors trade
on behalf of individuals, either customers or shareholders. Productive rms issue stocks and
corporate bonds to nance investments in production technology they hope will generate high
earnings and, consequently, high returns to their owners in future years. In the end, the decisions
taken at the company level are also driven by the desires of individuals to shift consumption
opportunities across time and states. In our basic models we will assume that all investors are
individuals and ignore the many good reasons for the existence of various intermediaries. For
example, we will assume that assets are traded without transaction costs. We will also ignore
taxes and the role of the government and central banks in the nancial markets. Some authors
use the term agent or investor instead of individual, maybe in order to indicate that some
investment decisions are taken by other decisionmaking units than individual human beings.
How should we represent an individual in an asset pricing model? We will assume that indi
viduals basically care about their consumption of goods and services throughout their life. The
consumption of a given individual at a future point in time is typically uncertain and we will there
fore represent it by a random variable. The consumption of an individual at all future dates is
represented by a stochastic process, the consumption process of the individual. Although reallife
economies oer a large variety of consumption goods and services, in our basic models we will
assume that there is only one good available for consumption and that each individual only cares
about her own consumption and not the consumption of other individuals. The single consump
tion good is assumed to be perishable, i.e. it cannot be stored or resold but has to be consumed
immediately. In more advanced models discussed in later chapters we will in fact relax these as
sumptions and allow for multiple consumption goods, e.g. we will introduce a durable good (like
a house), and we will also discuss models in which the wellbeing of an individual also depends on
what other individuals consume, which is often referred to as the keeping up with the Joneses
property. Both extensions turn out to be useful in bringing our theoretical models closer to reallife
nancial data but it is preferable to understand the simpler models rst. Of course, the wellbeing
of an individual will also be aected by the number of hours she works, the physical and mental
challenges oered by her position, etc., but such issues will also be ignored in basic models.
We will assume that each individual is endowed with some current wealth and some future
income stream from labor, gifts, inheritance, etc. For most individuals the future income will be
uncertain. The income of an individual at a given future point in time is thus represented by
a random variable and the income at all future dates is represented by a stochastic process, the
income process. We will assume that the income process is exogenously given and hence ignore
labor supply decisions.
If the individual cannot make investments at all (not even save current wealth), it will be
impossible for her to currently consume more than her current wealth and impossible to consume
more at a future point in time than her income at that date. Financial markets allow the individual
1.2 Elements of asset pricing models 5
to shift consumption opportunities from one point in time to another, e.g. from working life to
retirement, and from one state of the world to another, e.g. from a state in which income is
extremely high to a state where income is extremely low. The prices of nancial assets dene
the prices of shifting consumption through time and states of the world. The individuals desire
to shift consumption through time and states will determine the demand and supply and hence
the equilibrium prices of the nancial assets. To study asset pricing we therefore have to model
how individuals choose between dierent, uncertain consumption processes. The preferences for
consumption of an individual is typically modeled by a utility function. Since this is a text on
asset pricing, we are not primarily interested in deriving the optimal consumption stream and the
associated optimal strategy for trading nancial assets. However, since asset prices are set by the
decisions of individuals, we will have to discuss some aspects of optimal consumption and trading.
1.2.3 Equilibrium
For any given asset, i.e. any given dividend process, our aim is to characterize the reasonable
price or the set of reasonable prices. A price is considered reasonable if the price is an equilibrium
price. An equilibrium is characterized by two conditions: (1) supply equals demand for any asset,
i.e. markets clear, (2) any investor is satised with her current position in the assets given her
personal situation and the asset prices. Associated with any equilibrium is a set of prices for all
assets and, for each investor, a trading strategy and the implied consumption strategy.
1.2.4 The time span of the model
As discussed above, the important ingredients of all basic asset pricing models are the dividends
of the assets available for trade and the utility functions, current wealth, and future incomes of
the individuals that can trade the assets. We will discuss asset pricing in three types of models:
1. oneperiod model: all action takes place at two points in time, the beginning of the
period (time 0) and the end of the period (time 1). Assets pay dividends only at the end
of the period and are traded only at the beginning of the period. The aim of the model
is to characterize the prices of the assets at the beginning of the period. Individuals have
some initial beginningofperiod wealth and (maybe) some endofperiod income. They can
consume at both points in time.
2. discretetime model: all action takes place at a nite number of points in time. Let us
denote the set of these time points by T = 0, 1, 2, . . . , T. Individuals can trade at any of
these time points, except at T, and consume at any time t T. Assets can pay dividends at
any time in T, except time 0. Assuming that the price of a given asset at a given point in time
is exdividend, i.e. the value of future dividends excluding any dividend at that point in time,
prices are generally nontrivial at all but the last point in time. We aim at characterizing
these prices.
3. continuoustime model: individuals can consume at any point in time in an interval
T = [0, T]. Assets pay dividends in the interval (0, T] and can be traded in [0, T). Ex
dividend asset prices are nontrivial in [0, T). Again, our aim is to characterize these prices.
6 Chapter 1. Introduction and overview
In a oneperiod setting there is uncertainty about the state of the world at the end of the period.
The dividends of nancial assets and the incomes of the individuals at the end of the period will
generally be unknown at the beginning of the period and thus modeled as random variables. Any
quantity that depends on either the dividends or income will also be random variables. For example,
this will be the case for the endofperiod value of portfolios and the endofperiod consumption of
individuals.
Both the discretetime model and the continuoustime model are multiperiod models and can
potentially capture the dynamics of asset prices. In both cases, T denotes some terminal date in
the sense that we will not model what happens after time T. We assume that T < but under
some technical conditions the analysis extends to T = .
Financial markets are by nature dynamic and should therefore be studied in a multiperiod
setting. Oneperiod models should serve only as a pedagogical rst step in the derivation of the
more appropriate multiperiod models. Indeed, many of the important conclusions derived in one
period models carry over to multiperiod models. Other conclusions do not. And some issues
cannot be meaningfully studied in a oneperiod framework.
It is not easy to decide on whether to use a discretetime or a continuoustime framework for
studying multiperiod asset pricing. Both model types have their virtues and drawbacks. Both
model types are applied in theoretical research and reallife applications. We will therefore consider
both modeling frameworks. The basic asset pricing results in the early chapters will be derived
in both settings. Some more specic asset pricing models discussed in later chapters will only be
presented in one of these frameworks. Some authors prefer to use a discretetime model, others
prefer a continuoustime model. It is comforting that, for most purposes, both models will result
in identical or very similar conclusions.
At rst, you might think that the discretetime framework is more realistic. However, in real
life economies individuals can in fact consume and adjust portfolios at virtually any point in time.
Individuals are certainly not restricted to consume and trade at a nite set of prespecied points in
time. Of course no individual will trade nancial assets continuously due to the existence of explicit
and implicit costs of such transactions. But even if we take such costs into account, the frequency
and exact timing of actions can be chosen by each individual. If we are really concerned about
transaction costs, it would be better to include those in a continuoustime modeling framework.
Many people will nd discretetime models easier to understand than continuoustime models
and if you want to compare theoretical results with actual data it will usually be an advantage if
the model is formulated with a period length closely linked to the data frequency. On the other
hand, once you have learned how to deal with continuoustime stochastic processes, many results
are clearer and more elegantly derived in continuoustime models than in discretetime models.
The analytical virtues of continuoustime models are basically due to the welldeveloped theory of
stochastic calculus for continuoustime stochastic processes, but also due to the fact that integrals
are easier to deal with than discrete sums, dierential equations are easier to deal with than
dierence equations, etc.
1.3 Some stylized empirical facts about asset returns 7
1.3 Some stylized empirical facts about asset returns
This section will summarize some general empirical ndings about asset returns. Before doing
that, we need to make clear what return means and how we should compute the average return.
The percentage return or net rate of return on a given asset between time t t and time t is
r
t
=
D
t
+P
t
P
tt
P
tt
=
D
t
+P
t
P
tt
1,
where D
t
is the total dividends paid out over the period for each unit of the asset owned, P
t
is
the unit price of the asset at time t, and P
tt
is the unit price of the asset at time t t. Given
a time series of such returns r
1
, r
2
, . . . , r
T
over time periods of equal length t, you can compute
the arithmetic average (net rate of) return as
r
arith
=
1
T
T
t=1
r
t
.
This may be a good measure of the return you can expect over any period of length t. On the
other hand, it is not necessarily a good measure for the return you can expect to get over a longer
periods. This is due to the fact that returns are compounded. If you invested 1 unit of account
(i.e. 1 dollar if dividends and prices are measured in dollars) at the starting date of the time series
and kept reinvesting dividends by purchasing additional units of the asset, then you will end up
with
(1 +r
1
)(1 +r
2
) . . . (1 +r
T
)
after the last period. The geometric average (rate of) return is computed as
r
geo
= [(1 +r
1
)(1 +r
2
) . . . (1 +r
T
)]
1/T
1.
The geometric average is lower than the arithmetic average. The dierence will be larger for a
very variable series of returns.
As a simple example, assume that the percentage return of an asset is 100% in year 1 and 50%
in year 2. Then the arithmetic average return is (100%50%)/2 = 25% and the geometric average
return is [(1 + 1)(1 0.5)]
1/2
1 = 1 1 = 0, i.e. 0%. An investment of 1 in the beginning of
the rst year have grown to 2 at the end of year 1 and then dropped to 2 0.5 = 1 at the end of
year 2. The zero return over the two periods is better reected by the geometric average then the
arithmetic average.
Campbell (2003) lists the following stylized facts of U.S. asset returns after World War II (up
to around 1998), where average returns are computed as a geometric average:
1. The annualized real stock return was 8.1% on average with a standard deviation of 15.6%.
2. The annualized real return on a 3month Treasury bill (U.S. government bond) was 0.9% on
average. The standard deviation of the ex post real return was 1.7%, but much of this is
due to shortrun ination risk and the standard deviation of the ex ante real interest rate is
considerably smaller.
3. Real dividend growth at an annual frequency has a standard deviation of 6%.
8 Chapter 1. Introduction and overview
4. Quarterly real dividend growth and real stock returns has a correlation of only 0.03, but the
correlation increases with the measurement period up to a correlation of 0.47 at a 4year
horizon.
5. Excess returns on stocks over Treasury bills are forecastable. The log pricedividend ratio
can forecast 10% of the variance of the excess return over a 1year horizon, 22% over a 2year
horizon, and 38% over a 4year horizon.
6. The real interest rate has a positive autocorrelation of 0.5 using quarterly data.
Although the exact estimates vary, the above stylized facts are fairly consistent across countries.
The book Triumph of the Optimists by Dimson, Marsh, and Staunton (2002) provides an abundance
of information concerning the historical performance of nancial markets in 16 countries. Table 1.1
is a slightly edited version of Table 42 in that book. It shows that in any of these countries the
average return on stocks is higher than the average return on longterm bonds, which again is higher
than the average shortterm interest rate. The same ranking holds for the standard deviations.
If you take the standard deviation as a measure of risk, the historical estimates thus conrm the
conventional wisdom that higher average returns come with higher risk.
Dimson, Marsh, and Staunton (2002) also report the best and worst realized real returns for
each country and asset class over the period covered, cf. their Tables 43 and 61. For example,
over the past century the annual return on the U.S. stock market has varied from a low of 38.0%
in 1931 to a high of 56.8% in 1933. For the German stock market the minimum return of 89.6% in
1948 was immediately followed by the maximum return of 155.9% in 1949. For longterm bonds,
the annual U.S. real returns have varied from 19.3% in 1918 to 35.1% in 1982. The uctuations
around the mean are thus substantial.
Ibbotson Associates Inc. publishes an annual book with a lot of statistics on the U.S. nancial
markets. The cross correlations of annual nominal returns on dierent asset classes in the U.S. are
shown in Table 1.2. The cross correlations of annual real returns can be seen in Table 1.3. Note
a fairly high correlation between the real returns on the two stock classes and between the real
returns on the three bond classes, whereas the correlations between any bond class and any stock
class is modest.
Table 1.4 shows the autocorrelations of the annual returns in the dierent asset classes. The
autocorrelation of the shortterm interest rate measured by the return on shortterm government
bonds is quite high, while the autocorrelation for stock returns is close to zero. Autocorrelations
can be signicant for very short intraday periods.
According to Cont (2000) return distributions of individual stocks (and some other assets) exhibit
the following characteristics:
1. Heavy tails: a higher frequency of very low and of very high returns compared to the normal
distribution. However, increasing the length of the period over which returns are computed,
the distribution comes closer to a normal distribution. So the shape of the return distribution
varies with the period length.
2. Gain/loss asymmetry: higher frequency of large negative returns than of large positive re
turns.
3. High variability of returns over any period length.
1.3 Some stylized empirical facts about asset returns 9
Stocks Bonds Bills
Country Mean StdDev Mean StdDev Mean StdDev
Australia 9.0 17.7 1.9 13.0 0.6 5.6
Belgium 4.8 22.8 0.3 12.1 0.0 8.2
Canada 7.7 16.8 2.4 10.6 1.8 5.1
Denmark 6.2 20.1 3.3 12.5 3.0 6.4
France 6.3 23.1 0.1 14.4 2.6 11.4
Germany 8.8 32.3 0.3 15.9 0.1 10.6
Ireland 7.0 22.2 2.4 13.3 1.4 6.0
Italy 6.8 29.4 0.8 14.4 2.9 12.0
Japan 9.3 30.3 1.3 20.9 0.3 14.5
The Netherlands 7.7 21.0 1.5 9.4 0.8 5.2
South Africa 9.1 22.8 1.9 10.6 1.0 6.4
Spain 5.8 22.0 1.9 12.0 0.6 6.1
Sweden 9.9 22.8 3.1 12.7 2.2 6.8
Switzerland 6.9 20.4 3.1 8.0 1.2 6.2
United Kingdom 7.6 20.0 2.3 14.5 1.2 6.6
United States 8.7 20.2 2.1 10.0 1.0 4.7
Table 1.1: Means and standard deviations (StdDev) of real asset returns in dierent countries over
the period 19002000. Numbers are in percentage terms. The returns used are arithmetic annual
returns. Bonds mean longterm (approximately 20year) government bonds, while bills mean short
term (approximately 1month) government bonds. The bond and bill statistics for Germany exclude
the year 192223, where the hyperination resulted in a loss of 100 percent for German bond investors.
For Swiss stocks the data series begins in 1911. Source: Table 42 in Dimson, Marsh, and Staunton
(2002).
bills gov. bonds corp. bonds large stocks small stocks ination
bills 1.00 0.24 0.22 0.03 0.10 0.41
gov. bonds 0.24 1.00 0.91 0.16 0.00 0.14
corp. bonds 0.22 0.91 1.00 0.24 0.09 0.15
large stocks 0.03 0.16 0.24 1.00 0.79 0.03
small stocks 0.10 0.00 0.09 0.79 1.00 0.08
ination 0.41 0.14 0.15 0.03 0.08 1.00
Table 1.2: Cross correlations of annual nominal returns between bills (shortterm government bonds),
longterm government bonds, longterm corporate bonds, stocks in companies with large capitalization,
stocks in companies with small capitalization, and the ination rate. Data from the United States in
the period 19262000. Source: Ibbotson Associates (2000).
10 Chapter 1. Introduction and overview
bills gov. bonds corp. bonds large stocks small stocks
bills 1.00 0.58 0.59 0.12 0.06
gov. bonds 0.58 1.00 0.95 0.24 0.04
corp. bonds 0.59 0.95 1.00 0.30 0.12
large stocks 0.12 0.24 0.30 1.00 0.79
small stocks 0.06 0.04 0.12 0.79 1.00
Table 1.3: Cross correlations of annual real returns between bills (shortterm government bonds), long
term government bonds, longterm corporate bonds, stocks in companies with large capitalization, and
stocks in companies with small capitalization. Data from the United States in the period 19262000.
Source: Ibbotson Associates (2000).
nominal return real return
bills 0.92 0.67
gov. bonds 0.06 0.03
corp. bonds 0.07 0.18
large stocks 0.00 0.00
small stocks 0.08 0.05
Table 1.4: Autocorrelations of annual nominal and real returns on bills (shortterm government bonds),
longterm government bonds, longterm corporate bonds, stocks in companies with large capitalization,
and stocks in companies with small capitalization. Data from the United States in the period 19262000.
Source: Ibbotson Associates (2000).
1.4 The organization of this book 11
4. Volatility clustering: positive autocorrelation in volatilities over several days.
5. Volatility tends to be negatively correlated with the return.
1.4 The organization of this book
The remainder of this book is organized as follows. Chapter 2 discusses how to represent uncertainty
and information ow in asset pricing models. It also introduces stochastic processes and some key
results on how to deal with stochastic processes, which we will use in later chapters.
Chapter 3 shows how we can model nancial assets and their dividends as well as how we can
represent portfolios and trading strategies. It also denes the important concepts of arbitrage,
redundant assets, and market completeness.
Chapter 4 denes the key concept of a stateprice deator both in oneperiod models, in discrete
time multiperiod models, and in continuoustime models. A stateprice deator is one way to
represent the general pricing mechanism of a nancial market. We can price any asset given
the stateprice deator and the dividend process of that asset. Conditions for the existence and
uniqueness of a stateprice deator are derived as well as a number of useful properties of state
price deators. We will also briey discuss alternative ways of representing the general market
pricing mechanism, e.g. through a set of riskneutral probabilities.
The stateprice deator and therefore asset prices are ultimately determined by the supply and
demand of investors. Chapter 5 studies how we can represent the preferences for investors. We
discuss when preferences can be represented by expected utility, how we can measure the risk
aversion of an individual, and introduce some frequently used utility functions. In Chapter 6 we
investigate how individual investors will make decisions on consumption and investments. We set
up the utility maximization problem for the individual and characterize the solution for dierent
relevant specications of preferences. The solution gives an important link between stateprice
deators (and, thus, the prices of nancial assets) and the optimal decisions at the individual level.
Chapter 7 deals with the market equilibrium. We will discuss when market equilibria are Pareto
ecient and when we can think of the economy as having only one, representative individual instead
of many individuals.
Chapter 8 further explores the link between individual consumption choice and asset prices.
The very general Consumptionbased Capital Asset Pricing Model (CCAPM) is derived. A simple
version of the CCAPM is confronted with data and a number of extensions are discussed.
Chapter 9 studies the socalled factor models of asset pricing where one or multiple factors
govern the stateprice deators and thus asset prices and returns. Some empirically successful
factor models are described. It is also shown how pricing factors can be identied theoretically as
a special case of the general CCAPM.
While Chapters 8 and 9 mostly focus on explaining the expected excess return of risky assets,
most prominently stocks, Chapter 10 explores the implications of general asset pricing theory for
bond prices and the term structure of interest rates. It also critically reviews some traditional
hypotheses on the term structure of interest rates.
Chapter 11 shows how the information in a stateprice deator equivalently can be represented
by the price of one specic asset and an appropriately riskadjusted probability measure. This turns
out to be very useful when dealing with derivative securities, which is the topic of Chapter 12.
12 Chapter 1. Introduction and overview
Each chapter ends with a number of exercises, which either illustrate the concepts and conclusions
of the chapter or provide additional related results.
1.5 Prerequisites
We will study asset pricing with the wellestablished scientic approach: make precise denitions
of concepts, clear statements of assumptions, and formal derivations of results. This requires
extensive use of mathematics, but not very complicated mathematics. Concepts, assumptions, and
results will all be accompanied by nancial interpretations. Examples will be used for illustrations.
The main mathematical disciplines we will apply are linear algebra, optimization, and probability
theory. Linear algebra and optimization are covered by many good textbooks on mathematics
for economics as, e.g., the companion books by Sydsaeter and Hammond (2005) and Sydsaeter,
Hammond, Seierstad, and Strom (2005), andof courseby many more general textbooks on
mathematics. Probability theory is usually treated in separate textbooks... A useful reference
manual is Sydsaeter, Strom, and Berck (2000).
Probability theory Appendix A gives a review of main concepts and denitions in probability
theory. Appendix B summarizes some important results on the lognormal distribution which we
shall frequently apply.
Linear algebra, vectors, and matrices We will frequently use vectors and matrices to rep
resent a lot of information in a compact manner. For example, we will typically use a vector to
represent the prices of a number of assets and use a matrix to represent the dividends of dierent
assets in dierent states of the world. Therefore some basic knowledge of how to handle vectors
and matrices (socalled linear algebra) are needed.
We will use boldface symbols like x to denote vectors and vectors are generally assumed to be
column vectors. Matrices will be indicate by double underlining like A. We will use the symbol
to denote the transpose of a vector or a matrix. The following is an incomplete and relatively
unstructured list of basic properties of vectors and matrices.
Given two vectors x = (x
1
, . . . , x
n
)
and y = (y
1
, . . . , y
n
)
= [A
ji
] with columns and
rows interchanged.
Given an mn matrix A = [A
ij
] and an n p matrix B = [B
kl
], the product AB is the mp
matrix with (i, j)th entry given by (AB)
i,j
=
n
k=1
A
ik
B
kj
. In particular with m = p = 1, we see
that for two vectors x = (x
1
, . . . , x
n
)
and y = (y
1
, . . . , y
n
)
, we have x
y = x y, so the matrix
product generalizes the dot product for vectors.
_
AB
_
= B
.
An n n matrix A is said to be nonsingular if there exists a matrix A
1
so that AA
1
= I,
where I is the n n identity matrix, and then A
1
is called the inverse of A.
1.5 Prerequisites 13
A 2 2 matrix
_
a b
c d
_
is nonsingular if ad bc ,= 0 and the inverse is then given by
_
a b
c d
_
1
=
1
ad bc
_
d b
c a
_
.
If A is nonsingular, then A
is nonsingular and (A
)
1
= (A
1
)
.
If A and B are nonsingular matrices of appropriate dimensions,
_
AB
_
1
= B
1
A
1
. (1.2)
If a and x are vectors of the same dimension, then
a
x
x
=
x
a
x
= a.
If x is a vector of dimension n and A is an n n matrix, then
x
Ax
x
=
_
A+A
_
x. In
particular, if A is symmetric, i.e. A
= A, we have
x
Ax
x
= 2Ax.
Optimization Optimization problems arise naturally in all parts of economics, also in nance.
To study asset pricing theory, we will have to study how individual investors make decisions about
consumption and investment. Assuming that the wellbeing of an individual can be represented
by some sort of utility function, we will have to maximize utility subject to various constraints,
e.g. a budget constraint. Therefore we have to apply results on constrained optimization. The
main approach for solving constrained optimization problems is the Lagrange approach; see, e.g.,
Sydsaeter and Hammond (2005).
CHAPTER 2
Uncertainty, information, and stochastic processes
2.1 Introduction
2.2 Probability space
Any model with uncertainty refers to a probability space (, F, P), where
is the state space of possible outcomes. An element represents a possible realization
of all uncertain objects of the model. An event is a subset of .
F is a algebra in , i.e. a collection of subsets of with the properties
(i) F,
(ii) for any set F in F, the complement F
c
F is also in F,
(iii) if F
1
, F
2
, F, then the union
n=1
F
n
is in F.
F is the collection of all events that can be assigned a probability.
P is a probability measure, i.e. a function P : F [0, 1] with P() = 1 and the property that
P(
m=1
A
m
) =
m=1
P(A
m
) for any sequence A
1
, A
2
, . . . of disjoint events.
An uncertain object can be formally modeled as a random variable on the probability space.
A random variable X on the probability space (, F, P) is a realvalued function on which is
Fmeasurable in the sense that for any interval I R, the set [ X() I belongs to F,
i.e. we can assign a probability to the event that the random variable takes on a value in I.
What is the relevant state space for an asset pricing model? A state represents a possible
realization of all relevant uncertain objects over the entire time span of the model. In oneperiod
models dividends, incomes, etc. are realized at time 1. A state denes realized values of all the
dividends and incomes at time 1. In multiperiod models a state denes dividends, incomes, etc.
at all points in time considered in the model, i.e. all t T, where either T = 0, 1, 2, . . . , T
15
16 Chapter 2. Uncertainty, information, and stochastic processes
or T = [0, T]. The state space must include all the possible combinations of realizations of the
uncertain objects that may aect the pricing of the assets. These uncertain objects include all the
possible combinations of realizations of (a) all the future dividends of all assets, (b) all the future
incomes of all individuals, and (c) any other initially unknown variables that may aect prices,
e.g. variables that contain information about the future development in dividends or income. The
state space therefore has to be large. If you want to allow for continuous random variables,
for example dividends that are normally distributed, you will need an innite state space. If you
restrict all dividends, incomes, etc. to be discrete random variables, i.e. variables with a nite
number of possible realizations, you can do with a nite state space. For some purposes we need
to distinguish between an innite state space and a nite state space.
We will sometimes assume a nite state space in which case we will take it to be = 1, 2, . . . , S
so that there are S possible states of which exactly one will be realized. An event is then simply
a subset of and F is the collection of all subsets of . The probability measure P is dened
by the state probabilities p
, every set F F
t
is a subset of
some set in F
t
.
An alternative way of representing the information ow is in terms of an information ltration,
i.e. a sequence (F
t
)
tT
of sigmaalgebras on . Given a partition F
t
of , we can construct a sigma
algebra F
t
as the set of all unions of (countably many) sets in F
t
, including the empty union,
i.e. the empty set . Where F
t
contains only the disjoint decidable events at time t, F
t
contains
all decidable events at time t. For our simple twoperiod example above we get
F
0
= , ,
F
1
= , 1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 1, 2, 6, 3, 4, 5, 6, ,
while F
2
becomes the collection of all possible subsets of . In a general multiperiod model we
write (F
t
)
tT
for the information ltration. We will always assume that the time 0 information is
trivial, corresponding to F
0
= , and that all uncertainty is resolved at or before the nal date
so that F
T
is the set of all possible subsets of . The fact that we learn more and more about the
true state as time goes by implies that we must have F
t
F
t
whenever t < t
> t, we have
E
t
[E
t
[X]] = E
t
[X].
Loosely speaking, the theorem says that what you expect today of some variable that will be
realized in two days is equal to what you expect today that you will expect tomorrow about the
same variable.
2.3 Information 19
t = 0 t = 1 t = 2
F
0
F
11
F
12
F
13
F
26
F
25
F
24
F
23
F
22
F
21
= 1, p
1
= 0.24
= 2, p
2
= 0.06
= 3, p
3
= 0.04
= 4, p
4
= 0.16
= 5, p
5
= 0.20
= 6, p
6
= 0.30
Figure 2.1: An example of a twoperiod economy.
t = 0 t = 1 t = 2
F
0
F
21 0.8
F
22
0.2
F
11
0
.
3
F
23
0
.1
F
24
0.4
F
25
0
.5
F
12
0.4
F
26
1.0
F
13
0
.
3
Figure 2.2: The multinomial tree version of the twoperiod economy in Figure 2.1.
20 Chapter 2. Uncertainty, information, and stochastic processes
We can dene conditional variances, covariances, and correlations from the conditional expecta
tion exactly as one denes (unconditional) variances, covariances, and correlations from (uncondi
tional) expectations:
Var
t
[X] = E
t
_
(X E
t
[X])
2
_
= E
t
[X
2
] (E
t
[X])
2
,
Cov
t
[X, Y ] = E
t
[(X E
t
[X])(Y E
t
[Y ])] = E
t
[XY ] E
t
[X] E
t
[Y ],
Corr
t
[X, Y ] =
Cov
t
[X, Y ]
_
Var
t
[X] Var
t
[Y ]
.
Again the conditioning on time t information is indicated by a t subscript.
In the twoperiod model of Figures 2.1 and 2.2 suppose we have an asset paying statedependent
dividends at time 1 and 2 as depicted in Figures 2.3 and 2.4. Then the expected time 2 dividend
computed at time 0 is
E[D
2
] = E
0
[D
2
] = 0.24 0 + 0.06 20 + 0.04 10 + 0.16 5 + 0.2 20 + 0.3 20 = 12.4.
What is the expected time 2 dividend computed at time 1? It will depend on the information
available at time 1. If the information corresponds to the event F
11
= 1, 2, the expected dividend
is
E
1
[D
2
] = 0.8 0 + 0.2 20 = 4.
If the information corresponds to the event F
12
= 3, 4, 5, the expected dividend is
E
1
[D
2
] = 0.1 10 + 0.4 5 + 0.5 20 = 13.
If the information corresponds to the event F
13
= 6, the expected dividend is
E
1
[D
2
] = 1.0 20 = 20.
The time 1 expectation of the time 2 dividend is therefore a random variable which is measurable
with respect to the information at time 1. Note that the time 0 expectation of that random variable
is
E[E
1
[D
2
]] = 0.3 4 + 0.4 13 + 0.3 20 = 12.4 = E[D
2
]
consistent with the Law of Iterated Expectations.
2.4 Stochastic processes: denition, notation, and termi
nology
In oneperiod models all uncertain objects can be represented by a random variable. For example
the dividend (at time 1) of a given asset is a random variable. In multiperiod models we have
to keep track of dividends, asset prices, consumption, portfolios, (labor) income, etc., throughout
the time set T. For example the dividend of a given asset, say asset i, at a particular future date
t T can be represented by a random variable D
it
. Recall that, formally, a random variable is
a function from the state space into R, the set of real numbers. To represent the dividends of
an asset throughout all dates, we need a collection of random variables, one for each date. Such a
collection is called a stochastic process. (We will often just write process instead of stochastic
2.4 Stochastic processes: denition, notation, and terminology 21
t = 0 t = 1 t = 2
0
0
0
0
0
0
10
10
20
20
20
30 20
20
5
10
20
0
= 1, p
1
= 0.24
= 2, p
2
= 0.06
= 3, p
3
= 0.04
= 4, p
4
= 0.16
= 5, p
5
= 0.20
= 6, p
6
= 0.30
Figure 2.3: The dividends of an asset in the twoperiod economy.
t = 0 t = 1 t = 2
0
0
0.8
20
0.2
10
0
.
3
10
0
.1
5
0.4
20
0
.5
20
0.4
20
1.0
30
0
.
3
Figure 2.4: The multinomial tree version of the dividend process in Figure 2.3.
22 Chapter 2. Uncertainty, information, and stochastic processes
process.) The dividend of asset i is thus represented by a stochastic process D
i
= (D
it
)
tT
, where
each D
it
is a random variable. We can form multidimensional stochastic processes by stacking
onedimensional stochastic processes. For example, we can represent the dividends of I assets by
an Idimensional stochastic process D = (D
t
)
tT
, where D
t
= (D
1t
, . . . , D
It
)
.
In general, for any t T, the dividend at time t will be known at time t, but not before. The
random variable D
it
is then said to be F
t
measurable, where F
t
is the sigmaalgebra representing
the information at time t. If this information is equivalently represented by a partition F
t
=
F
t1
, F
t2
, . . . , measurability means that D
it
is constant on each of the elements F
tj
of the partition.
Note that this is indeed the case in our example in Figure 2.3. If D
it
is F
t
measurable for any t T,
the stochastic process D
i
= (D
it
)
tT
is said to be adapted to the information ltration (F
t
)
tT
.
Since the dividends of assets and the income of individuals are assumed to be part of the exogenous
uncertainty, it is natural to assume that dividend processes and income processes are adapted to
the information ltration. In fact, most concrete models write down stochastic processes for all the
exogenous variables and dene the information ltration as the smallest ltration to which these
exogenous processes are adapted.
An individual can choose how much to consume and which portfolio to invest in at any point in
time, of course subject to a budget constraint and other feasibility constraints. The consumption
and portfolio chosen at a given future date are likely to depend on the income the individual has
received up to that point and her information about her future income and the future dividends
of assets. The consumption rate at a given future date is therefore a random variable and the
consumption rates at all dates constitute a stochastic process, the consumption process. Similarly,
the portfolios chosen at dierent dates form a stochastic process, the portfolio process or socalled
trading strategy. Representing consumption and investments by stochastic processes does not mean
that we treat them as being exogenously given and purely determined by chance, but simply that
the individual will condition her consumption and portfolio decisions on the information received.
Since we assume that the underlying model of uncertainty includes all the uncertainty relevant for
the decisions of the individuals, it is natural to require that consumption and portfolio processes
are adapted to the information ltration.
Now consider prices. In a multiperiod model we need to keep track of prices at all points in
time. The price of a given asset at a given point in time depends on the supply and demand for
that asset of all individuals, which again depends on the information individuals have at that time.
Hence, we can also represent prices by adapted stochastic processes. To sum up, all the stochastic
processes relevant for our purposes will be adapted to the information ltration representing the
resolution of all the relevant uncertainty.
Next, we introduce some further terminology often used in connection with stochastic processes.
A stochastic process X = (X
t
)
tT
is said to be a martingale (relative to the probability measure
P and the information ltration (F
t
)
tT
), if for all t, t
T with t < t
E
t
[X
t
] = X
t
,
i.e. no change in the value is expected.
A sample path of a stochastic process X is the collection of realized values (X
t
())
tT
for a
given outcome . The value space of a stochastic process is the smallest set S with the
property that P(X
t
S) = 1 for all t T. If the value space has countably many elements,
2.5 Some discretetime stochastic processes 23
the stochastic process is called a discretevalue process. Otherwise, it is called a continuousvalue
process. Of course, if the state space is nite, all processes will be discretevalue processes. If
you want to model continuousvalue processes, you need an innite state space.
For the modeling of most timevarying economic objects it seems reasonable to use continuous
value processes. Admittedly, stock prices are quoted on exchanges as multiples of some smallest
possible unit (0.01 currency units in many countries) and interest rates are rounded o to some
number of decimals, but the set of possible values of such objects is approximated very well
by an interval in R (maybe R
+
or R itself). Also, the mathematics involved in the analysis of
continuousvalue processes is simpler and more elegant than the mathematics for discretevalue
processes. However there are economic objects that can only take on a very limited set of values.
For these objects discretevalue processes should be used. An example is the credit ratings assigned
by Moodys and similar agencies to debt issues of corporations.
As time goes by, we can observe the evolution in the object which the stochastic process describes.
At any given time t
)
, where X
t
S, will be known (at least in the
models we consider). These values constitute the history of the process up to time t
. The future
values are still stochastic.
As time passes we will typically revise our expectations of the future values of the process or,
more precisely, revise the probability distribution we attribute to the value of the process at any
future point in time, cf. the discussion in the previous section. Suppose we stand at time t and
consider the value of a process X at a future time t
T with
t < t
2
e
a
2
/2
da, t = 1, . . . , T,
where N() is the cumulative distribution function for an N(0, 1) variable. Probabilities of other
events will follow from the above. The information at time t is represented by the smallest algebra
with respect to which the random variables
1
, . . . ,
t
are measurable.
Stochastic processes for dividends etc. can be dened relative to the assumed exogenous shocks.
It is easy to obtain nonzero means, nonunit variances, and dependencies across time. A discrete
time stochastic process X = (X
t
)
tT
is typically specied by the initial value X
0
(a constant in
R) and the increments over each period, i.e. X
t+1
X
t+1
X
t
for each t = 0, 1, . . . , T 1. The
increments X
t+1
are dened in terms of the exogenous shocks
1
, . . . ,
t+1
which implies that the
process X is adapted.
Let us look at some discretetime stochastic processes frequently used in asset pricing models. In
all the examples we assume that the exogenous shocks
1
, . . . ,
T
are independent, onedimensional
N(0, 1) distributed random variables.
Random walk: X
t+1
=
t+1
or, equivalently, X
t+1
= X
0
+ (
1
+
2
+ +
t+1
). Here
is a positive constant. A random walk is a Markov process since only the current value X
t
and
not the previous values X
t1
, X
t2
, . . . aect X
t+1
. Since the expected change over any single
period is zero, the expected change over any time interval will be zero, so a random walk is a
martingale. Conditionally on X
t
, X
t+1
is normally distributed with mean X
t
and variance
2
.
X
t+1
is unconditionally (i.e. seen from time 0) normally distributed with mean X
0
and variance
(t + 1)
2
.
Random walk with drift: X
t+1
= +
t+1
, where is a constant (the drift rate) and is
a positive constant. Also a random walk with drift is a Markov process. The expected change over
any single period is so unless = 0 and we are back to the random walk (without drift), the
process X = (X
t
)
tT
is not a martingale. Conditionally on X
t
, X
t+1
is normally distributed with
mean +X
t
and variance
2
. X
t+1
is unconditionally normally distributed with mean X
0
+(t+1)
and variance (t + 1)
2
.
Autoregressive processes: A process X = (X
t
)
tT
with
X
t+1
= (1 )( X
t
) +
t+1
,
where (1, 1), is said to be an autoregressive process of order 1 or simply an AR(1) process.
It is a Markov process since only the current value X
t
aects the distribution of the future value.
The expected change over the period is positive if X
t
< and negative if X
t
> . In any case, the
value X
t+1
is expected to be closer to than X
t
is. The process is pulled towards and therefore
the process is said to be meanreverting. This is useful for modeling the dynamics of variables
that tend to vary with the business cycle around some longterm average. Note, however, extreme
shocks may cause the process to be pushed further away from .
The covariance between the subsequent values X
t
and X
t+1
= X
t
+ (1 )( X
t
) +
t+1
=
X
t
+ (1 ) +
t+1
is
Cov[X
t
, X
t+1
] = Cov[X
t
, X
t
] = Var[X
t
]
2.5 Some discretetime stochastic processes 25
so that = Cov[X
t
, X
t+1
]/ Var[X
t
] is the socalled autocorrelation parameter. Solving backwards,
we nd
X
t+1
= X
t
+ (1 ) +
t+1
= (X
t1
+ (1 ) +
t
) + (1 ) +
t+1
=
2
X
t1
+ (1 +)(1 ) +
t+1
+
t
= . . .
=
k+1
X
tk
+ (1 + + +
k
)(1 ) +
t+1
+
t
+ +
k
tk+1
=
k+1
X
tk
+ (1
k+1
) +
k
j=0
t+1j
.
More generally, a process X = (X
t
)
tT
with
X
t+1
= +
1
(X
t
) +
2
(X
t1
) + +
l
(X
tl+1
) +
t+1
is said to be an autoregressive process of order l or simply an AR(l) process. If the order is higher
than 1, the process is not a Markov process.
(G)ARCH processes: ARCH is short for Autoregressive Conditional Heteroskedasticity. An
ARCH(l) process X = (X
t
)
tT
is dened by
X
t+1
= +
t+1
t+1
,
where
2
t+1
= +
l
i=1
2
t+1i
.
The conditional variance depends on squares of the previous l shock terms.
GARCH is short for Generalized Autoregressive Conditional Heteroskedasticity. A GARCH(l, m)
process X = (X
t
)
tT
is dened by
X
t+1
= +
t+1
t+1
,
where
2
t+1
= +
l
i=1
2
t+1i
+
m
j=1
2
t+1j
.
ARCH and GARCH processes are often used for detailed modeling of stock market volatility.
More generally we can dene an adapted process X = (X
t
)
tT
by the initial value X
0
and the
equation
X
t+1
= (X
t
, . . . , X
0
) +(X
t
, . . . , X
0
)
t+1
, t = 0, 1, . . . , T 1, (2.1)
where and are realvalued functions. If
t+1
N(0, 1), the conditional distribution of X
t+1
given X
t
is a normal distribution with mean X
t
+(X
t
, . . . , X
0
) and variance (X
t
, . . . , X
0
)
2
.
We can write the stochastic processes introduced above in a dierent way that will ease the
transition to continuoustime processes. Let z = (z
t
)
tT
denote a unit random walk starting at
zero, i.e. a process with the properties
(i) z
0
= 0,
26 Chapter 2. Uncertainty, information, and stochastic processes
(ii) z
t+1
z
t+1
z
t
N(0, 1) for all t = 0, 1, . . . , T 1,
(iii) z
1
, z
2
, . . . z
T
are independent.
Then we can dene a random walk with drift as a process X with
X
t+1
= +z
t+1
,
an AR(1) process is dened by
X
t+1
= (1 )( X
t
) +z
t+1
,
and a general adapted process is dened by
X
t+1
= (X
t
, . . . , X
0
) +(X
t
, . . . , X
0
)z
t+1
.
We will see very similar equations in continuous time.
2.6 Continuoustime stochastic processes
2.6.1 Brownian motions
In the continuoustime asset pricing models we will consider in this book, the basic uncertainty in
the economy is represented by the evolution of a standard Brownian motion. A (onedimensional)
stochastic process z = (z
t
)
t[0,T]
is called a standard Brownian motion, if it satises the
following conditions:
(i) z
0
= 0,
(ii) for all t, t
0 with t < t
: z
t
z
t
N(0, t
1
and
2
dened by
1
=
_
2 ln U
1
sin(2U
2
),
2
=
_
2 ln U
1
cos(2U
2
)
will be independent standard normally distributed random numbers. This is the socalled BoxMuller transformation.
See e.g. Press, Teukolsky, Vetterling, and Flannery (1992, Sec. 7.2).
28 Chapter 2. Uncertainty, information, and stochastic processes
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
0 0.2 0.4 0.6 0.8 1
Figure 2.5: A simulated sample path of a standard Brownian motion based on 200 subintervals.
is also innite. Intuitively, these properties are due to the fact that the size of the increment of a
standard Brownian motion over an interval of length t is proportional to
t, in the sense that
the standard deviation of the increment equals
t is signicantly
larger than t, so the changes are large relative to the length of the time interval over which the
changes are measured.
The expected change in an object described by a standard Brownian motion equals zero and
the variance of the change over a given time interval equals the length of the interval. This can
easily be generalized. As before let z = (z
t
)
t0
be a onedimensional standard Brownian motion
and dene a new stochastic process X = (X
t
)
t0
by
X
t
= X
0
+t +z
t
, t 0, (2.2)
where X
0
, , and are constants. The constant X
0
is the initial value for the process X. It
follows from the properties of the standard Brownian motion that, seen from time 0, the value X
t
is normally distributed with mean t and variance
2
t, i.e. X
t
N(X
0
+t,
2
t).
The change in the value of the process between two arbitrary points in time t and t
, where
t < t
, is given by
X
t
X
t
= (t
t) +(z
t
z
t
).
The change over an innitesimally short interval [t, t + t] with t 0 is often written as
dX
t
= dt + dz
t
, (2.3)
where dz
t
can loosely be interpreted as a N(0, dt)distributed random variable. To give this a
precise mathematical meaning, it must be interpreted as a limit of the expression
X
t+t
X
t
= t +(z
t+t
z
t
)
for t 0. The process X is called a generalized Brownian motion or a generalized Wiener
process. This is basically the continuoustime version of a random walk with drift. The parameter
2.6 Continuoustime stochastic processes 29
0,6
0,4
0,2
0
0,2
0,4
0,6
0,8
1
1,2
1,4
0 0,2 0,4 0,6 0,8 1
sigma = 0.5 sigma = 1.0
Figure 2.6: Simulation of a generalized Brownian motion with = 0.2 and = 0.5 or = 1.0. The
straight line shows the trend corresponding to = 0. The simulations are based on 200 subintervals.
reects the expected change in the process per unit of time and is called the drift rate or simply
the drift of the process. The parameter reects the uncertainty about the future values of the
process. More precisely,
2
reects the variance of the change in the process per unit of time and
is often called the variance rate of the process. is a measure for the standard deviation of the
change per unit of time and is referred to as the volatility of the process.
A generalized Brownian motion inherits many of the characteristic properties of a standard
Brownian motion. For example, also a generalized Brownian motion is a Markov process, and the
sample paths of a generalized Brownian motion are also continuous and nowhere dierentiable.
However, a generalized Brownian motion is not a martingale unless = 0. The sample paths can
be simulated by choosing time points 0 t
0
< t
1
< < t
n
and iteratively computing
X
t
i
= X
t
i1
+(t
i
t
i1
) +
i
_
t
i
t
i1
, i = 1, . . . , n,
where
1
, . . . ,
n
are independent draws from a standard normal distribution. Figure 2.6 show
simulated sample paths for two dierent values of but the same . The paths are drawn using
the same sequence of random numbers
i
so that they are directly comparable. The straight line
represent the deterministic trend of the process, which corresponds to imposing the condition = 0
and hence ignoring the uncertainty. The parameter determines the trend, and the parameter
determines the size of the uctuations around the trend.
If the parameters and are allowed to be timevarying in a deterministic way, the process
X is said to be a timeinhomogeneous generalized Brownian motion. In dierential terms such a
process can be written as
dX
t
= (t) dt +(t) dz
t
. (2.4)
Over a very short interval [t, t+t] the expected change is approximately (t)t, and the variance
of the change is approximately (t)
2
t. More precisely, the increment over any interval [t, t
] is
30 Chapter 2. Uncertainty, information, and stochastic processes
given by
X
t
X
t
=
_
t
t
(u) du +
_
t
t
(u) dz
u
. (2.5)
The last integral is a socalled stochastic integral, which we will dene (although not rigorously)
and describe in a later section. There we will also state a theorem, which implies that, seen from
time t, the integral
_
t
t
(u) dz
u
is a normally distributed random variable with mean zero and
variance
_
t
t
(u)
2
du.
2.6.2 Diusion processes
For both standard Brownian motions and generalized Brownian motions, the future value is nor
mally distributed and can therefore take on any real value, i.e. the value space is equal to R.
Many economic variables can only have values in a certain subset of R. For example, prices of
nancial assets with limited liability are nonnegative. The evolution in such variables cannot be
well represented by the stochastic processes studied so far. In many situations we will instead use
socalled diusion processes.
A (onedimensional) diusion process is a stochastic process X = (X
t
)
t0
for which the change
over an innitesimally short time interval [t, t +dt] can be written as
dX
t
= (X
t
, t) dt +(X
t
, t) dz
t
, (2.6)
where z is a standard Brownian motion, but where the drift and the volatility are now functions
of time and the current value of the process.
2
This expression generalizes (2.3), where and
were assumed to be constants, and (2.4), where and were functions of time only. An equation
like (2.6), where the stochastic process enters both sides of the equality, is called a stochastic
dierential equation. Hence, a diusion process is a solution to a stochastic dierential equation.
If both functions and are independent of time, the diusion is said to be timehomo
geneous, otherwise it is said to be timeinhomogeneous. For a timehomogeneous diusion
process, the distribution of the future value will only depend on the current value of the process
and how far into the future we are looking not on the particular point in time we are standing
at. For example, the distribution of X
t+
given X
t
= x will only depend on X and , but not on t.
This is not the case for a timeinhomogeneous diusion, where the distribution will also depend
on t.
In the expression (2.6) one may think of dz
t
as being N(0, dt)distributed, so that the mean and
variance of the change over an innitesimally short interval [t, t +dt] are given by
E
t
[dX
t
] = (X
t
, t) dt, Var
t
[dX
t
] = (X
t
, t)
2
dt.
To be more precise, the change in a diusion process over any interval [t, t
] is
X
t
X
t
=
_
t
t
(X
u
, u) du +
_
t
t
(X
u
, u) dz
u
. (2.7)
Here the integrand of the rst integral
_
t
t
(X
u
, u) du depends on the values X
u
for u [t, t
],
which are generally unknown at time t. It is therefore natural to dene the integral
_
t
t
(X
u
, u) du
2
For the process X to be mathematically meaningful, the functions (x, t) and (x, t) must satisfy certain
conditions. See e.g. ksendal (1998, Ch. 7) and Due (2001, App. E).
2.6 Continuoustime stochastic processes 31
as the random variable which in state has the value
_
t
t
(X
u
(), u) du, which is now just
the integration of a realvalued function of time. The other integral
_
t
t
(X
u
, u) dz
u
is a socalled
stochastic integral, which we will discuss in Section 2.6.5.
We will often use the informal and intuitive dierential notation (2.6). The drift rate (X
t
, t)
and the variance rate (X
t
, t)
2
are really the limits
(X
t
, t) = lim
t0
E
t
[X
t+t
X
t
]
t
,
(X
t
, t)
2
= lim
t0
Var
t
[X
t+t
X
t
]
t
.
A diusion process is a Markov process as can be seen from (2.6), since both the drift and the
volatility only depend on the current value of the process and not on previous values. A diusion
process is not a martingale, unless the drift (X
t
, t) is zero for all X
t
and t. A diusion process
will have continuous, but nowhere dierentiable sample paths. The value space for a diusion
process and the distribution of future values will depend on the functions and . In Section 2.6.7
we will an example of a diusion process often used in nancial modeling, the socalled geometric
Brownian motion. Other diusion processes will be used in later chapters.
2.6.3 Ito processes
It is possible to dene even more general continuouspath processes than those in the class of
diusion processes. A (onedimensional) stochastic process X
t
is said to be an Ito process, if the
local increments are on the form
dX
t
=
t
dt +
t
dz
t
, (2.8)
where the drift and the volatility themselves are stochastic processes. A diusion process is
the special case where the values of the drift
t
and the volatility
t
are given as functions of
t and X
t
. For a general Ito process, the drift and volatility may also depend on past values of
the X process and also on past and current values of other adapted processes. It follows that Ito
processes are generally not Markov processes. They are generally not martingales either, unless
t
is identically equal to zero (and
t
satises some technical conditions). The processes and
must satisfy certain regularity conditions for the X process to be welldened. We will refer
the reader to ksendal (1998, Ch. 4) for these conditions. The expression (2.8) gives an intuitive
understanding of the evolution of an Ito process, but it is more precise to state the evolution in
the integral form
X
t
X
t
=
_
t
u
du +
_
t
u
dz
u
. (2.9)
Again the rst integral can be dened statebystate and the second integral is a stochastic
integral.
2.6.4 Jump processes
Above we have focused on processes having sample paths that are continuous functions of time,
so that one can depict the evolution of the process by a continuous curve. Stochastic processes
which have sample paths with discontinuities (jumps) also exist. The jumps of such processes
are often modeled by Poisson processes or related processes. It is wellknown that large, sudden
32 Chapter 2. Uncertainty, information, and stochastic processes
movements in nancial variables occur from time to time, for example in connection with stock
market crashes. There may be many explanations of such large movements, for example a large
unexpected change in the productivity in a particular industry or the economy in general, perhaps
due to a technological breakthrough. Another source of sudden, large movements is changes in
the political or economic environment such as unforseen interventions by the government or central
bank. Stock market crashes are sometimes explained by the bursting of a bubble (which does not
necessarily conict with the usual assumption of rational investors). Whether such sudden, large
movements can be explained by a sequence of small continuous movements in the same direction
or jumps have to be included in the models is an empirical question, which is still open. While
jump processes may be relevant for many purposes, they are also more dicult to deal with than
processes with continuous sample paths so that it will probably be best to study models without
jumps rst. This book will only address continuouspath processes. An overview of nancial
models with jump processes is given by Cont and Tankov (2004).
2.6.5 Stochastic integrals
In (2.7) and (2.9) and similar expressions a term of the form
_
t
t
u
dz
u
appears. An integral of
this type is called a stochastic integral or an Ito integral. For given t < t
t
u
dz
u
is a random variable. Assuming that
u
is known at time u, the value of the integral
becomes known at time t
t
dz
u
simply as
_
t
t
dz
u
= (z
t
z
t
).
We can easily generalize that to a piecewise constant integrand. Assume that there are points in
time t t
0
< t
1
< < t
n
t
, so that
u
is constant on each subinterval [t
i
, t
i+1
). The stochastic
integral is then dened by
_
t
u
dz
u
=
n1
i=0
t
i
_
z
t
i+1
z
t
i
_
. (2.10)
If the integrand process is not piecewise constant, there will exist a sequence of piecewise con
stant processes
(1)
,
(2)
, . . . , which converges to . For each of the processes
(m)
, the integral
_
t
t
(m)
u
dz
u
is dened as above. The integral
_
t
t
u
dz
u
is then dened as a limit of the integrals
of the approximating processes:
_
t
u
dz
u
= lim
m
_
t
(m)
u
dz
u
. (2.11)
We will not discuss exactly how this limit is to be understood and which integrand processes we can
allow. Again the interested reader is referred to ksendal (1998). The distribution of the integral
_
t
t
u
dz
u
will, of course, depend on the integrand process and can generally not be completely
characterized, but the following theorem gives the mean and the variance of the integral:
2.6 Continuoustime stochastic processes 33
Theorem 2.2 The stochastic integral
_
t
t
u
dz
u
has the following properties:
E
t
_
_
t
u
dz
u
_
= 0,
Var
t
_
_
t
u
dz
u
_
=
_
t
t
E
t
[
2
u
] du.
Proof: Suppose that is piecewise constant and divide the interval [t, t
u
dz
u
_
=
n1
i=0
E
t
_
t
i
_
z
t
i+1
z
t
i
_
=
n1
i=0
E
t
_
t
i
E
t
i
__
z
t
i+1
z
t
i
_
= 0,
using the Law of Iterated Expectations. For the variance we have
Var
t
_
_
t
u
dz
u
_
= E
t
_
_
_
_
t
u
dz
u
_
2
_
_
_
E
t
_
_
t
u
dz
u
__
2
= E
t
_
_
_
_
t
u
dz
u
_
2
_
_
and
E
t
_
_
_
_
t
u
dz
u
_
2
_
_
= E
t
_
_
n1
i=0
n1
j=0
t
i
t
j
(z
t
i+1
z
t
i
)(z
t
j+1
z
t
j
)
_
_
=
n1
i=0
E
t
_
2
t
i
(z
t
i+1
z
t
i
)
2
=
n1
i=0
E
t
_
2
t
i
(t
i+1
t
i
) =
_
t
t
E
t
[
2
u
] du.
If is not piecewise constant, we can approximate it by a piecewise constant process and take
appropriate limits. 2
If the integrand is a deterministic function of time, (u), the integral will be normally distributed,
so that the following result holds:
Theorem 2.3 If z is a Brownian motion, and (u) is a deterministic function of time, the random
variable
_
t
t
(u) dz
u
is normally distributed with mean zero and variance
_
t
t
(u)
2
du.
Proof: We present a sketch of the proof. Dividing the interval [t, t
t
(u) dz
u
n1
i=0
(t
i
)
_
z
t
i+1
z
t
i
_
.
The increment of the Brownian motion over any subinterval is normally distributed with mean
zero and a variance equal to the length of the subinterval. Furthermore, the dierent terms in
the sum are mutually independent. It is wellknown that a sum of normally distributed random
variables is itself normally distributed, and that the mean of the sum is equal to the sum of the
means, which in the present case yields zero. Due to the independence of the terms in the sum,
the variance of the sum is also equal to the sum of the variances, i.e.
Var
t
_
n1
i=0
(t
i
)
_
z
t
i+1
z
t
i
_
_
=
n1
i=0
(t
i
)
2
Var
t
_
z
t
i+1
z
t
i
=
n1
i=0
(t
i
)
2
(t
i+1
t
i
),
34 Chapter 2. Uncertainty, information, and stochastic processes
which is an approximation of the integral
_
t
t
(u)
2
du. The result now follows from an appropriate
limit where the subintervals shrink to zero length. 2
Note that the process y = (y
t
)
t0
dened by y
t
=
_
t
0
u
dz
u
is a martingale, since
E
t
[y
t
] = E
t
_
_
t
u
dz
u
_
= E
t
_
_
t
0
u
dz
u
+
_
t
u
dz
u
_
= E
t
__
t
0
u
dz
u
_
+ E
t
_
_
t
u
dz
u
_
=
_
t
0
u
dz
u
= y
t
,
so that the expected future value is equal to the current value. More generally y
t
= y
0
+
_
t
0
u
dz
u
for some constant y
0
, is a martingale. The converse is also true in the sense that any martingale
can be expressed as a stochastic integral. This is the socalled martingale representation theorem:
Theorem 2.4 Suppose the process M = (M
t
) is a martingale with respect to a probability measure
under which z = (z
t
) is a standard Brownian motion. Then a unique adapted process = (
t
)
exists such that
M
t
= M
0
+
_
t
0
u
dz
u
for all t.
For a mathematically more precise statement of the result and a proof, see ksendal (1998,
Thm. 4.3.4).
Now the stochastic integral with respect to the standard Brownian motion has been dened, we
can also dene stochastic integrals with respect to other stochastic processes. For example, if X
t
is a diusion given by dX
t
= (X
t
, t) dt + (X
t
, t) dz
t
and = (
t
)
t[0,T]
is a suciently nice
stochastic process, we can dene
_
t
0
u
dX
u
=
_
t
0
u
(X
u
, u) du +
_
t
0
u
(X
u
, u) dz
u
.
2.6.6 It os Lemma
In continuoustime models a stochastic process for the dynamics of some basic quantity is often
taken as given, while other quantities of interest can be shown to be functions of that basic variable.
To determine the dynamics of these other variables, we shall apply Itos Lemma, which is basically
the chain rule for stochastic processes. We will state the result for a function of a general Ito
process, although we will frequently apply the result for the special case of a function of a diusion
process.
Theorem 2.5 Let X = (X
t
)
t0
be a realvalued Ito process with dynamics
dX
t
=
t
dt +
t
dz
t
,
where and are realvalued processes, and z is a onedimensional standard Brownian motion.
Let g(X, t) be a realvalued function which is two times continuously dierentiable in X and con
tinuously dierentiable in t. Then the process y = (y
t
)
t0
dened by
y
t
= g(X
t
, t)
2.6 Continuoustime stochastic processes 35
is an It o process with dynamics
dy
t
=
_
g
t
(X
t
, t) +
g
X
(X
t
, t)
t
+
1
2
2
g
X
2
(X
t
, t)
2
t
_
dt +
g
X
(X
t
, t)
t
dz
t
. (2.12)
The proof of Itos Lemma is based on a Taylor expansion of g(X
t
, t) combined with appropriate
limits, but a formal proof is beyond the scope of this presentation. Once again, we refer to ksendal
(1998) and similar textbooks. The result can also be written in the following way, which may be
easier to remember:
dy
t
=
g
t
(X
t
, t) dt +
g
X
(X
t
, t) dX
t
+
1
2
2
g
X
2
(X
t
, t)(dX
t
)
2
. (2.13)
Here, in the computation of (dX
t
)
2
, one must apply the rules (dt)
2
= dt dz
t
= 0 and (dz
t
)
2
= dt,
so that
(dX
t
)
2
= (
t
dt +
t
dz
t
)
2
=
2
t
(dt)
2
+ 2
t
t
dt dz
t
+
2
t
(dz
t
)
2
=
2
t
dt.
The intuition behind these rules is as follows: When dt is close to zero, (dt)
2
is far less than
dt and can therefore be ignored. Since dz
t
N(0, dt), we get E[dt dz
t
] = dt E[dz
t
] = 0 and
Var[dt dz
t
] = (dt)
2
Var[dz
t
] = (dt)
3
, which is also very small compared to dt and is therefore
ignorable. Finally, we have E[(dz
t
)
2
] = Var[dz
t
] (E[dz
t
])
2
= dt, and it can be shown that
3
Var[(dz
t
)
2
] = 2(dt)
2
. For dt close to zero, the variance is therefore much less than the mean, so
(dz
t
)
2
can be approximated by its mean dt.
In standard mathematics, the dierential of a function y = g(t, X) where t and X are real
variables is dened as dy =
g
t
dt +
g
X
dX. When X is an Ito process, (2.13) shows that we have
to add a secondorder term.
2.6.7 The geometric Brownian motion
The geometric Brownian motion is an important example of a diusion process. A stochastic
process X = (X
t
)
t0
is said to be a geometric Brownian motion if it is a solution to the
stochastic dierential equation
dX
t
= X
t
dt +X
t
dz
t
, (2.14)
where and are constants. The initial value for the process is assumed to be positive, X
0
> 0.
A geometric Brownian motion is the particular diusion process that is obtained from (2.6) by
inserting (X
t
, t) = X
t
and (X
t
, t) = X
t
.
The expression (2.14) can be rewritten as
dX
t
X
t
= dt + dz
t
,
which is the relative (percentage) change in the value of the process over the next innitesimally
short time interval [t, t +dt]. If X
t
is the price of a traded asset, then dX
t
/X
t
is the rate of return
on the asset over the next instant. The constant is the expected rate of return per period, while
is the standard deviation of the rate of return per period. In this context it is often which is
called the drift (rather than X
t
) and which is called the volatility (rather than X
t
). Strictly
speaking, one must distinguish between the relative drift and volatility ( and , respectively) and
3
This is based on the computation Var[(z
t+t
z
t
)
2
] = E[(z
t+t
z
t
)
4
]
_
E[(z
t+t
z
t
)
2
]
_
2
= 3(t)
2
(t)
2
=
2(t)
2
and a passage to the limit.
36 Chapter 2. Uncertainty, information, and stochastic processes
the absolute drift and volatility (X
t
and X
t
, respectively). An asset with a constant expected
rate of return and a constant relative volatility has a price that follows a geometric Brownian
motion. For example, such an assumption is used for the stock price in the famous BlackScholes
Merton model for stock option pricing, cf. Chapter 12. In the framework of consumptionbased
capital asset pricing models it is often assumed that the aggregate consumption in the economy
follows a geometric Brownian motion, cf. Chapter 8.
Next, we will nd an explicit expression for X
t
, i.e. we will nd a solution to the stochastic
dierential equation (2.14). We can then also determine the distribution of the future value of
the process. We apply Itos Lemma with the function g(x, t) = lnx and dene the process y
t
=
g(X
t
, t) = lnX
t
. Since
g
t
(X
t
, t) = 0,
g
x
(X
t
, t) =
1
X
t
,
2
g
x
2
(X
t
, t) =
1
X
2
t
,
we get from Theorem 2.5 that
dy
t
=
_
0 +
1
X
t
X
t
1
2
1
X
2
t
2
X
2
t
_
dt +
1
X
t
X
t
dz
t
=
_
1
2
2
_
dt + dz
t
.
Hence, the process y
t
= lnX
t
is a generalized Brownian motion. In particular, we have
y
t
y
t
=
_
1
2
2
_
(t
t) +(z
t
z
t
),
which implies that
lnX
t
= lnX
t
+
_
1
2
2
_
(t
t) +(z
t
z
t
).
Taking exponentials on both sides, we get
X
t
= X
t
exp
__
1
2
2
_
(t
t) +(z
t
z
t
)
_
. (2.15)
This is true for all t
> t 0. In particular,
X
t
= X
0
exp
__
1
2
2
_
t +z
t
_
.
Since exponentials are always positive, we see that X
t
can only have positive values, so that the
value space of a geometric Brownian motion is S = (0, ).
Suppose now that we stand at time t and have observed the current value X
t
of a geometric
Brownian motion. Which probability distribution is then appropriate for the uncertain future
value, say at time t
? Since z
t
z
t
N(0, t
(conditional on X
t
) will be lognormally distributed. The probability density function for X
t
(given
X
t
) is given by
f(x) =
1
x
_
2
2
(t
t)
exp
_
1
2
2
(t
t)
_
ln
_
x
X
t
_
_
1
2
2
_
(t
t)
_
2
_
, x > 0,
and the mean and variance are
E
t
[X
t
] = X
t
e
(t
t)
,
Var
t
[X
t
] = X
2
t
e
2(t
t)
_
e
2
(t
t)
1
_
,
2.7 Multidimensional processes 37
70
80
90
100
110
120
130
140
150
0 0.2 0.4 0.6 0.8 1
sigma = 0.2 sigma = 0.5
Figure 2.7: Simulation of a geometric Brownian motion with initial value X
0
= 100, relative drift rate
= 0.1, and a relative volatility of = 0.2 and = 0.5, respectively. The smooth curve shows the
trend corresponding to = 0. The simulations are based on 200 subintervals of equal length, and the
same sequence of random numbers has been used for the two values.
cf. Appendix B.
Paths can be simulated by recursively computing either
X
t
i
= X
t
i1
+X
t
i1
(t
i
t
i1
) +X
t
i1
i
_
t
i
t
i1
or, more accurately,
X
t
i
= X
t
i1
exp
__
1
2
2
_
(t
i
t
i1
) +
i
_
t
i
t
i1
_
.
Figure 2.7 shows a single simulated sample path for = 0.2 and a sample path for = 0.5. For
both sample paths we have used = 0.1 and X
0
= 100, and the same sequence of random numbers.
We will consider other specic diusions in later chapters, when we need them. For example,
we shall use the OrnsteinUhlenbeck process dened by
dX
t
= ( X
t
) dt + dz
t
,
which is the continuoustime equivalent of the discretetime AR(1) process, and the square root
process dened by
dX
t
= ( X
t
) dt +
_
X
t
dz
t
.
Such processes are used, among other things, to model the dynamics of interest rates.
2.7 Multidimensional processes
So far we have only considered onedimensional processes, i.e. processes with a value space which
is R or a subset of R. In most asset pricing models we need to keep track of several processes, e.g.
38 Chapter 2. Uncertainty, information, and stochastic processes
dividend and price processes for dierent assets, and we will often be interested in covariances and
correlations between dierent processes.
If the exogenous shocks in the model are onedimensional, then increments over the smallest
time interval considered in the model will be perfectly correlated. In a discretetime model where
the exogenous shocks
1
, . . . ,
T
are onedimensional, changes in any two processes between two
subsequent points in time, say t and t + 1, will be perfectly correlated. For example, suppose X
and Y are two general processes dened by
X
t+1
=
Xt
+
Xt
t+1
, Y
t+1
=
Y t
+
Y t
t+1
where
t+1
has a mean of zero and a variance of one.
Xt
,
Y t
,
Xt
, and
Y t
are known at time t
and reect the conditional means and standard deviations of the increments to the processes over
the next period. By using the properties of covariances and variances we get
Cov
t
[X
t+1
, Y
t+1
] =
Xt
Y t
Var
t
[
t+1
] =
Xt
Y t
,
Var
t
[X
t+1
] =
2
Xt
, Var
t
[Y
t+1
] =
2
Y t
,
Corr
t
[X
t+1
, Y
t+1
] = 1.
Increments in two processes over more than one subperiod are generally not perfectly correlated
even with a onedimensional shock.
In a continuoustime model where the exogenous shock process z = (z
t
)
t[0,T]
is onedimensional,
the instantaneous increments of any two processes will be perfectly correlated. For example, if we
consider the two Ito processes X and Y dened by
dX
t
=
Xt
dt +
Xt
dz
t
, dY
t
=
Y t
dt +
Y t
dz
t
,
then Cov
t
[dX
t
, dY
t
] =
Xt
Y t
dt so that the instantaneous correlation becomes
Corr
t
[dX
t
, dY
t
] =
Cov
t
[dX
t
, dY
t
]
_
Var
t
[dX
t
] Var
t
[dY
t
]
=
Xt
Y t
dt
_
2
Xt
dt
2
Y t
dt
= 1.
Increments over any noninnitesimal time interval are generally not perfectly correlated, i.e. for
any h > 0 a correlation like Corr
t
[X
t+h
X
t
, Y
t+h
Y
t
] is typically dierent from one but close to
one for small h.
To obtain nonperfectly correlated changes over the shortest time period considered by the
model we need an exogenous shock of a dimension higher than one, i.e. a shock vector. One can
without loss of generality assume that the dierent components of this shock vector are mutually
independent and generate nonperfect correlations between the relevant processes by varying the
sensitivities of those processes towards the dierent exogenous shocks. We will rst consider the
case of two processes and later generalize.
2.7.1 Twodimensional processes
In the discretetime example above, we can avoid the perfect correlation between X
t+1
and Y
t+1
by introducing a second shock. Suppose that
X
t+1
=
Xt
+
X1t
1,t+1
+
X2t
2,t+1
2.7 Multidimensional processes 39
and
Y
t+1
=
Y t
+
Y 1t
t+1
+
Y 2t
2,t+1
,
where
1,t+1
and
2,t+1
are independent and have zero mean and unit variance. Now the covariance
is given by
Cov
t
[X
t+1
, Y
t+1
] =
X1t
Y 1t
+
X2t
Y 2t
,
while the variances are
Var
t
[X
t+1
] =
2
X1t
+
2
X2t
, Var
t
[Y
t+1
] =
2
Y 1t
+
2
Y 2t
.
The correlation is thus
Corr
t
[X
t+1
, Y
t+1
] =
X1t
Y 1t
+
X2t
Y 2t
_
(
2
X1t
+
2
X2t
)(
2
Y 1t
+
2
Y 2t
)
.
Since
2
X1t
2
Y 2t
+
2
X2t
2
Y 1t
2
X1t
Y 1t
X2t
Y 2t
= (
X1t
Y 2t
X2t
Y 1t
)
2
0,
it follows that
(
X1t
Y 1t
+
X2t
Y 2t
)
2
(
2
X1t
+
2
X2t
)(
2
Y 1t
+
2
Y 2t
),
and therefore the squared correlation is generally smaller than one. The correlation will equal 1
or +1 only if
X1t
=
Y 1t
= 0 or
X2t
=
Y 2t
= 0 so that we are eectively back to the single
shock case.
Similarly, in the continuoustime setting we add a standard Brownian motion so that
dX
t
=
Xt
dt +
X1t
dz
1t
+
X2t
dz
2t
, dY
t
=
Y t
dt +
Y 1t
dz
1t
+
Y 2t
dz
2t
,
where z
1
= (z
1t
) and z
2
= (z
2t
) are independent standard Brownian motions. This generates an
instantaneous covariance of Cov
t
[dX
t
, dY
t
] = (
X1t
Y 1t
+
X2t
Y 2t
) dt, instantaneous variances
of Var
t
[dX
t
] =
_
2
X1t
+
2
X2t
_
dt and Var
t
[dY
t
] =
_
2
Y 1t
+
2
Y 2t
_
dt, and thus an instantaneous
correlation of
Corr
t
[dX
t
, dY
t
] =
X1t
Y 1t
+
X2t
Y 2t
_
(
2
X1t
+
2
X2t
) (
2
Y 1t
+
2
Y 2t
)
,
which again can be anywhere in the interval [1, +1].
The shock coecients
X1t
,
X2t
,
Y 1t
, and
Y 2t
are determining the two instantaneous vari
ances and the instantaneous correlation. But many combinations of the four shock coecients will
give rise to the same variances and correlation. We have one degree of freedom in xing the shock
coecients. For example, we can put
X2t
0, which has the nice implication that it will simplify
various expressions and interpretations. If we thus write the dynamics of X and Y as
dX
t
=
Xt
dt +
Xt
dz
1t
, dY
t
=
Y t
dt +
Y t
_
t
dz
1t
+
_
1
2
t
dz
2t
_
,
2
Xt
and
2
Y t
are the variance rates of X
t
and Y
t
, respectively, while the covariance is Cov
t
[dX
t
, dY
t
] =
Xt
Y t
. If
Xt
and
Y t
are both positive, then
t
will be the instantaneous correlation between
the two processes X and Y .
In many continuoustime models, one stochastic process is dened in terms of a function of two
other, not necessarily perfectly correlated, stochastic processes. For that purpose we need the
following twodimensional version of It os Lemma.
40 Chapter 2. Uncertainty, information, and stochastic processes
Theorem 2.6 Suppose X = (X
t
) and Y = (Y
t
) are two stochastic processes with dynamics
dX
t
=
Xt
dt +
X1t
dz
1t
+
X2t
dz
2t
, dY
t
=
Y t
dt +
Y 1t
dz
1t
+
Y 2t
dz
2t
, (2.16)
where z
1
= (z
1t
and z
2
= (z
2t
) are independent standard Brownian motions. Let g(X, Y, t) be a
realvalued function for which all the derivatives
g
t
,
g
X
,
g
Y
,
2
g
X
2
,
2
g
Y
2
, and
2
g
XY
exist and are
continuous. Then the process W = (W
t
) dened by W
t
= g(X
t
, Y
t
, t) is an It o process with
dW
t
=
_
g
t
+
g
X
Xt
+
g
Y
Y t
+
1
2
2
g
X
2
_
2
X1t
+
2
X2t
_
+
1
2
2
g
Y
2
_
2
Y 1t
+
2
Y 2t
_
+
2
g
XY
(
X1t
Y 1t
+
X2t
Y 2t
)
_
dt
+
_
g
X
X1t
+
g
Y
Y 1t
_
dz
1t
+
_
g
X
X2t
+
g
Y
Y 2t
_
dz
2t
,
(2.17)
where the dependence of all the partial derivatives on (X
t
, Y
t
, t) has been notationally suppressed.
Alternatively, the result can be written more compactly as
dW
t
=
g
t
dt +
g
X
dX
t
+
g
Y
dY
t
+
1
2
2
g
X
2
(dX
t
)
2
+
1
2
2
g
Y
2
(dY
t
)
2
+
2
g
XY
(dX
t
)(dY
t
), (2.18)
where it is understood that (dt)
2
= dt dz
1t
= dt dz
2t
= dz
1t
dz
2t
= 0.
Example 2.1 Suppose that the dynamics of X and Y are given by (2.16) and W
t
= X
t
Y
t
. In
order to nd the dynamics of W, we apply the above version of Itos Lemma with the function
g(X, Y ) = XY . The relevant partial derivatives are
g
t
= 0,
g
X
= Y,
g
Y
= X,
2
g
X
2
= 0,
2
g
Y
2
= 0,
2
g
XY
= 1.
Hence,
dW
t
= Y
t
dX
t
+X
t
dY
t
+ (dX
t
)(dY
t
). (2.19)
In particular, if the dynamics of X and Y are written on the form
dX
t
= X
t
[m
Xt
dt +v
X1t
dz
1t
+v
X2t
dz
2t
] , dY
t
= Y
t
[m
Y t
dt +v
Y 1t
dz
1t
+v
Y 2t
dz
2t
] , (2.20)
we get
dW
t
= W
t
[(m
Xt
+m
Y t
+v
X1t
v
Y 1t
+v
X2t
v
Y 2t
) dt + (v
X1t
+v
Y 1t
) dz
1t
+ (v
X2t
+v
Y 2t
) dz
2t
] .
(2.21)
For the special case, where both X and Y are geometric Brownian motion so that m
x
, m
Y
, v
X1
,
v
X2
, v
Y 1
, and v
Y 2
are all constants, it follows that W
t
= X
t
Y
t
is also a geometric Brownian motion.
2
Example 2.2 Dene W
t
= X
t
/Y
t
. In this case we need to apply Itos Lemma with the function
g(X, Y ) = X/Y which has derivatives
g
t
= 0,
g
X
=
1
Y
,
g
Y
=
X
Y
2
,
2
g
X
2
= 0,
2
g
Y
2
= 2
X
Y
3
,
2
g
XY
=
1
Y
2
.
2.7 Multidimensional processes 41
Then
dW
t
=
1
Y
t
dX
t
X
t
Y
2
t
dY
t
+
X
t
Y
3
t
(dY
t
)
2
1
Y
2
t
(dX
t
)(dY
t
)
= W
t
_
dX
t
X
t
dY
t
Y
t
+
_
dY
t
Y
t
_
2
dX
t
X
t
dY
t
Y
t
_
.
(2.22)
In particular, if the dynamics of X and Y are given by (2.20), the dynamics of W
t
= X
t
/Y
t
becomes
dW
t
= W
t
_
_
m
Xt
m
Y t
+ (v
2
Y 1t
+v
2
Y 2t
) (v
X1t
v
Y 1t
+v
X2t
v
Y 2t
)
_
dt
+ (v
X1t
v
Y 1t
) dz
1t
+ (v
X2t
v
Y 2t
) dz
2t
_
.
(2.23)
Note that for the special case, where both X and Y are geometric Brownian motions, W = X/Y
is also a geometric Brownian motion. 2
We can apply the twodimensional version of Itos Lemma to prove the following useful result
relating expected discounted values and the drift rate.
Theorem 2.7 Under suitable regularity conditions, the relative drift rate of an Ito process X =
(X
t
) is given by the process m = (m
t
) if and only if X
t
= E
t
[X
T
exp
_
T
t
m
s
ds].
Proof: Suppose rst that the relative drift rate is given by m so that dX
t
= X
t
[m
t
dt + v
t
dz
t
].
Let us use Itos Lemma to identify the dynamics of the process W
t
= X
t
exp
_
t
0
m
s
ds or
W
t
= X
t
Y
t
, where Y
t
= exp
_
t
0
m
s
ds. Note that dY
t
= Y
t
m
t
dt so that Y is a locally
deterministic stochastic process. From Example 2.1, the dynamics of W becomes
dW
t
= W
t
[(m
t
m
t
+ 0) dt +v
t
dz
t
] = W
t
v
t
dz
t
.
Since W has zero drift, it is a martingale. It follows that W
t
= E
t
[W
T
], i.e. X
t
exp
_
t
0
m
s
ds =
E
t
[X
T
exp
_
T
0
m
s
ds] and hence X
t
= E
t
[X
T
exp
_
T
t
m
s
ds].
If, on the other hand, X
t
= E
t
[X
T
exp
_
T
t
m
s
ds] for all t, then the absolute drift of X follows
from this computation:
1
t
E
t
[X
t+t
X
t
] =
1
t
E
t
__
E
t+t
_
X
T
e
T
t+t
m
s
ds
__
_
E
t
_
X
T
e
T
t
m
s
ds
___
=
1
t
E
t
_
X
T
e
T
t+t
m
s
ds
X
T
e
T
t
m
s
ds
_
= E
t
_
X
T
e
T
t
m
s
ds
e
t+t
t
m
s
ds
1
t
_
m
t
E
t
_
X
T
e
T
t
m
s
ds
_
= m
t
X
t
,
so that the relative drift rate equals m
t
. 2
2.7.2 Kdimensional processes
Simultaneously modeling the dynamics of a lot of economic quantities requires the use of a lot of
shocks to those quantities. For that purpose we will work with vectors of shocks. In particular for
42 Chapter 2. Uncertainty, information, and stochastic processes
continuoustime modeling, we will represent shocks to the economy by a vector standard Brownian
motion. We dene this below and state Itos Lemma for processes of a general dimension.
A Kdimensional standard Brownian motion z = (z
1
, . . . , z
K
)
is a stochastic process
where the individual components z
i
are mutually independent onedimensional standard Brownian
motions. If we let 0 = (0, . . . , 0)
0 with t < t
: z
t
z
t
N(0, (t
dz
t
=
i
(X
t
, t) dt +
K
k=1
ik
(X
t
, t) dz
kt
, i = 1, . . . , K,
(2.25)
where
i
(X
t
, t)
k=1
ik
(X
t
, t) dz
kt
,
K
l=1
jl
(X
t
, t) dz
lt
_
=
K
k=1
K
l=1
ik
(X
t
, t)
jl
(X
t
, t) Cov
t
[dz
kt
, dz
lt
]
=
K
k=1
ik
(X
t
, t)
jk
(X
t
, t) dt
=
i
(X
t
, t)
j
(X
t
, t) dt, i, j = 1, . . . , K,
2.7 Multidimensional processes 43
where we have applied the usual rules for covariances and the independence of the components
of z. In particular, the variance of the change in the ith component process of an innitesimal
period is given by
Var
t
[dX
it
] = Cov
t
[dX
it
, dX
it
] =
K
k=1
ik
(X
t
, t)
2
dt = 
i
(X
t
, t)
2
dt, i = 1, . . . , K.
The volatility of the ith component is given by 
i
(X
t
, t). The variancecovariance matrix of
changes of X
t
over the next instant is (X
t
, t) dt = (X
t
, t)(X
t
, t)
j
(X
t
, t) dt
_

i
(X
t
, t)
2
dt 
j
(X
t
, t)
2
dt
=
i
(X
t
, t)
j
(X
t
, t)

i
(X
t
, t) 
j
(X
t
, t)
,
which can be any number in [1, 1] depending on the elements of
i
and
j
.
Similarly, we can dene a Kdimensional Ito process x = (X
1
, . . . , X
K
)
to be a process
with increments of the form
dX
t
=
t
dt +
t
dz
t
, (2.26)
where = (
t
) is a Kdimensional stochastic process and = (
t
) is a stochastic process with
values in the space of K Kmatrices.
Next, we state a multidimensional version of Itos Lemma, where a onedimensional process is
dened as a function of time and a multidimensional process.
Theorem 2.8 Let X = (X
t
)
t0
be an Ito process in R
K
with dynamics dX
t
=
t
dt +
t
dz
t
or,
equivalently,
dX
it
=
it
dt +
it
dz
t
=
it
dt +
K
k=1
ikt
dz
kt
, i = 1, . . . , K,
where z
1
, . . . , z
K
are independent standard Brownian motions, and
i
and
ik
are wellbehaved
stochastic processes.
Let g(X, t) be a realvalued function for which all the derivatives
g
t
,
g
X
i
, and
2
g
X
i
X
j
exist and
are continuous. Then the process y = (y
t
)
t0
dened by y
t
= g(X
t
, t) is also an Ito process with
dynamics
dy
t
=
_
_
g
t
(X
t
, t) +
K
i=1
g
X
i
(X
t
, t)
it
+
1
2
K
i=1
K
j=1
2
g
X
i
X
j
(X
t
, t)
ijt
_
_
dt
+
K
i=1
g
X
i
(X
t
, t)
i1t
dz
1t
+ +
K
i=1
g
X
i
(X
t
, t)
iKt
dz
Kt
,
(2.27)
where
ij
=
i1
j1
+ +
iK
jK
is the covariance between the processes X
i
and X
j
.
The result can also be written as
dy
t
=
g
t
(X
t
, t) dt +
K
i=1
g
X
i
(X
t
, t) dX
it
+
1
2
K
i=1
K
j=1
2
g
X
i
X
j
(X
t
, t)(dX
it
)(dX
jt
), (2.28)
where in the computation of (dX
it
)(dX
jt
) one must use the rules (dt)
2
= dt dz
it
= 0 for all i,
dz
it
dz
jt
= 0 for i ,= j, and (dz
it
)
2
= dt for all i. Alternatively, the result can be expressed using
44 Chapter 2. Uncertainty, information, and stochastic processes
vector and matrix notation:
dy
t
=
_
g
t
(X
t
, t) +
_
g
X
(X
t
, t)
_
t
+
1
2
tr
_
t
_
2
g
X
2
(X
t
, t)
_
t
__
dt+
_
g
X
(X
t
, t)
_
t
dz
t
,
(2.29)
where
g
X
(X
t
, t) =
_
_
_
_
g
X
1
(X
t
, t)
. . .
g
X
K
(X
t
, t)
_
_
_
_
,
2
g
X
2
(X
t
, t) =
_
_
_
_
_
_
_
2
g
X
2
1
(X
t
, t)
2
g
X
1
X
2
(X
t
, t) . . .
2
g
X
1
X
K
(X
t
, t)
2
g
X
2
X
1
(X
t
, t)
2
g
X
2
2
(X
t
, t) . . .
2
g
X
2
X
K
(X
t
, t)
.
.
.
.
.
.
.
.
.
.
.
.
2
g
X
K
X
1
(X
t
, t)
2
g
X
K
X
2
(X
t
, t) . . .
2
g
X
2
K
(X
t
, t)
_
_
_
_
_
_
_
,
and tr denotes the trace of a quadratic matrix, i.e. the sum of the diagonal elements. For example,
tr(A) =
K
i=1
A
ii
.
The probabilistic properties of a Kdimensional diusion process is completely specied by the
drift function and the variancecovariance function . The values of the variancecovariance
function are symmetric and positivedenite matrices. Above we had =
for a general
(KK)matrix . But from linear algebra it is wellknown that a symmetric and positivedenite
matrix can be written as
ii
= 1;
ij
=
i
k=1
ik
jk
V
i
V
j
,
ji
=
ij
, i < j.
2.8 Exercises
EXERCISE 2.1 In the twoperiod economy illustrated in Figures 2.1 and 2.2 consider an asset
paying a dividend at time 2 given by
D
2
=
_
_
0, for = 3,
5, for 1, 2, 4,
10, for 5, 6.
(a) What is the expectation at time 0 of D
2
? What is the expectation at time 1 of D
2
? Verify
that the Law of Iterated Expectations holds for these expectations.
(b) What is the variance at time 0 of D
2
? What is the variance at time 1 of D
2
? Conrm that
Var[D
2
] = E[Var
1
[D
2
]] + Var [E
1
[D
2
]].
EXERCISE 2.2 Let X = (X
t
) and Y = (Y
t
) be the price processes of two assets with no
intermediate dividends and assume that
dX
t
= X
t
[0.05 dt + 0.1 dz
1t
+ 0.2 dz
2t
] ,
dY
t
= Y
t
[0.07 dt + 0.3 dz
1t
0.1 dz
2t
] .
(a) What is the expected rate of return of each of the two assets?
(b) What is the return variance and volatility of each of the two assets?
(c) What is the covariance and the correlation between the returns on the two assets?
EXERCISE 2.3 Suppose X = (X
t
) is a geometric Brownian motion, dX
t
= X
t
dt + X
t
dz
t
.
What is the dynamics of the process y = (y
t
) dened by y
t
= (X
t
)
n
? What can you say about the
distribution of future values of the y process?
EXERCISE 2.4 Suppose that the continuoustime stochastic process X = (X
t
) is dened as
X
t
=
1
2
_
t
0
2
s
ds +
_
t
0
s
dz
s
,
where z = (z
t
) is a onedimensional standard Brownian motion and = (
t
) is some nice
stochastic process.
(a) Argue that dX
t
=
1
2
2
t
dt +
t
dz
t
.
46 Chapter 2. Uncertainty, information, and stochastic processes
(b) Suppose that the continuoustime stochastic process = (
t
) is dened as
t
= expX
t
.
Show that d
t
=
t
t
dz
t
.
EXERCISE 2.5 (Adapted from Bjork (2004).) Dene the process y = (y
t
) by y
t
= z
4
t
, where
z = (z
t
) is a standard Brownian motion. Find the dynamics of y. Show that
y
t
= 6
_
t
0
z
2
s
ds + 4
_
t
0
z
3
s
dz
s
.
Show that E[y
t
] E[z
4
t
] = 3t
2
.
EXERCISE 2.6 (Adapted from Bjork (2004).) Dene the process y = (y
t
) by y
t
= e
az
t
, where
a is a constant and z = (z
t
) is a standard Brownian motion. Find the dynamics of y. Show that
y
t
= 1 +
1
2
a
2
_
t
0
y
s
ds +a
_
t
0
y
s
dz
s
.
Dene m(t) = E[y
t
]. Show that m satises the ordinary dierential equation
m
(t) =
1
2
a
2
m(t), m(0) = 1.
Show that m(t) = e
a
2
t/2
and conclude that
E[e
az
t
] = e
a
2
t/2
.
CHAPTER 3
Assets, portfolios, and arbitrage
3.1 Introduction
This chapter shows how to model assets and portfolios of assets in one and multiperiod models
with uncertainty. The important concepts of arbitrage, redundant assets, and market completeness
are introduced.
3.2 Assets
An asset is characterized by its dividends and its price. We will always assume that the dividends
of the basic assets are nonnegative and that there is a positive probability of a positive dividend
at the terminal date of the model. Then we can safely assume (since equilibrium prices will be
arbitragefree; see precise denition below) that the prices of the basic assets are always positive.
We assume without loss of generality that assets pay no dividends at time 0. The price of an asset
at a given point in time is exclusive any dividend payment at that time, i.e. prices are exdividend.
At the last point in time considered in the model, all assets must have a zero price. We assume
throughout that I basic assets are traded.
3.2.1 The oneperiod framework
In a oneperiod model any asset i is characterized by its time 0 price P
i
and its time 1 dividend D
i
,
which is a random variable. If the realized state is , asset i will give a dividend of D
i
(). We
can gather all the prices in the Idimensional vector P = (P
1
, . . . , P
I
)
m=1
_
1 +
D
i,t+m
P
i,t+m
_
=
_
1 +
D
i,t+1
P
i,t+1
_
. . .
_
1 +
D
i,t+n
P
i,t+n
_
units at time t +n. The gross rate of return on asset i between time t and time t +n is therefore
R
i,t,t+n
=
A
t+n
P
i,t+n
P
it
=
P
i,t+n
P
it
n
m=1
_
1 +
D
i,t+m
P
i,t+m
_
.
Again, we can dene the corresponding net rate of return or logreturn.
In the discretetime framework an asset is said to be riskfree if the dividend at any time t =
1, . . . , T is already known at time t 1, no matter what information is available at time t 1.
The riskfree gross return between time t and t +1 is denoted by R
f
t
so that the subscript indicate
the point in time at which the return will be known to investors, not the point in time at which
the return is realized. Before time t, the riskfree return R
f
t
for the period beginning at t is not
necessarily known. Riskfree rates uctuate over time so they are only riskfree in the short run. If
you rollover in oneperiod riskfree investments from time t to t +n, the total gross rate of return
will be
R
f
t,t+n
=
n1
m=0
R
f
t+m
.
Note that this return is risky seen at, or before, time t. An asset that provides a riskfree return
from time t to time t +n is a defaultfree, zerocoupon bond maturing at time t +n. If it has a face
value of 1 and its time t price is denoted by B
t+n
t
, you will get a gross rate of return of 1/B
t+n
t
.
3.2.3 The continuoustime framework
In a continuoustime model over the time span T = [0, T], the price of an asset i is represented by
an adapted stochastic process P
i
= (P
it
)
t[0,T]
. In practice, no assets pay dividends continuously.
However, for computational purposes it is sometimes useful to approximate a stream of frequent
dividend payments by a continuoustime dividend process. On the other hand, a reasonable model
should also allow for assets paying lumpsum dividends. We could capture both through a process
D
i
= (D
it
)
t[0,T]
where D
it
is the (undiscounted) sum of all dividends of asset i up to and including
time t. The total dividend received in a small interval [t, t+dt] would then be dD
it
and the dividend
yield would be dD
it
/P
it
. A lumpsum dividend at time t would correspond to a jump in D
it
. For
50 Chapter 3. Assets, portfolios, and arbitrage
notational simplicity we assume that only at time T the basic assets of the economy will pay
a lumpsum dividend. The terminal dividend of asset i is modeled by a random variable D
iT
,
assumed nonnegative with a positive probability of a positive value. Up to time T, dividends
are paid out continuously. In general, the dividend yield can then be captured by a specication
like dD
it
/P
it
=
it
dt +
it
dz
it
for some one or multidimensional standard Brownian motion z
i
,
but for simplicity we assume that there is no uncertainty about the dividend yield over the next
instant, i.e.
it
= 0. To sum up, the dividends of asset i are represented by a dividend yield process
i
= (
it
)
t[0,T]
and a terminal lumpsum dividend D
i
.
Next consider returns. If we buy one unit of asset i at time t and keep reinvesting the continuous
dividends by purchasing extra (fractions of) units of the assets, how many units will we end up
with at time t
. We can then
proceed as in the discretetime case discussed above. Let A
t+nt
denote the number of units of the
asset we will have immediately after time t +nt. We start with A
t
= 1 unit. At t +t we receive
a dividend of
it
P
i,t+t
t which we spend on buying
it
t extra assets, bringing our holdings up
to A
t+t
= 1 +
it
t. At time t + 2t we receive a total dividend of A
t+t
i,t+t
P
i,t+2t
t,
which will buy us A
t+t
i,t+t
t extra units. Our total is now A
t+2t
= A
t+t
+A
t+t
i,t+t
t.
Continuing like this we will nd for any s = t + nt for some integer n < N, our total holdings
immediately after time s + t is given by A
s+t
= A
s
+A
s
is
t so that
A
s+t
A
s
t
=
is
A
s
.
If we go to the continuoustime limit and let t 0, the lefthand side will approach the derivative
A
s
and we see that A
s
must satisfy the dierential equation A
s
=
is
A
s
as well as the initial
condition A
t
= 1. The solution is
A
s
= exp
__
s
t
iu
du
_
.
So investing one unit in asset i at time t and continuously reinvesting the dividends, we will end
up with A
t
= exp
_
t
t
iu
du units of the asset at time t
] is therefore R
i,t,t
= exp
_
t
t
iu
duP
i,t
/P
it
.
The net rate of return per time period is r
i,t,t
= (R
i,t,t
1)/(t
t,
we get the instantaneous net rate of return per time period
r
it
=
1
dt
dP
it
P
it
+
it
.
The rst term on the righthand side is the instantaneous percentage capital gain (still unknown
at time t), the second term is the dividend yield (assumed to be known at time t).
We will typically write the dynamics of the price of an asset i as
dP
it
= P
it
[
it
dt +
it
dz
t
] , (3.3)
where z = (z
t
) is a standard Brownian motion of dimension d representing shocks to the prices,
it
is then the expected capital gain so that the total expected net rate of return per time period
is
it
+
it
, while
it
is the vector of sensitivities of the price with respect to the exogenous
shocks. The volatility is the standard deviation of instantaneous relative price changes, which is
3.2 Assets 51

it
 =
_
d
j=1
2
ijt
_
1/2
. Using Itos Lemma exactly as in Section 2.6.7, one nds that
P
i,t
= P
it
exp
_
_
t
t
_
iu
1
2

iu

2
_
du +
_
t
iu
dz
u
_
. (3.4)
Therefore, the gross rate of return on asset i between t and t
is
R
i,t,t
= exp
_
_
t
iu
du
_
P
i,t
P
it
= exp
_
_
t
t
_
iu
+
iu
1
2

iu

2
_
du +
_
t
iu
dz
u
_
.
For a short period, i.e. with t
it
+
it
1
2

it

2
_
t +
it
z
t
_
,
where z
t
= z
t+t
z
t
is ddimensional and normally distributed with mean vector 0 and
variancecovariance matrix I t (here I is the d d identity matrix). The gross rate of return is
thus approximately lognormally distributed with mean
E
t
[R
i,t,t+t
] exp
__
it
+
it
1
2

it

2
_
t
_
E
t
[exp
it
z
t
] = e
(
it
+
it
)t
,
cf. Appendix B.
The logreturn is
lnR
i,t,t
=
_
t
t
_
iu
+
iu
1
2

iu

2
_
du +
_
t
iu
dz
u
with mean and variance given by
E
t
[lnR
i,t,t
] =
_
t
t
_
iu
+
iu
1
2

iu

2
_
du
_
it
+
it
1
2

it

2
_
t,
Var
t
[lnR
i,t,t
] =
_
t
t
E
t
_

iu

2
du 
it

2
t,
according to Theorem 2.2, and the approximations are useful for a short time period t
t = t.
Note that the expected log return is (approximately)
_
it
+
it
1
2

it

2
_
t, whereas the log of
the expected gross return is (approximately) (
it
+
it
)t. The dierence is due to the concavity
of the log function so that Jensens Inequality implies that E[lnX] lnE[X] with equality only
for the case of a deterministic X.
We can write the dynamics of all I prices compactly as
dP
t
= diag(P
t
)
_
t
dt +
t
dz
t
, (3.5)
where diag(P
t
) is dened in (3.1),
t
is the vector (
1t
, . . . ,
It
)
and
t
is the I d matrix whose
ith row is
it
.
A riskfree asset is an asset where the rate of return over the next instant is always known. We
can think of this as an asset with a constant price and a continuous dividend yield process r
f
=
(r
f
t
)
t[0,T]
or as an asset with a zero continuous dividend yield and a price process accumulating
the interest rate payments, P
f
t
= exp
_
_
t
0
r
f
s
ds
_
.
52 Chapter 3. Assets, portfolios, and arbitrage
3.3 Portfolios and trading strategies
Individuals can trade assets at all time points of the model, except for the last date. The combi
nation of holdings of dierent assets at a given point in time is called a portfolio. We assume that
there are no restrictions on the portfolios that investors may form and that there are no trading
costs.
3.3.1 The oneperiod framework
In a oneperiod model an individual chooses a portfolio = (
1
, . . . ,
I
)
at time 0 with
i
being
the number of units held of asset i. It is not possible to rebalance the portfolio but the individual
simply cashes in the dividends at time 1. Let D
() =
I
i=1
i
D
i
() = D(),
i.e. D
= D.
Denote the price or value of a portfolio by P
=
I
i=1
i
P
i
= P.
This is called the Law of One Price. Since we ignore transaction costs, any candidate for an
equilibrium pricing system will certainly have this property. In Section 3.4 we will discuss the link
between the Law of One Price and the absence of arbitrage.
The fraction of the total portfolio value invested in asset i is then
i
=
i
P
i
/P
, we have 1 =
I
i=1
i
= 1. Note that diag(P) 1 = P and thus P
= P
= (diag(P) 1)
= 1
diag(P) so
that
=
diag(P)
P
=
diag(P)
1
diag(P)
. (3.6)
Given and the price vector P we can derive . Conversely, given , the total portfolio value P
,
and the price vector P, we can derive . We therefore have two equivalent ways of representing a
portfolio.
The gross rate of return on a portfolio is the random variable
R
=
D
I
i=1
i
D
i
P
I
i=1
i
P
i
R
i
P
=
I
i=1
i
P
i
P
R
i
=
I
i=1
i
R
i
= R, (3.7)
where
i
is the portfolio weight of asset i. We observe that the gross return on a portfolio is just
a weighted average of the gross rates of return on the assets in the portfolio. Similarly for the net
rate of return since
I
i=1
i
= 1 and thus
r
= R
1 =
_
I
i=1
i
R
i
_
1 =
I
i=1
i
(R
i
1) =
I
i=1
i
r
i
= r,
where r = (r
1
, . . . , r
I
)
t
=
t1
D
t
(
t
t1
) P
t
=
t1
(P
t
+D
t
)
t
P
t
, t = 1, 2, . . . , T 1. (3.8)
We can think of this as a budget constraint saying that the sum of the withdrawn dividend and
our additional investment (
t
t1
) P
t
has to equal the dividends we receive from the current
portfolio. The terminal dividend is
D
T
=
T1
D
T
. (3.9)
Given the Law of One Price, the initial price of the trading strategy is P
=
0
P
0
. We can let
D
0
= P
is dened at all t T.
For t = 1, . . . , T dene
V
t
=
t1
(P
t
+D
t
)
which is the time t value of the portfolio chosen at the previous trading date. This is the value of the
portfolio just after dividends are received at time t and before the portfolio is rebalanced. Dene
V
0
=
0
P
0
. We call V
= (V
t
)
tT
the value process of the trading strategy . According
to (3.8), we have V
t
= D
t
+
t
P
t
for t = 1, . . . , T and, in particular, V
T
= D
T
. The change in
the value of the trading strategy between two adjacent dates is
V
t+1
V
t
=
t
(P
t+1
+D
t+1
)
t1
(P
t
+D
t
)
=
t
(P
t+1
+D
t+1
) D
t
t
P
t
=
t
(P
t+1
P
t
+D
t+1
) D
t
. (3.10)
The rst term on the righthand side of the last expression is the net return on the portfolio
t
from time t to t + 1, the latter term is the net dividend we have withdrawn at time t.
The trading strategy is said to be selfnancing if all the intermediate dividends are zero, i.e.
if D
t
= 0 for t = 1, . . . , T 1. Using (3.8) this means that
(
t
t1
) P
t
=
t1
D
t
, t = 1, . . . , T 1.
The lefthand side is the extra investment due to the rebalancing at time t, the righthand side is
the dividend received at time t. A selfnancing trading strategy requires an initial investment of
P
=
0
P
0
and generates a terminal dividend of D
T
=
T1
D
T
. At the intermediate dates no
money is invested or withdrawn so increasing the investment in some assets must be fully nanced
by dividends or selling o other assets. If is selfnancing, we have V
t
= P
t
t
P
t
, the time t
price of the portfolio
t
, for t = 0, 1, . . . , T 1. Moreover, the change in the value of the trading
strategy is just the net return, cf. (3.10).
Portfolio weights...
Returns...
54 Chapter 3. Assets, portfolios, and arbitrage
3.3.3 The continuoustime framework
Also in the continuoustime framework with T = [0, T] a trading strategy is an Idimensional
adapted stochastic process = (
t
)
tT
where
t
= (
1t
, . . . ,
It
)
t
=
t
P
t
, t < T (3.11)
and V
T
=
T
D
T
D
T
, the terminal lumpsum dividend. The time 0 value is the cost of initiating
the trading strategy, which we can think of as a negative initial dividend, D
0
= V
0
=
0
P
0
.
We assume that between time 0 and time T no lumpsum dividends can be withdrawn from
the investment but funds can be withdrawn at a continuous rate as represented by the process
= (
t
) describing the rate with which we withdraw funds from our investments. Intermediate
lumpsum withdrawals could be allowed at the expense of additional notational complexity. For
later use note that an application of Itos Lemma implies that the increment to the value process
is given by
dV
t
=
t
dP
t
+d
t
P
t
+d
t
dP
t
. (3.12)
Assume for a moment that we do not change our portfolio over a small time interval [t, t + t].
The total funds withdrawn over this interval is
t
t. Let
t+t
=
t+t
t
and P
t+t
=
P
t+t
P
t
. The total dividends received from the portfolio
t
over the interval is
I
i=1
it
it
P
it
t,
which we can rewrite as
t
diag(P
t
)
t
t, where diag(P
t
) is the matrix dened in (3.1). Then the
budget constraint over this interval is
t
t +
t+t
P
t+t
=
t
diag(P
t
)
t
t.
The lefthand side is the sum of the funds we withdraw and the net extra investment. The right
hand side is the funds we receive in dividends. Let us add and subtract (
t+t
) P
t
in the above
equation. Rearranging we obtain
t
t +
t+t
P
t+t
+
t+t
P
t
=
t
diag(P
t
)
t
t.
The equivalent equation for an innitesimal interval [t, t +dt] is
t
dt +d
t
dP
t
+d
t
P
t
=
t
diag(P
t
)
t
dt.
Using this, we can rewrite the value dynamics in (3.12) as
dV
t
=
t
dP
t
+
t
diag(P
t
)
t
dt
t
dt.
Substituting in (3.5), this implies that
dV
t
=
t
diag(P
t
)
_
(
t
+
t
) dt +
t
dz
t
t
dt. (3.13)
As discussed earlier we can dene a portfolio weight vector
t
=
diag(P
t
)
t
P
t
t
=
diag(P
t
)
t
V
t
.
3.4 Arbitrage 55
The value dynamics can therefore be rewritten as
dV
t
= V
t
_
(
t
+
t
) dt +
t
dz
t
t
dt. (3.14)
A trading strategy is called selfnancing if no funds are withdrawn, i.e.
t
0. In that case
the value dynamics is simply
dV
t
=
t
diag(P
t
)
_
(
t
+
t
) dt +
t
dz
t
. (3.15)
This really means that for any t (0, T),
V
t
=
0
P
0
+
_
t
0
s
diag(P
s
)
_
(
s
+
s
) ds +
s
dz
s
=
0
P
0
+
_
t
0
s
diag(P
s
) (
s
+
s
) ds +
_
t
0
s
diag(P
s
)
s
dz
s
.
(3.16)
Returns...
3.4 Arbitrage
We have already made an assumption about prices, namely that prices obey the Law of One Price,
i.e. prices are linear. We will now make the slightly stronger assumption that prices are set so that
there is no arbitrage. An arbitrage is basically a riskfree prot.
3.4.1 The oneperiod framework
In the oneperiod framework we dene an arbitrage as a portfolio satisfying one of the following
two conditions:
(i) P
< 0 and D
0;
(ii) P
0 and D
0 with P
_
D
> 0
_
> 0.
Here D
is the random variable that represents the dividend of the portfolio . The inequality
D
0 means that the dividend will be nonnegative no matter which state is realized, i.e.
D
() 0 for all . (In a nitestate economy this can be replaced by the condition D
0
on the dividend vector, which means that all elements of the vector are nonnegative. The condition
P
_
D
> 0
_
> 0 can be replaced by the condition D
< P. Then an arbitrage can be formed by purchasing the portfolio for the price of P
is 0.9
1
+ 1.6
2
.
Consider the portfolio (2, 1)
0
< 0 and V
T
0 with probability one,
(ii) V
0
0, V
T
0 with probability one, and V
T
> 0 with strictly positive probability.
As we have seen above, V
T
= D
T
and V
0
= D
0
for selfnancing trading strategies. We can
therefore equivalently dene an arbitrage in terms of dividends. A selfnancing trading strategy
is an arbitrage if it generates nonnegative initial and terminal dividends with one of them being
strictly positive with a strictly positive probability. Due to our assumptions on the dividends of
the individual assets, the absence of arbitrage will imply that the prices of individual assets are
strictly positive.
Ruling out arbitrages dened in (i) and (ii) will also rule out shorter term riskfree gains. Suppose
for example that we can construct a trading strategy with a nonpositive initial value (i.e. a non
positive price), always nonnegative values, and a strictly positive value at some time t < T. Then
this strictly positive value can be invested in any asset until time T generating a strictly positive
terminal value with a strictly positive probability. The focus on selfnancing trading strategies is
therefore no restriction. Note that the denition of an arbitrage implies that a selfnancing trading
strategy with a terminal dividend of zero (in any state) must have a value process identically equal
to zero.
3.4.3 Continuoustime doubling strategies
In a continuoustime setting it is theoretically possible to construct some strategies that generate
something for nothing. These are the socalled doubling strategies, which were apparently rst
mentioned in a nance setting by Harrison and Kreps (1979). Think of a series of coin tosses
numbered n = 1, 2, . . . . The nth coin toss takes place at time 1 1/n [0, 1). In the nth toss,
you get 2
n1
if heads comes up, and looses 2
n1
otherwise, where is some positive number.
You stop betting the rst time heads comes up. Suppose heads comes up the rst time in toss
number (k +1). Then in the rst k tosses you have lost a total of (1+2+ +2
k1
) = (2
k
1).
Since you win 2
k
in toss number k + 1, your total prot will be 2
k
(2
k
1) = . Since the
probability that heads comes up eventually is equal to one, you will gain with probability one.
The gain is obtained before time 1 and can be made as large as possible by increasing .
3.5 Redundant assets 57
Similar strategies, with future dividends appropriately discounted, can be constructed in continuous
time models of nancial marketsat least if a riskfree asset is tradedbut are clearly impossible
to implement in real life. As shown by Dybvig and Huang (1988), doubling strategies can be ruled
out by requiring that trading strategies have values that are bounded from below, i.e. that some
constant K exists such that V
t
K for all t. A trading strategy satisfying such a condition
is said to be creditconstrained. A lower bound is reasonable since nobody can borrow an in
nite amount of money. If you have a limited borrowing potential, the doubling strategy described
above cannot be implemented. If you have no future income at all, K = 0 seems reasonable. An
alternative way of eliminating doubling strategies is to impose the condition that the value process
of the trading strategy has nite variance, cf. Due (2001). For a doubling strategy the variance
of the value process is in fact innite.
It seems evident that any greedy investor would implement a doubling strategy, if possible, since
the investor will make a positive net return with a probability of one in nite time. However,
Omberg (1989) shows that a doubling strategy may in fact generate an expected utility of minus
innity for riskaverse investors. In some events of zero probability a doubling strategy may result
in outcomes associated with a utility of minus innity. When multiplying zero and minus innity
in order to compute the expected utility, the result is indeterminate. Omberg computes the actual
expected utility of the doubling strategy by taking an appropriate limit and nds that this is
minus innity for commonly used utility functions that are unbounded from below. Although this
questions the above denition of an arbitrage, we will stick to that denition which is also the
standard of the literature.
In the rest of this book we willoften implicitlyassume that some conditions are imposed so
that doubling strategies are not implementable or that nobody wants to implement them.
3.5 Redundant assets
An asset is said to be redundant if its dividends can be replicated by a trading strategy in other
assets.
For example, in the oneperiod framework asset i is redundant if a portfolio = (
1
, . . . ,
I
)
exists with
i
= 0 and
D
i
= D
1
D
1
+ +
i1
D
i1
+
i+1
D
i+1
+ +
I
D
I
.
Recall that the dividends are random variables so the above equation really means that
D
i
() =
1
D
1
() + +
i1
D
i1
() +
i+1
D
i+1
() + +
I
D
I
(), .
Such a portfolio is called a replicating portfolio for asset i.
If an asset i is redundant, its price follows immediately from the law of one price:
P
i
=
1
P
1
+ +
i1
P
i1
+
i+1
P
i+1
+ +
I
P
I
.
We can thus focus on pricing the nonredundant assets, then the prices of all the other assets, the
redundant assets, follow.
Note that the number of nonredundant assets cannot exceed the number of states. If there are
more assets than states, there will be some redundant asset.
58 Chapter 3. Assets, portfolios, and arbitrage
statecontingent dividend
state 1 state 2 state 3
Asset 1 1 1 1
Asset 2 0 1 2
Asset 3 4 0 1
Asset 4 9 0 1
Table 3.1: The statecontingent dividends of the assets considered in Example 3.1.
Example 3.1 Consider a oneperiod economy with three possible endofperiod states and four
traded assets. The dividends are given in Table 3.1. With four assets and three states at least one
asset is redundant. The dividend vector of asset 4 can be written as a nontrivial linear combination
of the dividend vectors of assets 1, 2, and 3 since
_
_
_
_
9
0
1
_
_
_
_
=
_
_
_
_
1
1
1
_
_
_
_
_
_
_
_
0
1
2
_
_
_
_
+ 2
_
_
_
_
4
0
1
_
_
_
_
.
A portfolio of one unit of asset 1, minus one unit of asset 2, and two units of asset 3 perfectly
replicates the dividend of asset 4, which is therefore redundant. In terms of random variables, we
have the relation
D
1
D
2
+ 2D
3
= D
4
among the dividends of the four assets. On the other hand, asset 1 is redundant since it can be
perfectly replicated by a portfolio of one unit of asset 2, minus two units of asset 3, and one unit
of asset 4. Similarly, asset 2 is redundant and asset 3 is redundant. Hence, either of the four assets
can be removed without aecting the set of dividend vectors that can be generated by forming
portfolios. Note that once one of the assets has been removed, neither of the three remaining assets
will be redundant anymore. Whether an asset is redundant or not depends on the set of other
assets available for trade. This implies that we must remove redundant assets one by one: rst we
remove one redundant asset, then we look for another asset which is still redundant if we nd
one, we can remove that, etc. 2
In the multiperiod model an asset is said to be redundant if its dividend process can be generated
by a trading strategy in the other assets. In the discretetime framework, asset i is redundant if
there exists a trading strategy with
it
= 0 for all t and all so that
D
it
= D
t
t1
(D
t
+P
t
)
t
P
t
, t = 1, . . . , T.
Such a is called a replicating trading strategy for asset i.
Just as in the oneperiod setting, redundant assets are uniquely priced by noarbitrage.
Theorem 3.1 If is a replicating trading strategy for asset i, the unique arbitragefree price of
asset i at any time t is
P
it
=
t
P
t
.
3.6 Marketed dividends and market completeness 59
Proof: The trading strategy
dened by
t
=
t
e
i
, where e
i
= (0, . . . , 0, 1, 0, . . . , 0)
, is
selfnancing and V
T
= 0. Noarbitrage implies that V
t
= 0. The result now follows since V
t
=
t
P
t
=
t
P
t
P
it
. 2
The above denition of redundancy can be generalized to a continuoustime setting, where the
theorem is also valid.
The theorem is useful for the pricing of derivatives and applied, e.g., in the Cox, Ross, and
Rubinstein (1979) binomial model (see Exercise 3.1 at the end of this chapter) and the Black and
Scholes (1973) continuoustime model for the pricing of stock options.
3.6 Marketed dividends and market completeness
By forming portfolios and trading strategies investors can generate other dividends than those of
the individual basic assets. Any dividend that can be generated by trading the basic assets is said
to be a marketed dividend. If all the dividends you can think of are marketed, the nancial market
is said to be complete. Otherwise the nancial market is said to be incomplete. We will see in later
chapters that some important results will depend on whether the nancial market is complete or
incomplete. Below we provide formal denitions and characterize complete markets.
3.6.1 The oneperiod framework
In the oneperiod framework dividends are random variables. A random variable is said to be a
marketed dividend or spanned by traded assets if it is identical to the dividend of some portfolio
of the traded assets. The dividend of a portfolio is the random variable D
= D. The set of
marketed dividends is thus
M = x[x = D for some portfolio . (3.17)
Note that M is a subset of the set L of all random variables on the probability space (, F, P) with
nite variance.
If some of the basic assets are redundant, they will not help us in generating dividends. Suppose
that there are k I nonredundant assets. The kdimensional random variable of dividends of
these assets is denoted by
D and a portfolio of these assets is denoted by a kdimensional vector
. We can obtain exactly the same dividends by using only the nonredundant assets as by using
all dividends so
M =
_
x[x =
D for some portfolio
_
. (3.18)
The nancial market is said to be complete if any dividend is marketed, i.e. if any random
variable x is the dividend of some portfolio. In symbols, the market is complete if M = L.
With k nonredundant assets, the set of marketed dividends will be a kdimensional linear
subspace in L. You have k choice variables, namely how much to invest in each of the k non
redundant assets. The dimension of L is the number of possible states. For each state you have
to make sure that the dividend of the portfolio will equal the desired dividend x(). Whether the
market is complete or not is therefore determined by the relation between the number of states and
the number of nonredundant assets. If the state space is innite, the market is clearly incomplete.
60 Chapter 3. Assets, portfolios, and arbitrage
With a nite state space = 1, 2, . . . , S, there can be at most S nonredundant assets, i.e.
k S. If k < S, the market will be incomplete. If k = S, the market will be complete. Details
follow now: With a nite state space dividends can be represented by Sdimensional vectors and
the dividends of all the I basic assets by the I S dividend matrix D. A marketed dividend is
then an Sdimensional vector x for which a portfolio can be found such that
D
= x
Removing redundant assets we eliminate rows in the dividend matrix D. As long as one of the
assets is redundant, the rows of D will be linearly dependent. The maximum number of linearly
independent rows of a matrix is called the rank of the matrix. It can be shown that this is also the
maximum number of linearly independent columns of the matrix. With k nonredundant assets,
the rank of D is equal to k. Removing the rows corresponding to those assets from D we obtain a
matrix
D of dimension k S, where k S since we cannot have more than S linearly independent
Sdimensional vectors. Then the set of marketed dividend vectors is
M =
_
[
R
k
_
since we can attain the same dividend vectors by forming portfolios of only the nonredundant
assets as by forming portfolios of all the assets.
In the nitestate economy, the market is complete if M = R
S
, i.e. any statecontingent dividend
can be generated by forming portfolios of the traded assets. The market is complete if and only if
for any x R
S
, we can nd R
I
such that
D
= x.
Market completeness is thus a question of when we can solve S equations in I unknowns. From
linear algebra we have the following result:
Theorem 3.2 With a nite state space, = 1, 2, . . . , S, the market is complete if and only if
the rank of the I S dividend matrix D is equal to S.
Clearly, a necessary (but not sucient) condition for a complete market is that I S, i.e. that
there are at least as many assets as states. If the market is complete, the pruned dividend matrix
_
1
x. In the present case, we have
D =
_
_
_
_
1 1 1
0 1 2
4 0 1
_
_
_
_
, D
=
_
_
_
_
1 0 4
1 1 0
1 2 1
_
_
_
_
, (D
)
1
=
_
_
_
_
0.2 1.6 0.8
0.2 0.6 0.8
0.2 0.4 0.2
_
_
_
_
.
For example, the portfolio providing a dividend vector of (5, 10, 5)
is given by
=
_
_
_
_
0.2 1.6 0.8
0.2 0.6 0.8
0.2 0.4 0.2
_
_
_
_
_
_
_
_
5
10
5
_
_
_
_
=
_
_
_
_
19
9
6
_
_
_
_
.
3.6 Marketed dividends and market completeness 61
2
3.6.2 Multiperiod framework
In a multiperiod setting the market is said to be complete if for any random variable X there
is a trading strategy such that D
T
= X in all states. This implies that any terminal payment
you may think of can be obtained by some trading strategy. If you think of a payment before the
terminal date T, this can be transformed into a terminal payment by investing it in a given asset
until time T. Hence the denition covers all relevant dates and the market will thus be complete
if any adapted stochastic process can be generated by some trading strategy in the given assets.
More formally, let L denote the set of all random variables (with nite variance) whose outcome
can be determined from the exogenous shocks to the economy over the entire time span T. On
the other hand, let M denote the set of possible time T values that can be generated by forming
selfnancing trading strategies in the nancial market, i.e.
M =
_
V
T
[ selfnancing
_
.
Of course, for any trading strategy the terminal value V
T
is a random variable, whose outcome
is not determined until time T. Imposing relevant technical conditions on trading strategies, the
terminal value will have nite variance, so M is always a subset of L. If, in fact, M is equal to L,
the nancial market is complete. If not, it is said to be incomplete.
How many assets do we need in order to have a complete market? In the oneperiod model,
Theorem 3.2 tells us that with S possible states we need at least S suciently dierent assets for
the market to be complete. To generalize this result to the multiperiod setting we must be careful.
Consider once again the twoperiod model illustrated in Figures 2.1 and 2.2 in Chapter 2. Here
there are six possible outcomes, i.e. has six elements. Hence, one might think that we need access
to trade in at least six suciently dierent assets in order for the market to be complete. This is
not correct. We can do with fewer assets. This is based on two observations: (i) the uncertainty
is not revealed completely at once, but little by little, and (ii) we can trade dynamically in the
assets. In the example there are three possible transitions of the economy from time 0 to time 1.
From our oneperiod analysis we know that three suciently dierent assets are enough to span
this uncertainty. From time 1 to time 2 there are either two, three, or one possible transition
of the economy, depending on which state the economy is in at time 1. At most, we need three
suciently dierent assets to span the uncertainty over this period. In total, we can generate any
dividend process if we just have access to three suciently dierent assets in both periods.
In the oneperiod model, suciently dierent means that the matrix of the dividends of the
assets in a given period is of full rank. In the discretetime multiperiod model the payment of a
given asset at the end of each period is the sum of the price and the dividend in that period. The
relevant matrix is therefore the matrix of possible values of price plus dividend at the end of the
period. This matrix tells us how the assets will react to the exogenous shocks over that subperiod.
If we have at least as many assets that respond suciently dierently to the shocks as we have
possible realizations of these shocks, we can completely hedge the shocks.
The continuoustime nancial market with uncertainty generated by a ddimensional standard
Brownian motion is complete if there are d + 1 suciently dierent assets traded at all times.
62 Chapter 3. Assets, portfolios, and arbitrage
For example this is true with an instantaneously riskfree asset plus d risky assets having a price
sensitivity matrix
t
of rank d. The formal proof of this result is pretty complicated and will
not be given here. We refer the interested reader to Harrison and Pliska (1981, 1983) and Due
(2001). However, the result is quite intuitive given the following observations:
For continuous changes over an instant, only means and variances matter.
We can approximate the ddimensional shock dz
t
by a random variable that takes on d + 1
possible values and has the same mean and variance as dz
t
.
For example, a onedimensional shock dz
t
has mean zero and variance dt. This is also true
for a random variable which equals
dt with a probability of 1/2 and equals
dt with a
probability of 1/2.
With continuous trading, we can adjust our exposure to the exogenous shocks every instant.
Over each instant we can thus think of the model with uncertainty generated by a ddimensional
standard Brownian motion as a discretetime model with d+1 states. Therefore it only takes d+1
suciently dierent assets to complete the market. For example, if all market prices are aected
by a single shock, a onedimensional Brownian motion, the market will be complete if you can
always nd two assets with dierent sensitivities towards that shock. One of the assets could be a
riskfree asset. Note however that when counting the number of shocks you should include shocks
to variables that contain information about future asset prices.
Let us take a simple, concrete example. Suppose that assets pay no intermediate dividends and
that the price dynamics of an arbitratey asset, asset i, is given by
dP
it
= P
it
[
i
dt +
i
dz
t
] ,
where z = (z
t
) is a onedimensional standard Brownian motion and
i
,
i
are constants. Suppose
that assets i and j have dierent sensitivities towards the shock, i.e. that
i
,=
j
. From (3.14), it
follows that a selfnancing trading strategy with a weight of
t
on asset i and 1
t
on asset j
will generate a value process V
t
with dynamics
dV
t
= V
t
[(
t
(
i
j
) +
j
) dt + (
t
(
i
j
) +
j
) dz
t
] .
We can obtain any desired sensitivity towards the shock. The portfolio weight
t
= (
t
j
)/(
i
j
) will generate a portfolio return sensitivity of
t
. Now suppose that each
i
is a function of some
variable x
t
and that the dynamics of x
t
is aected by another shock represented by a standard
Brownian motion z = ( z
t
) independent of z, e.g.
dx
t
= m(x
t
) dt +v(x
t
)
_
dz
t
+
_
1
2
d z
t
_
.
Then the full uncertainty of the model is generated by a twodimensional shock (z, z). Since the
instantaneous price changes are aected only by z, the market is now incomplete. You cannot
hedge against the shock process z.
For a moment think about a continuoustime model where there can be a jump in the value
of one of the key variables. Suppose that in case of a jump the variable can jump to K dierent
values. The change over an instant in such a variable cannot be represented only by the expectation
3.7 Concluding remarks 63
and the variance of the jump. To obtain any desired exposure to the jump risk you need K assets
which react suciently dierent to the jump. Typical models with jump risk assume that the
size of the jump is normally or lognormally distributed. In both cases there are innitely many
possible realizations of the jump size and, consequently, a market with nitely many assets will be
incomplete.
3.6.3 Discussion
As we shall see in later sections, some fundamental results require that the market is complete. It
is therefore relevant to assess the realism of this property. Market completeness means that every
risk is traded. Individuals can insure against all risks (relevant for asset prices) through trading
in nancial assets. Clearly, individuals would like to have that opportunity so if the market is
incomplete there will be an incentive to create new nonredundant assets that will help complete
the market. In fact, at least part of the many new assets that have been introduced in the nancial
markets over the last decades do help complete the market, e.g. assets with dividends depending on
the stock market volatility, the weather at a given location, or the number of natural catastrophes.
On the other hand, some risks are dicult to market. For example, due to the obvious information
asymmetry, it is unlikely that individual labor income risk can be fully insured in the nancial
market. Maybe you would like to obtain full income insurance but who should provide that, given
that you probably know a lot more about your potential income than anybody elseand that you
can inuence your own income while others cannot. Hence, we should not expect that reallife
nancial markets are complete in the strict sense.
If the market is incomplete, it is not possible to adjust your exposure to one or several shocks
included in the model. But maybe investors do not care about those shocks. Then the model is
said to be eectively complete. Before solving for prices and optimal decisions of the individuals,
it is generally impossible to decide whether an incomplete market is really eectively complete.
We will return to this discussion and some formal results on that in Chapter 7.
3.7 Concluding remarks
To be added...
3.8 Exercises
EXERCISE 3.1 Consider a oneperiod model with only two possible endofperiod states. Three
assets are traded in an arbitragefree market. Asset 1 is a riskfree asset with a price of 1 and an
endofperiod dividend of R
f
, the riskfree gross rate of return. Asset 2 has a price of S and oers
a dividend of uS in state 1 and dS in state 2.
(a) Show that if the inequality d < R
f
< u does not hold, there will be an arbitrage.
Asset 3 is a calloption on asset 2 with an exercise price of K. The dividend of asset 3 is therefore
C
u
max(uS K, 0) in state 1 and C
d
max(dS K, 0) in state 2.
64 Chapter 3. Assets, portfolios, and arbitrage
(b) Show that a portfolio consisting of
1
units of asset 1 and
2
units of asset 2, where
1
= (R
f
)
1
uC
d
dC
u
u d
,
2
=
C
u
C
d
(u d)S
will generate the same dividend as the option.
(c) Show that the noarbitrage price of the option is given by
C = (R
f
)
1
(qC
u
+ (1 q)C
d
) ,
where q = (R
f
d)/(u d).
EXERCISE 3.2 Imagine a oneperiod economy with two possible endofperiod states that are
equally likely. Two assets are traded. Asset 1 has an initial price of 1 and pays o 1 in state 1 and
2 in state 2. Asset 2 has an initial price of 3 and gives a payo of 2 in state 1 and a payo k in
state 2, where k is some constant.
(a) Argue that if k = 4, the Law of One Price does not hold. Is the Law of One Price violated
for other values of k?
(b) For what values of k is the market complete?
(c) For what values of k is the market free of arbitrage?
(d) Assume k = 8. Is it possible to obtain a riskfree dividend? If so, what is the riskfree rate?
EXERCISE 3.3 Verify Equation (3.4).
EXERCISE 3.4 In a oneperiod twostate economy the riskfree interest rate over the period is
25%. An asset that pays out 100 in state 1 and 200 in state 2 trades at a price of 110.
(a) What is the noarbitrage price of a second risky asset that pays out 200 in state 1 and 100
in state 2?
(b) If this second risky asset trades at a higher price than what you computed in (a), how can
you obtain a riskfree prot?
CHAPTER 4
State prices
4.1 Introduction
If you want to price a set of assets, you could take them one by one and evaluate the dividends of
each asset separately. However, to evaluate all assets in a consistent way (e.g. avoiding arbitrage)
it is a better strategy rst to gure out what your general pricing rule should be and subsequently
you can apply that to any given dividend stream. The general pricing rule can be represented by
a stateprice deator, which is the topic of this chapter. Basically, a stateprice deator contains
information about the valuation of additional payments in dierent states and at dierent points in
time. Combining that with the state and timedependent dividends of any asset, you can compute
a value or price of that asset.
Section 4.2 denes the stateprice deator in each of our general frameworks (oneperiod, discrete
time, continuoustime) and derives some immediate consequences for prices and expected returns.
Further important properties of stateprice deators are obtained in Section 4.3. Section 4.4 con
tains examples of concrete distributional assumptions on the stateprice deator and the dividends
that will lead to a closedform expression for the present value of the dividends. Section 4.5 ex
plains the dierence between real and nominal stateprice deators. Finally, Section 4.6 gives a
preview of some alternative ways of representing the information in a stateprice deator. These
alternatives are preferable for some purposes and will be studied in more detail in later chapters.
The concept of state prices was introduced and studied by Arrow (1951, 1953, 1971), Debreu
(1954), Negishi (1960), and Ross (1978).
4.2 Denitions and immediate consequences
This section gives a formal denition of a stateprice deator. Some authors use the name stochastic
discount factor, eventprice deator, or pricing kernel instead of stateprice deator.
65
66 Chapter 4. State prices
4.2.1 The oneperiod framework
A stateprice deator is a random variable with the properties that
(i) has nite variance,
(ii) > 0, i.e. () > 0 for all states ,
(iii) the price of the I basic assets are given by
P
i
= E[D
i
], i = 1, 2, . . . , I, (4.1)
or, more compactly, P = E[D].
The nite variance assumptions on both the stateprice deator and the dividends ensure that the
expectation E[D
i
] is nite.
We get a similar pricing equation for portfolios:
P
=
I
i=1
i
P
i
=
I
i=1
i
E[D
i
] = E
_
_
I
i=1
i
D
i
__
= E
_
D
. (4.2)
In terms of gross rates of returns, R
i
= D
i
/P
i
, a stateprice deator has the property that
1 = E[R
i
], i = 1, 2, . . . , I, (4.3)
or, 1 = E[R] in vector notation.
For a riskfree portfolio with a dividend of 1, the price P
f
is
P
f
= E[]
and the riskfree gross rate of return is thus
R
f
=
1
P
f
=
1
E[]
. (4.4)
Exploiting the denition of a covariance, the pricing condition (4.1) can be rewritten as
P
i
= E[] E[D
i
] + Cov[D
i
, ]. (4.5)
A dividend of a given size is valued more highly in a state for which the stateprice deator is high
than in a state where the deator is low. If a riskfree asset is traded, we can rewrite this as
P
i
=
E[D
i
] +R
f
Cov[D
i
, ]
R
f
, (4.6)
i.e. the value of a future dividend is given by the expected dividend adjusted by a covariance term,
discounted at the riskfree rate. If the dividend is positively [negatively] covarying with the state
price deator, the expected dividend is adjusted upwards [downwards]. Dening the dividendbeta
of asset i with respect to the stateprice deator as
[D
i
, ] =
Cov[D
i
, ]
Var[]
(4.7)
and = Var[]/ E[] < 0, we can rewrite the above pricing equation as
P
i
=
E[D
i
] [D
i
, ]
R
f
. (4.8)
4.2 Denitions and immediate consequences 67
Some basic nance textbooks suggest that a future uncertain dividend can be valued by taking
the expected dividend and discount it by a discount rate reecting the risk of the dividend. For
asset i, this discount rate
R
i
is implicitly dened by P
i
= E[D
i
]/
R
i
and combining this with the
above equations, we must have
R
i
=
R
f
E[D
i
]
E[D
i
] [D
i
, ]
= R
f
1
1
[D
i
,]
E[D
i
]
. (4.9)
The riskadjusted discount rate does not depend on the scale of the dividend in the sense that the
riskadjusted discount rate for a dividend of kD
i
for any constant k is the same as for a dividend
of D
i
. Note that the riskadjusted discount rate will be smaller than R
f
if [D
i
, ]/ E[D
i
] < 0
which, since < 0, is the case if [D
i
, ] and E[D
i
] are of the same sign. And if 0 > E[D
i
]/[D
i
, ] >
, the riskadjusted gross discount rate will be negative! While this possibility is rarely realized
in textbooks, it is not really surprising. Some assets or investments will have a negative expected
future dividend but still a positive value today. This could be the case for insurance contracts that
typically have E[D
i
] < 0 and may have [D
i
, ] > 0. The lesson here is to be careful if you want
to value assets by discounting expected dividends by riskadjusted discount rates.
For the gross rate of return, R
i
= D
i
/P
i
, we get
1 = E[] E[R
i
] + Cov[R
i
, ]
implying that
E[R
i
] =
1
E[]
Cov[R
i
, ]
E[]
. (4.10)
If a riskfree asset is available, the above equation specializes to
E[R
i
] R
f
=
Cov[R
i
, ]
E[]
, (4.11)
which again can be rewritten as
E[R
i
] R
f
=
Cov[R
i
, ]
Var[]
_
Var[]
E[]
_
= [R
i
, ], (4.12)
where the returnbeta [R
i
, ] is dened as Cov[R
i
, ]/ Var[] corresponding to the denition of
the marketbeta of an asset in the traditional CAPM. An asset with a positive [negative] return
beta with respect to the stateprice deator will have an expected return smaller [larger] than the
riskfree return. Of course, we can also write the covariance as the product of the correlation and
the standard deviations so that
E[R
i
] R
f
= [R
i
, ][R
i
]
[]
E[]
, (4.13)
and the Sharpe ratio of asset i is
E[R
i
] R
f
[R
i
]
= [R
i
, ]
[]
E[]
. (4.14)
In particular, we can see that the maximal Sharpe ratio is the ratio []/ E[].
The expressions involving expected returns and returnbetas above are not directly useful if
you want to value a future dividend. For that purpose the equations with expected dividends
and dividendbetas are superior. On the other hand the returnexpressions are better for empirical
68 Chapter 4. State prices
studies, where you have a historical record of observations of returns and, for example, of a potential
stateprice deator.
Example 4.1 Let be a stateprice deator and consider a dividend given by
D
i
= a +b +,
where a, b are constants, and where is a random variable with mean zero and Cov[, ] = 0. What
is the price of this dividend? We can use the original pricing condition to get
P
i
= E[D
i
] = E[(a +b +) ] = a E[] +b E[
2
],
using E[] = Cov[, ] + E[] E[] = 0 + 0 = 0. Alternatively, we can compute
E[D
i
] = a +b E[], Cov[D
i
, ] = b Var[],
and use (4.5) to get
P
i
= E[] (a +b E[]) +b Var[]
= a E[] +b E[
2
]
using the identity Var[] = E[
2
] (E[])
2
. 2
Lognormal stateprice deator and return
In many specic models the stateprice deator is assumed to have a lognormal distribution. It
follows from Appendix B that
E[] = E
_
e
ln
= e
E[ln ]+
1
2
Var[ln ]
.
In particular, if a riskfree asset exists, the continuously compounded riskfree rate of return is
r
f
= lnR
f
= lnE[] = E[ln]
1
2
Var[ln]. (4.15)
For a risky asset with a gross return R
i
so that the stateprice deator and the gross return R
i
are jointly lognormally distributed, the product R
i
will also be lognormally distributed. Hence,
we nd
1 = E[R
i
] = E
_
e
ln +ln R
i
= exp
_
E[ln + lnR
i
] +
1
2
Var [ln + lnR
i
]
_
,
and thus
0 = E[ln+lnR
i
]+
1
2
Var [ln + lnR
i
] = E[ln]+
1
2
Var [ln]+E[lnR
i
]+
1
2
Var [lnR
i
]+Cov [ln, lnR
i
] .
If we let r
i
= lnR
i
denote the continuously compounded rate of return on asset i and apply (4.15),
we obtain
E[r
i
] r
f
= Cov[ln, r
i
]
1
2
Var[r
i
]. (4.16)
4.2 Denitions and immediate consequences 69
Excess returns
Some empirical tests of asset pricing models focus on excess returns, where returns on all assets
are measured relative to the return on a xed benchmark asset or portfolio. Let
R be the return
on the benchmark. The excess return on asset i is then R
e
i
R
i
R. Since the above equations
hold for both asset i and the benchmark, we get that
E[R
e
i
] = 0
and
E[R
e
i
] =
Cov[R
e
i
, ]
E[]
= [R
e
i
, ], (4.17)
where [R
e
i
, ] = [R
i
, ] [
R, ].
Stateprice vectors with a nite state space
If the state space is nite, = 1, 2, . . . , S, we can alternatively represent a general pricing rule
by a stateprice vector, which is an Sdimensional vector = (
1
, . . . ,
S
)
=1
D
i
, i = 1, 2, . . . , I, (4.18)
or, more compactly, P = D, where D is the dividend matrix of all the basic assets.
Suppose we can construct an ArrowDebreu asset for state , i.e. a portfolio paying 1 in state
and nothing in all other states. The price of this portfolio will be equal to
S
=1
=
1
1
.
There is a onetoone correspondence between stateprice vectors and stateprice deators. With
a nite state space a stateprice deator is equivalent to a vector = (
1
,
2
, . . . ,
S
)
and we can
rewrite (4.1) as
P
i
=
S
=1
p
D
i
, i = 1, 2, . . . , I.
We can then dene a stateprice vector by
. (4.19)
Conversely, given a stateprice vector this equation denes a stateprice deator. With innitely
many states we cannot meaningfully dene stateprice vectors but we can still dene stateprice
deators in terms of random variables.
70 Chapter 4. State prices
Example 4.2 Consider the same market as in Example 3.1. Suppose there is a stateprice vector
= (0.3, 0.2, 0.3)
1
=
0.3
0.5
= 0.6,
2
=
0.2
0.25
= 0.8,
3
=
0.3
0.25
= 1.2.
2
4.2.2 The discretetime framework
In the discretetime multiperiod framework with time set T = 0, 1, 2, . . . , T a stateprice deator
is an adapted stochastic process = (
t
)
tT
such that
(i)
0
= 1,
(ii)
t
> 0 for all t = 1, 2, . . . , T,
(iii) for any t T,
t
has nite variance,
(iv) for any basic asset i = 1, . . . , I and any t T, the price satises
P
it
= E
t
_
T
s=t+1
D
is
t
_
. (4.20)
The condition (i) is just a normalization. The condition (iii) is purely technical and will ensure that
some relevant expectations exist. Condition (iv) gives the price at time t in terms of all the future
dividends and the stateprice deator. This condition will also hold for all trading strategies.
The pricing condition implies a link between the price of an asset at two dierent points in time,
say t < t
_
T
s=t
+1
D
is
_
.
4.2 Denitions and immediate consequences 71
We can now rewrite the price P
it
as follows:
P
it
= E
t
_
T
s=t+1
D
is
t
_
= E
t
_
_
t
s=t+1
D
is
t
+
T
s=t
+1
D
is
t
_
_
= E
t
_
_
t
s=t+1
D
is
t
+
t
t
T
s=t
+1
D
is
_
_
= E
t
_
_
t
s=t+1
D
is
t
+
t
t
E
t
_
T
s=t
+1
D
is
_
_
_
= E
t
_
_
t
s=t+1
D
is
t
+P
it
t
_
_
. (4.21)
Here the fourth equality follows from the Law of Iterated Expectations, Theorem 2.1. Conversely,
Equation (4.21) implies Equation (4.20).
A particularly simple version of (4.21) occurs for t
= t + 1:
P
it
= E
t
_
t+1
t
(P
i,t+1
+D
i,t+1
)
_
, (4.22)
or in terms of the gross rate of return R
i,t+1
= (P
i,t+1
+D
i,t+1
)/P
it
:
1 = E
t
_
t+1
t
R
i,t+1
_
. (4.23)
These equations show that the ratio
t+1
/
t
acts as a oneperiod stateprice deator between time t
and t + 1 in the sense of the denition in the oneperiod framework, except that the price at the
end of the period (zero in the oneperiod case) is added to the dividend. Of course, given the state
price deator process = (
t
) we know the stateprice deators
t+1
/
t
for each of the subperiods.
(Note that these will depend on the realized value
t
which is part of the information available at
time t.) Conversely, a sequence of oneperiod stateprice deators
t+1
/
t
and the normalization
0
= 1 dene the entire stateprice deator process.
If an asset provides a riskfree gross rate of return R
f
t
over the period between t and t + 1 that
return will be known at time t and we get that
1 = E
t
_
t+1
t
R
f
t
_
= R
f
t
E
t
_
t+1
t
_
(4.24)
and hence
R
f
t
=
_
E
t
_
t+1
t
__
1
. (4.25)
Similar to the oneperiod case, the above equations lead to an expression for the expected excess
return of an asset over a single period:
E
t
[R
i,t+1
] R
f
t
=
Cov
t
_
R
i,t+1
,
t+1
t
_
E
t
_
t+1
t
_ =
t
_
R
i,t+1
,
t+1
t
_
t
, (4.26)
72 Chapter 4. State prices
where
t
_
R
i,t+1
,
t+1
t
_
=
Cov
t
_
R
i,t+1
,
t+1
t
_
Var
t
_
t+1
t
_ ,
t
=
Var
t
_
t+1
t
_
E
t
_
t+1
t
_ .
The same relation holds for net rates of return. Just substitute R
i,t+1
= 1 + r
i,t+1
and R
f
t
=
1 + r
f
t
on the lefthand side and observe that the ones cancel. On the righthand side use that
Cov
t
[R
i,t+1
, x] = Cov
t
[1 +r
i,t+1
, x] = Cov
t
[r
i,t+1
, x] for any random variable x. Therefore we can
work with gross or net returns as we like in such expressions. The conditional Sharpe ratio of
asset i becomes
E
t
[R
i,t+1
] R
f
t
t
[R
i,t+1
]
=
t
_
R
i,t+1
,
t+1
t
_
t
_
t+1
t
_
E
t
_
t+1
t
_. (4.27)
The maximal conditional Sharpe ratio is
t
_
t+1
t
_
/ E
t
_
t+1
t
_
. These expressions are useful for
empirical purposes.
In a situation where the nextperiod stateprice deator
t+1
/
t
and the nextperiod gross return
R
i,t+1
= e
r
i,t+1
on a risky asset are jointly lognormally distributed conditional on time t informa
tion, it follows from our oneperiod analysis that the expected excess continuously compounded
rate of return is
E
t
[r
i,t+1
] r
f
t
= Cov
t
_
ln
_
t+1
t
_
, r
i,t+1
_
1
2
Var
t
[r
i,t+1
], (4.28)
where
r
f
t
= lnR
f
t
= lnE
t
__
t+1
t
__
= E
t
_
ln
_
t+1
t
__
1
2
Var
t
_
ln
_
t+1
t
__
(4.29)
is the continuously compounded riskfree rate of return over the next period.
For the valuation of a future dividend stream we have to apply (4.20). Applying the covariance
denition once again, we can rewrite the price as
P
it
=
T
s=t+1
_
E
t
[D
is
] E
t
_
t
_
+ Cov
t
_
D
is
,
s
t
__
.
By denition of a stateprice deator, a zerocoupon bond maturing at time s with a face value
of 1 will have a time t price of
B
s
t
= E
t
_
t
_
.
Dene the corresponding (annualized) yield y
s
t
by
B
s
t
=
1
(1 + y
s
t
)
st
y
s
t
= (B
s
t
)
1/(st)
1. (4.30)
Note that this is a riskfree rate of return between time t and time s. Now we can rewrite the
above price expression as
P
it
=
T
s=t+1
B
s
t
_
_
E
t
[D
is
] +
Cov
t
_
D
is
,
s
t
_
B
s
t
_
_
=
T
s=t+1
E
t
[D
is
] +
Cov
t[D
is
,
t
]
E
t[
t
]
(1 + y
s
t
)
st
.
(4.31)
4.2 Denitions and immediate consequences 73
Each dividend is valued by discounting an appropriately riskadjusted expected dividend by the
riskfree return over the period. This generalizes the result in the oneperiod framework.
Some authors formulate the important pricing condition (iv) as follows: for all basic assets
i = 1, . . . , I the stateprice deated gains process G
i
= (G
it
)
tT
dened by
G
it
=
t
s=1
D
is
s
+P
it
t
(4.32)
is a martingale. This means that for t < t
, G
it
= E
t
[G
it
], i.e.
t
s=1
D
is
s
+P
it
t
= E
t
_
_
t
s=1
D
is
s
+P
it
t
_
_
.
Subtracting the sum on the lefthand side and dividing by
t
yield (4.21).
4.2.3 The continuoustime framework
Similar to the discretetime framework we dene a stateprice deator in the continuoustime
framework as an adapted stochastic process = (
t
) with
(i)
0
= 1,
(ii)
t
> 0 for all t [0, T],
(iii) for each t,
t
has nite variance,
(iv) for any basic asset i = 1, . . . , I and any t [0, T), the price satises
P
it
= E
t
_
_
T
t
is
P
is
t
ds +D
iT
t
_
. (4.33)
Under technical conditions, Equation (4.33) will also hold for all trading strategies. As in the
discretetime case the pricing equation (4.33) implies that for any t < t
< T
P
it
= E
t
_
_
t
is
P
is
t
ds +P
it
t
_
. (4.34)
Again, the condition (iv) can be reformulated as follows: for all basic assets i = 1, . . . , I the
stateprice deated gains process G
i
= (G
it
)
tT
dened by
G
it
=
_
_
_
_
t
0
is
P
is
s
ds +P
it
t
for t < T,
_
T
0
is
P
is
s
ds +D
iT
T
for t = T
(4.35)
is a martingale.
If we invest in one unit of asset i at time t and keep reinvesting the continuously paid dividends
in the asset, we will end up at time T with exp
_
T
t
iu
du units of the asset, cf. the argument in
Section 3.2.3. Therefore, we have the following relation:
P
it
= E
t
_
e
T
t
iu
du
D
iT
t
_
. (4.36)
74 Chapter 4. State prices
As in the discretetime case we can derive information about the shortterm riskfree rate of
return and the expected returns on the risky assets from the stateprice deator. First we do this
informally by considering a discretetime approximation with period length t and let t 0 at
some point. In such an approximate model the riskfree oneperiod gross rate of return satises
1
R
f
t
= E
t
_
t+t
t
_
,
according to (4.25). In terms of the annualized continuously compounded oneperiod riskfree rate
r
f
t
, the lefthand side is given by expr
f
t
t. Subtracting one on both sides of the equation and
changing signs, we get
1 e
r
f
t
t
= E
t
_
t+t
t
_
.
Dividing by t and letting t 0 we get
r
f
t
= lim
t0
1
t
E
t
_
t+t
t
_
=
1
dt
E
t
_
d
t
t
_
.
The lefthand side is the continuously compounded shortterm riskfree interest rate. The right
hand side is minus the relative drift of the stateprice deator.
To obtain an expression for the current expected return on a risky asset start with the equivalent
of (4.23) in the approximating discretetime model, i.e.
E
t
_
t+
t
R
i,t+t
_
= 1,
which implies that
E
t
[R
i,t+t
] E
t
_
t+
t
_
1 = Cov
t
_
R
i,t+t
,
t+
t
_
.
For small t, we have E
t
_
t+t
t
_
e
r
f
t
t
. The gross return is
R
i,t+t
= e
t+t
t
iu
du
P
i,t+t
P
it
exp
__
it
+
it
1
2

it

2
_
t +
it
z
t
_
,
so that E
t
[R
i,t+t
] e
(
it
+
it
)t
and
Cov
t
_
R
i,t+t
,
t+
t
_
= e
t+t
t
iu
du
Cov
t
_
P
i,t+t
P
it
,
t+t
t
_
= e
t+t
t
iu
du
Cov
t
_
P
i,t+t
P
it
P
it
,
t+t
t
_
.
If we substitute these expressions into the previous equation, use e
x
1 +x on the lefthand side,
divide by t, and let t 0 we arrive at
it
+
it
r
f
t
= lim
t0
e
t+t
t
iu
du
1
t
Cov
t
_
P
i,t+t
P
it
P
it
,
t+t
t
_
=
1
dt
Cov
t
_
dP
it
P
it
,
d
t
t
_
.
(4.37)
The lefthand side is the expected excess rate of return over the next instant. The righthand side
is the current rate of covariance between the return on the asset and the relative change of the
stateprice deator.
Now let us give a more rigorous treatment. Write the dynamics of a stateprice deator as
d
t
=
t
[m
t
dt +
t
dz
t
] (4.38)
4.3 Properties of stateprice deators 75
for some relative drift m and some sensitivity vector . First focus on the riskfree asset.
By the pricing condition in the denition of a stateprice deator, the process G
f
dened by
G
ft
=
t
exp
_
t
0
r
f
u
du has to be a martingale, i.e. have a zero drift. By Itos Lemma,
dG
ft
= G
ft
_
(m
t
+r
f
t
) dt
t
dz
t
_
so we conclude that m
t
= r
f
t
, i.e. the relative drift of a stateprice deator is equal to the negative
of the continuously compounded shortterm riskfree interest rate.
Next, for any risky asset i the process G
i
dened by (4.35) must be a martingale. From Itos
Lemma and the dynamics of P
i
and given in (3.3) and (4.38), we get
dG
it
=
it
P
it
t
dt +
t
dP
it
+P
it
d
t
+ (d
t
)(dP
it
)
= P
it
t
_
(
it
+
it
m
t
it
t
) dt + (
t
+
it
)
dz
t
.
With a riskfree asset, we know that m
t
= r
f
t
, so setting the drift equal to zero, we conclude that
the equation
it
+
it
r
f
t
=
it
t
(4.39)
must hold for any asset i. This is equivalent to (4.37). In compact form, the condition on can
be written
t
+
t
r
f
t
1 =
t
t
. (4.40)
A (nice) process = (
t
) satisfying this equation is called a market price of risk. If the price of
the ith asset is only sensitive to the jth exogenous shock, Equation (4.39) reduces to
it
+
it
r
f
t
=
ijt
jt
,
implying that
jt
=
it
+
it
r
f
t
ijt
.
Therefore,
jt
is the compensation in terms of excess expected return per unit of risk stemming
from the jth exogenous shock. This explains the name market price of risk. Summing up, the
dynamics of a stateprice deator is of the form
d
t
=
t
_
r
f
t
dt +
t
dz
t
_
, (4.41)
where is a market price of risk. This implies that
s
=
t
exp
_
_
s
t
r
f
u
du
1
2
_
s
t

u

2
du
_
s
t
u
dz
u
_
(4.42)
for any s > t.
If we know or specify the dynamics of the stateprice deator and the dividend stream from an
asset, we can in principle price the asset using (4.33). However, only in special cases is it possible
to compute the price in closed form. See examples in Section 4.4. In other cases one can use Monte
Carlo simulation to approximate the relevant expectation.
4.3 Properties of stateprice deators
After dening stateprice deators, two questions arise naturally: When does a stateprice deator
exist? When is it unique? We will answer these questions in the two following subsections.
76 Chapter 4. State prices
4.3.1 Existence
Here is the answer to the existence question:
Theorem 4.1 A stateprice deator exists if and only if prices admit no arbitrage.
Since we have already concluded that we should only consider noarbitrage prices, we can safely
assume the existence of a stateprice deator.
Let us take the easy part of the proof rst: if a stateprice deator exists, prices do not admit
arbitrage. Let us just think of the oneperiod framework so that the stateprice deator is a strictly
positive random variable and the price of a random dividend of D
i
is P
i
= E[D
i
]. If D
i
is non
negative in all states, it is clear that D
i
will be nonnegative in all states and, consequently, the
expectation of D
i
will be nonnegative. This rules out arbitrage of type (i), cf. the denition
of arbitrage in Section 3.4. If, furthermore, the set of states A = : D
i
() > 0 has
strictly positive probability, it is clear that D
i
will be strictly positive on a set of strictly positive
probability and otherwise nonnegative, so the expectation of D
i
must be strictly positive. This
rules out type (ii) arbitrage. The same argument applies to the discretetime framework. In the
continuoustime setting the argument should be slightly adjusted in order to incorporate the lower
bound on the value process that will rule out doubling strategies. It is not terribly dicult, but
involves local martingales and supermartingales which we will not discuss here. The interested
reader is referred to Due (2001, p. 105).
How can we show the other and more important implication that no arbitrage guarantees the
existence of a stateprice vector or stateprice deator? In Chapter 6 we will do that by constructing
a stateprice deator from the solution to the utility maximization problem of any individual. The
solution will only exist in absence of arbitrage. It is possible to prove the existence of a stateprice
deator without formally introducing individuals and solving their utility maximization problems.
The alternative proof is based on the socalled Separating Hyperplane Theorem and involves no
economics.
In the oneperiod framework with a nite state space = 1, 2, . . . , S, the alternative argument
goes as follows. A given portfolio has an initial dividend of P
, D
)
that is generated by all portfolios, i.e. all R
I
. Observe that M is a closed and convex subset of
L RR
S
. Let K be the positive orthant of L, i.e. K R
+
R
S
+
, where R
+
= [0, ). K is also
a closed and convex subset of L. Note that there is no arbitrage if and only if the only common
element of K and M is the zero element (0, 0).
Assume no arbitrage, i.e. K M = (0, 0). By the Separating Hyperplane Theorem (see,
e.g., Rockafellar 1970) there exists a nonzero linear functional F : L R with the property
that F(z) = 0 for all z M and F(x) > 0 for all nonzero x K. Hence, we can nd a
strictly positive
0
in R and an Sdimensional vector with strictly positive elements such that
F(d
0
, d) =
0
d
0
+ d for all (d
0
, d) in L. Since (P
, D
, D
) =
0
P
+ D
,
and hence
P
=
1
0
D
.
4.3 Properties of stateprice deators 77
The vector = /
0
is therefore a stateprice vector.
Just assuming that prices obey the law of one price would give us a vector satisfying D
i
= P
i
for all i. Imposing the stronger noarbitrage condition ensures us that we can nd a strictly positive
vector with that property, i.e. a stateprice vector. Dividing by state probabilities,
/p
,
we obtain a stateprice deator.
Example 4.3 Consider again the market in Example 3.1 and ignore asset 4, which is redundant.
Suppose the market prices of the three remaining assets are 1.1, 2.2, and 0.6, respectively. Can you
nd a stateprice vector ? The only candidate is the solution to the equation system D = P,
i.e.
=
_
D
_
1
P =
_
_
_
_
1 1 1
0 1 2
4 0 1
_
_
_
_
1
_
_
_
_
1.1
2.2
0.6
_
_
_
_
=
_
_
_
_
0.1
0.2
1
_
_
_
_
,
which is not strictly positive. Hence there is no stateprice vector for this market. Then there must
be an arbitrage, but where? We can see that the three assets are priced such that the implicit value
of an ArrowDebreu asset for state 1 is negative. The portfolio of the three assets that replicates
this ArrowDebreu asset is given by
=
_
D
_
1
e
1
=
_
_
_
_
0.2 1.6 0.8
0.2 0.6 0.8
0.2 0.4 0.2
_
_
_
_
_
_
_
_
1
0
0
_
_
_
_
=
_
_
_
_
0.2
0.2
0.2
_
_
_
_
,
which indeed has a price of 0.2 1.1 0.2 2.2 +0.2 0.6 = 0.1. You get 0.1 today for an asset that
pays you one if state 1 is realized and nothing in other states. This is clearly an arbitrage. 2
4.3.2 Uniqueness
Here is a general result on the uniqueness of stateprice deators:
Theorem 4.2 Assume prices admit no arbitrage. Then there is a unique stateprice deator if
and only if the market is complete. If the market is incomplete, several stateprice deators exist.
In a oneperiod framework the theorem can be explained as follows. In the absence of arbitrage,
some stateprice deator exists. If is a random variable with E[D
i
] = 0 for all i, then
E[( +)D
i
] = E[D
i
] + E[D
i
] = E[D
i
] = P
i
.
If is strictly positive with nite variance, then + will be a valid stateprice deator. When
can we nd such an ? If the market is complete, the dividends of the basic assets span all random
variables so it will be impossible to nd a random variable not identically equal to zero with
E[D
i
] = 0. Therefore must be the only stateprice deator.
Let us be a bit more precise. Consider, for simplicity, a oneperiod framework with a nite state
space and look for a stateprice vector , i.e. a strictly positive solution to the equation system
D = P. If the market is complete and there are no redundant assets, the dividend matrix D is
an S S nonsingular matrix so the only solution to the equation system is
= D
1
P.
78 Chapter 4. State prices
If there are redundant assets, they will be uniquely priced by noarbitrage so it will be sucient
to look for solutions to
D =
P, where
D is the dividend matrix and
P the price vector of the
nonredundant assets. In the case of a complete market the matrix
D is nonsingular. The unique
solution to the equation system is then
=
D
1
P. (4.43)
Whether the market is complete or not, the Sdimensional vector
=
D
D
D
_
1
P (4.44)
is a solution since
=
D
D
D
D
_
1
P =
P.
If
is strictly positive, we can therefore conclude that it will be a stateprice vector. Note that
=
_
D
D
_
1
P, i.e. if
will be strictly
positive. Why? Since the market is complete, we can construct an ArrowDebreu asset for any
state. The dividend vector of the ArrowDebreu asset for state is e
= (0, . . . , 0, 1, 0, . . . , 0)
,
where the 1 is the th element of the vector. The price of this portfolio is
. To avoid
arbitrage,
is a
stateprice vector if the market is complete.
In fact, if the market is complete,
and
. That
.
This argument works for any . Therefore the two vectors
and
are identical.
Now let us turn to stateprice deators. Due to the onetoone correspondence between state
price vectors and stateprice deators we expect to nd similar results. Dene the Sdimensional
vector
by
=
D
_
E
_
D
D
__
1
P. (4.45)
To see the meaning of this, let us for simplicity assume that none of the basic assets are redundant
so that
= D
(E[DD
])
1
P. Recall that D is the Idimensional random variable for which
the ith component is given by the random dividend of asset i. Hence, DD
is an I I matrix of
random variables with the (i, j)th entry given by D
i
D
j
, i.e. the product of the random dividend
of asset i and the random dividend of asset j. The expectation of a matrix of random variables
is equal to the matrix of expectations of the individual random variables. So E[DD
] is also an
I I matrix. For the general case we see from the denition that
=
_
E
_
D
D
__
1
P
of the nonredundant assets. We can think of
as a random variable
given by
=
D
_
E
_
D
D
__
1
P. (4.46)
4.3 Properties of stateprice deators 79
We can see that
E
_
_
= E
_
D
D
_
E
_
D
D
__
1
P
_
= E
_
D
D
_ _
E
_
D
D
__
1
P =
P.
It follows that
D
D
_
=
Ddiag(p)
and
the stateprice vector associated with a given stateprice deator is diag(p), cf. (4.19). With a
complete market,
D is a nonsingular S S matrix so
=
D
_
E
_
D
D
__
1
P =
D
Ddiag(p)
_
1
P
=
D
_
1
[diag(p)]
1
D
1
P = [diag(p)]
1
D
1
P,
and the stateprice vector associated with
is
diag(p)
=
D
1
P =
.
If the market is complete and arbitragefree we have identied the unique stateprice vector
and
are dened in terms of the prices of the basic assets. Observing the prices and statecontingent
dividends of the basic assets, we can extract the stateprice deator. If you want to compute
prices of the basic assets from their statecontingent dividends,
and
as
= D
1
P =
_
_
_
_
1 1 1
0 1 2
4 0 1
_
_
_
_
1
_
_
_
_
0.8
0.8
1.5
_
_
_
_
=
_
_
_
_
0.3
0.2
0.3
_
_
_
_
,
which is consistent with the results of Example 4.2. The portfolio generating this dividend vector
is
.
Since the market is complete, the unique stateprice deator
.
From Example 4.2, we have
1
= 0.6,
2
= 0.8,
3
= 1.2. The portfolio generating this dividend is
.
Suppose now that only assets 1 and 2 were traded with the same prices as above, P
1
= P
2
= 0.8.
Then
D =
_
1 1 1
0 1 2
_
, DD
=
_
3 3
3 5
_
,
_
DD
_
1
=
_
5
6
1
2
1
2
1
2
_
80 Chapter 4. State prices
state 1 state 2 state 3
probabilities 0.5 0.25 0.25
statecontingent values expectation
D
2
1
1 1 1 1
D
1
D
2
0 1 2 0.75
D
2
2
0 1 4 1.25
Table 4.1: Computation of expectations for
in Example 4.4.
and we get
= D
_
DD
_
1
P =
_
_
_
_
4
15
4
15
4
15
,
_
_
_
_
,
which is strictly positive and therefore a stateprice vector. It is the dividend of the portfolio that
only consists of 4/15 0.2667 units of asset 1. But we can nd many other stateprice vectors.
We have to look for strictly positive solutions (
1
,
2
,
3
)
1
+
2
+
3
= 0.8,
2
+ 2
3
= 0.8.
Subtracting one equation from the other we see that we need to have
1
=
3
. Any vector of the
form = (a, 0.8 2a, a)
with 0 < a < 0.4 will be a valid stateprice vector. (This includes, of
course, the stateprice vector in the threeasset market.) All these vectors will generate the same
price on any marketed dividend but dierent prices on nonmarketed dividends. For example, the
value of the dividend of asset 3, which we now assume is not traded, will be (4, 0, 1)
= 5a,
which can be anywhere in the interval (0, 2).
Let us compute
] are given in
Table 4.1. We get
E[DD
] =
_
1 0.75
0.75 1.25
_
, (E[DD
])
1
=
_
1.8182 1.0909
1.0909 1.4545
_
,
= (E[DD
])
1
P =
_
0.5818
0.2909
_
,
= D
=
_
_
_
_
0.5818
0.8727
1.1636
_
_
_
_
.
Since
> 0, it is a valid stateprice deator. (Note that this is not the stateprice deator
associated with the stateprice vector
/p
, i.e.
=
_
_
_
_
a/0.5
(0.8 2a)/0.25
a/0.25
_
_
_
_
=
_
_
_
_
2a
3.2 8a
4a
_
_
_
_
.
Letting b = 2a, any stateprice deator is of the form = (b, 3.2 4b, 2b) for 0 < b < 0.8. 2
In the multiperiod discretetime framework all the above observations and conclusions hold in
each period.
4.3 Properties of stateprice deators 81
Now consider the continuoustime framework and assume that an instantaneously riskfree asset
is traded. We have seen above that stateprice deators are closely related to market prices of risk.
Whenever we have a market price of risk = (
t
), i.e. a nice process satisfying
t
+
t
r
f
t
1 =
t
t
, (4.40)
then a stateprice deator can be dened by
0
= 1 and
d
t
=
t
_
r
f
t
dt +
t
dz
t
_
, (4.47)
or, equivalently,
t
= exp
_
_
t
0
r
f
s
ds
1
2
_
t
0

s

2
ds
_
t
0
s
dz
s
_
. (4.48)
The number of stateprice deators is therefore determined by the number of solutions to (4.40),
which again depends on the rank of the matrix
t
.
Suppose the rank of
t
equals k for all t. If k < d, there are several solutions to (4.40). We can
write one solution as
t
=
t
_
t
t
_
1
_
t
+
t
r
f
t
1
_
, (4.49)
where
t
is the kd matrix obtained from
t
by removing rows corresponding to redundant assets,
i.e. rows that can be written as a linear combination of other rows in the matrix. Similarly,
t
and
t
are the kdimensional vectors that are left after deleting from
t
and
t
, respectively, the
elements corresponding to the redundant assets. In the special case where k = d, we have
t
=
1
t
_
t
+
t
r
f
t
1
_
.
Let
, i.e.
t
= exp
_
_
t
0
r
f
s
ds
1
2
_
t
0

s

2
ds
_
t
0
(
s
)
dz
s
_
. (4.50)
In the oneperiod framework the (candidate) stateprice deator
t
=
_
t
t
_
1
_
t
+
t
r
f
t
1
_
in
the nonredundant assets and the fraction 1 (
t
)
t
of this trading strategy is given by
dV
t
= V
t
__
r
f
t
+ (
t
)
_
t
+
t
r
f
t
1
__
dt + (
t
)
t
dz
t
_
= V
t
__
r
f
t
+
t

2
_
dt + (
t
)
dz
t
_
.
(4.51)
cf. (3.14). It can be shown that
t
is the trading strategy with the highest expected continuously
compounded growth rate, i.e. the trading strategy maximizing E[ln(V
T
/V
0
)], and it is therefore
referred to as the growthoptimal trading strategy. Consequently,
t
dened in (4.49) is the relative
sensitivity vector of the value of the growthoptimal trading strategy. One can show that
t
=
V
0
/V
t
(see Exercise 4.11), so we have a stateprice deator dened in terms of the value of a
trading strategy.
If = (
t
) is a nice ddimensional stochastic process with
t
t
= 0 for all t, then
t
=
t
+
t
denes a market price of risk since
t
=
t
(
t
+
t
) =
t
t
+
t
t
=
t
t
=
t
+
t
r
f
t
1.
82 Chapter 4. State prices
If the market is incomplete, it will be possible to nd such an
t
and, hence, there will be more
than one stateprice deator.
An example of an incomplete market is a market where the traded assets are only immediately
aected by j < d of the d exogenous shocks. Decomposing the ddimensional standard Brownian
motion z into (Z,
Z), where Z is jdimensional and
Z is (d j)dimensional, the dynamics of the
traded risky assets can be written as
dP
t
= diag(P
t
)
_
t
dt +
t
dZ
t
.
For example, the dynamics of r
f
t
,
t
, or
t
may be aected by the nontraded risks
Z, representing
nonhedgeable risk in interest rates, expected returns, and volatilities and correlations, respectively.
Or other variables important for the investor, e.g. his labor income, may be sensitive to
Z. Let
us assume for simplicity that there are j risky assets and the j j matrix
t
is nonsingular (i.e.
there are no redundant assets). Then we can dene a unique market price of risk associated with
the traded risks by the jdimensional vector
t
=
_
t
_
1
(
t
+
t
r
t
1) ,
but for any wellbehaved (d j)dimensional process
, the process = (,
) will be a market
price of risk for all risks. Each choice of
generates a valid market price of risk process and hence
a valid stateprice deator.
4.3.3 Convex combinations of stateprice deators
A convex combination of some objects (such as vectors, random variables, stochastic process, etc.)
x
1
, x
2
, . . . , x
L
is given by
x =
L
l=1
l
x
l
,
where
1
, . . . ,
L
are positive constants summing up to one. The following theorem says that if
you have a number of stateprice deators, any convex combination of those deators will also be a
stateprice deator. This is true both in the oneperiod, the discretetime, and the continuoustime
framework. In particular, it tells you that once you have two dierent stateprice deators you can
generate innitely many stateprice deators. The proof of the theorem is left as Exercise 4.3.
Theorem 4.3 If
1
, . . . ,
L
are stateprice deators, and
1
, . . . ,
L
> 0 with
L
l=1
l
= 1, then
the convex combination
=
L
l=1
l
is also a stateprice deator.
4.3.4 The candidate deator
.
This return turns out to be important in later sections. For notational simplicity suppose that no
assets are redundant so that
= D
(E[DD
])
1
P,
= (E[DD
])
1
P.
4.3 Properties of stateprice deators 83
Let us rst rewrite
and
])
1
= (E[diag(P)RR
diag(P)])
1
= (diag(P) E[RR
] diag(P))
1
= [diag(P)]
1
(E[RR
])
1
[diag(P)]
1
, (4.52)
using the facts that prices are nonrandom and that (AB)
1
= B
1
A
1
for nonsingular matrices
A and B. Consequently,
= (E[DD
])
1
P = [diag(P)]
1
(E[RR
])
1
1,
applying that [diag(P)]
1
P = 1, and
= D
(E[DD
])
1
P
= D
[diag(P)]
1
(E[RR
])
1
1
=
_
[diag(P)]
1
D
_
(E[RR
])
1
1
= R
(E[RR
])
1
1
= 1
(E[RR
])
1
R,
using (3.2) and various rules from matrix algebra. The portfolio weight vector is obtained by
substituting
= diag(P)[diag(P)]
1
(E[RR
])
1
1 = (E[RR
])
1
1,
we get
=
(E[RR
])
1
1
1
(E[RR
])
1
1
. (4.53)
The gross return on this portfolio is
R
= (
R =
1
(E[RR
])
1
R
1
(E[RR
])
1
1
. (4.54)
This is the gross return corresponding to the dividend
.
We can also compute R
directly as
(welldened since it is a
dividend), i.e. R
/P(
). We can rewrite
as
= 1
(E[RR
])
1
R,
and the price of
is
P(
) = E[
] = E
_
P
(E[DD
])
1
DD
(E[DD
])
1
P
_
= P
(E[DD
])
1
P = 1
(E[RR
])
1
1.
Dividing
by P(
) we get (4.54).
In Exercise 4.4 you are asked to show the properties collected in the following lemma:
84 Chapter 4. State prices
Lemma 4.1 R
] =
1
(E[RR
])
1
E[R]
1
(E[RR
])
1
1
(4.55)
E[(R
)
2
] =
1
1
(E[RR
])
1
1
=
1
P(
)
, (4.56)
E[R
R
i
] = E[(R
)
2
], i = 1, . . . , N. (4.57)
In particular,
E[R
]
E[(R
)
2
]
= 1
(E[RR
])
1
E[R]. (4.58)
Any random variable that satises P
i
= E[D
i
] for all assets i can be decomposed as
= E[] + (P E[] E[D])
1
D
(DE[D]) +, (4.59)
where
D
= Var[D] and is a random variable with E[D] = 0 and E[] = 0. In particular
Cov[D, ] = 0. So taking variances we get
Var[] = (P E[] E[D])
1
D
Var [DE[D]]
1
D
(P E[] E[D]) + Var[]
(P E[] E[D])
1
D
(P E[] E[D])
= (1 E[] E[R])
1
(1 E[] E[R]) ,
where = Var[R]. The possible combinations of expectation and standard deviation of state
price deators form a hyperbolic region in (E[], [])space. This result is due to Hansen and
Jagannathan (1991) and the righthand side of the above inequality (or the boundary of the
hyperbolic region) is called the HansenJagannathan bound. In Exercise 4.5 you are asked to show
that
satises
= E[
] + (P E[
] E[D])
1
D
(DE[D]) . (4.60)
Note that no is added on the righthand side. It follows that
2
X
,
2
X
), cf. Theorem B.2 in Appendix B. Also assume that
T
=
t
e
Y
, where X
and Y are jointly normally distributed with correlation and Y N(
Y
,
2
Y
). Then the present
value of the dividend is
P
t
= E
t
_
D
T
t
_
= E
t
_
E
t
[D
T
]e
X
e
Y
= E
t
[D
T
] E
t
_
e
X+Y
.
Since X +Y N(
1
2
2
X
+
Y
,
2
X
+
2
Y
+ 2
X
Y
), we get
E
t
_
e
X+Y
= exp
_
1
2
2
X
+
Y
+
1
2
_
2
X
+
2
Y
+ 2
X
Y
_
_
= exp
_
Y
+
1
2
2
Y
+
X
Y
_
4.4 Multiperiod valuation models 85
and thus
P
t
= E
t
[D
T
] exp
_
Y
+
1
2
2
Y
+
X
Y
_
. (4.61)
The exponential term on the righthand side of that equation is the appropriately riskadjusted
discount factor for the given dividend.
In a continuoustime setting the distributional assumption on Y = ln(
T
/
t
) is satised if the
market price of risk vector is a constant and the shortterm riskfree rate is a constant r
f
since
then
Y =
_
r
f
+
1
2

2
_
(T t)
_
T
t
dz
u
N
_
_
r
f
+
1
2

2
_
(T t), 
2
(T t)
_
,
in which case we can rewrite the present value as
P
t
= E
t
[D
T
] exp
_
r
f
(T t) +
X
Y
_
.
If we substitute in
Y
and write
2
X
=
2
d
(T t), where
2
d
is the annualized variance of the
dividend, we get a present value of
P
t
= E
t
[D
T
] exp
_
r
f
(T t) +
d
(T t)
_
= E
t
[D
T
] exp
_
(r
f
d
)(T t)
_
.
The expected dividend is discounted with a riskadjusted percentage interest rate, which equals
the riskfree rate minus
d
.
4.4.1 Discretetime valuation
Consider a discretetime multiperiod framework and assume that the stateprice deator = (
t
)
satises
E
t
_
t+1
t
_
=
, Var
t
_
t+1
t
_
=
2
=
2
+
2
. Consider an uncertain
stream of dividends D = (D
t
), where the dividend growth rate is given by
D
t+1
D
t
= a +b
t+1
t
+
t+1
, t = 0, 1, . . . , T 1,
where
1
, . . . ,
T
are independent with E
t
[
t+1
] = 0 and E
t
[
t+1
t+1
] = 0 for all t.
First note that
E
t
_
D
t+1
D
t
t+1
t
_
= E
t
__
a +b
t+1
t
+
t+1
_
t+1
t
_
= a E
t
_
t+1
t
_
+b E
t
_
_
t+1
t
_
2
_
= a
+b
_
+
2
_
A
for every t = 0, 1, . . . , T 1. Together with the Law of Iterated Expectations this implies that for
each s > t
E
t
_
D
s
D
t
t
_
= E
t
_
D
s1
D
t
s1
t
D
s
D
s1
s1
_
= E
t
_
D
s1
D
t
s1
t
E
s1
_
D
s
D
s1
s1
__
= AE
t
_
D
s1
D
t
s1
t
_
= . . .
= A
st
.
86 Chapter 4. State prices
The pricedividend ratio of the asset is therefore
P
t
D
t
= E
t
_
T
s=t+1
D
s
D
t
t
_
=
T
s=t+1
E
t
_
D
s
D
t
t
_
=
T
s=t+1
A
st
= A+A
2
+ +A
Tt
=
A
1 A
_
1 A
Tt
_
.
The price is thus
P
t
= D
t
A
1 A
_
1 A
Tt
_
.
If [A[ < 1 and you let T , i.e. assume dividends continue in all future, you get
P
t
= D
t
A
1 A
.
In particular, the pricedividend ratio is a constant.
1
4.4.2 Continuoustime valuation: The PDE approach
4.4.3 Continuoustime valuation: Ane models
4.5 Nominal and real stateprice deators
It is important to distinguish between real and nominal dividends and prices. A nominal dividend
[price] is the dividend [price] in units of a given currency, e.g. US dollars or Euros. The corre
sponding real dividend [price] is the number of units of consumption goods which can be purchased
for the nominal dividend [price]. For simplicity assume that the economy only oers a single con
sumption good and let F
t
denote the price of the good in currency units at time t. (More broadly
we can think of F
t
as the value of the Consumer Price Index at time t.) A nominal dividend of
D
t
then corresponds to a real dividend of D
t
=
D
t
/F
t
. A nominal price of
P
t
corresponds to a real
price of P
t
=
P
t
/F
t
.
A stateprice deator basically links future dividends to current prices. We can dene a nominal
stateprice deator so that the basic pricing condition holds for nominal prices and dividends and,
similarly, dene a real stateprice deator so that the basic pricing condition holds for real prices
and dividends. If we continue to indicate nominal quantities by a tilde and real quantities without a
tilde, the denitions of stateprice deators given earlier in this chapter characterize real stateprice
deators.
1
This has some similarity to the Gordon Growth Model, which was developed by Gordon (1962) and is apparently
still frequently applied by many nancial analysts. That model assumes that expected dividends grow forever at a
constant rate of g and that the appropriate riskadjusted percentage discount rate is a constant r > g. Then the
present value of all future dividends is
P
t
=
s=t+1
D
t
(1 +g)
st
(1 + r)
(st)
= D
t
u=1
_
1 +g
1 + r
_
u
= D
t
1 +g
r g
,
so that the pricedividend ratio is constant.
4.5 Nominal and real stateprice deators 87
Consider a multiperiod discretetime economy where = (
t
) is a real stateprice deator so
that, in particular,
P
it
= E
t
_
T
s=t+1
D
is
t
_
,
cf. (4.20). Substituting in P
it
=
P
it
/F
t
and D
is
=
D
is
/F
s
and multiplying through by F
t
, we
obtain
P
it
= E
t
_
T
s=t+1
D
is
s
/F
s
t
/F
t
_
.
Now it is clear that a nominal stateprice deator
= (
t
) should be dened from a real stateprice
deator = (
t
) as
t
=
t
F
t
, all t T. (4.62)
Then the nominal stateprice deator will link nominal dividends to nominal prices in the same
way that a real stateprice deator links real dividends to real prices. This relation also works
in the continuoustime framework. Note that the nominal stateprice deator is positive with an
initial value of
0
= 1/F
0
.
In a discretetime framework the gross nominal return on asset i between time t and time t + 1
is
R
i,t+1
=
_
P
i,t+1
+
D
i,t+1
_
/
P
it
. The link between the gross real return and the gross nominal
return follows from
R
i,t+1
=
P
i,t+1
+D
i,t+1
P
it
=
P
i,t+1
/F
t+1
+
D
i,t+1
/F
t+1
P
it
/F
t
=
P
i,t+1
+
D
i,t+1
P
it
F
t
F
t+1
=
R
i,t+1
F
t
F
t+1
.
(4.63)
In terms of the net rates of return r
i,t+1
= R
i,t+1
1, r
i,t+1
=
R
i,t+1
1, and the percentage
ination rate
t+1
= F
t+1
/F
t
1 we have
1 +r
i,t+1
=
1 + r
i,t+1
1 +
t+1
,
which implies that
r
i,t+1
=
1 + r
i,t+1
1 +
t+1
1 =
r
i,t+1
t+1
1 +
t+1
r
i,t+1
t+1
. (4.64)
The above equations show how to obtain real returns from nominal returns and ination. Given a
time series of nominal returns and ination, it is easy to compute the corresponding time series of
real returns.
The realized gross ination rate F
t+1
/F
t
is generally not known in advance. Therefore the real
return on a nominally riskfree asset is generally stochastic (and conversely). The link between the
nominally riskfree gross return
R
f
t
and the real riskfree gross return R
f
t
is
1
R
f
t
= E
t
_
t+1
t
_
= E
t
_
t+1
t
F
t
F
t+1
_
= E
t
_
t+1
t
_
E
t
_
F
t
F
t+1
_
+ Cov
t
_
t+1
t
,
F
t
F
t+1
_
=
1
R
f
t
E
t
_
F
t
F
t+1
_
+ Cov
t
_
t+1
t
,
F
t
F
t+1
_
.
88 Chapter 4. State prices
In order to obtain a simpler link between the nominal and the real riskfree returns, we specialize
to a lognormal setting. Assume that the nextperiod stateprice deator
t+1
/
t
and the gross
realized ination F
t+1
/F
t
are given by
t+1
t
= e
a
t
+b
t
t+1
,
F
t+1
F
t
= e
h
t
+k
t
(
t
t+1
+
1
2
t
t+1
)
, (4.65)
where a = (a
t
), b = (b
t
), h = (h
t
), k = (k
t
), and = (
t
) are adapted processes, and where the
shocks
t+1
and
t+1
are independent and identically N(0, 1) distributed. Note that
E
t
_
ln
_
t+1
t
__
= a
t
, Var
t
_
ln
_
t+1
t
__
= b
2
t
.
Using Theorem B.2 in Appendix B, the gross real riskfree return becomes
R
f
t
=
_
E
t
_
t+1
t
__
1
=
_
e
a
t
+
1
2
b
2
t
_
1
= e
a
t
1
2
b
2
t
so that the continuously compounded real riskfree is
r
f
t
= lnR
f
t
= a
t
1
2
b
2
t
. (4.66)
Similarly, the expectation and the variance of the log ination rate are
E
t
_
ln
_
F
t+1
F
t
__
= h
t
, Var
t
_
ln
_
F
t+1
F
t
__
= k
2
t
,
implying an expected gross ination of
E
t
_
F
t+1
F
t
_
= e
h
t
+
1
2
k
2
t
.
Assuming that b
t
and k
t
are positive,
t
is the correlation between the log real stateprice deator
and the log ination rate.
Due to the lognormality assumption we can directly compute the expectation of the nominal
stateprice deator over the next period:
E
t
_
t+1
t
_
= E
t
_
t+1
t
F
t
F
t+1
_
= E
t
_
e
a
t
+b
t
t+1
e
h
t
k
t
t+1
k
t
1
2
t
t+1
_
= e
a
t
h
t
E
t
_
e
(b
t
k
t
t
)
t+1
_
E
t
_
e
k
t
1
2
t
t+1
_
= e
a
t
h
t
e
1
2
(b
t
k
t
t
)
2
e
1
2
k
2
t
(1
2
t
)
= e
a
t
h
t
+
1
2
b
2
t
+
1
2
k
2
t
t
b
t
k
t
.
The gross nominal riskfree return is thus
R
f
t
=
_
E
t
_
t+1
t
__
1
= e
a
t
+h
t
1
2
b
2
t
1
2
k
2
t
+
t
b
t
k
t
,
and the continuously compounded nominal riskfree rate is
r
f
t
= ln
R
f
t
= a
t
+h
t
1
2
b
2
t
1
2
k
2
t
+
t
b
t
k
t
,
which we can rewrite using (4.66) as
r
f
t
= r
f
t
+h
t
1
2
k
2
t
+
t
b
t
k
t
= r
f
t
+ ln
_
E
t
_
F
t+1
F
t
__
Var
t
_
ln
_
F
t+1
F
t
__
+ Cov
t
_
ln
_
t+1
t
_
, ln
_
F
t+1
F
t
__
.
(4.67)
4.5 Nominal and real stateprice deators 89
The nominal riskfree rate equals the real riskfree rate plus logexpected ination minus ination
variance and adjusted by an ination risk premium.
We can obtain a similar but more generally valid expression in a continuoustime framework.
Let = (
t
) denote a real stateprice deator, which evolves over time according to
d
t
=
t
_
r
f
t
dt +
t
dz
t
_
,
where r
f
= (r
f
t
) is the shortterm real interest rate and = (
t
) is the market price of risk.
Assume that the dynamics of the price of the consumption good can be written as
dF
t
= F
t
_
t
dt +
t
dz
t
. (4.68)
We can interpret
t+dt
= dF
t
/F
t
as the realized ination rate over the next instant,
t
= E
t
[
t+dt
]
as the expected ination rate, and
t
as the sensitivity vector of the ination rate.
Consider now a nominal bank account which over the next instant promises a riskfree monetary
return represented by the nominal shortterm interest rate r
f
t
. If we let
N
t
denote the time t dollar
value of such an account, we have that
d
N
t
= r
f
t
N
t
dt.
The real price of this account is N
t
=
N
t
/F
t
, since this is the number of units of the consumption
good that has the same value as the account. An application of Itos Lemma implies a real price
dynamics of
dN
t
= N
t
__
r
f
t
t
+
t

2
_
dt
t
dz
t
_
. (4.69)
Note that the real return on this instantaneously nominally riskfree asset, dN
t
/N
t
, is risky. Since
the percentage sensitivity vector is given by
t
, the expected return is given by the real short
rate plus
t
. Comparing this with the drift term in the equation above, we have that
r
f
t
t
+
t

2
= r
f
t
t
.
Consequently the nominal shortterm interest rate is given by
r
f
t
= r
f
t
+
t

t

2
t
, (4.70)
i.e. the nominal short rate is equal to the real short rate plus the expected ination rate minus the
variance of the ination rate minus a risk premium. The rst three terms on the righthand side
are clearly analogous to those in the discretetime relation (4.67). The continuoustime equivalent
of the last term in (4.67) is
Cov
t
[d(ln
t
), d(lnF
t
)] = Cov
t
[
t
dz
t
,
t
dz
t
] =
t
dt,
which corresponds to the last term in (4.70). The discretetime relation (4.67) and the continuous
time relation (4.70) are therefore completely analogous, but the discretetime relation could only
be derived in a lognormal setting.
The presence of the last two terms in (4.70) invalidates the Fisher relation, which says that the
nominal interest rate is equal to the sum of the real interest rate and the expected ination rate.
The Fisher hypothesis will hold if and only if the ination rate is instantaneously riskfree. In
90 Chapter 4. State prices
Chapter 10 we will discuss the link between real and nominal interest rates and yields in more
detail.
Individuals should primarily be concerned about real values since, in the end, they should care
about the number of goods they can consume. Therefore, most theoretical asset pricing models
make predictions about expected real returns.
4.6 A preview of alternative formulations
The previous sections show that a stateprice deator is a good way to represent the market
wide pricing mechanism in a nancial market. Paired with characteristics of any individual asset,
the stateprice deator leads to the price of the asset. This section shows that we can capture
the same information in other ways. The alternative representations can be preferable for some
specic purposes and we will return to them in later chapters. Here we will only give a preview.
For simplicity we keep the discussion in a oneperiod framework.
4.6.1 Riskneutral probabilities
Suppose that a riskfree dividend can be constructed and that it provides a gross return of R
f
.
A probability measure Q is called a riskneutral probability measure if the following conditions are
satised:
(i) P and Q are equivalent, i.e. attach zero probability to the same events;
(ii) the random variable dQ/dP (explained below) has nite variance;
(iii) the price of any asset i = 1, . . . , I is given by
P
i
= E
Q
_
(R
f
)
1
D
i
= (R
f
)
1
E
Q
[D
i
] , (4.71)
i.e. the price of any asset equals the expected discounted dividend using the riskfree interest
rate as the discount rate and the riskneutral probabilities when computing the expectation.
The riskfree return is not random and can therefore be moved in and out of expectations as in
the above equation. Given the return (or, equivalently, the price) of the riskfree asset, all the
marketwide pricing information is captured by a riskneutral probability measure.
In the case of a nite state space = 1, 2, . . . , S, a probability measure Q is fully characterized
by the state probabilities q
> 0 for
all . With nite the pricing equation in (iii) can be written as P
i
= R
1
f
S
=1
q
D
i
.
Why is Q called a riskneutral probability measure? Since the gross return on asset i is R
i
=
D
i
/P
i
, we can rewrite (4.71) as
E
Q
[R
i
] = R
f
, (4.72)
i.e. all assets have an expected return equal to the riskfree return under the riskneutral probability
measure. If all investors were riskneutral, they would rank assets according to their expected
returns only and the market could only be in equilibrium if all assets had the same expected
returns. The denition of a riskneutral probability measure Q thus implies that asset prices in the
4.6 A preview of alternative formulations 91
realworld are just as they would have been in an economy in which all individuals are riskneutral
and the state probabilities are given by Q. The price adjustments for risk are thus incorporated
in the riskneutral probabilities.
Next, let us explore the link between riskneutral probability measures and state prices. First,
assume a nite state space. Given a stateprice vector and the associated stateprice deator ,
we can dene
q
S
s=1
s
= R
f
= R
f
p
, = 1, . . . , S.
All the q
s are strictly positive and sum to one so they dene an equivalent probability measure.
Furthermore, (4.18) implies that
P
i
= D
i
=
S
=1
D
i
=
S
=1
(R
f
)
1
q
D
i
= E
Q
_
(R
f
)
1
D
i
,
so Q is indeed a riskneutral probability measure. Note that q
> p
if and only if
> (R
f
)
1
=
E[], i.e. if the value of the stateprice deator for state is higher than average.
The change of measure from the realworld probability measure P to the riskneutral probability
measure Q is given by the ratios
/p
= R
f
=1
p
=
S
=1
p
R
f
= R
f
S
=1
p
= 1.
When the state space is innite, stateprice deators still make sense. Given a stateprice deator
, we can dene a riskneutral probability measure Q by the random variable
=
dQ
dP
= R
f
.
Conversely, given a riskneutral probability measure Q and the riskfree gross return R
f
, we can
dene a stateprice deator by
= (R
f
)
1
dQ
dP
.
In the case of a nite state space, the riskneutral probability measure is given by
= q
/p
,
= 1, . . . , S, and we can construct a stateprice vector and a stateprice deator as
= (R
f
)
1
q
= (R
f
)
1
= (R
f
)
1
q
, = 1, . . . , S.
We summarize the above observations in the following theorem:
Theorem 4.4 Assume that a riskfree asset exists. Then there is a onetoone correspondence
between stateprice deators and riskneutral probability measures.
Combining this result with Theorems 4.1 and 4.2, we reach the next conclusion.
Theorem 4.5 Assume that a riskfree asset exists. Prices admit no arbitrage if and only if a
riskneutral probability measure exists. The market is complete if and only if there is a unique risk
neutral probability measure. If the market is complete and arbitragefree, the unique riskneutral
probability measure Q is characterized by dQ/dP = R
f
, where
is given by (4.46).
92 Chapter 4. State prices
Riskneutral probabilities are especially useful for the pricing of derivative assets. In Chapter 11
we will generalize the denition of riskneutral probabilities to multiperiod settings and we will
also dene other probability measures that are useful in derivative pricing.
4.6.2 Pricing factors
We will say that a (onedimensional) random variable x is a pricing factor for the market if there
exists some , R so that
E[R
i
] = +[R
i
, x], i = 1, . . . , I, (4.73)
where the factorbeta of asset i is given by
[R
i
, x] =
Cov[R
i
, x]
Var[x]
. (4.74)
The constant is called the factor risk premium and the zerobeta return. Due to the linearity
of expectations and covariance, (4.73) will also hold for all portfolios of the I assets. Note that if
a riskfree asset is traded in the market, it will have a zero factorbeta and, consequently, = R
f
.
The relation (4.73) does not directly involve prices. But since the expected gross return is
E[R
i
] = E[D
i
]/P
i
, we have P
i
= E[D
i
]/ E[R
i
] and hence the equivalent relation
P
i
=
E[D
i
]
+[R
i
, x]
. (4.75)
The price is equal to the expected dividend discounted by a riskadjusted rate. You may nd this
relation unsatisfactory since the price implicitly enters the righthand side through the returnbeta
[R
i
, x]. However, we can dene a dividendbeta by [D
i
, x] = Cov[D
i
, x]/ Var[x] and inserting
D
i
= R
i
P
i
we see that [D
i
, x] = P
i
[R
i
, x]. Equation (4.73) now implies that
E[D
i
]
P
i
= +
1
P
i
[D
i
, x]
so that
P
i
=
E[D
i
] [D
i
, x]
. (4.76)
Think of the numerator as a certainty equivalent of the risky dividend. The current price is the
certainty equivalent discounted by the zerobeta return, which is the riskfree return if this exists.
What is the link between pricing factors and stateprice deators? It follows from (4.10) that
any stateprice deator itself is a pricing factor. That equation does not require positivity of
the stateprice deator, only the pricing condition. Therefore any random variable x that satises
P
i
= E[xD
i
] for all assets works as a pricing factor. More generally, if x is a random variable and
a, b are constants so that P
i
= E[(a+bx)D
i
] for all assets i, then x is a pricing factor. In particular,
whenever we have a stateprice deator of the form = a +bx, we can use x as a pricing factor.
Conversely, if we have a pricing factor x for which the associated zerobeta return is non
zero, we can nd constants a, b so that = a + bx satises the pricing condition P
i
= E[D
i
] for
i = 1, . . . , I. In order to see this let denote the factor risk premium associated with the pricing
factor x and dene
b =
Var[x]
, a =
1
b E[x].
4.6 A preview of alternative formulations 93
Then = a +bx works since
E[R
i
] = a E[R
i
] +b E[R
i
x]
= a E[R
i
] +b (Cov[R
i
, x] + E[R
i
] E[x])
= (a +b E[x]) E[R
i
] +b Cov[R
i
, x]
=
1
_
E[R
i
]
Cov[R
i
, x]
Var[x]
_
=
1
(E[R
i
] [R
i
, x])
= 1
for any i = 1, . . . , I. Inserting a and b, we get
= a +bx =
1
_
1
Var[x]
(x E[x])
_
.
Any pricing factor x gives us a candidate a +bx for a stateprice deator but it will only be a true
stateprice deator if it is strictly positive. The fact that we can nd a pricing factor for a given
market does not imply that the market is arbitragefree.
Can the pricing factor be the return on some portfolio? No problem! Suppose x is a pricing
factor. Look for a portfolio which will generate the dividend as close as possible to x in the sense
that it minimizes Var[D
x]. Since
Var
_
D
= Var [D
x] = Var [D
] + Var[x] 2 Cov [D
, x]
=
Var[D] + Var[x] 2
Cov[D, x],
the minimum is obtained for
= (Var[D])
1
Cov[D, x].
This portfolio is called the factormimicking portfolio. Using (3.6) and (3.7), the gross return on
this portfolio is
R
x
=
diag(P)R
diag(P)1
=
Cov[D, x]
(Var[D])
1
diag(P)R
Cov[D, x]
(Var[D])
1
diag(P)1
=
Cov[R, x]
(Var[R])
1
R
Cov[R, x]
(Var[R])
1
1
. (4.77)
The vector of covariances of the returns on the basic assets and the return on the factormimicking
portfolio is
Cov[R, R
x
] =
Cov[R, x]
Cov[R, x]
(Var[R])
1
1
and therefore the beta of asset i with respect to R
x
is
[R
i
, R
x
] =
Cov[R
i
, R
x
]
Var[R
x
]
=
Cov[R
i
, x]
Var[R
x
] Cov[R, x]
(Var[R])
1
1
= [R
i
, x]
Var[x]
Var[R
x
] Cov[R, x]
(Var[R])
1
1
.
Consequently, if x is a pricing factor with zerobeta return and factor risk premium , then the
corresponding factormimicking return R
x
is a pricing factor with zerobeta return and factor risk
premium
=
Var[R
x
] Cov[R, x]
(Var[R])
1
1
Var[x]
.
94 Chapter 4. State prices
In that sense it is not restrictive to look for pricing factors only in the set of returns.
Note that when the factor x itself is a return, then it must satisfy
E[x] = +[x, x] = + = E[x]
so that
E[R
i
] = +[R
i
, x] (E[x] ) . (4.78)
Now it is clear that the standard CAPM simply says that the return on the market portfolio is a
pricing factor.
We will discuss factor models in detail in Chapter 9. There we will also allow for multi
dimensional pricing factors.
4.6.3 Meanvariance ecient returns
A portfolio is said to be meanvariance ecient if there is no other portfolio with the same expected
return and a lower return variance. The return on a meanvariance ecient portfolio is called a
meanvariance ecient return. The meanvariance frontier is the curve in a ([R], E[R])plane
traced out by all the meanvariance ecient returns.
The analysis of meanvariance ecient portfolios was introduced by Markowitz (1952, 1959) as
a tool for investors in making portfolio decisions. Nevertheless, meanvariance ecient portfolios
are also relevant for asset pricing purposes due to the following theorem:
Theorem 4.6 A return is a pricing factor if and only if it is a meanvariance ecient return
dierent from the minimumvariance return.
Combining this with results from the previous subsection, we can conclude that (almost) any
meanvariance return R
mv
give rise to a (candidate) stateprice deator of the form = a +bR
mv
.
And the standard CAPM can be reformulated as the return on the market portfolio is mean
variance ecient.
We will not provide a proof of the theorem here but return to the issue in Chapter 9.
4.7 Concluding remarks
This chapter has introduced stateprice deators as a way representing the general pricing mecha
nism of a nancial market. Important properties of stateprice deators were discussed. Examples
have illustrated the valuation of assets with a given stateprice deator. But what determines
the stateprice deator? Intuitively, state prices reect the value market participants attach to an
extra payment in a given state at a given point in time. This must be related to their marginal
utility of consumption. To follow this idea, we must consider the optimal consumption choice of
individuals. This is the topic of the next two chapters.
4.8 Exercises
EXERCISE 4.1 Imagine a oneperiod economy where the stateprice deator is lognormally
distributed with E[ln] =
and Var[ln] =
2
?
(b) Find the stateprice deator
?
(c) Find a portfolio with dividends 4, 3, and 1 in states 1, 2, and 3, respectively. What is the
price of the portfolio?
EXERCISE 4.3 Give a proof of Theorem 4.3 both for the oneperiod framework and the
continuoustime framework.
EXERCISE 4.4 Show Lemma 4.1.
EXERCISE 4.5 Assume a oneperiod framework with no redundant assets. Show that
can
be rewritten as in (4.60).
EXERCISE 4.6 Consider a oneperiod economy where the assets are correctly priced by a state
price deator M. A nutty professor believes that the assets are priced according to a model in
which Y is a stateprice deator, where Y is a random variable with E[Y ] = E[M]. Refer to this
model as the Y model.
(a) Show that the Y model prices a riskfree asset correctly.
(b) Argue that the expected return on an arbitrary asset i according to the Y model is given by
E
Y
[R
i
]
1
E[M]
1
E[M]
Cov[Y, R
i
].
(c) Show that
[ E[R
i
] E
Y
[R
i
] [
[R
i
]
[Y M]
E[M]
so that the mispricing of the Y model (in terms of expected returns) is limited.
(d) What can you say about the returns for which the lefthand side in the above inequality will
be largest?
(e) Under which condition on Y will the Y model price all assets correctly?
96 Chapter 4. State prices
EXERCISE 4.7 Consider a oneperiod economy where two basic nancial assets are traded
without portfolio constraints or transaction costs. There are three equally likely endofperiod
states of the economy and the prices and statecontingent dividends of the two assets are given in
the following table:
statecontingent dividend price
state 1 state 2 state 3
Asset 1 1 1 0 0.5
Asset 2 2 2 2 1.8
The economy is known to be arbitragefree.
(a) Characterize the set of statecontingent dividends that can be attained by combining the two
assets.
(b) Characterize the set of stateprice vectors = (
1
,
2
,
3
) consistent with the prices and
dividends of the two basic assets.
(c) Find the stateprice vector
t+1
t
_
N(
,
2
)
for each t = 0, 1, . . . , T 1. An asset pays a dividend process D = (D
t
) satisfying
D
t+1
D
t
= a
_
t+1
t
_
b
+
t+1
, t = 0, 1, . . . , T 1,
where a and b are constants, and
1
, . . . ,
T
are independent with E
t
[
t+1
] = 0 and E
t
[
t+1
t+1
] = 0
for all t. What is the price of the asset at any time t (in terms of a, b,
,
2
)?
EXERCISE 4.10 Consider a continuoustime economy in which the stateprice deator follows
a geometric Brownian motion:
d
t
=
t
_
r
f
dt +
dz
t
,
where r
f
and are constant.
(a) What is the price B
s
t
of a zerocoupon bond maturing at time s with a face value of 1?
Dene the continuously compounded yield y
s
t
for maturity s via the equation B
s
t
= e
y
s
t
(st)
.
(b) Compute y
s
t
. What can you say about the yield curve s y
s
t
?
98 Chapter 4. State prices
EXERCISE 4.11 Show that
t
= V
0
/V
t
, where V
and
> 0
and, of course, we have that p
1
+ +p
S
= 1. We take the state probabilities as exogenously given
and known to the individuals.
Individuals care about their consumption. It seems reasonable to assume that when an individual
5.2 Consumption plans and preference relations 101
state 1 2 3
state prob. p
c
(z) =
: c
=z
p
,
i.e. the sum of the probabilities of those states in which the consumption level equals z.
As an example consider an economy with three possible states and four possible statecontingent
consumption plans as illustrated in Table 5.1. These four consumption plans may be the prod
uct of four dierent portfolio choices. The set of possible endofperiod consumption levels is
Z = 1, 2, 3, 4, 5. Each consumption plan generates a probability distribution on the set Z. The
probability distributions corresponding to these consumption plans are as shown in Table 5.2. We
see that although the consumption plans c
(3)
and c
(4)
are dierent they generate identical proba
bility distributions. By assumption individuals will be indierent between these two consumption
plans.
Given these assumptions the individual will eectively choose between probability distributions
on the set of possible consumption levels Z. We assume for simplicity that Z is a nite set, but the
results can be generalized to the case of innite Z at the cost of further mathematical complexity.
We denote by P(Z) the set of all probability distributions on Z that are generated by consumption
plans in C. A probability distribution on the nite set Z is simply a function : Z [0, 1] with
the properties that
zZ
(z) = 1 and (A B) = (A) +(B) whenever A B = .
We assume that the preferences of the individual can be represented by a preference relation _
102 Chapter 5. Modeling the preferences of individuals
cons. level z 1 2 3 4 5
cons. plan 1,
c
(1) 0 0.3 0.2 0.5 0
cons. plan 2,
c
(2) 0.3 0 0.2 0 0.5
cons. plan 3,
c
(3) 0.5 0 0 0.5 0
cons. plan 4,
c
(4) 0.5 0 0 0.5 0
Table 5.2: The probability distributions corresponding to the statecontingent consumption plans
shown in Table 5.1.
on P(Z), which is a binary relation satisfying the following two conditions:
(i) if
1
_
2
and
2
_
3
, then
1
_
3
[transitivity]
(ii)
1
,
2
P(Z) : either
1
_
2
or
2
_
1
[completeness]
Here,
1
_
2
is to be read as
1
is preferred to
2
. We write
1
,_
2
if
1
is not preferred
to
2
. If both
1
_
2
and
2
_
1
, we write
1
2
and say that the individual is indierent
between
1
and
2
. If
1
_
2
, but
2
,_
1
, we say that
1
is strictly preferred to
2
and write
1
~
2
.
Note that if
1
,
2
P(Z) and [0, 1], then
1
+ (1 )
2
P(Z). The mixed distribution
1
+ (1 )
2
assigns the probability (
1
+ (1 )
2
) (z) =
1
(z) + (1 )
2
(z) to the
consumption level z. We can think of the mixed distribution
1
+ (1 )
2
as the outcome of a
twostage gamble. The rst stage is to ip a coin which with probability shows head and with
probability 1 shows tails. If head comes out, the second stage is the consumption gamble
corresponding to the probability distribution
1
. If tails is the outcome of the rst stage, the
second stage is the consumption gamble corresponding to
2
. When we assume that preferences
are represented by a preference relation on the set P(Z) of probability distributions, we have
implicitly assumed that the individual evaluates the twostage gamble (or any multistage gamble)
by the combined probability distribution, i.e. the ultimate consequences of the gamble. This is
sometimes referred to as consequentialism.
Let z be some element of Z, i.e. some possible consumption level. By 1
z
we will denote the
probability distribution that assigns a probability of one to z and a zero probability to all other
elements in Z. Since we have assumed that the set Z of possible consumption levels only has a
nite number of elements, it must have a maximum element, say z
u
, and a minimum element,
say z
l
. Since the elements represent consumption levels, it is certainly natural that individuals
prefer higher elements than lower. We will therefore assume that the probability distribution
1
z
u is preferred to any other probability distribution. Conversely, any probability distribution is
preferred to the probability distribution 1
z
l . We assume that 1
z
u is strictly preferred to 1
z
l so
that the individual is not indierent between all probability distributions. For any P(Z) we
thus have that,
1
z
u ~ ~ 1
z
l or 1
z
u ~ 1
z
l or 1
z
u ~ 1
z
l .
5.3 Utility indices 103
5.3 Utility indices
A utility index for a given preference relation _ is a function U : P(Z) R that to each probability
distribution over consumption levels attaches a realvalued number such that
1
_
2
U(
1
) U(
2
).
Note that a utility index is only unique up to a strictly increasing transformation. If U is a utility
index and f : R R is any strictly increasing function, then the composite function V = f U,
dened by V() = f (U()), is also a utility index for the same preference relation.
We will show below that a utility index exists under the following two axiomatic assumptions
on the preference relation _:
Axiom 5.1 (Monotonicity) Suppose that
1
,
2
P(Z) with
1
~
2
and let a, b [0, 1]. The
preference relation _ has the property that
a > b a
1
+ (1 a)
2
~ b
1
+ (1 b)
2
.
This is certainly a very natural assumption on preferences. If you consider a weighted average of
two probability distributions, you will prefer a high weight on the best of the two distributions.
Axiom 5.2 (Archimedean) The preference relation _ has the property that for any three prob
ability distributions
1
,
2
,
3
P(Z) with
1
~
2
~
3
, numbers a, b (0, 1) exist such that
a
1
+ (1 a)
3
~
2
~ b
1
+ (1 b)
3
.
The axiom basically says that no matter how good a probability distribution
1
is, it is so that
for any
2
~
3
we can nd some mixed distribution of
1
and
3
to which
2
is preferred. We just
have to put a suciently low weight on
1
in the mixed distribution. Similarly, no matter how bad
a probability distribution
3
is, it is so that for any
1
~
2
we can nd some mixed distribution
of
1
and
3
that is preferred to
2
. We just have to put a suciently low weight on
3
in the
mixed distribution.
We shall say that a preference relation has the continuity property if for any three probability
distributions
1
,
2
,
3
P(Z) with
1
~
2
~
3
, a unique number (0, 1) exists such that
2
1
+ (1 )
3
.
We can easily extend this to the case where either
1
2
or
2
3
. For
1
2
~
3
,
2
1
1
+(11)
3
corresponding to = 1. For
1
~
2
3
,
2
0
1
+(10)
3
corresponding
to = 0. In words the continuity property means that for any three probability distributions there
is a unique combination of the best and the worst distribution so that the individual is indierent
between the third middle distribution and this combination of the other two. This appears
to be closely related to the Archimedean Axiom and, in fact, the next lemma shows that the
Monotonicity Axiom and the Archimedean Axiom imply continuity of preferences.
Lemma 5.1 Let _ be a preference relation satisfying the Monotonicity Axiom and the Archimedean
Axiom. Then it has the continuity property.
104 Chapter 5. Modeling the preferences of individuals
Proof: Given
1
~
2
~
3
. Dene the number by
= supk [0, 1] [
2
~ k
1
+ (1 k)
3
.
By the Monotonicity Axiom we have that
2
~ k
1
+ (1 k)
3
for all k < and that k
1
+
(1 k)
3
_
2
for all k > . We want to show that
2
1
+ (1 )
3
. Note that by the
Archimedean Axiom, there is some k > 0 such that
2
~ k
1
+ (1 k)
3
and some k < 1 such
that k
1
+ (1 k)
3
~
2
. Consequently, is in the open interval (0, 1).
Suppose that
2
~
1
+ (1 )
3
. Then according to the Archimedean Axiom we can nd
a number b (0, 1) such that
2
~ b
1
+ (1 b)
1
+ (1 )
3
. The mixed distribution on
the righthand side has a total weight of k = b + (1 b) = + (1 )b > on
1
. Hence we
have found some k > for which
2
~ k
1
+ (1 k)
3
. This contradicts the denition of .
Consequently, we must have that
2
,~
1
+ (1 )
3
.
Now suppose that
1
+ (1 )
3
~
2
. Then we know from the Archimedean Axiom that a
number a (0, 1) exists such that a
1
+ (1 )
3
+ (1 a)
3
~
2
. The mixed distribution
on the lefthand side has a total weight of a < on
1
. Hence we have found some k < for
which k
1
+ (1 k)
3
~
2
. This contradicts the denition of . We can therefore also conclude
that
1
+ (1 )
3
,~
2
. In sum, we have
2
1
+ (1 )
3
. 2
The next result states that a preference relation which satises the Monotonicity Axiom and
has the continuity property can always be represented by a utility index. In particular this is true
when _ satises the Monotonicity Axiom and the Archimedean Axiom.
Theorem 5.1 Let _ be a preference relation which satises the Monotonicity Axiom and has the
continuity property. Then it can be represented by a utility index U, i.e. a function U : P(Z) R
with the property that
1
_
2
U(
1
) U(
2
).
Proof: Recall that we have assumed a best probability distribution 1
z
u and a worst probability
distribution 1
z
l in the sense that
1
z
u ~ ~ 1
z
l or 1
z
u ~ 1
z
l or 1
z
u ~ 1
z
l
for any P(Z). For any P(Z) we know from the continuity property that a unique number
1
z
u + (1
)1
z
l .
If 1
z
u ~ 1
z
l ,
= 1. If 1
z
u ~ 1
z
l ,
= 0. If 1
z
u ~ ~ 1
z
l ,
(0, 1).
We dene the function U : P(Z) R by U() =
1
_
2
zZ
1
(z)u(z)
zZ
2
(z)u(z). (5.1)
Here
zZ
(z)u(z) is the expected utility of endofperiod consumption given the consumption
probability distribution . The function u is called a von NeumannMorgenstern utility function
or simply a utility function. Note that u is dened on the set Z of consumption levels, which
in general has a simpler structure than the set of probability distributions on Z. Given a utility
function u, we can obviously dene a utility index by U() =
zZ
(z)u(z).
5.4.1 Conditions for expected utility
When can we use an expected utility representation of a preference relation? The next lemma is
a rst step.
Lemma 5.2 A preference relation _ has an expected utility representation if and only if it can be
represented by a linear utility index U in the sense that
U(a
1
+ (1 a)
2
) = aU(
1
) + (1 a)U(
2
)
for any
1
,
2
P(Z) and any a [0, 1].
Proof: Suppose that _ has an expected utility representation with utility function u. Dene
U : P(Z) R by U() =
zZ
(z)u(z). Then clearly U is a utility index representing _ and U
is linear since
U(a
1
+ (1 a)
2
) =
zZ
(a
1
(z) + (1 a)
2
(z)) u(z)
= a
zZ
1
(z)u(z) + (1 a)
zZ
2
(z)u(z)
= aU(
1
) + (1 a)U(
2
).
Conversely, suppose that U is a linear utility index representing _. Dene a function u : Z R
by u(z) = U(1
z
). For any P(Z) we have
zZ
(z)1
z
.
Therefore,
U() = U
_
zZ
(z)1
z
_
=
zZ
(z)U(1
z
) =
zZ
(z)u(z).
106 Chapter 5. Modeling the preferences of individuals
z 1 2 3 4
1
0 0.2 0.6 0.2
2
0 0.4 0.2 0.4
3
1 0 0 0
4
0.5 0.1 0.3 0.1
5
0.5 0.2 0.1 0.2
Table 5.3: The probability distributions used in the illustration of the Substitution Axiom.
Since U is a utility index, we have
1
_
2
U(
1
) U(
2
), which the computation above shows
is equivalent to
zZ
1
(z)u(z)
zZ
2
(z)u(z). Consequently, u gives an expected utility
representation of _. 2
The question then is under what assumptions the preference relation _ can be represented by
a linear utility index. As shown by von Neumann and Morgenstern (1944) we need an additional
axiom, the socalled Substitution Axiom.
Axiom 5.3 (Substitution) For all
1
,
2
,
3
P(Z) and all a (0, 1], we have
1
~
2
a
1
+ (1 a)
3
~ a
2
+ (1 a)
3
and
1
2
a
1
+ (1 a)
3
a
2
+ (1 a)
3
.
The Substitution Axiom is sometimes called the Independence Axiom or the Axiom of the Irrele
vance of the Common Alternative. Basically, it says that when the individual is to compare two
probability distributions, she need only consider the parts of the two distributions which are dier
ent from each other. As an example, suppose the possible consumption levels are Z = 1, 2, 3, 4
and consider the probability distributions on Z given in Table 5.3. Suppose you want to compare
the distributions
4
and
5
. They only dier in the probabilities they associate with consumption
levels 2, 3, and 4 so it should only be necessary to focus on these parts. More formally observe
that
4
0.5
1
+ 0.5
3
and
5
0.5
2
+ 0.5
3
.
1
is the conditional distribution of
4
given that the consumption level is dierent from 1 and
2
is the conditional distribution of
5
given that the consumption level is dierent from 1. The
Substitution Axiom then says that
4
~
5
1
~
2
.
The next lemma shows that the Substitution Axiom is more restrictive than the Monotonicity
Axiom.
Lemma 5.3 If a preference relation _ satises the Substitution Axiom, it will also satisfy the
Monotonicity Axiom.
5.4 Expected utility representation of preferences 107
Proof: Given
1
,
2
P(Z) with
1
~
2
and numbers a, b [0, 1]. We have to show that
a > b a
1
+ (1 a)
2
~ b
1
+ (1 b)
2
.
Note that if a = 0, we cannot have a > b, and if a
1
+(1 a)
2
~ b
1
+(1 b)
2
we cannot have
a = 0. We can therefore safely assume that a > 0.
First assume that a > b. Observe that it follows from the Substitution Axiom that
a
1
+ (1 a)
2
~ a
2
+ (1 a)
2
and hence that a
1
+ (1 a)
2
~
2
. Also from the Substitution Axiom we have that for any
3
~
2
, we have
3
_
1
b
a
_
3
+
b
a
3
~
_
1
b
a
_
2
+
b
a
3
.
Due to our observation above, we can use this with
3
= a
1
+ (1 a)
2
. Then we get
a
1
+ (1 a)
2
~
b
a
a
1
+ (1 a)
2
+
_
1
b
a
_
2
b
1
+ (1 b)
2
,
as was to be shown.
Conversely, assuming that
a
1
+ (1 a)
2
~ b
1
+ (1 b)
2
,
we must argue that a > b. The above inequality cannot be true if a = b since the two combined
distributions are then identical. If b was greater than a, we could follow the steps above with a and
b swapped and end up concluding that b
1
+ (1 b)
2
~ a
1
+ (1 a)
2
, which would contradict
our assumption. Hence, we cannot have neither a = b nor a < b but must have a > b. 2
Next we state the main result:
Theorem 5.2 Assume that Z is nite and that _ is a preference relation on P(Z). Then _ can
be represented by a linear utility index if and only if _ satises the Archimedean Axiom and the
Substitution Axiom.
Proof: First suppose the preference relation _ satises the Archimedean Axiom and the Substi
tution Axiom. Dene a utility index U : P(Z) R exactly as in the proof of Theorem 5.1, i.e.
U() =
, where
1
z
u + (1
)1
z
l .
We want to show that, as a consequence of the Substitution Axiom, U is indeed linear. For that
purpose, pick any two probability distributions
1
,
2
P(Z) and any number a [0, 1]. We want
to show that U(a
1
+ (1 a)
2
) = aU(
1
) + (1 a)U(
2
). We can do that by showing that
a
1
+ (1 a)
2
(aU(
1
) + (1 a)U(
2
)) 1
z
u + (1 aU(
1
) + (1 a)U(
2
)) 1
z
l .
108 Chapter 5. Modeling the preferences of individuals
This follows from the Substitution Axiom:
a
1
+ (1 a)
2
aU(
1
)1
z
u + (1 U(
1
)) 1
z
l + (1 a)U(
2
)1
z
u + (1 U(
2
)) 1
z
l
(aU(
1
) + (1 a)U(
2
)) 1
z
u + (1 aU(
1
) + (1 a)U(
2
)) 1
z
l .
Now let us show the converse, i.e. if _ can be represented by a linear utility index U, then it must
satisfy the Archimedean Axiom and the Substitution Axiom. In order to show the Archimedean
Axiom, we pick
1
~
2
~
3
, which means that U(
1
) > U(
2
) > U(
3
), and must nd numbers
a, b (0, 1) such that
a
1
+ (1 a)
3
~
2
~ b
1
+ (1 b)
3
,
i.e. that
U(a
1
+ (1 a)
3
) > U(
2
) > U(b
1
+ (1 b)
3
) .
Dene the number a by
a = 1
1
2
U(
1
) U(
2
)
U(
1
) U(
3
)
.
Then a (0, 1) and by linearity of U we get
U(a
1
+ (1 a)
3
) = aU(
1
) + (1 a)U(
3
)
= U(
1
) + (1 a) (U(
3
) U(
1
))
= U(
1
)
1
2
(U(
1
) U(
2
))
=
1
2
(U(
1
) +U(
2
))
> U(
2
).
Similarly for b.
In order to show the Substitution Axiom, we take
1
,
2
,
3
P(Z) and any number a (0, 1].
We must show that
1
~
2
if and only if a
1
+ (1 a)
3
~ a
2
+ (1 a)
3
, i.e.
U(
1
) > U(
2
) U(a
1
+ (1 a)
3
) > U(a
2
+ (1 a)
3
) .
This follows immediately by linearity of U:
U(a
1
+ (1 a)
3
) = aU(
1
) +U((1 a)
3
)
> aU(
2
) +U((1 a)
3
)
= U(a
2
+ (1 a)
3
)
with the inequality holding if and only if U(
1
) > U(
2
). Similarly, we can show that
1
2
if
and only if a
1
+ (1 a)
3
a
2
+ (1 a)
3
. 2
The next theorem shows which utility functions that represent the same preference relation. The
proof is left for the reader as Exercise 5.1.
Theorem 5.3 A utility function for a given preference relation is only determined up to a strictly
increasing ane transformation, i.e. if u is a utility function for _, then v will be so if and only
if there exist constants a > 0 and b such that v(z) = au(z) +b for all z Z.
5.4 Expected utility representation of preferences 109
If one utility function is an ane function of another, we will say that they are equivalent. Note
that an easy consequence of this theorem is that it does not really matter whether the utility is
positive or negative. At rst, you might nd negative utility strange but we can always add a
suciently large positive constant without aecting the ranking of dierent consumption plans.
Suppose U is a utility index with an associated utility function u. If f is any strictly increasing
transformation, then V = f U is also a utility index for the same preferences, but f u is only
the utility function for V if f is ane.
The expected utility associated with a probability distribution on Z is
zZ
(z)u(z). Recall
that the probability distributions we consider correspond to consumption plans. Given a con
sumption plan, i.e. a random variable c, the associated probability distribution is dened by the
probabilities
(z) = P( [c() = z) =
:c()=z
p
.
The expected utility associated with the consumption plan c is therefore
E[u(c)] =
u(c()) =
zZ
:c()=z
p
u(z) =
zZ
(z)u(z).
Of course, if c is a riskfree consumption plan in the sense that a z exists such that c() = z for all
, then the expected utility is E[u(c)] = u(z). With a slight abuse of notation we will just write
this as u(c).
5.4.2 Some technical issues
Innite Z. What if Z is innite, e.g. Z = R
+
[0, )? It can be shown that in this case a
preference relation has an expected utility representation if the Archimedean Axiom, the Substi
tution Axiom, an additional axiom (the sure thing principle), and some technical conditions
are satised. Fishburn (1970) gives the details.
Expected utility in this case: E[u(c)] =
_
Z
u(z)(z) dz, where is a probability density function
derived from the consumption plan c.
Boundedness of expected utility. Suppose u is unbounded from above and R
+
Z. Then
there exists (z
n
)
n=1
Z with z
n
and u(z
n
) 2
n
. Expected utility of consumption plan
1
with
1
(z
n
) = 1/2
n
:
n=1
u(z
n
)
1
(z
n
)
n=1
2
n
1
2
n
= .
If
2
,
3
are such that
1
~
2
~
3
, then the expected utility of
2
and
3
must be nite. But
for no b (0, 1) do we have
2
~ b
1
+ (1 b)
3
[expected utility = ].
no problem if Z is nite
no problem if R
+
Z, u is concave, and consumption plans have nite expectations:
u concave u is dierentiable in some point b and
u(z) u(b) +u
(b)(z b), z Z.
110 Chapter 5. Modeling the preferences of individuals
z 0 1 5
1
0 1 0
2
0.01 0.89 0.1
3
0.9 0 0.1
4
0.89 0.11 0
Table 5.4: The probability distributions used in the illustration of the Allais Paradox.
If the consumption plan c has nite expectations, then
E[u(c)] E[u(b) +u
~ 0.11
_
1
11
($0) +
10
11
($5)
_
+ 0.89 ($0) 0.9($0) + 0.1($5)
. .
0.
The certainty equivalent of the random consumption plan c is dened as the c
Z such that
u(c
) = E[u(c)],
i.e. the individual is just as satised getting the consumption level c
so that
E[u(c)] = u(c
) = u(E[c] (c)).
The risk premium is the consumption the individual is willing to give up in order to eliminate the
uncertainty.
The degree of risk aversion is associated with u
(c)
u
(c)
. (5.2)
The Relative Risk Aversion is given by
RRA(c) =
cu
(c)
u
(c)
= c ARA(c). (5.3)
We can link the ArrowPratt measures to the risk premium in the following way. Let c Z
denote some xed consumption level and let be a fair gamble. The resulting consumption plan
is then c = c +. Denote the corresponding risk premium by ( c, ) so that
E[u( c +)] = u(c
) = u( c ( c, )) . (5.4)
We can approximate the lefthand side of (5.4) by
E[u( c +)] E
_
u( c) +u
( c) +
1
2
2
u
( c)
_
= u( c) +
1
2
Var[]u
( c),
using E[] = 0 and Var[] = E[
2
] E[]
2
= E[
2
], and we can approximate the righthand side
of (5.4) by
u( c ( c, )) u( c) ( c, )u
( c).
Hence we can write the risk premium as
( c, )
1
2
Var[]
u
( c)
u
( c)
=
1
2
Var[] ARA( c).
Of course, the approximation is more accurate for small gambles. Thus the risk premium for a
small fair gamble around c is roughly proportional to the absolute risk aversion at c. We see that
the absolute risk aversion ARA( c) is constant if and only if ( c, ) is independent of c.
Loosely speaking, the absolute risk aversion ARA(c) measures the aversion to a fair gamble of
a given dollar amount around c, such as a gamble where there is an equal probability of winning
or loosing 1000 dollars. Since we expect that a wealthy investor will be less averse to that gamble
than a poor investor, the absolute risk aversion is expected to be a decreasing function of wealth.
Note that
ARA
(c) =
u
(c)u
(c) u
(c)
2
u
(c)
2
=
_
u
(c)
u
(c)
_
2
(c)
u
(c)
< 0 u
(c) > 0,
5.5 Risk aversion 113
that is, a positive thirdorder derivative of u is necessary for the utility function u to exhibit
decreasing absolute risk aversion.
Now consider a multiplicative fair gamble around c in the sense that the resulting consumption
plan is c = c (1 +) = c + c, where E[] = 0. The risk premium is then
( c, c)
1
2
Var[ c] ARA( c) =
1
2
c
2
Var[] ARA( c) =
1
2
c Var[] RRA( c)
implying that
( c, c)
c
1
2
Var[] RRA( c). (5.5)
The fraction of consumption you require to engage in the multiplicative risk is thus (roughly) pro
portional to the relative risk aversion at c. Note that utility functions with constant or decreasing
(or even modestly increasing) relative risk aversion will display decreasing absolute risk aversion.
Some authors use terms like risk tolerance and risk cautiousness. The absolute risk tolerance
at c is simply the reciprocal of the absolute risk aversion, i.e.
ART(c) =
1
ARA(c)
=
u
(c)
u
(c)
.
Similarly, the relative risk tolerance is the reciprocal of the relative risk aversion. The risk cau
tiousness at c is dened as the rate of change in the absolute risk tolerance, i.e. ART
(c).
5.5.3 Comparison of risk aversion between individuals
An individual with utility function u is said to be more riskaverse than an individual with utility
function v if for any consumption plan c and any xed c Z with E[u(c)] u( c), we have
E[v(c)] v( c). So the vindividual will accept all gambles that the uindividual will accept and
possibly some more. Pratt (1964) has shown the following theorem:
Theorem 5.5 Suppose u and v are twice continuously dierentiable and strictly increasing. Then
the following conditions are equivalent:
(a) u is more riskaverse than v,
(b) ARA
u
(c) ARA
v
(c) for all c Z,
(c) a strictly increasing and concave function f exists such that u = f v.
Proof: First let us show (a) (b): Suppose u is more riskaverse than v, but that ARA
u
( c) <
ARA
v
( c) for some c Z. Since ARA
u
and ARA
v
are continuous, we must then have that
ARA
u
(c) < ARA
v
(c) for all c in an interval around c. Then we can surely nd a small gamble
around c, which the uindividual will accept, but the vindividual will reject. This contradicts the
assumption in (a).
Next, we show (b) (c): Since v is strictly increasing, it has an inverse v
1
and we can dene
a function f by f(x) = u
_
v
1
(x)
_
. Then clearly f(v(c)) = u(c) so that u = f v. The rstorder
derivative of f is
f
(x) =
u
_
v
1
(x)
_
v
(v
1
(x))
,
114 Chapter 5. Modeling the preferences of individuals
which is positive since u and v are strictly increasing. Hence, f is strictly increasing. The second
order derivative is
f
(x) =
u
_
v
1
(x)
_
_
v
_
v
1
(x)
_
u
_
v
1
(x)
__
/v
_
v
1
(x)
_
v
(v
1
(x))
2
=
u
_
v
1
(x)
_
v
(v
1
(x))
2
_
ARA
v
_
v
1
(x)
_
ARA
u
_
v
1
(x)
__
.
From (b), it follows that f
(c) = c
and u
(c) = c
1
,
the absolute and relative risk aversions are given by
ARA(c) =
u
(c)
u
(c)
=
c
, RRA(c) = c ARA(c) = .
The relative risk aversion is constant across consumption levels c, hence the name CRRA (Constant
Relative Risk Aversion) utility. Note that u
(0+) lim
c0
u
() lim
c
u
(c) = 0.
Some authors assume a utility function of the form u(c) = c
1
, which only makes sense for
(0, 1). However, empirical studies indicate that most investors have a relative risk aversion
above 1, cf. the discussion below. The absolute risk tolerance is linear in c:
ART(c) =
1
ARA(c)
=
c
.
5.6 Utility functions in models and in reality 115
6
4
2
0
2
4
6
0 4 8 12 16
RRA=0.5 RRA=1 RRA=2 RRA=5
Figure 5.1: Some CRRA utility functions.
Except for a constant, the utility function
u(c) =
c
1
1
1
is identical to the utility function specied in (5.6). The two utility functions are therefore equiv
alent in the sense that they generate the same optimal choices. The advantage in using the latter
denition is that this function has a welldened limit as 1. From lHospitals rule we have
that
lim
1
c
1
1
1
= lim
1
c
1
lnc
1
= lnc,
which is the important special case of logarithmic utility. When we consider CRRA utility,
we will assume the simpler version (5.6), but we will use the fact that we can obtain the optimal
strategies of a logutility investor as the limit of the optimal strategies of the general CRRA investor
as 1.
Some CRRA utility functions are illustrated in Figure 5.1.
HARA utility. (Also known as extended power utility.) The absolute risk aversion for CRRA
utility is hyperbolic in c. More generally a utility function is said to be a HARA (Hyperbolic
Absolute Risk Aversion) utility function if
ARA(c) =
u
(c)
u
(c)
=
1
c +
for some constants , such that c + > 0 for all relevant c. HARA utility functions are
sometimes referred to as ane (or linear) risk tolerance utility functions since the absolute risk
tolerance is
ART(c) =
1
ARA(c)
= c +.
116 Chapter 5. Modeling the preferences of individuals
The risk cautiousness is ART
(c) = .
How do the HARA utility functions look like? First, let us take the case = 0, which implies
that the absolute risk aversion is constant (socalled CARA utility) and must be positive.
d(lnu
(c))
dc
=
u
(c)
u
(c)
=
1
implies that
lnu
(c) =
c
+k
1
u
(c) = e
k
1
e
c/
for some constant k
1
. Hence,
u(c) =
1
e
k
1
e
c/
+k
2
for some other constant k
2
. Applying the fact that increasing ane transformations do not change
decisions, the basic representative of this class of utility functions is the negative exponential
utility function
u(c) = e
ac
, c R, (5.7)
where the parameter a = 1/ is the absolute risk aversion. Constant absolute risk aversion is
certainly not very reasonable. Nevertheless, the negative exponential utility function is sometimes
used for computational purposes in connection with normally distributed returns, e.g. in oneperiod
models.
Next, consider the case ,= 0. Applying the same procedure as above we nd
d(lnu
(c))
dc
=
u
(c)
u
(c)
=
1
c +
lnu
(c) =
1
ln(c +) +k
1
so that
u
(c) = e
k
1
exp
_
ln(c +)
_
= e
k
1
(c +)
1/
. (5.8)
For = 1 this implies that
u(c) = e
k
1
ln(c +) +k
2
.
The basic representative of such utility functions is the extended log utility function
u(c) = ln(c c) , c > c, (5.9)
where we have replaced by c. For ,= 1, Equation (5.8) implies that
u(c) =
1
e
k
1
1
1
1
(c +)
11/
+k
2
.
For < 0, we can write the basic representative is
u(c) = ( c c)
1
, c < c, (5.10)
where = 1/ < 0. We can think of c as a satiation level and call this subclass satiation HARA
utility functions. The absolute risk aversion is
ARA(c) =
c c
,
which is increasing in c, conicting with intuition and empirical studies. Some older nancial
models used the quadratic utility function, which is the special case with = 1 so that u(c) =
( c c)
2
. An equivalent utility function is u(c) = c ac
2
.
5.6 Utility functions in models and in reality 117
For > 0 (and ,= 1), the basic representative is
u(c) =
(c c)
1
1
, c > c, (5.11)
where = 1/ > 0. The limit as 1 of the equivalent utility function
(c c)
1
1
1
is equal to the
extended log utility function u(c) = ln(c c). We can think of c as a subsistence level of wealth or
consumption (which makes sense only if c 0) and refer to this subclass as subsistence HARA
utility functions. The absolute and relative risk aversions are
ARA(c) =
c c
, RRA(c) =
c
c c
=
1 ( c/c)
,
which are both decreasing in c. The relative risk aversion approaches for c c and decreases
to the constant for c . Clearly, for c = 0, we are back to the CRRA utility functions so that
these also belong to the HARA family.
Meanvariance preferences. For some problems it is convenient to assume that the expected
utility associated with an uncertain consumption plan only depends on the expected value and the
variance of the consumption plan. This is certainly true if the consumption plan is a normally
distributed random variable since its probability distribution is fully characterized by the mean and
variance. However, it is generally not appropriate to use a normal distribution for consumption
(or wealth or asset returns).
For a quadratic utility function, u(c) = c ac
2
, the expected utility is
E[u(c)] = E
_
c ac
2
= E[c] a E
_
c
2
= E[c] a
_
Var[c] + E[c]
2
_
,
which is indeed a function of the expected value and the variance of the consumption plan. Alas,
the quadratic utility function is inappropriate for several reasons. Most importantly, it exhibits
increasing absolute risk aversion.
For a general utility function the expected utility of a consumption plan will depend on all
moments. This can be seen by the Taylor expansion of u(c) around the expected consumption,
E[c]:
u(c) = u(E[c]) +u
(E[c])(c E[c]) +
1
2
u
(E[c])(c E[c])
2
+
n=3
1
n!
u
(n)
(E[c])(c E[c])
n
,
where u
(n)
is the nth order derivative of u. Taking expectations, we get
E[u(c)] = u(E[c]) +
1
2
u
(E[c]) Var[c] +
n=3
1
n!
u
(n)
(E[c]) E[(c E[c])
n
] .
Here E[(c E[c])
n
] is the central moment of order n. The variance is the central moment of order 2.
Obviously, a greedy investor (which just means that u is increasing) will prefer higher expected
consumption to lower for xed central moments of order 2 and higher. Moreover, a riskaverse
investor (so that u
< 0) will prefer lower variance of consumption to higher for xed expected
consumption and xed central moments of order 3 and higher. But when the central moments
of order 3 and higher are not the same for all alternatives, we cannot just evaluate them on the
basis of their expectation and variance. With quadratic utility, the derivatives of u of order 3
and higher are zero so there it works. In general, meanvariance preferences can only serve as an
approximation of the true utility function.
118 Chapter 5. Modeling the preferences of individuals
5.6.2 What do we know about individuals risk aversion?
From our discussion of risk aversion and various utility functions we expect that individuals are
risk averse and exhibit decreasing absolute risk aversion. But can this be supported by empirical
evidence? Do individuals have constant relative risk aversion? And what is a reasonable level of
risk aversion for individuals?
You can get an idea of the risk attitudes of an individual by observing how they choose between
risky alternatives. Some researchers have studied this by setting up laboratory experiments in
which they present some risky alternatives to a group of individuals and simply see what they
prefer. Some of these experiments suggest that expected utility theory is frequently violated,
see e.g. Grether and Plott (1979). However, laboratory experiments are problematic for several
reasons. You cannot be sure that individuals will make the same choice in what they know is an
experiment as they would in real life. It is also hard to formulate alternatives that resemble the
rather complex reallife decisions. It seems more fruitful to study actual data on how individuals
have acted confronted with reallife decision problems under uncertainty. A number of studies do
that.
Friend and Blume (1975) analyze data on household asset holdings. They conclude that the
data is consistent with individuals having roughly constant relative risk aversion and that the
coecients of relative risk aversion are on average well in excess of one and probably in excess of
two (quote from page 900 in their paper). Pindyck (1988) nds support of a relative risk aversion
between 3 and 4 in a structural model of the reaction of stock prices to fundamental variables.
Other studies are based on insurance data. Using U.S. data on socalled property/liability
insurance, Szpiro (1986) nds support of CRRA utility with a relative risk aversion coecient
between 1.2 and 1.8. Cicchetti and Dubin (1994) work with data from the U.S. on whether
individuals purchased an insurance against the risk of trouble with their home telephone line.
They conclude that the data is consistent with expected utility theory and that a subsistence
HARA utility function performs better than log utility or negative exponential utility.
Ogaki and Zhang (2001) study data on individual food consumption from Pakistan and India
and conclude that relative risk aversion is decreasing for poor individuals, which is consistent with
a subsistence HARA utility function.
It is an empirical fact that even though consumption and wealth have increased tremendously
over the years, the magnitude of real rates of return has not changed dramatically. As indicated
by (5.5) relative risk premia are approximately proportional to the relative risk aversion. As we
shall see in later chapters, basic asset pricing theory implies that relative risk premia on nancial
assets (in terms of expected real return in excess of the real riskfree return) will be proportional
to the average relative risk aversion in the economy. If the average relative risk aversion was
signicantly decreasing (increasing) in the level of consumption or wealth, we should have seen
decreasing (increasing) real returns on risky assets in the past. The data seems to be consistent
with individuals having on average close to CRRA utility.
To get a feeling of what a given risk aversion really means, suppose you are confronted with
two consumption plans. One plan is a sure consumption of c, the other plan gives you (1 ) c
with probability 0.5 and (1 + ) c with probability 0.5. If you have a CRRA utility function
5.6 Utility functions in models and in reality 119
= RRA = 1% = 10% = 50%
0.5 0.00% 0.25% 6.70%
1 0.01% 0.50% 13.40%
2 0.01% 1.00% 25.00%
5 0.02% 2.43% 40.72%
10 0.05% 4.42% 46.00%
20 0.10% 6.76% 48.14%
50 0.24% 8.72% 49.29%
100 0.43% 9.37% 49.65%
Table 5.5: Relative risk premia for a fair gamble of the fraction of your consumption.
u(c) = c
1
/(1 ), the certainty equivalent c
)
1
=
1
2
1
1
((1 ) c)
1
+
1
2
1
1
((1 +) c)
1
,
which implies that
c
=
_
1
2
_
1/(1)
_
(1 )
1
+ (1 +)
1
1/(1)
c.
The risk premium ( c, ) is
( c, ) = c c
=
_
1
_
1
2
_
1/(1)
_
(1 )
1
+ (1 +)
1
1/(1)
_
c.
Both the certainty equivalent and the risk premium are thus proportional to the consumption
level c. The relative risk premium ( c, )/ c is simply one minus the relative certainty equivalent
c
/ c. These equations assume ,= 1. In Exercise 5.5 you are asked to nd the certainty equivalent
and risk premium for logutility corresponding to = 1.
Table 5.5 shows the relative risk premium for various values of the relative risk aversion coecient
and various values of , the size of the risk. For example, an individual with = 5 is willing to
sacrice 2.43% of the safe consumption in order to avoid a fair gamble of 10% of that consumption
level. Of course, even extremely risk averse individuals will not sacrice more than they can loose
but in some cases it is pretty close. Looking at these numbers, it is hard to believe in values
outside, say, [1, 10]. In Exercise 5.6 you are asked to compare the exact relative risk premia shown
in the table with the approximate risk premia given by (5.5).
5.6.3 Twogood utility functions and the elasticity of substitution
Consider an atemporal utility function f(c, z) of the consumption of two dierent goods at the same
time. An indierence curve in the (c, z)space is characterized by f(c, z) = k for some constant k.
Changes in c and z along an indierence curve are linked by
f
c
dc +
f
z
dz = 0
so that the slope of the indierence curve (also known as the marginal rate of substitution) is
dz
dc
=
f
z
f
c
.
120 Chapter 5. Modeling the preferences of individuals
Unless the indierence curve is linear, its slope will change along the curve. Indierence curves are
generally assumed to be convex. The elasticity of substitution tells you by which percentage you
need to change z/c in order to obtain a one percent change in the slope of the indierence curve. It
is a measure of the curvature or convexity of the indierence curve. If the indierence curve is very
curved, you only have to move a little along the curve before its slope has changed by one percent.
Hence, the elasticity of substitution is low. If the indierence curve is almost linear, you have to
move far away to change the slope by one percent. In that case the elasticity of substitution is
very high. Formally, the elasticity of substitution is dened as
=
d
_
z
c
_ _
z
c
d
_
f
z
f
c
_
_ f
z
f
c
=
f
z
_
f
c
z/c
d (z/c)
d
_
f
z
_
f
c
_, (5.12)
which is equivalent to
=
d ln(z/c)
d ln
_
f
z
_
f
c
_.
Assume now that
f(c, z) = (ac
+bz
)
1/
, (5.13)
where < 1 and ,= 0. Then
f
c
= ac
1
(ac
+bz
)
1
1
,
f
z
= bz
1
(ac
+bz
)
1
1
,
and thus
f
z
f
c
=
b
a
_
z
c
_
1
.
Computing the derivative with respect to z/c, we get
d
_
f
z
f
c
_
d
_
z
c
_ =
b
a
( 1)
_
z
c
_
2
and thus
=
b
a
_
z
c
_
1
z
c
1
b
a
( 1)
_
z
c
_
2
=
1
1
=
1
1
,
which is independent of (c, z). Therefore the utility function (5.13) is referred to as CES (Constant
Elasticity of Substitution) utility.
For the CobbDouglas utility function
f(c, z) = c
a
z
1a
, 0 < a < 1, (5.14)
the intertemporal elasticity of substitution equals 1. In fact, the CobbDouglas utility func
tion (5.14) can be seen as the limit of the utility function (5.13) assuming b = 1 a as 0.
5.7 Preferences for multidate consumption plans
Above we implicitly considered preferences for consumption at one given future point in time.
We need to generalize the ideas and results to settings with consumption at several dates. In
oneperiod models individuals can consume both at time 0 (beginningofperiod) and at time 1
5.7 Preferences for multidate consumption plans 121
(endofperiod). In multiperiod models individuals can consume either at each date in the discrete
time set T = 0, 1, 2, . . . , T or at each date in the continuous time set T = [0, T]. In any case a
consumption plan is a stochastic process c = (c
t
)
tT
where each c
t
is a random variable representing
the statedependent level of consumption at time t.
Consider the discretetime case and, for each t, let Z
t
R denote the set of all possible consump
tion levels at date t and dene Z = Z
0
Z
1
Z
T
R
T+1
, then any consumption plan c can
again be represented by a probability distribution on the set Z. For nite Z, we can again apply
Theorem 5.1 so that under the relevant axioms, we can represent preferences by a utility index U,
which to each consumption plan (c
t
)
tT
= (c
0
, c
1
, . . . , c
T
) attaches a real number U(c
0
, c
1
, . . . , c
T
)
with higher numbers to the more preferred consumption plans. If we further impose the Substitu
tion Axiom, Theorem 5.2 ensures an expected utility representation, i.e. the existence of a utility
function U : Z R so that consumption plans are ranked according to their expected utility, i.e.
U(c
0
, c
1
, . . . , c
T
) = E[U(c
0
, c
1
, . . . , c
T
)]
U (c
0
, c
1
(), . . . , c
T
()) .
We can call U a multidate utility function since it depends on the consumption levels at all
dates. Again this result can be extended to the case of an innite Z, e.g. Z = R
T+1
+
, but also
to continuoustime settings where U will then be a function of the entire consumption process
c = (c
t
)
t[0,T]
.
5.7.1 Additively timeseparable expected utility
Often timeadditivity is assumed so that the utility the individual gets from consumption in one
period does not directly depend on what she consumed in earlier periods or what she plans to
consume in later periods. For the discretetime case, this means that
U(c
0
, c
1
, . . . , c
T
) =
T
t=0
u
t
(c
t
)
where each u
t
is a valid singledate utility function. Still, when the individual has to choose her
current consumption rate, she will take her prospects for future consumption into account. The
continuoustime analogue is
U((c
t
)
t[0,T]
) =
_
T
0
u
t
(c
t
) dt.
In addition it is typically assumed that u
t
(c
t
) = e
t
u(c
t
) for all t. This is to say that the direct
utility the individual gets from a given consumption level is basically the same for all dates, but
the individual prefers to consume any given number of goods sooner than later. This is modeled by
the subjective time preference rate , which we assume to be constant over time and independent
of the consumption level. More impatient individuals have higher s. In sum, the lifetime utility
is typically assumed to be given by
U(c
0
, c
1
, . . . , c
T
) =
T
t=0
e
t
u(c
t
)
in discretetime models and
U((c
t
)
t[0,T]
) =
_
T
0
e
t
u(c
t
) dt
122 Chapter 5. Modeling the preferences of individuals
in continuoustime models. In both cases, u is a singledate utility function such as those
discussed in Section 5.6.
Timeadditivity is mostly assumed for tractability. However, it is important to realize that the
timeadditive specication does not follow from the basic axioms of choice under uncertainty, but
is in fact a strong assumption, which most economists agree is not very realistic. One problem
is that timeadditive preferences induce a close link between the reluctance to substitute con
sumption across dierent states of the economy (which is measured by risk aversion) and the
willingness to substitute consumption over time (which can be measured by the socalled elasticity
of intertemporal substitution). Solving intertemporal utility maximization problems of individuals
with timeadditive CRRA utility, it turns out that an individual with a high relative risk aversion
will also choose a very smooth consumption process, i.e. she will have a low elasticity of intertem
poral substitution. There is nothing in the basic theory of choice that links the risk aversion and
the elasticity of intertemporal substitution together. For one thing, risk aversion makes sense even
in an atemporal (i.e. onedate) setting where intertemporal substitution is meaningless and, con
versely, intertemporal substitution makes sense in a multiperiod setting without uncertainty in
which risk aversion is meaningless. The close link between the two concepts in the multiperiod
model with uncertainty is an unfortunate consequence of the assumption of timeadditive expected
utility.
According to Browning (1991), nonadditive preferences were already discussed in the 1890 book
Principles of Economics by Alfred Marshall. See Brownings paper for further references to the
critique on intertemporally separable preferences. Let us consider some alternatives that are more
general and still tractable.
5.7.2 Habit formation and statedependent utility
The key idea of habit formation is to let the utility associated with the choice of consumption at
a given date depend on past choices of consumption. In a discretetime setting the utility index of
a given consumption process c is now given as E[
T
t=0
e
t
u(c
t
, h
t
)], where h
t
is a measure of the
standard of living or the habit level of consumption, e.g. a weighted average of past consumption
rates such as
h
t
= h
0
e
t
+
t1
s=1
e
(ts)
c
s
,
where h
0
, , and are nonnegative constants. It is assumed that u is decreasing in h so that
high past consumption generates a desire for high current consumption, i.e. preferences display
intertemporal complementarity. For reasons of mathematical tractability, u(c, h) is often assumed
to be of the powerlinear form
u(c, h) =
1
1
(c h)
1
, > 0, c h.
This is closely related to the subsistence HARA utility, but with habit formation the subsistence
level h is endogenously determined by past consumption. The corresponding absolute and relative
risk aversions are
ARA(c, h)
u
cc
(c, h)
u
c
(c, h)
=
c h
, RRA(c, h) c
u
cc
(c, h)
u
c
(c, h)
=
c
c h
,
5.7 Preferences for multidate consumption plans 123
where u
c
and u
cc
are the rst and secondorder derivatives of u with respect to c. In particular,
the relative risk aversion is decreasing in c. Note that the habit formation preferences are still
consistent with expected utility.
A related line of extension of the basic preferences is to allow the preferences of an individual
to depend on some external factors, i.e. factors that are not fully determined by choices made
by the individual. One example that has received some attention is where the utility which some
individual attaches to her consumption plan depends on the consumption plans of other individuals
or maybe the aggregate consumption in the economy. This is often referred to as keeping up
with the Joneses. If you see your neighbors consume at high rates, you want to consume at
a high rate too. Utility is statedependent. Models of this type are sometimes said to have an
external habit, whereas the habit formation discussed above is then referred to as internal habit.
If we denote the external factor by X
t
, a timeadditive lifetime expected utility representation
is E[
T
t=0
e
t
u(c
t
, X
t
)], and a tractable version is u(c, X) =
1
1
(c X)
1
very similar to the
subsistence CRRA or the specic habit formation utility given above. In this case, however, the
subsistence level is determined by external factors. Another tractable specication is u(c, X) =
1
1
(c/X)
1
.
5.7.3 Recursive utility
Another preference specication gaining popularity is the socalled recursive preferences or
EpsteinZin preferences, suggested and discussed by, e.g., Kreps and Porteus (1978), Epstein and
Zin (1989, 1991), and Weil (1989). The original motivation of this representation of preferences is
that it allows individuals to have preferences for the timing of resolution of uncertainty, which is not
consistent with the standard multidate expected utility theory and violates the set of behavioral
axioms.
In a discretetime framework Epstein and Zin (1989, 1991) assumed that lifetime utility from
time t on is captured by a utility index U
t
(in this literature sometimes called the felicity)
satisfying the recursive relation
U
t
= f(c
t
, z
t
),
where z
t
= c
t
(U
t+1
) is the certainty equivalent of U
t+1
given information available at time t and
f is an aggregator on the form
f(c, z) = (ac
+bz
)
1/
.
The aggregator is identical to the twogood CES utility specication (5.13) and, since z
t
here refers
to future consumption or utility, = 1/(1) is called the elasticity of intertemporal substitution.
An investors willingness to substitute risk between states is modeled through z
t
as the certainty
equivalent of a constant relative risk aversion utility function. Recall that the certainty equivalent
for an atemporal utility function u is dened as
c
= u
1
(E[u(x)]) .
In particular for CRRA utility u(x) = x
1
/(1 ) we obtain
c
=
_
E[x
1
]
_ 1
1
,
where > 0 is the relative risk aversion.
124 Chapter 5. Modeling the preferences of individuals
To sum up, EpsteinZin preferences are specied recursively as
U
t
=
_
ac
t
+b
_
E
t
[U
1
t+1
]
_
1
_
1/
. (5.15)
Using the fact that = 1
1
, we can rewrite U
t
as
U
t
=
_
_
ac
1
1
t
+b
_
E
t
[U
1
t+1
]
_
1
1
1
_
_
1
1
1
.
Introducing = (1 )/(1
1
), we have
U
t
=
_
ac
1
t
+b
_
E
t
[U
1
t+1
]
_1
_
1
.
The constant a is unimportant. Bansal (2007) and other authors put a = 1 and b = .
However, the value of a does not aect optimal decisions and therefore no interpretation can be
given to a. In order to see this, rst note that we can rewrite (5.15) as
U
t
= a
1/
_
c
t
+ba
1
_
E
t
_
U
1
t+1
__
1
_
1/
= a
1/
_
c
t
+b
_
E
t
_
_
a
1/
U
t+1
_
1
__
1
_
1/
,
which implies that
a
1/
U
t
=
_
c
t
+b
_
E
t
_
_
a
1/
U
t+1
_
1
__
1
_
1/
.
Hence, it is clear that the utility index
U dened for any t by
U
t
= a
1/
U
t
is equivalent to the
utility index U, since it is just a scaling. Without loss of generality we could put a = 1.
Timeadditive power utility as a special case. The special case = 1/ corresponds to
timeadditive expected CRRA utility. In order to see this, rst note that with = 1/, we have
= 1 and thus
U
t
=
_
ac
1
t
+b E
t
[U
1
t+1
]
_ 1
1
or
U
1
t
= ac
1
t
+b E
t
[U
1
t+1
].
If we start unwinding the recursions, we get
U
1
t
= ac
1
t
+b E
t
_
ac
1
t+1
+b E
t+1
[U
1
t+2
]
_
= a E
t
_
c
1
t
+bc
1
t+1
_
+b
2
E
t
_
U
1
t+2
_
= . . .
= a
s=0
E
t
_
b
s
c
1
t+s
_
.
Any strictly increasing function of a utility index will represent the same preferences. In particular,
V
t
=
1
a(1 )
U
1
t
=
s=0
E
t
_
b
s
1
1
c
1
t+s
_
5.8 Exercises 125
represents the same preferences and is clearly equivalent to timeadditive expected utility. Note
that b plays the role of the subjective discount factor.
The EpsteinZin preferences are characterized by three parameters: the relative risk aversion
, the elasticity of intertemporal substitution , and the subjective discount factor b. Relative
to the standard timeadditive power utility, the EpsteinZin specication allows the relative risk
aversion (attitudes towards atemporal risks) to be disentangled form the elasticity of intertemporal
substitution (attitudes towards shifts in consumption over time). Moreover, Epstein and Zin (1989)
shows that when > 1/, the individual will prefer early resolution of uncertainty. If < 1/,
late resolution of uncertainty is preferred. For the standard utility case = 1/, the individual
is indierent about the timing of the resolution of uncertainty. Note that in the relevant case of
> 1, the auxiliary parameter will be negative if and only if > 1. Empirical studies disagree
about reasonable values of . Some studies nd smaller than one (e.g. Campbell 1999), other
studies nd greater than one (e.g. VissingJrgensen and Attanasio 2003).
The continuoustime equivalent of recursive utility is called stochastic dierential utility and
studied by, e.g., Due and Epstein (1992b). Note that generally recursive preferences are not
consistent with expected utility since U
t
depends nonlinearly on the probabilities of future con
sumption levels.
5.7.4 Twogood, multiperiod utility
For studying some problems it is useful or even necessary to distinguish between dierent consump
tion goods. Until now we have implicitly assumed a single consumption good which is perishable
in the sense that it cannot be stored. However, individuals spend large amounts on durable goods
such as houses and cars. These goods provide utility to the individual beyond the period of pur
chase and can potentially be resold at a later date so that it also acts as an investment. Another
important good is leisure. Individuals have preferences both for consumption of physical goods
and for leisure. A tractable twogood utility function is the CobbDouglas function:
u(c
1
, c
2
) =
1
1
_
c
1
c
1
2
_
1
,
where [0, 1] determines the relative weighting of the two goods.
5.8 Exercises
EXERCISE 5.1 Give a proof of Theorem 5.3.
EXERCISE 5.2 (Adapted from Problem 3.3 in Kreps (1990).) Consider the following two
probability distributions of consumption.
1
gives 5, 15, and 30 (dollars) with probabilities 1/3,
5/9, and 1/9, respectively.
2
gives 10 and 20 with probabilities 2/3 and 1/3, respectively.
(a) Show that we can think of
1
as a twostep gamble, where the rst gamble is identical to
2
. If the outcome of the rst gamble is 10, then the second gamble gives you an additional
5 (total 15) with probability 1/2 and an additional 5 (total 5) also with probability 1/2.
If the outcome of the rst gamble is 20, then the second gamble gives you an additional 10
(total 30) with probability 1/3 and an additional 5 (total 15) with probability 2/3.
126 Chapter 5. Modeling the preferences of individuals
(b) Observe that the second gamble has mean zero and that
1
is equal to
2
plus meanzero
noise. Conclude that any riskaverse expected utility maximizer will prefer
2
to
1
.
EXERCISE 5.3 (Adapted from Chapter 3 in Kreps (1990).) Imagine a greedy, riskaverse,
expected utility maximizing consumer whose endofperiod income level is subject to some uncer
tainty. The income will be Y with probability p and Y
as some loss the consumer might incur due an accident. An insurance company is
willing to insure against this loss by paying to the consumer if she sustains the loss. In return,
the company wants an upfront premium of . The consumer may choose partial coverage in the
sense that if she pays a premium of a, she will receive a if she sustains the loss. Let u denote
the von NeumannMorgenstern utility function of the consumer. Assume for simplicity that the
premium is paid at the end of the period.
(a) Show that the rst order condition for the choice of a is
pu
(Y a) = (1 p)()u
(Y (1 a)a).
(b) Show that if the insurance is actuarially fair in the sense that the expected payout (1 p)
equals the premium , then the consumer will purchase full insurance, i.e. a = 1 is optimal.
(c) Show that if the insurance is actuarially unfair, meaning (1 p) < , then the consumer
will purchase partial insurance, i.e. the optimal a is less than 1.
EXERCISE 5.4 Consider a oneperiod choice problem with four equally likely states of the
world at the end of the period. The consumer maximizes expected utility of endofperiod wealth.
The current wealth must be invested in a single nancial asset today. The consumer has three
assets to choose from. All three assets have a current price equal to the current wealth of the
consumer. The assets have the following endofperiod values:
state 1 2 3 4
probability 0.25 0.25 0.25 0.25
asset 1 100 100 100 100
asset 2 81 100 100 144
asset 3 36 100 100 225
(a) What asset would a riskneutral individual choose?
(b) What asset would a power utility investor, u(W) =
1
1
W
1
choose if = 0.5? If = 2?
If = 5?
Now assume a power utility with = 0.5.
5.8 Exercises 127
(c) Suppose the individual could obtain a perfect signal about the future state before she makes
her asset choice. There are thus four possible signals, which we can represent by s
1
= 1,
s
2
= 2, s
3
= 3, and s
4
= 4. What is the optimal asset choice for each signal? What
is her expected utility before she receives the signal, assuming that the signals have equal
probability?
(d) Now suppose that the individual can receive a lessthanperfect signal telling her whether
the state is in s
1
= 1, 4 or in s
2
= 2, 3. The two possible signals are equally likely. What
is the expected utility of the investor before she receives the signal?
EXERCISE 5.5 Consider an individual with log utility, u(c) = lnc. What is her certainty
equivalent and risk premium for the consumption plan which with probability 0.5 gives her (1) c
and with probability 0.5 gives her (1+) c? Conrm that your results are consistent with numbers
for = 1 shown in Table 5.5.
EXERCISE 5.6 Use Equation (5.5) to compute approximate relative risk premia for the
consumption gamble underlying Table 5.5 and compare with the exact numbers given in the table.
CHAPTER 6
Individual optimality
6.1 Introduction
Chapter 4 discussed how the general pricing mechanism of a nancial market can be represented
by a stateprice deator. Given a stateprice deator we can price all statecontingent dividends.
Conversely, given the market prices of statecontingent dividends we can extract one or several
stateprice deators. Market prices and hence the stateprice deator(s) are determined by the
supply and demand of the individuals in the economy. Therefore, we have to study the portfolio
decisions of individuals. This is the topic of the present chapter. In the next chapter we will then
discuss market equilibrium.
Section 6.2 studies the maximization problem of an individual under various preference speci
cations in the oneperiod setting. Sections 6.3 and 6.4 extend the analysis to the discretetime
and the continuoustime framework, respectively. The main result of these sections is that the
(marginal utility of) optimal consumption of any individual induces a valid stateprice deator,
which gives us a link between individual optimality and asset prices. This is the cornerstone of
the consumptionbased asset pricing models studied in Chapter 8. Section 6.5 introduces the dy
namic programming approach to the solution of multiperiod utility maximization problems. In
particular, we derive the socalled envelope condition that links marginal utility of consumption
to marginal utility of optimal investments. In this way the stateprice deator is related to the
optimally invested wealth of the individual plus some state variables capturing other information
aecting the decisions of the individual. This will be useful for the factor pricing models studied
in Chapter 9.
6.2 The oneperiod framework
In the oneperiod framework the individual consumes at time 0 (the beginning of the period) and
at time 1 (the end of the period). We denote time 0 consumption by c
0
and the statedependent
129
130 Chapter 6. Individual optimality
time 1 consumption by the random variable c. The individual has some initial wealth e
0
0 at
time 0 and may receive a nonnegative statedependent endowment (income) at time 1 represented
by the random variable e. The individual picks a portfolio at time 0 with a time 0 price of
P
P =
I
i=1
i
P
i
, assuming the Law of One Price, and a time 1 random dividend of
D
D =
I
i=1
i
D
i
. The budget constraints are therefore
c
+D
= e
+
I
i=1
D
i
i
, for all , (6.1)
c
0
e
0
P
= e
0
i=1
i
P
i
. (6.2)
The individual can choose the consumption plan and the portfolio. Since we will always assume
that individuals prefer more consumption to less, it is clear that the budget constraints will hold
as equalities. Therefore we can think of the individual choosing only the portfolio and then the
consumption plan follows from the budget constraints above.
Consumption has to be nonnegative both at time 0 and in all states at time 1 so we should add
such constraints when looking for the optimal strategy. However, we assume throughout that the
individual has innite marginal utility at zero consumption so that the nonnegativity constraints
are automatically satised and can be ignored. We assume that the preferences are concave so that
rstorder conditions provide the optimal choice. When solving the problem we will also assume
that the individual acts as a price taker so that prices are unaected by her portfolio choice. We
assume that prices admit no arbitrage. If there was an arbitrage, it would be possible to obtain
innite consumption. We do not impose any constraints on the portfolios the individual may
choose among.
The following subsections characterize the optimal consumption plan for various preference spec
ications.
6.2.1 Timeadditive expected utility
With timeadditive expected utility there is a singledate utility function u : R
+
R such that
the objective of the individual is
max
u(c
0
) + E
_
e
u(c)
, (6.3)
where is a subjective time preference rate. Substituting in the budget constraints, we get
max
u
_
e
0
i=1
i
P
i
_
+ E
_
e
u
_
e +
I
i=1
i
D
i
__
.
The rstorder condition with respect to
i
is
P
i
u
_
e
0
i=1
i
P
i
_
+ E
_
e
D
i
u
_
e +
I
i=1
i
D
i
__
= 0
or
P
i
u
(c
0
) = E
_
e
D
i
u
(c)
, (6.4)
where c
0
and c denotes the optimal consumption plan, i.e. the consumption plan generated by the
optimal portfolio . We can rewrite the above equation as
P
i
= E
_
e
(c)
u
(c
0
)
D
i
_
. (6.5)
6.2 The oneperiod framework 131
This equation links prices to the optimal consumption plan of an individual investor.
Comparing (6.5) and the pricing condition in the denition of a stateprice deator it is clear
that
= e
(c)
u
(c
0
)
(6.6)
denes a stateprice deator. It is positive since marginal utilities are positive. This stateprice
deator is the marginal rate of substitution of the individual capturing the willingness of the
individual to substitute a bit of time 0 consumption for some time 1 consumption.
The optimality condition (6.5) can also be justied by a variational argument, which goes as
follows. Assume that (c
0
, c) denotes the optimal consumption plan for the individual. Then any
deviation from this plan will give the individual a lower utility. One deviation is obtained by
investing in > 0 additional units of asset i at the beginning of the period. This leaves an initial
consumption of c
0
P
i
. On the other hand, the endofperiod consumption in state becomes
c
+D
i
. We know that
u(c
0
P
i
) +e
E[u(c +D
i
)] u(c
0
) +e
E[u(c)] .
Subtracting the righthand side from the lefthand side and dividing by yields
u(c
0
P
i
) u(c
0
)
+e
E
_
u(c +D
i
) u(c)
_
0.
Letting go to zero, the fractions on the lefthand side approaches derivatives
1
and we obtain
P
i
u
(c
0
) +e
E[u
(c)D
i
] 0,
which implies that
P
i
E
_
e
(c)
u
(c
0
)
D
i
_
.
On the other hand, if we consider selling > 0 units of asset i at the beginning of the period, the
same reasoning can be used to show that
P
i
E
_
e
(c)
u
(c
0
)
D
i
_
.
Hence, the relation must hold as an equality, just as in (6.5).
Example 6.1 For the case of CRRA utility, u(c) =
1
1
c
1
, we have u
(c) = c
. Therefore the
optimal consumption plan satises
P
i
= E
_
e
_
c
c
0
_
D
i
_
, (6.7)
and the stateprice deator derived from the individuals problem is
= e
_
c
c
0
_
. (6.8)
2
1
By denition, the derivative of a function f(x) is f
(x) = lim
0
f(x+)f(x)
=
k lim
0
f(x+k)f(x)
k
= k lim
k0
f(x+k)f(x)
k
= kf
= lim
0
P
i
u
(c
0
P
i
)
1
= P
i
u
(c
0
) and analogously for the other utility
quotient.
132 Chapter 6. Individual optimality
6.2.2 Nonadditive expected utility
Next consider nonadditive expected utility where the objective is to maximize E[U(c
0
, c)] for some
U : R
+
R
+
R. Again, substituting in the budget constraint, the problem is
max
E
_
U
_
e
0
i=1
i
P
i
, e +
I
i=1
i
D
i
__
. (6.9)
The rstorder condition with respect to
i
is
P
i
E
_
U
c
0
_
e
0
i=1
i
P
i
, e +
I
i=1
i
D
i
__
+ E
_
D
i
U
c
_
e
0
i=1
i
P
i
, e +
I
i=1
i
D
i
__
= 0
which implies that
P
i
= E
_
_
U
c
(c
0
, c)
E
_
U
c
0
(c
0
, c)
_D
i
_
_
, (6.10)
so that the corresponding stateprice deator is
=
U
c
(c
0
, c)
E
_
U
c
0
(c
0
, c)
_. (6.11)
This could be supported by a variational argument as in the case of timeadditive expected utility.
Again note that these equations hold for the optimal consumption plan.
Example 6.2 Consider the very simple habitstyle utility function
U(c
0
, c) =
1
1
c
1
0
+
1
1
e
(c c
0
)
1
.
The subsistence level for time 1 consumption is some fraction of time 0 consumption. In this
case the relevant marginal utilities are
U
c
0
(c
0
, c) = c
0
e
(c c
0
)
,
U
c
(c
0
, c) = e
(c c
0
)
.
The rstorder condition therefore implies that
P
i
= E
_
_
e
(c c
0
)
E
_
c
0
e
(c c
0
)
_D
i
_
_
, (6.12)
and the associated stateprice deator is
=
e
(c c
0
)
E
_
c
0
e
(c c
0
)
_. (6.13)
This simple example indicates that (internal) habit formation leads to pricing expressions that are
considerably more complicated than timeadditive utility. 2
6.2 The oneperiod framework 133
6.2.3 A general utility index
A general utility index U is tractable for a nite state space where we can write the objective as
max
U(c
0
, c
1
, . . . , c
S
),
where c
= e
I
i=1
D
i
i
so the rstorder condition implies that
P
i
=
S
=1
U
c
(c
0
, c
1
, . . . , c
S
)
U
c
0
(c
0
, c
1
, . . . , c
S
)
D
i
. (6.14)
This denes a stateprice vector by
=
U
c
(c
0
, c
1
, . . . , c
S
)
U
c
0
(c
0
, c
1
, . . . , c
S
)
.
6.2.4 A twostep procedure in a complete market
In a complete market we can separate the consumption and the portfolio decision as follows:
1. nd the optimal consumption plan given the budget constraints,
2. nd the portfolio nancing the optimal consumption plan; such a portfolio will exist when
the market is complete.
Let us show this in the case of a nite state space. Suppose the market is complete and let
denote the unique stateprice vector. The individual can obtain any dividend vector D at the cost
of D. Hence the individual can rst solve the problem
max
c
0
,c,D
U(c
0
, c) (6.15)
s.t. c e +D, (6.16)
c
0
e
0
D, (6.17)
c
0
, c 0,
for the optimal consumption plan. Here (6.16) is a vector inequality, which means that the in
equality should hold element by element, i.e.
c
+D
, = 1, . . . , S.
In fact we can eliminate the dividends from the problem. Multiplying (6.16) by , we get
c e + D.
Adding this to (6.17), we see that any feasible consumption plan (c
0
, c) must satisfy
c
0
+ c e
0
+ e. (6.18)
This is natural, since the lefthand side is the present value of the consumption plan and the right
hand side is the present value of the endowment, which is welldened since market completeness
ensures that some portfolio will provide a dividend identical to the endowment. Conversely, suppose
134 Chapter 6. Individual optimality
a consumption plan (c
0
, c) satises (6.18). Then it will also satisfy the conditions (6.16) and (6.17)
with D = c e. Thus we can nd the utility maximizing consumption plan by solving
max
c
0
,c
U(c
0
, c) (6.19)
s.t. c
0
+ c e
0
+ e
c
0
, c 0,
and we will still assume that the nonnegativity constraint will be automatically satised. The
Lagrangian for the problem is therefore
L = U(c
0
, c) + (e
0
+ e c
0
c) ,
where is the Lagrange multiplier. The rstorder conditions are
U
c
0
(c
0
, c) = ,
U
c
(c
0
, c) = .
In particular, the optimal consumption plan satises
U
c
(c
0
, c)
U
c
0
(c
0
, c)
= .
Given the chosen consumption plan c and the future income e, we can back out the portfolio by
solving D
u(c), we get
e
(c)
u
(c
0
)
=
6.2 The oneperiod framework 135
as found earlier.
If the market is incomplete, the individual cannot implement any consumption plan but only
those that can be nanced by portfolios of traded assets. Therefore, we cannot apply the above
technique.
6.2.5 Optimal portfolios and meanvariance analysis
We have not explicitly solved for the optimal portfolio. Although it is certainly relevant to study
the portfolio decisions of individuals in more detail, it is not necessary for asset pricing pur
poses. The most popular oneperiod model for portfolio choice is the meanvariance model intro
duced by Markowitz (1952, 1959). If the individual has meanvariance preferences, cf. Section 5.6,
her optimal portfolio will be a meanvariance ecient portfolio corresponding to a point on the
upwardsloping branch of the meanvariance frontier. Meanvariance analysis does not by itself
say anything about exactly which portfolio a given individual should choose but if we assume a
given meanvariance utility function, the optimal portfolio can be derived. Note, however, that
the conditions justifying meanvariance portfolio choice are highly unrealistic: either returns must
be normally distributed or individuals must have meanvariance preferences. Nevertheless, the
meanvariance frontier remains an important concept in both portfolio choice and asset pricing
(recall Theorem 4.6). Below we will use the traditional Lagrangian approach to characterize the
meanvariance ecient portfolios. In Section 9.4 we will oer an alternative orthogonal char
acterization and use that to show the link between meanvariance returns, pricing factors, and
stateprice deators.
We assume as before that I assets are traded and let R = (R
1
, . . . , R
I
)
with 1 = 1, where
i
is the fraction of total portfolio value invested in asset i. The gross return on a portfolio is
R
= R =
I
i=1
i
R
i
, cf. (3.7). The expectation and the variance of the return on a portfolio
are
E[R
] = E
_
I
i=1
i
R
i
_
=
I
i=1
i
E[R
i
] =
I
i=1
i
= =
,
Var [R
] = Var
_
I
i=1
i
R
i
_
=
I
i=1
I
j=1
j
Cov[R
i
, R
j
] =
.
A portfolio is then meanvariance ecient if there is an m R so that solves
min
s.t.
= m,
1 = 1, (6.21)
i.e. has the lowest return variance among all portfolios with expected return m.
Risky assets only
Assume that is positive denite, i.e. that
+(m
) + (1
1) ,
where and are Lagrange multipliers. The rstorder condition with respect to is
L
= 2 1 = 0,
which implies that
=
1
2
1
+
1
2
1
1. (6.22)
The rstorder conditions with respect to the multipliers simply give the two constraints to the
minimization problem. Substituting the expression (6.22) for into the two constraints, we obtain
the equations
1
+1
1
= 2m,
1
1 +1
1
1 = 2.
Dening the numbers A, B, C, D by
A =
1
, B =
1
1 = 1
1
, C = 1
1
1, D = AC B
2
,
we can write the solution to the two equations in and as
= 2
CmB
D
, = 2
ABm
D
.
Substituting this into (6.22) we obtain
= (m)
CmB
D
1
+
ABm
D
1
1. (6.23)
This is the meanvariance ecient portfolio with expected gross return m. Some tedious calcula
tions show that the variance of the return on this portfolio is equal to
2
(m) (m)
(m) =
Cm
2
2Bm+A
D
. (6.24)
This is to be veried in Exercise 6.4. Equation (6.24) shows that the combinations of variance and
mean form a parabola in a (mean, variance)diagram.
Traditionally the portfolios are depicted in a (standard deviation, mean)diagram. The above
relation can also be written as
2
(m)
1/C
(mB/C)
2
D/C
2
= 1,
from which it follows that the optimal combinations of standard deviation and mean form a hy
perbola in the (standard deviation, mean)diagram. This hyperbola is called the meanvariance
frontier of risky assets. The meanvariance ecient portfolios are sometimes called frontier port
folios.
Before we proceed let us clarify a point in the derivation above. We need to divide by D so D
has to be nonzero. In fact, D > 0. To see this, rst note that since and therefore
1
are
positive denite, we have A > 0 and C > 0. Moreover,
AD = A(AC B
2
) = (B A1)
1
(B A1) > 0,
6.2 The oneperiod framework 137
again using that
1
is positive denite. Since A > 0, we must have D > 0.
The (global) minimumvariance portfolio is the portfolio that has the minimum variance
among all portfolios. We can nd this directly by solving the constrained minimization problem
min
s.t.
1 = 1 (6.25)
where there is no constraint on the expected portfolio return. Alternatively, we can minimize
the variance
2
(m) in (6.24) over all m. Taking the latter route, we nd that the minimum
variance is obtained when the mean return is m
min
= B/C and the minimum variance is given by
2
min
=
2
(m
min
) = 1/C. From (6.23) we get that the minimumvariance portfolio is
min
=
1
C
1
1 =
1
1
1
1
1
1. (6.26)
It will be useful to consider the problem
max
_
1/2
s.t.
1 = 1, (6.27)
where is some constant. The denominator in the objective is clearly the standard deviation of
the return of the portfolio, while the numerator is the expected return in excess of . This ratio is
the slope of a straight line in the (standard deviation, mean)diagram that goes through the point
_
(
)
1/2
,
_
corresponding to the portfolio and intersects the meanaxis at . Applying
the constraint, the objective function can be rewritten as
f() =
( 1)
_
_
1/2
=
( 1)
_
_
1/2
.
The derivative is
f
= ( 1)
_
_
1/2
_
3/2
( 1)
and
f
= 0 implies that
( 1)
=
1
( 1) , (6.28)
which we want to solve for . Note that the equation has a vector on each side. If two vectors are
identical, they will also be identical after a division by the sum of the elements of the vector. The
sum of the elements of the vector on the lefthand side of (6.28) is
1
( 1)
_
=
( 1)
( 1)
,
where the last equality is due to the constraint. The sum of the elements of the vector on the
righthand side of (6.28) is simply 1
1
( 1). Dividing each side of (6.28) with the sum of
the elements we obtain the portfolio
=
1
( 1)
1
1
( 1)
. (6.29)
In particular, by letting = 0, we can see that the portfolio
slope
=
1
1
1
=
1
B
1
(6.30)
138 Chapter 6. Individual optimality
is the portfolio that maximizes the slope of a straight line between the origin and a point on
the meanvariance frontier in the (standard deviation, mean)diagram. Let us call
slope
the
maximumslope portfolio. This portfolio has mean A/B and variance A/B
2
.
From (6.23) we see that any meanvariance ecient portfolio can be written as a linear combi
nation of the maximumslope portfolio and the minimumvariance portfolio:
(m) =
(CmB)B
D
slope
+
(ABm)C
D
min
. (6.31)
Note that the two multipliers of the portfolios sum to one. This is a twofund separation result.
Any meanvariance ecient portfolio is a combination of two special portfolios or funds, namely
the maximum slope portfolio and the minimumvariance portfolio. These two portfolios are said
to generate the meanvariance frontier of risky assets. In fact, it can be shown that any other two
frontier portfolios generate the entire frontier.
The following result is both interesting and useful. Let R
, R
z()
] = 0. The
return R
z()
is called the zerobeta return for R
]
B C E[R
]
and that the tangent to the meanvariance frontier at the point corresponding to R
will intersect
the expected return axis exactly in E[R
z()
]. These results are to be shown in Exercise 6.5.
Allowing for a riskfree asset
Now let us allow for a riskfree asset with a gross return of R
f
. The riskfree asset corresponds
to the point (0, R
f
) in the (standard deviation, mean)diagram. Either the riskfree asset is one
of the I basic assets or it can be constructed as a portfolio of the basic assets. Without loss of
generality we can assume that the riskfree asset is the Ith basic asset. The remaining M I 1
basic assets are risky. Let
R = (R
1
, . . . , R
M
)
R] and variance
= Var[
1 is the invested in
the riskfree asset. A portfolio involving only the risky assets is an Mdimensional vector with
1 = 1.
Fix for a moment a portfolio of risky assets only. The gross return on this portfolio is
R
=
, Var[R] = (1
f
)
2
Var
_
R
and we obtain
E[R] =
f
R
f
+
E
_
R
[R
]
[R]
so varying
f
the set of points ([R], E[R]) [
f
1 will form an upwardsloping straight line
from (0, R
f
) through (
_
R
, E
_
R
). For
f
> 1, the standard deviation of the combined
portfolio is [R] = (1
f
)
_
R
and we get
E[R] =
f
R
f
E
_
R
[R
]
[R],
which denes a downwardsloping straight line from (0, R
f
) and to the right.
Minimizing variance for a given expected return we will move as far to the northwest or to the
southwest as possible in the (standard deviation, mean)diagram. Therefore the meanvariance
ecient portfolios will correspond to points on a line which is tangent to the meanvariance frontier
of risky assets and goes through the point (0, R
f
). There are two such lines, an upwardsloping
and a downwardsloping line. The point where the upwardsloping line is tangent to the frontier
of risky assets corresponds to a portfolio which we refer to as the tangency portfolio. This is
a portfolio of risky assets only. It is the portfolio that maximizes the Sharpe ratio over all risky
portfolios. The Sharpe ratio of a portfolio is the ratio (E[R
] R
f
)/[R
] between the excess
expected return of a portfolio and the standard deviation of the return. To determine the tangency
portfolio we must solve the problem
max
R
f
_
_
1/2
s.t.
1 = 1. (6.32)
Except for the tildes above the symbols, this problem is identical to the problem (6.27) with = R
f
which we have already solved. We can therefore conclude that the tangency portfolio is given by
tan
=
1
_
R
f
1
_
1
1
( R
f
1)
. (6.33)
The gross return on the tangency portfolio is
R
tan
=
tan
R
with expectation and standard deviation given by
tan
=
tan
=
1
_
R
f
1
_
1
1
( R
f
1)
, (6.34)
tan
=
_
tan
tan
_
1/2
=
_
( R
f
1)
1
( R
f
1)
_
1/2
1
1
( R
f
1)
. (6.35)
The maximum Sharpe ratio, i.e. the slope of the line, is thus
tan
R
f
tan
=
1
( R
f
1)
1
1
( R
f
1)
R
f
( R
f
1)
1
( R
f
1)
1/2
1
1
( R
f
1)
=
1
_
R
f
1
_
R
f
[1
1
_
R
f
1
_
]
_
( R
f
1)
1
( R
f
1)
_
1/2
=
( R
f
1)
1
( R
f
1)
_
( R
f
1)
1
( R
f
1)
_
1/2
=
_
( R
f
1)
1
( R
f
1)
_
1/2
.
140 Chapter 6. Individual optimality
The straight line from the point (0, R
f
) and to and through (
tan
,
tan
) constitutes the upward
sloping part of the meanvariance frontier of all assets. Similarly, there is a downwardsloping part
which starts out at (0, R
f
) and has a slope which equals minus the slope of the upwardsloping
frontier. Again we have twofund separation since all investors will combine just two funds, where
one fund is simply the riskfree asset and the other is the tangency portfolio of only risky assets.
A return R is meanvariance ecient if and only if
R = R
f
+ (1 )R
tan
for some . If 1, you will get a point on the upwardsloping part of the frontier. If 1, you
will get a point on the downwardsloping part. Of course, when a riskfree return is traded, it will
be the minimumvariance return. The relation between the mean m and the standard deviation
of the portfolios on the ecient frontier will be
= [mR
f
[
[R
tan
]
E[R
tan
] R
f
. (6.36)
6.3 The discretetime framework
In the discretetime framework each individual has to choose a consumption process c = (c
t
)
tT
,
where T = 0, 1, . . . , T and c
t
denotes the random, i.e. statedependent, consumption at time t.
The individual also has to choose a trading strategy = (
t
)
t=0,1,...,T1
with
t
representing the
portfolio held from time t until time t + 1. Again,
t
may depend on the information available to
the individual at time t so is an adapted stochastic process. The individual has an endowment
or income process e = (e
t
)
tT
, where e
0
is the initial endowment (wealth) and e
t
is the possibly
statedependent income received at time t.
6.3.1 Timeadditive expected utility
Let us rst focus on individuals with timeadditive expected utility. Standing at time 0 the problem
of an individual investor can therefore be written as
max
u(c
0
) +
T
t=1
e
t
E[u(c
t
)]
s.t. c
0
e
0
0
P
0
,
c
t
e
t
+D
t
, t = 1, . . . , T,
c
0
, c
1
, . . . , c
T
0.
Applying (3.8), we can also write the constraint on time t consumption as
c
t
e
t
+
t1
(P
t
+D
t
)
t
P
t
.
As in the oneperiod case we will assume that the nonnegativity constraint on consumption is
automatically satised and that the budget constraints hold as equalities. Therefore the problem
can be reformulated as
max
u(e
0
0
P
0
) +
T
t=1
e
t
E[u(e
t
+
t1
(P
t
+D
t
)
t
P
t
)] . (6.37)
6.3 The discretetime framework 141
The only terms involving the initially chosen portfolio
0
= (
10
,
20
, . . . ,
I0
)
will be
u(e
0
0
P
0
) +e
E[u(e
1
+
0
(P
1
+D
1
)
1
P
1
)] .
The rstorder condition with respect to
i0
implies that
P
i0
= E
_
e
(c
1
)
u
(c
0
)
(P
i1
+D
i1
)
_
.
This can also be veried by a variational argument as in the oneperiod analysis. More generally,
the rstorder condition with respect to
it
implies that
P
it
= E
t
_
e
(c
t+1
)
u
(c
t
)
(P
i,t+1
+D
i,t+1
)
_
. (6.38)
Note that c
t
and c
t+1
in these expressions are the optimal consumption rates of the individual. Not
surprisingly, this condition is equivalent to the conclusion in the oneperiod framework. In partic
ular, we can dene a stateprice deator = (
t
)
tT
from the individuals optimal consumption
process by
0
= 1 and
t+1
t
=
e
(c
t+1
)
u
(c
t
)
,
which means that
t
=
t
t1
t1
t2
. . .
1
0
= e
(c
t
)
u
(c
t1
)
e
(c
t1
)
u
(c
t2
)
. . . e
(c
1
)
u
(c
0
)
= e
t
u
(c
t
)
u
(c
0
)
.
(6.39)
This is the individuals marginal rate of substitution between time 0 consumption and time t
consumption.
6.3.2 Habit formation utility
Next let us consider a nonadditive specication of preferences and for concreteness we study a
specication with habit formation. The objective of the individual is
max
=(
t
)
t=0,1,...,T1
E
_
T
t=0
e
t
u(c
t
, h
t
)
_
,
where h
t
is the habit level at time t. First assume that the habit level at time t is some fraction
of the consumption level at time t 1,
h
t
= c
t1
,
and let h
0
= 0. Apply the variational argument given earlier. Let c
0
, c
1
, . . . , c
T
denote the optimal
consumption process and let h
0
, h
1
, . . . , h
T
denote the resulting process for the habit level. What
happens if the individual purchases > 0 units extra of asset i at time t and sells those units again
at time t+1? Consumption at time t and t+1 will change to c
t
P
it
and c
t+1
+(D
i,t+1
+P
i,t+1
),
respectively. The perturbation will also aect the habit level. With the assumed habit formation,
only the habit level at time t + 1 and time t + 2 will be changed. The new habit levels will be
142 Chapter 6. Individual optimality
h
t+1
P
it
at time t + 1 and h
t+2
+ (D
i,t+1
+ P
i,t+1
). Therefore the change in total utility
from time t and onwards will be
u(c
t
P
it
, h
t
) u(c
t
, h
t
) +e
E
t
[u(c
t+1
+(D
i,t+1
+P
i,t+1
), h
t+1
P
it
) u(c
t+1
, h
t+1
)]
+e
2
E
t
[u(c
t+2
, h
t+2
+(D
i,t+1
+P
i,t+1
)) u(c
t+2
, h
t+2
)] 0
Dividing by and letting 0, we obtain
P
it
u
c
(c
t
, h
t
) +e
E
t
[u
c
(c
t+1
, h
t+1
) (D
i,t+1
+P
i,t+1
) P
it
u
h
(c
t+1
, h
t+1
)]
+e
2
E
t
[u
h
(c
t+2
, h
t+2
) (D
i,t+1
+P
i,t+1
)] 0.
Here a subscript on u indicate the partial derivative of u with respect to that variable. Again the
opposite inequality can be reached by a similar argument. Replacing the inequality sign with an
equality sign and rearranging, we arrive at
P
it
_
u
c
(c
t
, h
t
) +e
E
t
[u
h
(c
t+1
, h
t+1
)]
_
= e
E
t
__
u
c
(c
t+1
, h
t+1
) +e
u
h
(c
t+2
, h
t+2
)
_
(D
i,t+1
+P
i,t+1
)
= e
E
t
__
u
c
(c
t+1
, h
t+1
) +e
E
t+1
[u
h
(c
t+2
, h
t+2
)]
_
(D
i,t+1
+P
i,t+1
)
,
where the last equality is due to the Law of Iterated Expectations. Consequently,
P
it
= E
t
_
e
u
c
(c
t+1
, h
t+1
) +e
E
t+1
[u
h
(c
t+2
, h
t+2
)]
u
c
(c
t
, h
t
) +e
E
t
[u
h
(c
t+1
, h
t+1
)]
(D
i,t+1
+P
i,t+1
)
_
, (6.40)
and a stateprice deator can be dened by
t
= e
t
u
c
(c
t
, h
t
) +e
E
t
[u
h
(c
t+1
, h
t+1
)]
u
c
(c
0
, h
0
) +e
E[u
h
(c
1
, h
1
)]
. (6.41)
Note that if = 0 we are back to the case of timeadditive utility.
It is probably more realistic that the habit level at time t depends on all the previous consumption
rates but such that the consumption in recent periods are more important for the habit level than
consumption in the far past. This can be captured by a specication like
h
t
=
t1
s=0
ts
c
s
,
where is a constant between 0 and 1. A change in the consumption at time t will now aect
the habit levels at all future dates t + 1, t + 2, . . . , T. Following the same line of argumentation as
above, it can be shown that this problem will generate the stateprice deator
t
= e
t
u
c
(c
t
, h
t
) +
Tt
s=1
s
e
s
E
t
[u
h
(c
t+s
, h
t+s
)]
u
c
(c
0
, h
0
) +
T
s=1
s
e
s
E[u
h
(c
s
, h
s
)]
, (6.42)
which is dicult to work with.
6.4 The continuoustime framework
In a continuoustime setting an individual consumes according to a nonnegative continuoustime
process c = (c
t
). Suppose that her preferences are described by timeadditive expected utility so
that the objective is to maximize E[
_
T
0
e
t
u(c
t
) dt].
6.4 The continuoustime framework 143
We will again go through a variational argument giving a link between the optimal consumption
process and asset prices. For simplicity assume that assets pay no intermediate dividends. Suppose
c = (c
t
) is the optimal consumption process for some agent and consider the following deviation
from this strategy: at time 0 increase the investment in asset i by units. The extra costs of
P
i0
implies a reduced consumption now. Let us suppose that the individual nances this extra
investment by cutting down the consumption rate in the time interval [0, t] for some small positive
t by P
i0
/t. The extra units of asset i is resold at time t < T, yielding a revenue of P
it
. This
nances an increase in the consumption rate over [t, t + t] by P
it
/t. The consumption rates
outside the intervals [0, t] and [t, t + t] will be unaected. Given the optimality of c = (c
t
), we
must have that
E
_
_
t
0
e
s
_
u
_
c
s
P
i0
t
_
u(c
s
)
_
ds +
_
t+t
t
e
s
_
u
_
c
s
+
P
it
t
_
u(c
s
)
_
ds
_
0.
Dividing by and letting 0, we obtain
E
_
P
i0
t
_
t
0
e
s
u
(c
s
) ds +
P
it
t
_
t+t
t
e
s
u
(c
s
) ds
_
0.
Letting t 0, we arrive at
E
_
P
i0
u
(c
0
) +P
it
e
t
u
(c
t
)
0,
or, equivalently,
P
i0
u
(c
0
) E
_
e
t
P
it
u
(c
t
)
.
The reverse inequality can be shown similarly so that we have that P
i0
u
(c
0
) = E[e
t
P
it
u
(c
t
)] or
more generally
P
it
= E
t
_
e
(t
t)
u
(c
t
)
u
(c
t
)
P
it
_
, t t
T.
With intermediate dividends this relation is slightly more complicated:
P
it
= E
t
_
_
t
is
P
is
e
(st)
u
(c
s
)
u
(c
t
)
ds +e
(t
t)
u
(c
t
)
u
(c
t
)
P
it
_
. (6.43)
We see that
t
= e
t
u
(c
t
)
u
(c
0
)
(6.44)
denes a stateprice deator, exactly as in the discretetime case.
If the market is complete, we can easily reach (6.44) solving step one of the twostep procedure
suggested in Section 6.2.4. The problem is
max
c=(c
t
)
E
_
_
T
0
e
t
u(c
t
) dt
_
s.t. E
_
_
T
0
t
c
t
dt
_
e
0
+ E
_
_
T
0
t
e
t
dt
_
,
where = (
t
) is the unique stateprice deator. The lefthand side of the constraint is the present
value of the consumption process, the righthand side is the sum of the initial wealth e
0
and the
present value of the income process. The Lagrangian for this problem is
L = E
_
_
T
0
e
t
u(c
t
) dt
_
+
_
e
0
+ E
_
_
T
0
t
e
t
dt
_
E
_
_
T
0
t
c
t
__
=
_
e
0
+ E
_
_
T
0
t
e
t
dt
__
+ E
_
_
T
0
_
e
t
u(c
t
)
t
c
t
_
dt
_
.
144 Chapter 6. Individual optimality
If we for each t and each state maximize the integrand in the last integral above, we will surely
maximize the Lagrangian. The rstorder condition is e
t
u
(c
t
) =
t
and since
0
= 1, we must
have (6.44).
Exercise 6.10 considers an individual with habit formation in a continuoustime setting.
6.5 Dynamic programming
Above we have linked the optimal consumption plan of an individual to asset prices. In Chapter 8
we will see how this leads to consumptionbased asset pricing models. While this link is quite
intuitive and theoretically elegant, empirical tests and practical applications of the model suer
from the fact that available data on individual or aggregate consumption are of poor quality. For
that purpose it is tempting to link asset prices to other variables for which better data are available.
One way to provide such a link is to explain optimal consumption in terms of other variables. If
c
t
is a function of some variable, say x
t
, then the equations in this chapter proves a link between
asset prices and x. To gure out what explains the consumption choice of an individual we have
to dig deeper into the utility maximization problem.
We will consider both a discretetime and a continuoustime framework. In both cases we will for
simplicity assume timeadditive expected utility. A central element of the analysis is the indirect
utility function of the individual, which is dened as the maximum expected utility of current and
future consumption. In the discretetime case the indirect utility at time t is dened as
J
t
= sup
(c
s
,
s
)
T
s=t
E
t
_
T
s=t
e
(st)
u(c
s
)
_
. (6.45)
There is no portfolio chosen at the nal date so it is understood that
T
= 0. Consumption at the
nal date is equal to the time T value of the portfolio purchased at T 1. In continuous time the
corresponding denition is
J
t
= sup
(c
s
,
s
)
s[t,T]
E
t
_
_
T
t
e
(st)
u(c
s
) ds
_
. (6.46)
For tractability it is necessary to assume that the indirect utility is a function of a limited number
of variables. Surely the indirect utility of a nitelylived individual will depend on the length of
the remaining life and therefore on calender time t. The indirect utility at a given time t will
also depend on the wealth W
t
of the individual at that point in time. Other variables containing
information about current or future investment opportunities or current or future income may have
to be added. Suppose that that extra information can be captured by a single variable x
t
. In that
case the indirect utility is of the form
J
t
= J(W
t
, x
t
, t)
for some function J.
We will show below that both in discrete and in continuous time the optimal consumption
strategy will satisfy the socalled envelope condition:
u
(c
t
) = J
W
(W
t
, x
t
, t), (6.47)
6.5 Dynamic programming 145
where J
W
is the partial derivative of J with respect to W. This is an intuitive optimality condition.
The lefthand side is the marginal utility of an extra unit of consumption at time t. The righthand
side is the marginal utility from investing an extra unit at time t in the optimal way. In an optimum
these marginal utilities have to be equal. If that was not the case the allocation of wealth between
consumption and investment should be reconsidered. For example, if u
(c
t
) > J
W
(W
t
, x
t
, t), the
consumption c
t
should be increased and the amount invested should be decreased. Using the
envelope condition, the stateprice deator derived from the individuals optimization problem can
be rewritten as
t
= e
t
u
(c
t
)
u
(c
0
)
= e
t
J
W
(W
t
, x
t
, t)
J
W
(W
0
, x
0
, 0)
, (6.48)
which links state prices to the optimally invested wealth of the individual and the variable x
t
. This
will be useful in constructing factor models in Chapter 9.
We will derive (6.47) using the dynamic programming technique. Along the way we will also nd
interesting conclusions on the optimal trading strategy of the individual. We do not make specic
assumptions on utility functions or the dynamics of asset prices but stick to a general setting.
More details and a lot of specic models are discussed in Munk (2008). The basic references
for the discretetime models are Samuelson (1969), Hakansson (1970), Fama (1970, 1976), and
Ingersoll (1987, Ch. 11). The basic references for the continuoustime models are Merton (1969,
1971, 1973b).
6.5.1 The discretetime framework
Assume that a riskfree and d risky assets are traded. Let
t
denote the dvector of units invested
in the risky assets at time t and let
0t
denote the units of the riskfree asset. Assume for simplicity
that the assets do not pay intermediate dividends. Similar to (3.10) the change in the wealth of
the individual between time t and time t + 1 is
W
t+1
W
t
=
d
i=0
it
(P
i,t+1
P
it
) +y
t
c
t
, (6.49)
where y
t
denotes the income received at time t. After receiving income and consuming at time t,
the funds invested will be W
t
+y
t
c
t
. Assuming this is nonnegative, we can represent the portfolio
in terms of the fractions of this total investment invested in the dierent assets, i.e.
it
=
it
P
it
W
t
+y
t
c
t
, i = 0, 1, . . . , d.
Dene the portfolio weight vector of the risky assets by
t
= (
1t
, . . . ,
dt
)
. By construction the
fraction invested in the riskfree asset is then given by
0t
= 1
d
i=1
it
= 1
t
1. The wealth
at the end of the period can then be restated as
W
t+1
= (W
t
+y
t
c
t
) R
W
t+1
, (6.50)
where
R
W
t+1
= 1 +r
f
t
+
t
_
r
t+1
r
f
t
1
_
(6.51)
is the gross rate of return on the portfolio, r
f
t
is the riskfree net rate of return, and r
t+1
is the
dvector of the net rates of return of the risky assets over the period. Note that the only random
146 Chapter 6. Individual optimality
variable (seen from time t) on the righthand side of the above expressions is the return vector
r
t+1
.
In the denition of indirect utility in (6.45) the maximization is over both the current and all
future consumption rates and portfolios. This is clearly a quite complicated maximization problem.
We will now show that we can alternatively perform a sequence of simpler maximization problems.
This result is based on the following manipulations:
J
t
= sup
(c
s
,
s
)
T
s=t
E
t
_
T
s=t
e
(st)
u(c
s
)
_
= sup
(c
s
,
s
)
T
s=t
E
t
_
u(c
t
) +
T
s=t+1
e
(st)
u(c
s
)
_
= sup
(c
s
,
s
)
T
s=t
E
t
_
u(c
t
) + E
t+1
_
T
s=t+1
e
(st)
u(c
s
)
__
= sup
(c
s
,
s
)
T
s=t
E
t
_
u(c
t
) +e
E
t+1
_
T
s=t+1
e
(s(t+1))
u(c
s
)
__
= sup
c
t
,
t
E
t
_
u(c
t
) +e
sup
(c
s
,
s
)
T
s=t+1
E
t+1
_
T
s=t+1
e
(s(t+1))
u(c
s
)
__
Here, the rst equality is simply due to the denition of indirect utility, the second equality comes
from separating out the rst term of the sum, the third equality is valid according to the law of
iterated expectations, the fourth equality comes from separating out the discount term e
, and the
nal equality is due to the fact that only the inner expectation depends on future consumption rates
and portfolios. Noting that the inner supremum is by denition the indirect utility at time t + 1,
we arrive at
J
t
= sup
c
t
,
t
E
t
_
u(c
t
) +e
J
t+1
= sup
c
t
,
t
_
u(c
t
) +e
E
t
[J
t+1
]
_
. (6.52)
This equation is called the Bellman equation, and the indirect utility J is said to have the
dynamic programming property. The decision to be taken at time t is split up in two: (1) the
consumption and portfolio decision for the current period and (2) the consumption and portfolio
decisions for all future periods. We take the decision for the current period assuming that we will
make optimal decisions in all future periods. Note that this does not imply that the decision for
the current period is taken independently from future decisions. We take into account the eect
that our current decision has on the maximum expected utility we can get from all future periods.
The expectation E
t
[J
t+1
] will depend on our choice of c
t
and
t
.
2
The dynamic programming property is the basis for a backward iterative solution procedure.
First, note that J
T
= u(c
T
) = u(W
T
), and c
T1
and
T1
are chosen to maximize
u(c
T1
) +e
E
T1
[u(W
T
)] ,
where
W
T
= (W
T1
+y
T1
c
T1
) R
W
T
, R
W
T
= 1 +r
f
T1
+
T1
_
r
T
r
f
T1
1
_
.
2
Readers familiar with option pricing theory may note the similarity to the problem of determining the optimal
exercise strategy of a Bermudan/American option. However, for that problem the decision to be taken is much
simpler (exercise or not) than for the consumption/portfolio problem.
6.5 Dynamic programming 147
This is done for each possible scenario at time T 1 and gives us J
T1
. Then c
T2
and
T2
are
chosen to maximize
u(c
T2
) +e
E
T2
[J
T1
] ,
and so on until time zero is reached. Since we have to perform a maximization for each scenario
of the world at every point in time, we have to make assumptions on the possible scenarios at
each point in time before we can implement the recursive procedure. The optimal decisions at any
time are expected to depend on the wealth level of the agent at that date, but also on the value of
other timevarying state variables that aect future returns on investment (e.g. the interest rate
level) and future income levels. To be practically implementable only a few state variables can be
incorporated. Also, these state variables must follow Markov processes so only the current values
of the variables are relevant for the maximization at a given point in time.
Suppose that the relevant information is captured by a onedimensional Markov process x = (x
t
)
so that the indirect utility at any time t = 0, 1, . . . , T can be written as J
t
= J(W
t
, x
t
, t). Then
the dynamic programming equation (6.52) becomes
J(W
t
, x
t
, t) = sup
c
t
,
t
_
u(c
t
) +e
E
t
[J(W
t+1
, x
t+1
, t + 1)]
_
(6.53)
with terminal condition J(W
T
, x
T
, T) = u(W
T
). Doing the maximization we have to remember
that W
t+1
will be aected by the choice of c
t
and
t
, cf. Equation (6.50). In particular, we see
that
W
t+1
c
t
= R
W
t+1
,
W
t+1
t
= (W
t
+y
t
c
t
)
_
r
t+1
r
f
t
1
_
.
The rstorder condition for the maximization with respect to c
t
is
u
(c
t
) +e
E
t
_
J
W
(W
t+1
, x
t+1
, t + 1)
W
t+1
c
t
_
= 0, (6.54)
which implies that
u
(c
t
) = e
E
t
_
J
W
(W
t+1
, x
t+1
, t + 1)R
W
t+1
. (6.55)
The rstorder condition for the maximization with respect to
t
is
E
t
_
J
W
(W
t+1
, x
t+1
, t + 1)
W
t+1
t
_
= 0, (6.56)
which implies that
E
t
_
J
W
(W
t+1
, x
t+1
, t + 1)
_
r
t+1
r
f
t
1
__
= 0. (6.57)
While we cannot generally solve for the optimal decisions, we can show that the envelope con
dition (6.47) holds. First note that for the optimal choice c
t
,
t
we have that
J(W
t
, x
t
, t) = u( c
t
) +e
E
t
_
J(
W
t+1
, x
t+1
, t + 1)
_
,
where
W
t+1
is next periods wealth using c
t
,
t
. Taking derivatives with respect to W
t
in this
equation, and acknowledging that c
t
and
t
will in general depend on W
t
, we get
J
W
(W
t
, x
t
, t) = u
( c
t
)
c
t
W
t
+e
E
t
_
J
W
(
W
t+1
, x
t+1
, t + 1)
W
t+1
W
t
_
,
where
W
t+1
W
t
= R
W
t+1
_
1
c
t
W
t
_
+ (W
t
+y
t
c
t
)
_
t
W
t
_
_
r
t+1
r
f
t
1
_
.
148 Chapter 6. Individual optimality
Inserting this and rearranging terms, we get
J
W
(W
t
, x
t
, t) = e
E
t
_
J
W
(
W
t+1
, x
t+1
, t + 1)R
W
t+1
_
+
_
u
( c
t
) e
E
t
_
J
W
(
W
t+1
, x
t+1
, t + 1)R
W
t+1
__
c
t
W
t
+ (W
t
+y
t
c
t
) e
_
t
W
t
_
E
t
_
J
W
(
W
t+1
, x
t+1
, t + 1)
_
r
t+1
r
f
t
1
__
.
On the righthand side the last two terms are zero due to the rstorder conditions and only the
leading term remains, i.e.
J
W
(W
t
, x
t
, t) = e
E
t
_
J(
W
t+1
, x
t+1
, t + 1)R
W
t+1
_
.
Combining this with (6.55) we obtain the envelope condition (6.47).
6.5.2 The continuoustime framework
As in the discretetime setting above assume that an instantaneously riskfree asset with a con
tinuously compounded riskfree rate of r
f
t
and d risky assets are traded. We assume for simplicity
that the assets pay no intermediate dividends and write their price dynamics as
dP
t
= diag(P
t
)
_
t
dt +
t
dz
t
, (3.5)
where z = (z
1
, . . . , z
d
)
it
dt +
d
j=1
ijt
dz
jt
_
_
, i = 1, . . . , d.
The instantaneous rate of return on asset i is given by dP
it
/P
it
. The dvector
t
= (
1t
, . . . ,
dt
)
t
contains the variance and covariance rates of instantaneous rates of return. We assume that
t
is
nonsingular and, hence, we can dene the market price of risk associated with z as
t
=
1
t
(
t
r
f
t
1),
so that
t
= r
t
1 +
t
t
,
i.e.
it
= r
t
+
d
j=1
ijt
jt
. We can now rewrite the price dynamics as
dP
t
= diag(P
t
)
__
r
t
1 +
t
t
_
dt +
t
dz
t
.
We represent the trading strategy by the portfolio weight process = (
t
), where
t
is the
dvector of fractions of wealth invested in the d risky assets at time t. Again the weight of the
riskfree asset is
0t
= 1
t
1 = 1
d
i=1
it
. Analogous to (3.14), the wealth dynamics can be
written as
dW
t
= W
t
_
r
f
t
+
t
t
t
_
dt + [y
t
c
t
] dt +W
t
t
t
dz
t
. (6.58)
6.5 Dynamic programming 149
For simplicity we assume in the following that the agent receives no labor income, i.e. y
t
0. We
also assume that a single variable x
t
captures the time t information about investment opportunities
so that, in particular,
r
f
t
= r
f
(x
t
),
t
= (x
t
, t),
t
= (x
t
, t),
where r
f
, , and now (also) denote suciently wellbehaved functions. The market price of risk
is also given by the state variable:
(x
t
) = (x
t
, t)
1
_
(x
t
, t) r
f
(x
t
)1
_
.
Note that we have assumed that the shortterm interest rate r
f
t
and the market price of risk vector
t
do not depend on calendar time directly. The uctuations in r
f
t
and
t
over time are presumably
not due to the mere passage of time, but rather due to variations in some more fundamental
economic variables. In contrast, the expected rates of returns and the price sensitivities of some
assets will depend directly on time, e.g. the volatility and the expected rate of return on a bond
will depend on the timetomaturity of the bond and therefore on calendar time.
Now the wealth dynamics for a given portfolio and consumption strategy is
dW
t
= W
t
_
r
f
(x
t
) +
t
(x
t
, t)(x
t
)
dt c
t
dt +W
t
t
(x
t
, t) dz
t
.
The state variable x is assumed to follow a onedimensional diusion process
dx
t
= m(x
t
) dt +v(x
t
)
dz
t
+ v(x
t
) d z
t
, (6.59)
where z = ( z
t
) is a onedimensional standard Brownian motion independent of z = (z
t
). Hence, if
v(x
t
) ,= 0, there is an exogenous shock to the state variable that cannot be hedged by investments
in the nancial market. In other words, the nancial market is incomplete. Conversely, if v(x
t
)
is identically equal to zero, the nancial market is complete. We shall consider examples of both
cases later. The dvector v(x
t
) represents the sensitivity of the state variable with respect to the
exogenous shocks to market prices. Note that the dvector (x, t)v(x) is the vector of instanta
neous covariance rates between the returns on the risky assets and the state variable. Under the
assumptions made above the indirect utility at time t is J
t
= J(W
t
, x
t
, t).
How do we implement the dynamic programming principle in continuous time? First consider
a discretetime approximation with time set 0, t, 2t, . . . , T = Nt. The Bellman equation
corresponding to this discretetime utility maximization problem is
J(W, x, t) = sup
c
t
0,
t
R
d
_
u(c
t
)t +e
t
E
t
[J(W
t+t
, x
t+t
, t + t)]
_
, (6.60)
cf. (6.52). Here c
t
and
t
are held xed over the interval [t, t+t). If we multiply by e
t
, subtract
J(W, x, t), and then divide by t, we get
e
t
1
t
J(W, x, t) = sup
c
t
0,
t
R
d
_
e
t
u(c
t
) +
1
t
E
t
[J(W
t+t
, x
t+t
, t + t) J(W, x, t)]
_
.
(6.61)
When we let t 0, we have that (by lHospitals rule)
e
t
1
t
,
150 Chapter 6. Individual optimality
and that (by denition of the drift of a process)
1
t
E
t
[J(W
t+t
, x
t+t
, t + t) J(W, x, t)] (6.62)
will approach the drift of J at time t, which according to Itos Lemma is given by
J
t
(W, x, t) +J
W
(W, x, t)
_
W
_
r(x) +
t
(x, t)(x)
c
t
_
+
1
2
J
WW
(W, x, t)W
2
t
(x, t)(x, t)
t
+J
x
(W, x, t)m(x)
+
1
2
J
xx
(W, x, t)(v(x)
v(x) + v(x)
2
) +J
Wx
(W, x, t)W
t
(x, t)v(x).
The limit of (6.61) is therefore
J(W, x, t) = sup
c0,R
d
_
u(c) +
J
t
(W, x, t) +J
W
(W, x, t)
_
W
_
r(x) +
(x, t)(x)
c
_
+
1
2
J
WW
(W, x, t)W
2
(x, t)(x, t)
+J
x
(W, x, t)m(x)
+
1
2
J
xx
(W, x, t)(v(x)
v(x) + v(x)
2
)
+J
Wx
(W, x, t)W
(x, t)v(x)
_
.
(6.63)
This is called the HamiltonJacobiBellman (HJB) equation corresponding to the dynamic opti
mization problem. Subscripts on J denote partial derivatives, however we will write the partial
derivative with respect to time as J/t to distinguish it from the value J
t
of the indirect utility
process. The HJB equation involves the supremum over the feasible time t consumption rates
and portfolios (not the supremum over the entire processes!) and is therefore a highly nonlinear
secondorder partial dierential equation.
From the analysis above we will expect that the indirect utility function J(W, x, t) solves the
HJB equation for all possible values of W and x and all t [0, T) and that it satises the terminal
condition J(W, x, T) = 0. (We could allow for some utility of terminal consumption or wealth, e.g.
representing the utility of leaving money for your heirs. Then the terminal condition should be of
the form J(W, x, T) = u(W).) This can be supported formally by the socalled verication theorem.
The solution procedure is therefore as follows: (1) solve the maximization problem embedded in the
HJBequation giving a candidate for the optimal strategies expressed in terms of the yet unknown
indirect utility function and its derivatives. (2) substitute the candidate for the optimal strategies
into the HJBequation, ignore the supoperator, and solve the resulting partial dierential equation
for J(W, x, t). Such a solution will then also give the candidate optimal strategies in terms of W,
x, and t.
3
Let us nd the rstorder conditions of the maximization in (6.63). The rstorder condition
with respect to c gives us immediately the envelope condition (6.47), which we were really looking
for. Nevertheless, let us also look at the rstorder condition with respect to , i.e.
J
W
(W, x, t)W(x, t)(x) +J
WW
(W, x, t)W
2
(x, t)(x, t)
+J
Wx
(W, x, t)W(x, t)v(x) = 0
3
There is really also a third step, namely to check that the assumptions made along the way and the technical
conditions needed for the verication theorem to apply are all satised. The standard version of the verication
theorem is precisely stated and proofed in, e.g. Theorem 11.2.2 in ksendal (1998) or Theorem III.8.1 in Fleming
and Soner (1993). The technical conditions of the standard version are not always satised in concrete consumption
portfolio problems, however, but at least for some concrete problems a version with an appropriate set of conditions
can be found; see, e.g., Korn and Kraft (2001) and Kraft (2004).
6.5 Dynamic programming 151
so that the optimal portfolio is
t
=
J
W
(W
t
, x
t
, t)
W
t
J
WW
(W
t
, x
t
, t)
_
(x
t
, t)
_
1
(x
t
)
J
Wx
(W
t
, x
t
, t)
W
t
J
WW
(W
t
, x
t
, t)
_
(x
t
, t)
_
1
v(x
t
). (6.64)
As the horizon shrinks, the indirect utility function J(W, x, t) approaches the terminal utility which
is independent of the state x. Consequently, the derivative J
Wx
(W, x, t) and hence the last term
of the portfolio will approach zero as t T. Shortsighted investors pick a portfolio given by the
rst term on the righthand side. We can interpret the second term as an intertemporal hedge
term since it shows how a longterm investor will deviate from the shortterm investor. The last
term will also disappear for noninstantaneous investors in three special cases:
(1) there is no x: investment opportunities are constant; there is nothing to hedge.
(2) J
Wx
(W, x, t) 0: The state variable does not aect the marginal utility of the investor.
This turns out to be the case for investors with logarithmic utility. Such an investor is not
interested in hedging changes in the state variable.
(3) v(x) 0: The state variable is uncorrelated with instantaneous returns on the traded assets.
In this case the investor is not able to hedge changes in the state variable.
In all other cases the state variable induces an additional term to the optimal portfolio relative to
the case of constant investment opportunities.
In general, we have threefund separation in the sense that all investors will combine the riskfree
asset, the tangency portfolio of risky assets given by the portfolio weights
tan
t
=
1
1
_
(x
t
, t)
_
1
(x
t
)
_
(x
t
, t)
_
1
(x
t
),
and the hedge portfolio given by the weights
hdg
t
=
1
1
_
(x
t
, t)
_
1
v(x
t
)
_
(x
t
, t)
_
1
v(x
t
).
Inserting the denition of we can rewrite the expression for the tangency portfolio as
tan
t
=
1
1
(x
t
, t)
1
((x
t
, t) r
f
(x
t
)1)
(x
t
, t)
1
_
(x
t
, t) r
f
(x
t
)1
_
,
which is analogous to the tangency portfolio in the oneperiod meanvariance analysis, cf. (6.33).
Substituting the candidate optimal values of c and back into the HJB equation and gathering
terms, we get the second order PDE
J(W, x, t) = u(I
u
(J
W
(W, x, t))) J
W
(W, x, t)I
u
(J
W
(W, x, t)) +
J
t
(W, x, t) +r(x)WJ
W
(W, x, t)
1
2
J
W
(W, x, t)
2
J
WW
(W, x, t)
(x)
2
+J
x
(W, x, t)m(x) +
1
2
J
xx
(W, x, t)
_
v(x)
2
+ v(x)
2
_
1
2
J
Wx
(W, x, t)
2
J
WW
(W, x, t)
v(x)
2
J
W
(W, x, t)J
Wx
(W, x, t)
J
WW
(W, x, t)
(x)
v(x).
(6.65)
Although the PDE (6.65) looks very complicated, closedform solutions can be found for a number
of interesting model specications. See, e.g., Munk (2008).
152 Chapter 6. Individual optimality
If more than one, say k, variables are necessary to capture the information about investment
opportunities, the optimal portfolio will involve k hedge portfolios beside the tangency portfolio
and the riskfree asset so that (k + 2)fund separation holds.
Nielsen and Vassalou (2006) have shown that the only characteristics of investment opportunities
that will induce intertemporal hedging is the shortterm riskfree interest rate r
f
t
and 
t
, which is
the maximum Sharpe ratio obtainable at the nancial market. Since r
f
t
is the intercept and 
t

the slope of the instantaneous meanvariance frontier this result makes good sense. Longterm
investors are concerned about the variations in the investments that are good in the short run.
6.6 EpsteinZin recursive utility
Now consider the EpsteinZin preferences introduced in Section 5.7.3, where the utility index U
t
is specied recursively as
U
t
=
_
ac
1
t
+e
_
E
t
[U
1
t+1
]
_1
_
1
.
Here is the relative risk aversion, is the subjective discount rate, and = (1)/(1
1
) where
is the elasticity of intertemporal substitution.
At time t, the objective of the individual is to maximize U
t
over all consumption and portfolio
strategies (c
t
, c
t+1
, c
t+2
, . . . ) and (
t
,
t+1
,
t+2
, . . . ). Let J
t
denote the maximum value of U
t
,
typically referred to as the indirect utility or derived utility. The dynamic programming idea
implies that
J
t
= sup
c
t
,
t
_
ac
1
t
+e
_
E
t
[J
1
t+1
]
_1
_
1
. (6.66)
The indirect utility is clearly depending on wealth. The wealth dynamics is given by
W
t+1
= (W
t
c
t
)
t
R
t+1
= (W
t
c
t
)R
t
t+1
(6.67)
We conjecture that
J
t
= h
t
W
t
.
Substituting the conjecture and the wealth dynamics into (6.66), we obtain
h
t
W
t
= sup
c
t
,
t
_
ac
1
t
+e
(W
t
c
t
)
1
_
E
t
_
h
1
t+1
_
R
t
t+1
_
1
__1
_
1
. (6.68)
In particular, for the optimal values of c
t
and
t
, we have
h
t
W
t
=
_
ac
1
t
+e
(W
t
c
t
)
1
_
E
t
_
h
1
t+1
_
R
t
t+1
_
1
__1
_
1
= c
t
_
a +e
_
W
t
c
t
c
t
_
1
_
E
t
_
h
1
t+1
_
R
t
t+1
_
1
__1
_
1
.
(6.69)
6.6.1 Firstorder condition wrt. c
t
In the maximization in (6.68), the rstorder condition with respect to c
t
implies that
ac
1
1
t
= e
(W
t
c
t
)
1
1
_
E
t
_
h
1
t+1
_
R
t+1
_
1
__1
= e
_
E
t
_
h
1
t+1
_
R
t+1
_
1
__1
(6.70)
Substituting that into (6.69), we obtain
h
t
W
t
= c
t
_
a +a
_
W
t
c
t
c
t
_
1
_
W
t
c
t
c
t
_
1
1
_
1
= c
t
_
a
W
t
c
t
_
1
= a
1
c
t
_
W
t
c
t
_
1
giving
h
t
= a
1
_
c
t
W
t
_
1
1
. (6.71)
Substituting the expression for h
t
into the rstorder condition (6.70) gives
a
_
W
t
c
t
c
t
_
1
1
= e
_
E
t
_
a
_
c
t+1
W
t+1
_
1
_
R
t
t+1
_
1
__1
.
Eliminating the as and substituting in the budget constraint (6.67), we arrive at
_
W
t
c
t
c
t
_
1
1
= e
_
E
t
_
_
c
t+1
(W
t
c
t
)R
t
t+1
_
1
_
R
t
t+1
_
1
__1
= e
_
E
t
_
_
c
t+1
(W
t
c
t
)
_
1
_
R
t
t+1
_
__1
= e
_
W
t
c
t
c
t
_
1
1
_
E
t
_
_
c
t+1
c
t
_
1
_
R
t
t+1
_
__1
so that
1 = e
_
E
t
_
_
c
t+1
c
t
_
1
_
R
t
t+1
_
__1
and thus
1 = e
E
t
_
_
c
t+1
c
t
_
1
_
R
t
t+1
_
_
= E
t
_
e
_
c
t+1
c
t
_
1
_
R
t
t+1
_
_
.
Since 1 = (1
1
) = /, we have
1 = E
t
_
e
_
c
t+1
c
t
_
/
_
R
t
t+1
_
_
. (6.72)
6.6.2 Firstorder condition wrt.
t
The maximization with respect to the optimal portfolio is subject to the condition that
t
1 = 1,
i.e.
1t
+ +
It
= 1. Hence
1t
= 1(
2t
+ +
It
) and the portfolio return can be written as
R
t
t+1
=
I
i=1
jt
R
i,t+1
=
1t
R
1,t+1
+
I
i=2
it
R
i,t+1
= R
1,t+1
+
I
i=2
it
(R
i,t+1
R
1,t+1
).
154 Chapter 6. Individual optimality
Substituting this into the righthand side of (6.68), we can see that the rstorder condition with
respect to
it
, i = 2, . . . , I, implies that
0 = E
t
_
h
1
t+1
_
R
t
t+1
_
(R
i,t+1
R
1,t+1
)
_
.
Using (6.71) and subsequently (6.67), we have
0 = E
t
_
_
c
t+1
W
t+1
_
1
_
R
t
t+1
_
(R
i,t+1
R
1,t+1
)
_
= E
t
_
_
c
t+1
W
t
c
t
_
1
_
R
t
t+1
_
1
(R
i,t+1
R
1,t+1
)
_
and since 1 = / and we can multiply with anything that is timet measurable, we
conclude that
0 = E
t
_
_
c
t+1
c
t
_
/
_
R
t
t+1
_
1
(R
i,t+1
R
1,t+1
)
_
.
Multiplying by
it
and summing over i gives
0 = E
t
_
_
c
t+1
c
t
_
/
_
R
t
t+1
_
1
I
i=2
it
(R
i,t+1
R
1,t+1
)
_
= E
t
_
_
c
t+1
c
t
_
/
_
R
t
t+1
_
1
_
R
t
t+1
R
1,t+1
_
_
,
which implies that
E
t
_
_
c
t+1
c
t
_
/
_
R
t
t+1
_
1
R
1,t+1
_
= E
t
_
_
c
t+1
c
t
_
/
_
R
t
t+1
_
_
= e
,
where the last equality follows from (6.72), and thus
E
t
_
e
_
c
t+1
c
t
_
/
_
R
t
t+1
_
1
R
1,t+1
_
= 1.
The above computation could be done for any of the I assets so we conclude that
E
t
_
e
_
c
t+1
c
t
_
/
_
R
t
t+1
_
1
R
i,t+1
_
= 1, i = 1, . . . , I. (6.73)
6.6.3 The stateprice deator
We can now see that the optimal decisions of the individual give rise to a nextperiod stateprice
deator given by
t+1
t
= e
_
c
t+1
c
t
_
/
_
R
t
t+1
_
1
. (6.74)
In particular if = 1/ and thus = 1, we are back to the traditional stateprice deator for
timeadditive power utility,
t+1
/
t
= e
(c
t+1
/c
t
)
.
6.7 Concluding remarks
This chapter has characterized the optimal consumption and portfolio choice of an individual by
the rstorder condition of her utility maximization problem. The characterization provides a link
between asset prices and the optimal consumption plan of any individual. In the next chapter we
will look at the market equilibrium.
6.8 Exercises 155
6.8 Exercises
EXERCISE 6.1 Consider a oneperiod economy and an individual with a timeadditive but
statedependent expected utility so that the objective is
max
u(c
0
, X
0
) +e
E[u(c, X)].
The decisions of the individual do not aect X
0
or X. For example, X
0
and X could be the
aggregate consumption in the economy at time 0 and time 1, respectively, which are not signicantly
aected by the consumption of a small individual. What is the link between prices and marginal
utility in this case? What if u(c, X) =
1
1
(c X)
1
? What if u(c, X) =
1
1
(c/X)
1
?
EXERCISE 6.2 Consider a oneperiod economy where four basic nancial assets are traded
without portfolio constraints or transaction costs. There are four possible endofperiod states of
the economy. The objective state probabilities and the prices and statecontingent dividends of
the assets are given in the following table:
state 1 state 2 state 3 state 4
probability
1
6
1
4
1
4
1
3
statecontingent dividend price
Asset 1 1 2 2 2
3
2
Asset 2 1 3 0 0
7
6
Asset 3 0 0 1 1
1
3
Asset 4 3 2 1 1
3
2
The economy is known to be arbitragefree.
(a) Show that asset 4 is redundant and verify that the price of asset 4 is identical to the price of
the portfolio of the other assets that replicates asset 4.
(b) Is the market complete?
(c) Show that the vector
=
_
1
6
,
1
3
,
1
6
,
1
6
_
is a valid stateprice vector and that it is in the set of
dividends spanned by the basic assets. Characterize the set of all valid stateprice vectors.
(d) Show that the vector
=
_
1,
4
3
,
4
7
,
4
7
_
is a valid stateprice deator and that it is in the set of
dividends spanned by the basic assets. Show that any stateprice deator must be a vector
of the form
_
1,
4
3
, y, 1
3
4
y
_
, where y
_
0,
4
3
_
.
(e) Show that it is possible to construct a riskfree asset from the four basic assets. What is the
riskfree interest rate?
In the following consider an individual maximizing u(c
0
)+ E[u(c)], where c
0
denotes consumption
at the beginning of the period and c denotes the statedependent consumption at the end of
the period. Assume u(c) = c
1
/(1 ). For 1, 2, 3, 4, let c
3
4
_
c
3
c
0
_
_
1/
.
For the remainder of the problem it is assumed that the individual has identical income/endowment
in states 3 and 4.
(g) Explain why c
3
and c
4
must be identical, and hence that the optimal consumption plan must
have
c
3
= c
4
= c
0
_
4
7
_
1/
.
(h) Assuming = 2, =
6
7
, and c
0
= 1, nd the optimal statedependent endofperiod con
sumption, i.e. c
1
, c
2
, c
3
, c
4
.
(i) What is the present value of the optimal consumption plan?
(j) Assuming that the individual receives no endofperiod income in any state, nd an optimal
portfolio for this individual.
EXERCISE 6.3 Consider a oneperiod economy with four possible, equally likely, states at the
end of the period. The agents in the economy consume at the beginning of the period (time 0)
and at the end of the period (time 1). The agents can choose between three dierent consumption
plans as shown in the following table:
consumption statecontingent time 1 consumption
at time 0 state 1 state 2 state 3 state 4
Consumption plan 1 8 9 16 9 4
Consumption plan 2 8 9 9 9 9
Consumption plan 3 8 4 16 25 4
Denote the time 0 consumption by c
0
, the uncertain consumption at time 1 by c, and the con
sumption at time 1 in case state is realized by c
.
(a) Consider an agent with logarithmic utility,
U(c
0
, c
1
, c
2
, c
3
, c
4
) = lnc
0
+ E[lnc] = lnc
0
+
4
=1
p
lnc
,
where p
is the probability that state is realized. Compute the utility for each of the
three possible consumption plans and determine the optimal consumption plan. Find the
associated stateprice vector. Using this stateprice vector, what is the price at the beginning
of the period of an asset that gives a payo of 2 in states 1 and 4 and a payo of 1 in states 2
and 3?
6.8 Exercises 157
(b) Answer the same questions with the alternative timeadditive square root utility,
U(c
0
, c
1
, c
2
, c
3
, c
4
) =
c
0
+ E[
c] =
c
0
+
4
=1
p
.
(c) Answer the same questions with the alternative habitstyle square root utility,
U(c
0
, c
1
, c
2
, c
3
, c
4
) =
c
0
+ E[
c 0.5c
0
] =
c
0
+
4
=1
p
0.5c
0
.
EXERCISE 6.4 Show Equation (6.24).
EXERCISE 6.5 Using the Lagrangian characterization of the meanvariance frontier, show that
for any meanvariance ecient return R
, R
z()
] = 0. Show that E[R
z()
] =
(ABE[R
])/(B C E[R
]).
EXERCISE 6.6 Let R
min
denote the return on the minimumvariance portfolio. Let R be any
other return, ecient or not. Show that Cov[R, R
min
] = Var[R
min
].
EXERCISE 6.7 Let R
1
denote the return on a meanvariance ecient portfolio and let R
2
denote the return on another not necessarily ecient portfolio with E[R
2
] = E[R
1
]. Show that
Cov[R
1
, R
2
] = Var[R
1
] and conclude that R
1
and R
2
are positively correlated.
EXERCISE 6.8 Think of the meanvariance framework in a oneperiod economy. Show that if
there is a riskfree asset, then any two meanvariance ecient returns (dierent from the riskfree
return) are either perfectly positively correlated or perfectly negatively correlated. Is that also
true if there is no riskfree asset?
EXERCISE 6.9 In a oneperiod model where the returns of all the risky assets are normally
distributed, any greedy and riskaverse investor will place herself on the upwardsloping part of
the meanvariance frontier. But where? Consider an agent that maximizes expected utility of end
ofperiod wealth with a negative exponential utility function u(W) = e
aW
for some constant a.
Suppose that M risky assets (with normally distributed returns) and one riskfree asset are traded.
What is the optimal portfolio of the agent? Where is the optimal portfolio located on the mean
variance frontier?
EXERCISE 6.10 Look at an individual with habit formation living in a continuoustime
complete market economy. The individual wants to maximize his expected utility
E
_
_
T
0
e
t
u(c
t
, h
t
) dt
_
,
where the habit level h
t
is given by
h
t
= h
0
e
t
+
_
t
0
e
(tu)
c
u
du.
158 Chapter 6. Individual optimality
We can write the budget constraint as
E
_
_
T
0
t
c
t
dt
_
W
0
,
where = (
t
) is the stateprice deator and W
0
is the initial wealth of the agent (including the
present value of any future nonnancial income).
(a) Show that dh
t
= (c
t
h
t
) dt. What condition on and will ensure that the habit level
declines, when current consumption equals the habit level?
(b) Show that the stateprice deator is linked to optimal consumption by the relation
t
= ke
t
_
u
c
(c
t
, h
t
) + E
t
_
_
T
t
e
(+)(st)
u
h
(c
s
, h
s
) ds
__
(*)
for some appropriate constant k. Hint: First consider what eect consumption at time t has
on future habit levels.
(c) How does (*) look when u(c, h) =
1
1
(c h)
1
?
CHAPTER 7
Market equilibrium
7.1 Introduction
The previous chapter characterized the optimal decisions of utilitymaximizing individuals who
take asset prices as given. This chapter will focus on the characterization of market equilibrium
asset prices. We will work throughout in a oneperiod model and assume that the state space is
nite, = 1, 2, . . . , S. The results can be generalized to an innite state space and a multiperiod
setting.
First, let us dene an equilibrium. Consider a oneperiod economy with I assets and L greedy and
riskaverse individuals. Each asset i is characterized by its time 0 price P
i
and a random variable
D
i
representing the time 1 dividend. Each individual is characterized by a (strictly increasing and
concave) utility index U
l
and an endowment (e
l
0
, e
l
). An equilibrium for the economy consists of
a price vector P and portfolios
l
, l = 1, . . . , L, satisfying the two conditions
(i) for each l = 1, . . . , L,
l
is optimal for individual l given P,
(ii) markets clear, i.e.
L
l=1
l
i
= 0 for all i = 1, . . . , I.
Associated with an equilibrium (P;
1
, . . . ,
L
) is an equilibrium consumption allocation (c
l
0
, c
l
),
l = 1, . . . , L, dened by
c
l
0
= e
l
0
l
P; c
l
= e
l
+D
, .
In the market clearing condition we have assumed that the traded assets are in a net supply of
zero and, since the time 0 endowment is a certain number of units of the consumption good, no
one owns any assets at time 0. This might seem restrictive but it does cover the case with initial
asset holdings and positive net supply of assets. Just interpret
l
as individual ls net trade in
the assets, i.e. the change in the portfolio relative to the initial portfolio, and interpret the time 1
endowment as the sum of some income from nonnancial sources and the dividend from the initial
portfolio.
159
160 Chapter 7. Market equilibrium
We will assume throughout that individuals have homogeneous beliefs, i.e. that they agree that
probabilities of future events are measured by the probability measure P.
Outline of the chapter...
7.2 Paretooptimality and representative individuals
Dene the aggregate initial and future endowment in the economy as
e
0
=
L
l=1
e
l
0
, e
=
L
l=1
e
l
.
(If we allow for assets in a positive net supply, the dividends of those assets are to be included in
the time 1 aggregate endowment.) A consumption allocation (c
1
0
, c
1
), . . . , (c
L
0
, c
L
) is said to be
feasible if
L
l=1
c
l
0
e
0
;
L
l=1
c
l
, .
Here, the lefthand sums dene the aggregate consumption,
C
0
=
L
l=1
c
l
0
, C
=
L
l=1
c
l
.
A consumption allocation (c
1
0
, c
1
), . . . , (c
L
0
, c
L
) is Paretooptimal if it is feasible and there is
no other feasible consumption plan ( c
1
0
, c
1
), . . . , ( c
L
0
, c
L
) such that U
l
( c
l
0
, c
l
) U
l
(c
l
0
, c
l
) for all
l = 1, . . . , L with strict inequality for some l.
Paretooptimality of consumption allocations is closely linked to the solution to the allocation
problem of a hypothetical central planner. Let = (
1
, . . . ,
L
)
: R
+
R
S
+
R by
U
( e
0
, e) = sup
_
L
l=1
l
U
l
(c
l
0
, c
l
) [
L
l=1
c
l
0
e
0
,
L
l=1
c
l
, c
l
0
, c
l
l
U
l
(c
l
0
, c
l
) is a linear combination of the utilities of the individuals when individual l
follows the consumption plan (c
l
0
, c
l
). This linear combination is maximized over all feasible alloca
tions of the total endowment ( e
0
, e). Given the total endowment and the weights on individuals,
U
gives the best linear combination of utilities that can be obtained. As long as the individuals
utility indices are increasing and concave, U
as the utility index of a greedy and riskaverse individual, a central planner giving
weights to the individual utility functions. The following theorem, sometimes called the Second
Welfare Theorem, gives the link to Paretooptimality:
Theorem 7.1 Given any Paretooptimal consumption allocation with aggregate consumption (C
0
, C),
a strictly positive weighting vector = (
1
, . . . ,
L
)
(C
0
, C).
Consequently, we can nd Paretooptimal allocations by solving the central planners problem
for a given weighting vector . The Lagrangian of the central planners constrained maximization
problem is
L =
L
l=1
l
U
l
(c
l
0
, c
l
) +
0
_
C
0
l=1
c
l
0
_
+
S
=1
_
C
l=1
c
l
_
,
7.2 Paretooptimality and representative individuals 161
where
0
,
1
, . . . ,
S
are Lagrange multipliers. The rstorder conditions are
L
c
l
0
= 0
l
U
l
c
l
0
=
0
, l = 1, . . . , L, (7.1)
L
c
l
= 0
l
U
l
c
l
, = 1, . . . , S, l = 1, . . . , L, (7.2)
L
0
= 0
L
l=1
c
l
0
= C
0
, (7.3)
L
= 0
L
l=1
c
l
= C
, = 1, . . . , S. (7.4)
Note that since we can scale all
l
s by a positive constant without aecting the maximizing
consumption allocation, we might as well assume
0
= 1. Given strictly increasing and concave
utility functions with innite marginal utility at zero, the rstorder conditions are both necessary
and sucient for optimality. We can thus conclude that a feasible consumption allocation is Pareto
optimal if and only if we can nd a weighting vector = (
1
, . . . ,
L
)
U
l
c
l
0
=
0
, = 1, . . . , S, l = 1, . . . , L, (7.5)
with the consequence that
U
k
c
k
U
k
c
k
0
=
U
l
c
l
U
l
c
l
0
, = 1, . . . , S, (7.6)
for any individuals k and l, i.e. the individuals align their marginal rates of substitution. This prop
erty is often referred to as ecient risksharing. The central planner will distribute aggregate
consumption risk so that all individuals have the same marginal willingness to shift consumption
across time and states.
Note that if individuals have timeadditive expected utility, i.e.
U
l
(c
0
, c) = u
l
(c
0
) +e
l
E[u
l
(c)] = u
l
(c
0
) +e
l
S
=1
p
u
l
(c
),
we can replace (7.1) and (7.2) by
l
u
l
(c
l
0
) =
0
, l = 1, . . . , L, (7.7)
l
e
l
p
l
(c
l
) =
, = 1, . . . , S, l = 1, . . . , L. (7.8)
Dividing the second equation by the rst, we see that a Paretooptimal consumption allocation
has the property that
e
k
u
k
(c
k
)
u
k
(c
k
0
)
= e
l
u
l
(c
l
)
u
l
(c
l
0
)
, = 1, . . . , S, (7.9)
for any two individuals k and l.
The following theorem shows that all Paretooptimal consumption allocations have the property
that the consumption of individuals move together.
Theorem 7.2 For any Paretooptimal consumption allocation, the consumption of any individual
will be a strictly increasing function of aggregate consumption, i.e. for any l, c
l
= f
l
(C) for some
strictly increasing function f
l
.
162 Chapter 7. Market equilibrium
Proof: Given some Paretooptimal consumption allocation (c
l
0
, c
l
), l = 1, . . . , L. Assume for
simplicity that preferences can be represented by timeadditive expected utility. From Theorem 7.1
we know that we can nd a weighting vector = (
1
, . . . ,
L
)
k
u
k
(c
k
0
) =
l
u
l
(c
l
0
),
k
e
k
u
k
(c
k
) =
l
e
l
u
l
(c
l
), = 1, . . . , S,
for any two individuals k and l. Moreover, for any two states ,
, we must have
u
k
(c
k
)
u
k
(c
k
)
=
u
l
(c
l
)
u
l
(c
l
)
. (7.10)
Aggregate consumption in state is C
=
L
l=1
c
l
, i.e. C
> C
, c
l
> c
l
.
Consequently, u
l
(c
l
) < u
l
(c
l
k
(c
k
) < u
k
(c
k
) and thus c
k
> c
k
= x
n
= e
l
p
l
(c
l
)
u
l
(c
l
0
)
= e
l
p
l
(f
l
(C
))
u
l
(c
l
0
)
.
Dene
(n) =
n
e
l
p
l
(f
l
(C
))
u
l
(c
l
0
)
= e
l
u
l
(f
l
(x
n
))
u
l
(c
l
0
)
n
p
= e
l
u
l
(f
l
(x
n
))
u
l
(c
l
0
)
P(C = x
n
),
which can be interpreted as the value of an asset with a dividend of one if aggregate consumption
turns out to be x
n
and a zero dividend in other cases. The price of any marketed dividend D can
then be written as
P =
S
=1
=
N
n=1
=
N
n=1
k
e
l
p
l
(f
l
(C
))
u
l
(c
l
0
)
D
=
N
n=1
e
l
u
l
(f
l
(x
n
))
u
l
(c
l
0
)
n
p
=
N
n=1
(n)
P(C = x
n
)
n
p
=
N
n=1
(n)
n
p
P(C = x
n
)
D
=
N
n=1
(n) E[D[C = x
n
],
7.3 Paretooptimality in complete markets 163
where E[D[C = x
n
] is the expected dividend conditional on aggregate consumption being x
n
.
The last equality is due to the fact that p
/P(C = x
n
) is the conditional probability of the state
given that aggregate consumption is x
n
. To price any marketed dividend we thus need only
prices of ArrowDebreu style assets on aggregate consumption and the expectation of the dividend
conditional on the aggregate consumption.
The economy is said to have a representative individual if for each equilibrium in the economy
with L individuals there is a vector such that the equilibrium asset prices are the same in the
economy with the single individual with utility U
and endowment ( e
0
, e). Theorem 7.1 has the
following immediate consequence:
Theorem 7.3 If the equilibrium consumption allocation is Paretooptimal, the economy has a
representative individual.
Since it is much easier to analyze models with only one individual, many asset pricing models do
assume the existence of a representative individual. Of course, it is interesting to know under what
conditions this assumption is satised, i.e. under what conditions the equilibrium consumption
allocation in the underlying multiindividual economy is going to be Paretooptimal. We will
study that question below. Note that in the representative individual economy there can be no
trade in the nancial assets (who should be the other party in the trade?) and the consumption of
the representative individual must equal the total endowment. If you want to model how nancial
assets are traded, a representative individual formulation is obviously not useful, but if you just
want to study equilibrium asset prices it is often convenient.
7.3 Paretooptimality in complete markets
From the analysis in the previous chapter we know that any individuals marginal rate of substi
tution is a valid stateprice deator. With timeadditive expected utility the stateprice deator
induced by individual l is the random variable
l
= e
l
u
l
(c
l
)
u
l
(c
l
0
)
,
where c
l
0
is the optimal time 0 consumption and c
l
is the optimal statedependent time 1 consump
tion. In general, these state prices may not be identical for dierent investors so that multiple
stateprice vectors and deators can be constructed in this way. However, if the market is complete,
we know that the stateprice deator is unique so we must have
k
=
l
for any two individuals k
and l. This means that
k
=
l
k
u
k
_
c
k
_
u
k
_
c
k
0
_ = e
l
u
l
_
c
l
_
u
l
_
c
l
0
_ (7.11)
for any , i.e. the marginal rate of substitution is the same for all individuals. From the discussion
above, this will imply ecient risksharing and that the consumption allocation is Paretooptimal.
This result is often called the First Welfare Theorem:
Theorem 7.4 If the nancial market is complete, then every equilibrium consumption allocation
is Paretooptimal.
164 Chapter 7. Market equilibrium
Here is a formal proof:
Proof: Recall from (6.19) that in a complete market the utility maximization problem of individ
ual l can be written as
max
c
0
,c
U
l
(c
0
, c)
s.t. c
0
+ c e
l
0
+ e
l
c
0
, c 0,
where is the unique stateprice vector. Let (c
l
0
, c
l
), l = 1, . . . , L be an equilibrium consumption
allocation, but suppose we can nd another consumption allocation ( c
l
0
, c
l
), l = 1, . . . , L which
gives all individuals at least the same utility and some individuals strictly higher utility than
(c
l
0
, c
l
), l = 1, . . . , L. Since we assume strictly increasing utility and (c
l
0
, c
l
) maximizes individual
ls utility subject to the constraint c
l
0
+ c
l
e
l
0
+ e
l
, the inequality
c
l
0
+ c
l
e
l
0
+ e
l
must hold for all individuals with strict inequality for at least one individual. Summing up over
all individuals we get
L
l=1
_
c
l
0
+ c
l
_
>
L
l=1
_
e
l
0
+ e
l
_
= e
0
+ e.
Hence, the consumption allocation ( c
l
0
, c
l
), l = 1, . . . , L is not feasible. 2
A complete market equilibrium provides ecient risksharing. From (7.11) it follows that, for
any two states and
, we have
u
k
_
c
k
_
u
k
_
c
k
_ =
u
l
_
c
l
_
u
l
_
c
l
_. (7.12)
Suppose that (7.12) did not hold. Suppose we could nd two individuals k and l and two states
and
such that
u
k
(c
k
)
u
k
(c
k
)
>
u
l
(c
l
)
u
l
(c
l
)
. (7.13)
Then the two agents could engage in a trade that makes both better o. What trade? Since the
market is complete, ArrowDebreu assets for all states are traded, in particular for states and
.
Consider the following trade: Individual k buys
from individual
k at a unit price of
= 0
.
The deal will change the consumption of the two individuals in states and
_
u
k
(c
k
) u
k
(c
k
)
_
+p
_
u
k
(c
k
) u
k
(c
k
)
_
.
Dividing by
and letting
k
(c
k
k
(c
k
),
7.3 Paretooptimality in complete markets 165
which is strictly positive whenever
u
k
(c
k
)
u
k
(c
k
)
>
p
. (7.14)
On the other hand, the total change in the expected utility of individual l will be
p
_
u
l
(c
l
) u
l
(c
l
)
_
+p
_
u
l
(c
l
) u
l
(c
l
)
_
.
Dividing by
and letting
0, we get
p
l
(c
l
) +p
l
(c
l
),
which is strictly positive whenever
u
l
(c
l
)
u
l
(c
l
)
<
p
. (7.15)
If the inequality (7.13) holds, the two individuals can surely nd prices
and
so that
both (7.14) and (7.15) are satised, i.e. both increase their expected utility.
If the market is incomplete, the individuals might not be able to implement this trade. In other
words, we cannot be sure that (7.12) holds for states for which ArrowDebreu assets are not traded,
i.e. uninsurable states.
Combining theorems stated above, we have the following conclusion:
Theorem 7.5 If the nancial market is complete, the economy has a representative individual.
If we want to study asset pricing in a complete market, we might as well assume that the economy
has a single individual. But what is the appropriate weighting vector for the complete market?
We must ensure that the rstorder conditions of the central planner are satised when we plug
in the optimal consumption plans of the individuals. Recall that in a complete market the utility
maximization problem of individual l can be formulated as in (6.19). The Lagrangian for this
problem is
L
l
= U
l
(c
l
0
, c
l
) +
l
_
S
=1
_
e
l
c
l
_
+e
l
0
c
l
0
_
.
The rstorder conditions, again necessary and sucient, are
L
l
c
l
0
= 0
U
l
c
l
0
=
l
, (7.16)
L
l
c
l
= 0
U
l
c
l
=
l
, = 1, . . . , S, (7.17)
L
l
l
= 0 c
l
0
+
S
=1
c
l
= e
l
0
+
S
=1
e
l
. (7.18)
If we set
0
= 1 and
l
= 1/
l
for each l = 1, . . . , L, the Equations (7.1)(7.4) will indeed be
satised.
Note that the weight
l
associated with individual l is the inverse of his shadow price of his
budget constraint and
l
will therefore depend on the initial endowment of individual l. Redis
tributing the aggregate initial endowment across individuals will thus change the relative values
of the weights
l
and, hence, the utility function of the representative individual and equilibrium
asset prices.
166 Chapter 7. Market equilibrium
Let us briey consider the special case where all individuals have timeadditive expected utilities.
Then the representative individual will also have timeadditive expected utility, i.e.
U
(c
0
, c) = u
,0
(c
0
) + E[u
,1
(c)] = u
,0
(c
0
) +
S
=1
p
u
,1
(c
),
where
u
,0
(c
0
) = sup
_
L
l=1
l
u
l
(c
l
0
) [ c
1
0
, . . . , c
L
0
> 0 with c
1
0
+ +c
L
0
c
0
_
, (7.19)
u
,1
(c) = sup
_
L
l=1
e
l
u
l
(c
l
) [ c
1
, . . . , c
L
> 0 with c
1
+ +c
L
c
_
. (7.20)
Often a functional form for u
,0
and u
,1
is assumed directly with the property that u
,1
(c) =
e
u
,0
(c) = e
u(c), where can be interpreted as the average time preference rate in the
economy. Then the unique stateprice deator follows by evaluating the derivative of u
at the
aggregate endowment,
=
u
,1
( e
)
u
,0
( e
0
)
= e
( e
)
u
( e
0
)
.
This will be very useful in order to link asset prices and interest rates to aggregate consumption.
7.4 Paretooptimality in some incomplete markets
In the previous section we saw that complete market equilibria are Paretooptimal. However, as
discussed earlier, reallife nancial markets are probably not complete. Equilibrium consumption
allocations in incomplete markets will generally not be Paretooptimal since the individuals cannot
necessarily implement the trades needed to align their marginal rates of substitution. On the other
hand, individuals do not have to be able to implement any possible consumption plan so we do not
need markets to be complete in the strict sense. If every Paretooptimal consumption allocation
can be obtained by trading in the available assets, the market is said to be eectively complete.
If the market is eectively complete, we can use the representative individual approach to asset
pricing. In this section we will discuss some examples of eectively complete markets.
For any Paretooptimal consumption allocation we know from Theorem 7.2 that the consumption
of any individual is an increasing function of aggregate consumption. Individual consumption is
measurable with respect to aggregate consumption. As in the discussion below Theorem 7.2,
suppose that the possible values of aggregate consumption are x
1
, . . . , x
K
and let
k
=
[C
= x
k
be the set of states in which aggregate consumption equals x
k
. If it is possible for
each k to form a portfolio that provides a payment of 1 if the state is in
k
and a zero payment
otherwisean ArrowDebreu style asset for aggregate consumptionthen the market is eectively
complete. (See also Exercise 7.3.) The individuals are indierent between states within a given
subset
k
. Risk beyond aggregate consumption risk does not carry any premium. Assuming strictly
increasing utility functions, aggregate time 1 consumption will equal aggregate time 1 endowment.
If we think of the aggregate endowment as the total value of the market, aggregate consumption
will equal the value of the market portfolio and we can partition the state space according to the
value or return of the market portfolio. Risk beyond market risk is diversied away and does not
carry any premium.
7.4 Paretooptimality in some incomplete markets 167
If there is a full set of ArrowDebreu style assets for aggregate consumption, markets will be
eectively complete for all strictly increasing and concave utility functions. The next theorem,
which is due to Rubinstein (1974), shows that markets are eectively complete under weaker
assumptions on the available assets if individuals have utility functions of the HARA class with
identical risk cautiousness. Before stating the precise result, let us review a few facts about the
HARA utility functions which were dened in Section 5.6. A utility function u is of the HARA
class, if the absolute risk tolerance is ane, i.e.
ART(c)
u
(c)
u
(c)
= c +.
The risk cautiousness is ART
(c) = e
c/
,
for the case = 0 (corresponding to negative exponential utility), or
u
(c) = (c +)
1/
,
for the case ,= 0 (which encompasses extended logutility, satiation HARA utility, and subsistence
HARA utility).
Theorem 7.6 Suppose that
(i) all individuals have timeadditive HARA utility functions with identical risk cautiousness;
(ii) a riskfree asset is traded;
(iii) the time 1 endowment of all individuals are spanned by traded assets, i.e. e
l
M, l = 1, . . . , L.
Then the equilibrium is Paretooptimal and the economy has a representative individual. The
optimal consumption for any individual is a strictly increasing ane function of aggregate con
sumption.
Proof: Suppose rst that the market is complete. Then we know that the equilibrium consumption
allocation will be Paretooptimal and we can nd a weighting vector = (
1
, . . . ,
L
)
so that
k
e
k
u
k
(c
k
) =
l
e
l
u
l
(c
l
) (7.21)
for any two individuals k and l and any . Assume that the common risk cautiousness is dierent
from zero so that
u
l
(c) = (c +
l
)
1/
, l = 1, . . . , L.
(The proof for the case = 0 is similar.) Substituting this into the previous equation, we obtain
k
e
k
_
c
k
+
k
_
1/
=
l
e
l
_
c
l
+
l
_
1/
,
which implies that
_
k
e
k
_
_
c
k
+
k
_ _
l
e
l
_
= c
l
+
l
.
Summing up over l = 1, . . . , L, we get
_
k
e
k
_
_
c
k
+
k
_
L
l=1
_
l
e
l
_
=
L
l=1
c
l
+
L
l=1
l
= C
+
L
l=1
l
,
168 Chapter 7. Market equilibrium
where C
, we nd that
c
k
=
C
L
l=1
l
(
k
e
k
)
L
l=1
(
l
e
l
)
A
k
C
+B
k
,
which is strictly increasing and ane in C
.
The same consumption allocation can be obtained in a market where a riskfree asset exists and
the time 1 endowments of all individuals are spanned by traded assets. 2
Under the additional assumption that the time preference rates of individuals are identical,
l
= for all l = 1, . . . , L, we can show a bit more:
Theorem 7.7 If the assumptions of Theorem 7.6 are satised and individuals have identical
time preference rates, then the relative weights which the representative individual associates to
individualsand therefore equilibrium asset priceswill be independent of the initial distribution
of aggregate endowment across individuals.
Proof: In order to verify this, we will compute the utility function of the representative individual.
Since we are assuming timeadditive utility, we will derive u
,0
and u
,1
dened in (7.19) and (7.20)
for any given weighting vector . We will focus on the case where the common risk cautiousness
is nonzero and dierent from 1 so that
u
l
(c) =
1
1 1/
(c +
l
)
11/
, u
l
(c) = (c +
l
)
1/
, l = 1, . . . , L.
(In Exercise 7.2 you are asked to do the same for the cases of extended logutility ( = 1) and neg
ative exponential utility ( = 0).) The rstorder condition for the maximization in the denition
of u
,0
implies that
l
(c +
l
)
1/
= , (7.22)
where is the Lagrange multiplier. Rearranging, we get
c
l
0
+
l
=
l
.
Summing up over l gives
c
0
+
L
l=1
l
=
l=1
l
,
so that the Lagrange multiplier is
=
_
c
0
+
L
l=1
l
_
1/
_
L
l=1
l
_
1/
.
Substituting this back into (7.22), we see that the solution to the maximization problem is such
that
l
(c +
l
)
1/
=
_
c
0
+
L
l=1
l
_
1/
_
L
l=1
l
_
1/
,
7.5 Exercises 169
which implies that
u
,0
(c
0
) =
L
l=1
l
1
1 1/
_
c
l
0
+
l
_
11/
=
1
1 1/
L
l=1
_
c
l
0
+
l
_
l
_
c
l
0
+
l
_
1/
=
1
1 1/
_
c
0
+
L
l=1
l
_
1/
_
L
l=1
l
_
1/
L
l=1
_
c
l
0
+
l
_
=
1
1 1/
_
L
l=1
l
_
1/
_
c
0
+
L
l=1
l
_
11/
.
(7.23)
Almost identical computations lead to
u
,1
(c) = e
1
1 1/
_
L
l=1
l
_
1/
_
c +
L
l=1
l
_
11/
(7.24)
The utility function of the representative individual is therefore of the same class as the individual
utility functions,
U
(c
0
, c) = u
,0
(c
0
) + E[u
,1
(c)] = u
(c
0
) +e
E[u
(c)],
where
u
(c) =
1
1 1/
_
L
l=1
l
_
1/
_
c +
L
l=1
l
_
11/
.
More importantly, the stateprice deator is
= e
(c)
u
(c
0
)
= e
L
l=1
l
_
1/
_
c +
L
l=1
l
_
1/
_
L
l=1
l
_
1/
_
c
0
+
L
l=1
l
_
1/
= e
_
c +
L
l=1
l
_
1/
_
c
0
+
L
l=1
l
_
1/
,
(7.25)
which is independent of the weighting vector . The stateprice deator and, hence, the asset
prices are independent of the distribution of endowment across individuals. 2
7.5 Exercises
EXERCISE 7.1 Suppose
1
and
2
are strictly positive numbers and that u
1
(c) = u
2
(c) =
c
1
/(1 ) for any nonnegative real number c. Dene the function u
: R
+
R by
u
(x) = sup
1
u
1
(y
1
) +
2
u
2
(y
2
) [ y
1
+y
2
x; y
1
, y
2
0 .
Show that u
(x) = kx
1
/(1 ) for some constant k. What is the implication for the utility of
representative individuals?
EXERCISE 7.2 Show Theorem 7.7 for the case of extended logutility and the case of negative
170 Chapter 7. Market equilibrium
exponential utility.
EXERCISE 7.3 Suppose that aggregate time 1 consumption can only take on the values
1, 2, . . . , K for some nite integer K. Assume that European call options on aggregate consumption
are traded for any exercise price 0, 1, 2, . . . , K. Consider a portfolio with one unit of the option
with exercise price k 1, one unit of the option with exercise price k + 1, and minus two units of
the option with exercise price k. What is the payo of this portfolio? Discuss the consequences of
your ndings for the (eective) completeness of the market. Could you do just as well with put
options?
EXERCISE 7.4 Assume a discretetime economy with L agents. Each agent l maximizes time
additive expected utility E
_
T
t=0
t
l
u
l
(c
lt
)
_
where u
l
is strictly increasing and concave. Show
that
t+1
t
=
L
l=1
l
u
l
(c
l,t+1
)
L
l=1
u
l
(c
lt
)
is a valid oneperiod stateprice deator, i.e. that it is strictly positive and satises E
t
[(
t+1
/
t
)R
t+1
] =
1 for any gross return R
t+1
over the period [t, t + 1].
EXERCISE 7.5 George and John live in a continuoustime economy in which the relevant
uncertainty is generated by a onedimensional standard Brownian motion z = (z
t
)
t[0,T]
. Both
have timeadditive utility of their consumption process: George maximizes
U
G
(c) = E
_
_
T
0
e
0.02t
lnc
t
dt
_
,
while John maximizes
U
J
(c) = E
_
_
T
0
e
0.02t
_
1
c
t
_
dt
_
.
Georges optimal consumption process c
G
= (c
Gt
) has a constant expected growth rate of 4% and
a constant volatility of 5%, i.e.
dc
Gt
= c
Gt
[0.04 dt + 0.05 dz
t
] .
Two assets are traded in the economy. One asset is an instantaneously riskfree bank account with
continuously compounded rate of return r
t
. The other asset is a risky asset with price process
P = (P
t
) satisfying
dP
t
= P
t
[
Pt
dt + 0.4 dz
t
]
for some drift
Pt
. The market is complete.
(a) What are the relative risk aversions of George and John, respectively?
(b) Using the fact that a state price deator can be derived from Georges consumption process,
determine the riskfree rate r
t
and the market price of risk
t
. What can you conclude about
the price processes of the two assets?
(c) Find the drift and volatility of Johns optimal consumption process, c
J
= (c
Jt
).
7.5 Exercises 171
(d) Suppose George and John are the only two individuals in the economy. What can you say
about the dynamics of aggregate consumption? Can the representative agent have constant
relative risk aversion?
EXERCISE 7.6 Consider a discretetime economy with L individuals with identical preferences
so that agent l = 1, . . . , L at time 0 wants to maximize
E
_
T
t=0
t
1
1
c
1
l,t
_
where c
l,t
denotes the consumption rate of individual l at time t. Let c
l,t
be the optimal consump
tion rate of individual l at time t.
(a) Argue why
t+1
t
=
L
L
l=1
_
c
l,t+1
c
l,t
_
(*)
is a stateprice deator between time t and time t + 1.
(b) If the market is complete, explain why the nextperiod stateprice deator in (*) can be
written as
t+1
t
=
_
L
l=1
c
l,t+1
L
l=1
c
l,t
_
.
EXERCISE 7.7 (Use spreadsheet or similar computational tool.) Consider a oneperiod economy
with 5 possible states and 5 assets traded. The statecontingent dividends and prices of the assets
and the state probabilities are as follows:
state 1 state 2 state 3 state 4 state 5 price
Asset 1 1 1 1 1 1 0.9
Asset 2 0 2 4 6 8 1.7
Asset 3 4 0 2 4 2 2.3
Asset 4 10 0 0 2 2 4.3
Asset 5 4 4 0 4 4 2.8
probability 0.25 0.25 0.2 0.2 0.1
(a) Verify that the market is complete and nd the unique stateprice deator.
Consider an individual investor, Alex, with access to the above nancial market with the given
prices. Suppose Alex has timeadditive expected utility with a time preference rate of = 0.03
and a constant relative risk aversion of = 2. Suppose his optimal consumption at time 0 is 5 and
that he will receive an income of 5 at time 1 no matter which state is realized.
(b) What is Alex optimal time 1 consumption? What does it cost him to nance that consump
tion? What is the optimal portfolio for Alex?
172 Chapter 7. Market equilibrium
Suppose now that there is only one other individual, Bob, in the economy. Bob also has time
additive expected utility with a time preference rate of 0.03 but a relative risk aversion of 5. Bobs
optimal time 0 consumption is also 5.
(c) What is Bobs optimal time 1 consumption?
(d) What is the aggregate time 0 consumption and the statedependent aggregate time 1 con
sumption?
(e) What is Bobs time 0 endowment and statedependent time 1 endowment? What is Bobs
optimal portfolio?
(f ) Verify that the markets clear.
EXERCISE 7.8 Bruce lives in a continuoustime complete market economy. He has time
additive logarithmic utility, u
B
(c) = lnc, with a time preference rate of
B
= 0.02, and his optimal
consumption process c
B
= (c
Bt
) has dynamics
dc
Bt
= c
Bt
[0.03 dt + 0.1 dz
t
] ,
where z = (z
t
) is a standard Brownian motion.
(a) Characterize the stateprice deator induced by Bruces optimal consumption process? Iden
tify the continuously compounded shortterm riskfree interest rate and the instantaneous
Sharpe ratio of any risky asset.
Patti lives in the same economy as Bruce. She has timeadditive expected utility with a HARA
utility function u
P
(c) =
1
1
(c c)
1
and a time preference rate identical to Bruces, i.e.
P
= 0.02.
(b) Explain why Pattis optimal consumption strategy c
P
= (c
Pt
) must satisfy
c
Pt
= c +
_
c
Bt
c
B0
_
1/
(c
P0
c) .
Find the dynamics of Pattis optimal consumption process.
CHAPTER 8
Consumptionbased asset pricing
8.1 Introduction
Previous chapters have shown how stateprice deators determine asset prices and how the optimal
consumption choices of individuals determine stateprice deators. In this chapter we combine
these observations and link asset prices to consumption. Models linking expected returns to the
covariance between return and (aggregate) consumption are typically called Consumptionbased
Capital Asset Pricing Models (CCAPM). Such models date back to Rubinstein (1976) and Breeden
(1979).
The outline of the chapter is as follows. Section 8.2 develops the CCAPM in the oneperiod
framework. Section 8.3 derives a general link between asset prices and consumption in a multi
period setting, while Section 8.4 focuses on a simple and tractable specication. Section 8.5
shows that this simple specication is unable to match important empirical features of aggregate
consumption and return, leaving a number of apparent asset pricing puzzles. Section 8.6 discusses
some problems with such empirical studies. Some extensions of the simple model are presented in
Sections 8.7 and 8.9. These extensions help resolve some of the apparent puzzles.
8.2 The oneperiod CCAPM
For simplicity let us rst investigate the link between asset prices and consumption in the one
period framework.
As discussed several times the marginal rate of substitution of any individual denes a state
price deator. If we assume timeadditive expected utility with time preference rate and utility
function u, this stateprice deator is
= e
(c)
u
(c
0
)
,
where c
0
is optimal time 0 consumption and c is the statedependent optimal time 1 consumption.
173
174 Chapter 8. Consumptionbased asset pricing
If the economy can be modeled by a representative individual, the equation holds for aggregate
consumption.
Assuming that a riskfree asset is traded, we know from Section 4.2.1 that the gross riskfree
return is
R
f
=
1
E[]
= e
1
E[u
(c)/u
(c
0
)]
(8.1)
and the expected gross return on any asset i is
E[R
i
] =
1
E[]
Cov[R
i
, ]
E[]
= R
f
Cov[u
(c)/u
(c
0
), R
i
]
E[u
(c)/u
(c
0
)]
= R
f
[u
(c)/u
(c
0
)]
E[u
(c)/u
(c
0
)]
[R
i
] [R
i
, u
(c)/u
(c
0
)] ,
(8.2)
where [x] denotes the standard deviation of the random variable x, while [x, y] is the correlation
between the random variables x and y. An asset with a return which is positively correlated with
the marginal utility of consumption (and hence negatively correlated with the level of consumption)
is attractive, has a high equilibrium price, and thus a low expected return.
In order to obtain a relation between expected returns and consumption itself (rather than
marginal utility of consumption) we need to make further assumptions or approximations. Below
we develop three versions.
8.2.1 The simple oneperiod CCAPM: version 1
A rstorder Taylor approximation of u
(c) around c
0
gives that
u
(c)
u
(c
0
)
u
(c
0
) +u
(c
0
)(c c
0
)
u
(c
0
)
= 1 (c
0
)g,
where (c
0
) = c
0
u
(c
0
)/u
(c
0
) is the relative risk aversion of the individual evaluated at the time 0
consumption level, and g = c/c
0
1 is the (statedependent) relative growth rate of consumption
over the period. If we further assume that E[u
(c)/u
(c
0
)] 1, we get that
E[R
i
] R
f
(c
0
) Cov[g, R
i
] = (c
0
)[g, R
i
][g][R
i
], (8.3)
where [g, R
i
] is the correlation between consumption growth and the return and [g] and [R
i
]
are the standard deviations of consumption growth and return, respectively. The above equation
links expected excess returns to covariance with consumption growth. We can rewrite the equation
as
E[R
i
] R
f
+[R
i
, g], (8.4)
where [R
i
, g] = Cov[g, R
i
]/ Var[g] and = (c
0
) Var[g]. If we ignore the approximative character
of the equation, it shows that the growth rate of the individuals optimal consumption is a pricing
factor.
If the economy has a representative individual we can use the above relations with aggregate
consumption instead of individual consumption. The factor risk premium should then reect the
relative risk aversion of the representative individual.
8.2 The oneperiod CCAPM 175
What if the economy does not have a representative individual? Eq. (8.3) will still hold for any
individual l, i.e.
E[R
i
] R
f
l
(c
l
0
) Cov[c
l
/c
l
0
, R
i
] = ARA
l
(c
l
0
) Cov[c
l
, R
i
],
where
l
and ARA
l
are the relative and absolute risk aversion of individual l, so that
_
E[R
i
] R
f
_
1
ARA
l
(c
l
0
)
Cov[c
l
, R
i
], l = 1, 2, . . . , L.
Summing up over all individuals, we get
_
E[R
i
] R
f
_
L
l=1
1
ARA
l
(c
l
0
)
L
l=1
Cov[c
l
, R
i
] = Cov
_
L
l=1
c
l
, R
i
_
= Cov[C, R
i
],
where C is aggregate time 1 consumption. If we let C
0
denote aggregate time 0 consumption, we
obtain
E[R
i
] R
f
_
L
l=1
1
ARA
l
(c
l
0
)
_
1
C
0
Cov[C/C
0
, R
i
]. (8.5)
Again ignoring the approximation, this equation shows that aggregate consumption growth is a
pricing factor even if there is no representative individual. We just need to replace the relative risk
aversion of the representative individual by some complex average of individual risk aversions. The
absolute risk tolerance of individual l is exactly 1/ ARA
l
(c
l
0
) so the rst term on the righthand
side of the above equation can be interpreted as the inverse of the aggregate absolute risk tolerance
in the economy. The higher the aggregate absolute risk tolerance, the lower the equilibrium risk
premium on risky assets.
Of course, we prefer exact asset pricing results to approximate. In a later section, we will show
that in continuoustime models the analogues to the above relations will hold as exact equalities
under appropriate assumptions.
8.2.2 The simple oneperiod CCAPM: version 2
Suppose that
(i) the individual has constant relative risk aversion, u(c) = c
1
/(1 ),
(ii) consumption growth is lognormally distributed,
ln(1 +g) ln
_
c
c
0
_
N( g,
2
g
).
Then
u
(c)
u
(c
0
)
=
_
c
c
0
_
= exp
_
ln
_
c
c
0
__
.
For a random variable x N(m, s
2
), it can be shown (see Appendix B) that
E[e
kx
] = e
km+
1
2
k
2
s
2
for any constant k. In particular,
E[e
x
] = e
m+
1
2
2
s
2
176 Chapter 8. Consumptionbased asset pricing
and
Var[e
x
] = E[(e
x
)
2
]
_
E[e
x
]
_
2
= E[e
2x
]
_
e
m+
1
2
2
s
2
_
2
= e
2m+2
2
s
2
_
e
m+
1
2
2
s
2
_
2
=
_
e
m+
1
2
2
s
2
_
2
_
e
2
s
2
1
_
.
In our case, we get
E
_
u
(c)
u
(c
0
)
_
= E
_
exp
_
ln
_
c
c
0
___
= exp
_
g +
1
2
2
g
_
(8.6)
and
[u
(c)/u
(c
0
)]
E[u
(c)/u
(c
0
)]
=
_
e
2
g
1
g
, (8.7)
where the approximation is based on e
x
1 + x for x 0. The gross riskfree rate of return is
then given by
R
f
= e
_
E
_
(c/c
0
)
__
1
= exp
_
+ g
1
2
2
g
_
so that the continuously compounded riskfree rate of return becomes
lnR
f
= + g
1
2
2
g
. (8.8)
From (8.2) and (8.7) we conclude that in the simple model the excess expected rate of return on
a risky asset is
E[R
i
] R
f
g
_
R
i
, (c/c
0
)
[R
i
]. (8.9)
A rstorder Taylor approximation of the function f(x) = x
1
_
c
c
0
1
_
and, consequently,
_
R
i
,
_
c
c
0
_
_
_
R
i
, 1
_
c
c
0
1
__
=
Cov
_
R
i
, 1
_
c
c
0
1
__
[R
i
]
_
1
_
c
c
0
1
__
=
Cov[R
i
, c/c
0
]
[R
i
][c/c
0
]
= [R
i
, c/c
0
].
Therefore,
E[R
i
] R
f
g
[R
i
, c/c
0
] [R
i
], (8.10)
as in (8.3).
8.2.3 The simple oneperiod CCAPM: version 3
Assume that the individual has constant relative risk aversion and that the gross asset return R
i
and the consumption growth c/c
0
are jointly lognormally distributed (a stronger condition than
in version 2). Then the stateprice deator = e
(c/c
0
)
1
2
2
Var
_
ln
_
c
c
0
__
= + g
1
2
2
g
,
8.2 The oneperiod CCAPM 177
as in version 2 above. The expectation of the continuously compounded rate of return r
i
= lnR
i
of asset i satises
E[r
i
] r
f
= Cov[ln, r
i
]
1
2
Var[r
i
]
= Cov
_
r
i
, ln
_
c
c
0
__
=
g
[r
i
, ln(c/c
0
)][r
i
].
(8.11)
This relation is exact but holds only under the more restrictive assumption of joint lognormality
of consumption and returns.
8.2.4 The consumptionmimicking portfolio
In both versions 1 and 2 above, expected excess returns are determined by the beta of the asset
with the growth rate of consumption via the equation E[R
i
] R
f
+ [R
i
, g], cf. (8.4). We
will now show that we can replace the consumption growth by the return R
c
on a portfolio
c
mimicking the consumption growth in the sense that
c
is the portfolio minimizing the variance of
the dierence between the portfolio return and the consumption growth. A portfolio characterized
by the portfolio weight vector has a return of R
g] =
Var[R] + Var[g] 2
Cov[R, g].
Here Var[R] is the variancecovariance matrix of the returns and Cov[R, g] is the vector of the
covariances of each assets return with consumption growth. The rstorder condition determining
the consumption growth mimicking portfolio =
c
is
2 Var[R]
c
2 Cov[R, g] = 0
c
= (Var[R])
1
Cov[R, g], (8.12)
so that the return on that portfolio is
R
c
= (
c
)
R = (Cov[R, g])
(Var[R])
1
R.
The variance of the return is
Var[R
c
] = (Cov[R, g])
(Var[R])
1
Cov[R, g],
which implies that
Cov[R
c
, g] = (Cov[R, g])
(Var[R])
1
Cov[R, g] = Var[R
c
]. (8.13)
Also observe that
Cov[R, R
c
] = Cov[R, g]. (8.14)
The equation (8.4) holds for all assets and portfolios. In particular, it must hold for the
consumptionmimicking portfolio so that
E[R
c
] R
f
[R
c
, g] =
Cov[R
c
, g]
Var[g]
=
Var[R
c
]
Var[g]
,
where the last equality follows from (8.13). Solving for in the above equation, we nd
Var[g]
Var[R
c
]
_
E[R
c
] R
f
_
.
178 Chapter 8. Consumptionbased asset pricing
Hence the expected excess return on any risky asset i can be rewritten as
E[R
i
] R
f
Cov[R
i
, g]
Var[g]
Var[g]
Var[R
c
]
_
E[R
c
] R
f
_
=
Cov[R
i
, g]
Var[R
c
]
_
E[R
c
] R
f
_
.
Applying (8.14), we conclude that
E[R
i
] R
f
[R
i
, R
c
]
_
E[R
c
] R
f
_
, (8.15)
where [R
i
, R
c
] = Cov[R
i
, R
c
]/ Var[R
c
]. Equation (8.15) is exactly as the classic CAPM but with
the return on a consumptionmimicking portfolio instead of the return on the market portfolio.
The consumptionmimicking portfolio has the property that its return has the highest absolute
correlation with consumption growth. In order to see this, note that
(Corr[R
, g])
2
=
(Cov[R
, g])
2
Var[R
] Var[g]
=
(
Cov[R, g])
2
Var[R]Var[g]
.
The rstorder condition for the minimization implies that
2
Cov[R, g] Cov[R, g] (
Var[R]Var[g]) (
Cov[R, g])
2
2 Var[R]Var[g] = 0.
Simplication leads to
Cov[R, g]
Var[R] =
Cov[R, g] Var[R].
We know from (8.12) that the consumptionmimicking portfolio satises Var[R] = Cov[R, g] and
substituting that into the above equation, we see that the consumptionmimicking portfolio also
satises the rstorder condition for the correlation maximization problem.
8.3 General multiperiod link between consumption and as
set returns
Now take the analysis to multiperiod settings. Assuming timeadditive expected utility we can
dene a stateprice deator from the optimal consumption plan of any individual as
t
= e
t
u
(c
t
)
u
(c
0
)
. (8.16)
This is true both in the discretetime and in the continuoustime setting. If a representative
individual exists, the equation holds for aggregate consumption.
Not surprisingly, the multiperiod discretetime setting leads to equations very similar to those
derived in the oneperiod framework in the previous section. Since
t+1
t
= e
(c
t+1
)
u
(c
t
)
,
we get from (4.25) and (4.26) that the riskfree gross return is
R
f
t
= e
_
E
t
_
u
(c
t+1
)
u
(c
t
)
__
1
(8.17)
and that the excess expected gross return on a risky asset is
E
t
[R
i,t+1
] R
f
t
=
Cov
t
[u
(c
t+1
)/u
(c
t
), R
i,t+1
]
E
t
[u
(c
t+1
)/u
(c
t
)]
=
t
_
R
i,t+1
,
u
(c
t+1
)
u
(c
t
)
_
t
[R
i,t+1
]
_
t
[u
(c
t+1
)/u
(c
t
)]
E
t
[u
(c
t+1
)/u
(c
t
)]
_
.
(8.18)
8.3 General multiperiod link between consumption and asset returns 179
These equations are the multiperiod equivalents of the Equations (8.1) and (8.2) for the oneperiod
model. As in the oneperiod case, we can obtain an approximate relation between expected returns
and relative consumption growth, g
t+1
= c
t+1
/c
t
1, over the next period,
E
t
[R
i,t+1
] R
f
t
(c
t
) Cov
t
[g
t+1
, R
i,t+1
], (8.19)
and also an approximate consumptionbeta equation
E
t
[R
i,t+1
] R
f
t
t
[R
i,t+1
, R
c
t+1
]
_
E
t
[R
c
t+1
] R
f
t
_
, (8.20)
where
t
[R
i,t+1
, R
c
t+1
] = Cov
t
[R
i,t+1
, R
c
t+1
]/ Var
t
[R
c
t+1
] and R
c
t+1
is the gross return on a portfolio
perfectly correlated with consumption growth over this period.
Now let us turn to the continuoustime setting. Suppose that the dynamics of consumption can
be written as
dc
t
= c
t
[
ct
dt +
ct
dz
t
] , (8.21)
where
ct
is the expected relative growth rate of consumption and
ct
is the vector of sensitivities
of consumption growth to the exogenous shocks to the economy. In particular, the variance of
relative consumption growth is given by 
ct

2
. Given the dynamics of consumption and the
relation (8.16) we can obtain the dynamics of
t
by an application of Itos Lemma on the function
g(t, c) = e
t
u
(c)/u
(c
0
). The relevant derivatives are
g
t
(t, c) = e
t
u
(c)
u
(c
0
)
,
g
c
(t, c) = e
t
u
(c)
u
(c
0
)
,
2
g
c
2
(t, c) = e
t
u
(c)
u
(c
0
)
,
implying that
g
t
(t, c
t
) = e
t
u
(c
t
)
u
(c
0
)
=
t
,
g
c
(t, c
t
) = e
t
u
(c
t
)
u
(c
0
)
=
u
(c
t
)
u
(c
t
)
t
= (c
t
)c
1
t
t
,
2
g
c
2
(t, c
t
) = e
t
u
(c
t
)
u
(c
0
)
=
u
(c
t
)
u
(c
t
)
t
= (c
t
)c
2
t
t
,
where (c
t
) c
t
u
(c
t
)/u
(c
t
) is the relative risk aversion of the individual, and where (c
t
)
c
2
t
u
(c
t
)/u
(c
t
) is positive under the very plausible assumption that the absolute risk aversion
of the individual is decreasing in the level of consumption. Consequently, the dynamics of the
stateprice deator can be expressed as
d
t
=
t
__
+(c
t
)
ct
1
2
(c
t
)
ct

2
_
dt +(c
t
)
ct
dz
t
_
, (8.22)
Comparing the above equation with (4.41), we can draw the conclusions summarized in the fol
lowing theorem:
Theorem 8.1 In a continuoustime economy where the optimal consumption process of an in
dividual with timeadditive expected utility satises (8.21), the continuously compounded riskfree
shortterm interest rate satises
r
f
t
= +(c
t
)
ct
1
2
(c
t
)
ct

2
(8.23)
and
t
= (c
t
)
ct
(8.24)
denes a market price of risk process. Here (c
t
) = c
t
u
(c
t
)/u
(c
t
) and (c
t
) = c
2
t
u
(c
t
)/u
(c
t
).
180 Chapter 8. Consumptionbased asset pricing
Equation (8.23) gives the interest rate at which the market for shortterm borrowing and lending
will clear. The equation relates the equilibrium shortterm interest rate to the time preference rate
and the expected growth rate
ct
and the variance rate 
ct

2
of aggregate consumption growth
over the next instant. We can observe the following relations:
There is a positive relation between the time preference rate and the equilibrium interest
rate. The intuition behind this is that when the individuals of the economy are impatient
and has a high demand for current consumption, the equilibrium interest rate must be high
in order to encourage the individuals to save now and postpone consumption.
The multiplier of
ct
in (8.23) is the relative risk aversion of the representative individual,
which is positive. Hence, there is a positive relation between the expected growth in aggregate
consumption and the equilibrium interest rate. This can be explained as follows: We expect
higher future consumption and hence lower future marginal utility, so postponed payments
due to saving have lower value. Consequently, a higher return on saving is needed to maintain
market clearing.
If u
is positive, there will be a negative relation between the variance of aggregate con
sumption and the equilibrium interest rate. If the representative individual has decreasing
absolute risk aversion, which is certainly a reasonable assumption, u
has to be positive.
The intuition is that the greater the uncertainty about future consumption, the more will
the individuals appreciate the sure payments from the riskfree asset and hence the lower a
return is necessary to clear the market for borrowing and lending.
If we substitute (8.24) into (4.39) we see that the excess expected rate of return on asset i over
the instant following time t can be written as
it
+
it
r
f
t
= (c
t
)
it
ct
= (c
t
)
ict

it
 
ct
. (8.25)
Here
it
ct
and
ict
are the covariance and correlation between the rate of return on asset i and the
consumption growth rate, respectively, and 
it
 and 
ct
 are standard deviations (volatilities)
of the rate of return on asset i and the consumption growth rate, respectively. Equation (8.25)
is the continuoustime version of (8.19). Note that the continuoustime relation is exact. Again,
if we can nd a trading strategy mimicking the consumption process (same volatility, perfect
correlation) we get the consumptionbeta relation
it
+
it
r
f
t
=
c
it
_
t
+
t
r
f
t
_
,
where
c
it
=
it
ct
/
ct

2
.
If the market is eectively complete, the above equations are valid for aggregate consumption
if we apply the utility function and time preference rate of the representative individual. The
representative individual version of Equation (8.25) says that risky assets are priced so that the
expected excess return on an asset is given by the product of the relative risk aversion of the
representative individual and the covariance between the asset return and the growth rate of
aggregate consumption. This is the key result in the Consumptionbased CAPM (or just CCAPM)
developed by Breeden (1979).
As already indicated in the oneperiod framework, we can obtain a relation between expected
returns and aggregate consumption also if the market is incomplete and no representative individual
8.4 The simple multiperiod CCAPM 181
exists. Let us stick to the continuoustime setting. Let c
l
= (c
lt
) denote the optimal consumption
process of individual number l in the economy and assume that
dc
lt
= c
lt
[
clt
dt +
clt
dz
t
] .
If there are L individuals in the economy, aggregate consumption is C
t
=
L
l=1
c
lt
and we have
that
dC
t
=
L
l=1
dc
lt
=
_
L
l=1
c
lt
clt
_
dt +
_
L
l=1
c
lt
clt
_
dz
t
C
t
[
Ct
dt +
Ct
dz
t
] ,
where
Ct
_
L
l=1
c
lt
clt
_
/C
t
and
Ct
=
_
L
l=1
c
lt
clt
_
/C
t
. We know from (8.25) that
it
+
it
r
f
t
= A
l
(c
lt
)c
lt
it
clt
, l = 1, . . . , L,
where A
l
(c
lt
) u
l
(c
lt
)/u
l
(c
lt
) is the absolute risk aversion of individual l. Consequently,
_
it
+
it
r
f
t
_
1
A
l
(c
lt
)
= c
lt
it
clt
, l = 1, . . . , L,
and summing up over l, we get
_
it
+
it
r
f
t
_
L
l=1
_
1
A
l
(c
lt
)
_
=
it
_
L
l=1
c
lt
clt
_
=
it
(C
t
Ct
) .
Therefore, we have the following relation between excess expected returns and aggregate consump
tion:
it
+
it
r
f
t
=
C
t
L
l=1
_
1
A
l
(c
lt
)
_
it
Ct
. (8.26)
Relative to the complete markets version (8.25), the only dierence is that the relative risk aversion
of the representative individual is replaced by some complicated average of the risk aversions of the
individuals. Note that if all individuals have CRRA utility with identical relative risk aversions,
then A
l
(c
lt
) = /c
lt
and the multiplier C
t
/
L
l=1
_
1
A
l
(c
lt
)
_
in the above equation reduces to .
The Consumptionbased CAPM is a very general asset pricing result. Basically any asset pricing
model can be seen as a special case of the Consumptionbased CAPM. On the other hand, the
general Consumptionbased CAPM is not very useful for applications without further assumptions.
Therefore we turn now to more specic consumptionbased models.
8.4 The simple multiperiod CCAPM
A large part of the asset pricing literature makes (not always explicitly stated, unfortunately) the
following additional assumptions:
1. the economy has a representative individual with CRRA timeadditive utility, i.e. u(C) =
1
1
C
1
,
2. future aggregate consumption is lognormally distributed.
182 Chapter 8. Consumptionbased asset pricing
In the discretetime version of the model, we can proceed as in version 2 of the oneperiod model.
The rst assumption leads to a marginal rate of substitution given by
u
(C
t+1
)
u
(C
t
)
=
_
C
t+1
C
t
_
= exp
_
ln
_
C
t+1
C
t
__
.
By the second assumption, ln(C
t+1
/C
t
) N( g,
2
C
), and hence
E
t
_
u
(C
t+1
)
u
(C
t
)
_
= E
t
_
exp
_
ln
_
C
t+1
C
t
___
= exp
_
g +
1
2
2
C
_
(8.27)
and
t
(u
(C
t+1
)/u
(C
t
))
E
t
[u
(C
t+1
)/u
(C
t
)]
=
_
e
2
C
1
C
. (8.28)
According to (8.17), the gross riskfree return over the period from t to t + 1 is then given by
R
f
t
= e
_
E
t
_
(C
t+1
/C
t
)
__
1
= exp
_
+ g
1
2
2
C
_
so that the continuously compounded riskfree rate of return becomes
r
f
t
lnR
f
t
= + g
1
2
2
C
. (8.29)
From (8.19) we conclude that in the simple model the excess expected gross return on a risky asset
is
E
t
[R
i,t+1
] R
f
t
C
t
_
R
i,t+1
,
_
C
t+1
C
t
_
t
[R
i,t+1
]
C
t
_
R
i,t+1
,
C
t+1
C
t
_
t
[R
i,t+1
] ,
(8.30)
where the last expression follows from a rstorder Taylor approximation as in Section 8.2.2. If
we further assume that the future asset price and the future consumption level are simultaneously
lognormally distributed, we are back in version 3 of the oneperiod model so that we get
E
t
[lnR
i,t+1
] lnR
f
t
+
1
2
Var
t
[lnR
i,t+1
] =
C
t
_
lnR
i,t+1
, ln
_
C
t+1
C
t
__
t
[lnR
i,t+1
] (8.31)
or, equivalently,
ln(E
t
[R
i,t+1
]) lnR
f
t
=
C
t
_
lnR
i,t+1
, ln
_
C
t+1
C
t
__
t
[lnR
i,t+1
] .
In the formulas above the mean g and variance
2
of consumption growth can vary over time,
but in the stationary case where both are constant we see that the riskfree interest rate must
also be constant. Furthermore, a risky asset with a constant correlation with consumption growth
will have a constant Sharpe ratio
_
E
t
[R
i,t+1
] R
f
t
_
/
t
[R
i,t+1
]. If the standard deviation of the
return is constant, the expected excess rate of return will also be constant.
In the continuoustime version of the stationary simple consumptionbased model, the second
assumption means that
Ct
and
Ct
in the consumption process (8.21) are constant, i.e. consump
tion follows a geometric Brownian motion. It follows from (8.23) and (8.25) that the model with
these assumptions generate a constant continuously compounded shortterm riskfree interest rate
of
r
f
= +
C
1
2
(1 +)
C

2
(8.32)
8.5 Theory meets data asset pricing puzzles 183
and a constant Sharpe ratio for asset i given by
it
+
it
r
f

it

=
iC

C
 (8.33)
if the asset has a constant correlation with consumption.
8.5 Theory meets data asset pricing puzzles
The simple consumptionbased model has been exposed to numerous empirical tests. Almost
all of these tests focus on the question whether the relation (8.33)or the similar discretetime
relation (8.30)holds for a broadbased stock index for reasonable values of the relative risk
aversion coecient. Let us use the following stylized gures that are roughly representative for the
U.S. economy over the second half of the 20th century:
the average annual real excess rate of return on the stock index (relative to the yield on a
shortterm government bond) is about 8.0%;
the empirical standard deviation of the annual real rate of return on the stock index is about
20.0%;
the empirical standard deviation of annual relative changes in aggregate consumption is about
2.0%;
the empirical correlation between the real return on the stock index and changes in aggregate
consumption is about 0.2.
Inserting these estimates into (8.33) it follows that the relative risk aversion coecient must be
100. This is certainly an unrealistically high risk aversion for a typical individual, cf. the discussion
in Section 5.6.2. In fact, this computationwhich is standard in the literatureexaggerates the
problem somewhat. The estimates of the return and consumption moments are based on discrete
observations, not continuous observations, so we should use the discretetime version of the model.
If we drop the approximation in (8.28), the discretetime simple model says that
E
t
[R
i,t+1
] R
f
t
_
e
2
1
t
_
R
i,t+1
,
C
t+1
C
t
_
t
[R
i,t+1
]
and plugging in the estimates, we need
1
_ln
_
_
1 +
_
E
t
[R
i,t+1
] R
f
t
t
[R
i,t+1
]
_
2
_
_
=
1
0.02
_
ln(5) 63.4.
But 63.4 is still an unreasonably high relative risk aversion.
For a reasonable level of risk aversion (probably in the area 25), the expected excess rate of
return predicted by the theory is much smaller than the historical average. For example with
= 5, the expected excess rate of return on the market portfolio should be 5 0.2 0.02 0.2 = 0.004
or 0.4% according to (8.33). The simple consumptionbased asset pricing model cannot explain
the high average stock returns observed in the data. This socalled equity premium puzzle was rst
pointed out by Mehra and Prescott (1985).
184 Chapter 8. Consumptionbased asset pricing
Country Period
m
m
C
mC
is then
computed as the value you need to apply for the relative risk aversion in order for (8.31) to hold
given the estimates of the other parameters. The
50
2.8% so that a 95% condence interval is roughly [2.5%, 13.5%]. With a mean return of 2.5% instead
of 8.0%, the required value of the relative risk aversion drops to 31.25, which is certainly still very
high but nevertheless considerably smaller than the original value of 100.
Secondly, the consumption data used in the tests is of a poor quality as pointed out by e.g.
Breeden, Gibbons, and Litzenberger (1989). The available data on aggregate consumption mea
sures the expenses for purchases of consumption goods over a given period (usually a quarter or
a month). This is problematic for several reasons. Many, especially very expensive, consumption
186 Chapter 8. Consumptionbased asset pricing
goods are durable goods oering consumption services beyond the period of purchase. The model
addresses the rate of consumption at a given point in time rather than the sum of consumption
rates over some time interval; see Grossman, Melino, and Shiller (1987). The consumption data
is reported infrequently relative to nancial data. Moreover, the aggregate consumption numbers
are undoubtedly subject to various sampling and measurement errors; see Wilcox (1992). These
problems motivate the development of asset pricing models that do not depend on consumption
at all. This is for example the case of the socalled factor models discussed in Chapter 9.
Thirdly, also data for stock returns has to be selected and applied with caution. Most tests of
the asset pricing models are based on data from the U.S. or other economies that have experienced
relatively high growth at least throughout the last 50 years. Probably investors in these countries
were not so sure 50 years ago that the economy would avoid major nancial and political crises
and outperform other countries. Brown, Goetzmann, and Ross (1995) point out that due to
this survivorship bias the realized stock returns overstate the exante expected rate of returns
signicantly by maybe as much as 24 percentage points. Note however that in many crises in
which stocks do badly, also bonds and deposits tend to provide low returns so it is not clear how
big the eect on the expected excess stock return is.
Fourthly, the pricing relations of the model involve the exante expectations of individuals, while
the tests of the model are based on a single realized sequence of market prices and consumption.
Estimating and testing a model involving exante expectations and other moments requires sta
tionarity in data in the sense that it must be assumed that each of the annual observations is
drawn from the same probability distribution. For example, the average of the observed annual
stock market returns is only a reasonable estimate of the exante expected annual stock market
return if the stock market return in each of the 50 years is drawn from the same probability distri
bution. Some important changes in the investment environment over the past years invalidate the
stationarity assumption. For example, Mehra and Prescott (2003) note two signicant changes in
the U.S. tax system in the period between 1960 and 2000, a period included in most tests of the
consumptionbased model:
The marginal tax rate for stock dividends has dropped from 43% to 17%.
Stock returns in most pension savings accounts are now taxexempt, which was not so in the
1960s. Bond returns in savings accounts have been taxexempt throughout the period.
Both changes have led to increased demands for stocks with stock price increases as a result. These
changes in the tax rules were hardly predicted by investors and, hence, they can partly explain
why the model has problems explaining the high realized stock returns. Similarly, it can be argued
that the reductions in direct and indirect transaction costs and the liberalizations of international
nancial markets experienced over the last decades have increased the demands for stocks and
driven up stock returns above what could be expected exante. The high transaction costs and
restrictions on particularly international investments in the past may have made it impossible or
at least very expensive for investors to obtain the optimal diversication of their investments so
that even unsystematic risks may have been priced with higher required returns as a consequence.
1
1
It is technically complicated to include transaction costs and trading restrictions in asset pricing models, but a
study of He and Modest (1995) indicates that such imperfections at least in part can explain the equity premium
puzzle.
8.7 CCAPM with alternative preferences 187
One can also argue theoretically that the returns of a given stock cannot be stationary. Here
is the argument: in each period there is a probability that the issuing rm defaults and the stock
stops to exist. Then it no longer makes sense to talk about the return probability distribution of
that stock. More generally, the probability distribution of the return in one period may very well
depend on the returns in previous periods.
Fifthly, as emphasized by Bossaerts (2002), standard tests assume that exante expectations of
individuals are correct in the sense that they are conrmed by realizations. The general asset
pricing theory does allow individuals to have systematically overoptimistic or overpessimistic
expectations. The usual tests implicitly assume that market data can be seen as realizations of
the exante expectations of individuals. Bossaerts (2002) describes studies which indicate that this
assumption is not necessarily valid.
8.7 CCAPM with alternative preferences
An interesting alternative to the simple consumptionbased model is to allow the utility of a given
consumption level at a given point in time t to depend on some benchmark X
t
, i.e. the preferences
are modeled by E[
_
T
0
e
t
u(c
t
, X
t
) dt] in continuous time or E[
T
t=0
e
t
u(c
t
, X
t
)] in discrete time.
This incorporates the case of (internal) habit formation where X
t
is determined as a weighted
average of the previous consumption rates of the individual, and the case of statedependent (or
external habit) preferences where X
t
is a variable not aected by the consumption decisions of
the individual. If we assume that a high value of the benchmark means that the individual will
be more eager to increase consumption, as is the case with (internal) habit formation, we should
have u
cX
(c, X) > 0.
Typically, models apply one of two tractable specications of the utility function. The rst
specication is
u(c
t
, X
t
) =
1
1
(c
t
X
t
)
1
, > 0, (8.34)
which is dened for c
t
> X
t
. Marginal utility is
u
c
(c
t
, X
t
) = (c
t
X
t
)
,
and the relative risk aversion is
c
t
u
cc
(c
t
, X
t
)
u
c
(c
t
, X
t
)
=
c
t
c
t
X
t
, (8.35)
which is no longer constant and is greater than . This will allow us to match historical consumption
and stock market data with a lower value of the parameter , but it is more reasonable to study
whether we can match the data with a fairly low average value of the relative risk aversion given
by (8.35). The second tractable specication is
u(c
t
, X
t
) =
1
1
_
c
t
X
t
_
1
, > 0, (8.36)
which is dened for c
t
, X
t
> 0. Since
u
c
(c, X) = c
X
1
, u
cX
(c, X) = ( 1)c
X
2
188 Chapter 8. Consumptionbased asset pricing
we need > 1 to ensure that marginal utility is increasing in X. Despite the generalization of the
utility function relative to the standard model, the relative risk aversion is still constant:
cu
cc
(c, X)
u
c
(c, X; )
= .
8.7.1 Habit formation
In the case with (internal) habit formation the individual will appreciate a given level of con
sumption at a given date higher if she is used to low consumption than if she is used to high
consumption. Such a representation of preferences is still consistent with the von Neumann and
Morgenstern (1944) axioms of expected utility but violates the timeadditivity of utility typically
assumed. A rational individual with an internal habit will take into account the eects of her
current decisions on the future habit levels. We saw in Chapter 6 that this will considerably
complicate the formulas for optimal consumption and for the associated stateprice deator.
Individuals with habit formation in preferences will other things equal invest more in the risk
free asset in order to ensure that future consumption will not come very close to (or even below)
the future habit level. This extra demand for the riskfree asset will lower the equilibrium interest
rate and, hence, help resolve the riskfree rate puzzle. Moreover, we see from (8.35) that the risk
aversion is higher in bad states where current consumption is close to the habit level than in
good states of high current consumption relative to the benchmark. This will be reected by
the risk premia of risky assets and has the potential to explain the observed cyclical behavior of
expected returns.
As we have seen in Chapter 6 the stateprice deators that can be derived from individuals with
internal habit formation are considerably more complex than with timeadditive preferences or
external habit formation. A general and rather abstract continuoustime analysis for the case of
an internal habit dened by a weighted average of earlier consumption rates is given by Detemple
and Zapatero (1991).
Only very few concrete asset pricing models with internal habit have been developed with the
continuoustime model of Constantinides (1990) as the most frequently cited. The model of Con
stantinides assumes a representative individual who can invest in a riskfree asset and a single
risky asset. A priori, the riskfree rate and the expected rate of return and the volatility of the
risky asset are assumed to be constant. The utility of the individual is given by (8.34), where the
habit level is a weighted average of consumption at all previous dates. Constantinides solves for
the optimal consumption and investment strategies of the individual and shows that the optimal
consumption rate varies much less over time in the model with habit formation than with the usual
timeadditive specication of utility. A calibration of the model to historical data shows that the
model is consistent with a large equity premium and a low risk aversion but, on the other hand, the
consumption process of the model has an unrealistically high autocorrelation and the variance of
longterm consumption growth is quite high. By construction, the model cannot explain variations
in interest rates and expected returns on stocks.
8.7 CCAPM with alternative preferences 189
8.7.2 Statedependent utility: general results
With an external habit/benchmark, e.g. given by the aggregate consumption level, the stateprice
deator is
t
= e
t
u
c
(c
t
, X
t
)
u
c
(c
0
, X
0
)
,
where the only dierence to the model without habit is that the marginal utilities depend on the
habit level. An external habit formalizes the idea that the marginal utility of consumption of one
individual is increasing in the consumption level of other individuals (sometimes referred to as the
keeping up with the Joneses eect). In this case, there is no eect of the consumption choice
of the individual on the future benchmark levels, but of course the individual will include her
knowledge of the dynamics of the benchmark when making consumption decisions.
In a discretetime setting the oneperiod deator is
t+1
t
= e
u
c
(c
t+1
, X
t+1
)
u
c
(c
t
, X
t
)
.
Using the Taylor approximation
u
c
(c
t+1
, X
t+1
) u
c
(c
t
, X
t
) +u
cc
(c
t
, X
t
)c
t+1
+u
cX
(c
t
, X
t
)X
t+1
,
the approximate relation
E
t
[R
i,t+1
] R
f
t
_
c
t
u
cc
(c
t
, X
t
)
u
c
(c
t
, X
t
)
_
Cov
t
_
R
i,t+1
,
c
t+1
c
t
_
u
cX
(c
t
, X
t
)
u
c
(c
t
, X
t
)
Cov
t
[R
i,t+1
, X
t+1
]
(8.37)
can be derived. The covariance of return with the benchmark variable adds a term to the excess
expected return of a risky asset. Moreover, the relative risk aversion in the rst term will now
generally vary with the benchmark variable.
In a continuoustime setting where the dynamics of consumption is again given by (8.21) and
the dynamics of the benchmark process X = (X
t
) is of the form
dX
t
=
Xt
dt +
Xt
dz
t
, (8.38)
an application of Itos Lemma will give the dynamics of the stateprice deator. As before, the
riskfree rate and the market price of risk can be identied from the drift and the sensitivity,
respectively, of the stateprice deator. The following theorem states the conclusion. Exercise 8.3
asks for the proof.
Theorem 8.2 In a continuoustime economy where the optimal consumption process of an indi
vidual with statedependent expected utility satises (8.21) and the dynamics of the benchmark is
given by (8.38), the continuously compounded riskfree shortterm interest rate satises
r
f
t
= +
c
t
u
cc
(c
t
, X
t
)
u
c
(c
t
, X
t
)
ct
1
2
c
2
t
u
ccc
(c
t
, X
t
)
u
c
(c
t
, X
t
)

ct

2
u
cX
(c
t
, X
t
)
u
c
(c
t
, X
t
)
Xt
1
2
u
cXX
(c
t
, X
t
)
u
c
(c
t
, X
t
)

Xt

2
c
t
u
ccX
(c
t
, X
t
)
u
c
(c
t
, X
t
)
ct
Xt
(8.39)
and
t
=
c
t
u
cc
(c
t
, X
t
)
u
c
(c
t
, X
t
)
ct
u
cX
(c
t
, X
t
)
u
c
(c
t
, X
t
)
Xt
(8.40)
190 Chapter 8. Consumptionbased asset pricing
denes a market price of risk process. In particular, the excess expected rate of return on asset i is
it
+
it
r
f
t
=
c
t
u
cc
(c
t
, X
t
)
u
c
(c
t
, X
t
)
it
ct
u
cX
(c
t
, X
t
)
u
c
(c
t
, X
t
)
it
Xt
. (8.41)
Assuming u
CX
(c, X) > 0, we see from (8.40) that, if
Xt
goes in the same direction as
ct
,
statedependent utility will tend to lower the market price of risk, working against a resolution of
the equity premium puzzle. On the other hand there is much more room for time variation in both
the riskfree rate and the market price of risk, which can help explain the predictability puzzle.
Again, the above results hold for aggregate consumption if a representative individual with
preferences of the given form exists. If
Xt
= 0, we can link asset prices to aggregate consumption
without assuming the existence of a representative individual, as in the case of standard preferences.
We obtain
it
+
it
r
f
t
=
C
t
L
l=1
_
1
A
l
(c
lt
,X
t
)
_
it
Ct
, (8.42)
where C
t
is aggregate consumption and A
l
(c
lt
, X
t
) = u
l
cc
(c
lt
, X
t
)/u
l
c
(c
lt
, X
t
) is the (now state
dependent) absolute risk aversion of individual l.
8.7.3 The Campbell and Cochrane model
Campbell and Cochrane (1999) suggest a discretetime model of an economy with identical indi
viduals with utility functions like (8.34), where X
t
is the benchmark or external habit level. The
nextperiod deator is then
m
t+1
t+1
t
= e
(c
t+1
X
t+1
)
(c
t
X
t
)
.
The lognormal distributional assumption for aggregate consumption made in the simple model
seems to be empirically reasonable so the Campbell and Cochrane model keeps that assumption.
Since it is not obvious what distributional assumption on X
t
that will make the model computa
tionally tractable, Campbell and Cochrane dene the surplus consumption ratio S
t
= (c
t
X
t
)/c
t
in terms of which the nextperiod deator can be rewritten as
m
t+1
= e
_
c
t+1
c
t
_
_
S
t+1
S
t
_
= exp
_
ln
_
c
t+1
c
t
_
ln
_
S
t+1
S
t
__
.
It is assumed that changes in both consumption growth and the surplus consumption ratio are
conditionally lognormally distributed with
ln
_
c
t+1
c
t
_
= g +
t+1
,
ln
_
S
t+1
S
t
_
= (1 )
_
ln
S lnS
t
_
+ (S
t
)
t+1
,
where
t+1
N(0,
2
) and the function is specied below with the purpose of obtaining some
desired properties. With lognormality of c
t+1
and S
t+1
, c
t+1
X
t+1
= c
t+1
S
t+1
and therefore the
nextperiod deator will also be lognormal. Note that lnS
t
will uctuate around ln
S, which may
think of as representing business cycles with low values of lnS
t
corresponding to bad times with
relatively low consumption. Also note that the consumption and the surplus consumption ratios
are perfectly correlated.
8.7 CCAPM with alternative preferences 191
The distributional assumption on the surplus consumption ratio ensures that it stays positive
so that c
t
> X
t
, as required by the utility specication. On the other hand, if you want the
benchmark X
t
to stay positive, the surplus consumption ratio must be smaller than 1, which is not
ensured by the lognormal distribution. According to Campbell, Lo, and MacKinlay (1997, p. 331),
it can be shown that
lnX
t+1
ln(1
S) +
g
1
+ (1 )
j=0
j
lnc
t+j
so that the benchmark is related to a weighted average of past consumption levels.
With the above assumptions, the nextperiod deator m
t+1
is
m
t+1
= exp
_
_
g (1 ) ln
S
t
S
_
(1 + (S
t
))
t+1
_
,
which is conditionally lognormally distributed with
t
[m
t+1
]
E
t
[m
t+1
]
=
_
exp
2
2
(1 + (S
t
))
2
1 (1 + (S
t
)) .
(Again the approximation is not that accurate for the parameter values necessary to match con
sumption and return data.) It follows from (4.26) that the expected excess gross return on a risky
asset is
E
t
[R
i,t+1
] R
f
t
(1 + (S
t
))
t
[m
t+1
, R
i,t+1
]
t
[R
i,t+1
]. (8.43)
The variation in risk aversion through S
t
increases the risk premium relative to the standard model,
cf. (8.30). In order to obtain the countercyclical variation in risk premia observed in data, has
to be a decreasing function of S.
The continuously compounded shortterm riskfree rate is
lnR
f
t
= ln
_
1
E
t
[m
t+1
]
_
= + g
1
2
2
(1 ) ln
S
t
S
1
2
2
(S
t
) ((S
t
) + 2) .
In comparison with the expression (8.29) for the riskfree rate in the simple model, the last two
terms on the righthand side are new. Note that S
t
has opposite eects on the two new terms.
A low value of S
t
means that the marginal utility of consumption is high so that individuals will
try to borrow money for current consumption. This added demand for shortterm borrowing will
drive up the equilibrium interest rate. On the other hand, a low S
t
will also increase the risk
aversion and, hence, precautionary savings with a lower equilibrium rate as a result. Campbell
and Cochrane x () and
S at
(S
t
) =
1
S
_
1 2 ln(S
t
/
S) 1,
S =
_
/(1 )
so that the equilibrium interest rate is given by the constant
lnR
f
= + g
1
2
2
_
S
_
2
.
Note that () is decreasing.
2
2
The specied function (S
t
) is only welldened for ln(S
t
/
S)
1
2
, which is not guaranteed. We can rule out
any problems with the square root by putting (S
t
) = 0 whenever S
t
S
max
e
1/2
S. While S
t
may indeed exceed
S
max
, this will only happen very rarely.
192 Chapter 8. Consumptionbased asset pricing
The authors calibrate the model to historical data for consumption growth, interest rates, and
the average Sharpe ratio, which for example requires that = 2 and
S = 0.057. The calibrated
model is consistent with the observed countercyclical variation in expected returns, standard
deviations of returns, and the Sharpe ratio. With the given parameter values the model can
therefore explain the predictability in those variables. The calibrated model yields empirically
reasonable levels of the expected return and standard deviation of stock returns but note that,
although the calibrated value of the utility parameter is small, the relative risk aversion /S
t
is still high. With S
t
=
S = 0.057, the risk aversion is approximately 35, and the risk aversion
is much higher in bad states where S
t
is low. The model can therefore not explain the equity
premium puzzle. Also observe that the value of
S implies an average habit level only 5.7% lower
than current consumption, which seems to be an extreme degree of habit formation. Finally, note
that the dynamics of the business cycle variable S
t
is exogenously chosen to obtain the desired
properties of the model. It would be interesting to understand how such a process could arise
endogenously.
8.7.4 The Chan and Kogan model
In many models for individual consumption and portfolio choice the individual is assumed to
have constant relative risk aversion both because this seems quite reasonable and because this
simplies the analysis. If all individuals in an economy have constant relative risk aversion, you
might think that a representative individual would also have constant relative risk aversion, but
as pointed out by Chan and Kogan (2002) this is only true if they all have the same level of risk
aversion. Individuals with a low (constant) relative risk aversion will other things equal invest a
larger fraction of their wealth in risky assets, e.g. stocks, than individuals with high (constant)
relative risk aversion. Increasing stock prices will imply that the wealth of the relatively risk
tolerant individuals will grow more than the wealth of comparably more risk averse individuals.
Consequently, the aggregate risk aversion in the economy (corresponding to the risk aversion of a
representative individual) will fall. Conversely, the aggregate risk aversion will increase when stock
markets drop. This simple observation supports the assumption of Campbell and Cochrane (1999)
discussed above that the risk aversion of the representative individual varies countercyclically,
which helps in resolving asset pricing puzzles.
Let us take a closer look at the model of Chan and Kogan (2002). It is a continuoustime ex
change economy in which the aggregate endowment/consumption Y = (Y
t
) follows the a geometric
Brownian motion
dY
t
= Y
t
[dt + dz
t
],
where >
2
/2, > 0, and z = (z
t
) is a onedimensional standard Brownian motion. Two
assets are traded: a risky asset which is a unit net supply and pays a continuous dividend equal
to the aggregate endowment, and an instantaneously riskfree asset (a bank account) generating a
continuously compounded shortterm interest rate of r
f
t
. The economy is populated with innitely
lived individuals maximizing timeadditive statedependent utility given by (8.36), i.e.
E
__
0
e
t
u(c
t
, X
t
; ) dt
_
, u(c, X; ) =
1
1
_
c
X
_
1
.
Here X is an external benchmark, e.g. an index of the standard of living in the economy. Let
8.7 CCAPM with alternative preferences 193
x
t
= lnX
t
and y
t
= lnY
t
. The dynamics of the benchmark is modeled through
x
t
= e
t
x
0
+
_
t
0
e
(ts)
y
s
ds
so that
dx
t
= (y
t
x
t
) dt.
The logbenchmark is a weighted average of past logconsumption. The relative logconsumption
variable
t
= y
t
x
t
will be representative of the state of the economy. A high [low] value of
t
represents a good [bad] state in terms of aggregate consumption relative to the benchmark. Note
that
d
t
= dy
t
dx
t
= (
t
) dt + dz
t
,
where = (
2
/2)/.
Individuals are assumed to dier with respect to their relative risk aversion but to have
identical subjective time preference parameters . Since the market is complete, an equilibrium
in the economy will be Paretooptimal and identical to the solution of the problem of a central
planner or representative individual. Let f() denote the weight of the individuals with relative
risk aversion in this problem, where we normalize so that
_
1
f() = 1. The problem of the
central planner is to solve
sup
{c
t
(Y
t
,X
t
;);>1,t0}
E
_
_
0
e
t
_
_
1
f()
1
1
_
c
t
(Y
t
, X
t
; )
X
t
_
1
d
_
dt
_
s.t.
_
1
c
t
(Y
t
, X
t
; ) d Y
t
, t 0.
In an exchange economy no intertemporal transfer of resources are possible at the aggregate level
so the optimal value of the central planners objective function will be E[
_
0
e
t
U(Y
t
, X
t
) dt],
where
U(Y
t
, X
t
) = sup
{c
t
(Y
t
,X
t
;);>1}
_
_
1
f()
1
1
_
c
t
(Y
t
, X
t
; )
X
t
_
1
d
_
1
c
t
(Y
t
, X
t
; ) d Y
t
_
.
(8.44)
The solution to this problem is characterized in the following theorem, which is the key to the
asset pricing results in this model.
Theorem 8.3 The optimal consumption allocation in the problem (8.44) is
c
t
(Y
t
, X
t
; ) =
t
(
t
; )Y
t
,
t
(
t
; ) = f()
1/
e
h(
t
)
t
, (8.45)
where the function h is implicitly dened by the identity
_
1
f()
1/
e
h(
t
)
t
d = 1. (8.46)
The utility function of the representative individual is
U(Y
t
, X
t
) =
_
1
1
1
f()
1/
e
h(
t
)
d. (8.47)
194 Chapter 8. Consumptionbased asset pricing
Proof: If we divide the constraint in (8.44) through by X
t
and introduce the aggregate consump
tion share
t
(Y
t
, X
t
; ) = c
t
(Y
t
, X
t
; )/Y
t
, the Lagrangian for the optimization problem is
L
t
=
_
1
f()
1
1
_
t
(Y
t
, X
t
; )
Y
t
X
t
_
1
d +H
t
_
Y
t
X
t
_
1
t
(Y
t
, X
t
; )
Y
t
X
t
d
_
=
Y
t
X
t
_
1
_
f()
1
1
t
(Y
t
, X
t
; )
1
_
Y
t
X
t
_
H
t
t
(Y
t
, X
t
; )
_
d +H
t
Y
t
X
t
,
where H
t
is the Lagrange multiplier. The rstorder condition for
t
implies that
t
(Y
t
, X
t
; ) = H
1/
t
f()
1/
_
Y
t
X
t
_
1
= f()
1/
e
h
t
t
,
where h
t
= lnH
t
. The consumption allocated to all individuals must add up to the aggregate
consumption, which implies the condition (8.46). From the condition, it is clear that h
t
and
therefore
t
depends on
t
but not separately on Y
t
and X
t
.
The maximum of the objective function is
U(Y
t
, X
t
) =
_
1
f()
1
1
_
t
(Y
t
, X
t
; )
Y
t
X
t
_
1
d
=
_
1
f()
1
1
f()
(1)/
e
h(
t
)(1)
t
e
(1)
t
d
=
_
1
1
1
f()
1/
e
h(
t
)
d,
which ends the proof. 2
The optimal allocation of consumption to a given individual is a fraction of aggregate endowment,
a fraction depending on the state of the economy and the relative risk aversion of the individual.
The next lemma summarizes some properties of the function h which will be useful in the
following discussion.
Lemma 8.1 The function h dened in (8.46) is decreasing and convex with
h
() =
__
1
1
f()
1/
e
h()
d
_
1
< 1. (8.48)
Proof: Dierentiating with respect to
t
in (8.46), we get
_
1
f()
1/
e
h(
t
)
t
_
(
t
) 1
_
d = 0,
which implies that
h
(
t
)
_
1
1
f()
1/
e
h(
t
)
t
d =
_
1
f()
1/
e
h(
t
)
t
d = 1,
from which the expression for h
(
t
) follows. Since we are integrating over 1,
_
1
1
f()
1/
e
h(
t
)
t
d <
_
1
f()
1/
e
h(
t
)
t
d = 1,
which gives the upper bound on h
(
t
).
8.7 CCAPM with alternative preferences 195
Convexity means h
(
t
) 0 and by dierentiating (8.48) we see that this is true if and only if
h
()
_
1
1
2
f()
1/
e
h()
d
_
1
1
f()
1/
e
h()
d,
i.e. if and only if
_
1
1
2
f()
1/
e
h()
d
__
1
1
f()
1/
e
h()
d
_
2
.
In order to show this we apply the CauchySchwartz inequality for integrals,
_
_
b
a
F(x)G(x) dx
_
2
_
_
b
a
F(x)
2
dx
__
_
b
a
G(x)
2
dx
_
,
with x = , a = 1, b = , and
F(x) =
1
_
f()
1/
e
h()
_
1/2
, G(x) =
_
f()
1/
e
h()
_
1/2
.
By the CauchySchwartz inequality and (8.46), we get exactly
__
1
1
f()
1/
e
h()
d
_
2
__
1
1
2
f()
1/
e
h()
d
___
1
f()
1/
e
h()
d
_
=
_
1
1
2
f()
1/
e
h()
d
as was to be shown. 2
Using (8.46) and (8.48), straightforward dierentiation of the utility function (8.47) gives that
U
Y
(Y
t
, X
t
) = X
1
t
e
h(
t
)
, (8.49)
U
Y Y
(Y
t
, X
t
) = h
(
t
)Y
1
t
U
Y
(Y
t
, X
t
). (8.50)
The relative risk aversion of the representative individual is therefore
Y
t
U
Y Y
(Y
t
, X
t
)
U
Y
(Y
t
, X
t
)
= h
(
t
). (8.51)
It follows from Lemma 8.1 that this relative risk aversion is greater than 1 and decreasing in the
state variable
t
. The intuition is that the individuals with low relative risk aversion will other
things equal invest more in the risky assetand their wealth will therefore uctuate morethan
individuals with high relative risk aversion. In good states a larger fraction of the aggregate
resources will be held by individuals with low risk aversion than in bad states. The relative risk
aversion of the representative individual, which is some sort of wealthweighted average of the
individual risk aversions, will therefore be lower in good states than in bad states. Aggregate
relative risk aversion is countercyclical.
We can obtain the equilibrium riskfree rate and the market price of risk from Theorem 8.2. In
the present model the dynamics of X
t
= e
x
t
will be
dX
t
= X
t
(y
t
x
t
) dt = X
t
t
dt, (8.52)
which is locally insensitive to shocks corresponding to
Xt
= 0. This will simplify the formulas
for the riskfree rate and the Sharpe ratio. Moreover,
U
Y X
(Y
t
, X
t
) = X
1
t
(1 +h
(
t
))U
Y
(Y
t
, X
t
), (8.53)
Y
2
t
U
Y Y Y
(Y
t
, X
t
) =
_
h
(
t
) +h
(
t
)
2
h
(
t
)
_
U
Y
(Y
t
, X
t
). (8.54)
We arrive at the following conclusion:
196 Chapter 8. Consumptionbased asset pricing
Theorem 8.4 In the Chan and Kogan model the riskfree shortterm interest rate is
r
f
t
= h
(
t
)(
t
) +
t
1
2
2
_
h
(
t
) +h
(
t
)
2
_
. (8.55)
and the Sharpe ratio of a risky asset is
t
+
t
r
f
t
t
= h
(
t
), (8.56)
which is decreasing in the state variable
t
.
In Exercise 8.4 you are asked to provide the details. In Exercise 8.5 you are asked to show how (8.56)
follows from (8.42).
The Sharpe ratio in the model varies countercyclically. In good states of the economy the
average relative risk aversion is relatively low and the risk premium necessary for markets clear is
therefore also low. Conversely in bad states where the average relative risk aversion is high.
8.7.5 EpsteinZin utility
Recall from Section 6.6 that in a discretetime model with EpsteinZin utility the nextperiod
stateprice deator is given by
M
t+1
t+1
t
= e
_
c
t+1
c
t
_
/
_
R
t
t+1
_
1
, (8.57)
where is the subjective time preference rate, is the elasticity of intertemporal substitution,
and = (1 )/(1
1
(c
t+1
/c
t
)
. Let g
t+1
= ln
_
c
t+1
c
t
_
denote the log growth rate of consumption and let
r
w
t+1
= lnR
t
t+1
denote the log return on the wealth of the individual, which for a representative
individual can be interpreted as the log return on a claim to aggregate consumption (in equilibrium:
aggregate dividends including labor income = aggregate consumption). We can then write the log
of the stateprice deator, m
t+1
= lnM
t+1
as
m
t+1
=
g
t+1
+ ( 1) r
w
t+1
. (8.58)
Applying the stateprice deator to the wealth portfolio implies
E
t
_
t+1
t
e
r
w
t+1
_
= E
t
_
e
m
t+1
+r
w
t+1
_
= 1 E
t
_
e
g
t+1
+r
w
t+1
_
= 1. (8.59)
Assume that log consumption growth g
t+1
and the log return on the wealth portfolio r
w
t+1
are
jointly normally distributed with g
t+1
N(
c
,
2
c
), r
w
t+1
N(
w
,
2
w
) and Cov
t
[g
t+1
, r
w
t+1
] =
cw
.
By the use of Theorem B.2 in Appendix B we nd that
1 = E
t
_
e
g
t+1
+r
w
t+1
_
= exp
_
c
+
w
+
1
2
2
c
+
1
2
2
w
cw
_
.
This implies that
w
= +
c
1
2
2
c
1
2
2
w
+
cw
, (8.60)
which we shall make use of below.
8.7 CCAPM with alternative preferences 197
Applying the stateprice deator to the riskfree asset implies that the oneperiod riskfree log
rate of return is
r
f
t
= lnE
t
[e
m
t+1
] = lnE
t
_
e
g
t+1
+(1)r
w
t+1
_
= +
c
+ (1 )
w
1
2
2
c
1
2
(1 )
2
2
w
(1 )
cw
= +
c
1
2
2
c
1
2
(1 )
2
w
,
(8.61)
where the last equality is due to (8.60). For the special case of power utility ( = 1/, = 1), we
are of course back to r
f
t
= +
1
2
2
as in (8.29).
Obviously, the extension from power utility to EpsteinZin utility has some consequences for
the riskfree rate. Not surprisingly, it is generally (the inverse of) the elasticity of intertemporal
substitution that determines how expected consumption growth
c
aects the equilibrium riskfree
short rate. If investors expect a higher growth rate of consumption, they would want to borrow
more in order to increase consumption now instead of in the next period, where the prospective high
consumption would lead to low marginal utility. To ensure market clearing, the short rate would
have to increase. If investors have a high elasticity of intertemporal substitution, a small increase
in the short rate is sucient. The last two terms in (8.61) can be seen as an adjustment for risk,
which now includes both consumption risk and wealth risk. If we take the stock market volatility
as a rough approximation to the volatility of the wealth portfolio, typical estimates suggest that
2
w
is considerably higher then
2
c
. Clearly the precise values of the preferences parameters are
therefore very important for the magnitude of the risk adjustment. In fact, if is (much) larger
than 1,
1
2
2
c
1
2
(1 )
2
w
can be positive.
In the power utility model with a high risk aversion, the average riskfree rate typically gets
too high. With EpsteinZin utility, we can have a high risk aversion without having a very low
substitution parameter so EpsteinZin utility can potentially help in explaining the riskfree rate
puzzle.
The log return r
i
t+1
on an arbitrary risky claim must satisfy
E
t
_
e
m
t+1
+r
i
t+1
_
= 1 E
t
_
e
g
t+1
+(1)r
w
t+1
+r
i
t+1
_
= 1. (8.62)
Assuming that g
t+1
, r
w
t+1
, and r
i
t+1
are jointly normally distributed, we get
E
t
[r
i
t+1
] +
1
2
Var
t
[r
i
t+1
] = +
c
+ (1 )
w
1
2
2
c
1
2
(1 )
2
2
w
(1 )
cw
+
Cov
t
[r
i
t+1
, g
t+1
] + (1 ) Cov
t
[r
i
t+1
, r
w
t+1
]
= +
c
1
2
2
c
1
2
(1 )
2
w
+
Cov
t
[r
i
t+1
, g
t+1
] + (1 ) Cov
t
[r
i
t+1
, r
w
t+1
],
again using (8.60). From (8.61) it is now clear that
E
t
[r
i
t+1
] r
f
t
+
1
2
Var
t
[r
i
t+1
] =
Cov
t
[r
i
t+1
, g
t+1
] + (1 ) Cov
t
[r
i
t+1
, r
w
t+1
]
=
t
[r
i
t+1
]
_
ic
+ (1 )
w
iw
_
=
t
[r
i
t+1
]
_
1
1
ic
+
1
1
w
iw
_
.
(8.63)
198 Chapter 8. Consumptionbased asset pricing
For the special case of power utility ( = 1/, = 1), the righthand side reduces to Cov
t
[r
i
t+1
, g
t+1
]
in agreement with (8.31). For = 0 (which requires a risk aversion of 1, i.e. log utility), the risk
premium equals Cov
t
[r
i
t+1
, r
w
t+1
] as in the classic CAPM. In general, the risk premium is a com
bination of the assets covariances with consumption and with wealth. As mentioned above,
w
may be considerably higher than
c
. Let us assume that > 1. Then if > 1, we have < 0
and thus a high weight on
w
. The righthand side of (8.63) can therefore potentially be higher
with EpsteinZin utility than with power utility. As an example, suppose in line with the num
bers used in the beginning of Section 8.5 that
t
[r
i
t+1
] = 0.2,
c
= 0.02,
ic
= 0.2, and x
= 10,
w
= 0.2, and
iw
= 0.2. Then the risk premium according to the power utility model is
5 0.2 0.02 0.2 = 0.004 = 0.4%. With an elasticity of intertemporal substitution of = 1.5 so
that = 12, the risk premium becomes
0.2
_
12
1.5
0.02 0.2 + 13 0.2 0.2
_
= 0.0976 = 9.76%
which is not far from historical estimates. (However, with these parameters and =
c
= 0.02,
the riskfree rate becomes 22.56% so not all problems are solved.) The risk premium is highly
sensitive to the value of . If we instead use = 0.5, we get = 4 and a risk premium of 1.76%,
which is not what we are looking for.
8.7.6 Durable goods
See Lustig and van Nieuwerburgh (2005), Piazzesi, Schneider, and Tuzel (2006), Yogo (2006). See
Exercise 8.10.
8.8 Longrun risks and EpsteinZin utility
Bansal and Yaron (2004) show that a model with a representative individual having EpsteinZin
utility and with a longrun risk component in aggregate consumption is able to explain some
of the apparently puzzling features of consumption and return data. Both the assumptions on
the preferences and the consumption dynamics are dierent from the simple consumptionbased
model, which is a special case. Bansal and Yaron emphasize that it is important to make both
these extensions.
Recall from Section 8.7.5 that the log of the nextperiod stateprice deator is
m
t+1
=
g
t+1
+ ( 1) r
w
t+1
, (8.58)
where g
t+1
= ln(c
t+1
/c
t
) is log consumption growth and r
w
t+1
is the log return on the wealth
portfolio.
8.8.1 A model with longrun risks
We now take the following specic model for consumption dynamics as suggested by Bansal and
Yaron (2004):
g
t+1
= +x
t
+
t
t+1
,
x
t+1
= x
t
+
e
t
e
t+1
,
2
t+1
=
2
+(
2
t
2
) +
v
v
t+1
.
(8.64)
8.8 Longrun risks and EpsteinZin utility 199
The shocks
t+1
, e
t+1
, v
t+1
are assumed independent and N(0, 1)distributed for all t. The inde
pendence assumption simplies notation and some of the computations in the following, but the
shocks might very well be correlated and that may change the asset pricing conclusions of the
model.
The process (x
t
) drives variations in conditional expectations of consumption and dividend
growth, while the process (
2
t
) captures timevarying economic uncertainty. Note that the dynamics
of these processes can be rewritten as
x
t+1
x
t
= (1 )(0 x
t
) +
e
t
e
t+1
,
2
t+1
2
t
= (1 )(
2
2
t
) +
v
v
t+1
,
from which it can be seen that the x
t
variable is uctuating around 0 and
2
t
is uctuating around
2
. (Unfortunately, with the proposed dynamics,
2
t
is not guaranteed to stay positive.) We will
assume < 1 and < 1. In the simple consumptionbased model, the expected consumption
growth rate is assumed constant so that the x
t
variable is absent (or constant and equal to 0) and
2
t
is constant and equal to
2
. Bansal and Yaron (2004) report an empirical estimate of equal to
0.979, which means that the x
t
variable is varying very slowly over time. Formulated dierently,
the speed of mean reversion, 1 , is very close to zero. Variations in x only matter for very
long periods, hence the name longrun risk. Empirical evidence for a longrun risk component
in aggregate consumption is also provided by, e.g., Hansen, Heaton, and Li (2008). Bansal and
Yaron also present empirical support for variations in the volatility of consumption growth, which
is captured by the
t
of the above model. They also nd a high persistence in that process with
an estimate of = 0.987.
Turning to the market portfolio, assume that the loggrowth rate of its dividends are given by
g
d,t+1
=
d
+x
t
+
d
t
(
t+1
+
_
1
2
u
t+1
), (8.65)
where u
t+1
is another N(0, 1) shock independent of the other shocks in (8.64), and where ,
d
> 0,
and [1, 1] are additional constants. Then
Var
t
[g
d,t+1
] =
2
d
2
t
, Corr
t
[g
t+1
, g
d,t+1
] = .
In the discussion of the simple consumptionbased model earlier in this chapter, we have implicitly
assumed that the market portfolio was a claim to aggregate consumption so that g
d,t+1
= g
t+1
in
all periods. In the above model this requires that
d
= , =
d
= 1, and = 1. In reality the
dividends of the stock market portfolio uctuate much more than aggregate consumption (
d
> 1)
and are not perfectly correlated with aggregate consumption ( < 1). Bansal and Yaron (2004)
assume the above dividend growth model with = 0, but we will allow for correlation in the
following. Following Abel (1999), Bansal and Yaron (2004) use = 3 so that expected dividend
growth is considerably more sensitive to the persistent factor than expected consumption growth,
which can be explained by the leverage of the companies issuing the stocks.
The state prices and returns in the model will be driven by x
t
and
2
t
. As we have seen earlier,
for example in Section 8.7.5, we can obtain closedform expressions for the riskfree rate and
expected excess returns when the log stateprice deator and the log returns are jointly normally
distributed. The driving processes (x
t
), (
2
t
) are Gaussian processes so if the log stateprice deator
and the log returns were linearly related to these variables, they would also be Gaussian. Such
200 Chapter 8. Consumptionbased asset pricing
a linear relationship is not exact, however, but we will establish and use an approximate linear
relationship, which will then allow us to obtain closedform expressions for the riskfree return,
the expected excess return on the market portfolio, etc. The approximation is described in the
following subsection.
8.8.2 The CampbellShiller linearization
Campbell and Shiller (1988) introduce a linear, approximative relation between log returns, log
dividends, and log prices. Write the gross return on an asset as
R
t+1
=
P
t+1
+D
t+1
P
t
=
P
t+1
D
t
+
D
t+1
D
t
P
t
D
t
=
P
t+1
D
t+1
D
t+1
D
t
+
D
t+1
D
t
P
t
D
t
=
e
z
t+1
+d
t+1
+e
d
t+1
e
z
t
=
e
d
t+1
(e
z
t+1
+ 1)
e
z
t
,
where d
t
= lnD
t
, p
t
= lnP
t
, z
t
= p
t
d
t
= ln
P
t
D
t
and d
t+1
= d
t+1
d
t
= ln
D
t+1
D
t
. Then
lnR
t+1
= ln
_
e
d
t+1
(1 +e
z
t+1
)
z
t
= z
t
+ d
t+1
+ ln(1 +e
z
t+1
)
and a rstorder Taylor approximation of ln(1 +e
z
t+1
) around z
t+1
= z gives
r
t+1
= lnR
t+1
z
t
+ d
t+1
+ ln(1 +e
z
)
e
z
1 +e
z
(z
t+1
z).
Letting
1
=
e
z
1+e
z
and
0
= ln(1 +e
z
)
1
z, we obtain
r
t+1
0
+
1
z
t+1
z
t
+ d
t+1
. (8.66)
For z, it is natural to use the average value of the log pricedividend ratio of the asset.
For an asset paying a dividend identical to the consumption stream, we can replace d
t+1
by
the loggrowth rate of consumption, g
t+1
= ln(C
t+1
/C
t
). In this case z
t
is to be interpreted as the
log priceconsumption ratio, and we have
r
w
t+1
0
+
1
z
t+1
z
t
+g
t+1
. (8.67)
Substituting this approximation into (8.59), we obtain
E
t
_
e
+(1
1
)g
t+1
+
0
+
1
z
t+1
z
t
_
= 1. (8.68)
For the market portfolio paying aggregate dividends, the analogous approximation of the log
return is
r
m
t+1
m
0
+
m
1
z
m
t+1
z
m
t
+g
d,t+1
, (8.69)
where z
m
t
is the log pricedividend ratio and g
d,t+1
is the loggrowth rate of aggregate dividends.
Substituting (8.67) and (8.69) into (8.62), we obtain
E
t
_
e
+[(1
1
)1]g
t+1
+(1)(
0
+
1
z
t+1
z
t
)+
m
0
+
m
1
z
m
t+1
z
m
t
+g
d,t+1
_
= 1, (8.70)
which we shall apply below.
8.8 Longrun risks and EpsteinZin utility 201
8.8.3 The stateprice deator
Substituting the specic model equations into (8.68), we get
E
t
_
e
+(1
1
)(+x
t
+
t
t+1
)+
0
+
1
z
t+1
z
t
_
= 1, (8.71)
which has to be satised (approximately, at least) for all possible values of x
t
and
t
at any point
in time t. Conjecture a solution for the log priceconsumption ratio of the form
z
t
= A
0
+A
1
x
t
+A
2
2
t
(8.72)
for all t. Substituting that and the equations for x
t+1
and
2
t+1
from (8.64) into the previous
equation and then collecting terms yield
1 = exp
_
q +
_
1
1
A
1
[1
1
]
_
x
t
+A
2
(
1
1)
2
t
_
E
t
_
e
(1
1
)
t
t+1
+
1
[A
1
t
e
t+1
+A
2
v
v
t+1
]
_
= exp
_
q +
_
1
1
A
1
[1
1
]
_
x
t
A
2
(1
1
)
2
t
_
exp
_
1
2
2
__
(1
1
)
2
+
2
1
A
2
1
2
e
_
2
t
+K
2
1
A
2
2
2
v
__
= exp
_
q +
1
2
2
1
A
2
2
2
v
+
_
1
1
A
1
[1
1
]
_
x
t
+
_
1
2
_
(1
1
)
2
+K
2
1
A
2
1
2
e
_
A
2
(1
1
)
_
2
t
_
,
where q = + (1
1
) +
0
+ A
0
(
1
1) +
1
A
2
2
(1 ). Since this is to be satised for all
values of x
t
and
2
t
, we need the coecients of x
t
and
2
t
to be zero, i.e.
A
1
=
1
1
1
1
, (8.73)
A
2
=
2(1
1
)
_
_
1
1
_
2
+
2
1
A
2
1
2
e
_
=
_
1
1
_
2
2(1
1
)
_
1 +
_
1
e
1
1
_
2
_
. (8.74)
The constant A
0
is included in q is then determined from the condition 1 = exp
_
q +
1
2
2
1
A
2
2
2
v
_
,
i.e. q +
1
2
2
1
A
2
2
2
v
= 0.
Combining (8.58) and (8.67), the log of the stateprice deator can be written as
m
t+1
= ( 1)q
x
t
+ ( 1)A
2
(
1
1)
2
t
+
_
1
t+1
+ ( 1)
1
A
1
t
e
t+1
+ ( 1)
1
A
2
v
v
t+1
.
(8.75)
Note that 1
= (1
1
) 1 = 1 1 = . Dene
= ,
e
= (1 )
1
A
1
e
,
v
= (1 )
1
A
2
. (8.76)
Then, from the distributional assumptions on the shocks, we conclude that we can rewrite m
t+1
as
m
t+1
= E
t
[m
t+1
]
t+1
t
e
t+1
v
v
t+1
, (8.77)
and that m
t+1
is conditional lognormally distributed with
E
t
[m
t+1
] = ( 1)q
x
t
+ ( 1)A
2
(
1
1)
2
t
, (8.78)
Var
t
[m
t+1
] =
2
t
_
+
2
e
_
+
2
v
2
v
. (8.79)
202 Chapter 8. Consumptionbased asset pricing
8.8.4 The riskfree rate
The riskfree log interest rate becomes
r
f
t
= lnE
t
[e
m
t+1
] = E
t
[m
t+1
]
1
2
Var
t
[m
t+1
]
= + (1 )q +
+x
t
1
2
2
v
2
v
_
(1 )A
2
(1
1
) +
1
2
2
+
1
2
2
e
_
2
t
.
(8.80)
Let us rst look at some special cases. For the standard dynamics of consumption growth, we
have x
t
0,
2
t
2
and
2
v
= 0 so that q = 0. Moreover,
e
= 0 and thus
e
= 0, and
A
2
(1
1
) = (1
1
)
2
/2. Hence, we get the constant riskfree rate
r
f
t
= +
1
2
2
_
(1
1
)
2
+
2
_
= +
1
2
_
1 +
2
_
2
. (8.81)
This is consistent with (8.61) because in this version of the model z
t
in (8.72) is constant and,
hence, r
w
t+1
equals a constant plus log consumption growth. Consequently, the variance of the
wealth portfolio equals the variance of consumption growth,
2
w
=
2
, and if you apply that
in (8.61), you get the above expression for the riskfree rate.
If we use the longrun risk model for consumption dynamics but specialize to power utility, the
riskfree rate will be r
f
t
= + ( + x
t
)
1
2
2
t
. The level of the riskfree rate will be similar
to the case with the simple model for consumption dynamics, but the uctuations in expected
consumption growth and consumption volatility generate uctuations in the riskfree rate.
For the general expression for the riskfree rate (8.80), the level is still decreasing in the elasticity
of intertemporal substitution. According to Bansal and Yaron (2004), the unconditional variance
of the riskfree rate is
Var[r
f
t
]
_
1
_
2
Var[x
t
],
which is again decreasing in the elasticity of intertemporal substitution. In the longrun risk model
with power utility briey discussed above, we would similarly have Var[r
f
t
]
2
Var[x
t
]. With
power utility, a high risk aversion is needed to match the equity premium, but that may easily
lead to an unrealistic high variance of the riskfree rate. With EpsteinZin utility and a fairly high
elasticity of intertemporal substitution, we can keep a relatively low and relatively stable riskfree
rate.
8.8.5 The premium on risky assets
From Section 4.2.2, we have that for any normally distributed logreturn r
i,t+1
,
E
t
[r
i,t+1
] r
f
t
= Cov
t
[m
t+1
, r
i,t+1
]
1
2
Var
t
[r
i,t+1
]
=
t
Cov
t
[
t+1
, r
i,t+1
] +
e
t
Cov
t
[e
t+1
, r
i,t+1
]
+
v
v
Cov
t
[v
t+1
, r
i,t+1
]
1
2
Var
t
[r
i,t+1
].
(8.82)
If we substitute the modeling assumptions (8.64) and (8.65) together with (8.72) into (8.70), we
8.8 Longrun risks and EpsteinZin utility 203
nd that
1 = E
t
_
exp
_
q + (
1
)x
t
+ ( 1)A
2
(
1
1)
2
t
+
m
1
z
m
t+1
z
m
t
_
exp
_
(
d
)
t
t+1
+ ( 1)
1
A
2
v
v
t+1
+
d
_
1
2
t
u
t+1
+ ( 1)
1
A
1
t
e
t+1
_
_
,
(8.83)
where q = + ( 1)q / +
m
0
+
d
. Now conjecture that the log pricedividend ratio is of
the form
z
m
t
= B
0
+B
1
x
t
+B
2
2
t
,
which implies that
m
1
z
m
t+1
z
m
t
= (
m
1
1)B
0
+
m
1
B
2
2
(1 ) +B
1
(
m
1
1)x
t
+B
2
(
m
1
1)
2
t
+
m
1
B
1
t
e
t+1
+
m
1
B
2
v
v
t+1
.
(8.84)
If we substitute that into (8.83), we obtain
1 = E
t
_
exp
_
q + (
m
1
1)B
0
+
m
1
B
2
2
(1 ) +
1
2
[( 1)
1
A
2
+
m
1
B
2
]
2
2
v
+
_
1
+B
1
(
m
1
1)
_
x
t
_
exp
__
A
2
( 1)(
1
1) +B
2
(
m
1
1) +
1
2
d
+
1
2
2
d
+
1
2
(
m
e
e
)
2
_
2
t
_
_
,
(8.85)
where
m
e
=
m
1
B
1
e
and
e
= (1 )
1
A
1
e
. Since this must be satised for any values of x
t
and
2
t
, we need the coecients of those terms to be zero and, consequently, the constant part of
the exponent to be zero as well, i.e.
B
1
=
1
1
1,m
, B
2
=
(1 )A
2
(1
1
) +
1
2
d
+
1
2
2
d
+
1
2
(
m
e
e
)
2
1
1,m
,
and B
0
is such that q + (
m
1
1)B
0
+
m
1
B
2
2
(1 ) +
1
2
[( 1)
1
A
2
+
m
1
B
2
]
2
2
v
= 0. This
conrms the conjectured form of z
m
t
.
If we use (8.84) in the CampbellShiller approximation (8.69) and focus on the shock terms, we
obtain
r
m
t+1
= E
t
[r
m
t+1
] +
m
e
t
e
t+1
+
m
v
v
v
t+1
+
m
t+1
+
d
_
1
2
t
u
t+1
, (8.86)
where
m
v
=
m
1
B
2
and
m
=
d
. Substituting this into (8.82) we nd
E
t
[r
m
t+1
] r
f
t
=
m
2
t
+
m
e
e
2
t
+
m
v
v
2
v
1
2
Var
t
[r
m
t+1
], (8.87)
and
Var
t
[r
m
t+1
] =
_
(
m
e
)
2
+
2
d
_
2
t
+ (
m
v
)
2
2
v
. (8.88)
Let us look at some special cases. If we keep the EpsteinZin preferences but specialize to the
consumption dynamics of the simple model (
2
t
2
, x
t
0), we get
E
t
[r
m
t+1
] r
f
t
=
d
1
2
Var
t
[r
m
t+1
], Var
t
[r
m
t+1
] =
2
d
2
(8.89)
204 Chapter 8. Consumptionbased asset pricing
and
d
= = 1 if we assume the dividends are identical to aggregate consumption. This is
consistent with (8.63) because in this special version of the model r
w
t+1
equals a constant plus
g
t+1
= +
t+1
and r
m
t+1
= E
t
[r
m
t+1
] +
d
t+1
+
d
_
1
2
u
t+1
so that
Cov
t
[r
m
t+1
, g
t+1
] = Cov
t
[r
m
t+1
, r
w
t+1
] =
d
2
.
Since
2
t
1
2
Var
t
[r
m
t+1
].
Therefore, under a power utility assumption, introducing a longrun risk element in consumption
dynamics does not change the level of excess expected returns on the market portfolio. We need
the combination of EpsteinZin utility and longrun risks in order to aect excess expected returns.
In the general expression (8.87) for the market risk premium the rst termthe standard term
is a premium for shortrun risk, the second term is a premium for longrun risk, while the third
term is a premium for consumption volatility risk. What can we say about the sign of the the two
latter premia? First consider
e
= (1 )
1
A
1
e
. We have 0 <
1
< 1 and
e
> 0. If we assume
that 1 > 1/, we will have A
1
> 0. With > 1/, we have < 1, and thus all in all we get
e
> 0.
Next consider
m
e
=
m
1
B
1
e
. Again
m
1
and
e
must be positive. If the persistence parameter is
greater than 1/, B
1
will be positive, and hence
m
e
is positive. Under such parameter values, the
longrun risk premium
m
e
e
2
t
is going to be positive. Also note that
m
e
and thus the longrun
risk premium are increasing in and , i.e. the risk premium increases with the persistence of the
longrun growth component and the sensitivity of dividends to that longrun component. What
about the volatility risk premium? Again, if > 1/ so that < 1, we will have B
2
> 0 and hence
m
v
=
m
1
B
2
> 0. Also note that A
2
> 0 so, for > 1/, we have
v
= (1 )
1
A
2
> 0. Under
these parameter conditions, the volatility risk premium
m
v
v
2
v
will be positive.
From the preceding discussion it is clear that the longrun risk and volatility risk features of
the model will add to the market risk premium if > 1 > 1/. In particular, the elasticity of
intertemporal substitution has to exceed 1. As discussed in Section 5.7.3, there is some debate
about empirical realistic values of , where some authors nd greater than one and others nd
smaller than one.
The variations in consumption volatility induce variations in the excess expected return, the
return volatility, and the Sharpe ratio of the market portfolio.
8.8.6 Evaluation of the model
Bansal and Yaron (2004) calibrate their model to return and consumption data from the United
States over the period 19291998. Table 8.2 contains statistics for the return data and for the
calibrated model using an elasticity of intertemporal substitution equal to 1.5. With a modest
level of risk aversion the model comes very close at matching the data.
As acknowledged by Bansal and Yaron, it is econometrically very dicult to detect a small but
highly persistent component in a time series, but at least their study shows that if consumption
growth has such a component, it may have signicant eects on asset returns. Combined with the
8.9 Consumptionbased asset pricing with incomplete markets 205
U.S. data 19291998 Calibrated model
Variable Estimate Std. error = 7.5 = 10
E[r
m
r
f
] 6.33 2.15 4.01 6.84
E[r
f
] 0.86 0.42 1.44 0.93
[r
m
] 19.42 3.07 17.81 18.65
[r
f
] 0.97 0.28 0.44 0.57
Table 8.2: Asset pricing in the Bansal and Yaron model. The information is taken from Table IV
in Bansal and Yaron (2004). All returns are in percentage terms. The elasticity of intertemporal
substitution is = 1.5.
plausible EpsteinZin preferences, longrun risks in consumption may explain (a substantial part
of) the equity premium puzzle.
Bansal (2007) gives a useful survey of the rapidly growing literature on longrun risks and asset
pricing with a lot of additional references for the interested reader. Various extensions of the long
run risk model and the applications of it are mentioned in that survey. For example, including a
longrun risk component in consumption appears to be important for explaining the crosssection
of returns. Bansal, Dittmar, and Lundblad (2005) show that the exposure of an assets dividends
to longrun consumption risks is an important explanatory variable in accounting for dierences
in mean returns across portfolios.
8.9 Consumptionbased asset pricing with incomplete mar
kets
8.9.1 Evidence of incomplete markets
Some empirical asset pricing studies indicate that markets are incomplete so that a representative
individual may be nonexisting. The marginal rate of substitution of each individual denes a
stateprice deator. It follows from Theorem 4.3 that a weighted average of stateprice deators
is also a stateprice deator. Brav, Constantinides, and Geczy (2002) assume that all individuals
have timeadditive CRRA utility with the same time preference rate and the same relative risk
aversion so that the stateprice deator for individual l is e
t
(c
lt
/c
l0
)
. An equallyweighted
average over L individuals gives the stateprice deator
t
= e
t
1
L
L
l=1
_
c
lt
c
l0
_
.
Using data on the consumption of individual households, the authors nd that this stateprice
deator is consistent with the historical excess returns on the U.S. stock market for a risk aversion
parameter as low as 3. If the market were complete, consumption growth c
lt
/c
l0
would be the
same for all individuals under these assumptions, and
t
= e
t
_
L
l=1
c
lt
L
l=1
c
l0
_
(8.90)
206 Chapter 8. Consumptionbased asset pricing
would be a valid stateprice deator. Summing up over all individuals in this formula, we get
the stateprice deator for a representative individual, who will also have relative risk aversion
equal to , but this is inconsistent with data except for unreasonably high values of . This study
therefore indicates that nancial markets are incomplete and do not allow individuals to align their
marginal rates of substitution.
For various reasons, a large, but declining, fraction of individuals do not invest in stock markets
at all or only to a very limited extent. Brav, Constantinides, and Geczy (2002) show that if in
Equation (8.90) you only sum up over individuals holding nancial assets with a value higher
than some threshold, this stateprice deator is consistent with historical data for a relative risk
aversion which is relatively high, but much lower than the required risk aversion using aggregate
consumption. The higher the threshold, the lower the required risk aversion. This result reects
that only the individuals active in the nancial markets contribute to the setting of prices. Other
empirical studies report similar ndings, and Basak and Cuoco (1998) set up a formal asset pricing
model that explicitly distinguishes between individuals owning stocks and individuals not owning
stocks.
8.9.2 Labor income risk
The labor income of individuals is not fully insurable, neither through investments in nancial
assets nor through existing insurance contracts. A number of papers investigate how nonhedgeable
income shocks may aect the pricing of nancial assets. If unexpected changes in labor income are
temporary, individuals may selfinsure by building up a buer of savings in order to even out the
consumption eects of the income shocks over the entire life. The eects on equilibrium asset prices
will be insignicant, cf. Telmer (1993). If income shocks are to help resolve the equity premium
puzzle, the shocks have to aect income beyond the current period. In addition, the magnitude of
the income shocks has to be negatively related to the level of stock prices. Both the persistency of
income shocks and the countercyclical variation in income seem reasonable in light of the risk of
layos and nd empirical support, cf. Storesletten, Telmer, and Yaron (2004). Individuals facing
such an income uncertainty will demand higher risk premia on stocks than in a model without labor
income because stocks do badly exactly when individuals face the highest risk of an unexpected
decline in income.
In a model where all individuals have timeadditive CRRA utility, Constantinides and Due
(1996) show by construction that if the income processes of dierent individuals are suciently
dierent in a certain sense then their model with any given risk aversion can generate basically any
pattern in aggregate consumption and nancial prices, including the puzzling historical pattern.
However, Cochrane (2001, Ch. 21) argues that with a realistic degree of crosssectional variation
in individual labor income, a relatively high value of the risk aversion parameter is still needed
to match historical data. Nevertheless, it seems to be important to incorporate the labor income
uncertainty of individuals and in particular the dierence between the labor income processes of
dierent individuals in the development of better asset pricing models.
Constantinides, Donaldson, and Mehra (2002) emphasize that individuals choose consumption
and investment from a lifecycle perspective and face dierent opportunities and risks at dierent
ages. They divide individuals into young, middleaged, and old individuals. Old individuals only
consume the savings they have build up earlier, do not receive further income and do not invest in
8.10 Concluding remarks 207
nancial assets. Young individuals typically have a low nancial wealth and a low current labor
income but a high human capital (the present value of their future labor income). Many young
individuals would prefer borrowing signicant amounts both in order to smooth out consumption
over life and to be able to invest in the stock market to generate additional consumption oppor
tunities and obtain the optimal risk/return tradeo. The empirically observed low correlation
between labor income and stock returns makes stock investments even more attractive for young
individuals. Unfortunately it is dicult to borrow signicant amounts just because you expect to
get high future income and, therefore, the young individuals can only invest very little, if anything,
in the stock market. Middleaged individuals face a dierent situation. Their future labor income is
limited and relatively certain. Their future consumption opportunities are primarily depending on
the return on their investments. For middleaged individuals the correlation between stock returns
and future consumption is thus quite high so stocks are not as attractive for them. Nevertheless,
due to the borrowing constraints on the young individuals, the stocks have to be owned by the
middleaged individuals and, hence, the equilibrium expected rate of return has to be quite high.
The authors set up a relatively simple model formalizing these thoughts, and the model is able
to explain an equity premium signicantly higher than in the standard model, but still below the
historically observed premium.
8.10 Concluding remarks
This chapter has developed consumptionbased models of asset pricing. Under very weak assump
tions, expected excess returns on risky asset will be closely related to the covariance of the asset
return with aggregate consumption, giving a conditional consumptionbased CAPM. The simple
version of the model is unable to match a number of features of consumption and return data,
which leads to several asset pricing puzzles. However, a number of relatively recent theoretical
and empirical studies identify extensions of the simple model that are able to eliminate (or at least
reduce the magnitude of) several of these puzzles. These extensions include statedependent pref
erences, heterogenous risk aversion, and labor income risk. The consumptionbased asset pricing
framework is alive and kicking. A potential problem of the tests and practical implementation of
such models, however, is the need for good data on individual or aggregate consumption. The next
chapter discusses asset pricing models not relying on consumption data.
8.11 Exercises
EXERCISE 8.1 Carl Smart is currently (at time t = 0) considering a couple of investment
projects that will provide him with a dividend in one year from now (time t = 1) and a dividend
in two years from now (time t = 2). He gures that the size of the dividends will depend on
the growth rate of aggregate consumption over these two years. The rst project Carl considers
provides a dividend of
D
t
= 60t + 5(C
t
E[C
t
])
at t = 1 and at t = 2. The second project provides a dividend of
D
t
= 60t 5(C
t
E[C
t
])
208 Chapter 8. Consumptionbased asset pricing
at t = 1 and at t = 2. Here E[ ] is the expectation computed at time 0 and C
t
denotes aggregate
consumption at time t. The current level of aggregate consumption is C
0
= 1000.
As a valuable input to his investment decision Carl wants to compute the present value of the
future dividends of each of the two projects.
First, Carl computes the present values of the two projects using a riskignoring approach, i.e.
by discounting the expected dividends using the riskless returns observed in the bond markets.
Carl observes that a oneyear zerocoupon bond with a face value of 1000 currently trades at a
price of 960 and a twoyear zerocoupon bond with a face value of 1000 trades at a price of 929.02.
(a) What are the expected dividends of project 1 and project 2?
(b) What are the present values of project 1 and project 2 using the riskignoring approach?
Suddenly, Carl remembers that he once took a great course on advanced asset pricing and that the
present value of an uncertain dividend of D
1
at time 1 and an uncertain dividend of D
2
at time 2
should be computed as
P = E
_
0
D
1
+
2
0
D
2
_
,
where the
t
s dene a stateprice deator. After some reection and analysis, Carl decides to
value the projects using a conditional Consumptionbased CAPM so that the stateprice deator
between time t and t + 1 is of the form
t+1
t
= a
t
+b
t
C
t+1
C
t
, t = 0, 1, . . . .
Carl thinks its fair to assume that aggregate consumption will grow by either 1% (low) or 3%
(high) in each of the next two years. Over the rst year he believes the two growth rates are
equally likely. If the growth rate of aggregate consumption is high in the rst year, he believes
that there is a 30% chance that it will also be high in the second year and, thus, a 70% chance of
low growth in the second year. On the other hand, if the growth rate of aggregate consumption
is low in the rst year, he estimates that there will be a 70% chance of high growth and a 30%
chance of low growth in the second year.
In order to apply the Consumptionbased CAPM for valuing his investment projects, Carl has to
identify the coecients a
t
and b
t
for t = 0, 1. The values of a
0
and b
0
can be identied from the
prices of two traded assets that only provide dividends at time 1. In addition to the oneyear
zerocoupon bond mentioned above, a oneyear European call option on aggregate consumption is
traded. The option has a strike price of K = 1020 so that the payo of the option in one year is
max(C
1
1020, 0), where C
1
is the aggregate consumption level at t = 1. The option trades at a
price of 4.7.
(c) Using the information on the two traded assets, write up two equations that can be used for
determining a
0
and b
0
. Verify that the equations are solved for a
0
= 3 and b
0
= 2.
The values of a
1
and b
1
may depend on the consumption growth rate of the rst year, i.e. Carl
has to nd a
h
1
, b
h
1
that denes the secondyear stateprice deator if the rstyear growth rate was
high and a
l
1
, b
l
1
that denes the secondyear stateprice deator if the rstyear growth rate was
low. Using the observed market prices of other assets, Carl concludes that
a
h
1
= 2.5, b
h
1
= 1.5, a
l
1
= 3.5198, b
l
1
= 2.5.
8.11 Exercises 209
(d) Verify that the economy is pathindependent in the sense that the current price of an asset
that will pay you 1 at t = 2 if the growth rate is high in the rst year and low in the second
year will be identical (at least to ve decimal places) to the current price of an asset that will
pay you 1 at t = 2 if the growth rate is low in the rst year and high in the second year.
(e) Illustrate the possible dividends of the two projects in a twoperiod binomial tree.
(f ) What are the correctly computed present values of the two projects?
(g) Carl notes that the expected dividends of the two projects are exactly the same but the
present value of project 2 is higher than the present value of project 1. Although Carl is
pretty smart, he cannot really gure out why this is so. Can you explain it to him?
EXERCISE 8.2 In the simple consumptionbased asset pricing model, the growth rate of aggre
gate consumption is assumed to have a constant expectation and standard deviation (volatility).
For example, in the continuoustime version aggregate consumption is assumed to follow a geo
metric Brownian motion. Consider the following alternative process for aggregate consumption:
dc
t
= c
t
[dt +c
1
t
dz
t
],
where , , and are positive constants, and z = (z
t
) is a standard Brownian motion. As in the
simple model, assume that a representative individual exists and that this individual has time
additive expected utility exhibiting constant relative risk aversion given by the parameter > 0
and a constant time preference rate > 0.
(a) State an equation linking the expected excess return on an arbitrary risky asset to the level
of aggregate consumption and the parameters of the aggregate consumption process. How
does the expected excess return vary with the consumption level?
(b) State an equation linking the shortterm continuously compounded riskfree interest rate
r
f
t
to the level of aggregate consumption and the parameters of the aggregate consumption
process. How does the interest rate vary with the consumption level?
(c) Use Itos Lemma to nd the dynamics of the interest rate, dr
f
t
? Can you write the drift and
the volatility of the interest rate as functions of the interest rate level only?
EXERCISE 8.3 Give a proof of Theorem 8.2.
EXERCISE 8.4 Consider the Chan and Kogan model. Show the expressions in (8.49), (8.50),
(8.52), (8.53), and (8.54). Show Theorem 8.4.
EXERCISE 8.5 In the Chan and Kogan model, show how (8.56) follows from (8.42).
EXERCISE 8.6 Consider a continuoustime economy with complete markets and a represen
tative individual having an external habit or keeping up with the Joneses utility function
210 Chapter 8. Consumptionbased asset pricing
so that, at any time t, the individual maximizes E
t
_
_
T
t
e
(st)
u(c
s
, X
s
) ds
_
, where u(c, X) =
1
1
(c X)
1
for c > X 0.
Dene Y
t
= ln
_
1
X
t
c
t
_
.
(a) Argue that Y
t
is positive. Would you call a situation where Y
t
is high for a good state or
a bad state? Explain!
(b) Argue that the unique stateprice deator is given by
t
= e
t
c
t
e
Y
t
c
0
e
Y
0
.
First, write the dynamics of consumption and the variable Y
t
in the general way:
dc
t
= c
t
[
ct
dt +
ct
dz
t
],
dY
t
=
Y t
dt +
Y t
dz
t
,
where z = (z
t
) is a multidimensional standard Brownian motion.
(c) Find the dynamics of the stateprice deator and identify the continuously compounded
shortterm riskfree interest rate r
f
t
and the market price of risk
t
.
An asset i pays an uncertain terminal dividend but no intermediate dividends. The price dynamics
is of the form
dP
it
= P
it
[
it
dt +
it
dz
t
] .
(d) Explain why
it
r
f
t
=
ict
ct
+
iY t
Y t
,
where
ict
= (
it
ct
)/
ct

2
and
iY t
= (
it
Y t
)/
Y t

2
. Express
ct
and
Y t
in terms of
previously introduced parameters and variables.
Next, consider the specic model:
dc
t
= c
t
[
c
dt +
c
dz
1t
],
dY
t
= [
Y Y
t
] dt +
Y
_
Y
t
_
dz
1t
+
_
1
2
dz
2t
_
,
where (z
1
, z
2
)
it
dt +
it
_
dz
1t
+
_
1
2
dz
2t
__
,
where
it
> 0 and (1, 1).
8.11 Exercises 211
(f ) What is the Sharpe ratio of asset i in the specic model? Can the specic model gener
ate countercyclical variation in Sharpe ratios (if necessary, provide parameter conditions
ensuring this)?
EXERCISE 8.7 Consider the setup of Exercise 6.10 with u(c, h) =
1
1
(c h)
1
.
(a) Can optimal consumption follow a geometric Brownian motion under these assumptions?
(b) Assume that the excess consumption rate c
t
= c
t
h
t
follows a geometric Brownian motion.
Show that the stateprice deator will be of the form
t
= f(t)e
t
c
t
for some deterministic
function f(t) (and nd that function). Find an expression for the equilibrium interest rate
and the market price of risk. Compare with the simple model with no habit, CRRA utility,
and consumption following a geometric Brownian motion.
EXERCISE 8.8 (This problem is based on the working paper Menzly, Santos, and Veronesi
(2002).) Consider an economy with a representative agent with lifetime utility given by
U(C) = E
__
0
e
t
ln(C
t
X
t
) dt
_
,
where X
t
is an external habit level and is a subjective discount rate. As in Campbell and
Cochrane (1999) dene the surplus ratio as S
t
= (C
t
X
t
)/C
t
. Dene Y
t
= 1/S
t
. Aggregate
consumption C
t
is assumed to follow a geometric Brownian motion
dC
t
= C
t
[
C
dt +
C
dz
t
] ,
where
C
and
C
are constants and z is a onedimensional standard Brownian motion. The
dynamics of the habit level is modeled through
dY
t
= k[
Y Y
t
] dt (Y
t
)
C
dz
t
,
where k,
Y , , and are constants.
(a) Show that E
t
[Y
] =
Y + (Y
t
Y )e
k(t)
.
(b) For any given dividend process D = (D
t
) in this economy, argue that the price is given by
P
D
t
= (C
t
X
t
) E
t
__
t
e
(t)
D
d
_
.
Let s
D
= D
/C
denote the dividends share of aggregate consumption. Show that the price
P
D
t
satises
P
D
t
C
t
=
1
Y
t
E
t
__
t
e
(t)
s
D
d
_
.
(c) Show that the price P
C
t
of a claim to the aggregate consumption stream D
= C
is given
by
P
C
t
C
t
=
1
+k
_
1 +
k
S
t
_
.
212 Chapter 8. Consumptionbased asset pricing
(d) Find the dynamics of the stateprice deator. Find and interpret expressions for the riskfree
interest rate and the market price of risk in this economy.
EXERCISE 8.9 (This problem is based on the paper Longsta and Piazzesi (2004).) Consider
a continuoustime model of an economy with a representative agent and a single nondurable good.
The objective of the agent at any time t is to maximize the expected timeadditive CRRA utility,
E
t
__
t
e
(st)
C
1
s
1
ds
_
,
where > 0 and C
s
denotes the consumption rate at time s. The agent can invest in a bank
account, i.e. borrow and lend at a shortterm interest rate of r
t
. The bank account is in zero net
supply. A single stock with a net supply of one share is available for trade. The agent is initially
endowed with this share. The stock pays a continuous dividend at the rate D
t
. The agent receives
an exogenously given labor income at the rate I
t
.
(a) Explain why the equilibrium consumption rate must equal the sum of the dividend rate and
the labor income rate, i.e. C
t
= I
t
+D
t
.
Let F
t
denote the dividendconsumption ratio, i.e. F
t
= D
t
/C
t
. Assume that F
t
= expX
t
,
where X = (X
t
) is the diusion process
dX
t
= ( X
t
) dt
_
X
t
dz
1t
.
Here , , and are positive constants and z
1
= (z
1t
) is a standard Brownian motion.
(b) Explain why F
t
is always between zero and one.
Assume that the aggregate consumption process is given by the dynamics
dC
t
= C
t
_
dt +
_
X
t
dz
1t
+
_
X
t
_
1
2
dz
2t
_
,
where , , and are constants and z
2
= (z
2t
) is another standard Brownian motion, independent
of z
1
.
(c) What is the equilibrium shortterm interest rate in this economy?
(d) Use Itos Lemma to derive the dynamics of F
t
and of D
t
.
(e) Show that the volatility of the dividend rate is greater than the volatility of the consumption
rate if > 0.
Let P
t
denote the price of the stock, i.e. the present value of all the future dividends.
(f ) Show that the stock price can be written as
P
t
= C
t
E
t
__
t
e
(st)
C
1
s
F
s
ds
_
.
8.11 Exercises 213
It can be shown that P
t
can be written as a function of t, C
t
, and F
t
:
P
t
= C
t
_
t
e
(st)
A(t, s)F
B(t,s)
t
ds.
Here A(t, s) and B(t, s) are some deterministic functions of time that we will leave unspecied.
(g) Use Itos Lemma to show that
dP
t
= P
t
_
. . . dt + ( +H
t
)
_
X
t
dz
1t
+
_
1
2
_
X
t
dz
2t
_
,
where the drift term is left out (you do not have to compute the drift term!) and where
H
t
=
_
t
e
(st)
A(t, s)B(t, s)F
B(t,s)
t
ds
_
t
e
(st)
A(t, s)F
B(t,s)
t
ds
.
(h) Show that the expected excess rate of return on the stock at time t can be written as
t
= X
t
_
2
+H
t
_
and as
t
=
2
Ct
+H
t
Ct
Ft
,
where
Ct
and
Ft
denote the percentage volatility of C
t
and F
t
, respectively.
(i) What would the expected excess rate of return on the stock be if the dividendconsumption
ratio was deterministic? Explain why the model with stochastic dividendconsumption ratio
has the potential to resolve (at least partially) the equity premium puzzle.
EXERCISE 8.10 Consider a discretetime representative individual economy with preferences
E[
t=0
t
u(C
t
, D
t
)], where C
t
is the consumption of perishable goods and D
t
is the consumption
of services from durable goods. Assume a CobbDouglas type utility function,
u(C, D
) =
1
1
_
C
(D
)
1
_
1
where > 0 and [0, 1].
(a) Write up the oneperiod stateprice deator discount factor M
t+1
t+1
t
=
u
C
(C
t+1
,D
t+1
)
u
C
(C
t
,D
t
)
in this economy.
Assume that the services from the the durable goods are given by D
t
= (K
t1
+D
t
), where K
t1
is the stock of the durable entering period t and D
t
is the additional purchases of the durable good
in period t. We can interpret as a depreciation rate or as the intensity of usage of the durable.
We must have K
t
= (1 )(K
t1
+D
t
).
(b) Argue that if one additional unit of the durable is purchased in period t, then the additional
services from the durable in period t +j will be (1 )
j
.
214 Chapter 8. Consumptionbased asset pricing
(c) What is the marginal (lifetime expected) utility, MU
D
t
, from purchasing an extra unit of the
durable in period t?
(d) If p
t
is the unit price of the durable good, argue that MU
D
t
= p
t
C
(1)1
t
(D
t
)
(1)(1)
in equilibrium.
Now we allow for a stochastic intensity of usage, i.e. = (
t
) is a stochastic process. Dene
X
t
= (K
t1
+D
t
)/K
t1
, x
t
= lnX
t
, c
t
= lnC
t
, and m
t
= lnM
t
.
(e) Show that m
t+1
= ln +((1) 1)c
t+1
+(1)(1) [(ln
t+1
) +x
t+1
+ ln(1
t
)].
Assume now that
c
t+1
= g +
t+1
,
t+1
N(0,
2
),
ln
t+1
= h +ln
t
+
t+1
,
t+1
N(0,
2
),
x
t+1
= (1 w ) ln
t
ln(1
t
) h +a
t+1
,
and that the two exogenous shocks
t+1
and
t+1
have correlation .
(f ) Derive the continuously compounded equilibrium short interest rate r
f
t
= lnR
f
t
. You should
nd that the interest rate is constant if w = 0.
EXERCISE 8.11 Consider a continuoustime economy with a representative agent with time
additive subsistence HARA utility, i.e. the objective of the agent is to maximize
E
_
_
T
0
e
t
1
1
(c
t
c)
1
dt
_
,
where c 0 is the subsistence consumption level. Assume that aggregate consumption c = (c
t
)
evolves as
dc
t
= c
t
dt +
_
c
t
(c
t
c) dz
t
,
where z = (z
t
) is a (onedimensional) standard Brownian motion. Find the equilibrium short
term interest rate r
f
t
and the market price of risk
t
, expressed in terms of c
t
and the parameters
introduced above. How do r
f
t
and
t
depend on the consumption level? Are r
f
t
and
t
higher or
lower or unchanged relative to the standard case in which c = 0?
CHAPTER 9
Factor models
9.1 Introduction
The lack of reliable consumption data discussed in Section 8.6 complicates tests and applications
of the consumptionbased models. As mentioned above, most tests that have been carried out
nd it problematic to match the (simple) consumptionbased model and historical return and
consumption data (of poor quality). This motivates a search for models linking returns to other
factors than consumption.
The classical CAPM is the Mother of all factor models. It links expected excess returns (on
stocks) to the return on the market portfolio (of stocks). It was originally developed in a one
period framework but can be generalized to multiperiod settings. The model is based on rather
unrealistic assumptions and the empirical success of the CAPM is modest.
Many, many papers have tried to identify factor models that perform better, mostly by adding
extra factors. However, this should only be done with extreme care. In a given data set of historical
returns it is always possible to nd a return that works as a pricing factor, as already indicated in
Chapter 4. In fact, any expost meanvariance ecient return will work. On the other hand, there
is generally no reason to believe that the same return will work as a pricing factor in the future.
Factors should be justied by economic arguments or even a theoretical asset pricing model.
It is worth emphasizing that the general theoretical results of the consumptionbased asset
pricing framework are not challenged by factor models. The problem with the consumptionbased
models is the implementation. Factor models do not invalidate the consumptionbased asset pricing
framework but are special cases that may be easier to apply and test. Therefore factors should
generally help explain typical individuals marginal utilities of consumption.
Section 9.2 reviews the classical oneperiod CAPM and how it ts into the modern consumption
based asset pricing framework. Section 9.3 denes and studies pricing factors in the oneperiod
setting. In particular, pricing factors are linked to stateprice deators. The relation between
meanvariance ecient returns and pricing factors is the topic of Section 9.4. Multiperiod pricing
215
216 Chapter 9. Factor models
factors are introduced in Section 9.5 with a discussion of the distinction between conditional and
unconditional pricing factors. Section 9.6 oers a brief introduction to empirical studies of factor
models. Finally, Section 9.7 discusses how pricing factors can be derived theoretically. It also
derives an intertemporal version of the CAPM.
9.2 The classical oneperiod CAPM
The classical CAPM developed by Sharpe (1964), Lintner (1965), and Mossin (1966) says that
return on the market portfolio is a pricing factor so that
E[R
i
] = +[R
i
, R
M
] (E[R
M
] ) , i = 1, 2, . . . , I, (9.1)
for some zerobeta return , which is identical to the riskfree rate if such exists. Here the market
beta is dened as [R
i
, R
M
] = Cov[R
i
, R
M
]/ Var[R
M
].
The classical CAPM is usually derived from meanvariance analysis. If all individuals have
quadratic utility or returns are normally distributed, any individual will optimally pick a mean
variance ecient portfolio. If R
l
denotes the return on the portfolio chosen by individual l and
w
l
denotes individual ls share of total wealth, the return on the market portfolio will be R
M
=
L
l=1
w
l
R
l
and the market portfolio will be meanvariance ecient. As was already stated in
Theorem 4.6 (demonstrated later in this chapter), the return R
mv
on any meanvariance ecient
portfolio is such that an R exists so that E[R
i
] = + [R
i
, R
mv
](E[R
mv
] ). In particular,
this is true for the market portfolio when it is ecient.
To see how the classical CAPM ts into the consumptionbased asset pricing framework, consider
a model in which all individuals have timeadditive expected utility and are endowed with some
time 0 wealth but receive no time 1 income from nonnancial sources. Let us consider an arbitrary
individual with initial wealth endowment e
0
. If the individual consumes c
0
at time 0 she will invest
e
0
c
0
in the nancial assets. Representing the investment by the portfolio weight vector , the
gross return on the portfolio will be R
= R =
I
i=1
i
R
i
. The time 1 consumption will equal
the total dividend of the portfolio, which is the gross return multiplied by the initial investment,
i.e.
c = R
(e
0
c
0
).
We can substitute this into the marginal rate of substitution of the individual so that the associated
stateprice deator becomes
= e
(c)
u
(c
0
)
= e
(R
(e
0
c
0
))
u
(c
0
)
. (9.2)
If the economy has a representative individual, she has to own all the assets, i.e. she has to invest
in the market portfolio. We can then replace R
by R
M
, the gross return on the market portfolio.
We can obtain the classical CAPM from this relation if we either assume that the utility function
is quadratic or that the return on the market portfolio is normally distributed.
Quadratic utility. The quadratic utility function u(c) = ( c c)
2
is a special case of the
satiation HARA utility functions. Marginal utility u
c R
(e
0
c
0
)
c c
0
= e
c
c c
0
e
e
0
c
0
c c
0
R
, (9.3)
which is ane in the portfolio return. It now follows from the discussion in Section 4.6.2 (also see
later section in the present chapter) that the portfolio return is a pricing factor so that
E[R
i
] = +[R
i
, R
] (E[R
] ) .
Again, if this applies to a representative individual, we can replace R
(y
y
)g(y)f(y) dy,
where
f(y) =
1
2
exp
_
1
2
2
y
(y
y
)
2
_
is the probability density function of y. Noting that f
(y) = f(y)(y
y
)/
2
y
, integration by
parts gives
_
(y
y
)g(y)f(y) dy =
2
y
_
g(y)f
(y) dy
=
2
y
_
(y)f(y) dy
2
y
_
g(y)f(y)
_
y=
=
2
y
E[g
(y)]
provided that g(y) does not approach plus or minus innity faster than f(y) approaches zero as
y . Hence,
Cov[x, g(y)] = Cov[y, g(y)] =
2
y
E[g
(y)]
218 Chapter 9. Factor models
as claimed. 2
For any stateprice deator of the form = g(x) where x and R
i
are jointly normally distributed,
we thus have
1 = E[g(x)R
i
] = E[g(x)] E[R
i
] + Cov[g(x), R
i
]
= E[g(x)] E[R
i
] + E[g
(x)] Cov[x, R
i
]
= E[g(x)] E[R
i
] + E[g
(x)] + E[g
(x)]x R
i
]
= E[(a +bx)R
i
],
for some constants a and b. Therefore, we can safely assume that g(x) is ane in x.
In (9.2) we have
= e
(R
(e
0
c
0
))
u
(c
0
)
= g(R
),
and if the individual asset returns are jointly normally distributed, the return on any portfolio and
the return on any individual asset will also be jointly normally distributed. According to Steins
Lemma we can then safely assume that is ane in R
is a pricing
factor. Note, however, that to apply Steins Lemma, we have to check that E[[g
(R
)[] is nite. In
our case,
g
(R
) = e
(e
0
c
0
)
u
(R
(e
0
c
0
))
u
(c
0
)
.
With logutility, u
(c) = 1/c
2
, and since E[1/(R
)
2
] is innite (or undened if you like) when R
is normally distributed,
we really need the utility function to be dened on the entire real line, which is not the case for
the most reasonable utility functions. For negative exponential utility, there is no such problem.
The assumptions leading to the classical CAPM are clearly problematic. Preferences are poorly
represented by quadratic utility functions or other meanvariance utility functions. Returns are not
normally distributed. A more fundamental problem is the static nature of the oneperiod CAPM.
Later in this chapter we will discuss how the CAPM can be extended to a dynamic setting. It turns
out that we can derive an intertemporal CAPM under much more appropriate assumptions about
utility functions and return distributions. We will need CRRA utility and lognormally distributed
returns. In addition, we will need the return distribution to be stationary, i.e. the same for all
future periods of the same length.
9.3 Pricing factors in a oneperiod framework
In Section 4.6.2 we dened a onedimensional pricing factor in a oneperiod framework and dis
cussed the relation between pricing factors and stateprice deators. Below we generalize this to
the case of multidimensional factors and give a more rigorous treatment.
9.3 Pricing factors in a oneperiod framework 219
9.3.1 Denition and basic properties
We will say that a Kdimensional random variable x = (x
1
, . . . , x
K
)
, i = 1, . . . , I, (9.4)
where the factorbeta of asset i is the Kdimensional vector [R
i
, x] given by
[R
i
, x] = (Var[x])
1
Cov[x, R
i
]. (9.5)
Here Var[x] is the K K variancecovariance matrix of x and Cov[x, R
i
] is the Kvector with
elements Cov[x
k
, R
i
]. Saying that x is a pricing factor we implicitly require that Var[x] is non
singular. The vector is called a factor risk premium and is called the zerobeta return.
We can write (9.4) more compactly as
E[R] = 1 +[R, x],
where R = (R
1
, . . . , R
I
)
. (9.6)
If a riskfree asset exists, 1 = R
f
1 = r
f
, the riskfree net rate of return.
The relation (9.4) does not directly involve prices. But since the expected gross return is E[R
i
] =
E[D
i
]/P
i
, we have P
i
= E[D
i
]/ E[R
i
] and hence the equivalent relation
P
i
=
E[D
i
]
+[R
i
, x]
. (9.7)
The price is equal to the expected dividend discounted by a riskadjusted rate. You may nd this
relation unsatisfactory since the price implicitly enters the righthand side through the return
beta [R
i
, x]. However, we can dene a dividendbeta by
[D
i
, x] = (Var[x])
1
Cov[x, D
i
]
and inserting D
i
= R
i
P
i
we see that [D
i
, x] = P
i
[R
i
, x]. Equation (9.4) now implies that
E[D
i
]
P
i
= +
1
P
i
[D
i
, x]
so that
P
i
=
E[D
i
] [D
i
, x]
. (9.8)
Think of the numerator as a certainty equivalent of the risky dividend. The current price is the
certainty equivalent discounted by the zerobeta return, which is the riskfree return if this exists.
The following result shows that pricing factors are not unique.
220 Chapter 9. Factor models
Theorem 9.1 If the Kdimensional random variable x is a pricing factor, then any x of the form
x = a +Ax where a R
K
and A is a nonsingular K K matrix is also a pricing factor.
Proof: According to (A.1), (A.2), and (1.2), we have
Cov[ x, R
i
] = Cov[a +Ax, R
i
] = ACov[x, R
i
],
(Var[ x])
1
=
_
Var[a +Ax]
_
1
=
_
Var[Ax]
_
1
=
_
AVar[x]A
_
1
=
_
A
_
1
(Var[x])
1
A
1
,
and thus
[R
i
, x] = (Var[ x])
1
Cov[ x, R
i
] =
_
A
_
1
(Var[x])
1
A
1
ACov[x, R
i
] =
_
A
_
1
[R
i
, x].
If we dene = A, we obtain
[R
i
, x]
=
_
_
A
_
1
[R
i
, x]
_
A = [R
i
, x]
and, hence,
E[R
i
] = +[R
i
, x]
, i = 1, . . . , I, (9.9)
which conrms that x is a pricing factor. 2
Let us look at some important consequences of this theorem.
In general the kth element of the factor beta [R
i
, x] is not equal to Cov[x
k
, R
i
]/ Var[x
k
].
This will be the case, however, if the elements in the pricing factor are mutually uncorrelated,
i.e. Cov[x
j
, x
k
] = 0 for j ,= k. In fact, we can orthogonalize the pricing factor so that this will
be satised. Given any pricing factor x, we can nd a nonsingular K K matrix V so that
V V
= Var[x]. Dening x = V
1
x, we know from above that x is also a pricing factor and the
variancecovariance matrix is
Var[ x] = V
1
Var[x]
_
V
1
_
= V
1
V V
_
V
_
1
= I,
i.e. the K K identity matrix. It is therefore no restriction to look only for uncorrelated pricing
factors.
We can also obtain a pricing factor with mean zero. If x is any pricing factor, just dene
x = x E[x]. Clearly, x has mean zero and, due to the previous theorem, it is also a pricing
factor. It is therefore no restriction to look only for zeromean pricing factors.
Finally, note that we can replace the constant vector a in the above theorem with a K
dimensional random variable with the property that Cov[R
i
, ] = 0 for all i. In particular
we have that if x is a pricing factor and is such a random variable, then x + is also a pricing
factor.
9.3.2 Returns as pricing factors
Suppose now that the pricing factor is a vector of returns on portfolios of the I assets. Then (9.4)
holds with each x
k
replacing R
i
. We have Cov[x, x] = Var[x] and hence [x, x] = I, the K K
identity matrix. Consequently,
E[x] = 1 + = E[x] 1,
9.3 Pricing factors in a oneperiod framework 221
where 1 is a Kdimensional vector of ones. In this case we can therefore rewrite (9.4) as
E[R
i
] = +[R
i
, x]
(E[x] 1) , i = 1, . . . , I. (9.10)
It is now clear that the classical CAPM has the return on the market portfolio as the single
pricing factor. More generally, we will demonstrate in Section 9.4.3 that a return works as a single
pricing factor if and only if it is the return on a meanvariance ecient portfolio (dierent from
the minimumvariance portfolio).
For the case of a onedimensional pricing factor x, Section 4.6.2 explained that the return R
x
on the factormimicking portfolio also works as a pricing factor. We can generalize this to a
multidimensional pricing factor in the following way. Given a pricing factor x = (x
1
, . . . , x
K
)
,
orthogonalize to obtain x = ( x
1
, . . . , x
K
)
. For each x
k
construct a factormimicking portfolio
with corresponding return R
x
k
. Then the return vector R
x
=
_
R
x
1
, . . . , R
x
K
_
will work as a
pricing factor and we have an equation like
E[R
i
] = +[R
i
, R
x
]
_
E[R
x
] 1
_
, i = 1, . . . , I. (9.11)
It is therefore no restriction to assume that pricing factors are returns.
9.3.3 From a stateprice deator to a pricing factor
From the denition of a covariance we have that Cov[R
i
, ] = E[R
i
] E[] E[R
i
]. From (4.3), we
now get that
E[R
i
] =
1
E[]
Cov[R
i
, ]
E[]
. (9.12)
With [R
i
, ] = Cov[R
i
, ]/ Var[] and = Var[]/ E[], we can rewrite the above equation as
E[R
i
] =
1
E[]
+[R
i
, ], (9.13)
which shows that the stateprice deator is a pricing factor. Although the proof is simple, the
results is important enough to deserve its own theorem:
Theorem 9.2 Any stateprice deator is a pricing factor.
In the above argument we did not use positivity of the stateprice deator, only the pricing
equation (4.1) or, rather, the return version (4.3). Any random variable x that satises P
i
= E[xD
i
]
for all assets works as a pricing factor. In particular, this is true for the random variable
dened
in (4.46) whether it is positive or not. We therefore have that
E[R
i
] =
+[R
i
,
, i = 1, . . . , I, (9.14)
where
= 1/ E[
] and
= Var[
]/ E[
and
use the return R
+[R
i
, R
, i = 1, . . . , I, (9.15)
where
= 1/ E[
= Var[
]/
_
E[
] E[(
)
2
]
_
.
More generally, we have the following result:
222 Chapter 9. Factor models
Theorem 9.3 If the Kdimensional random variable x satises
(i) Var[x] is nonsingular;
(ii) a R and b R
K
exist so that = a +b
E[x]
b
Cov[R
i
, x]
a +b
E[x]
=
1
a +b
E[x]
(Var[x]b)
(Var[x])
1
Cov[R
i
, x]
a +b
E[x]
= +[R
i
, x]
,
where = 1/ (a +b
x
satises P
i
= E[D
i
] for i = 1, . . . , I.
Proof: Let denote the factor risk premium associated with the pricing factor x. Dene
b =
1
(Var[x])
1
, a =
1
E[x].
Then = a +b
x works since
E[R
i
] = a E[R
i
] +b
E[R
i
x]
= a E[R
i
] +b
(Cov[R
i
, x] + E[R
i
] E[x])
= (a +b
E[x]) E[R
i
] + Cov[R
i
, x]
b
=
1
_
E[R
i
] Cov[R
i
, x]
(Var[x])
1
_
=
1
(E[R
i
] [R
i
, x]
)
= 1
for any i = 1, . . . , I. 2
Inserting a and b from the proof, we get
= a +b
x =
1
_
1
(Var[x])
1
(x E[x])
_
.
Any pricing factor x gives us a candidate a + b
k=1
ik
x
k
+
i
,
where E[x
k
] = 0, E[
i
] = 0, Cov[
i
, x
k
] = 0, and Cov[
i
,
j
] = 0 for all i, j ,= i, and k. Due
to the constraints on means and covariances, we have [R
i
, x] = (Var[x])
1
Cov[R
i
, x] as before.
Note that one can always make a decomposition as in the equation above. Just think of regressing
returns on the vector x. The real content of the assumption lies in the restriction that the residuals
are uncorrelated, i.e. Cov[
i
,
j
] = 0 whenever i ,= j. The vector x is the source of all the common
variations in returns across assets.
Suppose you have invested a given wealth in a portfolio. We can represent a zero net investment
deviation from this portfolio by a vector w = (w
1
, . . . , w
I
)
satisfying w 1 = 0, where w
i
is the
fraction of wealth additionally invested in asset i. In other words, we increase the investment in
some assets and decrease the investment in other assets. The additional portfolio return is
R
w
= w
R =
I
i=1
w
i
R
i
=
I
i=1
w
i
E[R
i
] +
I
i=1
w
i
i1
x
1
+ +
I
i=1
w
i
iK
x
K
+
I
i=1
w
i
i
.
Suppose we can nd w
1
, . . . , w
I
so that
(i)
I
i=1
w
i
ik
= 0 for k = 1, . . . , K,
(ii)
I
i=1
w
i
i
= 0,
then
R
w
=
I
i=1
w
i
E[R
i
],
i.e. the added return is riskfree. To rule out arbitrage, a riskfree zero net investment should give
a zero return so we can conclude that
R
w
=
I
i=1
w
i
E[R
i
] = 0.
In linear algebra terms, we have thus seen that if a vector w is orthogonal to 1 and to each of the
vectors (
1k
, . . . ,
Ik
)
, k = 1, . . . , K,
i.e. that constants ,
1
, . . . ,
K
exist so that
E[R
i
] = +
i1
1
+ +
iK
K
= +[R
i
, x], i = 1, 2, . . . , I,
i.e. x is a pricing factor.
With at least K suciently dierent assets, we can satisfy condition (i) above. We just need
the I K matrix [R, x] to have rank K. What about condition (ii)? The usual argument given
is that if we pick w
i
to be of the order 1/I and I is a very large number, then
i=1
w
i
i
will
be close to zero and we can ignore it. But close to zero does not mean equal to zero and if the
224 Chapter 9. Factor models
residual portfolio return is nonzero, the portfolio is not riskfree and the argument breaks down.
Even a very small dividend or return in a particular state can have a large inuence on the current
price and, hence, the expected return. With nitely many assets, we can only safely ignore the
residual portfolio return if the residual returns of all the individual assets are zero, i.e.
i
= 0 for
all i = 1, . . . , I.
Theorem 9.5 If individual asset returns are of the form
R
i
= E[R
i
] +[R
i
, x]x, i = 1, 2, . . . , I,
and the I K matrix [R, x] has rank K, then x is a pricing factor, i.e. R and R
K
exist
so that
E[R
i
] = +[R
i
, x]
, i = 1, 2, . . . , I.
It is, however, fairly restrictive to assume that all the variation in the returns on a large number
of assets can be captured by a low number of factors.
9.4 Meanvariance ecient returns and pricing factors
We have introduced the meanvariance frontier earlier in Sections 4.6.3 and 6.2.5. Here we provide
an alternative characterization of the meanvariance ecient returns and study the link between
meanvariance eciency and asset pricing theory.
As before let R = (R
1
, . . . , R
I
)
s.t.
= m,
1 = 1, (6.21)
i.e. has the lowest return variance among all portfolios with expected return m.
9.4.1 Orthogonal characterization
Following Hansen and Richard (1987) we will show that a return R is meanvariance ecient if
and only if it can be written on the form R = R
+ wR
e
for some number w. Here R
is the
return on the portfolio corresponding to the dividend
=
(E[RR
])
1
1
1
(E[RR
])
1
1
(4.53)
and its gross return is
R
= (
R =
1
(E[RR
])
1
R
1
(E[RR
])
1
1
. (4.54)
R
e
is a particular excess return to be dened shortly. This characterization of the meanvariance
portfolios turns out to be preferable for discussing the link between meanvariance analysis and
asset pricing models.
9.4 Meanvariance ecient returns and pricing factors 225
R
t into the meanvariance framework? The following lemma shows that it is a mean
variance ecient return on the downwardsloping part of the meanvariance frontier. Furthermore,
it is the return with minimum second moment. (Recall that the second moment of a random
variable x is E[x
2
].)
Lemma 9.2 R
)
2
] =
E[RR
E[RR
] s.t.
1 = 1.
The Lagrangian is L =
E[RR
] + (1
] 1 = 0 =
2
(E[RR
])
1
1.
Imposing the constraint
1 = 1, we get /2 = 1/(1
E[RR
in (4.53).
Since
is the portfolio minimizing the second moment of gross returns among all portfolios, it
is also the portfolio minimizing the second moment of gross returns among the portfolios with the
same mean return as
] = E[R
] = E[(R
)
2
] (E[R
])
2
= E[(R
)
2
] (E[R
])
2
.
It then follows that
is the portfolio minimizing the variance of return among all the portfolios
having the same mean return as
. Hence,
)
2
] yield points
on a circle with radius
K centered in (0,0), since ((R
))
2
+ (E[R
])
2
= E[(R
)
2
]. The return
R
1)
2
= E
_
(
D1)
2
_
= E[
DD
+ 1 2
D]
=
E[DD
] + 1 2
E[D] .
226 Chapter 9. Factor models
Minimizing with respect to , we get
cm
= (E[DD
])
1
E[D] where the subscript cm is short
for constantmimicking. Let us transform this to a vector
cm
of portfolio weights by using (3.6).
Applying (3.2) and (4.52), we get
diag(P)
cm
= diag(P) (E[DD
])
1
E[D]
= diag(P)[diag(P)]
1
(E[RR
])
1
[diag(P)]
1
E[D]
= (E[RR
])
1
E[R]
so that
cm
=
1
1
(E[RR
])
1
E[R]
(E[RR
])
1
E[R] =
E[(R
)
2
]
E[R
]
(E[RR
])
1
E[R] , (9.16)
where the last equality comes from (4.58). The constantmimicking return is thus
R
cm
= (
cm
)
R =
E[(R
)
2
]
E[R
]
E[R]
(E[RR
])
1
R. (9.17)
Excess returns and R
e
An excess return is simply the dierence between two returns. Since any return corresponds to a
dividend for a unit initial payment, an excess return can be seen as a dividend for a zero initial
payment. Of course, in absence of arbitrage, a nonzero excess return will turn out positive in
some states and negative in other states.
Typically, excess returns on dierent portfolios relative to the same reference return are con
sidered. For a reference return
R =
R
R [
1 = 1
_
,
where as before R is the Idimensional vector of returns on the basis assets, and denotes a
portfolio weight vector of these I assets.
It is useful to observe that the set of excess returns is the same for all reference returns. An excess
return relative to a reference return
R is given by
R
R = ( )
R
using the portfolio + . (Check for yourself!) Hence, we can simply talk about the set of
excess returns, R
e
, without specifying any reference return, and we can write it as
R
e
= (
e
)
R [ (
e
)
1 = 0.
The set of excess returns is a linear subspace of the set of all random variables (in the case with
S possible outcomes this is equivalent to R
S
) in the sense that
1. if w is a number and R
e
is an excess return, then wR
e
is an excess return;
Check: R
e
= (
e
)
R with (
e
)
1 = 0 implies that wR
e
= (w
e
)
R with (w
e
)
1 =
w(
e
)
1 = 0
2. if R
e1
and R
e2
are two excess returns, then R
e1
+R
e2
is also an excess return.
Check: R
ei
= (
ei
)
R with (
ei
)
1 = 0 implies that R
e1
+ R
e2
= (
e1
+
e2
)
R, where
(
e1
+
e2
)
1 = 0
9.4 Meanvariance ecient returns and pricing factors 227
Dene
R
e
=
E[R
]
E[(R
)
2
]
(R
cm
R
) = E[R]
(E[RR
])
1
R
E[R
]
E[(R
)
2
]
R
. (9.18)
Of course, R
cm
R
] = 0. (9.19)
In particular,
E[R
e
R
] = 0. (9.20)
(b) For any excess return R
e
we have
E[R
e
] = E[R
e
R
e
]. (9.21)
(c) E[R
e
] = E[(R
e
)
2
] and Var[R
e
] = E[R
e
](1 E[R
e
]).
Proof: (a) For any return R
i
, we have E[R
R
i
] = E[(R
)
2
] (cf. Lemma 4.1), and hence for any
excess return R
e
= R
i
R
j
, we have
E[R
R
e
] = E[R
(R
i
R
j
)] = E[R
R
i
] E[R
R
j
] = E[(R
)
2
] E[(R
)
2
] = 0.
In particular, this is true for the excess return R
e
.
(b) Write the excess return R
e
as the dierence between the return on some portfolio and R
,
i.e. R
e
=
RR
. Then
E[R
e
R
e
] = E[R
e
(
RR
)] =
E[R
e
R] E[R
e
R
] =
E[R
e
R]. (9.22)
Using (9.18), we get
E[R
e
R] = E
__
E[R]
(E[RR
])
1
R
_
R
_
E[R
]
E[(R
)
2
]
E[R
R]. (9.23)
Let us rst consider the last part. From Lemma 4.1, we have E[R
R
i
] = E[(R
)
2
] for each i so
that E[R
R] = E[(R
)
2
]1. Now look at the rst part of (9.23). It can be shown in general that,
for any vector x, E[x
RR] = E[RR
])
1
E[R] we obtain
E
__
E[R]
(E[RR
])
1
R
_
R
_
= E[RR
] (E[RR
])
1
E[R] = E[R].
We can now rewrite (9.23):
E[R
e
R] = E[R] E[R
]1.
Going back to (9.22), we have
E[R
e
R
e
] =
E[R
e
R] =
(E[R] E[R
]1) =
E[R] E[R
] = E[R
e
],
as had to be shown.
(c) The rst part follows immediately from (b). The second part comes from
Var[R
e
] = E[(R
e
)
2
] (E[R
e
])
2
= E[R
e
] (E[R
e
])
2
= E[R
e
] (1 E[R
e
])
where we have used the rst part. 2
228 Chapter 9. Factor models
A characterization of the meanvariance frontier in terms of R
and R
e
First we show that any return can be decomposed using R
, R
e
, and some residual excess
return.
Theorem 9.6 For any return R
i
, we can nd a number w
i
and an excess return
i
R
e
so that
R
i
= R
+w
i
R
e
+
i
(9.24)
and
E[
i
] = E[R
i
] = E[R
e
i
] = 0. (9.25)
Proof: Dene w
i
= (E[R
i
] E[R
]) / E[R
e
] and
i
= R
i
R
w
i
R
e
. Then (9.24) and E[
i
] = 0
hold by construction.
i
is the dierence between the two excess returns R
i
R
and w
i
R
e
and
therefore itself an excess return. Now E[R
i
] = 0 follows from (9.19) and from (9.21) we have
E[
i
R
e
] = E[
i
], which we know is zero. 2
Due to the relations E[R
R
e
] = E[R
i
] = E[R
e
i
] = 0, the decomposition is said to be orthogo
nal.
Note that the same w
i
applies for all returns with the same expected value. The return variance
is
Var[R
i
] = Var [R
+w
i
R
e
] + Var[
i
] + 2 Cov [R
+w
i
R
e
,
i
]
= Var [R
+w
i
R
e
] + Var[
i
] + 2 E[(R
+w
i
R
e
)
i
] E[R
+w
i
R
e
] E[
i
]
= Var [R
+w
i
R
e
] + Var[
i
].
Clearly the minimum variance for a given mean, i.e. a given w
i
, is obtained for
i
= 0. We therefore
have the following result:
Theorem 9.7 A return R
i
is meanvariance ecient if and only if it can be written as
R
i
= R
+w
i
R
e
for some number w
i
.
Varying w
i
from to +, R
i
= R
+ w
i
R
e
runs through the entire meanvariance frontier
in the direction of higher and higher expected returns (since E[R
e
] = E[(R
e
)
2
] > 0).
Note that from (9.18) the constantmimicking return can be written as
R
cm
= R
+
E
_
(R
)
2
E[R
]
R
e
(9.26)
so the constantmimicking return is meanvariance ecient.
Allowing for a riskfree asset
Now let us assume that a riskfree asset with return R
f
exists (or can be constructed as a portfolio
of the basic assets). Then from Lemma 4.1 we have that R
f
E[R
] = E[R
R
f
] = E[(R
)
2
], and
hence
R
f
=
E[(R
)
2
]
E[R
]
=
1
1
(E[RR
])
1
E[R]
. (9.27)
9.4 Meanvariance ecient returns and pricing factors 229
In addition we have
E[R]
(E[RR
])
1
R = 1. (9.28)
Let us just show this for the case with two assets, a riskfree and a risky so that R = (
R, R
f
)
,
where
R is the return on the risky asset. Then
E[R]
(E[RR
])
1
R =
_
E[
R], R
f
_
_
E[
R
2
] R
f
E[
R]
R
f
E[
R] (R
f
)
2
_
1
_
R
R
f
_
=
1
(R
f
)
2
_
E[
R
2
] (E[
R])
2
_
_
E[
R], R
f
_
_
(R
f
)
2
R
f
E[
R]
R
f
E[
R] E[
R
2
]
__
R
R
f
_
=
1
(R
f
)
2
_
E[
R
2
] (E[
R])
2
_
_
E[
R], R
f
_
_
_
(R
f
)
2
_
R E[
R]
_
R
f
_
E[
R
2
] E[
R]
R
_
_
_
= 1.
Substituting (9.27) and (9.28) into the general denition of R
e
in (9.18), we get
R
e
= 1
1
R
f
R
. (9.29)
Consequently,
R
f
= R
+R
f
R
e
(9.30)
so the w
i
corresponding to the riskfree return is R
f
itself. With R
f
> 1, we see that R
+ R
e
corresponds to a point on the frontier below the riskfree rate. (Again we use the fact that
E[R
e
] > 0.)
The minimumvariance return
With the decomposition in Theorem 9.7, it is easy to nd the minimumvariance return. The
variance of any meanvariance ecient return is
Var [R
+wR
e
] = E
_
(R
+wR
e
)
2
(E[R
+wR
e
])
2
= E
_
(R
)
2
+w
2
E
_
(R
e
)
2
+ 2wE[R
R
e
]
(E[R
])
2
w
2
(E[R
e
])
2
2wE[R
] E[R
e
]
= E
_
(R
)
2
+w
2
E[R
e
] (1 E[R
e
]) (E[R
])
2
2wE[R
] E[R
e
],
where the simplications leading to the last expression are due to Lemma 9.3. The rstorder
condition with respect to w implies that
w =
E[R
]
1 E[R
e
]
.
The minimumvariance return is thus
R
min
= R
+
E[R
]
1 E[R
e
]
R
e
= R
+
E[R
] E[R
e
]
Var[R
e
]
R
e
. (9.31)
When a riskfree asset exists, this simplies to R
min
= R
f
, because E[R
]/(1 E[R
e
]) = R
f
using (9.29).
230 Chapter 9. Factor models
9.4.2 Link between meanvariance ecient returns and stateprice de
ators
Theorem 9.8 Let R denote a gross return. Then there exists a, b R so that = a +bR satises
P
i
= E[D
i
] for i = 1, . . . , I if and only if R is a meanvariance ecient return dierent from the
constantmimicking return.
Proof: According to Theorem 9.6 we can decompose the return R as
R = R
+wR
e
+
for some w R and some excess return with E[] = E[R
] = E[R
e
] = 0. R is meanvariance
ecient if and only if = 0. We need to show that, for suitable a, b R,
= a +bR = a +b (R
+wR
e
+)
will satisfy P
i
= E[D
i
] for all i if and only if = 0 and w ,= E[(R
)
2
]/ E[R
].
Recall that P
i
= E[D
i
] for all i implies that E[R
i
] = 1 for all returns R
i
and E[R
e
] = 0 for
all excess returns R
e
. In particular,
1 = E[R
] = E[(a +b (R
+wR
e
+)) R
] = a E[R
] +b E
_
(R
)
2
0 = E[R
e
] = E[(a +b (R
+wR
e
+)) R
e
] = a E[R
e
] +bwE
_
(R
e
)
2
= (a +bw) E[R
e
].
Solving for a and b, we get
a =
w
wE[R
] E[(R
)
2
]
, b =
1
wE[R
] E[(R
)
2
]
so that
=
w (R
+wR
e
+)
wE[R
] E[(R
)
2
]
.
To avoid division by zero we have to assume that wE[R
] ,= E
_
(R
)
2
+w
i
R
e
+
i
. Then
E[R
i
] =
1
wE[R
] E[(R
)
2
]
E[(w (R
+wR
e
+)) (R
+w
i
R
e
+
i
)]
=
1
wE[R
] E[(R
)
2
]
_
wE[R
] E
_
(R
)
2
E[
i
]
_
,
where we have applied various results from earlier. We can now see that we will have E[R
i
] = 1
for all returns R
i
if and only if E[
i
] = 0 for all excess returns
i
. In particular E[
2
] = 0, which
together with E[] = 0] imply that = 0. 2
9.4.3 Link between meanvariance ecient returns and pricing factors
Theorem 9.9 A return R
mv
is a pricing factor, i.e. an R exists so that
E[R
i
] = +[R
i
, R
mv
] (E[R
mv
] ) , i = 1, . . . , I, (9.32)
if and only if R
mv
is a meanvariance ecient return dierent from the minimumvariance return.
9.4 Meanvariance ecient returns and pricing factors 231
The constant must then be equal to the zerobeta return corresponding to R
mv
, which is equal
to the riskfree return if a riskfree asset exists.
Proof: (If part.) First, let us show that if R
mv
is meanvariance ecient and dierent from
the minimumvariance return, it will work as a pricing factor. This result is originally due to Roll
(1977). For some w, we have R
mv
= R
+wR
e
. Consider a general return R
i
and decompose as
R
i
= R
+w
i
R
e
+
i
as in Theorem 9.6. Then
E[R
i
] = E[R
] +w
i
E[R
e
]
and
Cov[R
i
, R
mv
] = Var[R
] +ww
i
Var[R
e
] + (w +w
i
) Cov[R
, R
e
]
= Var[R
] +ww
i
Var[R
e
] (w +w
i
) E[R
] E[R
e
]
which implies that
Cov[R
i
, R
mv
] Var[R
] +wE[R
] E[R
e
] = w
i
(wVar[R
e
] E[R
] E[R
e
]) .
If w ,= E[R
] E[R
e
]/ Var[R
e
], which according to (9.31) means that R
mv
is dierent from the
minimumvariance return, then we can solve the above equation for w
i
with the solution
w
i
=
Cov[R
i
, R
mv
] Var[R
] +wE[R
] E[R
e
]
wVar[R
e
] E[R
] E[R
e
]
.
Hence
E[R
i
] = E[R
] +w
i
E[R
e
]
= E[R
] +
Cov[R
i
, R
mv
] Var[R
] +wE[R
] E[R
e
]
wVar[R
e
] E[R
] E[R
e
]
E[R
e
]
= +
Cov[R
i
, R
mv
]
wVar[R
e
] E[R
] E[R
e
]
E[R
e
]
where we have dened the constant as
= E[R
] +
wE[R
] E[R
e
] Var[R
]
wVar[R
e
] E[R
] E[R
e
]
E[R
e
].
This applies to any return R
i
and in particular to R
mv
itself, which implies that
E[R
mv
] = +
Var[R
mv
]
wVar[R
e
] E[R
] E[R
e
]
E[R
e
]
and hence
E[R
e
]
wVar[R
e
] E[R
] E[R
e
]
=
E[R
mv
]
Var[R
mv
]
.
Substituting this back in, we get
E[R
i
] = +
Cov[R
i
, R
mv
]
Var[R
mv
]
(E[R
mv
] ) = +[R
i
, R
mv
] (E[R
mv
] ) ,
which veries that R
mv
is a valid pricing factor.
232 Chapter 9. Factor models
(Only if part.) Next, let us show that if a return works as a pricing factor, it must be mean
variance ecient and dierent from the minimumvariance return. This proof is due to Hansen
and Richard (1987). Assume that R
mv
is a pricing factor. Decomposing as in Theorem 9.6
R
mv
= R
+wR
e
+,
we need to show that = 0 (so that R
mv
is meanvariance ecient) and that w ,= E[R
] E[R
e
]/ Var[R
e
]
(so that R
mv
is dierent from the minimumvariance return).
Dene a new return R as the ecient part of R
mv
, i.e.
R = R
+wR
e
.
Since
Cov[, R
mv
] = E[R
mv
] E[] E[R
mv
] = E[R
mv
] = E[
2
],
we get
Cov[R, R
mv
] = Cov[R
mv
, R
mv
] = Cov[R
mv
, R
mv
] Cov[, R
mv
] = Var[R
mv
] E[
2
].
On the other hand, E[R] = E[R
mv
] so applying (9.32) for R
i
= R, we obtain
E[R
mv
] = E[R] =
Cov[R, R
mv
]
Var[R
mv
]
(E[R
mv
] )
and hence Cov[R, R
mv
] = Var[R
mv
]. We conclude that E[
2
] = 0, which together with E[] = 0
imply = 0.
Suppose that w = E[R
] E[R
e
]/ Var[R
e
] and dene a new gross return
R = R
mv
+
1
E[R
e
]
R
e
.
Clearly, E[R] = E[R
mv
] + 1, and furthermore
Cov[R, R
mv
] = Var[R
mv
] +
1
E[R
e
]
Cov[R
e
, R
mv
] = Var[R
mv
]
since
Cov[R
e
, R
mv
] = Cov[R
e
, R
+wR
e
] = Cov[R
e
, R
] +wVar[R
e
]
= Cov[R
e
, R
] + E[R
] E[R
e
] = E[R
e
R
] = 0.
Applying (9.32) to the return R, we get
E[R] = +
Cov[R, R
mv
]
Var[R
mv
]
(E[R
mv
] ) = + (E[R
mv
] ) = E[R
mv
],
which contradicts our early conclusion that E[R] = E[R
mv
] + 1. Hence our assumption about w
cannot hold. 2
One implication of this theorem is that we can always nd some returns that work as a pricing
factor, namely the meanvariance ecient returns. Another implication is that the conclusion of
the classical CAPM can be restated as the market portfolio is meanvariance ecient.
9.5 Pricing factors in a multiperiod framework 233
9.5 Pricing factors in a multiperiod framework
In the oneperiod framework we dened a pricing factor to be a Kdimensional random variable x
such that there exists some R and some R
K
so that
E[R
i
] = +[R
i
, x]
, i = 1, . . . , I,
where [R
i
, x] = (Var[x])
1
Cov[x, R
i
]. We saw that any stateprice deator works as a pricing
factor and, more generally, if = a + b
t
, i = 1, . . . , I, (9.33)
for any t = 0, 1, 2, . . . , T 1. Here, the conditional factor beta is dened as
t
[R
i,t+1
, x
t+1
] = (Var
t
[x
t+1
])
1
Cov
t
[x
t+1
, R
i,t+1
]. (9.34)
If a conditionally riskfree asset exists, then
t
= R
f
t
implying that
E
t
[R
i,t+1
] = R
f
t
+
t
[R
i,t+1
, x
t+1
]
t
, i = 1, . . . , I. (9.35)
Suppose x is a conditional pricing factor, and let a = (a
t
) be an adapted onedimensional process
and A = (A
t
) be an adapted process whose values A
t
are nonsingular K K matrices. Then x
dened by
x
t+1
= a
t
+A
t
x
t+1
(9.36)
will also be a conditional pricing factor.
If = (
t
) is a stateprice deator process, the oneperiod analysis implies that the ratios
t+1
/
t
dene a conditional pricing factor. Since
t+1
= 0 +
t
(
t+1
/
t
) is a transformation of the
form (9.36), we see that any stateprice deator is a conditional pricing factor. As in the oneperiod
case, we will have that if
t+1
= a
t
+ b
t
x
t+1
is a stateprice deator for some adapted process
x, then x is a conditional pricing factor. And for any conditional pricing factor x, we can nd
adapted process a = (a
t
) and b = (b
t
) so that
t+1
= a
t
+b
t
x
t+1
denes a candidate stateprice
deator (not necessarily positive, however).
We will say that a Kdimensional adapted stochastic process x = (x
t
) is an unconditional
pricing factor, if there exist constants and so that
E[R
i,t+1
] = +[R
i,t+1
, x
t+1
]
, i = 1, . . . , I, (9.37)
for any t = 0, 1, 2, . . . , T 1. Here, the unconditional factor beta is dened as
[R
i,t+1
, x
t+1
] = (Var[x
t+1
])
1
Cov[x
t+1
, R
i,t+1
]. (9.38)
This is true if the stateprice deator can be written as
t+1
t
= a + b
x
t+1
for constants a and
b. Hence, an unconditional pricing factor is also a conditional pricing factor. The converse is not
true.
234 Chapter 9. Factor models
Testing factor models on actual data really requires an unconditional model since we need to
replace expected returns by average returns, etc. To go from a conditional pricing factor model
to an unconditional, we need to link the variation in the coecients over time to some observable
variables. Suppose for example that
t+1
t
= a
t
+ b
t
x
t+1
is a stateprice deator so that x = (x
t
)
is a conditional pricing factor, assumed to be onedimensional for notational simplicity. If we can
write
a
t
= A
0
+A
1
y
t
, b
t
= B
0
+B
1
y
t
for some observable adapted process y = (y
t
), then
t+1
t
= A
0
+A
1
y
t
+B
0
x
t+1
+B
1
y
t
x
t+1
which denes a 3dimensional unconditional pricing factor given by the vector (y
t
, x
t+1
, y
t
x
t+1
)
.
Now let us turn to the continuoustime setting. By a Kdimensional conditional pricing factor in
a continuoustime model we mean an adapted Kdimensional process x = (x
t
) with the property
that there exist some onedimensional adapted process = (
t
) and some Kdimensional adapted
process = (
t
) such that for any asset i (or trading strategy), the expected rate of return per
time period satises
it
+
it
=
t
+ (
ix
t
)
t
, (9.39)
where again
ix
t
is the factorbeta of asset i at time t. To understand the factorbeta write the
price dynamics of risky assets in the usual form
dP
it
= P
it
[
it
dt +
it
dz
t
] .
and the dynamics of x as
dx
t
=
xt
dt +
xt
dz
t
,
where
x
is an adapted process valued in R
K
and
x
is an adapted process with values being
K d matrices. Then the factorbeta is dened as
ix
t
=
_
xt
xt
_
1
xt
it
. (9.40)
If a bank account is traded, it then follows that
t
= r
f
t
. In the following we will assume that
this is the case.
Factors are closely linked to market prices of risk and hence to stateprice deators. If x = (x
t
)
is a factor in an expected returnbeta relation, then we can dene a market price of risk as (note
that it is ddimensional)
t
=
xt
_
xt
xt
_
1
t
,
since we then have
it
t
=
it
xt
_
xt
xt
_
1
t
=
_
ix
t
_
t
=
it
+
it
r
f
t
.
Conversely, let = (
t
) be any market price of risk and let = (
t
) be the associated stateprice
deator so that
d
t
=
t
_
r
f
t
dt +
t
dz
t
_
.
9.6 Empirical factors 235
Then we can use as a onedimensional factor in an expected returnbeta relation. Since this
corresponds to a factor sensitivity vector
t
t
replacing the matrix
xt
, the relevant beta is
i
t
=
t
it
2
t
t
t
=
1
it
t
t
.
We can use
t
=
t
t
t
, since then
i
t
t
=
1
it
t
t
(
t
t
t
) =
it
t
=
it
+
it
r
f
t
for any asset i. We can even use a
t
+ b
t
t
as a factor for any suciently wellbehaved adapted
processes a = (a
t
) and b = (b
t
).
If we use
t
(
t
)
t
=
t

t

2
. From (4.51), we see
that 
t

2
is exactly the excess expected rate of return of the growthoptimal strategy, we which
can also write as
t
r
f
t
. Hence, we can write the excess expected rate of return on any asset (or
trading strategy) as
it
+
it
r
f
t
=
i
t
_
t
[
t
r
f
t
]
_
=
it
t
(
t
)
t
[
t
r
f
t
]
i
t
[
t
r
f
t
]. (9.41)
Whether we want to use a discretetime or a continuoustime model, the key question is what
factors to include in order to get prices or returns that are consistent with the data. Due to the
link between stateprice deators and (marginal utility of) consumption, we should look for factors
among variables that may aect (marginal utility of) consumption.
9.6 Empirical factors
A large part of the literature on factor models is based on empirical studies which for a given
data set identies a number of priced factors so that most of the dierences between the returns
on dierent nancial assetstypically only various portfolios of stockscan be explained by their
dierent factor betas. The best known studies of this kind were carried out by Fama and French,
who nd support for a model with three factors:
1. the return on a broad stock market index;
2. the return on a portfolio of stocks in small companies (according to the market value of all
stocks issued by the rm) minus the return on a portfolio of stocks in large companies;
3. the return on a portfolio of stocks issued by rms with a high booktomarket value (the
ratio between the book value of the assets of the rm to the market value of all the assets)
minus the return on a portfolio of stocks in rms with a low booktomarket value.
According to Fama and French (1996) such a model gives a good t of U.S. stock market data over
the period 19631993. However, the empirical analysis does not explain why this threefactor model
does well and what the underlying pricing mechanisms might be. Also note that while a given
factor model does well in a given market over a given period it may perform very badly in other
markets and/or other periods (and risk premia may be dierent for other data sets). Some recent
studies indicate that the second of the three factors (the size eect) seems to have disappeared.
Another critique is that over the last 30 years empirical researchers have tried so many factors that
236 Chapter 9. Factor models
it is hardly surprising that they have found some statistically signicant factors. In fact, we know
that in any given sample of historical returns it is possible to nd a portfolio so that a factor model
with the return on this portfolio as the only factor will perfectly explain all returns in the sample!
Also note that the FamaFrench model is a partial pricing model since the factors themselves are
derived from prices of nancial assets. For these reasons the purely empirically based models do
not contribute much to the understanding of the pricing mechanisms of nancial markets. However,
some interesting recent studies explore whether the FamaFrench factors can be seen as proxies for
macroeconomic variables that can logically be linked to asset prices.
Linking FamaFrench factors to risk and variations in investment opportunities: Berk (1995),
Liew and Vassalou (2000), Lettau and Ludvigson (2001), Vassalou (2003), Petkova (2006).
Data snooping/biases explanations of the success of FF: Lo and MacKinlay (1990), Kothari,
Shanken, and Sloan (1995).
Problems in measurement of beta: Berk, Green, and Naik (1999), Gomes, Kogan, and Zhang
(2003).
9.7 Theoretical factors
Factor models can be obtained through the general consumptionbased asset pricing model by
relating optimal consumption to various factors. As discussed in Section 6.5 the optimal consump
tion plan of an individual with timeadditive expected utility must satisfy the socalled envelope
condition
u
(c
t
) = J
W
(W
t
, x
t
, t). (6.47)
Here J is the indirect utility function of the individual, i.e. the maximum obtainable expected
utility of future consumption. W
t
is the nancial wealth of the investor at time t. x
t
is the time t
value of a variable that captures the variations in investment opportunities (captured by the risk
free interest rate, expected returns and volatilities on risky assets, and correlations between risky
assets) and investorspecic variables (e.g. labor income). For notational simplicity, x is assumed
to be onedimensional, but this could be generalized.
In a continuoustime framework write the dynamics of wealth compactly as
dW
t
= W
t
[
Wt
dt +
Wt
dz
t
]
and assume that the state variable x follows a diusion process
dx
t
=
xt
dt +
xt
dz
t
,
xt
=
x
(x
t
, t),
xt
=
x
(x
t
, t).
From (8.16) it follows that the stateprice deator derived from this individual can be written as
t
= e
t
J
W
(W
t
, x
t
, t)
J
W
(W
0
, x
0
, 0)
.
An application of Itos Lemma yields a new expression for the dynamics of , which again can be
compared with (4.41). It follows from this comparison that
t
=
_
W
t
J
WW
(W
t
, x
t
, t)
J
W
(W
t
, x
t
, t)
_
Wt
+
_
J
Wx
(W
t
, x
t
, t)
J
W
(W
t
, x
t
, t)
_
xt
,
9.8 Exercises 237
is a market price of risk. Consequently, the expected excess rate of return on asset i can be written
as
it
+
it
r
f
t
=
_
W
t
J
WW
(W
t
, x
t
, t)
J
W
(W
t
, x
t
, t)
_
it
Wt
+
_
J
Wx
(W
t
, x
t
, t)
J
W
(W
t
, x
t
, t)
_
it
xt
, (9.42)
which can be rewritten as
it
+
it
r
f
t
=
iWt
Wt
+
ixt
xt
, (9.43)
where
iWt
=
it
Wt

Wt

2
,
ixt
=
it
xt

xt

2
,
Wt
= 
Wt

2
_
W
t
J
WW
(W
t
, x
t
, t)
J
W
(W
t
, x
t
, t)
_
,
xt
= 
xt

2
_
J
Wx
(W
t
, x
t
, t)
J
W
(W
t
, x
t
, t)
_
.
We now have a continuoustime version of (9.35) with the wealth of the individual and the state
variable as the factors. If it takes m state variables to describe the variations in investment
opportunities, labor income, etc., we get an (m+ 1)factor model.
If the individual is taken to be a representative individual, her wealth will be identical to the
aggregate value of all assets in the economy, including all traded nancial assets and nontraded
asset such as human capital. This is like the market portfolio in the traditional static CAPM. The
rst term on the righthand side of (9.42) is then the product of the relative risk aversion of the
representative individual (derived from her indirect utility) and the covariance between the rate
of return on asset i and the rate of return on the market portfolio. In the special case where the
indirect utility is a function of wealth and time only, the last term on the righthand side will be
zero, and we get the wellknown relation
it
+
it
r
f
t
=
iWt
_
Wt
+
Wt
r
f
t
_
,
where
iWt
is the marketbeta of asset i. This is a continuoustime version of the traditional
static CAPM. This is only true under the strong assumption that individuals do not care about
variations in investment opportunities, income, etc. In general we have to add factors describing
the future investment opportunities, future labor income, etc. This extension of the CAPM is
called the Intertemporal CAPM and was rst derived by Merton (1973b).
Only few empirical studies of factor models refer to Mertons Intertemporal CAPM when mo
tivating the choice of factors. Brennan, Wang, and Xia (2004) set up a simple model with the
shortterm real interest rate and the slope of the capital market line as the factors since these
variables capture the investment opportunities. In an empirical test, this model performs as well
as the FamaFrench model, which is encouraging for the development of theoretically wellfounded
and empirically viable factor models.
9.8 Exercises
EXERCISE 9.1 Consider a discretetime economy with a onedimensional conditional pricing
factor x = (x
t
) so that, for some adapted processes = (
t
) and some = (
t
),
E
t
[R
i,t+1
] =
t
+
t
[R
i,t+1
, x
t+1
]
t
,
for all assets i and all t = 0, 1, . . . , T 1.
238 Chapter 9. Factor models
(a) Show that
t+1
t
=
1
t
_
1
x
t+1
E
t
[x
t+1
]
Var
t
[x
t+1
]
t
_
satises the pricing condition for a stateprice deator, i.e.
E
t
_
t+1
t
R
i,t+1
_
= 1
for all assets i.
(b) Show that
t+1
/
t
is only a true oneperiod stateprice deator for the period between time t
and time t + 1 if you to impose some condition on parameters and/or distributions and
provide that condition. If this condition is not satised, what can you conclude about the
asset prices in this economy?
(c) Now suppose that the factor is the return on some particular portfolio, i.e. x
t+1
=
R
t+1
.
Answer question (b) again.
(d) Consider the good (?) old (!) CAPM where x
t+1
= R
M,t+1
, the return on the market portfolio.
Suppose that E
t
[R
M,t+1
] = 1.05,
t
[R
M,t+1
] = 0.2, and R
f
t
= 1.02. What do you need to
assume about the distribution of the market return to ensure that the model is free of
arbitrage? Is an assumption like this satised in typical derivations of the CAPM?
EXERCISE 9.2 Using the orthogonal characterization of the meanvariance frontier, show that
for any meanvariance ecient return R
, R
z()
] = 0. Show that
E[R
z()
] = E[R
] E[R
e
]
E[(R
)
2
] E[R
] E[R
]
E[R
] E[R
] E[R
] E[R
e
]
.
EXERCISE 9.3 Consider a discretetime economy in which asset prices are described by an
unconditional linear factor model
t+1
t
= a +b x
t+1
, t = 0, 1, . . . , T 1,
where the conditional mean and second moments of the factor are constant, i.e. E
t
[x
t+1
] = and
E
t
[x
t+1
x
t+1
] = for all t.
You want to value an uncertain stream of dividends D = (D
t
). You are told that dividends evolve
as
D
t+1
D
t
= m+ x
t+1
+
t+1
, t = 0, 1, . . . , T 1,
where E
t
[
t+1
] = 0 and E
t
[
t+1
x
t+1
] = 0 for all t.
The rst three questions will lead you through the valuation based directly on the pricing condition
of the stateprice deator.
(a) Show that, for any t = 0, 1, . . . , T 1,
E
t
_
D
t+1
D
t
t+1
t
_
= ma + (mb +a) +
b A
9.8 Exercises 239
(b) Show that
E
t
_
D
t+s
D
t
t+s
t
_
= A
s
.
(c) Show that the value at time t of the future dividends is
P
t
= D
t
A
1 A
_
1 A
Tt
_
.
Next, consider an alternative valuation technique. From (4.31) we know that the dividends can be
valued by the formula
P
t
=
Tt
s=1
E
t
[D
t+s
]
t
_
D
t+s
,
t+s
t
_
t,t+s
(1 + y
t+s
t
)
s
, (*)
where
t
_
D
t+s
,
t+s
t
_
= Cov
t
_
D
t+s
,
t+s
t
_
/ Var
t
_
t+s
t
_
,
t,t+s
= Var
t
_
t+s
t
_
/ E
t
_
t+s
t
_
.
In the next questions you have to compute the ingredients to this valuation formula.
(d) What is the oneperiod riskfree rate of return r
f
t
? What is the time t annualized yield y
t+s
t
on a zerocoupon bond maturing at time t + s? (If B
t+s
t
denotes the price of the bond, the
annualized gross yield is dened by the equation B
t+s
t
= (1 + y
t+s
t
)
s
.)
(e) Show that E
t
[D
t+s
] = D
t
(m+ )
s
.
(f ) Compute Cov
t
_
D
t+s
,
t+s
t
_
.
(g) Compute
t
_
D
t+s
,
t+s
t
_
.
(h) Compute
t,t+s
(i) Verify that the time t value of the future dividends satises (*).
EXERCISE 9.4 In a continuoustime framework an individual with timeadditive expected
power utility induces the stateprice deator
t
= e
t
_
c
t
c
0
_
,
where is the constant relative risk aversion, is the subjective time preference rate, and c =
(c
t
)
t[0,T]
is the optimal consumption process of the individual. If the dynamics of the optimal
consumption process is of the form
dc
t
= c
t
[
ct
dt +
ct
dz
t
]
then the dynamics of the stateprice deator is
d
t
=
t
__
+
ct
1
2
(1 +)
ct

2
_
dt +
ct
dz
t
_
.
240 Chapter 9. Factor models
(a) State the market price of risk
t
in terms of the preference parameters and the expected
growth rate and sensitivity of the consumption process.
In many concrete models of the individual consumption and portfolio decisions, the optimal con
sumption process will be of the form
c
t
= W
t
e
f(X
t
,t)
,
where W
t
is the wealth of the individual at time t, X
t
is the time t value of some state variable,
and f is some smooth function. Here X can potentially be multidimensional.
(b) Give some examples of variables other than wealth that may aect the optimal consumption
of an individual and which may therefore play the role of X
t
.
Suppose the state variable X is onedimensional and write the dynamics of the wealth of the
individual and the state variable as
dW
t
= W
t
__
Wt
e
f(X
t
,t)
_
dt +
Wt
dz
t
_
,
dX
t
=
Xt
dt +
Xt
dz
t
(c) Characterize the market price of risk in terms of the preference parameters and the drift and
sensitivity terms of W
t
and X
t
.
Hint: Apply Itos Lemma to c
t
= W
t
e
f(X
t
,t)
to express the required parts of the consumption
process in terms of W and X.
(d) Show that the instantaneous excess expected rate of return on risky asset i can be written
as
it
+
it
r
f
t
=
iW,t
Wt
+
iX,t
Xt
,
where
iW,t
and iX, t are the instantaneous betas of the asset with respect to wealth and
the state variable, respectively. Relate
Wt
and
Xt
to preference parameters and the drift
and sensitivity terms of W and X.
CHAPTER 10
The economics of the term structure of interest rates
10.1 Introduction
The previous two chapters focused on the implications of asset pricing models for the level of
stock market excess returns and the crosssection of stock returns. This chapter focuses on the
consequences of asset pricing theory for the pricing of bonds and the term structure of interest
rates implied by bond prices.
A bond is nothing but a standardized and transferable loan agreement between two parties. The
issuer of the bond is borrowing money from the holder of the bond and promises to pay back the
loan according to a predened payment scheme. The presence of the bond market allows individuals
to trade consumption opportunities at dierent points in time among each other. An individual
who has a clear preference for current capital to nance investments or current consumption can
borrow by issuing a bond to an individual who has a clear preference for future consumption
opportunities. The price of a bond of a given maturity is, of course, set to align the demand and
supply of that bond, and will consequently depend on the attractiveness of the real investment
opportunities and on the individuals preferences for consumption over the maturity of the bond.
The term structure of interest rates will reect these dependencies.
After a short introduction to notation and bond market terminology in Section 10.2, we derive
in Sections 10.3 and 10.4 relations between equilibrium interest rates and aggregate consumption
and production in settings with a representative individual. In Section 10.5 we give some examples
of equilibrium term structure models that are derived from the basic relations between interest
rates, consumption, and production. The famous Vasicek model and CoxIngersollRoss model are
presented.
Since individuals are concerned with the number of units of goods they consume and not the
dollar value of these goods, the relations found in those sections apply to real interest rates.
However, most traded bonds are nominal, i.e. they promise the delivery of certain dollar amounts,
not the delivery of a certain number of consumption goods. The real value of a nominal bond
241
242 Chapter 10. The economics of the term structure of interest rates
depends on the evolution of the price of the consumption good. In Section 10.6 we explore the
relations between real rates, nominal rates, and ination. We consider both the case where money
has no real eects on the economy and the case where money does aect the real economy.
The development of arbitragefree dynamic models of the term structure was initiated in the
1970s. Until then, the discussions among economists about the shape of the term structure were
based on some relatively loose hypotheses. The most wellknown of these is the expectation
hypothesis, which postulates a close relation between current interest rates or bond yields and
expected future interest rates or bond returns. Many economists still seem to rely on the validity
of this hypothesis, and a lot of man power has been spend on testing the hypothesis empirically. In
Section 10.7, we review several versions of the expectation hypothesis and discuss the consistency
of these versions. We argue that neither of these versions will hold for any reasonable dynamic
term structure model. Some alternative traditional hypotheses are briey reviewed in Section 10.8.
10.2 Basic interest rate concepts and relations
As in earlier chapters we will denote by B
T
t
the price at time t of a zerocoupon bond paying
a dividend of one at time T and no other dividends. If many zerocoupon bonds with dierent
maturities are traded, we can form the function T B
T
t
, which we refer to as the discount
function prevailing at time t. Of course, we must have B
t
t
= 1, and we expect the discount
function to be decreasing since all individuals will presumably prefer getting the dividend sooner
than later.
Next, consider a coupon bond with payment dates t
1
, t
2
, . . . , t
n
, where we assume without loss
of generality that t
1
< t
2
< < t
n
. The payment at date t
i
is denoted by Y
i
. Such a coupon
bond can be seen as a portfolio of zerocoupon bonds, namely a portfolio of Y
1
zerocoupon bonds
maturing at t
1
, Y
2
zerocoupon bonds maturing at t
2
, etc. If all these zerocoupon bonds are traded
in the market, the unique noarbitrage price of the coupon bond at any time t is
B
t
=
t
i
>t
Y
i
B
t
i
t
. (10.1)
where the sum is over all future payment dates of the coupon bond. If not all the relevant zero
coupon bonds are traded, we cannot justify the relation (10.1) as a result of the noarbitrage
principle. Still, it is a valuable relation. Suppose that an investor has determined (from private
or macro economic information) a discount function showing the value she attributes to payments
at dierent future points in time. Then she can value all sure cash ows in a consistent way by
substituting that discount function into (10.1).
The information incorporated in prices of the many dierent bonds is usually better understood
when transforming the bond prices into interest rates. Interest rates are always quoted on an annual
basis, i.e. as some percentage per year. However, to apply and assess the magnitude of an interest
rate, we also need to know the compounding frequency of that rate. More frequent compounding
of a given interest rate per year results in higher eective interest rates. Furthermore, we need
to know at which time the interest rate is set or observed and for which period of time the interest
rate applies. Spot rates applies to a period beginning at the time the rate is set, whereas forward
rates applies to a future period of time. The precise denitions follow below.
10.2 Basic interest rate concepts and relations 243
Annual compounding
Given the price B
T
t
at time t on a zerocoupon bond maturing at time T, the relevant discount
rate between time t and time T is the yield on the zerocoupon bond, the socalled zerocoupon
rate or spot rate for date T. That y
T
t
is the annually compounded zerocoupon rate means that
B
T
t
= (1 + y
T
t
)
(Tt)
y
T
t
=
_
B
T
t
_
1/(Tt)
1. (10.2)
The zerocoupon rates as a function of maturity is called the zerocoupon yield curve or simply
the yield curve. It is one way to express the term structure of interest rates.
While a zerocoupon or spot rate reects the price on a loan between today and a given future
date, a forward rate reects the price on a loan between two future dates. The annually com
pounded relevant forward rate at time t for the period between time T and time S is denoted by
f
T,S
t
. Here, we have t T < S. This is the rate, which is appropriate at time t for discounting
between time T and S. We can think of discounting from time S back to time t by rst discounting
from time S to time T and then discounting from time T to time t. We must therefore have that
_
1 + y
S
t
_
(St)
=
_
1 + y
T
t
_
(Tt)
_
1 +
f
T,S
t
_
(ST)
, (10.3)
from which we nd that
f
T,S
t
=
(1 + y
T
t
)
(Tt)/(ST)
(1 + y
S
t
)
(St)/(ST)
1.
We can also link forward rates to bond prices:
B
S
t
= B
T
t
_
1 +
f
T,S
t
_
(ST)
f
T,S
t
=
_
B
T
t
B
S
t
_
1/(ST)
1. (10.4)
Note that since B
t
t
= 1, we have
f
t,S
t
=
_
B
t
t
B
S
t
_
1/(St)
1 =
_
B
S
t
_
1/(St)
1 = y
S
t
,
i.e. the forward rate for a period starting today equals the zerocoupon rate or spot rate for the
same period.
Compounding over other discrete periods LIBOR rates
In practice, many interest rates are quoted using semiannually, quarterly, or monthly compound
ing. An interest rate of R per year compounded m times a year, corresponds to a discount factor of
(1 +R/m)
m
over a year. The annually compounded interest rate that corresponds to an interest
rate of R compounded m times a year is (1 +R/m)
m
1. This is sometimes called the eective
interest rate corresponding to the nominal interest rate R. Interest rates are set for loans with
various maturities and currencies at the international money markets, the most commonly used
being the LIBOR rates that are xed in London. Traditionally, these rates are quoted using a
compounding period equal to the maturity of the interest rate. If, for example, the threemonth
interest rate is l
t+0.25
t
per year, it means that
B
t+0.25
t
=
1
1 + 0.25 l
t+0.25
t
l
t+0.25
t
=
1
0.25
_
1
B
t+0.25
t
1
_
.
244 Chapter 10. The economics of the term structure of interest rates
More generally, the relations are
B
T
t
=
1
1 +l
T
t
(T t)
l
T
t
=
1
T t
_
1
B
T
t
1
_
. (10.5)
Similarly, discretely compounded forward rates can be computed as
L
T,S
t
=
1
S T
_
B
T
t
B
S
t
1
_
. (10.6)
Continuous compounding
Increasing the compounding frequency m, the eective annual return on one dollar invested at the
interest rate R per year increases to e
R
, due to the mathematical result saying that
lim
m
_
1 +
R
m
_
m
= e
R
.
A continuously compounded interest rate R is equivalent to an annually compounded interest rate
of e
R
1 (which is bigger than R). Similarly, the zerocoupon bond price B
T
t
is related to the
continuously compounded zerocoupon rate y
T
t
by
B
T
t
= e
y
T
t
(Tt)
y
T
t
=
1
T t
lnB
T
t
. (10.7)
The function T y
T
t
is also a zerocoupon yield curve that contains exactly the same information
as the discount function T B
T
t
and also the same information as the annually compounded yield
curve T y
T
t
. The relation is y
T
t
= ln(1 + y
T
t
).
If f
T,S
t
denotes the continuously compounded forward rate prevailing at time t for the period
between T and S, we must have that B
S
t
= B
T
t
e
f
T,S
t
(ST)
, in analogy with (10.4). Consequently,
f
T,S
t
=
lnB
S
t
lnB
T
t
S T
(10.8)
and hence
f
T,S
t
=
y
S
t
(S t) y
T
t
(T t)
S T
. (10.9)
Analytical studies of the term structure of interest rates often focus on forward rates for future
periods of innitesimal length. The forward rate for an innitesimal period starting at time T
is simply referred to as the forward rate for time T and is dened as f
T
t
= lim
ST
f
T,S
t
. The
function T f
T
t
is called the term structure of forward rates. Assuming dierentiability of
the discount function, we get
f
T
t
=
lnB
T
t
T
=
B
T
t
/T
B
T
t
B
T
t
= e
T
t
f
u
t
du
. (10.10)
Applying (10.9), the relation between the innitesimal forward rate and the spot rates can be
written as
f
T
t
=
[y
T
t
(T t)]
T
= y
T
t
+
y
T
t
T
(T t) (10.11)
under the assumption of a dierentiable term structure of spot rates T y
T
t
. The forward rate
reects the slope of the zerocoupon yield curve. In particular, the forward rate f
T
t
and the zero
coupon rate y
T
t
will coincide if and only if the zerocoupon yield curve has a horizontal tangent
at T. Conversely,
y
T
t
=
1
T t
_
T
t
f
u
t
du, (10.12)
10.3 Real interest rates and aggregate consumption 245
i.e. the zerocoupon rate is an average of the forward rates.
It is important to realize that discount factors, spot rates, and forward rates (with any compound
ing frequency) are perfectly equivalent ways of expressing the same information. If a complete yield
curve of, say, quarterly compounded spot rates is given, we can compute the discount function and
spot rates and forward rates for any given period and with any given compounding frequency. If
a complete term structure of forward rates is known, we can compute discount functions and spot
rates, etc. Academics frequently apply continuous compounding since the mathematics involved
in many relevant computations is more elegant when exponentials are used.
There are even more ways of representing the term structure of interest rates. Since most bonds
are bullet bonds, many traders and analysts are used to thinking in terms of yields of bullet bonds
rather than in terms of discount factors or zerocoupon rates. The par yield for a given maturity
is the coupon rate that causes a bullet bond of the given maturity to have a price equal to its face
value. Again we have to x the coupon period of the bond. U.S. treasury bonds typically have
semiannual coupons which are therefore often used when computing par yields. Given a discount
function T B
T
t
, the nyear par yield is the value of c satisfying
2n
i=1
_
c
2
_
B
t+0.5i
t
+B
t+n
t
= 1 c =
2
_
1 B
t+n
t
_
2n
i=1
B
t+0.5i
t
.
It reects the current market interest rate for an nyear bullet bond. The par yield is closely
related to the socalled swap rate, which is a key concept in the swap markets.
10.3 Real interest rates and aggregate consumption
In order to study the link between interest rates and aggregate consumption, we assume the exis
tence of a representative individual maximizing the expected timeadditive utility E[
_
T
0
e
t
u(c
t
) dt].
As discussed in Chapter 7, a representative individual will exist in a complete market. The param
eter is the subjective time preference rate with higher representing a more impatient individual.
c
t
is the consumption rate of the individual, which is then also the aggregate consumption level
in the economy. In terms of the utility and time preference of the representative individual the
stateprice deator is therefore characterized by
t
= e
t
u
(c
t
)
u
(c
0
)
.
Let us take a continuoustime framework and assume that c = (c
t
) follows a stochastic process
of the form
dc
t
= c
t
[
ct
dt +
ct
dz
t
] , (10.13)
where z = (z
t
) is a (possibly multidimensional) standard Brownian motion. Then Theorem 8.1
states that the equilibrium continuously compounded shortterm interest rate is given by
r
t
= +(c
t
)
ct
1
2
(c
t
)
ct

2
, (10.14)
and that
t
= (c
t
)
ct
(10.15)
246 Chapter 10. The economics of the term structure of interest rates
denes a market price of risk process. Here (c
t
) c
t
u
(c
t
)/u
(c
t
) is the relative risk aversion
and (c
t
) c
2
t
u
(c
t
)/u
(c
t
), which is positive under the very plausible assumption of decreasing
absolute risk aversion. For notational simplicity we leave out the f superscript on the shortterm
interest rate in this chapter. The short rate in (10.14) was already interpreted in Section 8.3.
In the special case of constant relative risk aversion, u(c) = c
1
/(1 ), Equation (10.14)
simplies to
r
t
= +
ct
1
2
(1 +)
ct

2
. (10.16)
In particular, we see that if the drift and variance rates of aggregate consumption are constant,
i.e. aggregate consumption follows a geometric Brownian motion, then the shortterm interest rate
will be constant over time. In that case the time t price of the zerocoupon bond maturing at
time s is
B
s
t
= E
t
_
t
_
= E
t
_
exp
_
r(s t)
1
2

2
(s t)
(z
s
z
t
)
__
= e
r(st)
and the corresponding continuous compounded yield is y
s
t
= r. Consequently, the yield curve will
be at and constant over time. This is clearly an unrealistic case. To obtain interesting models
we must either allow for variations in the expectation and the variance of aggregate consumption
growth or allow for nonconstant relative risk aversion (or both).
What can we say about the relation between the equilibrium yield curve and the expectations
and uncertainty about future aggregate consumption? Given the consumption dynamics in (10.13),
we have
c
T
c
t
= exp
_
_
T
t
_
cs
1
2

cs

2
_
ds +
_
T
t
cs
dz
s
_
.
Assuming that the consumption sensitivity
cs
is constant and that the drift rate is so that
_
T
t
cs
ds is normally distributed, we see that c
T
/c
t
is lognormally distributed. Assuming time
additive power utility, the stateprice deator
T
/
t
= e
(Tt)
(c
T
/c
t
)
t
_
= e
(Tt)
E
t
_
e
ln(c
T
/c
t
)
_
= exp
_
(T t) E
t
_
ln
_
c
T
c
t
__
+
1
2
2
Var
t
_
ln
_
c
T
c
t
___
,
where we have applied Theorem B.2 in Appendix B. Combining the above expression with (10.7),
the continuously compounded zerocoupon rate or yield y
T
t
for maturity T is
y
T
t
= +
E
t
[ln(c
T
/c
t
)]
T t
1
2
2
Var
t
[ln(c
T
/c
t
)]
T t
. (10.17)
Since
lnE
t
_
c
T
c
t
_
= E
t
_
ln
_
c
T
c
t
__
+
1
2
Var
T
_
ln
_
c
T
c
t
__
,
the zerocoupon rate can be rewritten as
y
T
t
= +
lnE
t
[c
T
/c
t
]
T t
1
2
(1 +)
Var
t
[ln(c
T
/c
t
)]
T t
, (10.18)
which is very similar to the shortrate equation (10.16). The yield is increasing in the subjective rate
of time preference. The equilibrium yield for the period [t, T] is positively related to the expected
10.3 Real interest rates and aggregate consumption 247
growth rate of aggregate consumption over the period and negatively related to the uncertainty
about the growth rate of consumption over the period. The intuition for these results is the same
as for shortterm interest rate discussed above. We see that the shape of the equilibrium time t
yield curve T y
T
t
is determined by how expectations and variances of consumption growth rates
depend on the length of the forecast period. For example, if the economy is expected to enter a
short period of high growth rates, real shortterm interest rates tend to be high and the yield curve
downwardsloping.
Equation (10.18) is based on a lognormal future consumption and power utility. We will discuss
such a setting in more detail in Section 10.5.1. It appears impossible to obtain an exact relation of
the same structure as (10.18) for a more general model, where future consumption is not necessarily
lognormal and preferences are dierent from power utility. However, following Breeden (1986), we
can derive an approximate relation of a similar form. The equilibrium time t price of a zerocoupon
bond paying one consumption unit at time T t is given by
B
T
t
= E
t
_
t
_
= e
(Tt)
E
t
[u
(c
T
)]
u
(c
t
)
, (10.19)
where c
T
is the uncertain future aggregate consumption level. We can write the lefthand side of
the equation above in terms of the yield y
T
t
of the bond as
B
T
t
= e
y
T
t
(Tt)
1 y
T
t
(T t),
using a rst order Taylor expansion. Turning to the righthand side of the equation, we will use a
secondorder Taylor expansion of u
(c
T
) around c
t
:
u
(c
T
) u
(c
t
) +u
(c
t
)(c
T
c
t
) +
1
2
u
(c
t
)(c
T
c
t
)
2
.
This approximation is reasonable when c
T
stays relatively close to c
t
, which is the case for fairly
low and smooth consumption growth and fairly short time horizons. Applying the approximation,
the righthand side of (10.19) becomes
e
(Tt)
E
t
[u
(c
T
)]
u
(c
t
)
e
(Tt)
_
1 +
u
(c
t
)
u
(c
t
)
E
t
[c
T
c
t
] +
1
2
u
(c
t
)
u
(c
t
)
E
t
_
(c
T
c
t
)
2
_
_
1 (T t) +e
(Tt)
c
t
u
(c
t
)
u
(c
t
)
E
t
_
c
T
c
t
1
_
+
1
2
e
(Tt)
c
2
t
u
(c
t
)
u
(c
t
)
E
t
_
_
c
T
c
t
1
_
2
_
,
where we have used the approximation e
(Tt)
1 (T t). Substituting the approximations
of both sides into (10.19) and rearranging, we nd the following approximate expression for the
zerocoupon yield:
y
T
t
+e
(Tt)
_
c
t
u
(c
t
)
u
(c
t
)
_
E
t
[c
T
/c
t
1]
T t
1
2
e
(Tt)
c
2
t
u
(c
t
)
u
(c
t
)
E
t
_
_
c
T
c
t
1
_
2
_
T t
. (10.20)
We can replace E
t
_
(c
T
/c
t
1)
2
_
by Var
t
[c
T
/c
t
] + (E
t
[c
T
/c
t
1])
2
and consider the eect of a
shift in variance for a xed expected consumption growth. Again assuming u
> 0, u
< 0, and
u
> 0, we see that the yield of a given maturity is positively related to the expected growth rate
of consumption up to the maturity date and negatively related to the variance of the consumption
growth rate up to maturity.
248 Chapter 10. The economics of the term structure of interest rates
10.4 Real interest rates and aggregate production
In order to study the relation between interest rates and production, we will look at a slightly
simplied version of the general equilibrium model of Cox, Ingersoll, and Ross (1985a).
Consider an economy with a single physical good that can be used either for consumption or
investment. All values are expressed in units of this good. The instantaneous rate of return on an
investment in the production of the good is
d
t
t
= g(x
t
) dt +(x
t
) dz
1t
, (10.21)
where z
1
is a standard onedimensional Brownian motion and g and are wellbehaved realvalued
functions (given by Mother Nature) of some state variable x
t
. We assume that (x) is nonnegative
for all values of x. The above dynamics means that
0
goods invested in the production process
at time 0 will grow to
t
goods at time t if the output of the production process is continuously
reinvested in this period. We can interpret g as the expected real growth rate of production in the
economy and the volatility (assumed positive for all x) as a measure of the uncertainty about
the growth rate of production in the economy. The production process has constant returns to
scale in the sense that the distribution of the rate of return is independent of the scale of the
investment. There is free entry to the production process. We can think of individuals investing
in production directly by forming their own rm or indirectly by investing in stocks of production
rms. For simplicity we take the rst interpretation. All producers, individuals and rms, act
competitively so that rms have zero prots and just passes production returns on to their owners.
All individuals and rms act as price takers.
We assume that the state variable is a onedimensional diusion with dynamics
dx
t
= m(x
t
) dt +v
1
(x
t
) dz
1t
+v
2
(x
t
) dz
2t
, (10.22)
where z
2
is another standard onedimensional Brownian motion independent of z
1
, and m, v
1
, and
v
2
are wellbehaved realvalued functions. The instantaneous variance rate of the state variable
is v
1
(x)
2
+ v
2
(x)
2
, the covariance rate of the state variable and the real growth rate is (x)v
1
(x)
so that the correlation between the state and the growth rate is v
1
(x)/
_
v
1
(x)
2
+v
2
(x)
2
. Unless
v
2
0, the state variable is imperfectly correlated with the real production returns. If v
1
is
positive [negative], then the state variable is positively [negatively] correlated with the growth rate
of production in the economy. Since the state determines the expected returns and the variance of
returns on real investments, we may think of x
t
as a productivity or technology variable.
In addition to the investment in the production process, we assume that the individuals have
access to a nancial asset with a price P
t
with dynamics of the form
dP
t
P
t
=
t
dt +
1t
dz
1t
+
2t
dz
2t
. (10.23)
As a part of the equilibrium we will determine the relation between the expected return
t
and
the sensitivity coecients
1t
and
2t
. Finally, the individuals can borrow and lend funds at an
instantaneously riskfree interest rate r
t
, which is also determined in equilibrium. The market is
therefore complete. Other nancial assets aected by z
1
and z
2
may be traded, but they will be
redundant. We will get the same equilibrium relation between expected returns and sensitivity
coecients for these other assets as for the one modeled explicitly. For simplicity we stick to the
case with a single nancial asset.
10.4 Real interest rates and aggregate production 249
If an individual at each time t consumes at a rate of c
t
0, invests a fraction
t
of his wealth
in the production process, invests a fraction
t
of wealth in the nancial asset, and invests the
remaining fraction 1
t
t
of wealth in the riskfree asset, his wealth W
t
will evolve as
dW
t
= r
t
W
t
+W
t
t
(g(x
t
) r
t
) +W
t
t
(
t
r
t
) c
t
dt
+W
t
t
(x
t
) dz
1t
+W
t
1t
dz
1t
+W
t
2t
dz
2t
.
(10.24)
Since a negative real investment is physically impossible, we should restrict
t
to the nonnegative
numbers. However, we will assume that this constraint is not binding.
Let us look at an individual maximizing expected utility of future consumption. The indirect
utility function is dened as
J(W, x, t) = sup
(
s
,
s
,c
s
)
s[t,T]
E
t
_
_
T
t
e
(st)
u(c
s
) ds
_
,
i.e. the maximal expected utility the individual can obtain given his current wealth and the current
value of the state variable. The dynamic programming technique of Section 6.5.2 lead to the
HamiltonJacobiBellman equation
J = sup
,,c
_
u(c) +
J
t
+J
W
(rW +W(g r) +W( r) c) +J
x
m
+
1
2
J
WW
W
2
_
[ +
1
]
2
+
2
2
2
_
+
1
2
J
xx
(v
2
1
+v
2
2
) +J
Wx
Wv
1
( +
1
)
_
The rstorder conditions for and imply that
=
J
W
WJ
WW
_
(g r)
2
1
+
2
2
2
2
( r)
1
2
2
_
+
J
Wx
WJ
WW
2
v
1
1
v
2
2
, (10.25)
=
J
W
WJ
WW
_
2
2
(g r) +
1
2
2
( r)
_
+
J
Wx
WJ
WW
v
2
2
. (10.26)
In equilibrium, prices and interest rates are such that (a) all individuals act optimally and (b)
all markets clear. In particular, summing up the positions of all individuals in the nancial asset
we should get zero, and the total amount borrowed by individuals on a shortterm basis should
equal the total amount lend by individuals. Since the available production apparatus is to be held
by some investors, summing the optimal s over investors we should get 1. Since we have assumed
a complete market, we can construct a representative individual, i.e. an individual with a given
utility function so that the equilibrium interest rates and price processes are the same in the single
individual economy as in the larger multiindividual economy. (Alternatively, we may think of the
case where all individuals in the economy are identical so that they will have the same indirect
utility function and always make the same consumption and investment choice.)
In an equilibrium, we have
(g r)
_
J
Wx
WJ
WW
_
_
J
W
WJ
WW
_
2
v
2
. (10.27)
Substituting this into the expression for
= 1 in equilibrium, we get
250 Chapter 10. The economics of the term structure of interest rates
that
1 =
_
J
W
WJ
WW
_
_
_
(g r)
2
1
+
2
2
2
2
2
2
(g r) +
_
J
Wx
WJ
WW
_
_
J
W
WJ
WW
_
2
v
2
2
2
_
_
+
_
J
Wx
WJ
WW
_
2
v
1
1
v
2
2
=
_
J
W
WJ
WW
_
g r
2
+
_
J
Wx
WJ
WW
_
v
1
.
Consequently, the equilibrium shortterm interest rate can be written as
r = g
_
WJ
WW
J
W
_
2
+
J
Wx
J
W
v
1
. (10.28)
This equation ties the equilibrium real shortterm interest rate to the production side of the
economy. Let us address each of the three righthand side terms:
The equilibrium real interest rate r is positively related to the expected real growth rate
g of the economy. The intuition is that for higher expected growth rates, the productive
investments are more attractive relative to the riskfree investment, so to maintain market
clearing the interest rate has to be higher as well.
The term WJ
WW
/J
W
is the relative risk aversion of the representative individuals indirect
utility. This is assumed to be positive. Hence, we see that the equilibrium real interest rate
r is negatively related to the uncertainty about the growth rate of the economy, represented
by the instantaneous variance
2
. For a higher uncertainty, the safe returns of a riskfree
investment is relatively more attractive, so to establish market clearing the interest rate has
to decrease.
The last term in (10.28) is due to the presence of the state variable. The covariance rate
of the state variable and the real growth rate of the economy is equal to v
1
. Suppose that
high values of the state variable represent good states of the economy, where the wealth of
the individual is high. Then the marginal utility J
W
will be decreasing in x, i.e. J
Wx
<
0. If instantaneous changes in the state variable and the growth rate of the economy are
positively correlated, we see from (10.25) that the hedge demand of the productive investment
is decreasing, and hence the demand for depositing money at the short rate increasing, in
the magnitude of the correlation (both J
Wx
and J
WW
are negative). To maintain market
clearing, the interest rate must be decreasing in the magnitude of the correlation as reected
by (10.28).
We see from (10.27) that the market prices of risk are given by
1
=
g r
,
2
=
_
J
Wx
WJ
WW
_
_
J
W
WJ
WW
_v
2
=
J
Wx
J
W
v
2
. (10.29)
Applying the relation
g r =
_
WJ
WW
J
W
_
J
Wx
J
W
v
1
,
we can rewrite
1
as
1
=
_
WJ
WW
J
W
_
J
Wx
J
W
v
1
. (10.30)
10.5 Equilibrium interest rate models 251
10.5 Equilibrium interest rate models
10.5.1 The Vasicek model
A classic but still widely used model of interest rate dynamics and the pricing of bonds and interest
rate derivatives is the model proposed by Vasicek (1977). The basic assumptions of the model is
that the continuously compounded shortterm interest rate r
t
has dynamics
dr
t
= ( r r
t
) dt +
r
dz
t
, (10.31)
where , r, and
r
are positive constants, and that the market price of risk associated with the
shock z is a constant .
Before we study the consequences of these assumptions, let us see how they can be supported
by a consumptionbased equilibrium model. Following Goldstein and Zapatero (1996) assume that
aggregate consumption evolves as
dc
t
= c
t
[
ct
dt +
c
dz
t
] ,
where z is a onedimensional standard Brownian motion,
C
is a constant, and the expected
consumption growth rate
ct
follows the process
d
ct
= (
c
ct
) dt + dz
t
.
The representative individual is assumed to have a constant relative risk aversion of . It follows
from (10.16) that the equilibrium real shortterm interest rate is
r
t
= +
ct
1
2
(1 +)
2
C
with dynamics dr
t
= d
ct
, which gives (10.31) with
r
= and r =
c
+
1
2
(1 +)
2
c
. The
market price of risk is =
c
, a constant.
The process (10.31) is a socalled OrnsteinUhlenbeck process. An OrnsteinUhlenbeck process
exhibits mean reversion in the sense that the drift is positive when r
t
< r and negative when
x
t
> r. The process is therefore always pulled towards a longterm level of r. However, the
random shock to the process through the term
r
dz
t
may cause the process to move further away
from r. The parameter controls the size of the expected adjustment towards the longterm level
and is often referred to as the mean reversion parameter or the speed of adjustment.
To determine the distribution of the future value of the shortterm interest rate dene a new
process y
t
as some function of r
t
such that y = (y
t
)
t0
is a generalized Brownian motion. It turns
out that this is satised for y
t
= g(r
t
, t), where g(r, t) = e
t
r. From Itos Lemma we get
dy
t
=
_
g
t
(r
t
, t) +
g
r
(r
t
, t)( r r
t
) +
1
2
2
g
r
2
(r
t
, t)
2
r
_
dt +
g
r
(r
t
, t)
r
dz
t
=
_
e
t
r
t
+e
t
( r r
t
)
dt +e
t
r
dz
t
= re
t
dt +
r
e
t
dz
t
.
This implies that
y
t
= y
t
+ r
_
t
t
e
u
du +
_
t
r
e
u
dz
u
.
252 Chapter 10. The economics of the term structure of interes
Much more than documents.
Discover everything Scribd has to offer, including books and audiobooks from major publishers.
Cancel anytime.