Advanced Portfolio Theory (Lecture Notes) : October 2004

Advanced Portfolio Theory
(Lecture Notes)
October 2004
Prof. Dr. Thorsten Hens
Thorsten.Hens@nhh.no
i
TABLE OF CONTENTS
Table of Contents i
List of Symbols v
CHAPTER 1: INTRODUCTION
1 Introduction 1
2 Portfolio Theory as A Subfield of Finance 1
3 The Structure of Modern Financial Markets 1
3.1 Principal-Agent Problems 2
3.2 Time Scale of Investment Decisions 2
4 The Anticipation Principle 2
5 Reflexivity and the Game Structure of Financial Markets 4
6 Rational Investment Process versus Simple Heuristics 6
7 Building Theories 7
CHAPTER 2: FOUNDATIONS FROM PORTFOLIO THEORY

1 Introduction 9
2 The Mean-Variance Analysis 9
2.1 Diversification 10
2.2 The Efficient Frontier 11
2.3 Optimal Portfolio of Risky Assets with a Risk-less Security 11
2.4 Two Fund Separation Theorem 12
2.5 The Structure of the Tangent Portfolio 13
2.6 The Asset Allocation Puzzle 14
3 Market Equilibrium 14
3.1 Capital Asset Pricing Model 15
3.2 Application: Market Neutral Strategies 16
4 Some Open Issues in the Mean-Variance Framework 17
CHAPTER 3: FOUNDATIONS FROM ASSET PRICING

1 Introduction 19
2 The General Equilibrium Model: Basics 19
2.1 Complete and Incomplete Markets 21
3 The Principle of No Arbitrage 24
4 Geometric Intuition for the FTAP 25
5 First Welfare Theorem 27
6 Market Selection Hypothesis with Rational Expectations 28
7 Stock Prices as Discounted Expected Payoffs 29
8 Consequences of No Arbitrage 29
8.1 The Law of One Price 29
ii
8.2 Linear Pricing 30

8.3 Derivatives 30
9 Equivalent Formulations of the No Arbitrage Principle 32
10 Limits to Arbitrage 34
10.1 The Case of 3Com and Palm 34
10.2 The Case of Closed End Funds 35
10.3 The LTCM case 35
10.4 No Arbitrage with Short-Sales Constraints 36
11 Identifying the Likelihood Ratio Process 36
11.1 The Likelihood Ratio Process with CAPM 37
11.2 The Likelihood Ratio Process with APT 38
11.3 The Representative Agent 38
12 The Rationality Benchmark 41
12.1 Empirical Evidence 43
13 Summary 50
CHAPTER 4: RATIONAL CHOICE: EXPECTED UTILITY AND BAYESIAN UPDATING

1 Rational Choice 52
1.1 Preferences 52
1.2 Utility Functions 52
1.3 Concept of Revealed Preferences 53
1.4 Concept of Experienced Preferences 53
1.5 The Concept of Preferences in Finance 53
2 Expected Utility Theory 54
2.1 The Representation Theorem 55
2.2 The Allais Paradox 55
2.3 The Probability Triangle 57
2.4 Ellsberg Paradox 58
2.5 Ambiguity 59
3 Stochastic Dominance 59
3.1 First Order Stochastic Dominance (FSD) 59
3.2 Second-Order Stochastic Dominance (SSD) 60
3.3 State-Dominance and Stochastic Dominance 60
4 A Second Look at Mean-Variance 61
4.1 Mean-Variance Principle as a Special Case of Expected Utility 61
5 Measures of Risk Aversion 63
6 Rational Probabilities 64
7 Rational Time Preferences 66
7.1 Time Preference Reversal 66
iii
7.2 Hyperbolic Discounting 66

7.3 Discounting and Risk Aversion 66
CHAPTER 5: CHOOSING A PORTFOLIO ON A RANDOM WALK: DIVERSIFICATION

1 Introduction 70
2 Returns 70
3 Portfolio Choice 71
3.1 Dynamic Portfolio Choice with CRRA: The “No Time Diversification” Theorem 73
3.2 Rebalancing, Fix Mix and Volatility Pumping 73
4 Asset Pricing 75
5 The Equity Premium Puzzle 77
6 Summary 78
CHAPTER 6: BEHAVIORAL PORTFOLIO THEORY

1 Introduction 80
2 Descriptive versus Prescriptive Theories 80
3 Search, Framing and Processing of Information 81
3.1 Selection of Information 81
3.2 Framing 82
3.3 Processing of Information 83
4 Prospect Theory 85
5 Probability Weighting and FSD 88
CHAPTER 7: CHOOSING A PORTFOLIO ON A MEAN REVERTING PROCESS: TIME DIVERSIFICATION

1 Introduction 90
2 The Mean Reverting Process 90
3 Optimal Portfolio Choice With Mean Reversion 91
4 The Case considered in Samuelson (1991) 92
5 Asset Pricing and Mean reversion 96
6 Conclusion 98
CHAPTER 8: BEHAVIORAL HEDGE FUNDS

1 What are Hedge Funds? 101
2 Hedge Funds Strategies 101
2.1 Long/Short strategies 101
2.2 Arbitrage 101
2.3 Event driven strategies 102
2.4 Directional strategies 102
3 Value at Risk 104
4 Investment Strategies Based on Behavioral Finance 105
4.1 Underreaction 105
iv
4.2 Momentum and Reversal 107

4.3 Strategies based on co-moving assets 107
4.4 Strategies exploiting probability weighting 108
5 Conclusions 109
CHAPTER 9: CHOOSING A PORTFOLIO ON A GARCH PROCESS: RISK MANAGEMENT

1 Introduction 112
2 Evidence 112
3 Coherent Risk Measures 113
4 Crash Measures 115
5 Asset Pricing Models 116
6 Summary 117
CHAPTER 10: EVOLUTIONARY FINANCE: SURVIVAL OF THE FITTEST ON WALL STREET

1 Introduction 119
2 The Ecology of the Market 119
3 Survival of the Fittest on Wall Street 120
4 The Evolutionary Portfolio Model 121
5 A Simulation Analysis 125
6 The Main Results 129
7 Extensions 132
v
LIST OF SYMBOLS
t = 0,1,...,T time periods
s = 1,...,S states of the world
i = 1,...,I investors
w si (exogenous) wealth of investor i in state s
c si consumption of investor i in state s
U i (c oi ,c 1i ,...,c Si ) utility of investor i
w i = w 0i period zero wealth
k = 1,...,K assets where k = 0 is the consumption good
a sk payoff of asset k in state s (resale value and dividends)
⎡a11 a1K ⎤
A = ⎢ 1 K ⎥ payoff matrix (S-rows and K columns)
⎣a S aS ⎦
θki units of asset k bought by investor i
q k price of asset k
Dk dividend of asset k
a sk = qtk+1 + Dtk+1(resale value and dividends)
a sk
Rsk = (gross) return of asset k in state s
qk
(R s0 := Rf for the risk free asset)
⎡R 1 R K ⎤
R = ⎢ 11 1K ⎥ matrix of (gross) returns
⎣RS RS ⎦
rsk = ln(Rsk ) logarithm of gross return
rsk = ln(Rks ) logarithm of gross return
q k θki
λ = i wealth share of investor i invested in asset k
i
k
w
π s state price, i.e. present value of one unit of wealth in state s
* π
π s = S s martingale probability of state s
∑π z
z=1
E expectation with respect to martingale measure

*
π
ps physical probability of occurence of state s

E p expectation with respect to physical measure
probs alternative notation for probabilty of state s
vi
S
E u i = u i (c oi ) + γ i ∑ ps u i (c si ) expected utility where γ i is time preference
s =1
α risk aversion parameter, for example U i (c ti ) = (c ti )1−α

i
i
*
πs
ls = ps
likelihood ratio of state s
S
µ ( x ) = E p ( x ) = ∑ ps x s expectation w.r.t. physical measure
s =1
S
COV ( x , y ) = ∑ ps ( x s −µ ( x ))( y s − µ ( y )) Covariance
s =1
S
σ 2 ( x ) = VAR ( x ) = ∑ ps ( x s −µ ( x ))2 Variance
s =1
COV ( x , y )
ρ (x , y ) = Correlation between x and y
σ ( x )σ ( y )
V i ( µ (c ),σ (c )) mean-variance utility
SD stochastic dominance; first (FSD) and second (SSD) order
N ( µ ,σ ) normal distribution; Φ(0,1) standardized normal distribution
u ``( x )
ARA( x ) = - absolute risk aversion
u `( x )
u ``( x )x
RRA( x ) = - relative risk aversion
u `( x )
RT ( x ) = 1/ ARA( x ) risk tolerance
i.i.d. identically and independently distributed
εt sequence of random disturbances
RW random walk, e .g . : x t +1 = x t + εt where E (εt ) = 0,E (ε 2t ) = σ 2 ,E (εt ,εt +1 ) = 0
AR autoregressive process : x t +1 = ax t + εt ; mean-reversion : ¦ a ¦< 1
ωt ∈ Ω state of the world in period t.
ω t = (ω0 ,ω1,...,ωt ) path up to period t.
Chapter 1:
Introduction
Introduction 1
1 INTRODUCTION
These lecture notes provide a rough guide through the slides of my course Introduction to Financial Economics,
that I teach at the universities of Zurich, Bergen, Paris and Marseille. They should be read while looking also at
the slides. The point of these notes is to provide the general picture and main intuitions. 11 exercises (with
solutions) are set up to penetrate the material. Very brilliant students will be able to fill the details on their own.
On demand I did provide them on the black board while teaching. Also, at the end of each chapter, I give the
references to the literature in which more details can be found. The focus of my course is on recent
developments in portfolio theory and asset pricing. Besides foundations from traditional finance, the course
covers also behavioral and evolutionary finance. I do not deal with corporate finance.
2 PORTFOLIO THEORY AS A SUBFIELD OF FINANCE

The research field Finance contains at least the following various topics.
• Public Finance
• International Finance
• Corporate Finance
• Derivatives
• Risk Management
• Portfolio Theory
• Asset Pricing
Portfolio theory is a descriptive and normative theory. On the one hand it studies how people should
combine assets (normative view). The leading idea can be summarized as the maxim “Do not put all eggs in
one basket”, which is known as the principle of diversification. On the other hand portfolio theory describes
how investors do combine assets (descriptive view). This second aspect has important implications for asset
pricing, because if one knows how investors form their portfolios then one can determine how assets are
priced. This is because prices are determined by the demand and supply generated by the investors`
portfolios decisions.
3 THE STRUCTURE OF MODERN FINANCIAL MARKETS

Modern financial markets are populated by various investors with different wealth, objectives and heterogeneous
beliefs. There are private investors with pension, housing and insurance concerns, firms implementing investment
and risk management strategies, investment advisors providing financial services, investment funds managing
pension or private capital and the government financing the public deficit. The investment decisions are
ultimately implemented by brokers, traders and market makers.
Introduction 2
3.1 Principal-Agent Problems

Many financial markets are dominated by large investors. On the Swiss equity market, for example, more than
75% of the wealth is managed by institutional asset managers providing services to private investors, insurance
funds and pension funds. Since the asset managers` investment abilities and effort are not observable by their
clients, the contract between the principle and the agent must be based on measurable variables such as relative
performance. Though, such contracts may generate herding behavior particularly among institutional investors.
In the words of Lakonishok et al (1992, page 25): “Institutions are herding animals. We watch the same
indicators and listen to the same prognostications. Like lemmings we tend to move in the same direction at the
same time. And that naturally exacerbates price movements.” Additionally, asymmetric information may create
“window dressing” effects, i.e. agents change their behavior upon the beginning of the reporting period.
3.2 Time Scale of Investment Decisions

Investors differ in their time horizon, information process and reaction time. Day traders for example make
several investment decisions per day requiring fast information processing. Their reaction time is limited to some
seconds. Other investors have longer investment horizon (e.g. one or more years). Their investment decisions do
not have to be made “just in time”. On the contrary, popular investment advice for investors with a longer
investment horizon is: “Buy stocks and take a good long sleep”. Investors following this advice are expected to
have a different perception to stocks as Benartzi and Thaler (1995) claim with the following example: ”Compare
two investors, Nick who calculates the gains and losses in his portfolio every day, and Dick who only looks at his
portfolio once per decade. Since, on a daily basis, stocks go down in value almost as often as they go up, Nick's
loss aversion will make stocks appear very unattractive to him. In contrast, loss aversion will not have much
effect on Dick's perception of stocks since at ten year horizons stocks offer only a small risk of losing money”.
At the other end of the time scale are the day traders. In the words of a day trader, interviewed by Wall Street
Journal (1988), the situation is like this: “Ninety percent of what we do is based on perception. It doesn‘t matter
if that perception is right or wrong or real. It only matters that other people in the market believe it. I may know
it‘s crazy, I may think it‘s wrong. But I lose my shirt by ignoring it. This business turns on decisions made in
seconds. If you wait a minute to reflect on things, you‘re lost. I can‘t afford to be five steps ahead of everybody
else in the market. That‘s suicide“.
Thus, intraday price movements reflect how the average investor perceives incoming events. In the very long run
price movements are determined by trends in fundamental data like earnings, dividend growth and cash flows.
How the short run aspects get washed out in the long run, i.e. how aggregation of fluctuations over time can be
modelled is rather unclear.
4 THE ANTICIPATION PRINCIPLE

As the following chart documents, stock prices are driven by the arrival of new information. In a sense the stock
market is a huge prediction machine that always tries to adjust its level to the events anticipated for the future.
However, by the very nature of the information process, news are those events that have not been anticipated
Introduction 3
before. Thus the changes in stock prices must be random. This wisdom was already known in the sixties. Cootner
(1964) expresses it quite clearly by writing: “The only price changes that would occur are those that result from
new information. Since there is no reason to expect information to be non-random in appearance, the period-to-
period price changes of a stock should be random movements, statistically independent of one another.” Hence
rationally anticipated prices must be random!”1
To see how this view of the stock market is sometimes abused, consider the following example. On September
30th 2002 Alan Greenspan decreased interest rates from 1.75 to 1.25 basis points which should be good news
for the stock market since the alternative investment in bonds is now less attractive. However, the DJIA dropped
from 7701 to 7591. A standard explanation is that the decrease in interest rates was already anticipated by the
market. In fact, from September 1990 to September 2002 Alan Greenspan changed interest rates on 46 days. In
27 cases on that day the DJIA moved in the wrong direction! Note that referring to anticipated events in this way
is a tautology, nicknamed “Neptun” by Popper. A tautology arises if one gives a reason to explain a fact and one
argues that the reason is true because the fact is observed. To illustrate this concept Popper refers to a fisher
man in ancient Greece. When he was asked why the see is so rough today he answered: “Because Neptun is
angry today.” And when asked how he did know that Neptun is angry today he answered: “Don’t you see how
rough the see is today?” In finance the concept of expectations is often used as such a Neptun. If some event
1
Note that Cootner (1964) assumes information to be random. A recent study of Damianova, Hens and Sommer (2004)
shows that indeed the news as perceived by business specialist are white noise (i.i.d. and Gaussian).
Introduction 4
drives the stock prices in the “wrong” direction then one argues that it was already anticipated in the prices!
That is to say the concept of expectations used this way cannot be falsified! A rigorous test of the anticipation
principle can never be done this way! Rather one has to analyze the statistical properties of the price process
relative to the news process.
5 REFLEXIVITY AND THE GAME STRUCTURE OF FINANCIAL MARKETS

The concept of reflexivity is described by Soros (1998) in the following way:
“Financial markets attempt to predict a future that is contingent on the decisions people make in the present.
Instead of just passively reflecting reality, financial markets are actively creating the reality that they, in turn,
reflect. There is a two way connection between present decisions and the future events, which I call reflexivity”
and “Reflexivity is absent from natural science, where the connection between scientists‘explanations and the
phenomenon that they are trying to explain runs only one way”. Moreover, Soros writes: “Each market
participant is faced with the task of putting a present value on a future course of events, but that course is
contingent on the present values that all market participants taken together attribute to it”.
One appearance of reflexivity is already described by Keynes (1936, page 156) in his famous beauty contest:
“Professional investment may be likened to those newspaper competitions in which the competitors have to pick
out the six prettiest faces from a hundred photographs, the prize being awarded to the competitor whose choice
most nearly corresponds to the average preferences of the competitors as a whole; so that each competitor has
to pick, not those faces which he himself finds prettiest, but those which he thinks likeliest to catch the fancy of
the other competitors, all of whom are looking at the problem from the same point of view”.
From the perspective of modern economics, Keynes beauty contest is a coordination game, i.e. a game where the
participants get high (low) payoff if they choose the same (a different) action.
Though, if the stock market were a coordination game why do we observe so many stock market fluctuations?
The answer is maybe in the behavior of noise traders. Sometimes investors may have to sell stocks for reasons
exogenous to the market. Rational coordination may brake down because this introduces uncertainty to the
coordination game.
To see the importance of noise trading consider a game played as a sequence of stage games with the following
structure (see Gerber, Hens and Vogt (2003)). There are 5 participants betting 1 point each on “up” or “down”.
A fair dice randomly distributes 6 points on “up” or “down”. If the participant’s prediction is correct, he receives
20 ECU otherwise nothing. The movement of the stock goes up (down) if the majority of points including those
from the noise traders is bet on up (down). The stage game is repeated in two rounds with 100 periods each.
The following chart displays a typical outcome:
Introduction 5
Spiel 1.2.: Kurs vs. Würfel (Sum m en)
20
15
10
5
Kurs (kum.)
Würfel (kum)
0
-5
-10
-15
Per io d en
Even though the noise is unsystematic stock prices show a clear pattern. They gain momentum and have long
periods of up and of down movements and suddenly they revert their direction. Hence stock prices show
momentum and reversal – the overall volatility (measured in terms of variance) is clearly higher than that of the
exogenous noise process. Why this pattern emerges can be seen from the behavior of the participants. Most of
the time they are nicely coordinated but every now and then the coordination is broken by an extreme move of
the exogenous noise. When asked who is responsible for the sudden reverses in stock price movements the
participants answer “The exogenous noise!” But when asked: “So why don’t you stick to the previous
coordination?” They respond because I cannot know that the others did understand that our coordination was
broken by the noise!”
Spiel 1.2.: Einzelne Teilnehm erInnen/ kurs Perioden 1-50
1.2
0.8
Kurs
TeilnehmerIn 1
TeilnehmerIn 2
0.6
TeilnehmerIn 3
TeilnehmerIn 4
TeilnehmerIn 5
0.4
0.2
Per io d e
Certainly the coordination game structure is an important aspect of stock markets. Yet this majority game aspect
has to be complemented with a minority game aspect. This seemingly contradiction is resolved when one makes
a distinction between the values investors hold in their depots (the depot values) and the values they did realize
for consumption. With respect to depot values stock markets are a coordination game. The more I bet on the
idea that prices go up the higher the value in my depot. However when it comes to realizing these values then
the stock market is a minority game, i.e. a game that cannot be won by a majority. In a stock market bubble for
example depot values increase to record levels but when investors try to realize these gains then markets crash.
The tension between the coordination game aspect with respect to depot values and the minority game aspect
with respect to realized values creates the fluctuations on financial markets.
A simple experiment that has both aspects is the so called guessing game that we describe in the next section in
Introduction 6
order to analyze whether strategies based on the idea of common knowledge of rationality or rather simple
heuristics are more successful on financial markets.
6 RATIONAL INVESTMENT PROCESS VERSUS SIMPLE HEURISTICS

The Guessing Game, also called “Pick a Number Game” (cf. Nagel (1995)) has the following rules: each
participant picks a number between 0 and 100. The winner is the participant who is closest to 2/3 of the average
number. Clearly the unique Nash equilibrium is that all participants choose the number 0. Only in this case no
participant has an incentive to change his choice. However it is a robust observation that the typical winning
number is in the range of 17-22. Hence the winners are those who have the best guess about the average
rationality of the participants and then try to outsmart the average by choosing a number below their guess. A
typical distribution of numbers chosen in the guessing game is:
16
14
F r e que nc y
12
10
0
0 13.6 27.2 40.8 54.4 Mor e
Bin
Even though these ideas seem rather simplistic they are valuable to understand hedge fund disasters like the one
of Long Term Capital Management (LTCM). LTCM (run by Meriwether, Merton and Scholes) gathered billions of
dollars from investors to be bet on the idea that markets will eventually converge to where rational theory
predicts them to be. To give an example, consider their pair trading strategy on the stocks of Royal Dutch
Petroleum (RDP) and Shell Transport and Trading (STT). The LTCM managers discovered that the share price of
RDP at the London stock exchange and that of STT at the New York stock exchange do not reflect the parity in
earnings and dividends stated in the splitting contract between these two units of the Royal Dutch/Shell holding.
According to this splitting contract, earnings and dividends are paid in relation 3 (RDP) to 2 (STT), i.e. the
dividends of Royal Dutch are 1.5 times higher than the dividends paid by Shell. Though, for a long time, the
market prices of these shares did not follow this parity as the figure bellow shows and since LTCM did not have
enough liquidity when the spread between the two shares widened they made huge losses on this trade.
Introduction 7
Actually the LTCM managers did not believe Keynes who wrote: “Markets can remain irrational longer than you
can remain solvent!”
7 BUILDING THEORIES
Albert Einstein is known to have said that “there is nothing more practical than a good theory.” But what is a
good theory? First of all, a good theory is based on observable assumptions so to avoid the Neptun fallacy.
Moreover, a good theory should have testable implications – otherwise it is a religion which cannot be falsified.
This falsification aspect cannot be stressed enough. Steve Ross, the founder of the econometric Arbitrage Pricing
Theory (APT), for example, claims that “every financial disaster begins with a theory!” By saying this he means
that those who start trading based on a theory are less likely to react to disturbing facts because they are
typically in love with their ideas. Finally, a good theory is a broad generalization of reality that captures the
essential features of it. Note that a theory does not become better if it becomes more complicated. Since there is
(and according to the rational point of view there has to be!) so much noise in our observations it is really hard
to build a good theory of financial markets. A good advice is to start with empirically robust deviations from
random behavior of asset prices and then to try to explain it with testable hypotheses. Of course this presumes
that you do understand the “Null Hypothesis”, i.e. what a rational market looks like. Therefore a big part of this
course will deal with traditional finance that explains the rational point of view. One way of reducing the noise is
to also use laboratory experiments because then one at least has control about the exogenous noise. However,
the step back into the reality of financial markets may be huge.
REFERENCES
Benartzi and Thaler (1995): Myopic Loss Aversion, Journal of Political Economy.
Cootner (1964): The Random Character of Stock Prices, MIT-press.
Damianova, Hens and Sommer (2004) News as Perceived by Business Specialists, mimeo University of Zurich.
Gerber Hens and Vogt (2003): Rational Investor Sentiment, NCCR-research paper.
Keynes (1936): General Theory, edition from 1964.
Lakonishok et al (1992): The Structure and Performance of the Money Management Industry, Brookings Papers
on Economic Activity, Washington, D.C.: Brookings Institutions: 331-339.
Nagel (1995): Unraveling the Guessing Game. An Experimental Study, American Economic Review, 85, 1313-
1326.
Soros (1998): The Crisis of Global Capitalism, Little, Brown and Company.
Wall Street Journal (1988): Making Book on the Buck, Sept. 23, 1988, p. 17.
Chapter 2:
Foundations from
Portfolio Theory
Foundations from Portfolio Theory 9
1 INTRODUCTION
The mean-variance analysis goes back to H. Markowitz (1952). In his work “Portfolio Theory Selection” he
recommends the use of expected return-variance of return rule,
“…both as a hypothesis to explain well-established investment behavior and as a maxim to guide one‘s
own action”.
Later, Jagannathan and Wang (1996) recognize the mean-variance analysis and the Capital Asset Pricing Model
as
“…the major contributions of academic research in the post-war era.”

Campbell and Viciera (2002) write on page 7:
„Most MBA courses, for example, still teach mean-variance analysis as if it were a universally accepted
framework for portfolio choice“
2 THE MEAN-VARIANCE ANALYSIS

qtk + Dtk
There are k = 1,2,..., K assets. The gross return of the assets is Rtk = k
, where qtk is the market price
qt −1
(or payoff) of asset k in time t, and Dtk are the dividends of asset k in time t. µk = E (Rtk ) is the expected
return2 and σ k2 = Var (Rtk ) is the variance of the gross returns. All assets can be represented in a two-
dimensional diagram with expected return µ as a reward measure and standard deviation σ = Var (Rtk ) as the
risk measure on the axes.
µ
. .
k
µk .
.
.
. . .
Rf .
σk σ
Every asset k can be characterized by the mean and standard deviation of its returns. The risk free-asset for
example has an expected return of R f with a zero standard deviation. An investor who puts all of his money
into risky asset k expects to achieve a return of µ k with a standard deviation σ k .
2
Expected returns are usually calculated using historical return values or Asset Pricing Models.
2.1 Diversification
Combining two risky assets k and j gives an expected portfolio return of µ λ = λµ k + (1 − λ ) µ j , where λ is the
portion of investor’s wealth invested in asset k. The portfolio variance is
σ = λ σ + (1 − λ ) σ + 2λ (1 − λ )Covk , j .
2
k
2 2
k
2 2
j
The advantage of combining risky assets depends on the covariance term. The smaller the covariance the smaller
is the portfolio risk, the higher is the diversification potential of mixing these risky asset. There is no
diversification potential of mixing risky assets with the riskless security, since the covariance of the returns is
equal to zero.
To see how portfolio risk changes with covariance it is convenient to standardize the covariance with the
standard deviation of assets returns. The result is the correlation coefficient between returns of assets k and j:
Covk , j
ρk, j = . The portfolio variance reaches its minimum, when the risky assets are perfectly negatively
σ kσ j
correlated: ρ k , j = −1 . In this case, the portfolio may even achieve an expected return, which is higher than the
risk free rate without bearing additional risk. The portfolio consisting of risky assets does not contain risk
because whenever the return of asset k increases, the return on stock j decreases, so if one invests positive
amounts in both assets, the variability in portfolio returns cancel out (on average).
µ
. .
k .
ρ k, j
= −1
ρ k, j = 1 . .
. .
Rf j .
σ
Investors can build portfolios from risky and risk-free assets but also portfolios from other portfolios etc. The set
of possible µ σ -combinations offered by portfolios of risky assets that yield minimum variance for a given rate
of return is called minimum-variance opportunity set or portfolio bound (see the figure bellow).
µ
σ (µ )
. .
.
MVP .
.
. .
rf .
The investor’s problem when choosing a portfolio is to pick a portfolio with the highest expected returns for a
given level of risk. Alternatively, he can minimize the portfolio variance for different levels of expected return.
Formally, this is equivalent to the following optimization problem:
σ ( µ ) := min ∑∑ λk Covk , j λ j
λ k j
s.t. ∑ λk µk = µ (1.1)
k
and ∑ k
λk = 1
2.2 The Efficient Frontier

The solution of problem (1.1) gives the mean-variance opportunity set or the portfolio bound. In order to identify
the efficient portfolios in this set, one has to focus on that part of the mean-variance efficient set that is not
dominated by lower risk and higher return. This is the upper part of the portfolio bound, since every portfolio on
it has a higher expected return but the same standard deviation as portfolios on the lower part (see the figure
bellow).
Thus, all the portfolios on the efficient frontier have the highest attainable rate of return given a particular level
of standard deviation. The efficient portfolios are candidates for the investor’s optimal portfolio.
µ
. .
.
.
.
. .
rf .
Rf
σ
2.3 Optimal Portfolio of Risky Assets with a Risk-less Security

The best combination of risky assets for riskaverse µ -σ - investors lies on the efficient frontier. Every point on it
is associated with the highest possible return a given certain risk level.
If an investor desires to combine a risky asset (or a portfolio of risky assets) with a risk-less security, he must
choose a point on the straight line3 connecting both assets.
The best portfolio combination is found when this line achieves its highest possible slope4. This line is known as
the Capital Market Line. Its tangent on the efficient frontier gives the best portfolio of risky assets, which is
called Tangent Portfolio.
2.4 Two Fund Separation Theorem

The optimal asset allocation consisting of risky assets and a riskless security depends on investor’s preferences,
αi
which are given by the utility function U ( µλ , σ λ ) = µλ −
2 i
σ λ2,i where α i > 0 is a risk aversion parameter of
2
investor i. The higher this parameter, the higher is the slope of the utility function5. The higher the risk aversion,
the higher is the required expected return for a unit risk (required risk premium).
Different investors have different risk-return preferences. Investors with higher (lower) level of risk aversion
choose portfolios with a low (high) level of expected return and variance, i.e. their portfolios move down (up) the
efficient frontier.
If there is a risk-free security, the Separation Theorem of James Tobin states that agents should diversify between
the risk-free asset (e.g. money) and a single optimal portfolio of risky assets. Since the Tangency Portfolio gives
the optimal mix of risky assets, a combination with the risk-free assets means that every investor has to make an
investment decision on the Capital Market Line. Different attitudes toward risk result in different combinations of
the risk-free asset and the tangent portfolio. More conservative investors for example will choose to put a higher
fraction of their wealth in the risk-free asset; respectively, more aggressive investors may decide to borrow
capital on the money market (go short in risk-free assets) and invest it in the tangent portfolio.
3
Since Cov( Rk , R f ) = 0 , the portfolio variance σ λ is a linear function of the portfolio weights.
4 µλ − R f
The slope of this line is known as the Sharpe Ratio. It is equal to .
σλ
5
The risk aversion concept is often discussed in the expected utility context, where the (absolute) risk aversion is measured
by the curvature of a utility function. We will come back to this point once the expected utility framework has been
introduced.
Thus, the asset allocation decision of investor i is described by the vector of weights:
λ i = (λ0i , (1 − λ0i )λ T ), i = 1,...I 6, where λ i ∈ RK +1 , λ0i ∈ R, and λT ∈ RK .
i
µ
Aggressive Investor
i´ T Best Mix
Moderate Investor
Rf Conservative Investor
σ
„The striking conclusion of his {Markowitz} analysis is that all investors who care only about mean and
standard deviation will hold the same portfolio of risky assets.“7
2.5 The Structure of the Tangent Portfolio

αi
According to the Two-Fund Separation an investor with utility U ( µλ , σ λ2 ) = µλi − σ λ2,i has to decide how to
2
split his wealth between the optimal portfolio of risky assets with a certain variance-covariance structure
(Tangent Portfolio) and the risk-less asset. The structure of the Tangent Portfolio can be found either by
maximizing the Sharpe Ratio subject to a budget constraint8 or by solving the simplest µ -σ maximization
problem. Note that the composition of the Tangent portfolio does not depend on the form of utility function.
αi
max U ( µλ , σ λ2 ) = µλi − σ λ2,i
K +1
λ∈R 2
K
(1.2)
s.t. ∑λ
k=0
k =1
In this problem, λ0 denotes the fraction of wealth invested in the risk-less asset. λ0 can be eliminated from the
K
optimization problem by substituting the budget constraint λ0 = 1 − ∑ λk into the utility function. Using the
k =1
definition of µλ and σ λ2 we get:
6
Note: there is no index i on the Tangent Portfolio λ T since this portfolio is the same for every investor.
7
Campbell and Viceira (2002), p.3.
8
This approach is cumbersome since the portfolio weight λ appears both in the numerator and in the denominator.
αi
max ( µ − R f )λ − λ ' COV λ (1.3)
λ 2
where now λ is the vector of risky assets in the Tangent Portfolio, µ is the vector of risky assets returns, and
Cov is their variance-covariance matrix. The first order condition of the problem is:
1
COV λ1 = (µ − R f )
αi
If there are no constraints on λ1 , then the solution is:
1
λ1 = COV −1 (µ − R f ) (1.4)
αi
With short-sales constraints, λ ≥ 0 , for example, one can apply standard algorithms for linear equation systems
to solve the problem.
Say, the solution to the first order condition is λ opt , then the Tangency Portfolio can be found by a
λkopt
renormalization: λkT = . Note that the risk aversion parameter α i cancels after the renormalization,
∑ λj
j
opt
which is the Two Fund Separation property.
Using more sophisticated functions than (1.2) will not change the result obtained in (1.4).
2.6 The Asset Allocation Puzzle

Canner, Mankiw und Weil (1997) studied investors’ portfolios managed by different banks and found that the
ratio of bonds to stocks differs substantially among investors with different risk attitude. This is a puzzle in the
sense that professional bank advisors seem not to follow the Two-Fund Separation Theorem as an asset
allocation rule, i.e. the portfolio structure of risky assets is not constant and varies with differences in client’s risk
aversion. For example, the ratio of the portfolio weight of S&P500 to bonds changes from 0.25 to 1.5 in
conservative, moderate and aggressive portfolios.
3 MARKET EQUILIBRIUM
If individual portfolios satisfy the Two-Fund-Separation then by setting demand equal to supply the sum of the
individual portfolios must be proportional to the vector of market capitalizations9 λkM .
⎡ ⎤
∑λ i
i
k = ⎢ ∑ (1 − λ0i ) ⎥ λkT = λkM Hence, the normalized Tangency Portfolio will be identical to the Market
⎣ i ⎦
9
The market capitalization of a company for example is the market value of total issued shares.
Portfolio.10
µ
CML
λM
Rf
j
Curve j is obtained by combination of some asset j with the market portfolio.
To determine a risk measure in a financial market with µ -σ -investors we need an asset pricing model.
3.1 Capital Asset Pricing Model

To understand the link between the individual optimization behavior and the market, compare the slopes of the
Capital Market Line and the j-curve. By the tangency property of λM they must be equal (see the figure above).
The slope of the Capital Market Line can be calculated dividing:
d µ (λ R f + (1 − λ ) RM ) dσ (λ R f + (1 − λ ) RM )
λ =0 = µ f − µ M by λ =0 = −σ M .
dλ dλ
The slope of the j-curve is
d µ (λ R j + (1 − λ ) RM ) dσ (λ R j + (1 − λ ) RM ) Cov( R j , RM ) − σ M2
λ =0 = µ j − µ M divided by λ =0 =
dλ dλ σM
From the slope’s equality at point λM follows:
( µ j − µ M )σ M µ M − µ1
=
Cov( x j , xM ) − σ 2
M σM
or equivalently: µ j − µ f = β jM ( µ M − µ f )
Cov( R j , RM )
where β jM =
σ M2
The result is the Capital Asset Pricing Model (CAPM).
10
Note that this equality is barely supported by empirical evidence, i.e. the Tangent Portfolio does not include all assets.
The reason for this mismatch could for example be that not every investor optimizes over risk and return as suggested by
Markowitz.
µ
SML
λM
µM µk − µ f = β kM ( µ M − µ f )
Rf
1 β
The difference to the mean-variance analysis is the risk measure. In the CAPM the asset’s risk is captured by the
factor β instead by the standard deviation of asset’s returns. It measures the sensitivity of asset j returns to
changes in the returns of the market portfolio. This is the so called systematic risk.
3.2 Application: Market Neutral Strategies

The Capital Asset Pricing Model has many applications for investment managers and corporate finance. Even
professionals dealing with alternative investments consider it for building portfolios. One example is a form of
Market Neutral Strategy followed by some hedge funds. This strategy aims a zero exposure to market risk. To
exclude the impact of market movements, it takes simultaneous long and short positions on risky assets. These
assets have the same beta (as measure for market risk) but different market prices. Under the assumption that
market prices will eventually return to their fundamental value defined by the CAPM, hedge fund managers take
long positions in underpriced assets and short positions in overpriced assets. In terms of expected returns, the
long (short) positions are in assets with higher (lower) expected returns than in the CAPM.11
µ SML
.
.
long
. .
. . .
.
. .
.
. short .
.
r Market neutral
11
When prices revert and increase (decrease) in order to reach their fundamental value, the expected returns are
decreasing (increasing).
4 SOME OPEN ISSUES IN THE MEAN-VARIANCE FRAMEWORK

The assumption of mean-variance preferences postulates that investors’ preferences depend only on the
mean and variance of payoffs. Though, experimental research provides a different description of
investors’ behavior under uncertainty. It suggests that, investor’s preferences are not always defined over
absolute payoffs. Individual decisions are often motivated by relative changes, i.e. gains and losses
compared to a certain reference point. Additionally, mean-variance preferences show some
inconsistencies with the axioms of rational choice.
The assumption of homogeneous beliefs requires that all investors have the same expectations about the
return distribution. In this case, all investors will face the same investment opportunity set and the
minimum-variance set will be identical for all investors. Though, if there is a disagreement among
investors’ beliefs, the composition of the tangency portfolio will not be uniquely determined. Then, it is
not obvious whether CAMP will hold. The relevance of the model is further questioned in the presence of
background risks that generate a mismatch between incentives and diversification behavior (e.g. CEOs
hedging firm’s stock options; employees holding shares of the company they currently work for).
The mean-variance framework suggests two portfolios: the tangent portfolio and the market portfolio.
These are quite different portfolios since typically the market portfolio is inefficient and the tangent
portfolio is underdiversified.
The mean variance analysis is not a dynamical concept. The optimal portfolio over a long period of time
is not just the sequence of optimal portfolios over short periods of time. Summing up one-period
decisions over a long period of time includes the risk of error accumulation.
The standard deviation is not always an appropriate risk measure, because of left sided fat tails in return
distributions.
REFERENCES:
Textbooks:
Huang and Litzenberger (1988): Foundations for Financial Economics, North Holland,
Chapters 1-4
Campbell and Viceira (2002): Strategic Asset Allocation, Oxford University Press, Chapters 1-2
Research Papers:
Later, Jagannathan and Wang (1996): The Conditional CAPM and the Cross-Sections of Expected Returns,
Journal of Finance (51), 3-53
Campbell and Viciera (2002): Strategic Asset Allocation, OUP
Canner, Mankiw and Weil (1997): An asset Allocation Puzzle, American Economic Review (87), pp.181-191
Markowitz, H. (1952): Portfolio Theory Selection, Journal of Finance (7), 77-91

Chapter 3:
Foundations from Asset Pricing

Foundations from Asset Pricing 19
1 INTRODUCTION
In this chapter, we will extend the two-period equilibrium model asking what the equilibrium prices in a multi-
period setting are if agents are not restricted to be mean-variance optimizers as in the Capital Asset Pricing
Model. An equilibrium model in a sequence economy consists of a description of time and uncertainty, a
description of the real side of the economy (goods, agents, preferences, and technology), trading arrangements,
and a description of agents’ behavior. An equilibrium is described by the conditions under which the decisions of
all agents in an economy are mutually consistent.
2 THE GENERAL EQUILIBRIUM MODEL: BASICS

To find the equilibrium in a system that exists over more than two periods, it is necessary to define first the
uncertainty associated with time and information. Similar to Lucas (1978), our model is defined over discrete
time that goes to infinity, i.e. t = 0,1, 2,... . The information structure is given by a finite set of realized states
ωt ∈ Ωt in each t. The uncertainty with respect to information decreases with the time since at every t only one
state is realized. The path of state realizations over time is denoted by the vector ω t = (ω 0 , ω1 ,..., ω t ) . The time-
uncertainty can be described graphically by an event tree consisting of an initial date ( t = 0 ) and Ωt states of
nature at the next date.
t=0 t=1 t=2
The probability measure determining the occurrence of the states is denoted by P. Note that P is defined over the
set of paths ω t . We call P the physical measure since it is exogenously given by nature. We use P to model
the exogenous dividends process. If these realizations are independent over time, P can be calculated as a
product of the probabilities associated with the realizations building the vector ω t = (ω 0 , ω1 ,..., ω t ) . For
example, the probability to get two time “head” by throwing a fair coin is equal the probability to get “head”
once (equal to 0.5) multiplied with the probability to get “head” in the second run (equal to 0.5).
In our model the payoffs are determined by the dividend payments and capital gains in every period. Let
i = 1,..., I denote the investors. There are k = 1,..., K long-lived assets in unit supply that enable wealth
transfers over different periods of time. k = 0 is the consumption good. This good is perishable, i.e. it cannot be
used to transfer wealth over time. All assets pay off in terms of good k=0. This clear distinction between means
to transfer wealth over time and means to consume is taken from Lucas (1978).
In a competitive equilibrium with perfect foresight every investor decides about his portfolio strategy
according to his consumption preferences12 over time θ ti=0,1,... = arg max U i (θ i ,0 ) under the budget constraint
θti ∈R K +1
K K
θti ,0 + ∑ qtkθti ,k = ∑ ( Dtk + qtk )θti−,k1 , where Dtk are the total dividend payments of asset k and the prices
k =1 k =1
qtk=0,1,... are exogenously given, all investors agree on the prices given the realized state.13 The market clearing
I
condition ∑θ
i=1
t
i,k
= 1, k = 1,..., K equalizes demand and supply.
Note that θti ,k (ω t ) ∈ R is the number of assets k that agent i has in period t given the path ω t .
θti (ω t ) ∈ R k +1 is the portfolio of assets that agent i has in period t given the path ω t , and θti ,k t = 0,1, 2,... is
the portfolio strategy along the set of paths. Note also that we did normalized the price of the consumption
good to be one and we used the Walras law to exclude the market clearing condition for the consumption good.
I
We start the economy with some initial endowment of assets θ −i 1 such that ∑θ
i=1
i
-1 = 1 . Assets start paying
K K
dividends in t = 0 , i.e. the budget constraint at the beginning is θ 0i ,0 + ∑ q0kθ 0i ,k = ∑ (q0k + D0k )θ −i ,1k . We can
k =1 k =1
think of t = 0 as the starting point of our analysis, i.e. θ −i 1 can be interpreted as the allocation of assets that we
inherit from a previous period (“the past”). Hence, in a sense the economy can be thought of as restarted at
t = 0.
Instead of using the number of assets k hold in the portfolio of investor i in time t, the investors’ demand can be
λti ,k wti
expressed in terms of asset allocation or percentage of the budget value. This is θ ti ,k = . Equalizing
qtk
I
λti ,k wti I
demand with supply, i.e. ∑
i =1 qtk
= 1 , gives qtk = ∑ λti ,k wti . In other words, the price of asset k is the wealth
i =1
average of the traders’ asset allocation for asset k. This pricing rule follows from the simple equilibrium condition
that demand is equal supply. No other assumptions are necessary to derive this result!
Since our analysis is focused on the development of the strategies wealth over time, it is useful to rewrite the
model in terms of λti ,k (i.e. strategy as a percentage of wealth) instead of θti ,k (i.e. strategy in terms of asset
units). Thus,
12
Note that investors’ preferences are defined over his consumption and not over the depot value. The utility function
representing investors’ time preferences and risk attitude determines the consumption, which is smoothed over the realized
states.
13
Investors may disagree on the probability distribution of the states but they agree on the prices conditionally on the
states.
Definition:
A competitive equilibrium with perfect foresight is a list of portfolio strategies λt=0,1,...

i
,i = 1,..., I and
a sequence of prices q kt=0,1,... , k = 1,..., K such that for all i = 1,..., I and for all t = 0,1,...
⎡ Dtk + qtk i , k ⎤ i
λti=0,1,... = arg max U i ( λ i ,0 wi ) s.t. wti = ⎢ ∑
K
k
λt −1 ⎥ wt −1 and λt ( wt ) ∈ ∆ k +1
λt ∈∆ K +1 ⎣ k =1 q t −1 ⎦
I
and markets clear: ∑λ
i=1
t
i,k
wti = qtk , k = 1,..., K t =1,2,...
In other words, in a competitive equilibrium all investors choose an asset allocation λti=0,1,... that maximizes their
utility over time under the restriction of a budget constraint with a stochastic compound interest rate.14 The
compatibility of these decision problems today and in all later periods and events is assured by the assumption of
perfect foresight. This equilibrium is therefore also called equilibrium in plans and price expectations.
2.1 Complete and Incomplete Markets

The model can considerably be simplified if markets are complete, i.e. if there are sufficient many assets to hedge
all risks. With complete markets – as we will show below – the sequence of budget constraints can be reduced
to a single budget constraint.
Definition:
A financial market (D, q) is said to be complete if any consumption stream θt0 (ω t ) can be attained
with some initial wealth w0 , i.e. it is possible to find some trading strategy θ t such that for all periods
t = 1, 2,...
K K
θt0 (ω t ) + ∑ qtk (ω t )θtk (ω t ) = ∑ ⎡⎣ Dtk (ω t ) + qtk (ω t ) ⎤⎦ θtk−1 (ω t −1 )
k =1 k =1
A financial market is said to be incomplete if there are some consumption streams that cannot be
achieved whatever the initial wealth is. This may arise if there are more states than insurance possibilities.
The necessary and sufficient condition for a financial market to be complete is:
k =1,..., K
rank ⎡⎣ Dtk (ω t −1ωt ) + qtk (ω t −1ωt ) ⎤⎦ = Ωt (ω t ) for all ω t , t = 1, 2,...
ωt ∈Ωt
At (ω t −1ωt )
Hence, if K < maxωt=1,2,... Ωt (ωt ) for some ω t , then markets are incomplete.
K
Dtk + qtk i ,k
14
The budget constraint is defined over the wealth in period t and t-1, where the product ∑
k =1 qtk−1
λt −1 is the
compound interest rate .
For example, in a symmetric tree model, where the set of possible states is equal to S, a necessary condition for
market completeness is that the number of assets is not smaller than the number of possible states, i.e. K ≥ S .
To illustrate the concept of market completeness let us consider the following example:
[2,1]
A
[2+3,1+2]
3 [1,1]
3
[1+1,-1+0] [1,0]
B 1
0
Asset payoffs (dividend payments and prices) are given in brackets over every possible state. The number below
indicates the target consumption θ 0 over the time. There are two assets, i.e. k = 1,2. An investor has to decide
how many units of asset 1 and 2 to buy at t = 0 and t = 1 in order to achieve the target consumption path.
Starting at the end of the period, an investor has to solve the following problem for node A:
2θ11 ( A) + θ12 ( A) = 4
θ11 ( A) + θ12 ( A) = 3
In other words, the sum of the payoffs of asset 1 and 2 multiplied with the number of assets hold in the two
possible states in node A must be equal to the target consumption in these states. The solution of the equation
system is: θ11 ( A) = 1, θ12 ( A) = 2 , i.e. in t = 1 given that node A is realized an investor has to hold 1 units of
asset 1 and 2 units of asset 2. This portfolio costs are: 3 *1 + 2 * 2 = 7
Applying the same calculation procedure for node B, we get:
1θ11 ( B) + 0 θ12 ( B) = 1
Thus, θ11 ( B) = 1, θ12 ( B) = 0 , i.e. in t = 1 given that node B is realized an investor has to hold 1 unit of asset 1
and no asset 2. The portfolio costs are: 1*1 + (−1) *0 = 1
Applying the above procedure for t = 0 , we get:
(2+3)θ 01 + (1 + 2)θ 02 = 3 + 7
(1+1)θ 01 + (−1 + 0)θ 02 = 0 + 1
which gives θ 01 = 13 /11,θ 02 =15/11 .
The same argument can be done for any other target consumption. Hence, in this example markets are
complete. An alternative way to define market completeness is by saying that every new asset is redundant to
the already existing assets.
Definition:
An asset is redundant if its payoffs are a linear combination of the existing assets. In that case it is
possible to find prices for the assets such that the introduction of the asset does not change the set of
attainable consumption streams. Respectively, any target consumption stream can be also achieved with
other assets.
Hence, an asset is redundant if it has payoffs D K +1 which are a linear combination of the existing assets
K
k = 1, 2,..., K : D K +1 = ∑ α k D k . Choosing prices according to the following linear rule:
k =1
K
q K +1 = ∑ α k q k in every event ω t we have:
k =1
{
rank ⎡⎣ Dtk (ω t −1ωt ) + qtk (ω t −1ωt ) ⎤⎦ ωk =t ∈Ω
1,..., K
t
}
⎡⎣ DtK +1 (ω t −1ωt ) + qtK +1 (ω t −1ωt ) ⎤⎦
k =1,..., K
= rank ⎡⎣ Dtk (ω t −1ωt ) + qtk (ω t −1ωt ) ⎤⎦
ωt ∈Ωt
i.e. the rank of the payoff matrix that includes the payoffs of the redundant assets does not change, since the
additional column in the payoff matrix is a linear combination of other columns.
One possibility to represent the concept of market completeness is to map the consumption stream from the
state-preference15 model (e.g. Lucas model) into the mean-variance framework from the previous chapter. If the
introduction of additional assets enlarges the efficient frontier, then the original economy could not have had
complete markets. Why? If assets are redundant in the state-preference model, their payoffs must be perfectly
correlated with the payoffs of other assets16. Thus, their inclusion in the portfolio of risky assets would not
change the efficient frontier shape. Consider for example hedge funds. With investments in various derivative
products they can reduce the variance of their payoffs without cutting the expected returns. This reduces the
correlation with other risky assets with linear payoffs and increases the diversification potential of hedge fund
investing. As a consequence, the efficient frontier including hedge funds shifts to the left promising investors to
receive the same expected returns with lower standard deviation. If hedge funds change the mean-variance
opportunity set (they are not perfectly positively correlated with other assets), they cannot be redundant in the
state-preference context. Thus, the economy without these assets must have been incomplete.
To summarize, redundant asset in the state-preference model are also redundant in the mean-variance
framework and a market is complete if any other asset is redundant. If assets are redundant, they do not create
additional insurance possibilities, the efficient frontier in the mean-variance framework and the rank of the
15
The state preference model is based on the idea that agents build their preferences over scenarios, or with respect to the
occurrence of different states. In contrast, in the mean-variance framework, agents consider only the mean-variance profile
of assets returns over the whole investment period. The occurrence of particular scenario gains an importance only if it
influences the mean and the variance of the final payoff.
16 Cov( A, α A) αVar ( A)
ρ ( A, α A) = = =1
σ ( A)σ (α A) αVar ( A)
payoff matrix in the state-preference model do not change. If there are non-redundant assets, i.e. assets that
change the rank of the payoff matrix and the efficient frontier, their inclusion creates additional insurance
possibilities, thus the market cannot have been complete.
3 THE PRINCIPLE OF NO ARBITRAGE

An arbitrage opportunity is a risk-less profit. In equilibrium, there cannot be any arbitrage opportunities because
this would conflict with the assumption that agents find an optimal portfolio. Adding an arbitrage opportunity
can improve any portfolio.
Definition:
An arbitrage is a self-financing trading strategy, i.e. there is some strategy θ t=0,1,... with θ −1 = 0 such
that for all t = 0,1,2,...
K K
θt0 + ∑ qtkθtk = ∑ ( Dtk + qtk )θtk−1
k =1 k =1
and the resulting consumption is positive: θt0 > 0 ,
i.e. θt0 (ω t ) ≥ 0 for all ω t and all t = 0,1,2,.... and θ 0 ≠ 0
The non-existence of arbitrage opportunities is fundamental for asset pricing. In the absence of arbitrage prices
can be represented as conditional expected values. Since these prices must be consistent with the optimization
calculus of agents with different utility functions and different risk-aversions, we use the so called risk-neutral
measures π t=1,2,.. to build expected values. The existence of this measure is guaranteed by the absence of
arbitrage. This result is known as the Fundamental Theorem of Asset Pricing (FTAP).
FTAP:
There is no arbitrage opportunity if and only if there is a state price process π t=1,2,.. 0 such that
1
qtk−1 (ω t −1 ) = ∑
π t −1 (ω ) ωt ∈Ωt
t −1
π t (ω t )( Dtk (ω t ) +qtk (ω t )) (1.5)
where ω t = ω t −1ωt .
In other words, we can price assets by discounting their payoffs with respect to the state prices, which depend
on agent’s preferences only indirectly through the dependence on asset prices. Note that state prices are not
equivalent to the so called physical (or objective) probability measure p. The state prices are a theoretical
construct that help to find the fair17 price of payoffs. In particular, π t=1,2,.. is the price of an elementary security
paying 1 only in state ωt of the path ω t .
17
In the insurance context, „fair“ means that the insurance premium must be equal to the expected damages.
Proof: (Risk-neutral measure implies no arbitrage)
Consider any self-financing strategy, i.e. suppose that for all t=0,1,2, and for all ω t the budget constructions are satisfied:
K K
θ t0 (ω t ) + ∑ qtk (ω t )θ tk (ω t ) = ∑ ( Dtk (ω t ) + qtk (ω t ))θtk−1 (ω t −1 )
k =1 k =1
Multiplying both sides with π t (ω t ) and adding across ω t gives:
⎛ K
⎞ ⎛ K ⎞
∑
ω
π (ω ) ⎜ θ
t
⎝
t
t
0
(ω t ) + ∑ qtk (ω t )θtk (ω t ) ⎟ = ∑ π t (ω t ) ⎜ ∑ ( Dtk (ω t ) + qtk (ω t ))θtk−1 (ω t −1 ) ⎟
k =1 ⎠ ωt ⎝ k =1 ⎠
t
Using (1.5) we get:
⎛ K
⎞ K
∑
ω
π (ω ) ⎜ θ
t
⎝
t
t
0
(ω t ) + ∑ qtk (ω t )θtk (ω t ) ⎟ = ∑ qtk−1 (ω t −1 )θtk−1 (ω t −1 )
k =1 ⎠ k =1
t
Adding this along all paths ω t =(ω0,ω1, ...,ωt ) gives:
∑∑
ω
t
π (ω )θ t
t
t
0
(ω t ) = ∑ q0kθ −k1
k =1
t
Now, if θt0 (ω t ) > 0 and π t (ω t ) 0 this would require θ −1 > 0 , saying that a positive payoff incurs positive costs, ruling out
arbitrage opportunities.
QED
Note that in this proof we did reduce the sequence of budget constraints to a single budget constraint in
terms of present values of all future expenditures and incomes. This single budget constraint always
needs to hold. If markets are incomplete then in addition one needs to make sure that the process of
gaps between an agent’s expenditure and his income can be achieved with the set of available assets.
4 GEOMETRIC INTUITION FOR THE FTAP

To give the geometric intuition of the Fundamental Theorem of Asset Pricing, we consider a simple case: the
one-period model. It simplifies the future significantly: it consists of a finite number of states s = {1,..., S } at the
second date.18 Thus, the optimization problem of a consumer is:
max U i (c0i , c1i ,..., cSi )

θ
c = w0i − ∑ qkθ k ,
i
0
k
K
csi = ∑ Dskθ k , s = 1,..., S .
k =1
where c0i is the consumption in t=0 (this is a short-hand notation to θ 0i in the multi-period model), wi0 is the
18
In the two-period model we use the simple notation s=1,2,…,S for the states ω1 ∈Ω1 .
∑(D + q k )θ 0 , where θ 0 is the portfolio inherited from the past.

K
k k k
wealth an agent i starts with: 0
k =1
In an equilibrium, arbitrage opportunities are excluded, i.e there is no vector of strategies θ ∈ R K such that
⎡ q´⎤
⎢ D ⎥ θ > 0 . In other words, there is no such vector of strategies θ ∈ R that does not require
K
⎣ ⎦
capital, − q 'θ ≥ 0 , but provides a non-negative payoff, Ds θ ≥ 0 , and a strictly positive payoff in minimum one
of the states including s=0. By the Fundamental Theorem of Asset Pricing, the no arbitrage condition is
S
equivalent with the existence of a state prices π ∈ R++
S
, such that qk = ∑ Dsk π s , k = 1,..., K . We already
s=1
showed the proof that prices calculated as expected payoff values with respect to the risk-neutral probabilities
are arbitrage-free. Now, we are going to show graphically that the no arbitrage condition requires the existence
of positive state prices, i.e. that the price of asset k is a positive linear function of its payoffs.
Consider two assets paying dividends in s = 1,..., S . The payoffs in these states can be represented by the
vector Ds ∈ R 2 . For simplicity we consider the case of two states. Thus, the dividend payoff of the both assets in
s = 1, 2 can be represented by the vectors D1 and D2 . To find the set of non-negative payoffs given a particular
state, we first determine the bound, where the asset payoff, Ds θ , is equal to 0. This is a line orthogonal to the
payoff vector19. Plotting these orthogonal lines for the vectors D1 and D2 , we determine the set of non-
negative payoffs in both states as crossing area of two half planes as shown in the figure below.
arbitrage
D2
D1
q
To determine the set of arbitrage opportunities, we have to find a strategy requiring investments, i.e. −q 'θ ≥ 0
or q 'θ ≤ 0 with a positive payoff in at least one of the states. To find the set of arbitrage portfolios we then plot
the price vector q so that conditions q 'θ ≤ 0 and Ds θ are satisfied. This is possible if and only if q does not
19
The scalar product of two vectors is the angle build by these vectors. The scalar product is positive (negative) if the angle
is smaller (greater) than 90 degrees. The scalar product of orthogonal vectors is equal to 0.
belong to the cone of D1 and D2 , i.e. if there exists 1 > π > 0 such that q = π D1 + (1 − π ) D 2 .
5 FIRST WELFARE THEOREM

An allocation of consumption streams {θti=,00,1,... (ωt )}
I
is attainable if each component is in the consumption set
i =1
of the agent and it does not use more consumption than is available from the dividend process:
I K
∑θ
i =1
t
i ,0
(ω t ) = ∑ Dtk (ω t ) for every ω t , t = 0,1, 2,...
k =1
In a financial market the allocation of consumption streams (θti=,00,... (ω t ) ) is Pareto-efficient if and only if it is
I
i =1
( )
I
attainable and there does not exist an alternative attainable allocation of consumption streams θˆti=,00,... (ω t ) ,
i =1
such that no consumer is worse off and some consumer is better off:
( )
Ui θˆti =,00,... (ωt ) ≥ Ui (θti =,00,... (ωt ) ) for all i and > for some i.
Note that market efficiency does not rule out that some agents consume much more than others. From the
perspective of fairness, this might not be optimal.
Proof: (The First Welfare Theorem):
( )
I
Suppose θˆti =,00,... (ωt ) is an attainable allocation that is Pareto-better than the financial market allocation:
i =1
( )
i.e. Ui θˆti =,00,... (ωt ) ≥ Ui (θti =,00,... (ωt ) ) for all i and > for some i.
Question: Why did the agents not choose θˆti =,00,... (ωt ) ? ( )
Since markets are complete the alternative allocation must have been too expensive:
∑∑ π t (ωt )θˆti ,0 (ωt ) > ∑∑ π t (ωt )θti ,0 (ωt )
t ωt t ωt
Adding across consumers gives:

∑∑∑ π t (ωt )θˆti ,0 (ωt ) > ∑∑∑ π t (ωt )θti ,0 (ωt )
i t ωt i t ωt
⇔ ∑∑ π t (ω ) ∑θˆ (ω ) > ∑∑ π t (ωt ) ∑θti ,0 (ωt )

t
t
i ,0 t
t ωt i t ωt i
K K
∑ Dtk (ωt ) ∑ Dtk (ωt )
k =1 k =1
Which is a contradiction!
QED
The above mentioned result only uses that the agents’ utility functions are increasing in θti ,0 . If the functions are
moreover smooth, i.e. continuously differentiable, and if boundary solutions are ruled out, Pareto efficiency can
be defined using the consumers’ marginal rates of substitution.
Then, if an allocation is Pareto-efficient the marginal rates of substitution needs to coincide across consumers:
∂θ 0 U i (c i ) ∂θ 0 U j ( c j )
MRS i
s,z = s
= s
= MRS sj, z
∂θ 0 U (c )
i i
∂θ 0 U ( c )
j i
z z
The graphical representation of this efficiency concept in the Edgeworth Box is:
csj i
czi
efficient allocation
czj
j
csi
6 MARKET SELECTION HYPOTHESIS WITH RATIONAL EXPECTATIONS

We can use the Pareto-efficiency property to formulate a Market Selection Hypothesis that determines which
investor survives best in the dynamics of the market in terms of (relative) wealth over time. If every investor has
some expected utility function with possible different time preferences and different risk attitude and if payoffs
are stochastic, then investor i will dominate investor j if his beliefs on the occurring of the states are more
accurate. Note that investor’s dominance is not defined over his strategy but on his ability to make good
estimates.
∞
Let investors maximize their expected utilities: Eui = ∑ (δ i ) ∑
t
p
t
i
(ω )u i (θt i (ω t )) they may differ with respect
t =0 ω
to the personal discount factor δ , the risk preferences u i (.) and personal beliefs p i (ω ) for the occurrence of a
i
particular event ω . Consider, the marginal rate of substitution of two expected utility investors between two
states s and z. Pareto-efficiency requires:
psi ∂u i ( wsi ) psj ∂u j ( wsj )

MRS i
s,z = i i i = j j j = MRS sj, z
pz ∂u ( wz ) pz ∂u ( wz )
If investors differ in their beliefs (expectations) for state s, e.g. psi > psj , there must also be some states such
that pzi < pzj , then w is > wsj and w iz < wzj to ensure the equality above.20 Thus, the better the agent’s beliefs,
the more those agents get in the more likely states. Note that the only one requirement for our fitness criteria to
hold is decreasing marginal utilities as in the expected utility framework.
The result that investors get more wealth in those states they assign a higher probability goes back to Sandroni
(2000) and Blume and Easley (2002). They formulate a theorem for competitive equilibrium with perfect
foresight that selects for consumers with correct beliefs. It says that in dynamically complete markets where the
dividend process is i.i.d. two expected utility investors i and j with different beliefs such that P i = P (investor i
20
This result follows from the decreasing marginal utility. The more wealth an investor receives the lower is his marginal
utility.
wti wti
has correct beliefs) and P ≠ P , then P almost surely lim j = 1 , where j is the relative wealth of investors
j
t →∞ w wt
t
i and j. In other words, excluding impossible events in the long run, investor i is wealthier than investor j if he
makes better predictions. If markets are complete the degree of risk aversion is not essential for capital
accumulation.
7 STOCK PRICES AS DISCOUNTED EXPECTED PAYOFFS

Suppose the first asset is short-lived and risk free (e.g. a one period loan-saving contract).
1
t −1 ∑
Then applying (1.5) gives: qt1−1 (ω t −1 ) = π t (ω t )( Dt1 (ω t ) + qt1 (ω t )) . Since the security is risk-less,
π t −1 (ω ) ωt ∈Ω 1 0
its payoff Dt1 (ω t ) is equal to 1 in all states. Under the assumption that the asset lives only one period, there is
no price for it after this period, i.e. qt1 (ω t ) is equal to 0.
This is equivalent to:
1 1
t −1 ∑
qt1−1 (ω t −1 ) = π t (ω t ) ≡
π t −1 (ω ) ωt ∈Ω 1 + rf (ω t −1 )
1
Using this and the condition qtk−1 (ω t −1 ) = ∑ π (ω t )( Dtk (ω t ) +qtk (ω t )) , we get:
π t −1 (ω t −1 ) ωt ∈Ω t
1
qtk−1 (ω t −1 ) =
1 + rf t −1 (ω t −1 ∑π
)ω
*
t (ω t )( Dtk (ω t ) +qtk (ω t ))
t ∈Ω
π t (ω t )
where π (ω ) =
* t
> 0 is indeed a (risk-neutral) probability measure based on the information of
t
∑ π t (ω t ).
ωt ∈Ω
period t-1. Hence, asset prices can be presented as discounted expected payoffs, conditional on the information
available at the time of valuation. This is a sequence of events (or a path) realized from the beginning until t − 1 .
8 CONSEQUENCES OF NO ARBITRAGE
8.1 The Law of One Price

The Law of One Price says that if from period t onward two assets have identical dividend processes, then in
period t-1 they must have the same price.
To see this, suppose that qt1-1 < qt2-1 and D1τ =t,t+1,... =Dτ1 =t,t+1,... . Buying θ t-1
1
units of the cheaper asset and selling
the same amount of the expensive asset gives: (qt2-1 - qt1-1 )θt1-1 > 0 in t-1 and in all other periods the portfolio is
K K
hedged, i.e. ∑q
k =1
k
t (ω t )θtk (ω t ) = ∑ ( Dtk (ω t ) + qtk (ω t ))θ tk−1 (ω t −1 )
k =1
since θtk (ω t ) = 0 for k ≠ 1,2 and θtk (ω t ) = θtk−1 (ω t −1 ) for k=1,2 .
8.2 Linear Pricing

The second consequence of the no arbitrage principle is the concept of linear pricing saying that if in period t-1
one buys and holds a portfolio θˆ then in the same period the price of the portfolio must be a linear
t-1
combination of the prices of its components:

K
qθt-1 =θˆt-1qt −1 = ∑ θˆtk−1qtk−1
ˆ
k=1
To see this, suppose that qθt-1 > θˆt-1qt −1 . Then, buying θˆtk−1 units of asset k and selling the portfolio θˆt −1 gives:
ˆ
qθt-1 − θˆt-1qt −1 > 0 in t-1 and otherwise the position is hedged:

ˆ
K K
qθt θˆt + ∑ qtk (ω t )θˆtk (ω t ) = ( Dtθ (ω t ) + qtθ (ω t ))θtθ−1 (ω t −1 ) + ∑ ( Dtk (ω t ) + qtk (ω t ))θˆtk−1 (ω t −1 ) because
ˆ ˆ ˆ ˆ
k =1 k =1
K
Dtθ (ω t ) = ∑ Dtk (ω t ) and θˆtk (ω t ) = θˆtk−1 (ω t −1 ) .
ˆ
k =1
8.3 Derivatives
The Fundamental Theorem of Asset Pricing is essential for the valuation of redundant assets such as derivatives.
In general, there are two possible ways to determine the value of a derivative. The first approach is based on
determining the value of a hedge portfolio. This is a portfolio of assets that delivers the same payoff as the
derivative. The second approach uses the risk-neutral probabilities in order to determine the current value of the
derivative’s payoff.
Consider an example of the one-period binomial model. In this simplified setting, we are looking for the current
price of a call option on a stock S. Assume that S = 100 and there are two possible prices in the next period:
Su = 200 if u = 2 and Sd = 50 if d = 0.5 . The riskless interest rate is 10%. The value of an option with
strike price K is given by max( Su − K , 0) if u and max( Sd − K , 0) id d is realized.
To determine the value of the call, we replicate its payoff using the payoffs of the underling stock and the bond.
If arbitrage is excluded, the value of the call is equal to the value of the hedge portfolio, which is the sum of the
values of its constituents. The idea is that a portfolio that has the same cash flow as the option must have the
same price as the call that we look for.
Calculating the call values for each of the states (“up” and “down”), we get that the call value given state “up”
is max( Su − K , 0) = (200 − 100) = 100 respective max( Su − K , 0) = max(50 − 100, 0) = 0 . The hedge
portfolio requires then to borrow 1/3 and to buy 2/3 assets in order to replicate the call’s payoff in each of the
states:
" up " : 2 / 3* 200 − 1/ 3*110 = 100

" down ": 2 / 3* 200 − 1/ 3*110 = 0
In general, we need to solve:
Cu = max( Su − K , 0) = nSu + mBR

Cd = max( Sd − K , 0) = nSd + mBR
where n is the number of stocks and m is the number of bonds needed to replicate the call payoff.
(Cu − Cd ) SuCd − SdCu

n= is also called delta of the option, m = .
Su − Sd BR( Su − Sd )
In the binomial model, we need two equations to match (“up” and “down”) with two securities (stock and
bond). In the trinomial model (if there is a state “middle”), we will need a third security in order to replicate the
call payoff etc.
The second approach to value derivatives is based on the FTAP result that in the absence of arbitrage we do not
consider the “objective” probabilities associated with “up” and “down” movements, which are already
considered in the equilibrium prices, instead we can value all securities “as if” we are in a risk-neutral world
with no premium for risk. In this case, we can consider the probability of an “up” (“down”) movement as being
equal to the risk-neutral probability π * ( 1 − π * ). Thus, the expected value of the stock with respect to these risk
neutral probabilities is: S0 = π * Su + (1 − π * ) Sd . In a risk-less world, this must be the same as investing S today
and receiving Sr after one period.
Then, π * Su + (1 − π * ) Sd = SR or π *u + (1 − π * )d = R .
The risk-neutral probabilities are then defined over the size of the up and down movements of the stock price
R−d
and the risk-free rate: π * = , 0 ≤ π * ≤ 1 . Using the risk-neutral measure we can calculate the current
u−d
π * Su + (1 − π * ) Sd π *C + (1 − π * )C
value of the stock: S = and the call: C = u d .
R R
To value an option in multi-period setting, we use the binomial lattice model:
C uu = m a x (0 , u 2 S − K )
Cu
C Cud = max ( 0, udS − K )
Cd
Cdd = max ( 0, d 2 S − K )
Note that the risk-neutral probability is a stationary measure, i.e. it remains the same at every node. To see this,
suppose that at some node of the binomial lattice the stock price is Z. Then, its expected value after 1 period is:
Eπ * ( Z ) = π * Zu + (1 − π * ) Zd . In a risk-less world this value must be equal to ZR:
Eπ * ( Z ) = π * Zu + (1 − π * ) Zd = ZR . Since Z cancel, we get that the risk-neutral probability is constant over the
time and depends only on the size and the frequency of “up” and “down” movements. Consider an example of
a call option over the periods t=0,1,2. The value of this option in t=1 depends on the realized state (“up” or
“down”), i.e.:
1
Cu = ⎡π * ⋅Cuu + (1 − π *) ⋅ Cud ⎦⎤
R⎣
1
Cd = ⎡⎣π * ⋅Cud + (1 − π *) ⋅ Cdd ⎤⎦
R
1 ⎡ 2
π ⋅ Cuu + 2 ⋅ π * ⋅ (1 − π *) ⋅ Cud + (1 − π *) ⋅ Cdd ⎤ respectively
2
The value of the call at t=0 is: C = 2 ⎣
R ⎦
⎡π *2 ⋅Max ⎡ 0, u 2 S − K ⎤ + 2 ⋅ π * ⋅ (1 − π *) ⋅ Max [ 0, udS − K ]⎤

1 ⎢ ⎣ ⎦ ⎥
C= 2
R ⎢ + (1 − π *) ⋅ Max ⎡0, d 2 S − K ⎤
2
⎥
⎣ ⎣ ⎦ ⎦
We can continue this argument for more and more periods to obtain the hypergeometric distribution, which in
limit gives the normal distribution (see Varian, 1989).
9 EQUIVALENT FORMULATIONS OF THE NO ARBITRAGE PRINCIPLE

According to the Fundamental Theorem of Asset Pricing, if a price process is arbitrage-free, there exists no
strategy θt , t = 0,1, 2,... that generates risk-free returns on q.
This is equivalent to the existence of a market expectation or a risk-neutral probability such that
1
qtk = Eπ *t ( Dtk+1 + qtk+1 ) t = 0,1, 2,... .
1 + rt
Applying forward iteration to the expression above in order to get a result for qtk which is not dependent on
future realizations, we get the Dividend Discount Model (DDM)21:
⎛ ∞ ⎛ 1 ⎞τ −t k ⎞
qtk = Eπ *t ⎜ ∑ ⎜ Dτ ⎟ , t = 0,1, 2,...
⎜ τ =t +1 ⎝ 1 + r ⎟⎠ ⎟
⎝ ⎠
Thus, if a market is rational, price movements will depend only on movements of the risk-free interest rate and
the dividend payments. Assuming that the dividend process follows a random walk, we can conclude that
perfectly anticipated prices must be random, i.e. Eπ *t (qtk+1 − (1 + r )qtk ) = − Eπ *t ( Dtk+1 )
21
To simplify expressions we have assumed a constant interest rate.
Expressing the no-arbitrage condition in terms of excess returns (returns exceeding the risk-less return) we get:
Eπ *t ( Rtk+1 − R ft ) = 0
In other word, the net present value of a strategy with respect to the risk-neutral probability must be equal to 0.
Positive net present values are possible only if one uses a probability measure different from π * . Though, in this
case, the probability measure used does not include all possible risks.
In terms of expected gains, the no arbitrage principle requires that they are martingales, i.e.
T
Eπ *t ( gt +1 − gt ) = 0 , where gt = ∑ ⎡1+1r ( Dτk+1 + qτk+1 )− qτk ⎤ θ k
. Hence the cumulated expected gains are:
⎢
τ =t ⎣ t ⎦⎥ τ
∞
Eπ *t ( ∑ gτ ) = 0 , which is consistent with the idea that “nobody can beat the market”, since nobody can beat
τ = t +1
a martingale. Though, every time somebody beats the market. To see this, consider the simple equilibrium
condition that demand is equal supply:
I K K
∑∑ q kθki = ∑ q kθ k
i=1 k=1 k=1
MV k
i
W
If the market value increases by x% and investor i’s wealth increases by less than x%, then somebody else must
have beaten the market. However, the number of investors who beat the market can be smaller than the number
of investors who lose to the market, so that the average number of investors does not beat the market. From a
dynamic perspective, the question is if there are investors who beat the market persistently.
Sometimes it is not useful to work with abstract probability measures, such as π * . In this case, it is helpful to
π*
change the measure using the likelihood ratio process: =
, which is also called pricing kernel, ideal
P
security, or stochastic discount factor. The expected return of a strategy with respect to the “objective”
probability P is then22: EPt ( Rtk ) = R f ,t − cov Pt ( Rtk , t ) , where the covariance of strategy returns to the likelihood
ratio represents the strategy risk.
To summarize, the absence of arbitrage is equivalent to the conclusion that gains are martingales, prices are
random and do not generate excess returns with respect to the risk-neutral probability π * . If we define expected
returns in terms of the physical probabilities, the risk can be measured by the covariance of returns to the
likelihood ratio process. In this sense, the state prices contain information on the market risk that prevents
arbitrage. And hence the risk neutral measure should better be called the risk-adjusted measure.
To see that prices calculated as expected values under the physical probability are typically not fair, consider the
following example. Consider a security S with a current value of 5. Say the physical probability of an
π s*
22
Note that R f = Eπ * ( R k ) = ∑ π s* Rs k = ∑ ps ( ) Rs k = ∑ ps ls Rs k = E p (l R k ) .
s s ps s
“up”/”down” movements are equal to 90% respective 10%. The price in the next period can be either equal to
9, given that state “up” is realized, or 4, given that state “down” is realized. The expected value of the security
payoff under the physical probability is equal to 8.5. Compared to the current price of 5, this can be considered
as an attractive investment opportunity. Though, it is not risk-less. There is a 10% probability that the current
price of 5 can decrease to 4. We can explicitly consider the risk of a “down” movement by calculating the risk-
neural probabilities π * . These are the probabilities that equalize the current price with its expected value. In the
example above, the risk-neutral probability changes considerably, i.e. π * in the “up” state is now 20%,
respectively 80% in the “down” state. If there is no risk, i.e. there is no state that delivers a lower payoff than
the current price the risk-neutral probability must be negative.
Are the risk-neutral probabilities relevant for personal investment decisions? The risk-neutral probabilities
determine the equilibrium value of future payoffs. Since these prices go into the budget constraint, they affect
the optimization problem of the investors. Though, investors are also utility maximizers. If investor’s expected
utility is defined over a probability measure that differs from the portfolio of his endowments, then the portfolio
that maximizes this utility will differ from the portfolio resulting by solving the budget constraint under fair prices
and the investor will decide to trade. In other words, the existence of no arbitrage does not imply that investors
should follow a passive investment strategy (buy and hold an index for example).
10 LIMITS TO ARBITRAGE
There exist strictly positive state prices if and only if security prices exclude unlimited arbitrage. Though, in the
presence of limits of arbitrage like short-sales constraints, the arbitrage is limited and even the law of one price
may fail in equilibrium. Let us consider first some examples.
10.1 The Case of 3Com and Palm

On March 2nd 2000 3Com made an IPO of one of its most profitable units. They decided to sell 5% of its Palm
stocks and retain 95%. At the IPO day, the Palm stock price opened at 38$, achieved its high at 165$ and
closed at 95.06$. This price movement was puzzling because the price of the mother-company 3Com closed that
day on 81.81$. If we would calculate the value of Palm shares per 3Com share, which is 142.59$23, and subtract
it from the end price of 3Com, we get 81.81$ − 142.58$ = −60.77$ . If we additionally consider the available
cash per 3Com share, we would come to a “stub” value for 3Com shares of -70.77$! Clearly, this result is a
contradiction of the law of one price since the portfolio value (the value of Palm shares, the rest of 3Com shares
and the cash amount), which is negative, differs from the sum of its constituents, which are positive.
Though, the relative valuation of Palm shares did not open an arbitrage strategy, since it was not possible to
short Palm shares. Also it was not easy to buy sufficiently many 3Com stocks and then to break 3com apart to
sell the embedded Palm stocks. The mismatch persisted for a long time (see figure below).
23 0.95*95.06
# outstanding 3Com shares
10.2 The Case of Closed End Funds

The case of closed end funds24 is more puzzling since the portfolio ingredients are not only known but also
tradable. Though, on average, the prices of fund shares are still not equal to the sum of the prices of its
components as the figure below shows.
The reason for this mismatch is the fact that no investor can unbundle the closed-end funds and trade their
components on market prices. Additionally, buying a share of an undervalued closed-end fund and selling the
corresponding portfolio until maturity does not work because closed end funds typically do not pay out the
dividends of their assets before maturity.
As in the 3Com-Palm case the violation of the law of one price does not constitute an arbitrage strategy because
the discount/premium of the closed end funds can deepen until maturity.
10.3 The LTCM case

The prominent LTCM case is an excellent example of the risks associated with seemingly arbitrage strategies. The
LTCM managers discovered that the share price of Royal Dutch Petrolium at the London exchange and the share
price of Shell Transport and Trading at the New York exchange do not reflect the parity in earnings and dividends
stated in the splitting contract between these two units of the Royal Dutch/Shell holding. According to this
splitting contract, earnings and dividends are paid in relation 3 (Royal Dutch) to 2 (Shell), i.e. the dividends of
24
A closed end fund is a mutual fund with a fixed asset composition.
Royal Dutch are 1.5 times higher than the dividends paid by Shell. Though, the market prices of these shares did
not follow this parity for long time as the figure bellow shows.
This example is most puzzling because a deviation of prices from the 3:2 parity invites investors to either buy or
sell a portfolio with shares in the proportion 3:2 and then to hold this portfolio forever: Doing this one can cash
in a gain today while all future obligations in terms of dividends are hedged. There is however the risk that the
company decides to change the parity.
10.4 No Arbitrage with Short-Sales Constraints

To illustrate how limits to arbitrage enlarge the set of arbitrage-free asset prices, consider the case of non-
negative payoffs and short-sales constraints i.e. a sk ≥ 0 and λki ≥ 0 . The short-sales restriction may apply to one
or more securities. Then, the Fundamental Theorem of Asset Pricing reduces to:
Theorem: (FTAM with Short-Sales Constraints)

There is no long-only portfolio θ ≥ 0 that qθ ≤ 0 and Aθ > 0 is equivalent to q>>0 .
Proof:
Suppose q>>0 and θ ≥ 0 . For strategy θ with Aθ > 0 must be true that qθ > 0 . In other words,
every long-only portfolio must cost something.
Suppose q k ≤ 0 , then for some k θ = (0,...,1,..., 0) is an arbitrage, i.e. Aθ > 0 and qθ ≤ 0 .

k
Hence, ALL positive prices are arbitrage-free because sales restrictions deter rational managers to exploit
eventual arbitrage opportunities. Consequently, the no-arbitrage condition does not tell us anything and we
need to look at specific assumptions to determine asset prices.
11 IDENTIFYING THE LIKELIHOOD RATIO PROCESS

To get some structure for the Likelihood Ratio Process, we can use the CAPM where is supposed to be in the
asset span of the risk-free rate and the market portfolio; APT in the case that there are more than one risk factor
determining expected returns; Macro Finance based on the preferences of a representative agent.
11.1 The Likelihood Ratio Process with CAPM

As we already have seen above, in the absence of arbitrage, there is always a risk-return decomposition of the
risk-free rate possible such that R f = E p ( R k ) + COV p (l , R k ) . Additionally, we proved that in the CAPM the
SML needs to hold. Now, we want to rephrase this result by showing that in the CAPM the likelihood ratio
process has to be collinear to the market portfolio.
Consider the general case of incomplete markets and decompose into = D + D⊥

, D ∈< D > , and
D⊥
∈< D > ⊥ .25 Suppose that D = a1 + bM + γ , where γ ∈< D > , γ ∉< 1, M > 26 and
E (γ 1) = E (γ ) = E (γ M ) = 0 .
We will show that for the SML to hold, the condition γ = 0 must be satisfied.
Recall that Cov(γ , M ) = E (γ M ) − E (γ ) E ( M ) = 0 . Hence, under the above assumption Cov(γ , M ) = 0
The price of the market portfolio is:
1 1
f [
q( M ) = E ( M ) + Cov(lD , M ) ] = f ⎡⎣ E ( M ) + bσ 2 ( M ) ⎤⎦ , which allows us to solve for b:
R R
R f q( M ) − E (M )
b=
σ 2 (M )
For every z ∈< D > we can also write:
1 1
f [
q( z ) = E ( z ) + Cov(lD , z ) ] = f ⎡⎣ E ( z ) + bCov( w0i , z ) + Cov(γ , z ) ⎦⎤ . Using the expression above for b,
R R
1 ⎡ Cov( M , z ) ⎤
we get: q( z ) =
Rf ⎢ E ( z ) + ( γ q( M ) − E ( M ) ) σ 2 ( M ) + Cov(γ , z ) ⎥ .
⎣ ⎦
On the other hand, SML implies that:
1 Cov( M , z ) ⎛ 1 ⎞
q( z ) − E ( z) = ⎜ q( M ) − f E ( M ) ⎟
R f
σ (M ) ⎝
2
R ⎠
This requires that Cov(γ , z ) = 0 ∀z ∈ 〈 D〉 or σ 2 (γ ) = 0 , which is equivalent to saying that γ is risk-less, i.e.
γ = λ1 for some λ ∈ R . This is a contradiction to the assumption that γ is not in the asset span of the risk-
less asset and the endowment: γ ∉< 1, M > .
25
This decomposition is not necessary in the case of complete markets, where l = lD .
26
1 denotes the risk-free pay off.
11.2 The Likelihood Ratio Process with APT

In the CAPM, the beta measures the sensitivity of the security’s returns to the market return. The model relies on
restrictive assumptions about agents’ preferences or security returns. The Arbitrage Pricing Theory (APT) is based
on a pricing relation similar to that of the CAPM but with several factors replacing the market return.
Let f1 ,..., f J be contingent claims called factors. It is assumed that all factors have zero expectation and are
linearly independent, i.e. Cov( f j , f l ) = 0 for j ≠ l . The likelihood ratio process can be identified as
J
= ∑α j f j
where f j ∈< D > . Following the same approach as in the CAPM, we get:
j =1
∑ Cov p ( f j , R k )
E p (R ) − R k f
= j
Var ( f ) j (E p ( f j ) − R f ) . This gives a richer economic regression. However this
regression is not founded by an economic model.
11.3 The Representative Agent

A fundamental issue in finance is the question under which conditions can prices, which are market aggregates
be generated by aggregate endowments (consumption) and some aggregate utility function. If this were
possible, one could identify the likelihood ratio process with the marginal rates of substitution of some aggregate
utility function. Moreover, we are interested in the question if it is possible to find an aggregate utility function
that has the same properties as the individual utility functions. Finally, we ask if it is possible to use the
aggregate decision problem to determine asset prices “out of sample”, i.e. after some change of e.g. the
dividend payoffs.
To answer the first question, we use the “Anything Goes” theorem saying that for every arbitrage free prices
there exists an economy with a representative consumer maximizing some expected utility function such that the
prices are the equilibrium prices of the economy populated only with this representative agent. To see this in the
case of a one-period model, recall that the no arbitrage condition requires the existence of some risk-neutral
S
probability π * 0 such that q = π * D . Choosing the utility function U R (c0 , c1 ,..., cS ) = c0 + ∑ π s*cs for the
s =1
27
representative agent, at prices q he will consume the aggregate endowments. Thus, if there is an artificial
agent with beliefs π * 0 over all possible states, he will also generate the prices q k .
The second question we are interested in is related to the inherent differences in the utility functions of
individuals and the representative agent. Individuals may have different beliefs or differ in their risk aversion
27
To see why this argument holds consider the following maximization problem:
max c0 + ∑ π s cs s.t. c0 + ∑ qkθ k = w0 and cs = ∑ Ds θ .
* k k
θ s k k
The first order condition gives q = ∑ π D , which is equivalent to the no arbitrage condition required to hold in
k * k
s s
s
equilibrium.
meanwhile the representative agent is assumed to be risk neutral with beliefs equal to π * 0 . Thus, the
R
question is under which conditions the utility function of the representative agent U has the same properties
i R
as the individuals U . Whereas the existence of U is conditioned on the existence of arbitrage-free prices, the
requirement that U R is indeed representative for the utility functions of individual agents is conditioned on the
Pareto-efficiency of the equilibrium allocation in the economy. In other words, if equilibrium allocations are
Pareto-efficient, then there exists some aggregate utility function of the same type as the individual utility
function able to generate the observed asset prices from a single agent decision problem. To see why this
argument must hold, note that the Pareto-efficiency condition is equivalent to maximizing some welfare function
being the weighted sum of individual utilities. Under scarce resources there is a trade-off between utility levels
that can be achieved. The social planer has then to choose the weights appropriately under the restriction that
the allocation achieved is attainable with the given resources. In a nutshell, to find U R as an appropriate
aggregation of many U i , we can represent U R as a weighted sum, whereby the weights determine which
Pareto-efficient allocation is obtained. If we elaborate on this, we have to remind that Pareto-efficiency requires
that all agents in the economy have equal MRS over consumption today and consumption in a period ahead, and
* *
all agents agree on the state prices, i.e. ∇1U (c ) = ... = ∇1U (c ) =: π . In particular to obtain the observed
1 1 I I
* *
∂ 0U 1 (c1 ) ∂ 0U I (c I )
equilibrium allocation from the welfare maximization problem we define U R (W ) = sup ⎧⎨∑ γ iU i (ci ) ¦ ∑ ci = W ⎫⎬
I I
c1 ,...,c I ⎩ i =1 i =1 ⎭
* *
where γ i = 1 . The first order condition of this problem gives γ 1∇U 1 (c ) = ... = γ I ∇U I (c ) =: λ and
1 I
*
∂ 0U i (c )
i
∇U R (W ) = λ . Hence,
*
∇1U i (c i )
∇1UR (W ) = *
and ∂ 0U R (W ) = 1 .
∂ 0U i (c ) i
⎛ − q* ⎞
In terms of security prices, we can rewrite the problem as: Max U (c ) s.t. c − W ≤ ⎜ ⎟ θ and the first
R R R
θ
⎝ D ⎠
* *
∇1U R (c R ) ∇1U i (ci )
order condition gives: q = *
*
D= *
D = π D . Hence asset prices can also be generated from
∂ 0U (c )
R R
∂ 0U (c )
i i
s single representative agent decision problem.
To see why the representative agent is of the same type as the agents in the economy, suppose that all agents in
S
the economy have common beliefs and time preferences. In this case, U i (c i ) = u i (c0i ) + β ∑ ps u i (csi ) for all
s =1
⎧ I
⎫ 1
I
i = 1,.., I . To show that the agent with U R (W ) = sup ⎨∑ γ iU i (c i ) ¦ ∑ c i = W ⎬ where γ i = *
is
c ,...,c ⎩ i =1
1 I
⎭
i =1
∂ 0U (c )
i i
really representative, rewrite his expected utility using U i (c i ) as

⎧ I ⎛ S
⎞ I ⎫
U R (W ) = sup ⎨∑ γ i ⎜ u i (c0i ) + β ∑ ps u i (csi ) ⎟ ¦ ∑ c i = W ⎬ . Using that
c1 ,...,c I ⎩ i =1 ⎝ s =1 ⎠ i =1 ⎭
I I I I
u R (W0 ) = max
i ∑ γ iui (c0i ) s.t. ∑ c0i = W0
c0
i =1 i=1
and u R (Ws ) = max
i
cs
∑ γ iui (csi ) s.t. ∑ csi = Ws
i =1 i=1
we get
S
U R (W ) = u R (W0R ) + β ∑ ps u R (WsR ) , which is of the same form as the utility of agent i under the condition
s =1
that the agent consumes the aggregate endowment in each state.
Note however that both arguments for a representative agent that we have given so far are tautological. We
need the equilibrium prices and allocation in order to define the utility of the representative agent. This gives the
necessary weights for the individual utilities. On the other hand, using these weights we derive the equilibrium
prices, i.e. we recover what we already needed to start the argument. In this sense, there is no additional
information gained through the analysis. For example, Lettau and Ludvigson (2001) show with an argument
along these lines that between 1952 and 1998 80% of the excess stock returns can be explained with
movements of aggregated households’ wealth as a proxy for log consumption-wealth ratio. Note that this
argument is an “in sample” statement which is equivalent to saying that security markets are 80% Pareto-
efficient.
From any investor’s perspective, the relevant question is under which conditions one can make conditional
estimates for future equilibrium. Thus, the next question requires defining the conditions under which we can use
a representative agent making “out of sample predictions”.
Obviously, the first condition under which this is possible is related to the heterogeneity of the agents in the
economy. If they have identical utilities and identical endowments, their aggregation is merely a scaling
( )
procedure. If agents’ utilities are not identical but quasi-linear, i.e. U i (c0i , c1i ,..., cSi ) = c0i + u i c1i ,..., cSi , it is
also possible to aggregate them in order to make estimates. In this case the weights in the welfare function
1
*
are independent of the equilibrium allocations. This would however imply that the agents` asset
∂ 0U (c )
i i
allocations do not change with their income. The following conditions that we describe are all based on the
expected utility hypothesis that we will elaborate on in the next chapter. The third assumption then additionally
to the expected utility hypothesis requires agents to have common beliefs on the occurrence of the states. If
K K
moreover if there is no aggregate risk, i.e. ∑ Dsk = ∑ Dzk ∀s, z , then the risk-neutral measure is equal to the
k =1 k =1
physical measure. In this case, the efficient allocation lies on the diagonal of the Edgeworth box ( cs = cz ) and
ps π s*
MRS s , z = = . Finally, if agents have common beliefs, markets are complete and agents have (identical)
pz π z*
constant relative risk aversion (CRRA) and collinear endowments, we can also use the representative agent for
“out of sample” predictions. Last but not least this is also possible if agents have quadratic utility functions and
they share the same beliefs, provided markets are complete28.
An example for the utility of the representative agent if there is no aggregate risk and agents share common
S
beliefs is: U R (W0 ,W1 ,..., WS ) = u R (W0 ) + β R ∑ ps u R (Ws ) for any concave u R . Moreover, if beliefs and time
s =1
preferences are common and agents have quasi-linear quadratic preferences, i.e.
⎛ i γi i 2⎞ S
U (c , c ,..., c ) = c + β ∑ ps ⎜ cs − (cs ) ⎟ , then we get:
i i
0
i
1
i
S
i
0
s =1 ⎝ 2 ⎠
⎛ R γR R 2⎞ S I
1
U (c , c ,..., c ) = c + β ∑ ps ⎜ cs −
R R R R
(cs ) ⎟ where γ R = ∑ i .
R
i =1 γ
0 1 S 0
s =1 ⎝ 2 ⎠
Finally, if agents have logarithmic utilities with common time preferences and their endowments are collinear,
i.e. if
S
U i (c 0i ,c 1i ,...,c Si ) = ln(c 0i ) + β ∑ psi ln(c si ) and wi = δ iW ,
s =1
S I
then U R (c 0R ,c 1R ,...,c SR ) = ln(c 0R ) + β ∑ psR ln(c sR ) where psR = ∑ δ i psi .
s =1 i =1
Having established that asset prices can be derived in this strong form from a single utility function, we can
identify the likelihood ratio process from the representative agents` marginal rates of substitution. Consider once
more the case of the one-period model:
1 S
U R (c0 , c1 ,..., cS ) = u R (c0 ) +
γR
∑p u
s =1
R
s
R
( cs )
K
λk
c0 = λ0 w R , cs = ∑ Dsk w R , s = 1,..., S .
k =1 qk
K
s.t . ∑λ
k =0
k =1
S
ps u ′(cs )
Suppressing the index R, the first order condition is: ∑ γ u′(c ) D
s=1
k
s = q k . Hence, the likelihood ratio process is
0
then:
u ′(cs ) u′( ws )
ls = = , s = 1,..., S , i.e. fluctuations in aggregate consumption determine asset prices.
γ u′(c0 ) γ u ′( w0 )
12 THE RATIONALITY BENCHMARK

To finish the thoughts on the foundations of asset pricing we show how under specific assumptions the concept
28
For more results along this line see Hens and Pilgrim (2003).
of perfect foresight equilibrium can be used to derive a very simple and compelling asset pricing implication.
Theorem:
Suppose
• relative dividends are i.i.d. (i.e. Ω t (ω t ) = S and ps (ω t ) = ps for all ω t ),
∞
• agents are expected utility maximizers, i.e. U i (c i ) = EPi ∑ ( 1i )t u i (cti ) ,
γ
t =1
K
• all u i (.) are logarithmic or there is no aggregate risk ( ∑ Dtk is deterministic ),
k =1
• all agents have rational expectations, i.e. P i = P, i = 1,..., I ,
⎛ ⎞
⎜ Dk ⎟
then λti ,k = (1 − λ0i ) EP ⎜ ⎟ is the unique equilibrium with perfect foresight. In particular then, the
⎜∑ Dj ⎟
⎝ j ⎠
relative prices must be also constant over time and must equal to the expected relative dividends.
To prove this theorem, consider first the one-period optimization problem defined over the consumption:
1 S
U i (c0i , c1i ,..., cSi ) = u i (c0i ) +
γ i ∑ p u (c ) s.t.
s =1
s
i i
s
c =λ w
i
0
i
0
i
K
λki
csi = ∑ Dsk k
wi , s = 1,..., S .
k =1 q
K
∑λ
k =0
i
k =1
S
ps u i \ (csi ) k
The first order condition is the Euler equation: ∑
s=1 γ i i\
u ( c i
)
Ds = q k , k = 1,..., K , which is not only necessary
0
but also sufficient since the objective function is concave.
Dsk
Now, suppose that λki = (1 − λ0i )∑ ps , i = 1,..., I . Using this to rewrite the consumption constraints
s ∑ Dsjj
(1 − λ0i ) wi
c = λ w and c =
i i i i
, we get
∑ (1 − λ0j )w j
0 0 1
⎡ ⎤ K
(1 − λ0i ) wi
qk = ∑ λki wi =∑ (1 − λ0i ) wi ⎢ ∑ s ps
Dsk
⎥ and c i
= ∑ Dsk , s = 1,..., S
∑ (1 − λ0j )w j
j s
i i ⎢⎣ ∑ Ds
j
⎥⎦ k =1
j
u i \ (c1i ) I
If there is no aggregate risk, then i i \ i = ∑ (1 − λ0i ) wi , where RHS is the price and LHS does not depend
γ u (c0 ) i =1
LHS RHS
on the realized state. Does any λsi exist that satisfies the equation above? For λ0i → 0 LHS is smaller than RHS
and for λ0i → 1 the LHS becomes larger than RHS. Thus, if the utility function is continuous, there exists some
S
ps c0i k
λ0i solving the first order condition. For example, in the case of a logarithmic utility function ∑ Ds = q k .
s=1 γ cs
i i
⎡ ⎤ K
(1 − λ0i ) wi
With qk = ∑ λki wi =∑ (1 − λ0i ) wi ⎢ ∑ s ps ∑
Dsk
⎥ and c i
= Dsk , s = 1,..., S we get
i i ⎢⎣ ∑ Ds
j
j
⎥⎦
s
k =1 ∑ (1 − λ0j )w j
j
γi
λ0i wi = γ i (1 − λ0i ) wi , which is solved for λ0i = .
LHS
1+ γ i
RHS
Now consider the case of multiple periods: Since the utility functions are time and state separable and since the
physical probability measure does not change over time, we can consider any node and its predecessor. If assets
are short-lived then there are no capital gains but only dividend payments. The first order condition is then:
S
ps u i \ (csi ( t )) k
∑
s=1 γ u (cs − ( t − 1))
i i\ i
Ds = qsk− ( t − 1) where s - is the predecessor of s in t. Hence the previous argument works
analogically.
Finally, in the case of long-lived assets we need also to consider capital gains. Thus, the first order condition is
S
ps u i \ (csi ( t ))
∑
s=1 γ u ( cs − ( t − 1))
i i\ i
( Dsk + qsk ) = qsk− ( t − 1) . Again, inserting the claimed solution for the asset prices shows that at
those prices the claimed portfolio solves the FOCs.
QED
12.1 Empirical Evidence

In this section we want to see how good the three asset pricing models that we developed so far are doing on
actual data. The models we want to test are the SML from the CAPM, a macro-finance regression based on the
idea of a representative agent, and the equilibrium implications of perfect foresight equilibrium that we derived
under the specific assumptions mentioned in the previous theorem. Before we are ready to do so, we first
describe the macro-finance regression we want to consider: Recall the no-arbitrage condition in the macro-
finance model:
1
t −1 ∑ t
qtk−1 (ω t −1 ) = π * (ω t )( Dtk (ω t ) +qtk (ω t )) .
R t −1 (ω ) ωt ∈Ω
f
Where from the FOC of the representative agent we get:

pt (ω t )u′(c(ω t ))
π (ω ) = R
* t f
t −1 (ω t −1
) , which then gives:
t
γ u′(c(ω t −1 ))
pt (ω t )u′(c(ω t )) k t
qtk−1 (ω t −1 ) = ∑
ωt ∈Ω γ u′(c(ω t −1 ))
( Dt (ω ) +qtk (ω t )) .
Thus, to test the macro-finance model, we run a regression on past price levels qtk−1 against
u′(c(ω t ))
( Dtk + qtk ) .
γ u′(c(ω ))
t −1
If we want to test the model for cross-sectional data, we have to run a regression on Rt −1 f − EP ( Rt k ) against
CovP (lt , Rt k ) since the risk-return decomposition of the macro-model predicts that
Rt −1 f = EP ( Rt k ) + CovP (lt , Rt k ) must hold.
The difference between time series and cross-sectional analysis is important if we study aggregates or the market
portfolio. The SML cannot tell us anything about the excess return of the market.29 Thus, if we want to analyze
the excess return of the market we have to apply time series models. On the other hand, if we are interested in
the excess returns of individual assets, we can determine specific risk factors and apply APT.
Now, we can apply the procedure discussed above to test the power of the macro-finance model for the
American stock market from 1998 to 2001. We have price data, stock market capitalization, and dividend
payments on stocks included in DJIA index, as well as GDP data and interest rates. First, we run a cross-section
regression on the SML calculating individual excess returns and using beta based on the market capitalization of
c1-α
the firms. Second, we run a time-series macro-finance regression. If we assume that u (c) = , then we get
1- α
α
1⎛c ⎞
that q k
t −1 has to regressed against ⎜ t −1 ⎟ ( Dtk + qtk ) , where ct is the aggregate consumption in the economy
γ ⎝ ct ⎠
which in the Lucas model is equal to total dividends.
Before we start with the regressions, we first look at the dividends data. The first observation we make is that
the aggregate dividends vary over time and grow exponentially.
29
The SML ( µ k − R f = β k ( µ M − R f ) ) is tautological for the market portfolio since β M = 1 .
Moreover, aggregate dividends move with the nominal GDP.
Aggregate Div and GDP on DJIA
3.5
3
2.5
2
1.5
1
0.5
0
1980 1985 1990 1995 2000 2005
dividends GDP
Also, in contrast to one of the assumptions in the rational benchmark theorem, the relative dividends between
sectors are not constant.
Relative Dividenen nach Branchen
100%
90%
Antel der einzelnen Firmen an den Dividenden
80%
70% Information technology

Financials
60% Utilities
Non-cyclical services
Cyclical services
50%
Non-cyclical consumer goods
Cyclical consumer goods
40% General industries
Basic industries
30% Resources
20%
10%
0%
04.01.1973
04.01.1974
04.01.1975
04.01.1976
04.01.1977
04.01.1978
04.01.1979
04.01.1980
04.01.1981
04.01.1982
04.01.1983
04.01.1984
04.01.1985
04.01.1986
04.01.1987
04.01.1988
04.01.1989
04.01.1990
04.01.1991
04.01.1992
04.01.1993
04.01.1994
04.01.1995
04.01.1996
04.01.1997
04.01.1998
04.01.1999
04.01.2000
04.01.2001
04.01.2002
However, the fluctuations in aggregate dividends seem more severe than those in relative dividends. Next, we
test the relationship between relative dividends and relative market capitalization as suggested by the strategy
λ * determined in the rationality benchmark. The first impression looking at the data confirms the notion that
there is a relationship between relative market values and relative dividends as the examples bellow show.
Procter & Gam ble

Me rck
0.12 0.08
0.07
0.1
0.06
0.08
0.05
Rel MV Rel MV
0.06 0.04
Rel Div Rel Div
0.03
0.04
0.02
0.02
0.01
0 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Now, we want to see if the model based on relative dividends provides a better description of the data than the
macro-finance model. To perform the latter we have chosen the utility parameters. The MSE is minimized in
sample for α = 10.12 and γ = 1.1 . Note that this is quite a high risk aversion – a phenomenon known as the
equality premium puzzle30. Doing this for McDonalds, for example, gives:
Mc Donalds R^2=2%
60000
50000
40000
30000
20000
10000
0
0 50000 100000 150000 200000 250000 300000 350000 400000
To compare, the regression on the relative market value of McDonalds against its relative dividends provides a
much better fit over the time as the figure bellow shows:
Rel Div Rel MV MC Donalds 81-01
0.04
0.035
0.03
0.025
Rel MV 81-01
0.02
0.015
0.01
0.005
0
0 0.002 0.004 0.006 0.008 0.01 0.012 0.014
Rel Div 81-01
The same result is obtained for any other firm (see slides!). If we sum over all stocks in DJIA, we get total values
for market capitalization and dividends. The relationship is presented in the next figure.
30
The equity premium puzzle is the observation that the FOC for a CRRA representative agent can only be satisfied on
market data if we assign an unreasonably high risk aversion. Or in other words, in order to induce that the representative
agent holds bonds and stocks he must be extremely risk avers because otherwise stocks are a too attractive investment.
Total MV and Total DIV
2500000 35000000
30000000
2000000
25000000
1500000
20000000
Total Market Value
Total Dividends
15000000
1000000
10000000
500000
5000000
0 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Again market values and dividend in one period move nicely with each other. So why is the macro-finance
regression much worse? The simple answer is that this regression relates market values and dividends of two
consecutive time periods and not just of one time period. Indeed, looking at changes in the total dividends and
changes in the market capitalization the correlation is also much worse:
Delta Market Values and Dividends Delta Market Values and Dividends
400000 5000000 400000 5000000
4000000 4000000
300000 300000
3000000 3000000
200000 200000
2000000 2000000
100000 100000
Delta MV Delta MV
1000000 1000000
Delta DIV Delta DIV
0 0
1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001
0 0
-100000 -100000
-1000000 -1000000
-200000 -200000
-2000000 -2000000
-300000 -3000000 -300000 -3000000
To test if the explanatory power of the traditional finance model improves from cross sectional perspective, we
run the SML-regressions testing both models for each year. The following chart is from the year 1981. Charts for
other years give a similar picture (see slides). As one can easily see in the figure bellow, the SML persistently
explains less of the variance in the observed variables than the model based on λ * , which we will later develop
also from our evolutionary theory.
REL DIV vs REL MV 2001 R^2 = 0.58
0.25
0.2
0.15
Rel MV
0.1
0.05
0
0 0.05 0.1 0.15 0.2 0.25
Rel Div
SML 1981 R^2=0.03
0.4
0.2
0
0.3 0.5 0.7 0.9 1.1 1.3 1.5
excess return
-0.2
-0.4
-0.6
-0.8
-1
beta
This finding is actually true for all years, as the following summary statistic shows:
R^2 EVOL vs CAPM 1981-2001
1.2
0.8
EVOL
0.6
CAPM
0.4
0.2
0
1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001
Though, is the predictive power of λ * also higher? To see this, we want to base our predictions on the notion
that prices are governed by a mean reversion process that force them to come back to the cross section
regression, the SML for the traditional finance model and the regression line of relative MV against relative Div
evolutionary model. Or in the words of Litterman (2003), quoted before:”…we view the financial markets as
having a center of gravity that is defined by the equilibrium of demand and supply”. The question then is which
regression line is more suitable for such a center of gravity
The differences between models are in the definition of this benchmark (or fundamental value). In the
evolutionary model, it is calculated as the difference between current and predicted market values. In the CAPM,
the benchmark is based on the difference in current and predicted returns. Comparing the R 2 of both models,
we can conclude that predictions based on market values determined by relative dividends (as in the rational
benchmark and later on in the evolutionary model) have better fit (higher R 2 ) than predictions based on beta as
a risk measure.
R^2 Evol prediction vs R^2 CAMP prediction
0.6
0.5
0.4 R^2 EVOL
0.3
0.2 R^2 CAPM
0.1
0
UNITED
M CDO NALD'
INT'L
PHILIP
EXXO N
AM ERICAN
CO CA -
G ENERALM
BO EING
Indeed, over 20 years, the evolutionary model provides an outperformance of 50% given that the estimates of
relative dividends are correct.
16
14
12
10
CAPM
Star (rebal.)
8
0
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
00
01
ez
ez
ez
ez
ez
ez
ez
ez
ez
ez
ez
ez
ez
ez
ez
ez
ez
ez
ez
ez
ez
ez
D
An “out of sample” analysis confirms the better predicting power of the evolutionary model. Over 4 years, the
evolutionary investment strategy achieves 13% outperformance compared to the DJIA index. All this is well
known among traditional finance people. Fama calls the fact that simple value strategies outperform the SML the
“value puzzle”. Since facts like those given here, can however not be disregarded one may as well say that the
really puzzling thing here is that traditional finance still holds on to its badly specified models.
1.5
1.4
1.3
1.2
1.1
1
one share per firm (buyandhold) div monthly rebal
0.9
01.01.98
01.03.98
01.05.98
01.07.98
01.09.98
01.11.98
01.01.99
01.03.99
01.05.99
01.07.99
01.09.99
01.11.99
01.01.00
01.03.00
01.05.00
01.07.00
01.09.00
01.11.00
01.01.01
01.03.01
01.05.01
01.07.01
01.09.01
01.11.01
As further analysis show, the results are not very sensitive to the frequency of rebalancing, i.e. an investor who
rebalances his portfolio daily achieves 1.4% more return than an investor performing monthly rebalancing.
13 SUMMARY
The traditional model in finance is based on complete rationality, i.e. all agents maximize their intertemporal
utility having perfect foresight of prices. In general, the model does not have any specific structure other than the
principle of no arbitrage. Under specific assumptions, such as rational expectations, which are justified by market
selection, absence of aggregate risk or the assumption that agents have logarithmic utilities, the model has nice
structure, which fits well with long-run averages. Though, there are several fluctuations around the long-run
average. The rational model attributes them to exogenous shocks and does not aim to explain them.
REFERENCES:
Books:
Eichberger Harper (1998): Financial Economics, Oxford University Press
Thorsten Hens and Beate Pilgrim (2002): General Equilibrium Foundations of Finance: Structure of Incomplete
Markets Models. Kluwer Academic Publishers, Boston/Dordrecht/London, 2002
LeRoy & Werner (2002):Principles of Financial Economics, Cambridge University Press
Research papers:
Lucas (1978): Asset Prices in an Exchange Economy, Econometrica
Varian (1989):The Arbitrage Principle in Economics, Journal of Economic Perspectives.
Lettau Ludvigson (2001):Consumption, Aggregate Wealth and Expected Stock Returns, The Journal of Finance,
Vol. LVI No 3, pp. 815-849.
Sandroni (2000): (2000) Do Markets Favor Agents Able to Make Accurate Predictions? Econometrica (lead
article) 68-6, 1303-1341.
Chapter 4:
Rational Choice:
Expected Utility and
Bayesian Updating
Rational Choice: Expected Utility and Bayesian Updating 52
1 RATIONAL CHOICE
In much of microeconomic theory, individuals are assumed to be rational. Rationality is a normative concept
stating how people should make decisions. Though, the reality is often different. Large body of experimental
work shows that the concept of rationality does not describe real decisions properly. To deepen the
understanding of peoples’ behavior, we need first to establish a normative basis for decision making which will
then serve as a benchmark to compare with real individuals’ behavior. Also for giving advice we will need to
know how one should take decisions.
When formulating a normative decision theory it is important to not that there is a difference between a “good”
decision and a lucky decision. In the words of Howard (1988), “…a good decision is an action we take that is
logically consistent with alternatives we perceive, the information we have, and the preferences we feel.” (p.
682). In contrast, a lucky decision has a good outcome. The connection between choices and feelings is
established through the utility function.
1.1 Preferences
The objectives of the decision maker can be summarized in a preference relation, which is denoted by . It
allows the comparison of pairs of alternative outcomes x, y ∈ X .
The first hypothesis of rationality is embodied in two basic assumptions about the preference relation:
completeness and transitivity. The assumption that a preference relation is complete says that the individual
has a well defined preference between two possible alternatives, i.e. for all x, y ∈ X , we have that either x≽y
or y≽x (or both). The strength of this assumption should not be underestimated. In many circumstances it is hard
to evaluate alternatives that are far from common experience. The transitivity assumption says for all
x, y, z ∈ X , if x≽y and y≽z, then x≽z, i.e. the decision maker does not face choices in which his preferences
appear to cycle: for example, feeling that coffee is at least as good as cappuccino and cappuccino is at least as
good as tee then also preferring tee over coffee.
1.2 Utility Functions

A utility function u ( x) assigns a numerical value to each element in X , ranking the elements of X in
accordance to the individual preferences. If preferences are summarized by the preference relation ≽ then for all
x, y ∈ X we have x≽y if and only if u ( x) ≥ u ( y ) .
To insure that a utility function exists, we need that the preference relation on X is also continuous, i.e the
preferences cannot exhibit “jumps”. For example, a preference relation on a set of lotteries is continuous if for
all lotteries p, q, and r, if p ≽ q ≽ r then there exists α ∈ (0,1) such that α p + (1− α )r ∼ q . In words,
continuity means that small changes in probabilities do not change the nature of ordering between lotteries.
Note that the utility function representing preferences is not unique. It is only the ranking of utilities that matters.
This is an ordinal property. The numerical values associated with the alternatives and the magnitudes of any
differences are cardinal properties.
1.3 Concept of Revealed Preferences

According to the concept of revealed preference, the outside observer does not know the preferences of an
agent. He can only try to infer them from his choices. An agent is not rational if his choices cannot be
“rationalized” by some preferences. For example, if an individual chooses alternative x when facing a choice
between x, y and z, he won’t behave “rational” by choosing y when he is faced with a decision between x, y,
and z`. In other words, if x is revealed to be at least as good as y, then y should not be revealed preferred to x.
In consumer demand theory, the idea behind the axiom of revealed preferences can be expressed as follows:
when facing prices and wealth ( p ,w ) the consumer chooses consumption bundle x = B ( p ,w ) even though
bundle y = B ( p ',w ') was also affordable, we can interpret the choice as “revealing” preference for x over y. If
the consumer has consistent demand, he would choose x over y whenever both are affordable. Thus, consistency
in demand requires that at least one of the consumption bundles is not affordable given a new price-wealth
situation as budget constraint B(p’,w’). An example for inconsistent behavior, where both consumption bundles
are affordable is given in the figure bellow.
B(p‘,w‘)
B(p,w)
1.4 Concept of Experienced Preferences

The concept of experienced preferences is based on the notion that the outside observer knows the preferences
of the agent and is able to tell when the agent will regret his choice. Behavioral economists like Bruno Frey claim
to have developed methods to elicit preferences through questionnaires and statistical methods of comparing
answers conditional on agents socioeconomic characteristics (age, income, sex, employment,…). As an exiting
reference on these issues see Frey and Stutzer (2002) “Happiness and Economics”.
1.5 The Concept of Preferences in Finance

Finance is mainly based on the concept of revealed preference. But it is applied by eliciting preferences in simple
but abstract settings which are then transferred to the real problem at hand. For example, some private bankers
let their clients rank lotteries in order to determine the asset allocation, which is best for the agent given the risk
aversion disclosed in their answers.
Since estimating agents’ preferences is necessarily associated with errors, it is advisable to impose minimal
assumptions on them. Under the condition that preferences can be defined over monetary equivalents, we try to
derive more complex rationality requirements from a simple axiom on agent’s preferences, Axiom 0, which
assumes that more money is better than less money. Thus, a behavior concerning risk and intertemporal asset
allocation will not be called irrational if it does not contradict this axiom. However, as we will see below, in a
well developed financial market the axiom may be used to derive far reaching conclusions.
Consider the following example. The outcomes of a symmetric dice follow two schemes A and B. Scheme A pays
600$, 700$, 800$, 900$, 1000$ and 500$ and Schemes B pays 500$, 600$, 700$, 800$, 900$, 1000$ for
each of the outcomes 1,2,…,6. Although the probability of getting each payoff is equally in both schemes, some
people prefer scheme A to B because in 5 out of 6 cases the payoff from scheme A is higher than from scheme B.
This choice is a violation of Axiom 0 because the same money should get the same preference ranking.
The second example of violation of the simple axiom that more money is better than less is a combination of two
choices between lotteries. The first one requires a choice between a sure gain of 2400 (A) and a gamble (B) with
25% chance of winning 10000 and 75% of winning nothing at all. The second choice requires a choice between
a sure loss of 7500 (C) and a gamble (D) with 75% of losing 10000 and 25% of loosing nothing. Clearly, the
expected payoff of choosing (B) is higher than choosing (A). Though, people typically choose (A) with the
motivation that a sure gain is better than a gamble even if the expected payoff is higher. While people prefer
sure positive payoff, they prefer to gamble when they face a choice associated with a sure loss, i.e. people who
previously choose (A) prefer (D). Now look at the combined payoff of these decisions: (A) and (D) is equivalent to
a lottery of loosing 7600 with probability 75% and winning 2400 with probability 25%; (B) and (C) is equivalent
to a lottery of loosing 7500 with probability 75% and winning 2500 with probability 25%. Clearly, (B) and (C) is
the better combination in any case since its payoff is higher than the combination (A) and (D) chosen by most of
the people.
2 EXPECTED UTILITY THEORY

Define X as the set of possible choices. For example X can be a set of lotteries L over a set of consequences
C = {c1 ,...ci ,..., cn } . p in L describes then the probability of the occurrence of the consequence c ∈ C . A lottery
can also be represented from a combination of states. In this case, the probability of a consequence ci is defined
S
as pi = ∑ probs , i.e. the sum of state probabilities associated with this consequence. Consider the following
s =1
cs =ci
example: there are three possible states with probabilities 0.3, 0.4 and 0.3. Let 5, 2, and 2 be the payoffs in
each of the states. In the lottery approach, there will be two consequences: 5 and 2 occurring with probability
0.3 (equal to the probability the state-preference approach) and 0.7 (equal to the sum of probabilities of the
states with 2 as payoff, i.e. 0.4+0.3).
We already assumed that the decision maker has preference relation on the set of lotteries, i.e. a complete and
transitive relation allowing comparison of any pair of lotteries. The existence of an utility function requires that
the preference relation is continuous.
We need one additional assumption on the decision maker’s preferences in order to represent his preferences by
a utility function with the expected utility form. This is the independence axiom. The preference relation ≽
satisfies it if for all lotteries p, q, and r and 0 ≤ α ≤ 1 we have p ≽ q if and only if

α p + (1 − α )r ≽ α q + (1 − α )r . In other words, if we mix each of two lotteries with the same third one, then
the preference ordering of the two resulting mixtures is independent of the particular third lottery used.
2.1 The Representation Theorem
Let ≽ be a preference order that is complete, transitive and continuous then ≽ can be represented by an
expected utility function, i.e. p q ⇔ Eu ( p) > Eu (q) if and only if ≽ satisfies the independence axiom.
One of the advantages of the expected utility representation is that it provides a valuable guide for action.
People often find it hard to think systematically about risky alternatives. But if the individual believes that his
choices should satisfy the axioms on which the theorem is based (in particular, the independence axiom), then
the theorem can be used as a guide in the decision process. It facilitates decisions because it separates beliefs
from risk attitudes.
2.2 The Allais Paradox

As a descriptive theory, however, the expected utility theorem is not without difficulties. The Allais paradox
constitutes the most famous challenge to the expected utility theorem. It is a thought experiment. There is an urn
containing balls numbered 0,1,...,99. The decision maker has to make two decisions. The first consists of a
choice between a lottery of gaining 50 independent of the number of the ball (lottery A) and a lottery paying 0 if
the number of the ball is 0, 250 if the number of the ball is between 1 and 10, and 50 if the number of the ball
is above 11 (Lottery B).
The second decision consist of a choice between a lottery paying 50 if the number of the ball is below 10 and
nothing otherwise (lottery A’) and a lottery paying 250 if the number of the ball is between 1 und 10 and
nothing otherwise.
0 1-10 11-99
A 50 50 50
B 0 250 50
A` 50 50 0
0 250 0
B`
It is typical for individuals to express the preferences A≽B and B`≽A`. The first choice means that one prefers the
certainty of receiving 50 over a lottery offering a 1/10 probability of getting five times more but bringing it with a
very small risk of getting nothing. The second choice means that a 1/10 probability of getting 250 is preferred to
getting only 50 with the slightly better chance of 11/100.
Though, these choices are not consistent with expected utility because they violate the independence axiom.
According to the independence axiom, if one prefers lottery A over lottery B in the first stage, his preferences
should not reverse in the second stage since the choice between A and B is equivalent to the choice between A’
and B’ once the third alternative (the outcome 11-99) is eliminated since it is the same in both lotteries.
The violation of the independence axiom can result in a violation of Axiom 0. If individuals prefer A over B in the
first stage, one can sell A to them and buy B from them, respectively sell B’ and buy A’ in the second stage
where B’ is preferred over A’. If individuals` preferences are strict they are willing to pay a positive amount for
these deals. However in any case that my occur the total payoff of such a deal is 0, i.e. the counterparty cashing
in the margin is hedged.
The contradiction to the independence axiom can also be represented using the lottery approach. Represent the
lottery outcomes and the associated probabilities as follows:
$0 $50 $250
A 0% 100% 0%
B 1% 89% 10%
A` 89% 11% 0
90% 0% 10%
B`
Note that the lottery payoffs can be written as:
A = 0.11*(0.00,1.00,0.00) + 0.89*(0,1,0)
B = 0.11*(0.09,0.00,0.90) + 0.89*(0,1,0)
A´= 0.11*(0.00,1.00,0.00) + 0.89*(1,0,0)
B´= 0.11*(0.09,0.00,0.90) + 0.89*(1,0,0)
A≽B and B`≽A` is then a contradiction to the independence axiom since subtracting the common lotteries the
lottery A (B) does not differ from lottery A’ (B’).
2.3 The Probability Triangle

A nice way of illustrating the expected utility hypothesis can be done in a triangle in which each corner
corresponds to one of three possible outcomes. The distance to the corner measures the probability of not
getting the outcome. On the diagonal from the upper left to the lower-right corner for example we can represent
all probability distributions with zero probability for the outcome assigned to the lower left corner.
1
Pc
=
1-
pb p
a -p
b
=
0
0 pa 1
n
In the probability triangle, the indifference curves of Eu ( p ) := ∑ p u (c ) are parallel straight lines
i=1
i i
const − u (cc ) u (cc ) − u (ca )

since EU (q ) = const is equivalent to pb = + pa .
u (cc ) − u (cb ) u (cc ) − u (cb )
Representing the Allais paradox in the probability triangle we get the following figure:
$250
Indifference curves
B´ B
0.1
A‘ A
$0 0.01 0.11 $50
The choice A≽B (B`≽A`) requires the indifference curve of the decision maker to be steeper than the straight line
connecting A and B (B` and A`). This however means that indifference curves cannot be parallel straight lines
since the slopes of the straight lines connecting A and B and A` ad B` are identical.
To look into the market consequences of such choices consider the state preference model. We can define three
states, i.e. s1={0}, s2={1-10}, s3={11-99}. Suppose there are three Arrow securities so that agents can transfer
πs
money between those states back and forth at given prices . Violations of the independence axiom then
πz
result in violations of Axiom 0 stating that agents prefer more money to less, since all lotteries can be attained by
some combination of the Arrow securities and the portfolio short one unit A and B` and long one unit A` and B
has a zero payoff and should not be valued positive by the agent.
2.4 Ellsberg Paradox

An even more likely violation of the expected utility axioms than in the Allais paradox are the results observed in
an experiment conducted by Daniel Ellsberg in 1961.
Subjects are presented with an urn containing 90 balls. 30 are red and the rest is either yellow or black. The
proportion of yellow to black balls is unknown. Subjects must bet on the color that will be drawn. The payoff
matrix is as follows:
red yellow black

50$ 0 0
A
0 0 50$
B
50$ 50$ 0
A`
0 50$ 50$
B`
It is very common to individuals to prefer lottery A to B since A gives 1/3 chance of winning 50$, whereas the
decision maker does not know how likely the payoff is in the latter, the proportion of black balls can be very
small. The second choice decision maker face is between A’ and B’. Most of the subjects in the experiment
choose B’, because it gives 2/3 chance of winning 50$ whereas A’ can only be certain of 1/3 chance of winning
50$ and the proportion of yellow balls can be very small. The choice of A over B and B’ over A’ violates the
independence axiom because ignoring the outcome yellow the two lotteries A and B and the two lotteries A` and
B` are identical. The explanation for the preference reversal is that people always prefer definite information to
indefinite: the urn may have more yellow balls than black, in this case lottery A’ is more attractive than lottery B’,
but people tend to prefer “the devil they know”, i.e. lottery B’ where they get 50$ with 50% probability.
As the Allais paradox, the Ellsberg paradox opens an arbitrage opportunity as well. If A is preferred to B, one can
sell A to investors and buy B. If B’ is preferred to A’, one can sell B’ and buy A’. The strategy is hedged and if
preferences are strict so that the agent would be willing to pay some money for this strategy one can exploit his
preference. Though, from a market perspective, we cannot expect that the market unbundles the yellow and
black balls since their proportion in the urn is ambiguous. In this case, the market may be incomplete. Most likely
there will only be the two assets (0,1,1) and (1,0,0) available for trade. Hence, the typical choice in experiments
is not irrational, because it cannot be refuted by a market strategy violating the axiom that more money is better
than less.
2.5 Ambiguity
The Ellsberg paradox casts doubt on the basic premise of subjective expected utility theory that subjective
probabilities are equivalent to objective probabilities. Knight (1920) was the first who made a distinction
between risk (lotteries with objective probabilities) and uncertainty (lotteries with subjective probabilities). In
particular, when having to form subjective probabilities people will not make point estimates but consider a
whole set of probabilities to be possible. Moreover, they do not assign likelihoods to the members of this set and
then compute compounded probabilities to get point estimates. They rather look at the worst-case scenario and
maximize against that, so that their utility function can be expressed as: max min E p u ( x) .
x p∈∆
3 STOCHASTIC DOMINANCE
Instead of considering payoffs and specific utility functions, we can ask under what general conditions it is
possible to assert that one state or payoff is preferred over another. We answer this question by comparing
probability distributions while assuming standard properties of the individuals’ preferences, i.e. increasing
marginal utility and risk-aversion. In general, two individuals with different preferences have different rankings
on prospects. However, in some cases it is possible to get an ordering that holds for all individuals regardless of
their preferences. This is possible when a choice between prospects can be made using their distribution
functions and one distribution stochastically dominates the other.
3.1 First Order Stochastic Dominance (FSD)

Consider two random variables, x, z, with the distribution functions A( y ) = P( x ≤ y ) and B( y ) = P ( z ≤ y ) . A
first-order stochastic dominates B if and only if for all y A( y ) < B( y ) .
B
A
0
FSD
Intuitively, no matter what level of y we look at, B always has a greater probability mass in the lower tail than
does A. For example, a lower-tail cumulative probability occurs at higher levels of y for A than for B.
Note that although A FSD B implies that the mean of y under A, ∫ ydA( y) , is greater than the mean of z under
B, a ranking of the means does not imply that the one FSD the other, rather, the entire distribution matters. The
relevance of FSD for portfolio choice is obtained from:
The distribution A FSD B if for every monotone (increasing) function u : R → R we have Eu ( A) > Eu ( B) ..
3.2 Second-Order Stochastic Dominance (SSD)

Only under quite stringent conditions will one distribution be FSD over another. A more-powerful result, involving
the concept of second-order stochastic dominance, is reached if we additionally make use of the risk-aversion
property of investors’ preferences.
y y
A SSD B if and only if for all y ∫ xdA( x ) < ∫ zdB( z ) . Geometrically, A SSD B if, up to every point y, the area
−∞ −∞
under A is smaller than the corresponding area under B.
1
+
B -
+ A
0
SSD but not FSD
Concerning portfolio choice we get: A SSD B if and only if all monotone (increasing) and concave functions u
Eu ( A) > Eu ( B) .
Note that the area lying to the left of the cumulative distribution represents the expected value of y, i.e. the
expected utility under the particular distribution. Note also that it is not necessary to limit the definition of SSD to
the subset of distributions with the same mean. However, if B has a higher mean than A then A cannot
dominate B in terms of SSD. Note that µ ( A) = Eu ( A) for u ( x) = x . Hence, if µ (A)<µ ( B) then for
u ( x) = x Eu ( A) < Eu ( B) and the criteria for SSD is not met. If we choose u ( y ) = −( y − µ ) 2 , which is
concave, and A SSD B then − E

A
{( y − µ ) } > − E {( y − µ ) } or −σ
2
B
2 2
A > −σ B2 . So if A and B have the same
mean and A SSD B, then A has a smaller variance.
3.3 State-Dominance and Stochastic Dominance

Risky alternatives can be formulated in terms of probability distribution over outcomes (see above) or in the case
that the random outcomes are generated by the occurrence of particular state, we can define a state-dependant
expected utility representation. The assumption made in axiom 0 that more money is better than less, is the state
dominance, i.e. ci > d i implies. U i (c i ) > U i (d i ) . Note that state dominance implies FSD and the reverse is
not true.
Consider the following example. There are 3 states occurring with equal probability and two random variables A
and B with payoffs as listed below.
1/3 1/3 1/3 1
S=1 S=2 S=3 2/3

A
A 0 1 2 1/3 B
B 1 2 1 0 1 2
B FSD A although B does not have a higher payoff in every state.
4 A SECOND LOOK AT MEAN-VARIANCE

In this part we focus on the question whether the mean-variance decision principle satisfies the axioms of the
expected utility theory.
4.1 Mean-Variance Principle as a Special Case of Expected Utility

The mean-variance principle can be considered as a special case of the expected utility function if we restrict the
agents’ preferences to be represented by a quadratic utility function:
α α α α
Eu ( x) = ∑ ps ( xs − xs2 ) = µ ( x) − µ ( x 2 ) = µ ( x)(1 − µ ( x)) − σ 2 ( x) = V ( µ , σ )
s 2 2 2 2
Also under the assumption that returns are distributed normally we get a mean-variance utility function. In this
case, agents’ expected utilities are defined only on the first two moments of the distribution since higher
moments are irrelevant for the normal distribution.
x−µ ^
Eu ( x) = ∫ u ( x) dN ( x; µ , σ ) = ∫ u ( ) d N ( x) = V ( µ , σ )
σ
However, both of these results have considerable limitations. If agents have quadratic utility function, they
become more risk averse with increasing wealth. We will see below that the so called absolute risk aversion is
increasing. This feature does not seem very plausible.
The assumption that asset returns are normally distributed is also questionable since data show the existence of
so called “fat tails”, i.e. compared to the shape of a normal distribution, we observe too many extreme
observations on the left and right side of the distribution.
Last but not least, under certain circumstances, the mean-variance approach is not consistent with Axiom 0. The
most prominent example is the mean-variance paradox. To see how this works, consider a lottery with a positive
payoff y > 0 in one of the states and zero payoff otherwise. The probability of the positive payoff is p>0. Now,
assume that y → ∞ while p → 0 so that py=const. Then, the variance of the final payoff will also tend to
infinity, σ → ∞ , and since the expected payoff of the lottery remains constant any mean- variance maximizer
will eventually prefer the zero outcome to the lottery:
0
0 σ
Clearly, this is a contradiction to Axiom 0 that agents prefer more money to less.
Finally, we analyze if mean-variance preferences are always consistent with the expected utility concept by
plotting the indifference curves of a mean-variance investor in a probability triangle defined over the payoffs 2, 4
and 6. As the figure shows, the indifference curves are not linear as required by the expected utility theory. Thus,
mean-variance preferences are not necessarily compatible with expected utility concept.
1
0.9
93
4.0 .2927
1.85 .0976
−1.561
−1.3171
−1.804
2.82
488
15
1.6098 37
0.87805 2
2.34
4
1.12
1.3659
2
−2
0.8
0.39024
0.63415
.78
0.14634
−2
−0.58537
05
.29
−0.097561
−0.82
−2
27
0.7
53
−1.0
66
−2
927
2.5854
.0
3.0732 171
48
7 32
0.6
1
3.3
3.56
49
−0
−1 −1
3.80
.56
.34
.31
2.8293
0.5
71 1
2.0976
14
2.3415
6
1.8537
1.12
−0 −0.8
1.36
0.4 0. .58 292 −1.0732

0.8
0.
53
0.6
14 −0
1.60
39
63 .09 7 7
78
59
34
02
4 75
05
61
15
4
98
0.3 −0.341
46
2.58
3.3171
0.14634
54
0.2
3.561
0.39024
3.07
0.6341
1.1 0.8780 5
5
1.36 22
32
2.
0.1 09 1.8
76 53 1.60 59 1.122
7 98
1.3659
1.6098
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
As discussed above, the expected utility framework match mean-variance preferences only in the case of
quadratic preferences or normally distributed returns. Though, there are examples showing that neither the
expected utility framework can cover the set of mean variance preferences nor the set of mean-variance
preferences can wrap all preferences within the expected utility framework. For example, the mean-variance
paradox is a case, where mean-variance preferences imply a contradiction to the expected utility concept by
violating axiom 0. Another example is the probability triangle with mean-variance preferences which are not
linear, hence incompatible with the expected utility concept. On the other hand, if preferences are polynomial,
defined over higher moments or logarithmical, they are well-matched by the expected utility but not by mean-
variance principle.
Thus we view the mean-variance principle as an alternative to the expected utility hypothesis. If one considers
the expected utility as the formalization of rational choice then the mean-variance principle must be seen as a
behavioral rule that is often used in practice.
5 MEASURES OF RISK AVERSION

In this section, we briefly summarize the definition of risk aversion and give some examples.
A general definition that does not presume an expected utility formulation is based on the idea of comparing
preferences over a lottery and a certain payoff. A decision maker exhibits risk aversion if a degenerated lottery
that pays a certain payoff is at least as good as the lottery itself. In the context of expected utility theory, risk
aversion is equivalent to the concavity of u (.) . Strict concavity means that marginal utility of money is
decreasing. Hence, at any level of wealth, the utility gain from an extra unit of money is less than the utility loss
of having a unit money less.
A decision-maker is risk-neutral (risk-loving) if the utility function of money is linear (convex).

u( x) u( x) u( x)
u( x2 ) u( x2 ) u( x2 )
u[ µ( F )]
U ( F ) = u[ µ ( F )] U ( F) U (F)
u[ µ ( F )]
u( x1 ) u( x1 ) u( x1 )
x x x
0 x1 c( F ) µ ( F ) x2 0 x1 c( F ) µ( F ) x2 0 x1 µ (F) c( F ) x2
Risk-neutral [q(F)=0] Risk-averse[q(F)>0] Risk-lover [q(F)<0]
The certainty equivalent c ( F ) of a gamble is the amount of money for which the individual is indifferent
between the gamble and a certain amount, i.e. u ( c ( F ) ) = ∫ u ( x ) dF ( x ) =: U ( F ) .

C
The risk premium q ( F ) is the difference between the expected payoff of the lottery µ ( F ) = ∫ xdF ( x ) and
the certainty equivalent, i.e. q ( F ) = µ ( F ) − c ( F ) .
Risk neutrality is equivalent to the linearity of u (.) , i.e. u ''( x) = 0 for all x. Then, we can measure the degree of
risk aversion by the curvature of u (.) . One possible measure of the curvature is u ''( x) , though, it is not
invariant to positive linear transformations of the utility function. To make it invariant, the simplest modification
is to use u ''( x) / u '( x) . If we change the sign to have a positive number for an increasing and concave u (.) , we
u ''( x)
get the Arrow-Pratt measure of absolute risk aversion: ARA( x) = − . It determines the asset allocation in
u '( x)
1
terms of units as income changes. The risk tolerance is then the inverse ARA or RT ( x) = .
ARA( x)
The relative risk aversion is obtained by simply multiplying the absolute risk aversion with the income:
u ''( x)
RRA( x) = − x . It determines the asset allocation in terms of shares when income changes.
u '( x)
Consider some examples:
• u ( x) = −e −α x is associated with constant ARA equal to α or an increasing RRA
• u ( x) = x − γ1 x 2 means increasing ARA
• u ( x) = xα is related to a decreasing ARA or a constant RRA
The demand (in units) for risky assets increases with income if and only if ARA(x) decreases with income. If an
investor has constant ARA(x), he should hold the same amount of risky assets by increasing income.
The wealth share of risky assets increases with income if and only if RRA(x) decreases with income. If an investor
has constant RRA(x), he should hold the same shares of risky assets as his income increases.
A broad class of utility functions that exhibit some nice properties in terms of asset allocation (allocation on risky
and riskless assets) is the HARA (Hyperbolic Absolute Risk Aversion)-family. The name comes from the hyperbolic
A
representation of ARA, i.e. ARA = , where A and B are constants and the domain is the set for which ARA is
B
1
1−
(α i + β x) β
nonnegative. The HARA utility function class can be represented by u i ( x) = . For β = 0 , we get
1
(1 − β )
1
β
the exponential function u i ( x) = −α i exp(− xi ) , and for β = 1 we get the logarithmic function
α
u i ( x) = ln(α i + x) . If all agents in the economy have the same β , they will have the same asset allocation in
terms of risky and riskless asset, i.e. the Two-Fund Separation will hold31.
Note that all u i (.) in the HARA class (with exception of linear utility) are strict increasing and strict concave.
Thus, agents with HARA utilities like consumption (or its monetary equivalent) and they are also risk averse.
6 RATIONAL PROBABILITIES
Under uncertainty, agents’ decisions are determined by preferences and assessed probabilities. Rational agents
assess probabilities following some basic rules:
• The probability of joint events cannot be larger than the probability of a single event, i.e.
P ( A and B) ≤ P( A)
P( A and B)
• The probability of a certain event A given event B is P ( A | B ) =
P( B)
• The joint probability of independent events is P ( A and B) = P( A) * P( B)
31
For a proof assuming Pareto-efficiency see Magill and Qunizii (1996) Chapter 3.
p(ci ) p ( A | ci )
• Probabilities updating follows the Bayes rule: p (ci | A) =
∑ i p(ci ) p( A | ci )
To illustrate that not even the first postulate is always easy to follow, consider the following story: “Linda is 31,
single, outspoken and very bright. She majored in philosophy. As a student she was deeply concerned with
issues surrounding equality and discrimination.” Now ask yourself: Is it more likely that Linda is a bank clerk or a
bank clerk and active in the feminist movement? Many people answer the second alternative is more likely!
To illustrate that it is not always straightforward to satisfy the logic of conditional probabilities consider the
following statement that many people would think is reasonable: “Lazy students wear old jeans. This student
wears an old jeans. He must be lazy.” Of course this statement is wrong because it is ill conditioning
Finally, to illustrate the pitfalls in updating probabilities consider the “Monty Hall Dilemma” as an example. The
candidate of a quiz show can choose between 3 doors. There is a prize behind one of the door. After the
candidate makes a choice, the quiz master opens one of the other doors and the candidate has the option to
switch to the other closed door in order to find the prize. Should a rational candidate switch or stick with his
previous pick? Mathematically, the chances to get the prize by choosing the other door are higher. The first pick
has one chance in three of being correct but do the odds rise when one door is shown to be a looser?
There is a 1/3 chance to hit the prize door, and a 2/3 chance that one misses the prize. If one decides to stay
with his previous pick, the probability to get the prize is 1/3. However, one missed (and this with the probability
of 2/3) then the prize is behind one of the remaining two doors. Furthermore, of these two, the quiz master will
open the empty one, leaving the prize door closed. To summarize, if one decide to stay with the first pick the
chance of winning is 1/3 whereas if he decides to switch the chance of winning increases to 2/3 as the following
calculation shows.
More formally, let Dt denotes “prize is behind door t” and Ot the event that “quiz master opens door t”. At
the beginning of the game, the probability that the prize is behind some door P ( Dt ) = 1/ 3 . Now, suppose that
the candidate chooses door 1. We calculate the probability that the quiz master opens door 2 wherever the price
is. The conditional probability that the quiz master opens door 2 given the price is behind door 1 and the
candidate picked it already is P (O2 | D1 ) = 1/ 2 (there are only closed doors), if the price is behind door 3 and
candidate picked door one, then the quiz master will open door 2 for sure P (O2 | D3 ) = 1 , he will never open the
door with the prize behind it: P (O2 | D2 ) = 0 . Thus, the probability that the quiz master opens door 2 is:
P (O2 ) = P(O2 ¦ D1 ) + P (O2 ¦ D2 ) + P (O2 ¦ D3 ) = (1/ 2 + 0 + 1)1/ 3 = 1/ 2 .
Applying the Bayes rule in order to determine the conditional probability that the prize is behind the door 1 and
quiz master opens door 2 is:
P(O2 ¦ D1 ) P( D1 ) 1/ 6
P ( D1 ¦ O2 ) = = = 1/ 3
P (O2 ) 1/ 2
Thus, the chances that the first pick is winning given that the quiz master open the second door is the same as at
the beginning of the game. Thus, the candidate should use the information behind the quiz master’s action and
make up his mind choosing door 3.
7 RATIONAL TIME PREFERENCES
7.1 Time Preference Reversal

The preference of consumption now is not independent on the consumption order. For example, a CEO
comparing the profits over the next 3 years may have the following preferences (30,70,100)≽(50,50,100). The
idea is that steadily increasing profits are perceived to be better than high, but stable profits. In terms of losses,
CEOs preferences reverse, (50,50,20) ≽ (30,70,20), low but stable profits are better than high but volatile
profits. The drop from 70 to 20 seems to be too harsh.
These preferences can be rational if looked at in isolation. The decision is however not rational if an
improvement through a financial market is possible. In this case, the CEO can lend 20 from the first to the
second period, which makes (50,50,100) dominating (30,70,100). Hence, preferences (50,50,20) ≽ (30,70,20)
contradicts Axiom 0.
7.2 Hyperbolic Discounting

Laibson (1997) summarized the idea of hyperbolic discounting as follows: „From today’s point of view one may
find an action to be taken tomorrow profitable but when it comes to tomorrow this is no longer so.“ Since
capital markets discount exponentially, hyperbolic discounting leading to a preference reversal can be exploited
in perfect capital markets.
T
Consider the utility function U t (Ct , Ct +1 ,..., CT ) = Ct + ß ∑ Cτ δ τ −t , where 1 ≥ β ≥ 0 and 1 ≥ δ ≥ 0 .
τ = t +1
For example, taking an action in t will lead in a consumption in t of -5 and consumption in t+1 of 6. From point
5
of view of time t-1 the action is preferred if β (−5δ + 6δ 2 ) > 0 or if δ > . From point of view of time t, the
6
5 15 5
action is preferred if −5 + β 6δ > 0 or if βδ > . Hence, for > δ > we get a preference reversal.
6 β6 6
The example shows clearly that irrational decisions can be represented as the solution of a maximization
problem. Though, this does not make them rational in the sense that they may violate Axiom 0.
7.3 Discounting and Risk Aversion

Note that with the standard expected utility representation, U (c) = u (c0 ) + β ∑ ps u (cs ) , risk and time
s
preferences are closely linked. In the case of CRRA, u (cs ) = (cs )1−α , time preferences are defined over the
α
∂ u (c ) ⎛c ⎞
marginal utility of consumption over two point of time: 1 1 = β ⎜ 1 ⎟ where β is the discount factor of
∂ 0u (c0 ) ⎝ c0 ⎠
future consumption. Risk preferences are defined over the marginal utility of consumption in two states s and z:
α
∂ s u (cs ) ⎛ cs ⎞
= ⎜ ⎟ . Epstein and Zin (1989) found an expected utility representation separating both aspects: the
∂ z u (c z ) ⎝ cz ⎠
elasticity of intertemporal substitution, defined as ψ , and the relative risk aversion α :
θ
(1−α )
⎡ 1−α θ ⎤
1 (1−α )
(1 − α )
U t (C ) = ⎢(1 − δ )Ct θ
+ δ ( EtU t +1 ) ⎥ where θ ≡
⎣ ⎦ (1 − ψ1 )
For some class of utility functions, this representation shows an interesting connection between time and risk
preferences. For example, if the utility function is logarithmic time and risk preferences coincide. For CRRA they
are reciprocal to each other.
1
α=
ψ
Coefficient of relative
risk aversion α
Log utility
1
CRRA utility
1
Elasticity of intertemporal substitution Ψ
REFERENCES:
Textbooks:
Eisenführ und Weber (1999): Rationales Entscheiden, Springer.
Frey and Stutzer (2002) “Happiness and Economics”, Oxford University Press.
Huang and Litzenberger (1988) Chapter 1.
Magill and Qunizii (1996): Theory of Incomplete Markets; MIT-Press
Research Papers:
Arrow (1971): “Axiomatic Theories of Choice, Cardinal Utility and Subjective Probability: a Review”; Opening
lecture for the Workshop in Economic Theory organized by the International Economic Association in Bergen.
Epstein and Zin (1989):“Substitution, Risk Aversion, and the Temporal Behavior of Consumption and Asset
Returns“, Econometrica (57), pp.937-69
Howard, R.A.: “Decision Analysis: Practice and Promise“, Management Science (34), June 1988, p.679-695
Keeney (1982): “Decision Analysis”, Operations Research (30), pp. 803-838.
Laibson, D. (1997): “Golden Eggs and Hyperbolic Discounting”, Quarterly Journal of Economics, 112, 443-447.
Chapter 5:
Choosing a Portfolio on a Random

Walk: Diversification
Choosing a Portfolio on a Random Walk: Diversification 70
1 INTRODUCTION
One of the main claims of classical finance is that assets have random returns. Evoking the anticipation principle,
as mentioned above, Cootner (1964), for example, has claimed that “the day-to-day changes of asset prices
must statistically independent of each other”. By the Central Limit Theorem32 returns over longer periods would
then be lognormal distributed. In this section we want to take two points of view: First we ask what would be
the best portfolio strategy of a rational investor facing a random walk. In particular we analyze whether – as
many people claim - every rational investor should follow a buy and hold strategy when facing a random walk?
Second, we ask – as it is also claimed many times – whether a random walk can be the outcome of an economy
with rationally acting agents. Summarizing, we are asking whether neoclassical finance, i.e. evoking the utility
maximization principle, is indeed compatible with classical finance.
In this chapter we will challenge this commonly held view and show that expected utility maximizers with CRRA
should follow a rebalancing portfolio strategy rather than simply buy and hold. In contrast to a passive buy and
hold strategy, where the units of assets are fixed, a rebalancing strategy fixes the percentage of wealth invested
in each asset. Moreover, we show that assuming the economy can be represented by one agent with a CRRA-
utility function would imply that fluctuations of asset prices are much too small as compared to those found in
the DJIA 81-03 data. Finally, for reasonable degrees of risk aversion this neoclassical model is not able to explain
the equity premium, i.e. the fact that on average stocks have 6% higher returns than the risk free rate.
2 RETURNS
Let us first consider the assumption of normally distributed returns in more detail: First, it is obvious that returns
on stocks can never be normally distributed because stock returns are bounded below whereas a normal
distribution has unbounded support. The correct modeling of returns is to assume that they are log-normally
distributed, i.e. that the log of gross returns is normally distributed. Indeed, as returns over longer horizons are
the product over short horizon returns, the central limit theorem can be evoked to the sum of the logs of short
run returns.
⎛ qk ⎞
Hence the correct modeling of returns should be to assume that the quantities log ⎜ t +k1 ⎟ are normally
⎜ q ⎟
⎝ t ⎠
distributed. To get an intuition for the adequacy of this property, we for example take the logarithm of weekly
gross returns of 3Com and General Motors shares, plot a histogram and compare it with the histogram of a
normal distribution with the same mean and standard deviation.33
32
The Central Limit Theorem shows that the sum of i.i.d. variables is normally distributed.
33
To run the Chi-quadrate test we need a minimum amount of observations in each bin. This is why you see “clusters” on
both sides of the histogram. Since the test is performed to the similarly transformed normal distribution there is no bias in
the test results.
3M General Motors
200 180
150
150
120
100 90
60
50
30
0 0
The hypothesis of normal distribution can be rejected for both shares although the deviations in the GM
histogram to the normal distribution appear small. The 3M Returns are too often close to its mean and also too
often far away from the mean. That is to say the histogram has fat tails and is too narrow in the middle range.
Running statistical tests for all 30 companies included in the DJIA, we see that 29 out of 30 do not have
lognormal distributed returns. Hence the assumption of a random walk cannot be supported. Yet we may ask
whether at least the conclusions drawn under this assumption make sense.
Properties of lognormal returns
If a random variable is log-normally distributed, i.e. if log( X ) ∼ N ( µ ,σ ) then
1 σ2
log(EX ) = E log( X ) + Var (log( X )) = µ + . In other words, the higher the variance, the higher are the
2 2
expected returns. In contrast to the normal distribution, we cannot increase the variance without adjusting the
mean because the support of the lognormal distribution is bounded below by zero. Increasing the variance is
equivalent to shifting more weight on the right side, which increases the expected value of returns!
0 X
The disadvantage of log-returns is the property that the sum of lognormal returns is not itself lognormal. This is a
problem because the log-normal property of individual assets cannot be extended to the portfolio, i.e. if each
asset return is lognormal, the portfolio return as a weighted average of lognormals is not necessary lognormal as
well. This problem can be avoided by considering short time intervals. In the limit of continuous time the problem
is gone.
3 PORTFOLIO CHOICE
In this section, we will show that with log-normally distributed returns a CRRA-expected utility maximizer should
choose a fix-mix strategy, i.e. he should fix his portfolio weights and then rebalance the units hold in his
portfolio. As a side product we see that the mean-variance principle can be modified to display the agent’s
optimal intertemporal asset allocation.
To this end, consider the following expected utility maximization problem, which describes the “myopic”
optimization from one period to the next:
wt1+−1α
Max Et
λt ∈R K +1 1−α
s.t.
wt +1 = Rλt wt
K
∑λ
k =0
k ,t =1
K
where Rλt = ∑ λk ,t Rk ,t +1 and R0,t +1 = R f ,t .
k =0
First, we transform the problem by taking the logarithm to:
Wt1+−1α W 1− α
max Et 1− α
⇔ max log ⎡⎢ Et 1t−+α1 ⎤⎥
K +1
λt ∈R λt ⎣ ⎦
This is equivalent to:
max
K +1
log EtWt1+−1α − log(1 − α )
λt ∈R
If next period wealth is log normal we can apply the property that:
1
log EtWt +1 = Et log Wt +1 + (Vart log Wt +1 )
2
to get:
(1 − a )2
max (1 − α ) Et logWt +1 + VAR(logWt +1 )
K +1
λt ∈R 2
Since next period wealth is proportional to this period wealth, from this expression we already see that the asset
allocation is independent of wealth. Using the notation for a portfolio’s mean and variance of log-returns we get:
(1 − a) 2 2
max (1 − α ) Et rλt + σ rλt
λt 2
Dividing by 1 − α and eliminating it from the maximization problem, yields:
(1 − a ) 2
max Et rλt + σ rλt
K +1
λt ∈R 2
where σ r2λ is the conditional variance of the log portfolio return.

t
Using the budget constraint λo, t = ∑ k =1 λk , t we get:

K
K
1 K a K K
maxK
λt ∈R
∑ λk ,t ( Et rk ,t +1 − rf ,t ) +
k =1
∑ k ,t rk ,t+1 2 ∑∑
2 k =1
λ σ 2
−
k =1 j =1
λk , t λ j , t cov( rk , t +1 , rj , t +1 )
⎛ ⎛ 1⎞ ⎞
1 ⎜ ⎜ ⎟ 1 2⎟
The solution of the problem is: λ opt
= i cov ⎜ Et ( rt +1 ) − rf , t ⎜ . ⎟ + σ t ⎟ where rt +1 is the vector of asset log
−1
t
α ⎜
t
⎜ 1⎟ 2 ⎟
⎝ ⎝ ⎠ ⎠
returns and σ t2 is the vector of variances of the k assets.
1 2
Note that except for the term σ t this is the same solution as in the two period mean-variance model from the
2
second chapter. This term tilts the asset allocation to those assets with higher volatility. The section 2.2. will
provide an intuition for this.
3.1 Dynamic Portfolio Choice with CRRA: The “No Time Diversification” Theorem
A fundamental observation in finance is that in an efficient market an expected utility maximizer with CRRA does
not change his asset allocation over time. To see this we need to extend the previous analysis for the case that
the investor cares about his wealth n periods from now, Wt + n . If all wealth is invested, the budget constraint is:
Wt+n = Rt + n Rt + n −1 Rt + n − 2 ...RtWt or in logs: wt+n = rt + n + rt + n −1 + rt + n − 2 ... + rt + wt . Note that we continue to use
gross returns.
Suppose that the log-normal returns are i.i.d. over time. This implies that risky returns are not serially
correlated, Et + n ( wt+n ) = nEt ( wt+1 ) , and the variance of risky assets is proportional to the holding period,
VARt + n ( wt+n ) = nVARt ( wt+1 ) . With i.i.d. returns all means and variances are scaled up by the same factor n. In
other words, both the short-term and long-term investor face the same mean-variance choice scaled up or down
by the factor n. Thus, if investors have CRRA utility functions, the asset allocation does not depend on wealth
and since expectations and covariances are constant over time, both the short-term and long-term investors
choose the same portfolio, i.e. the long-term investor acts myopically. This result is known as the “no time
diversification” theorem which goes back to Samuelson (1969) and Merton (1969).
We will see later that in the case of a log utility, portfolio choice will be myopic even if asset returns are not i.i.d.
In any case, a CRRA-expected utility maximizer will not choose a buy and hold strategy but fix his asset mix and
then rebalance.
3.2 Rebalancing, Fix Mix and Volatility Pumping

The “No Time Diversification Theorem” of Samuelson and Merton shows that on i.i.d. process with log-normal
returns an expected utility maximizer with CRRA chooses an asset allocation that is invariant over time. Hence
when the price of an asset goes up, the investor sells part of his holdings of that asset and when the price goes
down he will purchase more of this asset in order to keep the share of wealth invested in this asset λ i ,k fixed
over time. This strategy is also called “fix mix” and the corresponding behavior is called “rebalancing”. It is
illustrated in the figure bellow.
sell buy
9
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
The next example, “volatility pumping”, illustrates that in a sense a rebalancing behavior is more successful the
more volatile the assets are.
There are two assets, cash with zero interest and a stock that in each period has a 50:50 chance of either
doubling its value or of reducing it by one-half. The stochastic process is i.i.d. as for example a repeated coin
tossing. An investment left in the stock will have a value that fluctuates a lot but has no overall growth rate. Also
the investment in cash has no growth rate. However if we rebalance, say a (½, ½) asset allocation, our wealth
1 1 1 1 1
grows at a rate of about 6%: µ ( g ) = ln( + 1) + ln( + ) ≈ 0.059 , where µ ( g ) is the expected growth
2 2 2 2 4
rate of wealth.
Since we will use these ideas later quite frequently, we take the opportunity of this simple example to make the
reasoning a bit more precise: Let ωt ∈ {H , T } be the state of the world in period t and ω t = (ω0 , ω1 ,..., ωt ) be
the path up to period t. Then, the evolution of wealth is given by the recursion:
w(ω t +1 ) = ⎡⎣ A(ω t +1 )λ + (1 − λ ) ⎤⎦ w(ω t ) , where A(ω t +1 ) is 2 in the case of H and ½ in the case of T. Since ω t
is i.i.d. A(ω t +1 ) = A(ωt ) . By the Law of Large Numbers we get for the expected evolution of log-returns:
⎡1 ⎤
E ln( w(ω t +1 )) = p( H ) ln [ 2λ + (1 − λ ) ] + p(T ) ln ⎢ λ + (1 − λ ) ⎥ + ln w(ω0 ) .
⎣2 ⎦
1 1 ⎡1 ⎤
Hence, the expected growth rate is: µ (g(λ ))= ln [ 2λ + (1 − λ ) ] + ln ⎢ λ + (1 − λ ) ⎥ .
2 2 ⎣2 ⎦
As claimed above, the expected growth rate of putting all money into one of the two assets is zero:
1 1 ⎡1⎤ 1 1
µ (g(1))= ln [ 2] + ln ⎢ ⎥ = µ (g(0))= ln [1] + ln [1] = 0
2 2 ⎣2⎦ 2 2
⎛1 1⎞
However, rebalancing the portfolio ( λ ,1-λ ) = ⎜ , ⎟ gives:
⎝2 2⎠
⎛1⎞ 1 ⎡1 ⎤ 1 ⎡1 1⎤
µ (g ⎜ ⎟) = ln ⎢ + 1⎥ + ln ⎢ + ⎥ ≈ 0.059 .
2 2
⎝ ⎠ 2 ⎣2 2 4
⎦ ⎣ ⎦
The variance of the expected growth rates is:

2
1 1 ⎡ ⎡1 ⎤ ⎤
σ (g(λ ))= ⎡⎣ln [ 2λ + (1 − λ )] − µ (g(λ )) ⎤⎦ + ⎢ln ⎢ λ + (1 − λ ) ⎥ − µ (g(λ )) ⎥ .
2 2
2 2 ⎣ ⎣2 ⎦ ⎦
The growth rates of different portfolio strategies can be represented in a mean-variance diagram.
EXP-g and VAR-g
0.08
0.06
EX P-g
0.04
0.02
0
0 0.1 0.2 0.3 0.4 0.5 0.6
VAR-g
The highest growth is achieved by the strategy investing half of the wealth in stocks and the rest in cash34.
Depending on his risk aversion, an investor may however go for a smaller growth rate in order to reduce the
variance of the growth rate.
4 ASSET PRICING
In the previous section we have seen that if log-returns are i.i.d. and normally distributed, then all CRRA-
expected-utility-maximizers choose a fix-mix strategy with the same fund of risky assets. In this section we want
to ask two questions: First, can this view be made consistent, i.e. is it possible to generate i.i.d. log-normally
distributed returns with CRRA-expected-utility-maximizers? Second, does this give a reasonable asset pricing
model, i.e. a model that generates these returns out of reasonable assumptions?
First of all, there are good reasons to prefer utility functions in which relative risk aversion35 does not depend on
wealth:
„The long-run behavior of the economy suggests that relative risk aversion cannot depend strongly on wealth.
Per capita consumption and wealth have increased greatly over the past two centuries. Since financial risks are
34
Be sure that you can prove this fact!
35
Remind that the coefficient of relative risk aversion determines the fraction of wealth that an investor is ready to pay in
order to avoid a gamble of a given size relative to his wealth.
multiplicative, this means that the absolute scale of financial risks has also increased while the relative scale is
unchanged. Interest rates and risk premium do not show any evidence of long-term trends in response to this
long-term growth; this implies that investors are willing to pay almost the same relative costs to avoid given
relative risks as they did when they were much poorer, which is possible only if relative risk aversion is almost
independent of wealth.“ (Campbell and Viceira, 2002), p. 24.
Hence, to explain the rather limited range of financial variables under economic growth, we assume that agents
have utility functions with a constant relative risk aversion (CRRA).
To answer the question if the portfolio choice discussed above is compatible with an asset pricing model,
λti ,k
consider an equilibrium, where the demand of investor i in period t is: θ ti,k = k
wti . Normalizing the asset
q t
supply to 1, we get the market clearing condition ∑θ i

t
i,k
= 1 and as above the price of asset k is
q kt = ∑ λti,k wti , i.e. a weighted average of the strategies. Assuming two-fund separation36 for a common and
i
1 1
stationary strategy λ , the price of asset k is q kt = λ k ∑ wti = λ k ∑w i
. Comparing the prices of asset
i α i
α rep
i
t
q kt λtk
k and another asset j we see that relative prices are constant: = .
q tj λtj
However, in reality, relative market values seem to deviate from this result as the figure bellow shows.
Relative Market Values
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003
Extending the analysis to an aggregate level, we see that aggregate wealth is determined by aggregate
dividends ∑ wti = ∑ Dtk . Thus, prices will fluctuate with aggregate dividends, i.e. if aggregate dividends are
i k
i.i.d. and Gaussian, the prices will be also i.i.d. and log-normal.
However, empirically, the fluctuations of dividends are too small to explain the fluctuations of returns. In other
words, there is excess volatility.
36
Note that as shown above CRRA would imply two-fund separation.
Total DIV vs MV 81-03
4500000 60000000
4000000
50000000
3500000
3000000 40000000
2500000
MV
30000000
DIV
2000000
1500000 20000000
1000000
10000000
500000
0 0
1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003
5 THE EQUITY PREMIUM PUZZLE

A common argument to convince people to invest in stocks is to show by how much one`s wealth would have
appreciated in the long run if it were invested in stocks as compared to bonds.
Investing one single dollar in 1920 on DJIA would have given you 2586.52 dollars while it would only have given
you 16.56 dollars if it were invested in bonds. Or to phrase this fact differently, the average excess return of
stock over bonds is about 6% p.a. in the US. In other countries this so called equity premium is similar: UK:
4.6%, Japan 3.3%, Germany 6.6% France 6.3%.
Of course, stocks have also been more volatile but the question remains whether these high excess returns can
really be explained by the volatility of stock returns. Or, to be more precise, we ask whether the standard asset
pricing model with a single representative consumer maximizing a CRRA-utility could generate such high risk
premium. To get some intuition why this model fails in this respect, or phrased less drastically why the equity
premium is a puzzle, consider the following simple question: Which X would make you indifferent to the
following lottery? Doubling your income with 50% probability and loosing X% of your income with 50%
probability. The typical answer: X = 23%
However an expected utility maximizing investor with CRRA who on realistic data chooses the risk free asset and
stocks would choose X = 4 % . Hence explaining the equity premium with this model would amount to choose
unreasonable high degrees of risk aversion, i.e. such a representative consumer would not represent the average
risk aversion in the society.
Yet a different way of stating the equity premium puzzle is to argue that for reasonable degrees of risk aversion,
the observed volatility in consumption is not high enough to generate the observed Sharpe ratios on stocks. This
argument refers to the intertemporal utility maximization principle, as expressed in the first-order condition that
we derived above:
⎛ ∂1s u ′i ⎞
Rf = Ε p [l Rm ] , where l s = Rf β ⎜ i ⎟
and β is the time preference parameter.
⎝ ∂ 0u ′ ⎠
SDF
Using the definition of the covariance and the fact that the correlation coefficient is bounded above by 1 yields
the famous Hansen-Jagannathan volatility bound:
Ε p [ rm ] − rf
≤ σ ( l ) ≈ ασ ( ∆ ln ( cs ) )
σ ( rm ) (1 + rf )
Hence the excess return of the market divided by its standard deviation and the gross interest rate is bounded
above by the standard deviation of the likelihood ratio process. An approximation of the latter is the product of
the risk aversion coefficient and the volatility of changes in the growth rate of consumption, which is however
too small for the observed size of the left-hand side of the inequality.
6 SUMMARY
In this chapter we have seen that a Random Walk has log-normal i.i.d. returns, which delivers nice results using
the Mean-Variance framework. Under these conditions however the best portfolio strategy is not necessarily buy-
and-hold but fix-mix –at least for the standard class of utility functions. Also, using the properties of log-normal
returns, we see that the more volatile the markets the higher are the expected returns!
Empirically, the assumption of log-normal i.i.d. returns is questionable. Relative market values do fluctuate and
the fluctuations in aggregate dividends are too small to explain the fluctuations in returns (excess volatility). Also
the observed equity premium cannot be reconciled with this standard model of neoclassical finance.
REFERENCES:
Textbooks:
Campbell and Viceira (2002): Strategic Asset Allocation, chapter 2.
Cootner, P. (1964). The Random Character of Stock Market Prices. The M.I.T. Press, Cambridge.
Luenberger (1998): Investment Science, chapter 15.
Research Papers:
Campbell and Viceira (2001): „Stock Market Mean Revision and the Optimal Equity Allocation of a Long-Lived
Investor“, European Finance Review.
Hansen LP, Jagannathan R. 1991. “Implications of Security Market Data for Models of Dynamic Economies”.
Journal of Political Economy 99: 225-262.
Samuelson (1969):“Lifetime Portfolio Selection by Dynamic Stochastic Programming“, Review of Economics and
Statistics (51), pp.239-246.
Chapter 6:
Behavioral Portfolio Theory

Behavioral Portfolio Theory 80
“No rational argument will have a rational effect on a man who does not want to adopt a rational attitude.”
(Karl Popper)
1 INTRODUCTION
As concluded in previous chapters, neither the assumptions nor the conclusions of classical finance match the
reality. For example, the postulate from the efficient markets hypothesis that returns must follow a random walk
is not confirmed by the data. Additionally, even if there is some structure in the dynamics of economic variables
typically represented by the Euler equation, it does not represent the real economy very well. A natural way to
start developing a theory that better matches the reality is to observe agents` behavior in the economy either
empirically (e.g. observing individual portfolios) or experimentally. The assumption of complete rationality
underlying the efficient market hypothesis is wishful thinking that may be quite off from reality.
The field of behavioral finance provides a new set of explanations of observed empirical regularities documented
as puzzles in the classical finance. It also provides a new set of predictions. The goal of this chapter is to present
some of the central ideas of behavioral finance in order to develop a better understanding for the conclusions in
the following chapters.
2 DESCRIPTIVE VERSUS PRESCRIPTIVE THEORIES

Individuals’ behavior has been studied by many authors. Bubbles for example became representative for
investors’ behavior being inconsistent with prevalent normative theories. Regarding the South Sea Bubble in
1720, Sir Isaac Newton said: “I can calculate the movements of heavenly bodies but not the madness of
people”. For Keynes (1936), people’s behavior is often driven by animal spirits: “Most… decisions to do
something positive can only be taken as a result of animal spirits-of a spontaneous urge to action rather than
inaction, and not as the outcome of a weighted average of quantitative benefits multiplied by quantitative
probabilities.”
Already Veblen (1900) suggested that bubbles can be better understand if one understands the psychology of
decision making: “Market values being a psychological outcome, it follows that pecuniary capital, an aggregate
of market values, may vary in magnitude with a freedom which gives the whole an air of caprice – such as
psychological phenomena, particularly the psychological phenomena of crowds, frequently present, and such as
becomes strikingly noticeable in times of panic or of speculative inflation”.
Kahneman and Tvesky are more specific by introducing the idea of heuristics: “In making predictions and
judgments under uncertainty, people do not appear to follow the calculus of chance or the statistical theory of
prediction. Instead, they rely on a limited number of heuristics which sometimes yield reasonable judgments and
sometimes lead to severe and systematic errors”.

Though, it is ambiguous which of the phenomena are robust. In the words of Shleifer (2000): “There is a lot of
psychology that might be relevant for the formation of investor sentiment, and no obvious way of deciding which
psychological biases are the most important”.
Additionally, Raiffa (1994) raises the question of the usefulness of this research. One possibility is to use the
ideas and build better theories either drawing direct inferences or by modifying normative theories to include
cognitive concerns. Though, Howard (1988) expresses concerns regarding this idea: “Some decision theorists
have questioned the normative concepts. They desire to weaken the norms until the normative behavior agrees
with the descriptive behavior of human beings. A moment of reflection shows that if we have a theory that is
both normative and descriptive, we do not need a theory at all. If a process is natural like breathing, why would
you even be tempted to have a normative theory?”. Another possibility is to use the research results and provide
better training for decision makers.
In the following we will provide a systematic discussion about the patterns of decision making as a basis for
deriving implications.
3 SEARCH, FRAMING AND PROCESSING OF INFORMATION
3.1 Selection of Information

Information processing requires in the first step a selection of relevant issues. In complex, uncertain
environments, decision makers under time pressure often use rules of thumb or heuristics. However, applying
them may result in poor decision outcomes.
For example, using the recognition heuristic following the rule that if one of two objects is recognized and the
other is not, then one should infer that the recognized object has the higher value, may lead to false conclusions.
To illustrate the phenomena consider the results of the following example. In an experiment, people have been
asked which US city has more inhabitants: San Diego or San Antonio. Participants chose the city they have heart
most about – San Diego – although it had less inhabitants. As noted above applying heuristics does not
necessarily lead to bad decision outcomes. Gigerenzer and Todd (1999) show for example that non-experts can
achieve higher investment returns than experts by applying a simple heuristic: buy the shares of companies
whose name you recognize.
Many decisions under uncertainty in particular under asymmetric information, where everybody beliefs that
everybody has better information, induces imitative behavior. This behavior is known as herding. As (Banerjee,
1992) has pointed out herding may also result under complete rationality. Let us presume that you and a lot of
other people have to find your way to a new destination, and you come to a crossway where you can only either
go left or right. Everyone has a private imperfect signal (e.g. "judgment" or "opinion"). For simplicity, let
everyone have a private signal "left" ("right") with probability 1/2 if the true best choice is to go left (right). So,
the signal helps but it is not perfect. Everyone's signal is equally good. Assume further that you are the third
person to choose, and you first saw a man and then a woman go left. It is optimal for you to go "left" even if
your private signal/intuition says "right". Why? You know that the man must have had "left" signal, because he
went left. The woman saw the man go "left." She would have figured out that the first individual's signal was
"left". If her private signal was "left", she would have surely walked left, too. If her signal was "right", she
would have been aware of one right and one left signal, say in this case she tosses a fair coin. She might have
walked the one or the other way. Having seen both the man and the woman walk "left," you know that the
man had a "left" signal and the woman had a better than even chance of having had a "left" signal. Even if
your private information is a "right" signal, you should choose "left" if you are acting rationally. The
consequence is that million rational individuals may walk "left" just because the first two individuals walked
"left", even if the true best choice was "right" called. This situation is called informational cascades. Note that
in an informational cascade everybody acts rationally. Though, even if all participants as a collective have
information in favor of the correct action, each participant may take the wrong action.37
3.2 Framing
People’s behavior may depend on the way their decision problems are presented or framed. Tversky and
Kahneman (1986) show that alternative descriptions of a decision problem often give rise to different
preferences contrary to the principle of invariance underling a rational choice. The intuition behind this normative
concept is that variations of the form that do not affect the actual outcomes should not be relevant for the
choice. Consider the following example. Assume that there are 60’000 SFr. at risk. One group of respondents
receives the following problem. They can rescue 20’000 SFr. (alternative A) or rescue the total amount with
probability 1/2 and loose the total amount with probability 2/3 (alternative B). A second group of respondents
receives the same problem but with a different formulation. Now, alternative A is the sure loss of 20’000 SFr.
and alternative B is the uncertain outcome to loose nothing with probability 1/3 and loose everything with
probability 2/3. The majority choice of the first group is risk-averse: the prospect of certainly saving 20’000 is
more attractive than a risky prospect with the same expected value. In contrast, the majority choice in the
second group is risk taking: the certain loss of 20’000 SFr. is less acceptable than the two-of-three chance to
loose everything. The preferences of both groups illustrate that choices involving gains are often based on risk-
aversion while choices involving losses are often risk-taking.
Benartzi and Thaler (1998) observe that employees planning their pension savings tend to split the funds equally
(1/n) over the alternatives offered by their pension fund. As a consequence, asset classes offering more choices
receive significantly more funds.
37
For a survey on rational herding see Andrea Devenow, Ivo Welch (1996) Rational Herding in Financial Economics.
European Economic Review 40:3-5, 603-615.
Another example for the impact of framing on investment decisions is the practice of selling shares that have
sharply declined in value on the end of reporting period, i.e. window dressing.38 By selling the losers portfolio
managers want to dress their window, i.e. to show in their brochures that they have the winners in their
portfolio.
3.3 Processing of Information

There are several biases in the processing of information.
• The availability bias can be best illustrated considering the following answers of the following questions:
What is a more likely death in the US – being killed by a falling airplane parts or by a shark? (Plous,
1993). Most people judge the probability of dying by an event, which receives more publicity and it is
easier to recall as higher against the statistical evidence.
• The gambler’s fallacy is a phenomenon where people inappropriate predict reversals in a random
sequence. For example, in a roulette after a run of order at most 7 people think it is time for the other
color to come again although the outcomes are i.i.d. Asking students to estimate the probability of runs
of different order (1 to 6) by throwing a fair coin 100 times, we observe that people underestimate the
frequency of long runs of the random walk process.
people random walk
0.6
0.5
0.4
0.3
0.2
0.1
0
1 2 3 4 5 6 7 8
• Violation of rational choice theory is commonly observed in simple repeated binary choice problems. In
such tasks people often tend to ‘match’ probabilities, i.e. they allocate their responses to the two
38
Josef Lakonishok, A. Shleifer, R. Thaler, and R. Vishny (1991)
options in proportion to their relative payoff probabilities. Suppose that a monetary payoff of fixed size is
given with tossing a coin with probability 0.49 for “head” and with probability of 0.51 for “tail”.
Probability matching refers to behavior in which “head” is chosen on about 49% of trials and “tail” on
51%. On any coin toss, the expected payoff for choosing “tail” is higher than the expected payoff for
choosing “head”. Thus, the optimal choice is always to always select “tail”. But the rational behavior
would not match the randomness of the choice problem. From investors perspective, the decision
whether to invest in an asset whose price increase with “head” and decrease with “tail” is not to invest
if the coin is fair and there are transaction costs, respectively to follow a “buy and hold” strategy if the
coin is worn as described above.
• When people try to determine the probability that a data set A was generated by a model/process B, or
that an object A belongs to a class B, they often use the representativeness heuristic. This means that
they evaluate the probability by the degree to which A reflects the essential characteristics of B. In many
circumstances, representativeness is a helpful heuristic, but it can generate many severe biases. Consider
the following example. Assume that there is a fund manager known to beat the market in 2 of 3 years.
Given the sequences BLBBB, LBLBBB and LBBBBB (B: beat; L: loose), people typically choose the
sequence LBLBBB as it is representative for the probability 2/3. Though, the sequence BLBBB is more
likely than the sequence LBLBBB because the latter includes it. The probability for the sequence LBLBBB
is the conditional probability for the sequence BLBBB given the additional realization L in one of the
years. , the correct answer is LBBBBB39.
• Even though in one person decision problems additional constraints can only make one worse-off,
people often impose themselves rules to overcome self-control deficits. Examples include getting your
stomach removed to prevent yourself from eating too much. Perhaps an appropriate model for this
phenomenon should involve two agents the one who has the tendency to overeat and the other hinder
the overeater to do so.
• Also closed-end-funds represent a puzzle. They typically start with a discount of 10%, which is a natural
result of underwriting and start-up costs. The discount rate fluctuates widely and upon termination share
prices rise and discount shrinks. Several explanations have been suggested. One of them is based on the
notion that people may not be able to stop spending capital gains, much like being not able to stop
overeating. Since closed-end funds retain the profits and do not provide regular payoffs which could be
n
39
If outcomes are independent then the cumulated probability for each sequence is P ( x1 ,..., xn ) = ∏ p( x ) where x
i =1
i i
is a realization in a sequence and p ( xi ) is its probability. The probability of the first sequence is
1 4 2 4
⎡1⎤ ⎡ 2 ⎤ ⎡1⎤ ⎡ 2 ⎤
P1 ( x1 ,..., xn ) = ⎢ ⎥ ⎢ ⎥ , the second sequence has probability P 2 ( x1 ,..., xn ) = ⎢ ⎥ ⎢ ⎥ and the third
⎣3⎦ ⎣ 3⎦ ⎣3⎦ ⎣ 3 ⎦
1 5
⎡1⎤ ⎡ 2 ⎤
P ( x1 ,..., xn ) = ⎢ ⎥ ⎢ ⎥ .
3
⎣3⎦ ⎣ 3 ⎦
spent for consumption, one would nee to sell shares i.e. to start “dipping into capital” if one wants to
finance consumption. This however is risky since people may then dip into capital too much. Hence
closed end funds sell at discount, which makes their price lower than the sum of the prices of their
components.40
• There is strong empirical evidence that investors do not diversify sufficiently, instead they hold primarily
shares of domestic companies. This phenomenon called home bias is associated with the notion that
people prefer familiar situations. They feel that they are in a better position than others to evaluate
particular decision problem.
• The Ellsberg paradox provided evidence that people dislike ambiguity or situations where they are not
sure what the probability distribution of a gamble is. If people feel more comfortable in situations of risk
than in situations of uncertainty then this could explain why they prefer investing in domestic companies.
Moreover, if investors are concerned that their model of stock returns is misspecified, they would charge
a substantially higher equity premium as compensation for the perceived ambiguity in the probability
distribution.
4 PROSPECT THEORY
Large body of experimental work has shown that people systematically violate the assumptions underlying
expected utility theory choosing among risky gambles. The Prospect Theory of Kahneman and Tversky is one of
the theories trying to match the experimental evidence. Their theory has no aspiration as a normative theory: it
tries to capture people’s attitude to risky gambles.
The theory consists of four building blocks. First, utility is defined over gains and losses rather than over final
wealth.41 Second, the value function has a kink at the origin, indicating a greater sensitivity to losses than to
gains, a feature known as loss aversion. Third, the shape of the value function is concave in the domain of gains
and convex in the domain of losses. In other words, people are risk averse over gains and risk-seeking over
losses, i.e. they gamble to avoid losses. The final piece of prospect theory is the nonlinear probability
transformation: small probabilities are over-weighted.
40
As Martin Zweig, a famous closed end fund manager showed, the discount can be eliminated by paying out dividends.
41
This idea was first proposed by Markowitz!
P ro b ab ility W e ig h ten in g W (P )
1.2
0.8
0.6
P
0.4
0.2
0 0 .2 0 .4 0 .6 0 .8 1 1 .2
W (P)
To illustrate the first result, consider a lottery paying 200 with probability 50% and -100 with probability 50%.
1 1
Evaluating the lottery using the expected utility approach requires comparing U (−100) + U (200) with
2 2
U (0) . Typically people would not play this lottery because they are loss averse, i.e. losses are weighted with the
factor 2.25 more than gains. If lotteries are not evaluated according to gains and losses but according to final
1 1
wealth, then the lottery is evaluated by comparing U ( w − 100) + U ( w + 100) with U ( w) , where w > 0 is
2 2
the initial wealth. Every expected utility maximizer, with CRRA for example, u ( x) = x a where a > 0 would play
the lottery.
One application of the loss aversion is the disposition effect, i.e. investors are reluctant to sell assets trading at a
loss relative to the price at which they were purchased. Odean (1998) finds that individual investors are more
likely to sell stocks which have gone up in value rather than stocks that have lost.
One further aspect of prospect theory that is similar to thinking in terms of gains and losses is narrow framing or
the tendency of agents facing new gambles to evaluate them separately, in isolation from other risks they
already have. In other words, agents derive utility directly from the outcome of the gamble instead of evaluating
its contribution to total wealth and miss the chance of diversification. Put differently, agents maximize utility
locally in an optimal manner, but by doing so they may come to a bad global outcome. For example, a lottery A
paying 60 by “head” and -59 by “tail” and a lottery B paying -59 by “head” and 60 by “tails” would be
rejected if evaluated separately. Though, playing both lotteries simultaneously is an arbitrage opportunity
delivers a sure profit of 1 (an arbitrage opportunity). Narrow framing my also arise from mental accounting, i.e.
from separating assets into categories like bonds, stocks and alternative investments. If one diversifies within
those categories but not across them then suboptimal allocations will result.
The third pillar of prospect theory implies that agents change their risk attitude when facing gains and losses. In
particular, they prefer to take additional risks and gamble in order to achieve sufficient gains that neutralize the
losses.
The last building block of prospect theory is the proposition that individuals transform probabilities to decision
weights. This attitude can explain the Allais Paradox presented in previous chapter. Applying the appropriate
transformation of the underlying probabilities we get: A≽B if and only if
w(1)u(50) > w(0.01)u(0) + w(0.89)u(50) + w(0.1)u(250) or
( w(1) - w(0.89))u(50) > w(0.01)u(0) + w(0.1)u(250)
Applying the same procedure for the lotteries A’ and B’ we get B`≽ A`, if and only if
w(0.9)u (0) + w(0.1)u (250) > w(0.89)u (0) + w(0.11)u (50) or
( w(0.9) - w(0.89))u (0) + w(o.1)u (250) > w(0.11)u (50)
Thus this choice while being a contradiction to the expected utility approach is consistent with prospect theory.
5 PROBABILITY WEIGHTING AND FSD

Let there be three states, s = 1, 2,3 with payoffs (100, 100, 90.25). The true probabilities are
p = (0.05, 0.05, 0.9) . Let the perceived probabilities be w = (0.1, 0.1, 0.85) . If the vNM utility function is
u ( x) = x , for example, then the prospect-theory-investor would prefer the lottery to a certain payoff of 100,
since 0.1 100 + 0.1 100 + 0.85 90.25 = 2 + 8.075 > 10 = 1 100 . However, this preference violates First
Order Stochastic Dominance.42 To cure the problem of FSD, the cumulative prospect theory of Tversky and
Kahneman (1992) suggest defining the weighting function, say T ( x) , on the cumulative probability density of
gains and losses rather than on the probability density. The prospect utility function is obtained in general as
PTv( x) = ∫ v( x)dT ( x) . Since T ( x) is an increasing monotone transformation the First Stochastic Dominance
property is preserved. In other words, if lottery A first order stochastically dominates lottery B then T(A) first order
stochastically dominates T(B).
REFERENCES:
Textbooks:
Campbell and Viceira (2002): “Strategic Asset Allocation”
Gigerenzer and Todd (1999): „Simple Heuristics That Make us Smart“, Oxford University Press
Plous, S. (1993): “The Psychology of Judgment and Decision Making”, McGraw-Hill
Simon, H.A. (1957): “Models of Man: Social and Rational”, Wiley
Shleifer (2000): “Inefficient Markets”
Veblen (1900): “Theory of the Leisure Class”
Research Papers:
DeGeorgi , Hens and Levy (2003): “Prospect Theory and the CAPM”, NCCR-working paper.
John Maynard Keynes (1936): “The General Theory of Employment, Interest and Money”, Harcourt
De Bondt (1998): “Anatomy of the individual investor”, European Economic Review.
Howard, R.A. (1988): “Decision Analysis: Practice and Promise”, Management Science (34), p. 679-695
Kahneman and Tversky (1973): “On the Psychology of Prediction”, Psychological Review, 80
Kahneman and Tversky (1979): “Prospect Theory: an analysis of decision under risk”, Econometrica 47, 263-291
Tverky and Kahneman (1992): “Advances in Prospect Theory: Cumulative Representation of Uncertainty”,
Journal of Risk and Uncertainty, 5, pp. 297-323
42
Since the preference (100,100,90.25) >(100,100,100) is a violation of the state dominance.
Chapter 7:
Choosing a Portfolio on a
Mean Reverting Process:
Time Diversification
Choosing a Portfolio on a Mean Reverting Process: Time Diversification 90
1 INTRODUCTION
As discussed in previous chapters, in efficient markets, mean-variance analysis is sufficient to solve investment
decision problems. However, there is ample evidence that markets are not as efficient as one would wish.
Concerning the issue of mean-reversion we for example observe that the longer the investment horizon the more
favorable is the ratio of return to variance43. The ratio of variance over an n-period horizon to n times the
variance of a one period horizon has been suggested as a test for mean-reversion. While this ratio should
fluctuate around one in a random walk, for mean-reverting processes it is significantly less than one. As one of
our exercises shows this is the case for many stocks in the DJIA. One may say that markets have a “memory”,
i.e. prices in the future depend on past realizations, contradicting the efficient market hypothesis as formulated
by Fama (1970). In this chapter we discuss the question how to invest on mean-reverting processes. We find that
for reasonable degrees of risk aversion a CRRA-expected utility maximizer will hold more risky assets when those
are mean-reverting but he will also have to “time the market”, i.e. to adjust his asset allocation to the ups and
downs of the market. We conclude this chapter by asking once more whether the model can be closed, i.e.
whether there are reasonable assumptions on the investors behavior that are able to generate a mean-reverting
asset price process. This leads us to behavioral finance asset pricing models that are know under the term
“investor sentiment models”. Finally, we take a glance at investor sentiment indices as constructed by
practitioners.
2 THE MEAN REVERTING PROCESS

A mean reverting process is a stochastic process that can be described as an autoregressive process of order one,
AR(1), i.e. as yt +1 = µ + ρ yt + ε t where ε t satisfies E(ε t ) = 0 , E(ε t2 ) = σ 2 and. For ρ = 1 , we get a unit
root process or a random walk. If ρ < 1 the process is mean reverting. If ρ > 1 the process is mean-averting.
As an example for an AR(1) process, consider a simple Markov process, where realizations at time t depend only
on the realizations in the previous period t-1 as summarized in the figure bellow.
t
+ -
t-1
+ b 1-b
- c 1-c
The variables b and c denote the probabilities to reach state “+” in period t given that in the previous period the
state was “+” respectively “– “. If the process is mean reverting (averting), the probability to be in the same
state one period ahead is smaller (greater) than the probability for a “switch”. In contrast, these probabilities are
the same for a random walk, since the process has no memory. In a random walk with drift the probabilities
43
Recall that in a random walk both the return and the variance increase at the same rate with the investment horizon.
differ from ½ but are identical for both cases “+” respectively “– “.
3 OPTIMAL PORTFOLIO CHOICE WITH MEAN REVERSION

The results in the previous chapter show that betting on time diversification, i.e. holding more risky assets the
longer the investment horizon, does not make any sense if markets are efficient and assets returns are i.i.d. In
this chapter, we show that a rational investor with reasonable risk aversion will hold more stocks (diversify over
time) when returns are mean reverting and that she will have to time the market, i.e. she will have to adjust her
share of risky assets to the market conditions.
This point is made in the book of Campbell and Viceira (2002). Unfortunately these authors do use the term
“buy&hold” differently to the way we used it so far. For them a buy&hold strategy is a strategy in which the
asset allocation is hold fixed over time. We called this strategy a fix-mix or rebalancing strategy because it
cannot be followed by buying and then holding assets untouched. When percentages of wealth are fixed the
units of assets have to be rebalanced! In a second dimension Campbell and Viceira (2002) distinguish between
myopic, tactical and strategic decisions. Investors following a myopic buy-and-hold strategy plan over one period
of time and do not try to implement any timing strategies while building their asset allocations. The tactical asset
allocation allows investors to react to market movements while planning ahead for one period only. Investors
implementing a strategic asset allocation plan over multiple periods and their asset allocation changes over time
as a response to market movements.
The resulting portfolio rules for Epstein-Zin utility functions and a general mean-reverting process can be
represented in the following figure:
If an investor forms a tactical asset allocation then on average she will hold as much stocks as a myopic buy and
hold maximizer. But after a bull market she will reduce her share and after a bear market she will increase her
share of risky assets. I.e. the optimal share of risky assets is an increasing function of up-movements of the
market, the line has a positive slope. At the point where the line crosses the myopic buy-and-hold line, the
excess stock return equals the unconditional mean. The optimal strategy of a long-term investor is the strategic
asset allocation. The line is slightly steeper than the tactical asset allocation line and it is shifted upward, so that
the intercept is positive. The higher slope can be explained with the notion that long-term investors anticipate
that in the next period they will have the chance to change their decisions in order to exploit market
opportunities, so that their aversion against the risky asset decreases and they are ready to put more weight on
the stock holdings in their portfolios.
This behavior of long-term investors has been discussed by Siegel (1994) as well. His advice that long-term
investors should hold more equities is only based on the reduced risk of stock returns at long horizons due to
mean reverting stock returns, ignoring the potential for timing: “Stocks have what economists call mean-
reverting returns, meaning that over long periods of time, high returns seem to be followed by periods of low
returns and vice versa. On the other hand, over time, real returns on fixed income assets become relatively less
certain. For horizons of 20 years or more, bonds are riskier than stocks.” Based on this fact Siegel recommends
holding more stocks the longer the investment horizon. The problem with this recommendation is that if stock
returns are mean reverting then they are predictable. Thus, investors can do even better by timing the market
and the buy-and-hold strategy is not optimal anymore.
Campbell and Viceira (2002) illustrate the proof of this claim by using an AR(1) process with log-normal returns
and Epstein-Zin utility. Samuelson (1991) gives the same intuition in a much simpler model based on the two-
state Markov model introduced above and CRRA expected utility functions. For the sake of simplicity, we will
only present this case here.
4 THE CASE CONSIDERED IN SAMUELSON (1991)

Consider as before a simple 2-by-2 Markov process (markets have a memory over one period only), where the
total return can be high (“+”) or low (“–“). There are two assets: safe cash and a risky equity. In the case of a
random walk, regardless of the last period realization, the probability of observing high or low returns is the
1
same, equal to a where 0 < a < 1 . If we assume that stocks rise on average in the long run, then a > > 1− a .
2
If the process is mean reverting, then b = Prob {+ | +} < 1/ 2 < Prob{+ | −} = c . If we want to recognize
additionally that stocks have an upward trend then the average of b and c must exceed ½. To simplify matters,
Samuelson has chosen the parameters without a trend in the following way:
t
+ -
t-1
+ b 1-b
- c 1-c
t t t
+ - + - + -
t-1 t-1 t-1
+ 1/3 2/3 + 1/2 1/2 + 2/3 1/3

- 2/3 1/3 - 1/2 1/2 - 1/3 2/3
Hence all processes have the same unconditional expectation44. The return matrix that Samuelson has chosen is:
⎡1 3⎤
R=⎢ ⎥ . For the case of a logarithmic utility (or “Bernoulli” utility) we solve the problem
⎣1 0 ⎦
Max p ln(1 + 2λ ) + (1 − p) ln(1 − λ ) where λ is the percentage wealth invested in the risky asset.
λ
3 p −1
The solution of the problem is: λ * = .
2
Myopic Investor
From the general solution given just before, in the case of a random walk, where p=1/2 the optimal fraction of
risky assets is ¼. Conditioning upon knowing the “+” or “–“ that has occurred, the one-period Bernoulli
investor who is permitted to time will choose to hold only cash in the next period, λ+* = 0 , if the previous
realization was good, or put half of its wealth in risky assets, λ−* = 1/ 2 , if the realization was bad. Thus, for a
myopic Bernoulli utility maximizer the long run average of the timing asset allocation coincides with the asset
allocation on a random walk that has the same long run probabilities as the Markov process. For a Bernoulli
agent the optimal allocation does not change even if the problem is solved for T periods. To see this, let
ωt ∈ {+, −} be the realization in period t and ω t = (ω0 ,...,ωt ) be the history up to that period. Then the
evolution of wealth can be written as wt +1 (ω t+1 ) = [1 − λ t (ω t )+R(ωt +1 )λ t (ω t )] wt (ω t ) where λ t (ω t ) is the
decision taken at t and R(ωt +1 ) is the return at t+1. Evaluating the wealth at the end of the planning period we
see that the T-period panning problem decomposes into T separate one-period problems:
Maxt
prob(+ ¦ ω t ) ln(1 − 2λ t (ω t ))+prob(− ¦ ω t ) ln(1 − λ t (ω t ))
λ (ω )
t
We get the same optimal solution as before. Thus, a strategic long-term Bernoulli utility maximizer chooses the
same asset allocation as the myopic Bernoulli utility maximizer as claimed above.
We can summarize the alternative portfolio rules for a Bernoulli utility maximizer as follows:
44
That is to say in the limit distribution of all three Markov matrices both states are equally likely. You may check this fact
by iterating the matrices!
1
If we choose a more realistic utility function from the CRRA class as for example U ( w) = − the optimization
w
problem becomes Max − p (1 + 2λ ) −1 − (1 − p )(1 − λ ) −1 and the solution is:
λ
⎧ 1 ⎫
⎪ p = 2/3 ⎪
⎪ 4 ⎪
⎪⎪ 4 + 72 p (1 − p) ⎪⎪
λ* = ⎨ 0 < p < 2/3 ⎬
⎪ 4(3 p − 2) ⎪
⎪ 4 − 72 p(1 − p ) ⎪
⎪ 2/3 < p <1 ⎪
⎩⎪ 4(3 p − 2) ⎭⎪
Thus, in the case of a random walk where p=1/2, the optimal allocation on risky assets is λ * = 0.1213... If the
investor believes in mean reversion (p=1/3), he will put all his wealth in the riskless asset, λ+* = 0 , after a good
realization and only part of it, λ−* = 1/ 4 , after a bad realization. Thus, for a more realistic utility maximizer, the
long run average of the timing asset allocation is greater than the asset allocation on a random walk that has
1 * 1 * 1 1 1
the same long run probabilities as the Markov process, i.e. λ− + λ+ > λ * because 0 + ⋅ > 0.1213...
2 2 2 2 4
The consequences for the long-term strategic asset allocation can be seen by considering the following two
alternative three-period cases. Let the accumulated probabilities be as illustrated in the figure bellow.
1/9
+
-
+ 2/9
4/9
+ +
-
-
2/9
Further, let λ-2 (λ+2 ) be the optimal one-period choice in the second period after a -/+return. The first period
choice after a “+” return is determined as before:
Max -1/9 (1+2λ )-1 (1+2λ+2 )-1 - 4/9 (1-λ )-1 (1+2λ−2 )-1 - 2/9 (1+2λ )-1 (1-λ+2 )-1 - 2/9 (1-λ )-1 (1-λ+2 )-1
λ
= 25/27 Max -9/25 (1+2λ )-1 - 16/25 (1-λ )-1

λ
Hence, taking into account the second period optimization the odds for a good outcome have changed from 1 to
2 in the myopic case to 16 to 25 in the strategic case. Consequently, the investor invests more in the risky asset,
λ+1 = 0.0198193 > 0 .
If we consider a strategic decision after a previous bear market then the probabilities change to:
2/9
+
- + - 4/9
+ 2/9
-
-
1/9
we get:
Max -2/9 (1+2λ )-1 (1+2λ+2 )-1 - 4/9 (1+2λ )-1 (1-λ+2 )-1 - 2/9 (1-λ )-1 (1+2λ−2 )-1 (1-λ+1 )-1 - 1/9 (1-λ )-1 (1-λ+2 )-1
λ
= 26/27 Max -18/26 (1+2λ )-1 - 8/26 (1-λ )-1

λ
Hence, taking into account the second period optimization the odds have changed to from 1 to 2 in the myopic
case to 18 to 26. Consequently, the investor invests more in the risky asset λ+1 = 0.272078... > 0.25 .
Thus, a strategic long term utility maximizer with a more realistic utility function chooses an asset allocation that
has more risky assets than the myopic utility maximizer with the same utility function.
The alternative portfolio rules are summarized in the figure bellow.
To summarize, the investor with the more risk averse utility U(W) = -1/W has a smaller allocation of risky assets
than the Bernoulli case and the longer the time horizon the greater the proportion of risky assets. Similarly, a less
risk averse utility U (W ) = W has a greater allocation of risky assets than the Bernoulli case and the longer
the time horizon the smaller the proportion of risky assets (see the slides). Finally, for the more risk averse utility
U(W) = -1/W we observe time diversification, i.e. the allocation of risky assets increases with the investment
horizon. The following figure summarizes the results of Samuelson (1991):
CRRA Utility
One Period Two Periods
Random Walk Mean Reversion Random Walk Mean Reversion
(No Timing) (Timing/TAA*) (No Timing) (Timing/SAA)
λ+2 = 0
λ =0
1
λ−2 = 0.5
λRW
1
= 0.25 +
Bernoulli Utility λRW
2
= 0.25 = λRW
1
λ−1 = 0.5
“fix-mix” Average:
u ( w) = ln( w) λ = 0.5λ+2 + 0.5λ 2 = 0.25 = λRW “fix-mix” - strategy
−
strategy
λ = 0.5λ+2 + 0.5λ 2 = 0.25 = λRW
−
NO TIME DIVERSIFICATION
λ+2 = 0.0198193... > 0 = λ+1

= 0.1213... λ = 0
1
λRW
1 + λ−2 = 0.272078... > 0.25 = λ−1
1 λ−1 = 0.25 λ 2
RW = 0.1213 = λ 1
RW
u ( w) = − Average:
w “fix-mix” 1 1 “fix-mix” - strategy
λ = λ1 + λ 1 = 0.125 > λRW
1 1 1
strategy λ = λ 2 + λ 2 = 0.145... > λRW
1
2 +
2 −
2 +
2 −
TIME DIVERSIFICATION
5 ASSET PRICING AND MEAN REVERSION

There are good reasons to believe that on aggregate dividends are not mean-reverting (see, for example
Campbell and Shiller (1988)). If however the dividend process is random (i.i.d. or a geometric Brownian motion),
then the rational portfolio choice of an expected utility maximizer with CRRA will not be able to generate a mean
reverting return process, because as we have shown above such an investor will choose a fix-mix strategy. To
build models where portfolio choice is consistent with mean reversion, we must look elsewhere or in the words
of Shiller (2000) “In sum, stock prices clearly have a life of their own; they are not simply responding to earnings
or dividends. Nor does it appear that they are determined only by information about future earnings or dividends.
In seeking explanations of stock price movements, we must look elsewhere.”
Investors Sentiment Hypothesis
If the input is a random walk then in a totally rational model the output should follow a random walk as well. To
derive mean reversion we therefore need to break with total rationality in the sense of expected utility
maximization with CRRA. In technical terms, in order to get mean reversion out of a representative agent model,
we need to introduce some reason for mean-reversion in the condition for intertemporal utility optimization (the
Euler equation). So far changes of believes and changes of risk attitudes have been proposed.
For example, Barberis, Shleifer and Vishny (1998) accept all traditional asset pricing assumptions including the
hypothesis that the representative investor applies the Bayesian rule for updating. In their model of investor
sentiment, they assume that earnings follow a random walk but the investor believes the market switches
between two regimes: a ``momentum'' and a ``mean-reversion'' state. If the investor does Bayesian updating
every period then he switches between two moods called overreaction and underreaction. Since there is only one
representative consumer, his marginal rate of substitution determines relative prices. Changes in the aggregated
dividends determine changes in relative prices so that the dividend yield should remains constant. There is no
additional uncertainty in the maximization problem besides of the investor’s expectations. Relative prices change
endogenously as a function of investor’s beliefs to be in a momentum or mean reversion regime determined by a
Markov process.
Barberis, Huang, and Santos (2001) follow a different approach. They assume that not the beliefs determining
agent’s marginal rate of substitution but his risk preferences change over time as initially predicted by Thaler and
Johnson (1990). The intuition is simple: after bad realizations of the dividend process the representative investor
faces losses that motivate her to take less risk while after good realizations she has a cushion of gains that she
will use to take more risks. This so called „house money“ effect should not be confused with the feature of
prospect theory that investors are risk averse when facing gains and risk seeking when facing losses. The former
concerns the change of risk aversion over time while the latter concerns the evaluation of gains and losses of
lotteries at any point in time.
Investor Sentiment Index
One example of how to transfer the theories discussed above into practice are indices measuring the investors’
sentiment. The sentiment index provided by Credit Suisse First Boston for example is based on past price
realizations of the Swiss Market Index (SMI). Compared to a Random Walk with the same mean and standard
deviation as the SMI, at the first glance, the Sentiment Index seems to provide a good fit as the next figure
shows. The problem with this index is the usual causality issue: Do the prices determine the index or the index
influences the prices.
The Sentiment Index of Merrill Lynch is based on the notion that there are two main factors driving market’s
mood: overconfidence and regret. The intuition is simple: the more profit investors make the likely it is that they
become overconfident. On the contrary, the longer are the sequences associated with losses, the more likely it is
that investors would regret their investment choice and start selling their positions. The relative number of runs
associated with capital gains and losses determines which of these effects will dominate and determine future
market movement. One problem with this approach of measuring investors’ sentiment is that the length of the
runs is not determined endogenously by the model.
6 CONCLUSION
The main body of empirical research suggests that most stocks and stock indices follow a mean-reversion
process. An expected utility maximizer with CRRA would then time the market and hold a higher fraction of risky
assets in his portfolio in order to diversify over time. However, his portfolio decision behavior cannot generate
mean reverting prices in an asset pricing model. In contrast, models based on Behavioral Finance can do this
trick.
REFERENCES:
Textbooks:
Chan (2002): Time Series Applications to Finance, Wiley
Campbell and Viceira (2002): Strategic Asset Allocation
Shiller (2000): Irrational Exuberance
Siegel (1994): Stocks for the Long Run, McGraw-Hill
Research Papers:
Barberis, Shleifer and Vishny (1998): A Model of Investor Sentiment, Journal of Financial Economics,
(http://gsbwww.uchicago.edu/fac/nicholas.barberis/research/)
Barberis, Huang, and Santos (2001): Prospect Theory and Asset Pricing, Quarterly Journal of Economics,
(http://gsbwww.uchicago.edu/fac/nicholas.barberis/research/)
Campbell and Shiller (1988): The Dividend Price Ration and Expectations of Future Dividends and Discount
Factors, Review of Economic Studies 58, 495-514.
Samuelson (1991): Long-run Risk Tolerance When Equity Returns Are Mean Reverting: Pseudoparadoxes and
Vindication of “Businessman’s Risk”, in W.C. Brainard, W.D. Nordhaus, and H.W. Wattsleds, “Money,
Macroeconomics, and Economic Policy”, Essays in Honor of James Tobin, MIT Press, pp. 181-222
Thaler and Johnson (1990): “Gambling with the House Money and Trying to Break Even: The Effects of Prior
Outcomes on Risky Choice" Management Science, XXXVI, 643-660.
Chapter 8:
Behavioral Hedge Funds

Behavioral Hedge Funds 101
1 WHAT ARE HEDGE FUNDS?

Despite strong media attention, hedge funds still do not have a precise legal definition. Though, they have a
series of common characteristics that help distinguish them from other investment funds.
Hedge fund managers follow active investment strategies. They attempt to take advantage of a cheaper
access to markets, better analysis of investment opportunities, and regulatory unrestricted conditions of
trade allowing the implementation of flexible investment policies. Using the legal form of a limited
partnership offshore investment company, hedge funds avoid regulation and minimize their tax bills.
On the other hand, hedge funds have limited liquidity. The dates when investors can enter a hedge fund
are strictly specified. Additionally, there is a minimum time an investor is required to keep his money
invested in the hedge funds (lockup period). It is often required that investors give advance notice of
their wish to cash in.
While traditional funds charge usually only a management fee, hedge funds require both a management
fee and an incentive fee. Additionally, many funds include a “high watermark”, i.e. minimum rate of
return over the whole investment period above which the hedge funds is legitimated to change incentive
fees.
To align hedge fund manager’s interests with those of his investors, managers invest a significant
personal stake in their funds as partners.
Since manager’s skills and available investment opportunities are not scalable, size is not necessarily a
factor of success. Some hedge fund strategies have limited ability to absorb large sum of funds and
managers prefer to close them to new subscribers as soon as the assets under management achieve a
certain level.
2 HEDGE FUNDS STRATEGIES
2.1 Long/Short strategies

Long/short strategies involve the combined purchase of undervalued assets and the sale of overvalued securities.
The “fair” value is usually determined using the Arbitrage Pricing Theory (APT). The strategy is successful if the
purchased stock appreciates and/or the sold share looses value. This is the reason why long/short strategies
perform well in bear markets as well as bull markets. If both assets have different betas, the overall portfolio will
have some degree of market exposure. Though, a hedge fund may be neutral to a specific market factor, e.g.
sector index, series of interest rates etc.
2.2 Arbitrage
In the world of hedge funds, “arbitrage” has a different meaning than a risk-free trade that does not require any
cash flow and results in an immediate profit and no future losses as defined in the academic literature. In the
context of hedge funds, an arbitrage is a strategy that profits from differences in prices of correlated financial
instruments usually traded on different markets. The simultaneous purchase and sale of these instruments is
usually not risk-free. The strategy it attempts is to exploit discrepancies in the relative pricing of closely related
securities under the assumption that they will disappear over time.
For example, managers of convertible arbitrage strategies attempt to buy undervalued instruments that are
convertible into equity and then hedge the market risks by going short in the firm`s stock. The fair value is based
on the optional characteristics in the convertible bond and the manager’s assumptions on the future volatility of
the stock. The risk is that the volatility may turn out as lower than expected. One of the main drivers of recent
returns in convertible arbitrage is generated by IPOs.
Another example for an arbitrage strategy is the fixed income arbitrage. This strategy seeks to exploit pricing
anomalies within and across global fixed income markets and their derivatives. Usually, managers take offsetting
long/short positions in similar fixed income securities that are in some way interrelated but the relationship is
temporally distorted by market events, investor preferences, exogenous shocks etc.
2.3 Event driven strategies

Event-driven strategies focus on identifying securities that can benefit from the occurrence of extraordinary
events such as restructurings, takeovers, mergers, liquidations, bankruptcies etc. The strategy profits from the
reaction of security prices, which is typically influenced more by the dynamics of the particular event than by the
general appreciation or depreciation of the firms equity or debt.
Distressed securities strategies for example invest in the debt or equity of companies experiencing financial or
operational difficulties. The strategy exploits the fact that many investors are restricted from trading securities
that are below investment grade. In this sense, the strategy is a very good example of a regulatory arbitrage:
most investors must sell securities of troubled companies because policy restrictions and regulatory constrains do
not allow them to hold securities with very low credit ratings. The result is a pricing discount that reflects in
addition the uncertainty about the outcome of the event.
2.4 Directional strategies

Rather than hedging risks, direction strategies rely on the direction of market movement in order to achieve
profits.
Global macro funds for example do not hedge at all. They profit by exploiting extreme price/value changes by
taking large directional bets reflecting their forecasts on the future market movement. Leverage and derivative
products are used to hold large market exposure and boost returns.
Emerging markets hedge funds focus on equity or fixed income investments in so called emerging markets. The
risks with this strategy are associated with poor accounting, lack of proper legal systems, unsophisticated local
investors. The opportunities are due to undetected, undervalued securities.
Sector hedge funds focus on long and short investments in particular sector, e.g. life sciences (pharmaceuticals,
biotechnology), real estate, and energy.
Dedicated short hedge funds seek to profit from the decline in stocks by taking short positions.
Hedge funds are interesting particularly in a downside market because they provide stable positive returns.
300
250
200
150
100
50
Swiss Market Ind.(r), USD
CSFB Tremont Hedge, USD
0
31.01.1994
31.05.1994
30.09.1994
31.01.1995
31.05.1995
30.09.1995
31.01.1996
31.05.1996
30.09.1996
31.01.1997
31.05.1997
30.09.1997
31.01.1998
31.05.1998
30.09.1998
31.01.1999
31.05.1999
30.09.1999
31.01.2000
31.05.2000
30.09.2000
31.01.2001
31.05.2001
30.09.2001
31.01.2002
31.05.2002
30.09.2002
Another argument for investing in hedge funds is their diversification potential. Every Mean-Variance investor
would invest in hedge funds because of the following figure.
Higher moments of the return distribution of hedge funds strategies can be considered additionally as measures
for investment risk.
3 VALUE AT RISK
Risk is primarily associated with losses. Measures based on the standard deviation of returns are inappropriate
when returns are not normal for at least two reasons. First, standard deviation is a symmetric measure; it
penalizes negative but also positive deviations from the mean. Moreover, deviations do not have a value
expression. With the Value at Risk (VaR) method initially introduced by practitioners, the measure of risk is an
amount qα such that the net worth of the position at some future date T is smaller than qα with probability α .
The number qα is the α -quantile of the return distribution. It is called the “ α % VaR”. The figure bellow
illustrate the VaR concept graphically.
The biggest trouble with the VaR as a risk measure is that it neglects the impact of extreme negative events with
low probability. Additionally, the measure leads to erroneous conclusions when applied to a portfolio of risks.
We will come back to these issues in a later chapter.
4 INVESTMENT STRATEGIES BASED ON BEHAVIORAL FINANCE

Investment strategies based on Behavioral Finance research are primarily motivated by the evidence that
individuals do not update their expectations using the Bayesian rule. On the whole, one could expect that market
returns do not follow the predictions of traditional assets pricing models.
4.1 Underreaction
According to experimental evidence, individuals do not update their beliefs rationally using Bayesian rule.
Moreover, they systematically ignore part of the new evidence. The following example illustrates this. Consider
that there are 100 urns with 1000 balls in each of them. 45 of the urns have 700 black and 300 red balls and
the rest of the urns have 300 black and 700 red balls. The question on the probability that a randomly selected
urn has more black balls is usually answered correctly (45%). In the next question, individuals are provided with
additional information that from 12 balls drawn from the randomly selected urn 8 balls are black and 4 are red.
To answer the question the same question as before including the new information, individuals have to apply
Bayesian rule.
p ( s ) p (*/ s )
p ( s / *) = where p( s ) = 45% and p(r ) = 55%
p ( s ) p (*/ s ) + p (r ) p (*/ r )
⎛12 ⎞
After some calculations p(*/ s ) = ⎜ ⎟ ( 0.7 ) ( 0.3) and transformations
8 4
⎝8⎠
⎛ 12 ⎞
⎜ ⎟ ( 0.3) ( 0.7 )
8 4
4
1 55 ⎝ 8 ⎠ 11 ⎛ 0.3 ⎞
p ( s / *) = we get p(r ) p (*/ r ) = = ⎜ ⎟ = 0.027 and
p(r ) p (*/ r ) 45 ⎛ 12 ⎞ 9 ⎝ 0.7 ⎠
1+ ⎜ ⎟ ( 0.7 ) ( 0.3)
8 4
p( s ) p (*/ s ) ⎝8⎠
1
p ( s / *) ≈ ≈ 96.04% .
1.027
Though, the typical answers are significantly lower (45% and 67%) indicating that on average people seam to
underreact to new information.
The implications for investment strategies are based on studies of market reaction upon earnings
announcements. Positive (negative) earnings surprises are associated with positive (negative) subsequent returns
as illustrated in the figure bellow. In the short run returns are predictable which is known as the post-earnings-
surprise announcement drift.
The next figure shows the size of the cumulative returns after a quarter for growth and value stocks as a function
of the size of the surprise.
Though, from a cross-sectional perspective, the reaction to earnings surprises is much stronger for growth stocks
than for value stocks, i.e. the curve is still increasing but its slope is higher compared to the response function of
value stocks. Additionally, the impact of earnings surprises on subsequent returns is limited to about -20% by
negative surprises respective to about 13% by positive surprises and there is no return overshooting
(overreaction).
Investment strategies based on the “post earnings announcement drift” are successful implemented in practice.
Fuller and Thaler Asset Management for example invest in stocks with positive standard unexpected earnings
(SUE)45 and sell stocks with negative SUE.
45
The difference between actual earnings and the forecast is scaled by the historical standard deviation of the forecast
errors. The forecast is a univariate first-order autoregressive model in seasonal differences.
1000%
800%
600%
400%
200%
0%
-200%
94
97
00
03
D 2
Se 2
93
D 5
Se 5
96
D 8
Se 8
99
D 1
Se 1
02
4
-9
-9
-9
-0
-9
-9
-9
-0
-0
n-
n-
n-
n-
p-
p-
p-
p-
ec
ec
ec
ec
ar
ar
ar
ar
ar
Ju
Ju
Ju
Ju
M
M
Small/Mid-Cap Growth (Net-of-Fees) Russell 2500 Growth
4.2 Momentum and Reversal

While most of the evidence indicates an underreaction to earnings announcements generating momentum in
stock prices, other evidence search for the sources of long-run mean reversion as first documented by DeBondt
and Thaler (1985). They showed that firms with prior extreme negative stock price performance (“losers”) seem
to outperfom those with prior extreme positive performance (“winners”) as if a part of the prior stock price
movement constitutes a deviation from fundamental value. There is still no consensus explanation for these
results. Though, what is clear is that even if overreaction exists, it appears to be to complicated to be
characterized as a simple function of recent earnings changes. In the words of Bernard (1985), a predictable
reversal of prior extreme price movements is consistent with a wide variety of market inefficiencies, including
random deviations of prices from fundamental values, and need not to be cause by any systematic
misinterpretation of earnings information.
Thus, the question if stock prices follow a momentum or a reversion is empirical. The largest part of the empirical
evidence indicates that momentum strategies (buying winners and selling the losers) are most successful over
medium formation and holding periods (3-12 months) and reversion strategies deliver highest returns over long
periods (3-5 years).
Reversal strategies can be defined not only over price movements but also over ratios including accounting
measures such as the price-to-earnings (P/E) ratios. Since these rations are also used to characterize socks as
“value” or “growth”, we can observe momentum and reversion in the returns of value stocks.
4.3 Strategies based on co-moving assets

To predict future asset price movements, one can use event studies indicating the amount of information
reflected in current prices, asset pricing models defining a fundamental value as the equilibrium in the long run
or just empirical evidence on a systematic price behavior without being concerned with any particular reasons.
Another possibility to derive profitable investment strategies is to go to the basics of asset pricing and bet that at
least in the long run the Law of One Price must hold. In the case of Royal Dutch and Shell Transportation for
example the market needed almost four years to recognize this.
4.4 Strategies exploiting probability weighting

Kahneman and Tversky (1979, 1992) documented that people give too much weight to small probabilities. To
illustrate this preference consider the following example. In an experiment, people are asked to decide between
two lotteries. They can either bet on a 2 to 5 shot and receive 40% profit or bet on a 20 to 1 shot and receive
⎛5 2 9⎞
200% profit in the case of a win. Although the expected return of the first lottery ⎜ ⋅ 1.4 + = ⎟ is higher,
⎝7 7 7⎠
⎛ 1 20 22 ⎞
the people prefer the latter ⎜ ⋅ 2 + = ⎟.
⎝ 21 21 21 ⎠
Hodges, Tompkins, and W. T. Ziemba (2003) found that this result can be also observed by horse racetracks.
Applying the idea to index options on S&P500 and FTSE 100 futures they found that the favorite deep in-the-
money calls have positive expected value just like the favorites at the racetrack and there are only small profits
on the deepest in-the-money puts. The losses are more and more as the puts gets more out-of-the-money just
like in the racetrack.
To exploit these regularities, the authors build an investment strategy that sells overpriced puts, hedge them
shorting futures and use the revenues of the put sale to buy calls. Its performance is presented in the figure
bellow.
5 CONCLUSIONS
Investing is much more complicated as suggested by traditional finance. It is absolutely possible to achieve a
better performance than the market since there are systematical deviations from the so called “fundamental
value”. Some of these deviations are due to psychological biases extensively studied by experimental studies.
A good portfolio management has to stand up to the continuously changing process of existing anomalies.
Moreover, it must be based on a solid empirical analysis in order to recognize profitable market anomalies on
time.
REFERENCES:
Bernard (1985): Stock Price Reaction to Earnings Announcements, in Thaler: Advances in Behavioral Finance,
Sage.
Hodges and Ziemba (2003): The favorite-longshot bias in S&P500 futures options: the return to bets and the
cost of insurance, working paper, University of British Columbia
DeBondt and Thaler (1985): Does the Stock Market Overreact?, Journal of Finance 40 (3): 793-806
Froot and Deborah (1998): How are Stock Prices Affected by the Location of Trade, Journal of Financial
Economics (53).
Lhabitant (2002): Hedge Funds, Wiley.
Skinner and Sloan (1999): Earnings Surprises, Growth Expectations, and Stock Returns, University of Michigan
Working Paper
Chapter 9:
Choosing a Portfolio on a GARCH

Process: Risk Management
Choosing a Portfolio on a GARCH Process: Risk Management 112
1 INTRODUCTION
There is much ambiguity about risk nowadays. For micro-economists, risk usually has neutral meaning. In other
words, risk is associated with the uncertain (positive or negative) outcomes of decisions. For financial
economists, risk is related to something negative such as the possibility of losses. It is not surprising that the
question of how to measure risk is at least as unclear as the definition of risk. In this chapter, we will present
various risk measures and discuss their adequacy using axioms required for meaningful comparison between
risky outcomes to be made. Additionally, we will discuss the question if it is possible to generate processes
endogenously in order to explain phenomena such as the stochastic volatility.
2 EVIDENCE
Bachelier (1900) tried to convince his fellow students that stock prices are random by presenting them a set of
artificially generated truly random prices and prices from the DJIA. Hardly anyone could name the DJIA-prices.
Looking however at the returns of these prices the true returns are easily identified because they are much more
irregular than those from a random walk. True returns show volatility that is stochastic, clustered and
asymmetric. Moreover extreme returns associated with crashes are more frequent. These facts raise the issue
how risk should be measured. Investors using the variance of asset returns as the risk measure face various
difficulties interpreting the results of their analysis. To begin with we show that the volatility of real returns (e.g.
DJIA) is not constant, as required by a White noise process, but stochastic.
S t o c h a s t i c V o l a t il i t y
D JIA -re tu rn s R W - r e t u rn s
Second, volatility is asymmetric, i.e. on average, the volatility of asset returns increases in recessions and
decreases in boom phases. Intuitively, this result can be explained with the fact that investors become nervous in
down markets and place extreme trades. Hence, sharp decreases of asset prices are followed by sharp increases
in the volatility of returns. The effect is not country specific46.
46
Note however the exception that is given by Italy in this respect.
MSCI STOCKS INDEX
Sample Mean Mean Volatility Volatility Corr Corr
Period up down Up Down Corr up Down
World 1970.1‐2003.10 21.00% ‐15.00% 14.35 17.86 0.89 0.86 0.9
USA 1970.1‐2003.10 24.00% ‐17.00% 15.65 17.61 1 1 1
GER 1970.1‐2003.10 28.00% ‐23.00% 19.63 22.62 0.51 0.35 0.6
UK 1970.1‐2003.10 29.00% ‐21.00% 22.3 23.3 0.61 0.47 0.66
CH 1970.1‐2003.10 25.52% ‐19.90% 17.9 21.15 0.64 0.5 0.7
CAN 1970.1‐2003.10 25.73% ‐19.11% 17.41 19.66 0.71 0.65 0.72
IT 1970.1‐2003.10 35.89% ‐29.41% 25.45 23.84 0.35 0.2 0.41
FR 1970.1‐2003.10 32.94% ‐26.00% 21.71 23.45 0.54 0.4 0.6
AUS 1970.1‐2003.10 28.96% ‐23.53% 20.42 23.74 0.53 0.33 0.62
JAP 1970.1‐2003.10 27.45% ‐22.05% 19.42 20.4 0.36 0.24 0.37
Even if one uses the implied volatility47, it also increases (decreases) with decreasing (increasing) returns as
shown in the figure bellow.
The third problem using the variance (volatility) as risk measure is that return distributions have “fat tails”, i.e.
there are too many extreme observations compared to the case of normally distributed returns. Examples of
extreme events are bubbles (New Economy bubble, Sought See bubble etc).
Time series with stochastic volatility can be parameterized using so called Generalized Autoregressive
Conditional Heteroscedastic (GARCH) models.
3 COHERENT RISK MEASURES

So far we introduced two risk measures: the variance and Value at Risk, VaR. Both measures have weaknesses.
The variance does not distinguish between positive and negative deviations from the mean. On the other hand,
using value at risk, we will accept positions as safe when in say, less than 1% of the cases, we get in trouble.
The value at risk has some dangers. When used we will accept positions that with high probability are good but
that with a very small probability lead to bankruptcy. Inside a financial institution, the use of value at risk is more
problematic. Additionally, adding up the value at risk of different units or traders according to the portfolio
theory may lead to erroneous conclusions with respect to the risk exposure of the firm. This idea is illustrated in
the figure bellow.
47
Note that the implied volatility is usually lower than the realized volatility.
A coherent risk measure satisfies a prescribed set of axioms required for meaningful comparison between risky
cash flows.
• The sub-additivity describes a diversification effect: ρ ( X + Y ) ≤ ρ ( X ) + ρ (Y )
• The homogeneity expresses a scaling property: ρ (tX ) = t ρ ( X )
• The monotonicity connects state dominance with risk reduction: ρ ( X ) ≥ ρ (Y ), if X ≤ Y
• The risk-free condition captures the reduction of risk by investing in the risk-free payoff r :
ρ ( X + rn) = ρ ( X ) − n
The only one measure that satisfies these axioms is the Conditional Value at Risk defined as
CVaR= E(x x ≤ VaR(α )) (Theorem by Artzner, Delbaen, Eber, and Heath (1999)). The CVaR is thus the
conditional expectation of the random variable x below the VaR-level. The main advantages of this measure
compared to VaR are that it takes into account the size of losses and does not distort the risk exposure at
portfolio level.
Computing the tangential portfolios with CVaR or with Variance as risk measure we see that on our data the
CVaR-portfolio is much better diversified than the portfolio based on variance.
3M
3M
DJIA 81-03 CVaR-Portfolio 95% Target 1.002 ALCOA DJIA 81-03 Mean-Variance Portfolio Rf=1.002 ALCOA
ALTRIA GP.
ALTRIA GP.
AMERICAN EXPRESS AMERICAN EXPRESS
AT & T AT & T
BOEING BOEING
CATERPILLAR CATERPILLAR
CITIGROUP CITIGROUP
COCA COLA COCA COLA
DU PONT E I DE NEMOURS DU PONT E I DE NEMOURS
EASTMAN KODAK EASTMAN KODAK
EXXON MOBIL EXXON MOBIL
GENERAL ELECTRIC GENERAL ELECTRIC
GENERAL MOTORS GENERAL MOTORS
HEWLETT - PACKARD HEWLETT - PACKARD
HOME DEPOT HOME DEPOT
HONEYWELL INTL. HONEYWELL INTL.
INTEL INTEL
INTL.BUS.MACH. INTL.BUS.MACH.
INTL.PAPER INTL.PAPER
JOHNSON & JOHNSON JOHNSON & JOHNSON
JP MORGAN CHASE &.CO. JP MORGAN CHASE &.CO.
MCDONALDS MCDONALDS
MERCK &.CO. MERCK &.CO.
MICROSOFT MICROSOFT
PROCTER & GAMBLE PROCTER & GAMBLE
SBC COMMUNICATIONS SBC COMMUNICATIONS
UNITED TECHNOLOGIES UNITED TECHNOLOGIES
WAL MART STORES WAL MART STORES
WALT DISNEY WALT DISNEY
Supposing normally distributed returns, these two portfolios would need to coincide because in that case the
CVaR is a linear function of mean and variance. Hence, maximizing for example µλ − γ CVaRλ and noting that
with normally distributed returns CVaRλ = αµλ + βσ λ2 we are actually maximizing a mean-variance utility.
However, the global minimum risk portfolio can be different to those based on CVaR directly.
Using CVaR (or Expected Shortfall) as a risk measure places optimal portfolios differently on the risk-return
frontier, e.g. the global MVP is not efficient anymore.
4 CRASH MEASURES
In addition to measuring the probability of realizing a certain loss investors may be interested in the probability
for market crashes. The difference between the long term (30-years) Treasury bond yield and the earnings-to-
price ratio of the firms is a crash indicator used by the FED. The next figure shows if this indicator predicts S&P
500 crashes.
5 ASSET PRICING MODELS

Asset pricing models based on the representative consumer paradigm are not able to generate stochastic
volatility, bubbles and crashes48. Studying the interaction of agents (or strategies) on financial markets is
therefore proposed. Indeed evolutionary finance models of the first generation (Arthur et al (1997), LeBaron et al
(1999), Brock, Hommes (1999), Lux (1998)) are able to generate GARCH effects. The following chart is taken
from Lux (2002) in which there are three types of agents, fundamentalists, trend followers and outsiders. The
fundamentalists buy (sell) if the current price of a risky asset is below (above) the fundamental value. The trend
followers buy (sell) if the prices have gone up (down). The outsiders get attracted to the market when prices
have gone up severely. They enter the market as trend followers. The relative proportion of these trader types is
determined by the success of the strategies. We will analyze these types of models in more detail the next
chapter. The following figure showing stochastic volatility is taken from Lux and Schornstein (2004):
48
Unless on is willing to postulate that the representative consumer has beliefs or risk aversion that is changing
stochastically over time.
6 SUMMARY
Empirical studies show that the volatility of stock returns is not constant but stochastic. Moreover, asset returns
have “fat tails”. For these among other reasons, the variance cannot be used as an appropriate risk measure.
The Value at Risk as an alternative measure has also disadvantages particular relevant in portfolio context. In
this case, the Conditional Value at Risk is the better risk measure because it considers the size of losses.
From investor’s point of view, crash measures provide additional information on the investment risk. Crashes are
important because they increase the volatility. Models with heterogeneous agents (strategies) are able to explain
stochastic volatility.
REFERENCES:
Arthur, Holland, LeBaron, Palmer, Taylor (1997) “Asset Pricing under Endogenous Expectations in an Artificial
Stock Market” in The Economy as an Evolving Complex System II, Arthur, Durlauf and Lane (eds), pp. 15-44,
Addison Wesley.
Artzner, Delbaen, Eber and Heath: Coherent Measures of Risk, Math. Finance 9 (1999), no. 3, 203-228
(http://www.math.ethz.ch/~delbaen/)
Arzner, Delbaen,Eber and Heath (1997):Thinking Coherently, Risk Vol. 10/ No 11, November 1997
Bachelier (1900): Théorie de la Spéculation", Annales de l'Ecole normale superiure (trans. Random Character of
Stock Market Prices).
Brock and Hommes (1998): Heterogeneous Beliefs and Routes to Chaos in a Simple Asset Pricing Model, Journal
of Economic Dynamics and Control ,Vol. 22, pp. 1235 – 274.
LeBaron, Arthur and Palmer (1999): Time Series Properties of an Artificial Stock Market, Journal of Economic
Dynamics and Control, Vol. 23, pp. 1487 – 516.
Lux (1998): The Socio-Economic Dynamics of Speculative Markets: Interacting Agents, Chaos, and the Fat Tails
of Return Distributions, Journal of Economic Behavior and Organization, Vol. 33, pp. 143 – 65.
Lux and Schornstein (2004): Genetic Learning as an Explanation of Stylized Facts of Foreign Exchange Markets,
forthcoming special issue on Evolutionary Finance of Journal of Mathematical Economics, Hens and Schenk-
Hoppe (eds).
Chapter 10:
Evolutionary Portfolio Theory:

Survival of the Fittest on Wall
Street
Evolutionary Portfolio Theory: Survival of the Fittest on Wall Street 119
“Whenever a theory appears to you as the only possible one, take this as a sign that you have neither
understood the theory nor the problem which it was intended to solve.”
(Karl Popper)
1 INTRODUCTION
In this final chapter of the lecture we provide leading edge research that tries to give a synthesis of the
traditional and the behavioral finance point of view. We suggest a model of portfolio selection and asset pricing
that is based on the idea of heterogeneity of strategies. The strategies considered may result from completely
rational investors maximizing some inter-temporal expected utility, from simple heuristics, from behavioral
finance or from principal agent models describing incentive problems in institutions. Actually we do not consider
how strategies are generated but we ask how they perform once they are in the market. The first observation is
that there is nothing like “the best strategy” because the performance of any strategy will depend on all
strategies that are in the market. Rationality therefore is to be seen as conditional on the market ecology. Our
theory will be descriptive and normative as well, answering which set of strategies one would expect to be in a
market and how to find the best response to any such market. Whereas traditional finance – based on
optimization and equilibrium – has borrowed a lot from classical mechanics, behavioral finance has borrowed
from psychology. We believe that it is time in finance to borrow from biology, in particular from evolutionary
reasoning. The principles of natural selection and mutation as formulated by Charles Darwin, will be two fruitful
analogies that lead us surprisingly far in this chapter.
2 THE ECOLOGY OF THE MARKET

Before developing a theory of evolution of strategies it is most useful to carefully observe which strategies can be
found in a market. In biology Charles Darwin`s discovery of the principles of evolution would not have been
possible without a careful observation and classification of the plants and animals in the world. This major step
was provided to Darwin by Carl von Linee. We recall from the first chapter of this course that a financial market
is first of all characterized by an enormous heterogeneity of strategies that individual investors and institutions
apply. There is nothing like the single representative agent maximizing some artificial utility function. A
classification of strategies that will turn out to be useful is the following one:
• Staying Outside
o This is the stock market strategy that is still followed by the majority of people
o When severe upwards price movements happen then some people may give up this strategy
• News Trading Strategies
o This is the strategy the traditional theory would recommend:

o Try to guess the future returns of assets by gathering and processing all relevant news
• Value Strategies
o Form your portfolio according to value ratios like:
o Price/Earnings, Cash-Flow/Price, Book / Market, Dividend-Yield etc
• Strategies based on Technical Analysis
o Try to guess the momentum of prices by looking at moving averages of various length
o Try to guess the reversal of prices by looking at standard deviations from moving averages
• Experimentation
o Create new strategies by “backtesting”: Check what would have run nicely on realized data
o Be aware of “data snooping”!49
3 SURVIVAL OF THE FITTEST ON WALL STREET

The first point Charles Darwin made was to argue that for the evolution of the ecology we need not model the
interaction of the individuals but the interaction of the strategies played by individuals, i.e. the interaction of the
species.
From an evolutionary point of view the fate of a single individual animal counts nothing as compared to the
relative size of the population of its species. Hence, we suggest stratifying the financial markets not in terms of
individuals but in terms of strategies as listed above. For the market it is totally irrelevant who is investing
according to, say, P/E-ratios. The only thing that matters is how much money is invested according to such a
criterion. In biology strategies fight for resources like food. In finance strategies fight for market capital. In an
evolutionary model there are two forces at play: The selection force reducing the set of strategies and the
mutation force enriching it. You see the selection force in financial markets when you realize that every loss
some strategy made by buying at high prices and selling at low prices must have generated an equally sized gain
for a set of counter-parties. The mutation force is clearly seen if you look back a bit in history and observe that
previously popular strategies like trying to corner a market are no longer so frequent while new strategies like
those followed by hedge funds have emerged. Ultimately, what the evolutionary finance model tries to explain is
how the ecology of the market evolves over time, i.e. how the distribution of wealth across strategies changes
over time. The following chart shows the evolution of wealth for the hedge fund strategies. This sector of the
market is remarkable for our theory because this is the first sector in which data have been collected in the way
that suits our theory: Wealth stratified by strategies: Figure 1 shows for every year how much wealth was
49
On every given set of returns one can find strategies that would have done extremely well ex post. When backtesting a
strategy one has to make sure that one only uses the information that was known at the time. Moreover, when optimizing
parameters of a model on given data one should keep track of the many alternatives from which one has chosen the best
set of parameters. See White (2000) for precise statements on this line.
invested according to which HF-strategy: Global Macro, Long/Short-equity, Managed Futures ,… We see that
overall wealth invested in HFs has grown but some strategies like Long/Short have profited more than average
while others like Global Macro lost funds. Since the HF sector only accounts for 4-6% of wealth overall, this give
however only an example of what one would need to do in order to found the evolutionary approach empirically.
This empirical work is currently in progress. In this lecture we will outline the theoretical foundation for it. Note
that the evolutionary model is based on observables like strategies. We claim that for the vast majority of capital
the strategy according to which it is managed is in principle observable. This is because most capital is managed
by delegation and in this process the principal (the investor) wants the agents (the wealth manager) to commit
to some strategy in order to simplify monitoring and verifiability problems. Indeed many banks compete for
investors` money by advertising strategies they want to commit to. Moreover, the evolutionary approach takes a
“flow of funds” perspective. It claims that understanding according to which principles wealth flows into
strategies is the key to understand where asset prices are going. Recently, some practitioners have also proposed
this view (see for example the newsletter of UBS-Warburg (2002)). UBS-Warburg claims, for example that the
daily level of the SOX-semiconductor index can be predicted by a cumulative flow index capturing the flow into
this sector. We are more skeptical that the flows approach is useful for daily data. Within one day or even within
minutes prices can change drastically on the occurrence of strong news without any high volume of trade. For
the short run day to day changes we buy the news story given by the anticipation principle while in the longer
run – on monthly data for example, we got good predictions from the flows approach. In the next section we will
make these vague ideas precise.
T ot al A s se t s H i s t or y
D e c e mb e r 1 9 9 3 - J u n e 2 0 0 2 ( U S D i n M i o . )
14 0 '0 0 0
Con ver t bi el Ar bit r age
12 0 '0 0 0
Dedicat ed Shor t Bias
M an aged Fut ur es
10 0 '0 0 0
Emer gin g M ar ket s
8 0 '0 0 0
Fixed I n come Ar bit r age
Even t Dr vi en
6 0 '0 0 0
Equit y M ar ket Neut r al
4 0 '0 0 0
GlobalM acr o
2 0 '0 0 0 Lon g/ Shor t Equit y
M ult -i St r at egy
0
19 9 3 19 9 4 19 9 5 19 9 6 19 9 7 19 9 8 19 9 9 2000 2001 2 0 0 2 Q2
4 THE EVOLUTIONARY PORTFOLIO MODEL

We base our evolutionary model on the Lucas (1978) asset pricing model. This has two advantages. First, the
Lucas model is a model that makes sense from an economic point of view. The time uncertainty structure is
simple enough to penetrate; budget equations and numeraire good properties are satisfied. Moreover, displaying
our new ideas in this traditional model will help to asses the differences of the two. Recall from the traditional
finance model that the Lucas (1978) model is defined over discrete time that goes to infinity, i.e. t = 0,1,2,... .
The information structure is given by a finite set of realized states ωt ∈ Ωt in each t. The uncertainty with
respect to information decreases with the time since at every t only one state is realized. The path of state
realizations over time is denoted by the vector ωt = (ω 0 , ω1 ,..., ωt ) . The time-uncertainty can be described
graphically by an event tree consisting of an initial date ( t = 0 ) and Ωt states of nature at the next date.
t=0 t=1 t=2
The probability measure determining the occurrence of the states is denoted by P. Note that P is defined over the
set of paths ω t . We called P the physical measure since it is exogenously given by nature. We used P to
model the exogenous dividends process. In the model the payoffs are determined by the dividend payments and
capital gains in every period. Let i = 1,..., I denote the strategies. There are k = 1,..., K long-lived assets in unit
supply that enable wealth transfers over different periods of time. k = 0 is the consumption good. This good is
perishable, i.e. it cannot be used to transfer wealth over time. All assets pay off in terms of good k=0. This clear
distinction between means to transfer wealth over time and means to consume is taken from Lucas (1978). Hens
and Schenk-Hoppe (2004) show that this assumption is essential when taking the evolutionary perspective: If the
consumption good were non-perishable and hence could also be used to transfer wealth over time then every
strategy that tries to save using the consumption good will be driven out of the market by the strategy that does
not use the consumption good to transfer wealth over time but otherwise uses the same investment strategy. As
in the traditional model, we use the consumption good as the numeraire of the system. Note that one of the
long-lived assets could be a bond paying a risk free dividend. Bonds may however be risky in terms of their resale
values, i.e. in terms of the prices qtk+1 (ω t +1 ) .
The evolutionary portfolio model we propose studies the evolution of wealth over time as a random dynamical
system (RDS). A dynamical system is an iterative mapping on a compact set that describes how a particle (in our
model the relative wealth among strategies) moves from one position to the next. In a random dynamical system
this movement is not deterministic but subject to random shocks. An example of a random dynamical system is
the famous Brownian motion describing the stochastic movement of a particle in a fluid. Note that in contrast to
traditional finance we assume that in order to “predict” the next position of the particle one is not allowed to
know the realizations of future data of the system. I.e. we do not allow for perfect foresight of prices! Yet it may
happen that under certain nice conditions the limit position of the RDS could also be described “as if” agents
were maximizing expected utility under perfect foresight. In particular some of our results give an evolutionary
justification of the “rational benchmark-theorem” according to which investing in proportion to expected relative
dividends is the unique perfect foresight equilibrium of some traditional economy with logarithmic expected
utility functions.
As in the traditional model, we start from the fundamental equation of wealth dynamics:
⎧⎪ K ⎡ D k (ω t +1 ) + qtk+1 (ω t +1 ) ⎤ i ,k t ⎫⎪ i t
wti+1 (ω t +1 ) = ⎨ ∑ ⎢ t +1 ⎥λt (ω ) ⎬ wt (ω )
⎪⎩ k =1 ⎣ qtk (ω t ) ⎦ ⎪⎭
From one period to the next the wealth of any strategy i is multiplied with the gross return it has generated by its
portfolio strategy λti ,k (ω t ) executed in the previous period50. Returns come from two sources, dividends and
capital gains. In every period asset prices are determined by the equilibrium between demand and supply within
that period. Since Dtk+1 (ω t +1 ) was meant to denote the total dividends of asset k, we have normalized the
I i ,k
supply to one, and as before equilibrium prices are given by: qtk (ω t ) = ∑ λ (ω t )wti (ω t ) . Note that wealth,
i =1 t
dividends and prices may all be subject to some growth rate like the rate at which nominal GDP is growing.
However, for analyzing what is the best way of splitting your wealth among the long-lived assets we can restrict
attention to relative wealth, relative dividends and relative prices getting rid of the absolute growth rates.
The fundamental equation of wealth evolution written in relative terms is given by:
⎪⎧ K ⎡ d k (ω ) + qˆ k (ω t +1 ) ⎤ ⎪⎫
rti+1 (ω t +1 ) = λt0+1 (ω t +1 ) ⎨∑ ⎢ t +1 t +1 k t t +1 ⎥λ
i ,k
(ω t ) ⎬ rti (ω t ) ,
qˆt (ω )
t
⎪⎩ k =1 ⎣ ⎦ ⎪⎭
Pt k (ω t ) Dtk+1 (ω t +1 ) wti (ω t )
where qˆtk (ω t ) = , d k
(ω ) = , r i
(ω t
) = .
∑ wti (ω t ) t +1 t +1 ∑ Dtj+1 (ωt +1 ) t
i j
∑ wti (ω t ) i
∑D t
k
(ω t )
Here we did use the identity ∑ w (ω ) =
i
i
t
t k
λt0 (ω t )
which determines aggregate wealth by the ratio of
inflows (aggregate dividends) to outflows of wealth (the consumption rate). Details of the computation can be
found on one of the ppt-slides of this chapter. In deriving the fundamental equation of wealth evolution written
in relative terms we did however make one important assumption: All strategies have the same consumption
rate λt0 (ω t ) . The justification of this assumption is that we are searching for the best allocation of wealth
among the long-lived assets. It is clear that among two strategies with otherwise equal allocation of wealth
50
Note that up to now we did not make any assumption on how the portfolio strategy λti ,k (ω t ) executed at ω t is
determined!
among long-lived assets the one with a smaller consumption rate will eventually dominate. Written I relative
terms the asset pricing equation becomes really nice now. The relative asset prices is simply the convex
I i ,k
combination of the strategies in the market: qˆ (ω ) = ∑ λ (ω t ) rti (ω t ) . In terms of evolutionary game theory
k
t
t
i =1 t
this means that strategies are “playing the field” i.e. one strategy has an impact on any other strategy only via
the average of the strategies. Careful reading of the wealth flow equation reveals that the flow of wealth is not
already described by a dynamical system. We would like to have a mapping from the relative wealth in one
period rt (ω t ) = ( rt1 (ω t ),..., rt I (ω t )) to the relative wealth in the next
period rt +1 (ω t +1 ) = ( rt1+1 (ω t +1 ),..., rt I+1 (ω t +1 )) . But rti+1 (ω t +1 ) also enters on the right hand side because capital
gains do depend on the strategies played. Fortunately the dependence of capital gains of strategies wealth is
linear so that we can solve the wealth flow equation in the RDS-form. In the resulting equation, the inverse
matrix captures the capital gains. Note that the KxI matrix Λ t +1 (ω t +1 ) is the matrix of portfolio strategies in
which portfolio strategy λti+1 (ω t +1 ) is one column.
Note that this equation is a first order stochastic difference equation describing a mapping from the simplex ∆
into itself.
Before we start analyzing this RDS let us summarize the assumptions made so far. We used Lucas (1978)
distinction between means to transfer wealth over time and means to consume, we assumed that all strategies
have the same consumption rate and by writing the Lucas model as a RDS we assumed that strategies are not
allowed to use information that is not available at the time when executed. Every of these assumptions seems
well justified to us. Note that in contrast to many other economic models generating dynamics we did not make
any simplifying assumptions like linear demand functions, usually justified by first order Taylor approximations.
One has to be very careful when making these seemingly innocuous assumptions. When iterating a dynamical
system terms of higher order may accumulate so that the real dynamics of the system looks quite different to the
dynamics of the system based on the simplifying assumptions.
5 A SIMULATION ANALYSIS
To get some first insight into the behavior of the RDS developed above it is useful to do a simulation analysis.
Since the model has been formulated as a dynamical system it can be iterated period by period. For the sake of
simplicity, in this section we assume that all strategies are constant over time fix-mix strategies: λti (ω t ) = λ i .
We have seen in an earlier chapter that on an i.i.d. return process those strategies are the optimal portfolio rules
of expected utility maximizers with CRRA. Differences in the asset allocation could be attributed to differences in
beliefs and risk aversions. The following figure displays the exogenous dividends process that we use in our
simulations:
16
15
14
log of dividends
13
12
11
10
1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001
years
These are the total dividends Dtk (ω t ) , paid out from 1981 to 2001 by the stocks in the DJIA. First one observes
that there is a growth rate in all assets` dividends. On a logarithmic scale the dividends Dtk (ω t ) are increasing
with linear rate over time. We take the dividends from 1981 to 2001 to generate an i.i.d. process for the
exogenous dividends of our model. To do so, we identify each year with one state ωt ∈ Ω t = {1,2,...,21} and
then randomly choose one of the states every period. “Randomly” meaning that in every period we select one of
the states from the same probability distribution with equally likely states. Think of this process as being
generated by repeatedly rolling a fair dice with 21 sides. Even though the exogenous random process is i.i.d. the
return process may however not be i.i.d. because in the returns the prices also enter and those are generated
endogenously. The next figure shows a typical run of a simulation with two strategies, a strategy generated from
mean-variance analysis (blue line) and the naïve diversification rule of fixing equal weights in the portfolio (red
line). Even though initially the wealth of the mean-variance rule accounts for 90% of the market wealth after a
few iterations the situation has reversed and the 1/n rule has 90% of the market wealth.
This wealth dynamics is reflected in asset prices:
Asset prices initially reflect the mean-variance rule but rapidly converge to the 1/n rule. A careful reader may
wonder how we did compute the mean-variance rule. After all this is a return based rule and returns are
endogenous. In our simulation we have even given the mean-variance rule perfect foresight and rational
expectations. I.e. we allowed it to know in advance which prices will ultimately result in a competition with the
naïve diversification rule51. Note that the mean-variance principle is a mapping from expected returns to asset
allocations. Hence, one can also flip the mean-variance principle upside down and ask which return expectations
one would need in order to generate a portfolio rule that is doing fine in competition with 1/n and the expected
relative dividends rule that is known to be the unique competitive equilibrium with perfect foresight.
51
We will later present analytical results showing that indeed 1/n will always win against mean-variance.
MV return expectations to obtain...
1
lambda star portfolio
1/n portfolio
true returns at 1/n
0.8
0.6
0.4
0.2
-0.2
-0.4
.. -0.6
The figure shows that these expectations are far off from the observed returns when mean-variance competes
with 1/n. That is to say, adding some degree of sophistication to the mean-variance rule by adding learning rules
will also not help in competition with 1/n. Summarizing, the first simulation result shows that seemingly rational
portfolio rules like mean-variance can do quite poorly against seemingly irrational rules like 1/n – a result that
was first pointed at by DeLong, Shleifer, Summers and Waldman (1990).
The next simulations show that the fight for market capital, i.e. the endogenous changes in the wealth
distribution can generate endogenous uncertainty that exceeds the uncertainty given by the exogenous dividends
process. As an effect, asset prices shows “excess volatility” which was first pointed at by Shiller (1981). Consider
a situation with two strategies, the 1/n strategy and a mean-risk strategy in which the risk measure now is the
CVaR, explained above.
Wealth
0.7
0.6
0.5
Mean-Variance
0.4 Mean-CVaR
Expected Dividends
Growth Optimal
Equal Weights
0.3
Prospect Theory
CAPM
0.2
0.1
0
1 45 89 133 177 221 265 309 353 397 441 485 529 573 617 661 705 749 793 837 881 925 969
Period
In this setting the wealth evolution never settles. As the 1/n rule gains wealth it turns prices to its disadvantage
so that the CVaR can grow. But the same happens to the CVaR rule. As it grows it turns the prices to the
advantage of the 1/n rule. As an effect asset prices show volatility in excess of the volatility of the exogenous
dividends: The blue line above shows the prices of some asset the red line below shows the fluctuations in its
dividends.
0.3
0.25
0.2
0.15
0.1
0.05
0
1 16 31 46 61 76 91 106 121 136 151 166 181 196 211 226 241 256 271 286 301 316 331 346 361 376 391 406 421 436 451 466 481 496
We conclude our simulation analysis by including the expected relative dividends portfolio
λ *,k = E P d k ,k = 1,2,...,K in the market selection process. As a result the process always converges to the
situation in which all the market wealth is concentrated at the strategy λ*. The next figure shows a typical run of
the simulation. The final figure of this section shows the average run with a 1 standard-deviation band for each
strategy. Note that these bands do not widen but get tighter as time goes on, indicating that the process
converges. In these figures we did display the expected dividends rule λ* in competition with two mean-variance
rules (one based on 1/n prices and one based on λ* prices), the growth optimal rule maximizing the expected
logarithm of returns based on1/n prices, the equal weights or illusionary diversification portfolio 1/n, a portfolio
based on prospect theory. In the course we made available the simulation program with which the competition
of any set of simple strategies can be studied. We recommend running simulations on your own in order to get
some intuition for the market selection process. Our conjecture from these simulations is: Starting from any initial
distribution of wealth, on P-almost52 all paths the market selection process converges to λ*, if the dividend
process d is i.i.d..
52
That is to say on all paths except for those that are highly unlikely, i.e. those that have measure zero according to the
probability measure P. For example if P- is i.i.d, every infinite sequence in which some state is not visited infinitely often has
measure zero.
SAMPLE RUN
0.6
0.5
0.4
Mean-Variance (1)
Expected Dividends
Growth Optimal (1)
0.3 Equal Weights
Prospect Theory
Mean-Variance (star)
CAPM
0.2
0.1
0
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 101
Period
Mean +/- standard deviation
0.7
0.6
1 stern
0.5 2 illu
Wealth share
3 (m-s)(1)
0.4
4 (m-s)(stern)
0.3 5 gop (1)
6 cpt
0.2 7 (m-s)e (1)
8 (m-s)e (stern)
0.1
0
0 20 40 60 80 100
Period
6 THE MAIN RESULTS

In this section we explain the main analytical results that we were able to derive so far. These results are derived
from three types of structural assumptions: Assumptions on the maturity of assets (short-lived or long-lived),
assumptions on the dividend process (i.i.d or stationary Markovian) and assumptions on the set of strategies
(simple or stationary adapted). Under these assumptions we derive two properties of the expected relative
dividends portfolio λ*: The single survivor property and the evolutionary stability property. Here is a precise
definition:
Definition: [Single-Survivor-Property]
The strategy i is called the single survivor of the market selection process if starting from any initial distribution of
wealth r ∈ ∆ I the process converges P-almost surely to r∞ = ei ∈ ∆ I , the i-th corner of the simplex.
So far we have been able to show the single survivor hypothesis of λ* only for the case of short-lived assets. In
this case there are no capital gains and the wealth evolution reduces to:
⎧K ⎡ ⎤ ⎫
⎪ ⎢ d t +1(ωt +1 ) t ⎪ i
k
rt +1(ω ) = λ0 ⎨∑ ⎢
i t +1 ⎥ λ (ω )⎬ rt (ωt ) . The first result we get is:
i ,k
⎪ k =1 ⎢ ∑ t
t ⎥ t
λ k
(ω t
)r i
(ω ) ⎪
⎩ ⎣ i
t
⎥⎦ ⎭
Theorem: [Evstigneev, Hens, Schenk-Hoppé (2003)]
Suppose assets are short-lived and non-redundant, relative dividends d are i.i.d and consider λ simple (constant
*
in time). Then λ k = (1− λ0 )E pd (kω ) , k=1,..,K is the single survivor of the market selection process.
The assumption of simple strategies can be dispensed with if some regularity conditions on the process are
made. One needs to make sure that P-almost surely the relative dividends process is distinct from the simply
strategy of buying and holding the index (the CAPM-strategy, as we call it). In this case one can consider any
stationary and adapted portfolio rules. That is strategies that do not depend on time in an autonomous way and
that only depend on the paths ω t generated by the time-uncertainty process – an assumption that is well know
from the literature on hedging derivatives. It does not allow to use information that is unknown at the time the
strategy is executed.
Theorem: [Amir, Evstigneev, Hens, Schenk-Hoppé (2004)]
Suppose assets are short-lived and non-redundant, relative dividends d are i.i.d and consider any λ stationary
*
and adapted to ω. Then λ k = (1− λ0 )E pd (kω ) , k=1,..,K is the single survivor of the market selection process,
provided it is sufficiently distinct from the CAPM-strategy.
The case of short-lived assets is however not really realistic. One may say that investing on assets is mainly done
because of potential capital gains (speculation) rather than because of dividends. Indeed the dividend yield on
stocks (about 3-4% on long term averages) is much smaller than return generated from capital gains (about 6-
7% on long-term averages). In the case of long-lived assets we cannot show the single survivor property of λ*.
However, we can show that λ* is the unique portfolio strategy that cannot be invaded by other strategies. We
call this property evolutionary stability. Before we do so we need however define a point of rest of the dynamical
system that can be perturbed by mutations.
Definition: [Fix Point]
A wealth distribution r ∈ ∆ I is a fixed point of the random dynamical system F ( r, ω ) if

F (r ,ω ) = r for all ω ∈ Ω .
Since the RDS F describes a market selection process, every monomorphic population is a fix point. That is to say,
if all strategies are identical then the evolution of wealth does not change over time. A slightly deeper result is
that in the case of non-redundant assets the converse of this statement is also true: Every fix point of F is a
monomorphic population. The intuition of the proof is as follows. Suppose two strategies differ then, since assets
are not redundant, there is a continuation path along which the wealth distribution changes, contradicting the
notion of a fix point.
Proposition: [Fix Points are Monomorphic Populations]
Every monomorphic population is a fix point of F. Moreover, if there are no-redundant assets then every fix point
of F is a monomorphic population.
Now we are in the position to define a stability notion of fix points that concerns the thought experiment of
throwing in with little initial wealth a new strategy into a market that is a fix point. Since fix points are
monomorphic we can thus restrict attention to markets in which only one strategy is the incumbent.
Definition: [Evolutionary Stability]
A trading strategy λi is called evolutionary stable if for all λj, starting from r0 = (1 − ε , ε ) ∈ ∆ i , j , for all 0<ε < 1,
the market selection process converges P-almost surely to r∞ = (1,0) ∈ ∆ i , j .
When an incumbent strategy λi is evolutionary stable only with respect to mutant strategies λj that are local
mutations of λi , i.e. in the strategy simplex the two strategies are close to each other, then we call the strategy λi
locally evolutionary stable. The main result on evolutionary stability is:
Theorem: [Evstigneev, Hens, Schenk-Hoppe (2003)]
Suppose relative dividends d follow an i.i.d. process and consider any λ stationary and adapted. Then λ*is
evolutionary stable. No other strategy is locally evolutionary stable.
The key to understand this result is given by the exponential growth rate of a strategy λi in a market determined
⎛ K
k λ
i ,k
⎞
by strategy λ : g (λ , λ ) = E p ln ⎜ 1− λ0 + λ0 ∑d (ω ) n ,k
j i n
⎟⎟ . If some strategy has a higher exponential growth rate
⎜ λ
⎝ k =1 ⎠
than some other strategy then as the process unfolds, P-almost surely this strategy will have higher wealth. The
unique property of the strategy λ* is that it has, as shown in one of the ppt-slides, the highest growth rate
against itself. The following matrix, in which columns (rows) correspond to incumbents (mutants) summarizes the
evolutionary stability result: If λ* is determining market prices then no other strategy can invade this market. If
some other strategy is the incumbent then there is always a potential invading strategy, which might not be λ*
itself (see the table below). The theorem can be generalized to relative dividends that are stationary Markov. The
expected relative dividends strategy is also the unique evolutionary stable strategy in this case: Let
⎡ p11 p1S ⎤
⎢ ⎥
Ωt = {1,...,S } ,t = 1,2,... and let p = ⎢ pss′ ⎥ be the Markov transition matrix then
⎢ p1S pS ⎥⎦
S′
⎣
∞
λ * = λ0 ∑ (1−λ0 )n p nd is the expected discounted dividends strategy
n =1
⎡ λ11 λ1K ⎤ ⎡d 11 d 1K ⎤
⎢ ⎥ ⎢ ⎥
where λ * = ⎢ λsk ⎥ and d = ⎢ d sk ⎥.
⎢ λS1 λSK ⎥⎦ ⎢d S1 K ⎥
dS ⎦
⎣ ⎣
Hence, a rational market in which prices are determined by expected relative dividends is more robust than any
irrational market. Occasionally, a big push of irrationality can drive prices away from the rational valuation. The
resulting markets are however more fragile and even a sequence of local mutations can bring the market back to
the rational valuation. This is why in the data presented in chapter 3 we see that market capitalizations reflect
relative dividends. Hence we have given an evolutionary justification of this robust finding.
From our main result we can derive two corollaries that give a nice re-interpretation of mean-variance analysis
and the CAPM. First, it is easily seen that every under-diversified strategy like the mean-variance portfolio is,
cannot be evolutionary stable. Leaving out some asset is fatal for survival. Any other strategy that devotes some
arbitrary small fraction of wealth to the left out assets will surely have a positive growth rate against the under-
diversified strategy. The CAPM strategy on the other hand amounts to buy and hold the market portfolio. In our
notation this means that λtk ,CAPM = qˆtk , k = 1,2,..., K . That is to say the CAPM strategy is a passive imitation
strategy that will always mimic the average strategy of the market and if there is one such strategy that
dominates the market, the CAPM-strategy will automatically follow it. As an effect the relative market wealth of
the CAPM strategy stays constant over time.
7 EXTENSIONS
The main restriction of the results mentioned above is that strategies are assumed to be stationary and adapted
to the time-uncertainty structure. Imagine for example a simple trend chasing strategy that buys (sells) assets
when prices have increased (decreased). Since the endogenous price process may not be stationary, this strategy
is not necessarily stationary and adapted. Similarly, imagine strategies are build by switching between basic
strategies which may be the stationary adapted strategies in the theorems from above. It is reasonable to
assume that switching basic strategies depends on their relative returns, which being endogenous may not be
stationary and adapted. Finally, assuming that strategies are stationary and adapted implies a market
microstructure that at first sight looks a bit unrealistic. This assumption amounts to modeling the market as a
batch auction in which every strategy submits a demand function in terms of its asset allocation shares
I i ,k
λ (ω ) and each auctioneer chooses prices so that his market clears: q (ω ) = ∑ λ (ω t )wti (ω t ) .The implicit
t
i t k
t
t
i =1 t
assumption here is that the shares in the asset allocation are neither allowed to depend on the prices of that
λti ,k (ω t )wti (ω t )
market nor on the prices of any other market. Also the units bought θ ti ,k (ω t ) = can not be
qti ,k (ω t )
based on limit orders, for example. However, one should keep in mind that in principle the strategies can be
revised in every period. I.e. allowing for changes in strategies if the resulting demand of assets is satisfactory will
help to overcome the limitations of this simple market microstructure, which again brings us back to non-
stationary strategies. In this section we will extend the model to allow for non-stationary strategies in two
directions: Trend Chasing and Imitation.
Consider the following aggressive momentum or trend chasing strategy:
Aggressive Momentum Strategy:
Partition assets in: IN and OUT assets.

Asset Allocation Rule: = 1/n for all IN assets and = 0 for all OUT assets.
Partitioning Rule:
If the price of asset k has increased twice in the previous three periods then k joins IN.
If the price of asset k has decreased three times in the last three periods then k joins OUT.
The following figure presents the wealth dynamics of the aggressive momentum rule in interaction with λ*
The expected relative dividends rule is no longer evolutionary stable. For very many periods (from period 350 to
500, for example) it has conquered almost the entire market and yet in period 520 or so a little outburst of
activity by the momentum strategy happens. However, any such outburst is not sustainable and ultimately the
process converges back to λ*. Note that in contract to our interpretation of the evolutionary stability result in the
previous section here the shifts away from the rational market are generated endogenously.
A second way of modelling non-stationary strategies is to consider strategies as determining the flow of wealth
between a set of base strategies. To get some intuition for this point of view, consider the Hedge-Fund-strategies
as mentioned above as the base strategies. Now the evolution of wealth between those strategies can be
attributed to two sources: Internally generated wealth from earning returns on the strategy and externally
generated wealth from switches of wealth between strategies. The next two figures show the decomposition of
relative wealth flows in these two components:
Assets History Through Internal Growth (relative)

December 1993 - June 2002
0.600
0.500
Convertible Arbitrage
0.400
Dedicated Short Bias
Managed Futures
Emerging Markets
Fixed Income Arbitrage
0.300
Event Driven
Equity Market Neutral
Global Macro
0.200 Long/Short Equity
Multi-Strategy
0.100
0.000
1993 1994 1995 1996 1997 1998 1999 2000 2001 2002Q2
Assets Growth History Through In/Outflows (relative)

December 1993 - June 2002
0.8
0.6
Convertible Arbitrage
0.4
Dedicated Short Bias
Managed Futures
Emerging Markets
Fixed Income Arbitrage
0.2
Event Driven
Equity Market Neutral
Global Macro
0 Long/Short Equity
Multi-Strategy
-0.2
-0.4
1993 1994 1995 1996 1997 1998 1999 2000 2001 2002Q2
The key modelling assumption is then to link the external flow of wealth to the internally generated returns.
Recent empirical papers on Hedge Funds have found surprisingly robust external flow functions, say
H i : [0,1] → [0,1] for the various strategies. The following figure from Agarwal, Daniel and Naik (2003)
illustrates such functions:
Basically the flows follow the returns very procyclical: Higher returns trigger higher flows. However there are
some non-linearities. For some strategies the flow functions are convex for others they are concave, as
Getmansky (2003) has figured out:
A first theoretical paper on this view is Alos-Ferrer and Ania (2004). These authors consider the case of short-
lived assets and use simple strategies as the base strategies between investors allocate their wealth
proportionally to the realized returns of the last period. Alos-Ferrer and Ania (2004) show that the wealth
evolution follows a stationary Markov process the limit distribution of which puts most wealth on the simple
EP D k
strategy of relative expected dividends λˆ k = , k = 1,2,.., K . On the DJIA data the differences between
∑ EP D j j
the relative expected dividends and the expected relative dividends portfolio are not large, as the next figure
shows:
0.2
lambda^hat lambda^star
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
TECHNOLOGIES
PROCTER & GAMBLE
PHILIP MORRIS CO
MINING/MFG
MERCK & CO., INC.
MCDONALD'S COR
J.P. MORGAN
INTERNATIONAL
INT'L BUSINESS
PACKARDCO.
GENERALMOTORS
GENERAL ELECTRIC
EXXON MOBIL CORP.
EASTMAN KODAK
DU PONT DE
COCA - COLA
CATERPILLAR INC.
BOEING COMPANY
AT&T CORP.
AMERICAN EXPRESS
ALCOA INC.
MINNESOTA
NEMOURS
COMPANY
HEWLETT -
MACHS
UNITED
PAPER
CORP
(THE)
CO.
CO.
CO.
CO.
Summarizing these results we see that evolutionary portfolio theory confirms what traditional finance has told us:
Value strategies work! The twist of the evolutionary perspective is however that value only works in the long run.
Moreover since the evolutionary model is describing the off-equilibrium dynamics of the financial markets it is
also able to make medium run predictions and does not need to refer to the shock metaphor traditional finance
is using to excuse itself from not being able to model the off-equilibrium behaviour. The usefulness of the
evolutionary approach will depend on how good one can assess the ecology of the market. We have started
fruitful collaborations on this with banks in Zürich.
REFERENCES:
Agarwal, V., N. Daniel, and N. Naik (2003): “Flows and Performance in the Hedge Fund Industry”. Mimeo.
Alos-Ferrer and Ania (2004): “The Asset Market Game”, forthcoming in Special issue on Evolutionary Finance of
Journal of Mathematical Economics, Hens and Schenk-Hoppe (eds).
Amir, R., Evstigneev I., Hens, Th., Schenk-Hoppé, K-R., (2004). “Market Selection and Survival of Investment
Strategies”, NCCR-working paper no. 6, forthcoming in Special issue on Evolutionary Finance of Journal of
Mathematical Economics.
De Long J., Shleifer A., Summers L. and R. Waldman (1990):”Noise Trader Risk in Financial Markets”, Journal of
Political Economy, Vol 98, pp. 703-738.
Getmansky, M. (2003): “The Life Cycle of Hedge Funds: Fund Flows Size and Performance”, Mimeo MIT.
Evstigneev, I. V., Hens, Th. and Schenk-Hoppé, K. R., Evolutionary Stable Investment in Stock Markets, NCCR
working paper no. 84, June 2003.
Hens, Th., and Schenk-Hoppé, K.R., Markets Do Not Select For a Liquidity Preference as Behavior Towards Risk,
NCCR working paper no. 21, December 2002.
Lucas, R. (1978): “Asset Prices in an Exchange Economy," Econometrica, 46, 1429-1445.
Shiller, R.J. (1981): “Do Stock Prices Move Too Much to be Justified by Subsequent Changes in Dividends?”,
American Economic Review (71), 421-436.
UBS-Warburg (2002): “Watching Flows”, Global equity research December 2002, Johansen and Ineichen.
White, H. (2000): “A Reality Check for Data Snooping”, Econometrica Vol. 68, No. 5, pp. 1097-1126.
Exercises
Exercises
EXERCISE 1: MEAN VARIANCE AND CAPM
There are two risky assets k = 1,2 and one riskfree asset paying 2% return p.a. Investors can buy the riskfree
asset but not to sell it. The expected return of risky assets are µ1 = 5% and µ2 = 7.5% . The variance-covariance
⎛ 2 −1⎞
matrix (in %) is: COV = ⎜ ⎟.
⎝ −1 4 ⎠
1. Calculate the minimum-variance and the tangent portfolio
2. Assume a mean-variance investor facing the same variance-covariance matrix chooses a portfolio
λ = (0.2,0.5,0.3) . What are the returns he implicitly expect?
3. Assume that the market portfolio is given by λM = (0.4,0.6) . Calculate the Beta factors of the risky
assets. Assume further that the excess return of the market portfolio is 3%. What is the CAPM expected
excess return of the risky assets?
EXERCISE 2: DATA ANALYSIS

1. Calculate the mean and standard deviation of the monthly net returns of asset classes and annualize
them
2. Make histograms
3. Calculate the covariance matrix
4. Calculate the efficient frontier with and without the Tremont index
5. Calculate the tangent portfolio with and without short sales constraints
6. Analyze the sensitivity of the tangent portfolio and its expected return
7. Show the SML based on the tangent portfolio without short sales
EXERCISE 3: THE GENERAL EQUILIBRIUM MODEL

Consider two-period financial economy without consumption in the first period. There are s=1,…,S states in the
second period, k=1,…,K assets and i=1,…,I consumers.
1. Define the financial equilibrium in this economy.
2. Assume that there are 3 states and 2 consumers such that

U 1 ( x 1, x 2 , x 3 ) = ln x 1 + ln x 2 , ω1 = ( 0,1,2)
U 2 ( x 1 , x 2 , x 3 ) = ln x 2 + ln x 3 , ω 2 = ( 2,1,0 )
⎛ 1 0⎞
Show that for the assets matrix A = ⎜⎜ 0 1 ⎟⎟
⎜ 0 1⎟
⎝ ⎠
x = ( 2, 1/2, 3/2) , x = ( 0, 3/2, 1/2) and θ = ( 2, −1/2) , θ = ( −2, 1/2) is equilibrium.

*1 *2 *1 *2
3. Consider a third asset with the payoff A 3 = ( 0,0,1) . Is it possible to duplicate this asset from the assets
in the previous questions?
4. Calculate an arbitrage-free price for the third asset? Is it unique?
5. Determine the equilibrium in the economy including the third asset.
EXERCISE 4: MARKET COMPLETENESS

Consider a two-period financial economy. There are 3 possible states and 2 assets with a payoff
⎛1 0⎞
A = ⎜⎜ 2 1 ⎟⎟ .
⎜1 0 ⎟⎠
⎝
1. Is the market complete?
2. Assume that there is a third asset: a riskless bond. Is the market complete in this case?
EXERCISE 5: ASSET PRICING

1. Calculate the linear price rule using the risk-neutral probabilities
2. Consider a Binomial Model with a riskfree asset paying return R and one risky share which price
increases by the factor u and decreases by the factor d in each period where u > R > d > 0 . Use the
binomial tree over three periods t = 0,1,2 to show how do assets values change over time.
3. Assume that there is a third asset: a hedge fund paying the highest payoff achieved over the time. Show
this payoff of the hedge fund in each of the states in every period.
4. Calculate the value of the hedge fund in t = 0 using the Law of One Price
5. Assume that the share can take three instead of two different values in each period u > m > d > 0 . Is
there a portfolio of assets that is able to duplicate the payoff of the hedge fund?
EXERCISE 6: THE REPRESENTATIVE AGENT
Consider a one period economy (t=0,1) with two possible states in the second period (s=1,2). Assume that
consumption only takes place in t=1. There are two agents i=1,2 having the logarithmic expected utilities
u1(c1,c2)= 0.75 ln (c1) + 0.25 ln (c2) and u2(c1,c2)= 0.25 ln (c1) + 0.75 ln (c2), respectively. There are two
assets in unit supply: One riskless asset paying off 1 in both states and one risky asset paying off 2 in the first
state and 0.5 in the second state. The first (second) agent owns the first (second) asset. Asset are the only
sources of income.
1. Determine the competitive equilibrium.
2. Find a representative consumer with logarithmic expected utility function whose demand could also
generate the equilibrium prices found in a).
3. Suppose the payoff of the second asset increases to 3 in the first state. Compute the new asset prices
using the representative consumer as determined in b).
4. Compute the new equilibrium prices in the original economy with two agents

This exercise is based on Based on US-DATA 1981-2001 (see homepage). Using DJIA Prices, DJIA-MV, DJIA-DIV,
GDP, Interest Rates:
1. Do for every year the cross section regression of the SML: excess returns versus betas based on market
capitalization.
2. Do the time-series macro-finance regression:

1
qtk−1 against l t (Dtk + qtk ),where qtk and Dtk are the market capitalization and the total dividends of firm k,
Rtf−1
α
∂u (c t ) c1-α 1⎛c ⎞
and the likelihood ratio process is lt = R f
t −1 ,i.e. for u(c)= we regress qtk−1 against ⎜ t −1 ⎟ (Dtk + qtk )
γ ∂u (c t −1 ) 1-α γ ⎝ ct ⎠
EXERCISE 8: EXPECTED UTILITY (1)

Consider the time-uncertainty structure that is generated by tossing two fair coins. Suppose there are two assets
associated with the two coins. Asset 1 (2) pays off +1 if the first (second) coin comes up H and asset 1 (2) pays
off -1 if the first (second) coin comes up T.
1. Draw the time uncertainty structure.

2. Display the payoffs of the strategies (1,0), (0,1), (0.5,0.5), (λ,1-λ), for any 0≤λ≤1, in a matrix with
states as rows and strategies as columns.
3. Show that every risk averse expected utility maximizer prefers the portfolio (0.5,0.5) to any other
portfolio.

Consider the following four assets A,B,C, D with returns of 2 %, 4 %, and 6 % occuring with the probabilities
listed below:
2% 4% 6%
A 0.2 0 0.8
B 0 1 0
C 0.8 0 0.2
D 0.75 0.25 0
1. Show that the ranking B>A>C>D is not compatible with expected utility.
2. Represent the investments A, B, C, D in a probability triangle
3. Represent the investments A, B, C, D in a mean-standard-deviation-diagram
4. Show the indifference curves of an investor with the preferences B>A>C>D
5. Is this preference order compatible with the expected utility hypothesis
6. Order the investments A, B, C, D using the stochastic dominance concept
7. Find the men-variance optimal efficient portfolios and the optimal portfolio for investor with utility
8. How does the portfolio structure change with increasing income?

Consider an investor with utility V ( µ ,σ ) = µ ⋅ (10 − σ ) for σ ≤ 10
1. Draw the indifference curves in a mean-standard-deviation diagram
2. Consider a mean-variance investor with constant absolute risk aversion if for a given standard deviation
the slope of her indifference curve doe not change with the mean. Does the investor above have a
constant absolute risk aversion?
3. Show that this investor would prefer a sure payment of 10 units to a lottery paying either10 or 20 with
the same probability
4. Is this preference consistent with the first-order stochastic dominance?
5. How would a risk-loving investor with a monotonic utility function decide between the sure payment and
the lottery?

1. Get the weekly returns for stocks listed on DJIA from the homepage.
2. Test for Normality of the ln-returns.
3. Test aggregate dividends for normality of ln-returns.
4. For market totals, compare the variances of the ln-returns of dividends and market values.
5. Test for autocorrelation.
EXERCISE 12: TIME DIVERSIFICATION

Your initial wealth is $1000 which you can invest over two periods in a risk free asset and a risky asset. There
are two states drawn with 50% probability in each period that determine the gross return of the risky asset:
Suppose: R(1)=1.2 and R(2)=0.9 per period. Consider two expected utility maximizers, say A and B, with von
Neumann-Morgenstern utilities
−1 ⎧ −2.25 ⋅ (Wt −Wt −1 ) if Wt ≤ Wt −1

u A (Wt ) = and u B (Wt ) = ⎨ .
Wt ⎩ (Wt −Wt −1 ) if Wt ≥ Wt −1
1. Compute the optimal share of risky assets 0 ≤ λ ≤1 for both utility functions when the investors are
myopic, i.e. plan ahead for one period only. Do these shares depend on the realization of the states?
2. Compute the optimal share of risky assets λ fixed at the outset for both utility functions when the
investor plan ahead for two periods but do not want to rebalance after the first period.
3. Compute the optimal policy rule for shares of risky assets λ over time for both utility functions when the
investors plan ahead for two periods but do allow for rebalancing after the first period.
EXERCISE 13: BEHAVIORAL PORTFOLIO THEORY

Daniel goes on holiday. He has two credit cards and two purses. The probability to loose one of the purses is
25%. Assume that the probability to loose one of the purses is independent from the probability of loosing the
other one. The monetary loss of a credit card is -1 (the monetary loss of none of the cards is 0, respectively the
monetary loss of both cards is -2).
Daniel has two strategies: diversification (each card is in a different purse) and concentration (both cards are in
the same purse).
1. Which strategy would Daniel choose if he maximizes his expected utility?
2. Which strategy would Daniel choose if his decision is based on the prospect theory (assume that Daniel
does not change the probability weights)?
3. Which strategy would Harry Markowitz choose as a mean-variance decision maker?
EXERCISE 14: TESTING FOR MEAN REVERSION

For the Data of the stock in the DJIA given on the web-page,
VAR (rt n )
1. Compute the Variance Ratios: VAR(n) =
nVAR(rt )
2. Which stocks are mean-reverting, mean-averting or random walks?
3. Also compute the Variance Ratio for the DJIA-index.
EXERCISE 15: INTERTEMPORAL ASSET ALLOCATION

Consider an investment problem over two periods. The initial wealth is W=1. There are two possible states of
the world in each period being determined by a fair coin, tossed in each period (i.i.d.). There are two assets: a
riskless asset with gross return of 1 and a risky asset that has a gross return of 3 if the coin tosses up Heads and
of 0 if the coin tosses up Tails. The investor has an intertemporal expected utility without discounting and his per
period utility is quadratic: u(w)=(w-1/3w2), for b>0 but sufficiently small to ensure monotonicity of u.
1. Determine the optimal asset allocation over time.
2. Is the optimal asset allocation independent of time?
EXERCISE 16: HEDGE FUNDS

Take the Hedge Fund and Index Data from the course web page and
1. Compute the Mean-Variance Efficient Frontier without and with Hedge Funds included.
2. Test the Indices and the HF-styles for Normality and Mean-Reversion.
3. Compute the 95% and the 99% VaR of the HF-styles.

4. Which risks may be hidden in Hedge Funds; i.e. which risks may not be covered by the Data on Hedge
Fund indices?
Supplement:
Statistical Properties of
Time-Series
Appendix Stochastic Processes - 147 - 147
“It is a test of true theories not only to account for but to predict phenomena.”
William Whewell
1 Introduction
The very existence of financial economics as a discipline is based on uncertainty and its impact on the behavior of
investors and ultimately on asset prices. Estimating and testing theoretical models are intimately related to this
type of uncertainty on which those models are based. Previous chapters discussed the uncertainty factor from a
theoretical point of view. This supplement assists obtaining basic statistical background necessary to get an
intuition on the main empirical approaches used in testing financial models.
2 Some Definitions
Why do financial economists formulate models using returns instead of prices? First, to the average investor,
investments in financial markets can be considered as a having constant returns-to-scale – the size of investment
does not have any impact on prices changes. Thus, in perfectly competitive financial markets, the return is a
complete and, in particular, scale-free measure of investments. Second, returns have more attractive properties
than prices such as the stationary property of returns but not of prices.
Pt −1
• net return on the period t − 1 to t : Rt = −1
Pt
1/ k
⎡ k −1 ⎤ 1 ⎡ k −1 ⎤
• annualized return of returns over k periods: Rt (k ) = ⎢∏ (1 + Rt − j )⎥ − 1 or Rt (k ) ≈ ⎢ ∑ Rt − j ⎥
⎣ j =1 ⎦ k ⎣ j =0 ⎦
Pt
• log returns: rt ≡ log(1 + Rt ) = log( ) = pt − pt −1 .
Pt −1
A random variable X is a real-valued function whose domain is the outcomes of an experiment. Intuitively, a
random variable assigns a number to an outcome of an experiment. Random variables come in two favors. A
discrete random variable is one that can take at most a countable number of possible values. A continuous
random variable can take on any value over any interval or all of the real numbers, i.e. the set of possible values
is uncountable. Since the value of a random variable is determined by the outcome, we may assign probabilities
to the possible values of the random variable.
A probability function is a function that assigns to each value of a random variable the probability that the
value will be obtained. For a discrete random variable, we define the probability mass
function: p (a ) = P ( X = a ) .
In the case of a continuous random variable, we define the probability density function (pdf) f ( x ) with the
property that for any set B of real numbers: P ( X ∈ B ) = ∫ f ( x )dx .
B
The random variables are said to be independent, if for all a and b, P ( X ≤ a ,Y ≤ b ) = P ( X ≤ a )P ( X ≤ b ) .
A stochastic process is a collection of random variables, i.e. for each t ∈T X (t ) is a random variable. If the
index t is interpreted as time, then X (t ) is the state of the process at time t. The index set T can be countable
set and we have a discrete-time stochastic process, or non-countable continuous set and we have a continuous-
time stochastic process. In most cases, a stochastic variable has both a expected value term (drift term) and a
random term (volatility term). We can see the stochastic process forecasting for a random variable X, as a
forecasted value (E[X]) plus a forecasting error, where error follow some probability distribution. So:
X(t) = E[X(t)] + error(t).
3 The Normal Distribution

The random variable X is normally distributed with parameters mean µ and variance σ 2 if the density of X is
1
e −( x − µ ) / 2σ 2
2
given by: f ( x ) = . Although the distribution is completely defined by the first two moments,
2πσ
the third (skewness) and the forth (kurtosis) moments are often used to test for normality.
The skewness or the normalized third moment of a random variable X is defined by:
E ( X − µ )3
S (X ) = =0
σ3
The kurtosis, or the normalized forth moment of X is defined by:
E ( X − µ )4
K (X ) = =3
σ4
The normal distribution, as all other symmetric distributions, has skewness equal to zero. Distributions skewed to
the left (right) have negative (positive) skewness.
The normal distribution has kurtosis equal to 3, in contrast, fat-tailed distributions with extra probability mass in
the tails have higher kurtosis.
4 The Lognormal Distribution

The concept of normally distributed returns suffers from at least two drawbacks. First, the normal distribution
support the entire real line, in contrast, the largest loss an investor can realize is the entire investment and no
more. The lower bound of -1 is clearly violated by the normality. Second, if single-periods returns are normal
then returns over several periods cannot be normally distributed since they are the product of the single-periods
returns.
An alternative way to solve the last problem is to take the logarithmic returns and assume that they are normally
distributed rit ∼ N ( µi ,σ i2 ) (simple gross returns are then lognormal distributed).
Under the lognormal model, the mean and variance of simple returns are given by:
σ2
µi +
− 1 and Var (Rit ) = e 2µi +σ (e σ i − 1) , so that the fist problem is solved as well53.
2 2
E (Rit ) = e 2
5 The Martingale Model

A Martingale is a stochastic process satisfying the following condition:
E (Pt +1 | Pt ,Pt −1 ,...) = Pt or equivalently E (Pt +1 - Pt | Pt ,Pt −1 ,...) = 0
If the price process is a Martingale, then the best estimate for the price in the next period is simply equal to this
period’s price conditioned on the history of the game. Alternatively, if the expected price changes at any stage
are zero conditioned on the history, the game is fair – it is not in favor to any player.
The Martingale property of asset prices was long considered to be a necessary condition for the efficiency of
capital markets, in which information contained in past prices is instantly and fully reflected in asset’s current
price. On an efficient market, the conditional expectation of future price changes, conditioned on the price
53
(1 + Rit ) ≥ 0 since (1 + Rit ) = exp(rit )
history, must be zero. Furthermore, the more efficient the market, the more random is the sequence of price
changes. Thus, on efficient markets price changes must be completely random and unpredictable.
Though, the Martingale hypothesis places restrictions on expected returns but does not account for risk. Positive
expected prices changes do not contradict the fair game concept since they can be viewed as the reward
necessary to attract investors holding the asset and bear the associated risks. Once asset prices are properly
adjusted for risk, the martingale property holds. The transformation isolates the impact of time and state
preferences of investors on asset prices. The, the predictability of these preferences does not represent a violation
of the market efficiency.
6. The Random Walk

The simplest version of a Random Walk is given by:
Pt = µ + Pt −1 + εt and εt ∼ IID (0,σ 2 )
As the Martingale, the Random Walk is also a fair game since the increments are independent.54 In fact, a
Random Walk is a Martingale with constant variance of the innovations.
Using the IID increments assumption (also called White Noise), we can calculate the conditional mean and
variance, conditional on some initial value P0 at time 0:
E (Pt | P0 ) = P0 + mt
Var (Pt | P0 ) = σ 2t
Thus, the Random Walk is a non-stationary55 since both its mean and variance are linear in time.
If we additionally assume that the increments are normally distributed, εt ∼ IID N (0,σ 2 ) then the process is
equivalent to an arithmetic Brownian motion. Wiener process or a Brownian Motion is derived from the
simple Random Walk, replacing the time sequence by time series when time intervals become smaller and
approach zero.
A second version of a Random Walk includes independent but not identically distributed increments. This version
is useful in particular by the explanation of time variation of volatility in financial assets returns.
The weakest form of a Random Walk is obtained by additionally relaxing independent assumption to include
processes with dependant but uncorrelated increments. For example, processes where
Cov (εt ,εt −k ) = 0 for all k ≠ 0 but Cov (ε 2t ,ε 2t −k ) ≠ 0 for some k ≠ 0 .
7. MARKET EFFICIENCY AND THE LAW OF ITERATED EXPECTATIONS
54
Independence requires that not only the increments but also any non-linear function of them are also uncorrelated,
which is a much stronger requirement fort he fairness of the game than in the case of a Martingale.
55
Distribution is stationary if at least the mean and variance are constant over time.
In an information efficient market, price changes cannot be forecasted if they are anticipated properly. If prices
adjust as rapidly as information become available then we would expect to see randomness in successive
transactions rather than great continuity, i.e. small movements in the same direction are very unlikely. On the
other hand, if prices are determined by discounted cash flows, one would expect that returns would become
deterministic. In fact, asset returns can be random even if security prices are determined by discounting future
cash flows. The key argument is the Law of Iterated Expectations.
To illustrate it, we define the information sets I t and J t where I t ⊂ Jt . The Law of Iterated expectations says
that if one has limited information, the best forecast he can make is the forecast of the forecast one would make
if he has superior information. Formally: E ( X | I t ) = E [E ( X | J t ) | I t ] . In other words, if one has limited
information, he cannot predict the forecast error he would make if he would have superior information. Applied
to asset prices, the Law of Iterated Expectations implies that realized changes in asset prices as rational
expectations of some fundamental value V * (as discounted cash flow) are unforecastable given some
information set I t . Or if Pt = E (V * I I t ) = E tVt resp. Pt +1 = E (V * I I t +1 ) = E t +1V , the expectation in the price over
the next period is: E t (Pt +1 − Pt ) = E t (E t +1 I V * ) − E tV * = 0 . Thus, future prices changes are not forecastable as in
a Random Walk; future directions cannot be predicted on the base of past realizations.
8. MARKOV PROCESSES
A Markov process is a stochastic process where all the values are drawn from a discrete set. In a first-order
Markov process only the most recent draw affects the distribution of the next one. Given the present, the future
is conditionally independent on the past.
Markov processes are usually defines by specifying the transition probabilities from one state to the state
next period. That is,
Prob( X t +1 is in A I X t , X t −1,...) = Prob(X t +1 is in A I X t )
For example, X t +1 = a + bX t + εt is a Markov process and for a = 0 and b = 1 it is also a Martingale.56
9. MEAN REVERSION
Mean reversion in contrast to a random walk describes the phenomenon that a variable appears to be pulled
back to some long-run average level over time. Changes in the variable (growth rates, returns) must be
negatively serially correlated at some frequency for the correction to occur.
The Mean Reversion Process is a log-normal diffusion process, but with the variance growing not proportionally
to the time interval. The variance grows in the beginning and after sometime stabilizes on certain value. The
stabilization of the variance is due the spring like effect of the mean-reversion. The pictures below illustrate this,
56
E ( X t +1 I X t , X t −1, ,...) = E (a + bX t + εt I X t , X t −1, ,...) = a + bE (X t I X t , X t −1, ,...) + E t (εt ) = X t for a = 0, b = 1, and εt ∼ N (0,σ 2 )
for the "low" and “high” price cases:

Advanced Portfolio Theory (Lecture Notes) : October 2004

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Advanced Portfolio Theory (Lecture Notes) : October 2004

Uploaded by

Copyright:

Available Formats

Advanced Portfolio Theory

Prof. Dr. Thorsten Hens

CHAPTER 2: FOUNDATIONS FROM PORTFOLIO THEORY

CHAPTER 3: FOUNDATIONS FROM ASSET PRICING

8.2 Linear Pricing 30

CHAPTER 4: RATIONAL CHOICE: EXPECTED UTILITY AND BAYESIAN UPDATING

7.2 Hyperbolic Discounting 66

CHAPTER 5: CHOOSING A PORTFOLIO ON A RANDOM WALK: DIVERSIFICATION

CHAPTER 6: BEHAVIORAL PORTFOLIO THEORY

CHAPTER 7: CHOOSING A PORTFOLIO ON A MEAN REVERTING PROCESS: TIME DIVERSIFICATION

CHAPTER 8: BEHAVIORAL HEDGE FUNDS

4.2 Momentum and Reversal 107

CHAPTER 9: CHOOSING A PORTFOLIO ON A GARCH PROCESS: RISK MANAGEMENT

CHAPTER 10: EVOLUTIONARY FINANCE: SURVIVAL OF THE FITTEST ON WALL STREET

E expectation with respect to martingale measure

ps physical probability of occurence of state s

α risk aversion parameter, for example U i (c ti ) = (c ti )1−α

2 PORTFOLIO THEORY AS A SUBFIELD OF FINANCE

3 THE STRUCTURE OF MODERN FINANCIAL MARKETS

3.1 Principal-Agent Problems

3.2 Time Scale of Investment Decisions

4 THE ANTICIPATION PRINCIPLE

5 REFLEXIVITY AND THE GAME STRUCTURE OF FINANCIAL MARKETS

Spiel 1.2.: Kurs vs. Würfel (Sum m en)

6 RATIONAL INVESTMENT PROCESS VERSUS SIMPLE HEURISTICS

Cootner (1964): The Random Character of Stock Prices, MIT-press.

Keynes (1936): General Theory, edition from 1964.

“…the major contributions of academic research in the post-war era.”

2 THE MEAN-VARIANCE ANALYSIS

into risky asset k expects to achieve a return of µ k with a standard deviation σ k .

2.2 The Efficient Frontier

2.3 Optimal Portfolio of Risky Assets with a Risk-less Security

2.4 Two Fund Separation Theorem

λ i = (λ0i , (1 − λ0i )λ T ), i = 1,...I 6, where λ i ∈ RK +1 , λ0i ∈ R, and λT ∈ RK .

2.5 The Structure of the Tangent Portfolio

definition of µλ and σ λ2 we get:

If there are no constraints on λ1 , then the solution is:

which is the Two Fund Separation property.

2.6 The Asset Allocation Puzzle

Curve j is obtained by combination of some asset j with the market portfolio.

3.1 Capital Asset Pricing Model

The slope of the Capital Market Line can be calculated dividing:

From the slope’s equality at point λM follows:

The result is the Capital Asset Pricing Model (CAPM).

3.2 Application: Market Neutral Strategies

4 SOME OPEN ISSUES IN THE MEAN-VARIANCE FRAMEWORK

Campbell and Viciera (2002): Strategic Asset Allocation, OUP

Markowitz, H. (1952): Portfolio Theory Selection, Journal of Finance (7), 77-91

Foundations from Asset Pricing

2 THE GENERAL EQUILIBRIUM MODEL: BASICS

t=0 t=1 t=2

A competitive equilibrium with perfect foresight is a list of portfolio strategies λt=0,1,...

2.1 Complete and Incomplete Markets

Applying the same calculation procedure for node B, we get:

Applying the above procedure for t = 0 , we get:

which gives θ 01 = 13 /11,θ 02 =15/11 .

3 THE PRINCIPLE OF NO ARBITRAGE

and the resulting consumption is positive: θt0 > 0 ,

i.e. θt0 (ω t ) ≥ 0 for all ω t and all t = 0,1,2,.... and θ 0 ≠ 0

paying 1 only in state ωt of the path ω t .

Proof: (Risk-neutral measure implies no arbitrage)

Multiplying both sides with π t (ω t ) and adding across ω t gives:

Using (1.5) we get:

⎡π 2 ⋅Max ⎡ 0, u 2 S − K ⎤ + 2 ⋅ π ⋅ (1 − π *) ⋅ Max [ 0, udS − K ]⎤