Selected Finance Notes

FINANCE NOTES
Mike Cliff
Current Draft: June 30, 1998
Contents
1 Introduction 1
2 Asset Pricing 3
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Portfolio Theory . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.1 Single Period Optimization Problem . . . . . . . . . . 4
2.2.2 Key Results . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.3 Multiperiod Portfolio Choice . . . . . . . . . . . . . . . 6
2.3 Equilibrium Asset Pricing Theory . . . . . . . . . . . . . . . . 7
2.3.1 Utility Functions . . . . . . . . . . . . . . . . . . . . . 8
2.3.2 CAPM Theory . . . . . . . . . . . . . . . . . . . . . . 9
2.3.3 ICAPM Theory . . . . . . . . . . . . . . . . . . . . . . 11
2.3.4 CCAPM Theory . . . . . . . . . . . . . . . . . . . . . 15
2.3.5 The CIR Model . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Arbitrage Asset Pricing . . . . . . . . . . . . . . . . . . . . . . 20
2.4.1 State Contingent Claims . . . . . . . . . . . . . . . . . 20
2.4.2 Arbitrage Pricing Theory . . . . . . . . . . . . . . . . . 21
2.5 Pricing Kernel Approach . . . . . . . . . . . . . . . . . . . . . 23
2.5.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5.2 Different Expectations . . . . . . . . . . . . . . . . . . 25
2.5.3 Asset Pricing with m . . . . . . . . . . . . . . . . . . . 26
2.5.4 The Agent’s Problem . . . . . . . . . . . . . . . . . . . 26
2.5.5 The Main Results . . . . . . . . . . . . . . . . . . . . . 27
2.5.6 Hansen-Jagannathan Bounds . . . . . . . . . . . . . . 28
2.6 Conditioning Information . . . . . . . . . . . . . . . . . . . . . 29
2.7 Market Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.8 Empirical Asset Pricing . . . . . . . . . . . . . . . . . . . . . 31
2.8.1 Properties of Asset Returns . . . . . . . . . . . . . . . 31
i
ii CONTENTS
2.8.2 General Procedures . . . . . . . . . . . . . . . . . . . . 36

2.8.3 CAPM Tests . . . . . . . . . . . . . . . . . . . . . . . 37
2.8.4 ICAPM/CCAPM Tests . . . . . . . . . . . . . . . . . . 40
2.8.5 APT Tests . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.8.6 Present Value Relations . . . . . . . . . . . . . . . . . 42
3 Fixed Income 45
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Term Structure Basics . . . . . . . . . . . . . . . . . . . . . . 45
3.3 Inflation and Returns . . . . . . . . . . . . . . . . . . . . . . . 45
3.4 Forward Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.5 Bond Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.6 Affine Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.6.1 Vasicek . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.6.2 The CIR Model . . . . . . . . . . . . . . . . . . . . . . 49
3.6.3 Duffie-Kan Class . . . . . . . . . . . . . . . . . . . . . 50
3.6.4 Other Single Factor Models . . . . . . . . . . . . . . . 51
3.6.5 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . 51
3.7 Multi-Factor Models . . . . . . . . . . . . . . . . . . . . . . . 51
3.8 Empirical Tests . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.8.1 Brown & Dybvig (1986) . . . . . . . . . . . . . . . . . 51
3.8.2 Brown & Schaefer (1994) . . . . . . . . . . . . . . . . . 53
3.8.3 Chan, Karolyi, Longstaff & Sanders (1992) . . . . . . . 53
3.8.4 Gibbons & Ramaswamy (1993) . . . . . . . . . . . . . 54
3.8.5 Pearson & Sun (1994) . . . . . . . . . . . . . . . . . . 54
3.8.6 Longstaff & Schwartz (1992) . . . . . . . . . . . . . . . 54
4 Derivatives 55
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2 Binomial Models . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.1 Alternative Derivations . . . . . . . . . . . . . . . . . . 57
4.2.2 Trinomial Models . . . . . . . . . . . . . . . . . . . . . 60
4.3 Black Scholes Model . . . . . . . . . . . . . . . . . . . . . . . 60
4.3.1 Black Scholes Derivations . . . . . . . . . . . . . . . . 60
4.3.2 Implied Volatilities . . . . . . . . . . . . . . . . . . . . 64
4.3.3 Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.4 Advanced Topics . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.4.1 American Options . . . . . . . . . . . . . . . . . . . . . 64
CONTENTS iii
4.4.2 Exotic Options . . . . . . . . . . . . . . . . . . . . . . 66

4.4.3 Other Advanced Topics . . . . . . . . . . . . . . . . . . 67
4.5 Interest Rate Derivatives . . . . . . . . . . . . . . . . . . . . . 67
4.5.1 Stochastic Interest Rate Models . . . . . . . . . . . . . 68
4.5.2 Stochastic Term Structure Models . . . . . . . . . . . . 68
5 Corporate Finance 71
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2 Information Asymmetry/Signaling . . . . . . . . . . . . . . . . 71
5.3 Agency Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.4 Capital Structure . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.5 Dividends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.5.1 Factors Influencing Dividend Policy . . . . . . . . . . . 88
5.5.2 Key Dividends Papers . . . . . . . . . . . . . . . . . . 89
5.6 Corporate Control . . . . . . . . . . . . . . . . . . . . . . . . 95
5.7 Mergers and Acquisitions . . . . . . . . . . . . . . . . . . . . . 100
5.7.1 Tender Offers . . . . . . . . . . . . . . . . . . . . . . . 100
5.7.2 Competition Among Bidders . . . . . . . . . . . . . . . 103
5.7.3 Managerial Power . . . . . . . . . . . . . . . . . . . . . 103
5.7.4 Key Papers . . . . . . . . . . . . . . . . . . . . . . . . 104
5.8 Financial Distress . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.8.1 Factors Affecting Reorganizations . . . . . . . . . . . . 108
5.8.2 Private Resolution . . . . . . . . . . . . . . . . . . . . 109
5.8.3 Formal Resolution . . . . . . . . . . . . . . . . . . . . 110
5.8.4 Key Papers . . . . . . . . . . . . . . . . . . . . . . . . 111
5.9 Equity Issuance . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.9.1 Flotation Methods . . . . . . . . . . . . . . . . . . . . 116
5.9.2 Direct Flotation Costs . . . . . . . . . . . . . . . . . . 117
5.9.3 Indirect Flotation Costs . . . . . . . . . . . . . . . . . 119
5.9.4 Valuation Effects . . . . . . . . . . . . . . . . . . . . . 119
5.9.5 SEO Timing . . . . . . . . . . . . . . . . . . . . . . . . 120
5.9.6 Key Papers . . . . . . . . . . . . . . . . . . . . . . . . 121
5.10 Initial Public Offerings . . . . . . . . . . . . . . . . . . . . . . 126
5.10.1 IPO Anomalies . . . . . . . . . . . . . . . . . . . . . . 128
5.10.2 Key Papers . . . . . . . . . . . . . . . . . . . . . . . . 130
5.11 Executive Compensation . . . . . . . . . . . . . . . . . . . . . 133
5.12 Risk Management . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.13 Internal/External Markets and Banking . . . . . . . . . . . . . 143
iv CONTENTS
5.14 Convertible Debt . . . . . . . . . . . . . . . . . . . . . . . . . 147

5.15 Imperfections and Demand . . . . . . . . . . . . . . . . . . . . 151
5.16 Financial Innovation . . . . . . . . . . . . . . . . . . . . . . . 155
6 Market Microstructure 159

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.2 The Value of Information . . . . . . . . . . . . . . . . . . . . . 160
6.3 Single Period REE . . . . . . . . . . . . . . . . . . . . . . . . 161
6.4 Batch Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.4.1 Strategic Uninformed Traders . . . . . . . . . . . . . . 170
6.5 Sequential Trade Models . . . . . . . . . . . . . . . . . . . . . 173
6.5.1 Specialists and Dealers . . . . . . . . . . . . . . . . . . 173
6.5.2 Other Topics . . . . . . . . . . . . . . . . . . . . . . . 176
6.6 Special Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
6.6.1 Bubbles . . . . . . . . . . . . . . . . . . . . . . . . . . 176
6.6.2 Speculation . . . . . . . . . . . . . . . . . . . . . . . . 177
6.6.3 Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6.6.4 Cascades . . . . . . . . . . . . . . . . . . . . . . . . . . 178
7 International Finance 179

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
7.2 Spot Currency Pricing . . . . . . . . . . . . . . . . . . . . . . 180
7.3 Forward Currency Pricing . . . . . . . . . . . . . . . . . . . . 181
7.4 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
7.5 International Asset Pricing . . . . . . . . . . . . . . . . . . . . 184
7.6 Other Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
8 Appendix: Math Results 185

8.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
8.1.1 Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
8.1.2 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . 185
8.1.3 Distributions . . . . . . . . . . . . . . . . . . . . . . . 185
8.1.4 Convergence . . . . . . . . . . . . . . . . . . . . . . . . 186
8.1.5 Some Famous Inequalities . . . . . . . . . . . . . . . . 186
8.1.6 Stein’s Lemma . . . . . . . . . . . . . . . . . . . . . . 187
8.1.7 Bayes Law . . . . . . . . . . . . . . . . . . . . . . . . . 187
8.1.8 Law of Iterated Expectations . . . . . . . . . . . . . . 187
8.1.9 Stochastic Dominance . . . . . . . . . . . . . . . . . . 187
CONTENTS v
8.2 Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

8.2.1 Projection Theorem . . . . . . . . . . . . . . . . . . . . 188
8.2.2 Cramer-Rao Bound and the Var-Cov Matrix . . . . . . 188
8.2.3 Testing: Wald, LM, LR . . . . . . . . . . . . . . . . . . 188
8.3 Continuous-Time Math . . . . . . . . . . . . . . . . . . . . . . 189
8.3.1 Stochastic Processes . . . . . . . . . . . . . . . . . . . 189
8.3.2 Martingales . . . . . . . . . . . . . . . . . . . . . . . . 189
8.3.3 Itô’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . 189
8.3.4 Cameron-Martin-Girsanov Theorem . . . . . . . . . . . 189
8.3.5 Special Processes . . . . . . . . . . . . . . . . . . . . . 190
8.3.6 Special Lemma . . . . . . . . . . . . . . . . . . . . . . 191
References 192
vi CONTENTS
Chapter 1
Introduction
These notes are an effort to integrate the body of knowledge encountered in

the Finance PhD classes at the University of North Carolina. The original
draft of this document was developed to prepare for the area comprehensive
exams. As such, the presentation is in a condensed format. The theoretical
models are derived in a skeleton form, with the focus on the set up and key
steps rather than line-by-line explanations. Similarly, empirical work is sum-
marized in terms of purpose, methodology (where important), findings, and
fit with the literature. Throughout the manuscript special attention is given
to tying together the ideas in differerent models and areas. Providing this
structure should make the material easier to remember and more meaningful
to interpret.
The organization of the paper is as follows. Chapter 2 covers asset pricing,
both theoretical and empirical. A separate chapter on fixed income secuti-
ties follows. Chapter 4 coveres derivative securities, from both a binomial
perspective and a continuous time framework. Chapter 5 covers the main
topics in corporate finance, again both theoretically and empirically. Next
is a chapter on market microstructure and information economics, which is
largely theoretical. A chapter on international finance concludes the main
body of the document. Lastly is a chapter covering important mathematical,
statistical, and econometric issues.
An effort is made to preserve notational consistency, but inevitably there
will be deviations. Bold is used for vectors x and matrices X. Time subscripts
are dropped unless needed for clarity. Generally t is the current time, T is
the end, and τ is the time between two dates. Random variables are given a
tilde only as needed. Expectations are with respect to the true probabilities
1
2 CHAPTER 1. INTRODUCTION
P unless otherwise denoted. The risk-neutral measure is represented by Q.

R denotes a gross return, whereas r is a net return. When working with
the pricing kernel models it is useful to have a notation for element-wise
operations. I use to denote element-by-element multiplication and to
represent such a division.
This document draws heavily from a variety of sources, including Bhat-
tacharya and Constantinides (1989), Cochrane (1998), Campbell, Lo, and
MacKinlay (1997), Huang and Litzenberger (1988), Ingersoll (1987), Jarrow,
Maksimovic, and Ziemba (1995), as well as lecture notes from Dong-Hyun
Ahn, Jennifer Conrad, and Dick Rendleman.
June 30, 1998

Mike Cliff
mcliff@unc.edu
Chapter 2
Asset Pricing
2.1 Introduction
There are three primary apporaches to pricing assets. The equilibrium ap-
proach begins with agents preferences (e.g., over expected returns or con-
sumption). Agents maximize expected utility subject to budget constraints
and market clearing conditions. Equilibrium models price all assets simul-
taneously and in equilibrium there is no arbitrage. The arbitrage approach
takes a different point of view. It takes as given the prices of basis assets,
which can be combined to generate other payoffs. The absence of arbitrage
implies unique prices for these synthetic assets when markets are (locally)
complete. If markets are incomplete, it may be the case that there is a range
of admissable prices. Unfortunately, it is generally not possible to recover
a supporting equilibrium from the arbitrage approach. Somewhat paradox-
ically, the arbitrage approach may in fact admit arbitrage opportunities in
the sense that selecting different basis assets may give different prices. The
final approach focuses on the pricing kernel. This approach shares many of
the features of the first two approaches and provides a unifying framework.
Under this paradigm, all assets can be priced by the relation p = E[mx].
Asset pricing models differ in the specification of the pricing kernel m.
One question that arises immediately in asset pricing is the decision to
work in discrete or continous time. The discrete time models were developed
first, and the have the benefit of a more intuitive feel. Continuous time
models have a number of advantages. With a single state variable returns
are perfectly instantaneously correlated which simplifies the analysis. More
3
4 CHAPTER 2. ASSET PRICING
generally, moments higher than the second vanish in continuous time.

Conditional asset pricing models have become popular in response to
the failure of unconditional models. A conditional model can capture time-
varying expected returns and/or risk premiums.
This chapter develops each theoretical approach, discussing the under-
lying assumptions and the resulting implications. Derivations are provided
for each, and an effort is made to show the connections among the models.
The chapter concludes with a summary of the major emprical results and
methodologies. We begin the chapter by reviewing portfolio theory.
2.2 Portfolio Theory

Portfolio theory is concerned with the investors’ decision to consume or save
and the portfolio selection decision. The theory develops many of the re-
sults that appear in the CAPM framework. These results follow from mean-
variance mathematics, not from any economic model. Early works were due
to Markowitz (1959), who moved the thinking from maximizing E[R] to con-
sideration of both mean and variance. The principle of diversification comes
from this work.
Under certain conditions, we can consider only the mean and variance of
asset returns. One sufficient condition is quadratic utility (see Section 2.3.1).
The other sufficient condition is multivariate normality of asset returns. Al-
though either of these assumptions are unlikely to hold, the resulting analysis
provides an intuitively appealing framework.
2.2.1 Single Period Optimization Problem

In terms of notation, consider a vector of asset weights w, returns r, expected
returns µ, and a variance-covariance matrix Σ. Investors minimize variance,
subject to achieving a particular return and the portfolio weights summing
to one.
1
L = w0 Σw + λ(µp − w0 µ) + γ(1 − w0 ι) (2.1)
2
with FOCs
Σw = λµ + γι µp = w 0 µ 1 = w0 ι.
2.2. PORTFOLIO THEORY 5
Solve for w to get
w = λΣ−1 µ + γΣ−1 ι. (2.2)
Frontier portfolios are linear combinations of two portfolios. Premultiply by

ι and µ, then define A = µ0 Σ−1 ι, B = µ0 Σ−1 µ, C = ι0 Σ−1 ι, D = BC − A2 .
Combine the expression for w and the FOCs to get λ = (Cµp − A)/D and
γ = (B − Aµp )/D. This gives
wp = g + hµp (2.3)
where g = BΣ−1 ι − AΣ−1 µ /D and h = CΣ−1 µ − AΣ−1 ι /D. Note

that ι0 g = 1, ι0 h = 0, µ0 g = 0, µ0 h = 1
2.2.2 Key Results

From here, we can establish a number of results [see Markowitz (1959) and
Roll (1977)].
• The efficient frontier is a hyperbola in µ-σ space. q
• Global minimum variance portfolio o is the point ( C1 , CA ).
• o is positively correlated with all other minimum variance portfolios
and its covariance with these portfolios is its variance, 1/C.
• A frontier portfolio p is efficient if µp ≥ CA .
• For all frontier portfolios except the minimum variance portfolio, there
exists a unique orthogonal frontier portfolio, z with wz0 Σwp = 0.
• All portfolios on the efficient frontier are positively correlated. More
generally, ρp,j = SRj /SRp where p is on the efficient frontier. (??)
• µg = 0, µg+h = 1.
• The portfolios g and g + h span the entire frontier.
• Any n ≥ 2 frontier portfolios can span the entire frontier.
• If wi is efficient then wq = A0 wi is efficient (A diagonal, trace(A) = 1,
and Aii ≥ 0 ∀ i).
• The covariance between the returns of a frontier portfolio p and any
other portfolio n (not necessarily on the frontier) is λµn + γ.
• µn = µz + β(µp − µz ), where σp,z = 0
• geometry of tangency lines.
A beta representation is easy to derive from the FOCs.

wp0 Σwp = σp2 = λµp + γ
i0p Σwp = σip = λµi + γ
z0p Σwp = σzp = λµz + γ = 0
where z is the portfolio orthogonal to p (or rf in the SL model) and i is an

portfolio. Since the third equation equals zero, subtract it from the first two
equations.
σp2 = λµp + γ − λµz ) − γ = λ(µp − µz )
σip = λµi + γ − λµz ) − γ = λ(µi − µz )
Solving for λ and rearranging gives the desired result

µi = µz + βip (µp − µz ). (2.4)
Note that the beta is measured relative to portfolio p which is currently
unspecified. The important point of Roll’s critique is that this representation
is a mathematical result from the set up of the minimization problem. It does
not have any economic content unless we specify p as a particular portfolio.
His critique also says that the CAPM is not testable because the market
portfolio includes all assets, which we can not measure.
2.2.3 Multiperiod Portfolio Choice

In moving to a multiperiod setting the agent now considers future expected
consumption. Time subscripts are for indexing only. Other subscripts denote
partial derivatives.
T
X
max Et [U (Ct )] + Et [B(WT )]
{C,w}
t=1
Define the indirect utility function as

T
X
J(Wt ) ≡ max Et [U (Cs )] + Et [B(WT )]
{C,w}
s=t
2.3. EQUILIBRIUM ASSET PRICING THEORY 7
with J(WT ) = B(WT ). At T − 1, indirect utility is
J(WT −1 ) = max U (CT −1 ) + ET −1 [J(WT )]

P
where WT = [WT −1 −CT −1 ] [ wi (Ri − Rf ) + Rf ]. The first order conditions
are
UC − ET −1 [BW R∗ ] = 0 and ET −1 [BW (Ri − Rf )] = 0.
This generalizes to
J(Wτ ) = max U (Cτ ) + Eτ [J(Wτ +1 )]
with FOCs
UC = Eτ [JW R∗ ] and Eτ [JW (Ri − Rf )] = 0.
With log utility optimal consumption depends only on current wealth

and not on the investment opportunity set. Consumption is a specific pro-
portion of wealth and investors choose portfolios as in a single period setting
by equating the marginal utilities across assets. With power utility, opti-
mal consumption does depend on the investment opportunity set although
investment decisions are independent of consumption. With more general
HARA utility both consumption and portfolio choice depend on wealth.
2.3 Equilibrium Asset Pricing Theory

The equilibrium approach begins with agents’ preferences and maximizes
expected utility subject to budget constraints and market clearing condi-
tions. This approach has the advantage of internal consistency (no arbitrage
opportunities) and providing comparative statics. Models may be general
equilibrium or partial (e.g., take the riskless rate as given). A disadvantage
of these models is that they require taking a stand on preferences, and this of-
ten involves a tradeoff between reality and tractability. The standard CAPM
is set is a single-period discrete world, whereas the ICAPM and CCAPM are
multi-period models in continuous time.
2.3.1 Utility Functions

Utility functions are the foundation of equilibrium asset pricing models.
Specifying a utility function deternines the features of the agents’ prefer-
ences, which in turn affect how assets are priced in the economy. Here we
will discuss several important classes of utility functions in a nested frame-
work. Many of the commonly used funtions are special cases of more general
specifications. This section also briefly discusses aggregation, representative
agents, and the implications for asset pricing.
One desirable feature is time-separability of a utility function. This means
that an agent’s consumption today does not affect his consumption prefer-
ences in the future (no hangovers)
T
" #
X
u(c0 , . . . , cT ) = Et β τ uτ (cτ )
τ =0
This is a strong assumption, but it greatly simplifies much of the analysis.

Durability is one source of nonseparabilty. Models that relax the separability
assumption include habit persistence, “keeping up with the Joneses,” and the
Epstein-Zin class of recursive preferences models.
Risk-averse agents have utility functions that are concave in wealth (or
consumption). In this case, u[E(c)] ≥ E[u(c)] (by Jensen’s Inequality). It is
expected utility we care about. Concave utility functions mean the agent is
better off with a certain outome than a risky outcome.
There are several measures of risk aversion. RA = −u00 /u0 is the Arrow-
Pratt measure of absolute risk aversion, which applies to small risks. The
larger is RA , the larger the risk premium required to induce the agent to invest
in risk assets. CARA means the agent keeps a fixed dollar amount invested
in the risky asset as wealth changes. Models based on CARA preferences do
not have income effects. IARA implies the risky asset is an inferior good —
those with more wealth take less risk. This doesn’t make sense if one thinks
about a subsistence level of wealth.
To measure relative risk aversion, we use RR (C) = RA (C)C. This mea-
sure describes proportional changes in the risky asset investment for changes
in wealth. The wealth elasticity of demand is unity for CRRA utility func-
tions and greater than one for DRRA functions. With CRRA, agents invest
a contant proportion of their wealth in the risky assets, whereas with DRRA
the fraction of wealth invested in the risky asset increases with initial wealth.
Table 2.1: Common Utility Functions: HARA
γ
1−γ αW
u(W ) = +b
γ 1−γ
Case u= γ b Features
Risk-neutral 1
Quadratic 2 IARA, M-V
Negative Exp. −e−αW −∞ 1 CARA= α
Power W γ /γ <1 0 CRRA= 1 − γ
Log log(W ) 0 0 CRRA=1
The HARA (hyperbolic absolute risk aversion) family nests many com-
monly used classes of utility functions. Table 2.1 summarzies the features of
common utility functions.
With a riskless asset, quadratic or HARA utility implies two-fund separa-
tion. If there is not a riskless asset, quadratic or CRRA utility provides this
result. With the exception of quadratic (which has its own undesirable prop-
erties), restrictions on utility functions alone do not imply mean-variance
preferences, so therefore do not imply the CAPM.
Equilibrium models rely on the ability to aggregate over individuals in
the economy. A complete or effectively complete market guarantees the ex-
istance of a representative agent. The representative agent’s utility function
is completely determined by individual agents’ preferences and wealths and
is independent of available assets only when all investors have HARA utility.
The risk aversion of the representative agent is the harmonic mean of indi-
vidual risk aversions, and will be less than or equal to the wealth-weighted
average. It is easier to establish the existence of a representative agent than
it is to aggregate demands. In many cases, however, we are interested in the
less difficult task of aggregating demand only at the equilibrium price.
2.3.2 CAPM Theory

Assumptions
• homogeneous expectations (distinguishes from portfolio theory)
• Quadratic utility or multivariate normality of returns

• rational, risk-averse investors
• perfect capital markets
• unrestricted short selling (Black)
• borrow and lend at riskless rate (SL)
Derivation of Sharpe-Lintner Model
1
L = w0 Σw + λ[µp − µf − w0 (µ − µf ι)]
2
FOCs:
Σw = λ(µ − µf ι)
µp − µf = w0 (µ − µf ι)
Solving for λ,
λ = w0 Σw[w0 (µ − µf ι)]−1
so
Σw
µ − µf ι = Σw/λ = (µp − µf ) = β(µp − µf )
w0 Σw
Investors will only hold a combination of the riskfree asset and a tangency
portfolio. With homogeneous expectations the portfolio p must be the value-
weighted market portfolio M .
µ − µf ι = β(µM − µf )
Derivation of Black Model

Black’s (1972) CAPM adds one assumption to give the portfolio math results
economic content. With investor homogeneity, all investors will hold efficient
portfolios. Since the value weighted market portfolio is a linear combination
of these efficient portfolios, it too is efficient.We can the rewrite (2.4) as
µi = µz + βi (µM − µz ).
Alternatively, we can maximize expected return for a given portfolio vari-

ance.
L = w0 µ + µz (1 − ι0 w) + λ(σ 2 − w0 Σw)
gives FOCs:
σ 2 = wΣw 1 = 10 w µ = µz ι + 2λΣw.
So
w0 µ = µz + 2λσ 2
2
For the market portfolio 2λ = (µM − µz )/σM . For a generic asset,
2
µi = µz + (µM − µz )σiM /σM = µz + βi (µM − µz ).
Interpretation
The assets that covary negatively with the market tend to payoff when the
market is doing poorly. These assets are valueable to investors in smoothing
their wealth. Since they are valuable, investors will pay a high price and
accept a low return. Thus, assets with low or negative betas will have low
(or possibly negative) expected returns. Higher risk aversion increases the
E[r ]−r
risk-return tradeoff. This is measured by the Sharpe-ratio Mσi f , the slope
of the CML.
2.3.3 ICAPM Theory

The intertemporal capital asset pricing model and consumption capital asset
pricing model extend the standard CAPM intuition to a multi-period set-
ting. The ICAPM replaces dependence on quadratic utility/normal returns
with the assumption of a GBM process which implies normally distributed
returns. In the continuous time setting, higher moments do not matter, im-
proving tractability of the model. An advantage over the CAPM is utility
can be state-dependent, although the time-separability assumption remains.
With constant risk tolerance utility functions and constant investment op-
portunities, optimal portfolio choices are also constant. When the investment
opportunity set changes, so will portfolio allocations.
Merton’s (1973) ICAPM begins with the specification of asset price paths.
Demands are determined by investors’ maximizing current and expected fu-
ture utility, subject to his budget constraint. Preferences are instantaneously
state-independent and depend only on immediate consumption. The in-
direct utility function, which is the maximized utility of future wealth, is
state-dependent. A collection of state variables are sufficient statistics for
summarizing the investment opportunities. Investors hedge against adverse
changes in the investment opportunity set, with the end goal being a hedge
against changes in consumption.
Assumptions
• limited liability
• perfect markets
• no restrictions on trading volume/short selling
• always in equilibrium
• borrow/lend at same rate
• continuous-time trading
• state variable has continuous sample path
• first 2 return moments exist, higher moments unimportant
• returns have a compact distribution
• time-separable preferences
• r̃i = αi dt + σi dzi
Under certain conditions, we have two-fund separation and the CAPM:

1. log utility (this means JW x = 0, investors do not want to hedge)
2. σix = 0 ∀ i (no hedge is possible)
The following derivation is for a single state variable x. The more gereral
case of a vector of state variables is similar.
Underlying Processes
√
dW = −Cdt + [W − Cdt]w0 r dx = µdt + sε̃x dt
Et [dW ] = [W w0 α − C]dt E(dx) = µdt
var(dW ) = W 2 w0 Σwdt var(dx) = s2 dt
cov(dx, r) = ρix sσi dt = σix dt

Optimization Problem
Z t+dt
J(W, x, t) = max Et U (C, s)ds + J(W + dw, x + dx, t + dt)
t
J(W + dw, x + dx, t + dt) = J(W, x, t) + Jt dt + JW dW + Jx dx

1 1 1
+ JW W (dW )2 + Jxx (dx)2 + Jtt (dt)2
2 2 2
+ JW x dwdX + Jtx dtdX + JtW dtdW + φ
(2.5)
where φ contains higher-order terms.
E[J(·, ·, ·)] = J + Jt dt + JW E[dW ] + Jx E[dx]

1 1
+ JW W var(dW ) + Jxx var(dx) + JW x cov(dw, dx) (2.6)
2 2
W2
0 = max [U (C, t) + Jt + JW (−C + W w0 α) + JW W w0 Σw
{C,w} 2
1
+ Jx µ + Jxx s2 + JW x W w0 σ ix ]dt (2.7)
2
FOCs: (with portfolio constraint N
P PN
i=0 wi αi = rf + i=1 wi (αi − rf ))
UC = J W (envelope condition)
W JW (α − rf ι) + W 2 JW W Σw + W JW x σ ix = 0
Now solve for optimal portfolio weights

−JW −1 −JW x
∗
w = Σ (α − rf ι) + Σ−1 σ ix (2.8)
W JW W W JW W
Define

−JW −1 −JW x
D≡ 0
ι Σ (α − rf ι) H≡ ι0 Σ−1 σ ix
W JW W W JW W
Σ−1 (α − rf ι) Σ−1 σ ix
t = 0 −1 h = 0 −1
ι Σ (α − rf ι) ι Σ σ ix
Therefore w∗ = Dt + Hh. Further, ι0 t = ι0 h = 1 so t and h are portfolios.
This gives three-fund separation, with the third fund being the riskless asset.
h is the “hedge portfolio,” and has the highest correlation with the state
variable x. This set up generalizes with a vector of state variables, in which
case we have dim(x) + 2-fund separation.
Equilibrium conditions:
Define ak = −JW /JW W and bk = −JW x /JW W where k indexes the investor.
Rewrite the second FOC as:
JW (α − rf ι) + JW W W Σw∗ + JW x σ ix = 0
ak (α − rf ι) = Wk Σwk − bk σ ix
P
Sum over all investors and divide by k ak :
(α − rf ι) = AΣµ − Bσ ix or (αi − rf ) = Aσim − Bσix
P P P P P P
where A = k Wk / k ak , B = k bk / k ak , and µ = k wk Wk / k Wk
(average investment in each asset across investors). Now multiply by µ0 and
h0 to get
2 2
αm − r = Aσm − Bσmx , αh − r = Aσhm − Bσhx ,
Solving for A and B and substituting,
2
σim σhx − σix σmh σix σm − σim σmx
αi − r = 2 (αm − r) + 2 (αh − r)
σm σhx − σmx σmh σm σhx − σmx σmh
= βim (x)(αm − r) + βih (x)(αh − r)

The βs have the interpretation of regression coefficients in an IV regression,
where x serves as an instrument for h. Note that
ΣΣ−1 σ ix σ ix
σ ih = Σh = 0 −1 = 0 −1
ι Σ σ ix ι Σ σ ix
Therefore, σix = kσih . This trick generalizes to cov(j, x) = kcov(j, h) where
k = ι0 Σ−1 σ ix . Terms depending on x can be factored from the betas so
βim (x) = βim and βih (x) = βih .
2.3.4 CCAPM Theory

The CCAPM, due to Breeden (1979), is very much like the ICAPM with
consumption growth as the single state variable. In the ICAPM investors
hedge against changes in the state variables because these represent changes
in the investment opportunity set, and therefore, changes in consumption.
The CCAPM goes directly to heding against changes in consumption. The
model is also similar to the static CAPM, where end of period wealth mat-
tered. Since the CAPM is one period, end of period wealth is the same
as consumption. A key assumption in the CCAPM is additively separable
preferences, which gives state independence of direct utility.
To make more clear the link between the ICAPM and the CCAPM, note
that in the ICAPM agents set the marginal utility of wealth equal to the
marginal utility of consumption along the optimal consumption path. This
is the envelope condition, UC = JW . If markets are complete, then perfect
hedges for the state variables can be formed and all individuals will have per-
fectly (instantaneously) correlated consumption policies. This is an analogue
to all individuals holding the market portfolio in the static CAPM.
In many ways, the CCAPM is the most fundamental of the equilibrium
models. It is illogical to choose the CAPM or ICAPM because you think
the consumption-based model is wrong. The only reason for chosing an
alternative model is because the consumption data to test the model may be
unsatisfactory.
CCAPM Derivation
The combination of portfolios h and t which the investor chooses minimize
the variance in consumption, not wealth.
The CCAPM can be derived as a simple modification to the previous
∗
derivation of the ICAPM. Since UC = JW at the optimum, JW W = UCC CW
and JW x = UCC Cx∗ . Substituting into (2.8),
−UCC Cx∗

−UC −1
∗
w = ∗
Σ (α − rf ι) + ∗
Σ−1 σ ix
W UCC CW W UCC CW
or
−UC ∗
(α − rf ι) = W CW Σw∗ + Cx∗ σ ix .
UCC
The covariance between the return on asset i and consumption growth is

dC
cov r, = E[(αdt + Σdz)(Ct dt + CW dW + Cx dx + φ)]
C
W CW Cx
= Σw + σ ix = σ kiC /C.
C C
Noting that this is different for each agent k and letting T k = −CUC /UCC
T k (αi − rf ) = σiC
k
.
Summing over all investors we get
(αi − rf ) = T −1 σiC .
Defining a reference portfolio C,
σC2 = wC0 σ iC = T (αC − rf ).
Solving for T and substituting,

σiC
(αi − rf ) = (αp − rf ) = βiC (αC − rf ).
σpC
Note that if the consumption portfolio is not itself a traded asset than the
portfolio with the maximum correlation with consumption can be used. The
same basic intuition applies, but this results in the same kind of instrumental
variable flavor as in the previous presentation of the ICAPM. If consuption
is available, it serves as the single variable driving the returns process. When
it is not available we include additional state variables to use as instruments.
2.3.5 The CIR Model

? derive a general equilibrium model with endogenous production and stochas-
tic technology shocks. Distribution of production depends on the state vari-
ables Y , which are changing randomly. This model fills a void in the litera-
ture in that it endogenously determines the equilibrium price path, given the
specification of technology. Recall the ICAPM begins with a specification of
the price path then determines the equilibrium demand.
Assumptions
• single physical good
• n production activities follow (2.9)
• k state variables follow (2.10)
• contingent claims for the single good, whose value follows (2.11)
• competitive markets
• endogenously determined instantaneous borrowing/lending R t0 rate r
• fixed number of identical individuals who maximize E t U [C(s), Y (s), s]ds
• continous investing and trading with no transactions costs
• there exists a unique J and v̂
• (technical) v ∈ V is the class of admissible controls
• (technical) J, a∗ and C ∗ are sufficiently differentiable.
Underlying Processes
n Production Activities
dη(t) = Iη α(Y, t)dt + Iη G(Y, t)dw(t) (2.9)
k State Variables
dY (t) = µ(Y, t)dt + S(Y, t)dw(t) (2.10)
Value of Contingent Claim i
dF i = (F i βi − δi )dt + F i hi dw(t) (2.11)
Derivation
Budget constraint
" n k
#
X X
dW = ai W (αi − r) + bi W (βi − r) + rW − C dt
i=1 i=1
n n+k
! k k
!
X X X X
+ ai W gij dwj + bi W hij dwj (2.12)
i=1 j=1 i=1 j=1
or
n+k
X
dW = W µ(W )dt + W qj dwj
j=1
hR i
t0
Let K(v(t), W (t), Y (t), t) ≡ E t
U (v(s), Y (s), s)ds and define Lv (t)
W,Y,t
as the differential operator
k n+k
v
X 1 X
L (t)K = µ(W )W KW + µ i KY i + W 2 KW W qi2
i=1
2 i=1
k n+k k k n+k
X X 1 XX X
+ W KW Yi qj sij + KY i Y j sim sjm (2.13)
i=1 j=1
2 i=1 j=1 m=1
Let the indirect utility function J(W, Y, t) be the solution to
max[Lv (t)J + U (v, Y, t)] + Jt = 0.

v∈V
J has many of the same properties as U , such as being increasing and strictly
concave in W .
Defining Ψ = Lv J + U , we get the following necessary and sufficient
conditions:
• Ψ C = U C − JW ≤ 0
• CΨC = 0
• Ψa = [α − r]W JW + [GG0 a + GH 0 b]W 2 JW W + GS 0 W JW Y ≤ 0
• a 0 Ψa = 0
• Ψb = [β − r]W JW + [HG0 a + HH 0 b]W 2 JW W + HS 0 W JW Y = 0
Solving for Ĉ, â, b̂, we obtain

Pa PDE for J. The equilibrium satisfies these
conditions and markets clear: ai = 1 and bi = 0 ∀ i.
Characterizations
The expected rate of return on wealth is a∗0 α. r is the negative of the expected
rate of change in the MU wealth, or a∗0 α + the covariance between the rate
of return on wealth and the rate of change in the MU of wealth.

JW W dW dW JW W
r = −E =E + cov ,−
JW W W JW
The expected rate of return on the ith contingent claim is
(βi − r)F i = [φW φ0Y ][FW

i
FiY ]0
where
" k #
JW W X JW Y i
φW = − var(W ) + − cov(W, Yi ) = (a∗0 α − r)W
JW i=1
J W
and
" k #
JW W X JW Y i
φYi = − cov(W, Yi ) + − cov(Yi , Yj )
JW j=1
JW
Alternatively, we can write
βi = r − cov(F i , JW )/F i JW
The expected return on a contingent claim is the riskfree rate plus a linear
combination of the first partials of the asset price with respect to W and Y .
The weights are the φ coefficients, which are much like factor risk premiums
in the APT or hegde portfolios in the ICAPM. The φs do not depend on the
contingent claim itself and are the same for all claims.
If U is not state-dependent, we get a CCAPM-type result, with φW =
00 00 00
− uu0 cov(C ∗ , W ) and φY = − uu0 cov(C ∗ , Y ), giving (βi −r)F i = − uu0 cov(C ∗ , F i ).
The expected excess return on an asset is proportional to its covariance with
optimal consumption. We can then express relative rates of return in a way
that does not depend (explicitly) on preferences.
Fundamental Valuation Equation
1 X 1 XX
var(W )FW W + cov(W, Yi )FW Yi + cov(Yi , Yj )FYi Yj
2 " 2 #
X −JW Y
X −JW W i
+ F Yi µi − cov(W, Yi ) − cov(Yi , Yj )
i
JW j
JW
+ [rW − C ∗ ]FW + Ft − rF + δ(W, Y, t) = 0 (2.14)
where r and C ∗ are functions of W, Y, and t. This PDE holds for any contin-
gent claim, with boundary conditions and δ depending on the terms of the
claim. The PDE can price assets with payoffs (i) contingent on crossing a
barrier, (ii) contingent on not crossing a barrier, and/or (iii) flow payoffs.
We can focus on the system of equations:

dW (t) = [a∗0 αW − C ∗ ]dt + a∗0 GW dw(t)
dY (t) = µ(Y, t)dt + S(Y, t)dw(t)

or a second system with a different drift term reflecting a change of measure:
dW (t) = [a∗0 αW − C ∗ − φW ]dt + a∗0 GW dw(t)
dY (t) = [µ(Y, t) − φ0Y ]dt + S(Y, t)dw(t)

JW (W (s),Y (s),s)
The expression JW (W (t),Y (t),t)
is the conditional pricing kernel.
2.4 Arbitrage Asset Pricing

Arbitrage pricing takes a set of basis assets as given and uses them to price
other assets.
2.4.1 State Contingent Claims

State contigent claims, or Arrow-Debreu securities, are the building blocks
for all assets. These securities pay $1 in a specified state and zero otherwise.
Ross (1977b) shows the absence of arbitrage implies the existence of state
contingent prices and, therefore, of a linear pricing
P operator. This is really
just a spanning result. We can write p(x) = s φ(s)x(s). This says the
price of security x is the sum over all states of the price of a dollar in each
state φ(s) scaled by the size of the payoff in each state x(s). Harrison and
Kreps (1979) extend this to show that this operator can be represented as
an expectation with respect to a martingale measure.
Let D denote an (n × n) matrix of asset payoffs with typical element dij ,
where i denotes the state and j the security. This matrix is a colection of
vectors dj of asset payoffs. α is an n-vector of weights, b an n-vector of
payoffs. φ is the price vector for the n Arrow-Debreu securities and p the
prices of the complex securities. We have the following pricing relations
D0 φ = p and Dα = b
1
with ι0 φ = 1+r f
= pf , (1 + rf )ι0 φ = ι0 π = 1. π = f (θ, λ) is the risk-neutral
probablities, a function of the true probabilties θ and risk aversion λ.
2.4. ARBITRAGE ASSET PRICING 21
2.4.2 Arbitrage Pricing Theory

The APT, originally developed by Ross (1976), has generated a tremendous
literature of theoretical extensions and a wide range of empirical tests. The
intuition is simple. Assume returns follow a factor-model, meaning returns
depend on the realization of factors and (quasi-) orthogonal shocks.1 The
factors are not diversifiable, whereas the orthogonal shocks are in some sense.
The theory is silent on what the factors are, or even the number of factors.
A key idea is the factor-mimicking portfolio.
There are really three different cases of the APT, depending on the as-
sumptions about the structure of the Ω matrix of “idiosyncratic” covariances.
If we have an exact or noiseless factor model, then Ω is the zero matrix and
an exact arbitrage argument will hold. Alternatively, we could have a strict
factor model in which the matrix is diagonal so there is no correlation across
assets. Large diversified portfolios cause the idiosyncratic variance to go to
zero. We appeal to an asymptotic arbitrage argument in which there is no
arbitrage on average, although specific securities may be mispriced. Finally,
we could allow for a more general correlation structure where Ω may con-
tain non-zero off-diagonal elements. This approximate factor model allows
for idiosyncratic correlations (e.g., industries) and requires restrictions on
the covariaces of returns such that the idiosyncratic part is diversified away
while the factors remain. The controversy over the structure of Ω has major
implications for the testability of the model.
The APT has a flavor very similar to the ICAPM, although it is arises
from a different viewpoint. In the end, both models specify expected returns
as a function of a linear combination of their covariances with variables (fac-
tors and state variables, respectively). This link arises because it is implied
by the absence of arbitrage. The additional assumptions in the equilibrium
model serve to determine the risk premium associated with each state vari-
able.2
The model has been extended in a number of other ways including dy-
namic, conditional, nonlinear, international versions. Tests of the model have
also followed several paths, broadly categorized as cross-sectional or time se-
1
By quasi-orthogonal shocks I mean that some correlation among the reisduals is al-
lowed.
2
Actually, models such as the CAPM are partial equilibrium models and take the
riskless rate and market price of risk as given. Richer models such as CIR introduce
production uncertainty and are able to more completely characterize the economy.
ries. In general, tests reject the model but find it provides more favorable
performance than models like the CAPM.
APT Derivation
This derivation is based on the strict factor version. The exact APT deriva-
tion will also work under this approach. Modifications for the approximate
APT are mentioned at the end. It is very important to understand that the
APT starts with a characterization of realized returns r, and uses statistical
properties to say something about expected returns µ.
rt = µt + ν t = µt + Bft + ut (2.15)
E[rt ] = µt ft ∼ N (0, I) ut ∼ N (0, Ω)
where Ω is diagonal. ft is a factor vector and B a loading matrix, which

together give the unexpected factor-related return. Return covariances are
E[rt r0t ] = Bff 0 B0 + Ω = BB0 + Ω = Ψ.
As an aside, define Φ such that ΦΦ0 = I, giving B = DΦ, a rotation. There-

fore, Ψ = D0 D + Ω, illustrating the rotational indeterminancy.
Next form a portfolio with weights w. The portfolio variance is
σp2 = w0 Ψw = w0 BB0 w + w0 Ωw ≈ w0 BB0 w.
The strategy is to choose w such that w 0 BB0 w = 0 without making an

investment, ι0 w = 0. To find a w think of this as a regression of µ on [ι B].
This is
µ = λ0 ι + Bλ + w. (2.16)
The normal equations from the regression give ι0 w = 0 and B0 w = 0, which

implies w0 BB0 w = 0 as desired.
To find w0 r, insert (2.16) into (2.15) to get
rp = w0 (λ0 ι + Bλ + w) + w0 Bf + w0 u.
2.5. PRICING KERNEL APPROACH 23
Taking expectations and using the orthogonality conditions, µp = w0 w. This

validates (2.16), which can be written as
µt ≈ λ0 ι + Bλ. (2.17)
If a factor is negatively correlated with the IMRS the model implies a positive
risk premium.
Using wN in (2.16), where N indexes the number of assets, a sequence of
arbitrage portfolios satisfies the Ross pricing bound if w N 0 wN does not go to
infinity with N . The approximate factor model is derived by requiring that
as N → ∞ the smallest eigenvalue of B0 B → ∞ while the largest eigenvalue
of Ω → 0. That is, the factors are pervasive while the idiosyncratic part is
diversifiable.
2.5 Pricing Kernel Approach

The pricing kernel approch is in many ways a hybrid of the equilibrium and
arbitrage approaches. The focus is to specify the pricing kernel3 m which
makes the Euler equation hold:
pt = Et [mt+τ xt+τ ] (2.18)
This seemingly simple expression is complex enough to cover pricing for any
asset. The expression can be modified to handle returns, excess returns,
stocks, bonds, options, etc. The meaning of the payoff x and the price
change, but the same intuition applies.
The expected return on an asset is negatively related to its covariance with
the stochastic discount factor. Assets whose returns vary positively with the
sdf pay off when the marginal utility is high. That is, they provide wealth
in the states when it is most valuable to investors. Consequently, investors
are willing to pay high prices and accept low returns for these assets.
There are basically two ways of doing business. One is to take the IMRS as
given and interpret (2.18) as the Euler equation arising from the consumer’s
3
This object lives by many names, including the stochastic discount factor (sdf), in-
tertemporal marginal rate of substitution (IMRS), or benchmark pricing variable. It is
incorrectly referred to as the Radon-Nikodym derivative, Arrow-Debreu price, or state-
contingent claims price (unless the riskless rate is zero). While on naming conventions, the
risk-neutral probability measure is also referred to as the equivalent martingale measure
(EMM).
optimization problem. The goal would then be to explain asset returns. The
other view is to take the returns as given and explore the implications for m.
The characteristics of m depend upon the structure of the economy. If
the law of one price (LOP) is satisfied, there will exist (at least one) m such
that (2.18) holds. In the absence of arbitrage (NA), m is strictly positive. If
markets are complete then m is unique.
2.5.1 Basics
This presentation is for a discrete time, multiperiod model. Define the con-
sumption set c ∈ B(ei , p) ⊂ R × X. The budget constraints are c(0) =
e(0) − θ 0 p and c(T, ω) = e(T, ω) − θ 0 d(ω). Combining these two equations,
D̂θ = ĉ − ê. The attainable set D̂θ = ĉ ignores the initial endowment. I
will abuse notation and consistency by letting Q and π ∗ refer to the EMM.
The later is more appropriate for discrete settings. Also, dividend (payoff)
vectors and matrices are indicated by d and D.
Definition 1 The market is complete iff every consumption process is at-
tainable (M = X), or iff rank(D) = k.
Definition 2 An arbitrage strategy has non-negative, non-zero consumption
with e(0) = (0); D̂θ ≥+ 0
Definition 3 An Equivalent Martingale Measure Q (or π ∗ ) satisfies p =
D0 π ∗ /Rf .
Q exists iff there is no arbitrage, or iff an equilibrium exists. If markets
are complete then Q is unique.
Definition 4 A price functional Φ : R × M → R (Π : M → R) satisfies
Φ(c) = c(0) + Π(c(T )) = c(0) + θ 0 p for any θ such that c(T ) = θ 0 d.
This implies B(e, p) can be expressed as Φ(nc) = 0 where nc(t) ≡ c(t) −
e(t) ∈ M .
Π is unique even in an incomplete market and exists is there is an equi-
librium. A price system is viable: iff there is no arbitrage, iff Q exists, or iff
Φ (or Π) exists.
Definition 5 Ψ : X → R is an extension of Π if for all x ∈ M, Ψ(x) = Π(c).
A sequence of scaled prices is a Q-martingale.
2.5.2 Different Expectations

Denote the price of asset x, a package of state-contingent claims, as p(x).
Then

X X φ(s)
p(x) = φ(s)x(s) = π(s) x(s) = E P [mx]
s s
π(s)
where π(s) is the (true) probability of state s. It follows then that m(s) =
φ(s)/π(s). To move to risk-neutral probabilities π ∗ , define
π ∗ (s) ≡ Rf m(s)π(s) = Rf φ(s),
P
where 1/Rf = φ(s) = E[m]. Then
X X π ∗ (s) E Q [x]
p(x) = φ(s)x(s) = x(s) = .
s s
Rf Rf
These results imply
p(x) = E Q [x]/Rf = E P [mx].
Stated differently
π ∗ (s)
= Rf m(s).
π(s)
The risk neutral probabilities give greater weight to states with high marginal
utility, the “bad” states. In discrete time, the “change of measure” is
Q
π ∗ π = Rf m =
P
In continuous time the analagous expression is
dQ f Q (x1 , . . . , xn )
= lim nP
dP n→∞ fn (x1 , . . . , xn )
where fn () represents the joint likelihood under the respective measure. This
expression is the Radon-Nikodym derivative, and is the limit of the likelihood
ratios. This random variable satisfies

Q P dQ
E (xT ) = E xT .
dP
2.5.3 Asset Pricing with m

This analysis is useful in pricing assets. For a collection of assets in an econ-
omy, the price is the risk-neutral expectation of the future value, discounted
back to the present at the riskless rate
p = D0 π ∗ /Rf .
If the market is complete, Q is unique (π ∗ is identifable in a discrete setting)

and we can invert the payoff matrix to solve for the probabilities
π ∗ = Rf (D0 )−1 p.
If the market is not complete it is often possible to get a range of admissable

EMMs. Further restrictions may result from imposing the NA condition that
the pricing kernel be positive.
Recall that dividing by the riskless rate will give the Arrow-Debreu prices
φ = π ∗ /Rf = (D0 )−1 p. Furthermore, the pricing kernel is
m = pf π ∗ π = (D0 )−1 p π.
Once the EMM or pricing kernel are known they can be used to price any
other asset.
2.5.4 The Agent’s Problem

There is a relationship between the pricing kernel and equilibrium approaches.
The agent will
X X X
max u(c) + βπ(s)u[c(s)] s.t. c + φ(s)c(s) = y + φ(s)y(s).
{c,c(s)}
s s s
FOCs are
u0 (c) = λ βπ(s)u0 [c(s)] = λφ(s)
Solving,
u0 [c(s)]
φ(s) = βπ(s)
u0 (c)
or
φ(s) u0 [c(s)]
m(s) = =β 0 .
π(s) u (c)
Thus m(s1 )/m(s2 ) = u0 [c(s1 )]/u0 [c(s2 )], so m gives the marginal rate of sub-
stitution between date and state contingent claims. In equilibrium, marginal
utility growth should be the same for all consumers
u0 (ci,t+1 ) u0 (cj,t+1 )
βi 0 = βj 0 .
u (ci,t ) u (cj,t )
Hence m is referred to as the IMRS. Taking the expectation of either m or
IMRS gives the price of a riskless bond.
2.5.5 The Main Results

Using the definition of covariance and (2.18)
1 = E[mR] = E[m]E[R] + cov(m, R) (2.19)
1 cov(m, R)
E[R] = − (2.20)
E[m] E[m]
It follows immediately that if there is a riskless asset Rf = 1/E[m], or pf =

E[m]. Without a riskless asset, we can view 1/E[m] as a “shadow” riskfree
rate, or a zero beta return. Note that the expectations have been under the
true probability measure P.
Using the above results,

cov(m, R) var(m)
E[Ri ] = Rf + − = Rf + βi,m λm
var(m) E[m]
which is a beta pricing model.
Relation between m, β models, and MV frontier

• p = E[mx] ⇒ β: m, x∗ , or R∗ can serve as reference variables. If
m = b0 f , then f , proj(f |X), or proj(f |R) can be used.
• p = E[mx] ⇒ mean-variance frontier which includes R∗
• β ⇒ p = E[mx]: m = b0 f
Table 2.2: Common Pricing Kernels
Model mt+1
CAPM a + bRW,t+1
a+ K
P
ICAPM k=1 bk fk,t+1
u0 (ct+1 )
CCAPM β u0 (ct )
APT b0 f
Black-Scholes exp[−(r + 21 σ 2 )τ + σdZ]
• MV frontier ⇒ p = E[mx]: m = a + bRmv

• MV frontier ⇒ β model with Rmv as a reference variable.
Since mean-variance efficiency implies a single beta representation, some

single beta representation can always be found. The asset pricing model says
that a particular portfolio (e.g., the market) will be mean-variance efficient.
In other words, the content of a model comes from m = f (·), not p = E[mx].
Also, given any multi-factor or multi-beta representation, we can always
find a single beta representation. The relationship between the ICAPM and
CCAPM is an example of this.
m as a Portfolio
The portfolio that maximizes squared correlation with m is a minimum vari-
ance portfolio. m∗ , the projection, also prices assets and can replace m.
p = E[mx] = E[(m∗ + ε)x] = E[m∗ x]
2.5.6 Hansen-Jagannathan Bounds

The Hansen and Jagannathan (1991) bounds are an important addition to
asset pricing. Instead of a binary reject/fail to reject result, the HJ bounds
offer some insights as to why the model may be rejected. The model is most
useful for testing models like the consumption model where m is explicitly
specified. The model is useless for evaluating factor models that do not
specify the factors since there are always some factor-mimicking portfolios
that will work ex post.
2.6. CONDITIONING INFORMATION 29
Working with excess returns, E[mr e ] = 0, so E[m]E[rie ] = −cov(m, ri ) =

ρmri σm σri . Since |ρ| ≤ 1,
σm E[ri ]
≥ (2.21)
E[m] σr i
where r ∗ represents the return with the maximum Sharpe ratio. This holds
for any asset i, including the one with the maximum Sharpe ratio. To be
clear, the maximal Sharpe ratio measure the excess return on the tangency
portfolio r ∗ relative to its standard deviation (assuming a one-factor world).
Both the excess return on the tangent portfolio and the SR depend on Rf .
Rewriting as σm = E[m]SR, the H-J bound is a function of E[m]. As
we change E[m], we get a new Rf , a new tangency portfolio, and a new
Sharpe ratio. Plotting σm as a function of E[m] gives us the locus of points
comprising the H-J bound. Note that if we know Rf , the the bound is just
a point. These results are based on the law of one price (LOP), and do not
use the no arbitrage (NA) restricition that m > 0.
By imposing the NA restriction we can sharpen the bound given in (2.21).
The NA bound is very similar to the LOP bound for moderate values of
E[m], but as E[m] becomes more extreme (higher SR), the NA bound is
much stricter (higher). For payoffs x and Lagrange multipliers λ and δ,
m+ = [λ + δ 0 x]+
subject to E[m+ ] = E[m] and E[m+ x] = p. This nonlinear problem can

generally be solved numerically. m+ has the interpretation of a call option
with zero strike price on a portfolio of payoffs [1x]0 .
The H-J bound analysis has been extended in several ways. Snow (1991)
generalizes the model to include any moment of m. In this setting the bounds
are more sensitive to outliers. Other extensions include incorporating trans-
actions costs, utilizing cross-moments, and analyzing pricing errors as a way
to detect specification errors. One example is adding different sets of assets
and seeing how much the bound shifts up.
2.6 Conditioning Information

The difference between a conditional and unconditional model is the infor-
mation set used. If payoffs and discount factors (and therefore, prices) are
iid, then conditional and unconditional models are the same. Define
UMV iff E[Rp2∗ ] ≤ E[Rp2 ] ∀ Rp s.t. E[Rp∗ ] = E[Rp ]
CMV iff Et [Rp2∗ ] ≤ Et [Rp2 ] ∀ Rp s.t. Et [Rp∗ ] = Et [Rp ]

By iterated expectations, this gives UMV ⊆ CMV. If a portfolio is UMV it
must be CMV, but the converse need not be true. We can also consider the
set of minimum variance portfolios conditional on Z, CMVZ . Then CMV
includes CMVZ , which in turn includes UMV. A conditional factor pricing
model does not imply an unconditional model. An unconditional model does
imply a conditional model.
From here we can say that it is possible to reject that a portfolio is UMV
or CMVZ , but we can not reject CMV since the information set for CMV
is unobservable. This is similar to the issue raised by Roll (1977); rejecting
UMV does not imply rejection of CMV. Cochrane (1998) refers to this as the
Hansen and Richard (1987) critique. The use of scaled factors (i.e., scaled
by instruments in the proper information set) is a partial solution.
If the test is based on 1 = E[mR] for some particular m, then it is possible
to test without the complete information set. Recall m∗ can replace m in
(2.18), so m∗ is also CMV and is a function of the unobserved information
set.
The use of conditional models allows for time-varying expected returns.
This time variation can arises due to changes in the risk premium or because
of conditional covariances (β changes through time). The ARCH-GARCH
family of models is often used to capture the time series behavior of condi-
tional moments.
2.7 Market Efficiency

Examining the link between the theoretical asset pricing models and empir-
ical tests requires a position on market efficiency. The general idea behind
market efficiency is that prices reflect available information. Of course a more
precise definition of available information and the implications of reflecting
this information are necessary.
The early view of market efficiency was the random walk. In this model
the series of innovations is independent. Empirical evidence during this pe-
riod found that prices are consistent with a random walk. The apparant im-
plications of this model are that prices are not driven by supply/demand and
2.8. EMPIRICAL ASSET PRICING 31
there is no point in fundamental analysis. In fact, the random walk does not
have these implications since slowly adjusting prices would allow profitable
trading strategies. A problem with the random walk is that it simulatneosly
requires rational investors to eliminate profitable trading opportunites, but
also assumes investors irrationally pay for security analysis.
The martingale model was proposed as an alternative to the random walk
by Samuelson in the mid-1960s. A random variable xt+1 is a martingale with
respect to an information set Φt if
E[xt+1 ] = xt .
A fair game has the property that E[yt+1 ] = 0. Returns are a fair game if
prices and dividends follow a martingale. Finding a variable that can predict
returns means either that returns are not a martingale or that that variable
in not in the information set. More recent versions of market efficiency also
assume rational expectations.
The martingale will hold when investors have common, constant rate
of time preferences, homogeneous beliefs, and are risk-neutral. Note that
risk neutrality implies a martingale, but does not imply a random walk.
The reason is that a martingale allows dependence of higher moments on
the information set, whereas the random walk does not. Allowing for risk
aversion does not go very far in reconciling the martingale model with the
data.
There are several reasons not to base market efficiency on the martingale
model. In a setting such as the ICAPM, conditional expected returns depend
on dividends. Since dividends are autocorrelated the conditional expected
returns are partially forecastable in violation of the martingale model. Time
variation in the risk premium may also lead to failure of the martingale model.
Finally, most emprircal tests have a joint hypothesis problem. Rejecting a
model may mean either the model is wrong or the market is inefficient.
2.8 Empirical Asset Pricing

2.8.1 Properties of Asset Returns
Normality offers nice features in modeling asset prices, however departures
from normality have been extensively documented. Relative to the normal
distribution, asset returns exhibit skewness and kurtosis. Matters are com-
plicated further by serial correlation in returns.
Table 2.3: Patterns In Returns
Factor Relation Comment

Size – Banz (1981)
B/M +
E/P + Basu (1977)
CF/P +
1/P +
T-bills –
Dividend Yield +
Term structure slope
Expected Inflation –
Credit quality + also related to volatility
January +
Monday –
Contrarian ?,
Jegadeesh and Titman (1993)
Momentum
Cross-sectional Patterns
There is evidence that lagged variables are useful in predicting stock and
bond returns. Many of the results documented in the U.S. are also present
in other countries. Table 2.3 provides an overview of these patterns.
Interpretation of these patterns are difficult since many of these variables
are highly correlated, and much of the relation each has with returns comes
in January. At longer time horizons some of the effects, such as size and
E/P, tend to reverse themselves. A common criticism is that these variables
may be correlated with the true β when estimates of β are noisy. Chan &
Chen () show that average size and estimated beta in size-sorted portfolios
are almost perfectly negatively correlated.
Another issue that arises in interpretation of the cross-sectional regulari-
ties is whether they are all capturing the same underlying phenomenon. This
is especially likely considering price is in many of the variables.
Attempts to disentangle the effects are inconclusive. Some researchers
claim size subsumes E/P, while others claim the opposite. Fama and French
(1992) claim that size and B/M together subsume E/P (and beta). Given
the way these tests are designed, the B/M variable may actually be a proxy
for the true beta. A stock that recently declined in price will have a high
B/M. This stock is also likely to be more levered than before its decline, so
it is now riskier and should have a higher beta. However the beta estimate
is generally based on returns several years prior, so the recent downturn is
likely to be washed out. In the end, the estimated beta may be too low, and
the high B/M may capture the added risk of the stock. Alternatively, the
B/M results may be due to survivorship biases in the COMPUSTAT tapes.
There are several calendar related patterns in returns. Most famous is
the January effect, where returns are much larger in January than in other
months. Possible explanations include tax-based trading, window dressing
by institutions, and liquidity trading. The January effect is most pronounced
for small firms.
The weekend effect describes the large negative returns from Friday close
to Monday close. It is not clear that all the abnormal return is due to the
weekend period, but Monday returns alone do not seem to account for the
entire effect. International evidence is mixed with respect to weekly patterns,
but many of the Asian markets have a Tuesday effect, which corresponds to
Monday trading in the U.S. There is some evidence that most of the returns
each month occur during the first two weeks. This may be due to portfolio
rebalancing caused by month-end salaries. Finally, there is a holiday effect,
where one third of the annual returns occur on the trading days preceeding
the eight holidays on which the market is closed.4
In a clever paper Berk (1995) addresses the fact that price is directly
related to size. The basic logic is very simple — risky firms will be discounted
at a higher rate, therefore current market values will be smaller. This will
give the appearance that small firms have higher returns, even though firm
size (future cashflows) and risk may be unrelated. Consider a set of firms
with log future cash flows c, log price p, and log return r = c − p. Further
assume size and risk are independent. Now regress returns on beginning of
period size
r = α 1 + β1 p + ε 1 .
4
This is misleading since positive and negative returns cancel out.
The sign of β1 depends on the covariance between r and p

cov(r, p) = cov(c − r, r) = cov(c, r) − var(r) = −var(r) < 0.
Thus we should expect a negative relation between firm size and returns.
Now consider a regression of actual returns on expected (model) returns r̂
r = α2 + β2 r̂ + ε2 .
Take the pricing errors ε2 and regress them on current size
ε2 = α 3 + β3 p + ε 3 .
The sign of this regression coefficient depends on the covariance
cov(p, ε2 ) = cov(c − r, ε2 )
= cov(c, r − α2 − β2 r̂) − cov(α2 + β2 r̂ + ε2 , ε2 ) = −var(ε2 ) < 0.
This shows that size is negatively related to pricing errors. How much of the
variation in actual returns is explained by size? Decompose the R 2 from the
first regression
2
2 var(β1 p) 2 var(p) cov(r, p) var(p)
R = = β1 =
var(r) var(r) var(p) var(r)
var(r) var(r) var(r)
= = =
var(p) var(c − r) var(r) + var(c)
The larger the variation in cashflows the lower is the R2 . The basic con-
clusion of the article is that market value will end up capturing unmea-
sured/unmodeled risks.
Time Series Patterns

Asset returns contain patterns in autocorrelations summarized in Table 2.4.
Using CRSP stock returns from 1962–1994, portfolio autocorrelations range
from 1.3% to 43.1%. Autocorrelations increase with shorter time horizons
and are higher in equally-weighted portfolios than value-weighted portfolios.
Both of these effects are likely due to higher autocorrelation in smaller stocks,
which may be due to non-synchronous trading. There is weak evidence of
negative autocorrelations in multi-year returns. In most cases the economic
significance of the autocorrelations may be small, as is the proportion of the
total variance explained. Individual stocks, especially smaller ones, tend to
have negative autocorrelation.
Table 2.4: Correlation Patterns
Horizon Individual Portfolio

Daily – +
Weekly – +
Monthly – +
Annual
Multi-year –
Variance Ratios
The random walk hypothesis implies the variance of asset returns scales with
time; a T -period return should have a variance T times as large as a one-
period return. A similar statistic can be derived using variance differences.
Finite sample properties can be significantly improved by using overlapping
observations and making appropriate degrees of freedom adjustments.
Positive autocorrelations suggest variance ratios greater than one. For the
equally-weighted portfolios, this seems to be the case, with V R(2) ≈ 1.2, and
increasing with longer-horizons. V R(16) ranges from 1.5 to 1.9, depending on
the time period (this effect is getting smaller as time goes on). These results
disappear in value-weighted portfolios. Looking at size-sorted portfolios, the
variance ratios are largest for the small-stock portfolios and are close to one
for the stocks in the largest decile. For individual securities the variance
ratios are close to one in general, and less than one for the longer horizons.
This is because there is some negative autocorrelation in individual security
returns due to the bid-ask spread.
The combination of negative autocorrelation in individual securties and
positive autocorrelation in portfolios gives rise to positive cross-autocorrelations.
This phenomena can be summarized as a stronger correlation between cur-
rent small-stock returns and lagged large-stock returns than between current
large-stock returns and lagged small-stock returns. More directly, large stocks
tend to lead smaller stocks. This can help explain the apparant profitability
of contrarian strategies.
Long-Horizon Returns
Shiller () and Summers () present models where stock prices have fads or bub-
bles, causing large slowly decaying swings from fundamental values. Shorter
horizon portfolio returns have little autocorrelation, while returns at longer
horizons have strong negative autocorrelation. Empirical evidence supports
these models, although the tests are based on small sample sizes and lack
power. Other empirical results indicate that the variance grows more slowly
than the time horizon, also consistent with the model. A general problem
is that that irrational bubbles in stock prices are not distinguishable from
rational time-varying expected returns. Long-horizon returns are also pre-
dictable with other variables such as D/P and E/P. These variables can
explain roughly a quarter of the variation in two to four year returns, much
more than is possible for shorter horizons.
? propose their contrarian viewpoint, where buying losers and selling
winners (measured over 3 to 5 year periods) produces excess returns. Others
have argued that the excess returns are due to differences in risk, although
a rebuttal paper from DeBondt and Thaler disagrees. It is possible that the
contratrian results are due to a size effect or some type of distressed-firm
effect.
2.8.2 General Procedures

Multivariate tests can elimintate the errors-in-variables problem and increase
the precision of parameter estimates. This type of test still does not say why
the model is rejected. Consider a multi-beta model of the form
K
X
Et [Ri,t+1 ] = λ0,t + βi,j,t λj,t .
j=1
To test this using a multivariate regression

K
X
Ri,t+1 = αi + βi,j Rj,t+1 + εi,t+1 (2.22)
j=1
P
the intercept restriction is αi = λ0 (1 − βi,j ). This is equivalent to mean-
variance intersection, meaning that the minimum variance boundaries of all
the asset returns and minimum variance portfolios intersect at a single point.
In other words, a combination of mimicking portfolios lies on the mean-

variance frontier.
The multivariate regression in the restricted form uses T N observations
to estimate N + 1 parameters. The unrestricted model has 2N parameters
to estimate. Tests with longer time series have more power, while those with
more assets have a larger size. The restrictions can be tested with the Wald
(W), likelihood ratio (LR), or Lagrange multiplier (LM) statistics. These are
all asymptotically χ2 but may differ in finite samples.
2.8.3 CAPM Tests

The only testable implications of the CAPM are that the market is mean-
variance efficient, and for the SL model that the intercept is zero. Roll
(1977) indicates that this is inherently impossible to do since the market is
unobservable. “Rejecting” the model may simply mean that the proxy is not
mean variance efficient. Converesely, “failing to reject” may mean that the
proxy is mean variance efficient. In either case, we have not said anything
about the mean-variance efficiency of the market. Further, there are always
some portfolios which are mean-variance efficient. There is also the issue
with conditioning information. The CAPM can hold conditionally but fail
unconditionally. Without knowing what conditioning information to use, the
models are difficult to test.
Stambaugh (1982) examines the sensitivity to excluded assets in the mar-
ket proxy, finding inferences are similar regardless of the specific composition
of the proxy. Kandel and Stambaugh (1987) and Shanken (1987) estimate
the upper bound on the correlation between the proxy and the true market
needed to overturn rejection of the model. As long as the correlation is at
least 0.70, inferences would not change. Roll and Ross (1994) counter by
saying that if the true market portfolio is efficient, cross-sectional relations
between expected return and beta are very sensitive to the proxy choice.
As in any statistical test, there is a tradeoff between size and power.
Adding assets tends to increase the size of a test in finite samples. A longer
time series can considerably increase the power of a test. GMM tests have
become popular since they do not rely on normality, homoskedasticity, or
uncorrelatedness.
The early evidence was generally supportive of the CAPM, in that the
evidence seemed consistent with mean-variance efficiency of the “market”
portfolio. Representative studies include Fama and MacBeth (1973), Black,
Jensen, and Scholes (1972), and Blume and Friend (1973). In the mid-1970’s
the “anomalies” literature developed [see Fama (1991) for a review].
Common criticisms of these “anomolies” are sample selection and data
snooping biases. Kothari, Shanken, and Sloan (1995) claim that sample
selection biases drive the results of Fama and French (1992), although Fama
and French (1996b) dispute this claim.
Fama-MacBeth (1973)
FM perform introduce what has become a classic methodology for empirical
asset pricing tests. They test the Black and SL CAPMs using monthly
portfolio returns and the equally-weighted NYSE as the market. Their tests
examine (i) the linearity of the risk-return tradeoff, (ii) if variables other
than β matter, (iii) if the risk premium is positive, and (iv) if the return on
the zero-beta portfolio is equal to the riskless rate.
The procedure is as follows. First, portfolios are formed using estimated
β of individual securities over a four year period. Since measurement error
will systematically affect these portfolios, the betas are reestimated over a
five year period and averaged across assets to get portfolio β. The β for
each portfolio is recalculated each month over the next four years to cover
delistings. Returns for each of the 20 portfolios are regressed on the port-
folio betas. This is repeated each month, and the estimated coefficients are
averaged over time.
The results are generally supportive of the Black model but the estimated
riskless rate is higher than the market rate. Additional regressions including
β̂ 2 and the asset-specific risk indicate that the risk-return relation is linear
and there is no reward for bearing unsystematic risk.
Extensions by Litzenberger and Ramaswamy (1979) and Shanken (1992)
explicitly adjust standard errors for the EIV bias rather than form portfolios.
Shanken (1992) shows that the standard errors in Fama and MacBeth (1973)
do not properly reflect measurement error in β, overstating the precision of
the risk premium estimates.
Black, Jensen & Scholes (1972)

Fama-French (1992)
The controversial Fama and French (1992) paper has generated a significant
debate in the literature. The general goal of the paper is to assess the relative
importance of beta, size, B/M, leverage, and E/P in determining the cross-
section of expected returns. These variables had been previously documented
as important in the “anomalies” literature. Their general findings are that
beta is not systematically related to returns, while size and B/M subsume
the other factors.
The methodology employed is basically an extension of the Fama and
MacBeth (1973) procedure. The new steps involve the combination of ac-
counting and market data. All accounting data for the fiscal year ending
t − 1 is combined with returns measured from July of year t to June of t + 1.
Stock price data used to construct accounting ratios is from the beginning of
year t, while the size measure is from June of year t. This procedure ensures
all explanatory variables are known prior to the return.
In order to preserve the firm-specific accounting information, portfolios
are not used in the same way as in FM. Instead, portfolios are used to
calculate betas, which are then assigned to all firms in that portfolio. The
portfolios are formed by first forming size deciles, then forming beta deciles
within each size decile. In both sorts, breakpoints are set based on only the
NYSE firms. With these 100 portfolios, portfolio betas are calculated each as
the sum of the coefficient on current and prior month CRSP value-weighted
retutns. The beta for a particular stock can change over time as the stock
moves into different portfolios.
This two-way sorting procedure produces variation in beta that is unre-
lated to size. Univariate statistics show that average returns are related to
size, but unrelated to beta. This evidence is confirmed by the FM regressions.
Gibbons (1982)
Gibbons (1982) introduces a multivariate test of the CAPM and rejects
CAPM soundly using LR. He uses the CRSP equally-weighted index as the
market, estimates β over a 5 year period, and forms 40 portfolios. This mu-
tivariate methodology avoids the EIV problem, provides more precise risk
premium estimates, and has more power than previous tests. The nonlinear
restriction on the intercept is linearized with a Taylor-series expansion.
Stambaugh (1982)
Stambaugh (1982) shows inferences are not sensitive to proxy choice, but
are sensitive to the asset choice. He argues that W lacks power, LR has
the wrong size, and LM is closest to its asymptotic distribution. Using a

portfolio of stocks, bonds, and preferred, he fails to reject linearity (Black
CAPM), but rejects SL. Using fewer assets he rejects both models.
Shanken (1985)
Shanken (1985) provides the asymptotic results for the multivariate tests in
Gibbons (1982). He shows that LM < LR < Q∗ (= W ). These statistics are
all transformations of one another. Shanken uses QA
C , which includes consid-
erations for sample size and degrees of freedom adjustments. Recalculating
Gibbons’ LR statistic, Shanken shows p = 0.75, so the rejection inference is
overturned.
The cross-section regression test (CSRT) used in this paper does not
require specifying HA . The procedure estimates beta in a first stage, then
using betas in cross-sectional regressions. The CAPM is rejected using the
equally-weighted CRSP index.
MacKinlay (1987)
MacKinlay (1987) discusses power of multivariate SL CAPM tests. Finds
that tests against an unspecified alternative have low power. The type of de-
viation from the model is important in determining power. These tests have
reasonable power against cross-sectional random deviations. However, these
tests have low power against omitted factors. He rejects in some subperiods
but fails to reject overall.
2.8.4 ICAPM/CCAPM Tests

Tests of a multi-beta model are similar to CAPM tests in that they are
really tests of the mean-varance efficiency of a particular combination of
portfolios. There is mixed evidence about the importance of durable goods.
Habit persistence models perform better in goodness-of-fit tests, but still do
not explain the first moment of the equity premium puzzle.
Hansen Singleton
Reject model. See QM notes for more details.
Mehra Prescott (1985)

The equity premium puzzle arises because extreme risk aversion parameters
are needed to make the low volatility of aggregate consumption growth in the
U.S. consistent with the returns on both equity and T-bills. Some of these
results may arise partially because of poorly measured consumption data, but
efforts to correct for this still lead to rejections of the model. One possible
(partial) explanation for the equity premium puzzle is incomplete markets,
which may result in the overestimation of risk aversion. One experiment
using log utility (CRRA = 1) results in an estimate based on aggregate
consumption of CRRA = 3. Weil () presents the same puzzle from the
perspective of the riskless asset.
2.8.5 APT Tests

The testable implications of the APT given in (2.17) are
1. λi 6= 0 for any i
2. λ0 (= rf ) ≥ 0 (debated)
3. linearity
Again, the test really amounts to seeing if a particular combination of port-

foliosis mean-varance efficienct.
To make the intertemporal APT testable, certain restricitons need to be
imposed. One alternative is to assume that (i) the observed set of assets has
a factor structure, (ii) the noise terms of the observed assets are uncorrelated
with the noise terms on the unobserved assets, and (iii) the factors span the
state variables. Alternatively, we can assume logarithmic utility in which
case the intertemporal APT reduces to the APT. These requirements are
very similar to the ICAPM.
As mentioned in Section 2.4.2, the APT has features which make testing
difficult. In fact, one view is that APT is not testable [e.g., Shanken (1982),
Reisman (1992)], whereas others [Ingersoll (1984), others ??] claim it is. The
primary reason for this disagreement is the approximate nature of the model.
Are deviations from the exact model due to the approximation or are they
genuine deviations from the model itself? The test then becomes a joint test
of the model and the additional assumptions needed to impose the exact pric-
ing relation. The APT and ICAPM are not empirically distinguishable. The
“pervasive factors” in the APT world can coincide with the “state variables”
in the ICAPM world.
The test of the model requires estimation of both the factor loadings (B)
and the factor prices (F). The two primary testing approaches differ in the
order these variables are estimated. Cross-sectional tests estimate (B) in the
time series, the use these estimates for a number of firms to estimate (F) in
the cross-section. The time series tests perform the estimation in the reverse
order.
Fama and MacBeth (1973) provide the basic approach for the cross-
sectional test [see Section 2.8.3 for details]. Some of these tests estimate
the factors statistically while others use economic specifications. Chen, Roll,
and Ross (1986) specify five economic variables as factors: industrial produc-
tion, unexpected inflation, changes in expected inflation, credit quality, and
a term premium. The find that the specification is good in the sense that
many of these factors are priced and additional factors such as the market
return, consumption growth, and changes in oil prices are not priced. Chan,
Chen, and Hsieh (1984) perform a study similar to CRR, but are also able to
explain the size anomaly. However, Shanken and Weinstein (1990) reply that
these two studies are sensitive to the portfolio formation used. Specifically,
forming size-based portfolios at the end of the estimation period causes mis-
estimation of the βs to show up systematically in the size portfolios, biasing
the subsequent risk premium estimate.
The time series test method was originally proposed by Black, Jensen,
and Scholes (1972) Factor prices are estimated in the first pass, and their
sensitivity in the second pass. The null bypothesis is that the intercept is
zero (or α = (1 − Bi )λ0 in the absence of a riskless asset).
In summary, the tests of the APT generally reject the model, but the
APT seems to perform better than alternatives such as CAPM. The APT
has been used in applications which offer indirect evidence of its success as
well. In fund performance tests, the model indicates fund managers have
negative Jensen’s alphas, which is a similar result from the CAPM models
(the magnitudes differ though). In calculating the cost of capital, CAPM
and APT yield similar results. In event studies the APT does not seem to
offer much gain over a single factor model.
2.8.6 Present Value Relations

The history of volatility and returns tests result in a flip-flop of results. The
early variance bounds tests rejected the present value models, whereas the
returns tests failed to reject. More recently, volatility bounds tests provide
mixed evidence, but the returns tests now reject the model.
Volatility tests
Denote the “perfect foresight” price

∞
X
p∗t = β τ dt+τ
τ =1
Then pt = E[p∗t ] or
p∗t = E[p∗t ] + εt = pt + εt
var(p∗t ) = var(pt ) + var(εt ) ≥ var(pt ) (2.23)
This says actual prices should be less volatile than the “model” price from
the dividend series. In fact, we find the opposite. Actual prices are more
volatile than would be expected from dividends.
There are several problems with the above test. First, the price series
is nonstationary so it needs to be modified. Second, the infinite sum is a
problem in a finite sample. This can be overcome by including a terminal
value in the distant future. Third, the observed dividend series is not series
of independent observations, but rather a single realization. This creates a
small sample problem in implementing the test. Fourth, there is no way to
capture time-varying expected returns in this framework. Finally, different
specifications of the investors’ information sets lead to different critical val-
ues, making interpretation difficult. In summary, there are several necessary
adjustments to the variance bounds test. Even after making these adjust-
ments, there is no way to hold size constant so there is no way to meaningfully
compare the power of this test to alternatives.
Shiller (1981) uses the perfect foresight price decomposition to derive
varaince bounds. He finds the actual price is five to thirteen times more
volatile than the perfect foresight price. His analysis indicates that the price
change volatility is highest when information about dividends is revealed
smoothly. Large, occasional information releases result in prices with lower
variance but higher kurtosis.
Returns tests
Tests of long horizon returns have found that there is siginificant negative
autocorrelation over the three to five year horizon, indicating a tendancy for
mean reversion.
Orthogonality tests
A model-free version is not subject to the nuisance parameter problem which
plagues the variance bounds test. Both the model-free and the model-based
orthogonality tests are better-behaved econometrically than the returns tests.
Chapter 3
Fixed Income
3.1 Introduction
The pricing of bonds differs from pricing other assets such as equity primarily
because bonds are nonlinear. A bond has:
1. fixed, known maturity
2. fixed, known terminal (face) value
3. fixed, known periodic cash flows
4. more thinly traded (at least “older” issues)
Term structure models can be viewed as time series models of the stochastic
discount factor.
Duration, Convexity
3.2 Term Structure Basics

3.3 Inflation and Returns
3.4 Forward Rates
Forward rates had been viewed simply as forecasts of expected future spot
rates (PEH). Fama (??) shows that the forward rates also contain expecta-
tions of the premium above one month T-bills.
• Holding period return is the change in log price on a particular bond
from one period to the next.
45
46 CHAPTER 3. FIXED INCOME
• The forward rate is the difference in the log prices of bonds of different
maturities at the same point in time.
• Premium is the holding period return less the one month spot rate.
Fama (1984)
Fama uses a regression approach to separate the information about expected
future spot rates from information about the expected premium.
1. premium = f (forward - spot)
2. ∆ spot = f (forward - spot)
Results are that forward rates can predict premiums which vary through
time and the expected future spot rate up to five months out. Froot has a
response to Fama’s finding, suggesting that Fama ignores systematic expec-
tations errors.
Fama–Bliss (1987)
Find that forward rate forecasts of near-term changes in interest rates are
poor, but forecast power increases at longer time horizons. Interpret this as
evidence of a slow mean-reverting process. Also find evidence of time-varying
expected premiums, and that the ordering of risks and rewards changes with
the business cycle.
Stambaugh (1988)
An affine yield model implies a latent variable structure for bond returns.
Fewer state variables than forecasting variables puts testable restrictions on
forecasting equations for bond returns. Reject CIR with non-matched ma-
turities (avoids measurement error). Addresses source of errors, their conse-
quences, and how the choice of instruments affect the outcome of the tests
3.5 Bond Pricing

As any asset, bonds can be priced using the pricing kernel approach presented
in Section 2.5. Begin with the fundamental pricing equation
1 = Et [Mt+1 Rn,t+1 ].
3.6. AFFINE MODELS 47
The uppercase M is used to distinguish it from logs and the n subscript

indicates the time to maturity. The return can obviuosly be expressed as
the relative price change Rn,t+1 = Pn−1,t+1 /Pn,t . Substituting this into the
pricing equation gives
Pn,t = Et [Mt+1 Pn−1,t+1 ].
Recursive substitution and the fact that the bond is worth a dollar at matu-
rity gives another representation
Pn,t = Et [Mt+1 . . . Mt+n ].
In this light a bond pricing model is really a time series model of the stochastic
discount factor.
Fixed income models are broadly categorized as either stochastic interest
rate models or stochastic term structure models. Stochastic interest rate
models begin by specifying a process dr for the short rate. The problem with
this approach is that the model price of the bond may not equal the market
price. The short rate process also implies prices for bonds of other maturities
and these may be mispriced as well. The stochastic term structure models
use the observed market prices and estimates of the volatility structure to
infer the stochastic process of the short rate. This information is then used
to get a distribution for the bond price.
3.6 Affine Models

Affine yield models represent a class of realtively simple models in which all
relevent variables are conditionally log-normal and log yields are linear in
state variables. Affine forward rates imply affine yields. Taking logs of the
pricing relation
1
pn,t = Et [mt+1 + pn−1,t+1 ] + var(mt+1 + pn−1,t+1 ).
2
A model with k state variables implies that the term structure can be
summarized by the levels of k bond yields at each point in time and the
constant coefficients relating the bond yields. In this sense affine yield models
are linear; they are non-linear in the evolutionary process of the k basis yields
and the relation between the cross-sectional coefficients and the underlying
parameters of the model.
Table 3.1: Single Factor Stochastic Interest Rate Models
dr = (α + βr)dt + σr γ dZ
Model α β γ Specification
Merton (ABM) 0 0 αdt + σdZ
Vasicek 0 (α + βr)dt + σdZ
√
CIR SR 1/2 (α + βr)dt + σ rdZ
Courtadon 1 (α + βr)dt + σrdZ
Dothan 0 0 1 σrdZ
GBM 0 1 βrdt + σrdZ
CIR VR 0 0 3/2 σr 3/2 dZ
CEV 0 βrdt + σr γ dZ
Duffie-Kan 1/2 (α1 + β1 r)dt + (α2 + β2 r)γ dZ
Assumptions
• distribution of the SDF is conditionally lognormal;
• bond prices are jointly lognormal with the SDF;
• (additional strong assumptions): homoskedastic mt+1 (Vasicek)
Properties
• Log prices (and yields) are affine in state variables.
• Analytic solution of pricing equations (outside affine yield generally
requires numerical solutions e.g., Black, Derman, and Toy).
• Trivial rejection of model without addition of an error term.
• Limits the way in which interest rate volatility can change with the
level of interest rates.
• Implies risk premia on long bonds always have the same sign (single-
factor).
• Applies to real bonds only ?
• The model can be renormalized so that the yields themselves are the
state variables (e.g., a two-factor model would use two yields).
3.6. AFFINE MODELS 49
3.6.1 Vasicek
dr = κ(θ − r)dt + σdB
y1t = xt − β 2 σ 2 /2 and − pnt = An + Bn xt
To get this model begin by writing the sdf as a forecast and an innovation
−mt+1 = xt + εt+1 .
The sign is a convention. Assume that xt+1 follows an AR(1) process and,
for simplicity, its innovations are uncorrelated with εt+1
xt+1 − µ = φ(xt − µ) + ξt+1 and εt+1 = βξt+1 .
Now consider the log price of a one period bond

1 1
p1,t = Et [mt+1 ] + var(mt+1 ) = −xt + β 2 σ 2 = −y1,t .
2 2
• Allows interest rates to be negative (OK for real, not nominal).
• Can handle rising, inverted, and humped yield curves, but not inverted
humped curves.
• Price of interest rate risk is a constant that does not depend on the
level of the short rate.
• Interest rate changes have constant variance.
• Limiting forward rate can not be both finite and time-varying.
• Log forward rate curve tends to slope downwards unless β is sufficiently
small.
• Random walk is a special case.
• B measures the sensitivity of the n-period bond return to the one-
period interest rate (and the state variable). This sensitivity increases
in maturity, and is always less than the maturity.
• Average short rate is µ − β 2 σ 2 /2.
3.6.2 The CIR Model

√
dr = κ(θ − r)dt + σ rdB
The basic CIR model is a general equilibrium, continuous time model of the
real returns on the asset in an economy [see section 2.3.5]. The general model
is specialized to the term structure in ?. The asset is used to smooth con-
sumption, so its value depends on its hedging effectiveness, or its covariance
with consumption. The model is derived in an option pricing framework
by constructing a riskless synthetic portfolio, which must earn the riskless
rate in equilibrium. The hedge portfolio is constructed of bonds of differing
maturities; it is assumed that the market price of risk is the same for bonds
of all maturities. A recursive approach must be used to solve the model.
Although the model claims to endogenously derive the interest rate process,
it is a direct consequence of the specification of the state variable.
Assumptions
• identical individuals with time-additive log utility (Dunn and Singleton
relax this assumption but do not have much success)
• xt+i and mt+i are normal conditional on xt for i = 1, but non-normal
for i > 1.
• y1t = −p1t = xt (1 − β 2 σ 2 /2) y1t is proportional to the state variable
and its conditional variance is proportional to its level.
• restricts interest rates to be positive
Predictions
• Variance proportional to the state variable.
• All bond returns are perfectly correlated (general prediction of all
single-factor models).
• Prices are a deterministic function of the parameters, the short rate,
and maturity; an error term must be specified to keep the model
testable.
• The long rate converges to a constant.
• Stable parameters (λ, κ, θ, σ).
• Forward rate fnt = −Bn2 xt σ 2 /2
• time variation in term premia ?
3.6.3 Duffie-Kan Class

The Duffie-Kan model is the most general affine model possible. It nests all
the common models as special cases.
p
dr = κ(θ − r)dt + α + βrdZ
3.7. MULTI-FACTOR MODELS 51
3.6.4 Other Single Factor Models

HJM
Ho-Lee
BDT
3.6.5 Alternatives
• Non-linear models (γ = 3/2)
• Non-parametric models
• Markov switching models
• GARCH
• Higher-order ARMA processes
• Several state variables
3.7 Multi-Factor Models

Longstaff and Schwartz (2–factor)
1/2
−mt+1 = x1t + x2t + x1t εt+1
p1t = −x1t − x2t + x1t β 2 σ12 /2

• second factor (instantaneous variance of changes in short rate) avoids
implication that all bond returns are perfectly correlated
• variance of innovation to log SDF is proportional to the level of x1t and
is conditionally correlated with x1t but not with x2t .
• One-period yield is no longer proportional to x1t and the short rate
alone is no longer sufficient to describe the state of the economy.
• The model is a generalization of the square-root model
• it can also generate inverted humped yield curves.
• Whenever the SDF can be expressed as the sum of two independent
processes, the resulting term structure is the sum of the term structures
that would exist under each of these processes.
3.8 Empirical Tests

3.8.1 Brown & Dybvig (1986)
• Nominal, prices, cross-sectional, ML
CHAPTER 3. FIXED INCOME
Table 3.2: Summary of Empirical Results
Paper Dataa Methodsb Notes Results

BD (1986) P,N C,ML iid errors r̂ > r, σ not constant
BS (1994) P,R C,ML Unstable est., don’t support
mean reversion, σ > 0 binds
CKLS (1992) Y,N TS,GMM assume normality reject γ < 1, unconstr. γ = 1.5,
mean reversion not important
GR (1993) Y,R TS,GMM forecast R from N fail to reject CIR, plausible
use non-central χ2 estimates, fit short bonds better
PS (1994) P,N TS,ML second factor for inflation unstable/unrealistic estimates,
non-central χ2 reject original and two factor CIR
LS (1992) Y,N C,GMM second factor for volatility reject single factor model,
estimated with GARCH 2–factor holds for short and int bonds
a
Price or Yield; Nominal or Real.
b
Cross-section or Time series, Econometric Method.
52
3.8. EMPIRICAL TESTS 53
• Assume pricing errors are iid - a strong assumption given the differences
in trading frequency across maturities; an alternative is to assume vari-
ance increases with maturity and is correlated across maturities.
• Estimated r systematically overstates implied short rates (recall Fama
MacBeth; Merton’s model of heterogeneous information sets).
• Find estimated variance is erratic, although similar in magnitude to
CIR weekly time series estimates. √
¯
• find annual average of implied standard deviation (σ̂ r̂) appears to be
an unbiased predictor of time series estimate of the standard deviation
of changes in the short rate.
• Bills appear to be better described by the model than bonds.
• Discount issues’ prices are underestimated, premiums are overestimated.
• Evidence that the errors are not iid.
3.8.2 Brown & Schaefer (1994)

• Real, prices, cross-sectional, ML
• CIR model is generally able to replicate observed yield curve shapes
• Pricing errors are generally within the bid–ask spread
• Parameter estimates are unstable, especially κ + λ
• Positivity constraint on σ 2 binds in many cases
• Cross-sectional estimates of variance are not unbiased estimates of the
time series estimates.
• evidence on mean reversion is generally not supportive
3.8.3 Chan, Karolyi, Longstaff & Sanders (1992)

CKLS present a generalized model that nests eight popular interest rate
processes.
dr = (α + βr)dt + σr γ dZ
• Nominal, yields, time series, GMM
• The γ term seems to be the most important; models with γ < 1 are
all rejected, and those with γ = 1.5 fare the best. The unrestricted
estimate of γ is 1.5, and is significantly different than unity.
• The mean reversion process, which adds considerable complexity to the
model, does not appear to be of major importance.
• Results are trouble for single-factor affine yield models: without mean
reversion, the term structure may increase initially, but will then be
downward sloping. Second, with γ > 0.5, the models become in-
tractable and must be solved numerically.
3.8.4 Gibbons & Ramaswamy (1993)

• Forecast real returns on nominal bonds in a time series setting (assume
inflation is independent of the real SDF ?)
• GMM in a time series
• Fail to reject CIR, obtain plausible parameter estimates
• Reject with off-the-run bonds (measurement error and a small sample).
• Model fits short end of term structure better than longer maturities.
• Find some evidence of autocorrelation in returns
3.8.5 Pearson & Sun (1994)

• Nominal, prices, time series, ML
• Generalize square-root model to allow the variance of the state variable
to be linear in the level of the state variable.
• Also include a second factor — expected inflation.
• Reject original and two-factor CIR model.
• Unrealistic parameter estimates:
• Unstable parameter estimates (across datasets).
• Within sample prediction has no power and is little better than a naive
prediction of current values.
3.8.6 Longstaff & Schwartz (1992)

• second factor for volatility estimated using GARCH
• test cross-sectional restrictions with GMM
• Find model holds for both short-and intermediate-term maturities
• Reject single-factor model
Chapter 4
Derivatives
4.1 Introduction
Virtually all derivatives pricing is based on some sort of arbitrage argu-
ment. This chapter outlines derivative pricing in terms of both discrete-
and continuous-time models. Several derivations of each model are given
to show the links between them. More advanced topics are covered rather
superficially.
4.2 Binomial Models

Binomial option pricing is a special case of Arrow-Debreu pricing presented
in Section 2.4.1. Standardize the price of an asset to have a price of $1, value
in the “up state” of u, and value in the “down state” of d. Recall Xφ = p
and X0 α = b so

u d φ1 1
=
1 1 φ2 pf
Solving these equations,

Rf − d u − Rf
φ1 = and φ2 =
Rf (u − d) Rf (u − d)
Rf − d u − Rf
π1 = and π2 = .
u−d u−d
55
56 CHAPTER 4. DERIVATIVES
The binomial model is based on a replication argument. Consider posi-

tions in a stock and bond such that the portfolio replicates the payoffs on an
option in the next period. That is, we want to find holdings in the stock and
bond ∆ and B so the price of the position is C u in the up-state and C d in
the down-state
Su∆ + Rf B = C u and Sd∆ + Rf B = C d .
Solving these equations gives
Cu − Cd uC d − dC u C u − Su∆
∆= and B = = .
S(u − d) Rf (u − d) Rf
The stock holding ∆ has the interpretation of the partial derivative of the
call price with respect to the stock price. The current price of the option is
C = ∆S + B = [πC u + (1 − π)C d ]/Rf .
To implement this approach we need to calculate u and d. The formal

specification is
√ r ! √ r !
µa τ σa τ 1 − θ µa τ σa τ θ
u = exp + √ and d = exp − √
n n θ n n 1−θ
but a “shortcut” specification is

√ √
σa τ σa τ
u = exp √ and d = exp − √ .
n n
The subscript a indicates annual figures and continous compounding should

be used. The life of the option is τ and there are n periods in the binomial
tree. The corresponding riskless rate is Rf = exp(ra τ /n).
Solving for the price of the option uses a recursive algorithm. At the
expiration of the option the value is given by C = (ST − K, 0)+ . Using these
values, the option price at T − 1 can be calculated. Stepping backwards
through the tree gives the initial option price.
To get the price of a European put option, put-call parity can be used.
This is an arbitrage argument that requires
S + P = C + Ke−rτ .
4.2. BINOMIAL MODELS 57
Table 4.1: Early Exercise of American Options
Call Put
d=0 Never In the money
d>0 Before ex-date After ex-date
Volatility does not enter the equation directly since is affects the put and call
in the same way.
If the option is American it is necessary to check for early exercise at each
node in the tree. To do so simply uses C = (Ch , Cx )+ where the h indicates
the hold value as calculated above and x is the early exercise value. Early
exercise is never optimal for a call on a stock that does not pay dividends.
For the put to be exercised early it must be sufficiently in the money. If the
stock does pay dividends, calls may be exercised just before the ex-date and
puts just after the ex-date.
The number of steps in the tree affect the answer for the option price. The
model value converges to the true value as the number of nodes gets large,
but at a computational expense. The model price generally changes very
little after about a hundred steps. There is an “odd-even” effect where the
calculated value oscillates between over- and under-valued as the number of
nodes in incremented. To remove this error, you can use a weighted average
of prices calculated at n − 1, n, n + 1 nodes.
4.2.1 Alternative Derivations

CAPM-based derivation
The standard CAPM result is
E[ri ] = rf + βi (E[rm ] − rf ])
2
and βi = σim /σm = ρim σi /σm . Let λi = ρim [E[rm ] − rf ] /σm , the correlation-
adjusted market risk premium. Rewriting the CAPM relation,
E[ri ] = rf + λi σi
or
E[Ri ] = Rf + λi σi .
Now assume asset i is an option written on a stock whose returns follow a

binomial process. P u and P d are the end-of-period state prices, with θ the
true probability of the up-state. The current price is given by P . Then
θP u + (1 − θ)P d
E[Ri ] =
P
and
Pu − Pdp
σi = θ(1 − θ).
P
Substituting these expressions into the modified CAPM expresion and rear-
ranging yields
P u π + P d (1 − π)
P =
Rf
p
where the risk-neutral probability π = θ − λi θ(1 − θ) is a function of the
true probabilities and the correlation-adjusted market price of risk. To avoid
arbitrage, all assets must be priced with the same risk-neutral probabilities.
Every dollar investment in the stock should be priced according to
uπ + d(1 − π)
1= .
Rf
Rearranging and solving for π gives
Rf − d
π= .
u−d
Note that when λi = 0, π = θ. This happens when investors are actually

risk-neutral or if the security is uncorrelated with the market. With λi > 0
the risk-neutral probabilities overstate the true probabilties in unfavorable
states and understate the truth in favorable states.
4.2. BINOMIAL MODELS 59
Relation to Black Scholes

Subscripts u and d index the up- and down-states, while all other subscripts
denote partial derivatives. Begin with the single period binomial option
pricing equation
V u π + V d (1 − π)
C=
Rf
where
√
Rf − d erτ − e−σ τ
π= = σ √τ √ .
u−d e − e−σ τ
√ √
Assume a 50% probability of the up-state to get u = eσ τ
and d = e−σ τ
.
Re-expressing V u and V d
√ √
V u = C(eσ τ
S, t + τ ) and V d = C(e−σ τ
S, t + τ ).
Substitute into the binomial equation

√ √ √ √
(erτ − e−σ τ
)C(eσ τ
S, t + τ ) + (eσ τ − erτ )C(e−σ τ
S, t + τ )
C(S, t) = √ √ .
erτ [eσ τ − e−σ τ ]
Next, perform several Taylor series expansions.

√ √
∆S u = (eσ τ
− 1)S ∆S d = (e−σ τ
− 1)S ∆t = [(t + τ ) − t] = τ
√ √ 1 √ √ 1
eσ τ
= 1 + σ τ + σ2τ e−σ τ
= 1 − σ τ + σ2τ erτ = 1 + rτ.
2 2
√ √ 1 √
C(eσ τ
S, t + τ ) = C + (eσ τ
− 1)SCS + (eσ τ − 1)2 S 2 CSS + τ Ct
2
and similarly for the down state. Substituting all this into the expanded bino-
mial formula and simplify by cancelling like terms and drop terms involving
higher orders of τ gives the Black-Scholes PDE
1
Ct = rC − rSCS − σ 2 S 2 CSS .
2
4.2.2 Trinomial Models

Multinomial models are based on matching risk-neutral moments. For ex-
ample, the trinomial model requires three probabilities, pu , pm , and pd . If the
stock price process is
dS 1
= (r − σ 2 )dt + σdW = αk + σdW
S 2
then E[dS/S] = αk and var(dS/S) = α2 k 2 + σ 2 k. Three equations are used
to solve for the three unknown probabilities
pu h + pm 0 + pd (−h) = αk
p u h2 + p m 0 2 + p d h2 = α 2 k 2 + σ 2 k
pu + pm + pd = 1.
The resulting answers are
2

1 2 k 2k k
pu = σ 2 +α 2 +α
2 h h h
2
k k
pu = 1 − σ 2 2 − α 2 2
h h
2

1 2 k 2 k k
pd = σ 2 +α 2 −α
2 h h h
4.3 Black Scholes Model

The famous Black and Scholes (1973) option pricing model and its extensions
by Merton (1973) has revolutionized derivative pricing.
4.3.1 Black Scholes Derivations

Derivation I: Replication
Assume the stock price follows GBM
dS = µSdt + σSdW
and there is a riskless asset B = ert . The option price depends on the stock
price and time C(S, t). Using Ito’s Lemma
1
dC = Ct dt + CS dS + CSS (dS)2 .
2
4.3. BLACK SCHOLES MODEL 61
Making the substitutions gives

1
dC = (Ct + µSCS + σ 2 S 2 CSS )dt + σSCS dW
2
= µC Cdt + σC CdW.
Form an arbitrage portfolio with investments wS + wC + wB = 0. The return

on this investment is
dΠ dS dC
= wS + wC + wB rdt
Π S C
= wS [µdt + σdW − rdt] + wC [µC dt + σC dW − rdt]
= [wS (µ − r) + wC (µC − r)]dt + [wS σ + wC σC ]dW
Choose wS and wC such that there is no risk, wS σ + wC σC = 0. With no

risk, wS (µ − r) + wC (µC − r) = 0 to avoid arbitrage so
µ−r µC − r
= = λ,
σ σC
the market price of risk. Making the substitutions
µ−r (Ct + µSCS + 21 σ 2 S 2 CSS )/C − r

= .
σ σSCS /C
Simplifying gives the PDE
1
Ct = rC − rSCS − σ 2 S 2 CSS .
2
Derivation II: Using CAPM

An alternative derivation uses the CAPM. The beta of an option is a function
of the stock beta and the elasticity of the option price with respect to the
stock price
S
βC = β S C S .
C
The expected return on the stock and option are

dS dC
E = (r + αβS )dt = µdt and E = (r + αβC )dt = µC dt.
S C
Making the substitution,
E[dC] = (rC + αSCS βS )dt.
By Ito’s Lemma
1
dC = (Ct + µSCS + σ 2 S 2 CSS )dt + σSCS dW = µ̂C Cdt + σC CdW.
2
Taking expectations and setting the two expressions equal gives the Black
Scholes PDE
1
Ct = rC − rSCS − σ 2 S 2 CSS
2
Solving the PDE

The following method makes use of the Feynman-Kac (Cox-Ross) solution.
The boundary condition is C(ST , T ) = (ST − K)+ .
C = E Q [e−rτ (ST − K)+ ] = e−rτ E Q [(ST − K)+ ]

= e−rτ E Q [ST |ST ≥ K]Prob[ST ≥ K] − Ke−rτ Prob[ST ≥ K].
Next, get the conditional distribution
ln ST | ln St ∼ N (ln St + (r − σ 2 /2)τ, σ 2 τ ) = N (m, v 2 ).
The density is1

∂ ln ST
f (ST |St ) = f (ln ST | ln St ) ·
∂ST
(ln ST − ln St − (r − σ 2 /2)τ )2 1

1
=√ exp −
2πσ 2 τ 2σ 2 τ ST
" 2 #
1 1 ln ST − m
= √ exp −
vST 2π 2 v
1
To derive this realize that under Q, dS = Srdt + SσdZ. Let x = ln S so
∂x 1 ∂2x dS 1 1
dx = dS + (dS)2 = − (dS)2 = (r − σ 2 )dt + σdZ.
∂S 2 ∂S 2 S 2S 2 2
4.3. BLACK SCHOLES MODEL 63
Next calculate the terms involving ST and K
Prob[ST ≥ K] = Prob[ln ST ≥ ln K] = 1 − Prob[ln ST ≤ ln K]

ln K − m m − ln K
=1−φ =φ
v v
2

ln(ST /K) + (r − σ /2)τ
=φ √ = N (d2).
σ τ
Using the same idea and a change of variable y = ln ST so ey = ST and

dST = ey dy
E Q [ST |ST ≥ K]Prob[ST ≥ K]

Z ∞ " 2 #
1 1 ln ST − m 1
= ST √ exp − dS(T )
K v 2π 2 v ST
Z ∞ " 2 #
1 1 ln ST − m
= √ exp − exp(ln ST )d ln S(T )
ln K v 2π 2 v
Z ∞
1 1 2 2
2
= √ exp − 2 ln ST − (m + v ) + m + v /2 d ln S(T )
ln K v 2π 2v
Z ∞ " 2 #
2

1 1 ln S T − (m + v )
= exp(m + v 2 /2) √ exp − d ln S(T )
ln K v 2π 2 v
ln K − (m + v 2 ) m + v 2 − ln K

2
= exp(m + v /2) 1 − φ = exp(·)φ
v v
ln(St /K) + (r − σ /2)τ + σ 2 τ
2

2 2

= exp ln St + (r − σ /2)τ + σ τ /2 φ √
σ τ
= Serτ N (d1)
Combining these results gives the Black Scholes model
C(S, t) = SN (d1) − Ke−rτ N (d2)
where
ln(S/K) + (r + σ 2 /2)τ √
d1 = √ and d2 = d1 − σ τ .
σ τ
4.3.2 Implied Volatilities

The volatility parameter is the most difficult to obtain and perhaps the most
important. An alternative to using the model to give an option price is to
invert the model to give an implied volatility, taking option prices as inputs.
4.3.3 Hedging
Hedging involves forming portfolios to reduce or minimize various types of
risk. The most common hedge is a delta-neutral position. This investment
has an expected price change of zero when the stock price changes — the
loss from a drop in the stock is offset by a gain on an option. This is a local
hedge, since the delta changes when the stock price changes. A gamma-
neutral hedge preserves the delta-hedge. Other hedges include rho for the
interest rate and vega for volatility. Again, these are partial hedges and
assume everything else is constant.
To determine the appropriate hedge, find the options with the maximium
and minimum pricing error per unit of stock-equivalent risk model−market
∆
. Buy
and sell these options in amounts proportional to the inverse of the delta to
balance the stock-equivalent risk. For a gamma hedge, combine two delta-
neutral portfolios such that the gammas balance.
4.4 Advanced Topics

4.4.1 American Options
Boundary Conditions
Define St∗ as the exercise boundary. Conditions for early exercise require
lim C(St ) = St∗ − K

St →St∗
∂C(St )
lim ∗ = 1.
St →St ∂St
With dividends the stock process is
dS
= (r − δ)dt + σdW̃
S
4.4. ADVANCED TOPICS 65
so
1
dC = [Ct + (r − δ)SCS + σ 2 S 2 CSS ]dt + σC dW̃ = rCdt + σC W̃ .
2
The resulting PDE is
1
Ct + (r − δ)SCS + σ 2 S 2 CSS − rC = 0
2
with boundary conditions
CT (ST ) = (ST − K)+ and C0 (S0 ) = sup E Q [e−r(τ −t) (S0 − K)+ ].
τ ∈[t,T ]
At the boundary you are indifferent to exercising since exercising gener-

ates
dS + (δS − rK)dt
while continuing generates

1
dC = (Ct + σ 2 S 2 CSS )dt + CS dS
2
= [rc − (r − δ)S]dt + dS
= [r(S − K) − (r − δ)S]dt + dS
= (δS − rK)dt + dS.
To exercise you borrow rK and receive δS, so rK = δS and ST∗ = rK/δ.
Integration
Broadie & DeTemple and Barone-Adesi & Whaley. Let Ct and ct denote
American and European call option values. We can write
Ct (St ) = ct (St ) + εt
so
Z T
Q −rt + Q
C0 (S0 ) = E [e (ST − K) ] + E [ε e−rt df (τ )].
0
CHECK
L-U Bound
Broadie & DeTemple find upper and lower bounds on American options by
using capped calls. A capped call value can be found for a given early exercise
path.
BBS/Richardson Extrapolation
The binomial Black-Scholes method (BBS) is essentially a binomial tree with

the analytic BS formula attached at the last node. This avoids some of the
problems from disctretization in a tree, but preserves the ability to price
American options. Richardson extrapolation involves calculating the price
with N nodes and again with 2N nodes. The option price is then calcualated
as twice the first minus the second value (e.g., p = 2pN − p2N ). This avoids
the odd-even effect and allows use of a small N .
4.4.2 Exotic Options

Barrier options utilize the reflection principle. Put-call symmetry says C(S, K, r, δ) =
P (K, S, δ, r).
To price a down-and-out call, let H denote the barrier, xt = ln(St /S0 ), yt =
inf t∈[0,T ] xt , Yt = supt∈[0,T ] xt , and y = ln(H/S0 ). Then
Cdoc = e−rt E Q (ST − K)+ Prob[yT ≥ y]

= e−rt E Q [(S0 exT − K)Prob[yT ≥ y, xt > ln(K/S0 )]]
To price lookback options,
Standard : C = (ST − MTT0 )+ P = (MTT0 − ST )+

Extreme : C = (MTT0 − K)+ P = (K − MTT0 )+
Asian options can be of the form
C = (S̄ − K)+ P = (K − S̄)+

C = (S − K̄)+ P = (K̄ − S)+
4.5. INTEREST RATE DERIVATIVES 67
Table 4.2: Common Interest Rate Models
Stochastic Interest Rate Stochastic Term Structure

Rendleman & Bartter Ho & Lee
Courtadon HJM
Vasicek Black, Derman & Toy
CIR
4.4.3 Other Advanced Topics

Stochastic Volatility and Jumps
Monte Carlo, QMC, etc.
Parametric Pricing
4.5 Interest Rate Derivatives

An underlying assumption of the preceeding option models is that the asset
follows a lognormal or binomial process. For fixed income securities this
assumption is not valid. The price of these securities must end up back at
par when they mature. Surprisingly, the fact that we know the terminal price
makes option pricing more difficult. With the view that this is an additional
constraint on the system it is more understandable why interest rate options
have this added complexity. Many of the interest rate models are discussed
more fully in Chapter 3.
There are three basic steps in interest rate option pricing. First, random
interest rates are modeled. Next, the distribution of the interest rates are
used to infer the distribution of prices for the underlying debt instrument.
Finally, the distribution of the underlying asset is used to price the option.
Interest rate options can be broadly categorized as stochastic interest
rate models and stochastic term structure models. Refer to Section 3.5 for
a discussion. The stochastic interest rate approach is subject to error since
the option model is based on bond prices that are potentially wrong. In the
stocjastic term structure models, market data is used to get a distribution
for the bond price, which, in turn, is used to price the option.
4.5.1 Stochastic Interest Rate Models

The Rendleman & Bartter model uses a binomial process and assumes inter-
est rate changes are a constant percentage. Courtadon models the interest
rate process in continuous time. Like the RB model, interest rates are lognor-
mal. However, Courtadon adds a mean-reversion feature which overcomes
the problem in RB that the interest rate can become infinitely large. The
Vasicek model includes mean-reversion, but uses a normal process, allowing
negative interest rates. The CIR model modifies Vasicek by using a square
root process which produces variance proportional to the level of the interest
rate. Refer to Table 3.1 for a summary of model specifications.
4.5.2 Stochastic Term Structure Models

The presentation of the following models are based on their discrete time
analogs.
Ho & Lee
The Ho-Lee () model generates parallel shifts in the yield curve. In a binomial
setting it produces a recombining tree. It is based on
D (t) [π + (1 − π)δ t ]
Rt,j =
D (t+1) δ t−j
where δ = exp[−2φ(τ /n)1.5 ], D (t) is the current price of the bond maturing
at time t, and φ is the standard deviation of the log yield of one-year discount
bonds.
BDT
The Black, Derman, & Toy () model features a fixed ratio of adjacent prices
at each point in time, αt . The rate can be expressed as rt,j = αj rt,0 . In the
Ho-Lee model this ratio is fixed for all t.
Heath, Jarrow, & Morton

The Heath, Jarrow & Morton model is the most general term structure model.
Although the HJM model is set in continous time, a discrete time analog is
available. Market bond prices and volatilities are used to determine the
4.5. INTEREST RATE DERIVATIVES 69
m
interest rate process (tree). In terms of notation, let Dt,k denote the price
of a bond maturing at m observed at time t in state k. Since the tree does
not recombine, there are 2t nodes at time t. The states are indexed with the
lowest (all downs) state as 0 and the highest state as 2t − 1. By convention,
up states are when the bond price increases (and interest rate decreases).
The risk neutral and true probabilities of an up-state are π and θ. There are
two equations expressing current price and volatility as functions of the next
period prices.
m m
m
πDt+1,2k+1 + (1 − π)Dt+1,2k
Dt,k =
1 + rt,k

1 1
ln m
Dt+1,2k
− ln m
Dt+1,2k+1
σt+1 =
2(m − t − 1)
The second equation can be more conveniently expressed as

m m
Dt+1,2k+1 = exp[σt+1 · 2(m − t − 1)]Dt+1,2k
m
The values Dt,2k and σt+1 are generally estimated (or given) and the prices
at t + 1 are determined by solving the equations simultaneously at each node.
Unlike the standard binomial pricing model, this model is solved by stepping
forward through the tree.
Chapter 5
Corporate Finance
5.1 Introduction
Corporate finance covers a range of issues related to the choice of capital
structure, distributiuon of cashflows, and issuance of securities. Asymmetric
information problems are common and is the subject of much of the work
in corporate finance. Also important are the agency costs that arise from
the conflicts of interest between the decision makers and other parties. A
common example of an agency costs is between the manager and the outside
owners of the firm. The compensation contract offered to managers is one
way of dealing with this agency cost.
Many of the earlier works make relatively strong assumptions. The last
few sections attempt to understand the implications of relaxing these as-
sumptions. When demand for assets is not perfectly elastic there will be
price effects caused by changes in quantity. Similarly, imperfections may
give rise to financial innovation.
5.2 Information Asymmetry/Signaling

An information asymmetry occurs when one group of agents has better in-
formation than other groups. Adverse selection arises when an agent making
a decision is better informed than the person with whom he is contracting.
This is different than moral hazard, where the agent with superior informa-
tion can influence the outcomes by his action. A signal is an action that
an agent takes to provide credible information. The signal must impose a
71
72 CHAPTER 5. CORPORATE FINANCE
greater cost on the “low quality” agents than on the “high quality” agents
to prevent mimicking.
A common element of signaling papers in finance is that the source of the
informational asymmetry generally comes from managers’ superior forecasts
of future cash flows. Investors are typically homogeneous with respect to
taxes and restrictions on trading, otherwise clienteles would arise. Models
also usually prevent the manager from trading personally.
The outcomes depend on the nature of the informational asymmetry. If
the asymmetry is over the assets in place, but not the new project, overpricing
and project scaling are the most efficient signals. If there is asymmetric
information about the project’s value, and good firms have more valuable
projects than bad firms, signals which burn money after the issuance are
dominant.
The overinvestment (when only project value is asymmetric information)
can be eliminated through money burning, but the underinvestment problem
(when there is differential information about the assets in place) is not com-
pletely solved. The distinction that the burned money comes from project
cash flows is important. Equity financed money burning is an inefficient
signal.
Akerlof (1970)
In his famous “lemons” paper, Akerlof (1970) shows how quality uncertainty
can affect the size and average quality in the automobile market. In extreme
cases, markets can fail completely. A case is made for the role of certain
institutions in improving the efficiency of the market.
The seller of the cars know more about the quality of the car than the
buyers. Demand depends on price and average quality, QD = D(p, µ). The
average quality will also depend on price, µ = µ(p). Supply depends on price
as well, QS = S(p). In equilibrium, S(p) = D(p, µ(p)). A low average quality
will cause the owners of good cars not to sell, further lowering the average
quality.
Several applications of the basic model are discussed. In the insurance
market, healthy individuals will tend to opt out of the market, leaving the
insurer with a disproportionately large share of the unhealthy. Costs of
dishonesty must include both the direct costs as well as the indirect costs of
driving business out of the market. There are several institutions that can
mitigate these types of problems. Risk transferring guarantees can allow the
5.2. INFORMATION ASYMMETRY/SIGNALING 73
owners of good cars to get their fair value. Brand-names and chains can also
reduce quality uncertainty, as do licensing practices.
Spence
Spence (1973) develops the signaling model in the context of the job market.
This is really just one example of an investment under uncertainty problem.
Here the potential employer can not observe the quality of the applicants, but
can use education to make rational inferences about the applicant’s quality.
Education separates the types of applicants because it is costly to obtain, and
more so for the low types. The high types will obtain just enough education
to make it unattractive for a low type to mimic.
The employers offer wage schedules that are a function of the educational
signal and other (non-signal) indices. Individuals choose education levels to
maximize wages net of signaling costs.
For the signaling equilibrium to work the costs of signaling must be neg-
atively correlated with productive capacity. Otherwise, lower quality types
will overinvest in the signal to mimic higher quality types. The use of indices
results in forming probability distributions conditional on both the signal
and the indices. This segments the population by indices, and these subsets
need not have the same equilibrium.
Spence (1974) is a more general description of the signaling environment.
There must be an information asymmetry where the seller knows more about
the good than the buyer. The seller signals and the buyer responds. The
signal is based on the anticipated response of the buyer.
Myers & Majluf (1984)

Myers and Majluf (1984) is the classic paper on financing under asymmetric
information. A firm has a positive NPV project that requires external financ-
ing. If the manager believes the stock is underpriced, pursuing the project
requires issuing underpriced stock, diluting the value to existing sharehold-
ers. Investors will then believe that when a firm does issue, it is likely that
the stock is overpriced. Consequently, announcements of new issues generate
a share price decline.
In the basic model, a firm has existing assets in place, a, and a valuable
investment opportunity, b > 0. The project is all-or-nothing and requires
the issuance of equity to make the initial investment I. The firm currently
has slack S, which is fixed and publicly known, and would need to raise
additional equity E = I − S. There are three dates in the model. At time
t − 1 the market has the same information as management; valuations are
given by Ā and B̄. At time t the manager learns a and b, while the market
knows only the distribution of Ã and B̃. Additional assumption are perfect
markets, costly signaling, and passive existing shareholders.
Managers act in the interest of old shareholders by maximizing V0old =
V (a, b, E). The market value of the shares will generally be different from
the manager’s valuation since the market does not know a or b. Denote the
market value P 0 if stock is issued and P otherwise. The managers will issue
new stock when
E/(P 0 + E)(S + a) ≤ P 0 /(P 0 + E)(E + b).
In words this says the old shareholders must get more of the new value than
the new shareholders get of the original value. The firm is more likely to
issue when b is high or a is low. Rearranging, the indifference equation is
b = (E/P 0 )(S + a) − E.
Above this line the firm will issue and invest, below it will do nothing. The
issue price P 0 is given by
P 0 = S + Ā(M 0 ) + B̄(M 0 )
where the last terms represent expected values given issuance.

Unless the firm is certain to issue, P 0 < P . The decision not to issue
is good news about the value of the existing assets. The ex ante loss from
passing up good projects is L = F (M )B̄(M ). With S > I, L = 0. If the
firm could be split, then the problem goes away. The solution of P 0 requires
a simple numerical algorithm. Start by setting P 0 = S + Ā + B̄. Then
determine M and M 0 and calculate P 0 = S + Ā(M 0 ) + B̄(M 0 ). Repeat this
procedure until convergence.
The above analysis can be extended to include debt financing. If the
firm can issue riskless debt the problem disappears — the firm always issues
riskless debt and takes the project. If the firm can only issue risky debt the
problem in reduced, but not eliminated. Thus, the general rule is to issue
securities less subject to mispricing first. The firm will issue and invest only
when b ≥ ∆D (or ∆E). We should have |∆D| < |∆E| and with the same
5.3. AGENCY THEORY 75
signs. In this case the firm will never issue equity. Any time it decides to
issue it will use debt. This extreme condition can be tempered by introducing
costs of debt such as bankruptcy or agency costs. Note that if the information
asymmetry is about the variance of value rather than the mean, then equity
will dominate debt.
The model makes a number of predictions. It says it is generally better
to issue safe securities, a pecking order result. Firms with insufficient slack
may forgo good investment opportunities — the underinvestment problem.
Firms can build up slack by retaining earnings or issuing securities when
information asymmetries are small to avoid some of these problems. Firms
should avoid issuing risky securities to pay dividends. Stock price will fall
when managers have superior information and they issue securities. A merger
between a firm with little slack and one with a lot of slack is likely to increase
value, but negotiating such a merger is likely to be difficult.
The basic Myers and Majluf (1984) framework has been extended in a
number of ways, including dividend policy, scale of investment, project tim-
ing, and public offerings (overpricing and underpricing).
Cooney & Kalay (1993)

? extend the classic Myers and Majluf (1984) paper to allow for the possi-
bility of negative NPV projects. With this additional realism the stock price
reaction to equity issuance is not necessarily negative. Note that it is the
existence, not acceptance, of negative NPV projects that drives this result.
In the new model, low values of a can cause overinvestment. Firms will
accept negative NPV projects in order to sell overvalued existing assets.
Also, firms with riskier new projects may experience stock price increases
on issuance announcement. The revised model has lower issue prices and
probability of issuance. When there is a limited supply of zero NPV projects
(e.g., transactions costs and taxes for financial investments) there may be
positive annoncement effects.
5.3 Agency Theory

An agency relationship is a contract under which a principal engages an agent
to perform some task on his behalf which involves delegating some decision-
making authority to the agent. The agent may not always have natural
incentives to act in the best interest of the agent. The principal can address
this problem by establishing the appropriate incentives and/or monitoring
the agent. Incentive alignment is rarely free; agency costs are defined as the
sum of monitoring costs, bonding costs, and the residual loss.
Jensen & Meckling (1976)

In a widely cited paper, Jensen and Meckling (1976) develop a theory of the
ownership structure of the firm using elements of property rights, agency,
and financial theory. Property rights specify how costs and rewards will be
allocated among the participants in an organization. The firm is defined as a
legal fiction which serves as a “nexus for contracting.” The firm has divisible
claims on assets and cashflows, but does not have intentions, behaviors, or
motivations. The paper focuses on the positive aspects of agency theory —
the interaction of the various parties assuming they act optimally. Most of
the previous literature was normative in nature.
The presence of inside and outside equity owners introduces an agency
cost of outside equity. This arises because the outsider funds a portion of the
insiders perquisite consumption, so the insider will consume “too much.” As
long as the market anticipates this all these costs will be passed back to the
insider.
The manager consumes perquisites F . When he is the sole proprietor
he chooses to consume F ∗ and the firm is worth V ∗ . Every dollar of perqs
he forgoes increases the value of the firm by a dollar. He chooses the point
with the highest utility given his budget set. The manager sells a share
(1 − α) to an outsider. Now the manager pays only $α for every dollar in
benefits. If the outsider pays V ∗ and holding F ∗ constant, the budget slope
changes to −α. If the manager can change his consumption, he will increase
consumption which lowers the value of the firm. With rational expectations
the market will foresee this and will pay only V 0 . At this point the manager
is over-consuming perqs; by decreasing to F 0 he increases his utility and the
value increases to V 0 . The entire decrease in value V ∗ − V 0 is borne by
the insider. This is a gross cost, since it does not include the benefit from
increased consumption. The net cost is given by the change in the utility
levels.
Introducing monitoring allows an improvement. The insider receives all
the benefits from monitoring (i.e., he bears all the net costs). It does not
matter who actually makes the payment for these costs since they all fall
5.3. AGENCY THEORY 77
back to the insider in the end. The outcome is suboptimal or inefficient

only relative to a world with no agency costs. Given that these costs exist,
and since the insider bears these costs, the insider will minimize these costs.
The size of these agency costs will depend on managers’ tastes, degree of
managerial discretion, monitoring and bonding costs, difficulty in measuring
performance, and the costs of devising, implementing and enforcing incentive
contracts.
The scale of the firm can also be analyzed in this framework. When the
insider lacks sufficient resources and needs external financing, agency costs
reduce the value of the firm at a given level of fringe benefits consumption.
The insider will stop increasing the value of the firm when the gross increment
in value is offset by the incremental loss in the consumption of additional
fringe benefits. The end result is that the insiders are worse off than before.
The reason the can not be the same is because they can not credibly commit
to not consuming additional benefits.
Debt financing creates a risk-shifting incentive since equity holders enjoy
the benefits of positive outcomes without a matching liability for negative
outcomes, much like an option. This is an overinvestment problem — the
firm takes bad projects because the equity holders can expropriate wealth
from the bondholders. Again, monitoring and bonding are possible (partial)
solutions, but are likely to be difficult to implement.
In a multiperiod setting, “being good” will reduce agency costs due to
a reputation effect. Yet the problem will not be solved since each agent
has an end to their game and will always eventaully face the temptation to
shirk. Inside debt may help reduce the problems as well since the manager
will not be tempted to expropriate wealth from his own bonds. In some
sense the manger’s salary may serve this purpose. He may take measures to
preserve his salary, including pursuing safe investments. Incentive compen-
sation, such as options, may be effective in this case. Convertible securities
may incent managers to avoid risk-shifting. Security analysts may also help
reduce agency costs. In situations where it is easy for the insider to consume
perqs, less outside equity should be used.
Jensen (1986)
Jensen (1986) discusses how free cashflows (FCF) can cause agency costs by
allowing managers discretion to make bad investments. Reducing FCF can
minimize a manager’s ability to waste resources and it also subjects the firm
to more frequent monitoring since it has to access the capital markets more
often.
The agency costs of debt have been cited as a reason to use less debt.
Jensen points out that debt can also help reduce agency costs by reducing
FCF. Debt can be viewed as a substitute for dividends in this sense. Ad-
ditional debt will also serve to increase efficiency as bankruptcy becomes
more likely. There is evidence supporting these claims. Leverage-increasing
transactions are associated with increases in equity value. LBO targets tend
to have high FCF and low growth opportunities. Also, strip, or mezzanine,
financing limits the conflicts of interest between classes of security holders.
The FCF hypothesis also applies to takeovers. Firms with high FCF and
unused borrowing power are likely to undertake bad mergers. Takeovers, es-
pecially hostile ones, can generate the crisis needed to make changes. Within
declining industries, mergers are likely to be value-enhancing since they re-
move resources from a relatively unproductive sector. Acquirers tend to
have performed well, generating excess cash to pursue the acquisition. Tar-
gets tend to either have poor managers and poor performance or good per-
formance and significant FCF. Cash or debt financed takeovers generally
provide larger benefits than transactions financed with stock.
Fama (1980)
Fama (1980) explains how the separation of ownership and control in a large
corporation is an efficient organizational form. The basic idea is that manage-
ment is a special type of labor which coordinates inputs and makes decisions.
Management rents its human capital to the firm. Risk bearers provide capi-
tal ex ante in exchange for uncertain future payments. The capital markets
and managerial labor markets provide discipline to the manager. Monitoring
occurs within and among management, up and down the chain of command.
The board monitors top management; it can include top management but
should also include outsiders.
A distinction between ownership of the firm and ownership of capital
is made. Since the firm is a collection of contracts, no one really owns it.
Rather, security holders own claims on the cashflows. With this view, control
rights over a firm’s decisions does not necessarily lie with the security holders.
In order to hold the manager accountable there must be some mechanism
for ex post settling up. The general necessary conditions are uncertainty
about managerial talents or tastes, labor markets that efficiently use past
5.4. CAPITAL STRUCTURE 79
information in determining wages, and a wage revision process that is strong

enough to resolve incentive problems. When the manager is the sole pro-
prietor he can not avoid ex post settling up with himself. The optimal pay
incentives are effort based. This does not expose the manager to risks beyond
their control for which they would demand compensation. The problem is
that effort is difficult to measure. When performance measures are noisy,
less weight should be put on recent results.
Lehn & Poulsen (1989)

Lehn and Poulsen (1989) analyze FCF in privatizing transactions to identify
the sources of value. Unlike other corporate control transactions, synergies
are not a potential source of value. The four sources under consideration are
tax effects, wealth redistribution, asymmetric information, and agency costs.
The results are largely consistent with the FCF hypothesis. These transac-
tions are more likely in firms with high CF/EQ or low sales growth. Premi-
ums paid are also positively related to CF/EQ. The results are strongest in
the hostile takeover wave in the mid-80’s and among firms with low manage-
ment ownership.
The analysis consists of two parts. First, firms that went private are con-
trasted to a control sample that did not to understand the factors important
in the decision. This is done by comparing means of the groups and also in a
logit regression. The variables of interest are CF/EQ, Tax/EQ, sales growth,
and footsteps, a dummy for competing bids or rumors. The results indicate
that privatized firms are larger, have more cash, slightly lower recent growth,
and are more likely to have other bids. These effects tend to become stronger
in the second half of the sample. The second part of the paper attempts to
explain the cross-sectional variation in premiums in these transactions by
regressing the premium on CF/EQ, Tax/EQ, and sales growth. The results
are generally supportive of the FCF hypothesis, especially in the second half
of the sample and among low management ownership firms.
5.4 Capital Structure

The capital structure decision balances the costs and benefits of the vari-
ous financing choices. These can be categorized as taxes, bankruptcy costs,
and agency costs. Tax considerations include both the advantages of debt
at the corporate level as well as the disadvantage at the individual level.

Bankruptcy costs associated with debt are subdivided into direct and indi-
rect components. Agency costs arise from the conflicts of interest between
different investor classes and also with management. Although debt creates
several agency costs, it actually reduces agency costs under the FCF hypoth-
esis. Agency costs can lead to underinvestment or overinvestment.
Miller (1977)
This paper is a study of the way taxes affect capital market equilibrium.
Pre-“Debt and Taxes” the view was that optimal capital structure involved
balancing the corporate tax advantage of debt against the costs of financial
distress (loss of tax shields, overinvestment, underinvestment, monitoring
costs, etc.). Miller’s “horse and rabbit stew” refers to the corporate tax
advantages of debt dominating the costs associated with bankruptcy. Miller
adds personal tax considerations of investors to the mix. Taxes are important
in the capital structure decision because they affect aggregate supply and
demand for corporate securities.
Using a bond market equilibrium analysis, Miller argues that the higher
costs of borrowing negate the entire benefit of tax shields so the capital
structure choice is irrelevant for individual firms, although there will be an
optimal amount of aggregate debt. With progressive corporate taxes and/or
if the differential information-related costs of debt versus equity are convex
in the amount of debt, then capital structure may in fact matter.
The classic M&M Proposition I is modified to include personal taxes

(1 − τC )(1 − τS )
VL = V U + 1 − B.
(1 − τB )
Proposition I (with taxes) says that firms can increase value by issuing debt.
But if this is the case then the market is not in equilibrium. Assume for
simplicity that there are no capital gains taxes, all bonds are riskless, and
there are no transactions costs, Miller’s equilibrium is given by the curves S
and D in Figure 5.1. The flat part of the demand curve represents the demand
for taxable bonds by tax-exempt investors. To get taxable investors to hold
bonds, the rate must be high enough to offset the taxes. The equilibrium
is where τC = τB . In the more general case, with capital gains taxes, the
equilibrium condition is
(1 − τC )(1 − τS ) = (1 − τB ).
R
D = r0 /(1 − τB )
S = r0 /(1 − τC )
r∗ S1 = r0 /(1 − τC0 )
r0 S2 = r0 /(1 − τC0 ) − d
Q∗ Q
Figure 5.1: Bond Market Equilibrium
The area between the supply and demand curves below the equilibrium is the
“bondholder surplus.” This arises because rates are driven up to the point
where the marginal investor’s tax rate is equal to the corporate rate, but all
investors can get the same rate in the market.
A crucial assumption in Miller is the inability to perform tax arbitrage:
selling assets taxed at a high rate to buy those taxed at a lower rate. Cliente-
les may arise because of differences in tax treatment of various organizational
forms and differences in transaction costs [Shin and Stulz (1996)].
DeAngelo & Masulis (1980)
? generalize Miller (1977a) to include more realistic taxes, bankruptcy,

and agency costs. In this “modified balancing theory” the full burden of
bankruptcy or lending costs is not necessarily borne by the debtors. Some of
these costs are shifted to bond buyers in the form of lower risk-adjusted inter-
est rates. Miller’s irrelevance result is shown to be extremely fragile. When
either non-debt tax shields or bankruptcy/agency costs create an increasing
marinal cost individual firms do have an optimal capital structure.
The single period1 model allows different tax rates for each investor, so
long as the ordinary income tax rate is higher than the capital gains rate.
All firms face the same marginal tax rates. The set up results in three tax
brackets: those who prefer debt, those who prefer equity, and those who are
indifferent.
1
A multiperiod model with tax carryforwards, etc. would be qualitatively similar.
For the marginal investor µ

(1 − τBµ )π(s) (1 − τBµ ) (1 − τEµ ) (1 − τEµ )π(s)
= = = ∀ s.
PB (s) P̄B P̄E PE (s)
Non-debt tax shields such as depreciation are given by ∆, Γ are the dollar
amount of tax credits, and θ represents the maximum fraction of the tax
liability that can be shielded by tax credits. There are four outcomes that
result from different states.
Debt Equity State
X(s) 0 [0, s1 ]
B X(s) − B [s1 , s2 ]
B X(s) − B − τC [X(s) − B − ∆](1 − θ) [s2 , s3 ]
B X(s) − B − τC [X(s) − B − ∆] + Γ [s3 , s̄]
For all states up to s3 the firm loses some of its tax shields,
R even though it
may not be in bankruptcy. The value of the firm is given by S B(s)+E(s)ds.
In Miller’s world, ∆ = Γ = 0 so s1 = s2 = s3 . Taking the partial wrt B,
P̄B = P̄C (1 − τC ). The interpretation of the flat section of the supply curve is
that all tax shields are fully utilized in all states of nature. The curve begins
to slope to compensate the firm for some of these tax shields going unutilized
in some states.
In the new equilibrium, the net tax advantages of debt are equated with
the expected default costs
(1 − τB ) − (1 − τC )(1 − τS ) = E[default and agency costs].
Firms with low earnings may lose some of the value of their tax shields.
The incremental value of interest tax shields decreases as firms increase lever-
age, implying a negative slope for the supply curve of taxable corporate
bonds. This is depicted as S1 in Figure 5.1
Adding leverage-related deadweight costs d will cause the tax advantage
of corporate borrowing to become more significant. At the margin, the dead-
weight cost per dollar of borrowing, d∗ is the same for all firms. The new sup-
ply curve S2 has a more negative slope because of the deadweight costs. This
reduces the level of aggregate borrowing and the equilibrium risk-adjusted
rate of return. Leverage-related deadweight costs increase the marginal tax
advantage of borrowing because they decrease the supply of bonds, eliminat-
ing some of the “bondholder surplus.”
The existence of an optimal capital structure in this setting is essentially

an empirical issue. Do deadweight costs and underutilization of tax shields
have significant impacts on the rate of return to bondholders? There is evi-
dence that deadweight costs and possible underutilization of tax shields are
sufficiently significant to affect bond pricing. Evidence implies that leverage-
related costs reduce the supply of corporate bonds and lower the cost of
borrowing, generating a positive net tax advantage of corporate debt.
The theory also implies that firms that reach d∗ faster than others will
have less leverage. In other words, firms that are more likely to encounter
financial distress at a given debt ratio are less likely to borrow. Supportive
evidence shows that there is a significant negative relation between observed
leverage measures and historical failure rates. The probability of financial
distress is also positively related to the variability of operating earnings. In
sum, the evidence is consistent with the generalized balancing theory.
Myers (1977)
In Myers (1977) the firm is viewed as a collection of assets in place and growth
opportunities. Risky debt reduces the value of the real options, an agency
cost. This cost arises either from a suboptimal underinvestment strategy or
from the costs of avoiding underinvestment. This underinvestment results
even when managers are acting in shareholders’ best interest. The level of
borrowing is inversely related to the relative size of the growth opportunities
and is determined by the tradeoff between these costs and the tax benefits of
debt. The shareholders absorb the costs of avoiding underinvestment, which
include:
• Rewrite/renegotiate debt contract
• Shorten maturity prior to “exercise date”
• Mediation
• Dividend restrictions
• Reputation effects
• Monitoring
The basic analysis considers the value of a firm facing an investment

opportunity requiring an investment I and paying V (s). A firm with risky
debt P will issue take the project if V (s) ≥ I + P .
The analysis can be extended to a multiperiod setting
Vt = VE,t + VD,t .
The firm will invest as long as the incremental benefit
dVE /dI = dV /dI − dVD /dI > 1.
If the value of debt depends on the volatility of the firm value, then the
transfer of value from equity to debt is
dV /dI − dVE /dI = dV /dI · ∂f /∂V + ∂f /∂σ 2 · ∂σ 2 /∂I > 0.
In conclusion, Myers’ work indicates that assets in place can support more
debt than growth opportunities can, capital intensive businesses with high
operating leverage can support more debt, and more profitable firms should
have more debt. This logic is similar to Shleifer and Vishny (1992) who say
more liquid assets can support more debt.
Masulis (1980)
Masulis (1980) examines the valuation effects of capital structure changes
on security value. The sample of intrafirm exchange offers and recapitaliza-
tions abstracts from asset changes that accompany many other changes in
capital structure. The types of transactions considered include issuing debt
for equity (E → D), preferred for equity (E → P ), and debt for preferred
(P → D).
There are three primary sources of valuation effects. Tax-related stories
predict changes in equity value to be positively related to increases in debt.
Bankruptcy and reorganization expenses should cause a negative relation
between equity value and leverage increases. Wealth redistribution from
agency costs are a zero sum game, so gains to one group of security holders are
at the expense of another group. Two other theories that are not considered
are signaling and the offer premium hypothesis.
The methodology employed uses comparison period returns. This ap-
proach essentially calculates the abnormal return for a security as the de-
viation from the mean return over a comparison period. These abnormal
returns are averaged across all securities to get a portfolio abnormal return.
The results are largely consistent with the tax and wealth redistribution
effects, but provide little evidence about the bankruptcy costs. Leverage in-
creasing transactions tend to increase equity value, while leverage decreasing
transactions tend to decrease shareholder value.
Table 5.1: Predictions in Masulis
Source E→D E→P P →D

Tax + 0 +
Bankruptcy – 0 –
WR: E + + –/0
WR: P – – +
WR: D – –/0 –
Table 5.2: Predictions in Titman & Wessels
Attribute Pred. Significant Results

Collateral Value +
Non-debt tax shield –
Growth –
Uniqueness – Yes
Industrial +
Size + Small = ST
Volatility –
Profitability – MV measure
Titman & Wessels (1988)

Titman and Wessels (1988) expand the range of capital structure theories
tested and attempt to overcome some econometric problems. The paper uses
a factor analytic technique, similar to some APT tests, to relate unobservable
attributes to capital structure measures using observable data. The process
involves estimating a measurement model and a structural model simultane-
ously. Although theoretically appealing, implementation requires imposing
a number of restrictions on the loading matrix in the measurement model.
The authors use six D/E ratios as dependent variables obtained from
all combinations of long-term, short-term, and convertible debt to book and
market equity, {LT, ST, Conv}/{BE, M E}. The explanatory attributes are
summarized in Table 5.2.
The results indicate that uniqueness is important. The authors believe
this supports the costs of financial distress, but the proxies may also be re-
lated to non-debt tax shields and collateral value. The size effect for small
firms is taken as evidence that transaction costs may be important. The
analysis is unable to explain the cross-sectional variation in convertible debt.
The lack of evidence in many cases may be due to problems with the mea-
surement model.
Rajan & Zingales (1995)

The purpose of the Rajan and Zingales (1995) paper is to see if factors
determined to be important in determining capital structure in the U.S. are
also important in other countries. This research is valuable because many
of the theories that explain capital structure were developed in response to
empirical observations. The paper studies the G-7 countries: U.S., U.K.,
Canada, Japan, Germany, France, and Italy.
There are a few limitations to the analysis. First, there is a bias towards
large, listed companies. Second, there are variations in industry concentra-
tions across countries. Third, there are differences in financial statements and
reporting across countries. Finally, bank- versus market-oriented economies
may produce systematic differences.
The primary analysis in the paper relates four factors to leverage, which is
measured in both book and market terms. The factors are tangibility, M/B,
size, and profitability. A long list of control variables are also included.
The authors also look at the distribution of wealth transfers out of the firm
and find that these payments are generally made through the most tax-
advantaged route.
The general results indicate that U.K. and German firms tend to have
lower leverage than firms in the U.S. The factors generally have the hypothe-
sized relation with leverage. Tangibility is positively related, M/B negatively
related. Size is positively related except for in Germany where it is nega-
tively related. Profitability is negatively related, except in Germany and
France [this is opposite the predictions of Ross (1977a) and Myers (1977)].
Graham (1996)
Graham (1996) is the first paper to take a careful look at the role of marginal
taxes in the capital structure decision. Economic theory indicates marginal
rates are what matter, but previous studies have used statutory rates as a
5.5. DIVIDENDS 87
matter of convenience.
The approach for estimating marginal rates is to calculate the present
value of current and future taxes on a $1 increase in income based on simu-
lations. The main analysis regresses (D1 − D0 )/D0 on the marginal tax rate,
relative cost of debt, probability of bankruptcy, non-debt tax shields, and a
list of control variables.
The results find that the marginal tax rate is important in explaining
capital structure. The difference between statutory and marginal tax rates
is also important, providing evidence that firms still use it in the capital
structure decision. Firms with volatile tax rates tend to use more debt as
expected with a progressive tax schedule. The relative cost of debt has the
wrong sign, but there may be a multicollinearity problem.
5.5 Dividends
Developing a model of dividend policy consistent with firms maximizing prof-
its and individuals maximizing utility has been a challenge. MM moved the
thinking away from the view that more dividends were better. Dividend ir-
relevence in perfect markets is based on the idea of replicating any desired
payoff by buying/selling shares. Transactions costs remove the ability of in-
dividuals to make home made dividends. There may be clienteles that prefer
dividends. There are also behavioral arguments, market timing stories, and
institutional constraints (“prudent man” rules).
Stylized Facts:
• Corporations payout a significant portion of earnings as dividends.
• Dividends have been the predominant form of payout.
• Individuals in high tax brackets receive substantial dividends.
• Corporations smooth dividends.
• Market reactions are positively correlated with dividend changes.
Black (1976) presents arguments for and against dividends as the “divi-
dend puzzle.” A firm may choose to pay dividends to provide a return ex-
pected by investors, even though this may be irrational. With transactions
costs, dividends may be a better way to distribute wealth to shareholders
than selling a few shares. The dividends may be used to signal information,
such as higher expected future earnings. Finally, dividends could be used to
expropriate wealth from bondholders. Reasons not to pay dividends include
tax avoidance, investment in growth opportunities, and the pecking order

argument.
5.5.1 Factors Influencing Dividend Policy

Dividends and Taxes
Since capital gains taxes are typically lower than the tax on dividends, and
capital gains can be deferred, there is a general tax disadvantage to dividends.
This advantage may vary over investor types (low tax-rate individuals, cor-
porations, tax-exempt institutions). The price drop on the ex-date has been
well-documented to be less than the dividend amount. The average premium
increases with the dividend yield, consistent with the tax clientele hypothe-
sis. There is also evidence of abnormal volume around the ex-date, indicating
there is not a (perfect) tax clientele.
Signaling with Dividends

Signaling implications that have been tested empirically include (i) dividend
changes should be followed by subsequent earnings changes in the same direc-
tion, (ii) unanticipated changes in dividends should be followed by revisions
in the market’s expectation of future earnings, (iii) unanticipated dividend
changes should be accompanied by stock price changes in the same direction.
There is only weak evidence that dividend changes convey information
about future earnings. There is evidence that earnings forecast revisions are
positively related to both dividend changes and the market reaction (causal-
ity), consistent with the signaling hypothesis. There is fairly strong evidence
of a positive relation between market reaction and dividend changes.
Agency Costs and Dividends

Expropriation of bondholders may come in the form of dividend payments.
Under this hypothesis, equity increases in value with a payout, while debt
loses value. Under the alternative that dividends signal good news, both
debt and equity should increase in value. There is evidence that bond prices
drop significantly with dividend decreases, but does not change significantly
at an increase. This is consistent with the information content explanation.
Dividends also reduce the free-cash flow problem of Jensen (1986). In sum,
5.5. DIVIDENDS 89
there is weak empirical support for the informational content of dividends,

and practically no support for dividends as a solution to agency problems.
5.5.2 Key Dividends Papers

Miller & Rock (1985)
Miller and Rock (1985) develop a model where firms use dividends to signal
their quality in a setting where there is an information asymmetry about
current earnings. The model has two periods. At time zero the firm invests
in a project whose profitability is unobservable by investors. The project
produces earnings at time one, which the firm uses to finance the dividend
and new investment. The project produces additional earnings at time two,
which are correlated with time one earnings. Good firms pay a level of
dividends sufficiently high to make it unattractive for bad firms to copy them.
Costs arise from the distortion in the investment decision. Dividends provide
information about earnings through the sources and uses of funds identity.
This model does not say why firms use dividends rather than repurchases.
• Outsiders can not observe the cash flows.
• All firms have identical investments with diminishing marginal rates of
return.
• External financing is done only with riskless debt.
• All dividends and capital gains are taxed at τ .
• Dividends and repurchases are perfect substitutes.
• Firms signal by distributing cash and altering their investments.
• Good firms are able to distribute more cash and still match investments
of bad firms.
• Bad firms can not afford to mimic the good firms because they would
have to forgo projects with relatively high marginal returns.
• The equilibrium has deadweight costs relative to the perfect informa-
tion case.
In this two period model a firm has a concave investment technology F (I)
and makes investments It at t = {0, 1} that generate random earning X̃t+1 =
F (It ) + ε̃t+1 . The errors are unconditionally mean zero, but E[ε̃2 |ε1 ] = γε1 .
The sources and uses of funds identity requires
I1 + D1 = X̃1 + B1 ,
where B1 is additional financing and D1 the dividend.

At time 1 the value of the shares is
V1 = D1 − B1 + [F (I1 ) + γε1 ]/(1 + r). (5.1)
The firm maximizes value by choosing I1 , D1 and B1 subject to the sources/uses

constraint. Substituting for the net dividend, the FOC is F 0 (I1∗ ) = 1 + r.
The earnings announcement effect is
h i
γ γ
V1 − E0 [V1 ] = ε1 1 + = X1 − E0 [X̃1 ] 1 + . (5.2)
1+r 1+r
The difference between actual and expected dividends is
(D1 − B1 ) − E0 [D1 − B1 ] = X1 − E0 [X̃1 ] = ε1
and the dividend announcement effect is the same as (5.2).

The dividend announcement reveals information about current earnings,
which in turn are useful for predicting future earnings. There are two com-
ponents to the announcement effect. The first is a dollar for dollar reaction
to the dividend surprise. The second is the discounted future change aris-
ing from the persistence parameter. In this model, earnings announcements
shortly after the net dividend announcement should not contain any new
information. In practice such earnings announcements do appear to be in-
formative. This is because they contain information on outside financing,
which is not part of the gross dividend. Financing announcement effects are
similar to dividend announcements, but with the sign reversed.
With intermediate trading, optimal policies are inconsistent because a
firm could pay a higher dividend by forgoing investments and raise the stock
price. A solution to this problem is underinvestment. The informational
asymmetry is that at time 1 the market knows the initial investment and the
first dividend, while the directors also know the cashflow and investment.
That is, Ωm = {I0 , D1 } and Ωd = {I0 , D1 , I1 , X1 , B1 , ε1 }. As a result, the
directors and market have different valuations of the firm. The directors
value the firm according to (5.1). The market can only use its information
in the valuation
V1 = D1 − B1 + E1m [F (I1 ) + γ ε̃1 |Ωm ]/(1 + r).
The managers choose the net dividend and investment to maximize the
weighted average of the two valuations subject to the sources/uses constraint.
5.5. DIVIDENDS 91
The weights are the fraction owned by selling stockholders and the fraction
retained. The public can use its information about the net dividend to infer
the earnings for which the dividend is optimal. Although there are an in-
finite number of informationally consistent valuation schedules, one Pareto
dominates the others. A firm with the lowest earnings will choose the same
net dividend and investment level as in the full information case, giving a
boundary condition. The solution to an ODE satisfying the maximization
problem has all net dividends at least as large as the optimal level.
Higher dividends serve as a signal of higher current earnings. The better
firms are able to pay out a higher dividend and forgo productive investments.
Since the investment technology is concave, forgoing projects has a higher
marginal cost for the lower quality firms. This separating equilibrium restores
consistency, but at the expense of underinvesting.
There is some empirical evidence supporting the validity of dividends as
signals. Examples include Vermaelen (1981) and Prabhala (1993), but ? do
not find supportive evidence. Since the Miller and Rock (1985) model is in
response to the observation that unexpected dividend changes are positively
related to stock price changes, the one would expect to find some supportive
evidence.
Prabhala (1993)
Prabhala (1993) presents a framework where dividends serve as a signal of
the quality of investment opportunities. This comes in response to earlier
literature where Tobin’s q and dividend yield are claimed to explain stock
price reactions to dividend announcements arising from agency costs of free
cashflows and the existence of dividend clienteles. This same evidence is
consistent with a signaling model which subsumes the importance of the
other effects.
The motivation for the signaling interpretation is that the other interpre-
tations are inconsistent with rational expectations. Since q, dividend yield,
firm value, and stock price are useful in predicting dividends, they should be
used in making optimal forecasts. The alternative interpretations depend on
dividend changes being unanticipated.
This model can be viewed as an extension of Miller and Rock (1985),
where the information asymmetry now relates to the quality of growth op-
portunities θ. A larger net dividend gets a higher market price at t = 1,
but reduces investment and the cashflow at t = 2 which is distributed to the
remaining (1 − k) shareholders. Since signal costs decrease with firm type,

the better firms can afford to signal more than the lower quality firms.
The dividend yield effect has been interpreted as evidence supporting
the existence of clienteles. Evidence that high-yield firms experience larger
announcement effects is consistent with this argument. Prabhala reinterprets
this evidence in a signaling framework where dividends are more informative
about growth opportunities for high-yield firms. Also, these firms are less
likely to have strong growth prospects so dividend increases are less likely.
Prior studies show dividend increase announcement effects for low q firms
are larger than for the high q firms, consistent with the free cashflow hypoth-
esis. This is because a reduction in FCF is more valuable for firms which
tend to squander cash. The signaling interpretation reflects the market’s ex-
pectations: high q firms are more likely to have better growth prospects and
are more likely to increase dividends so dividend increases result in smaller
announcement effects.
The empirical methodology estimates a dividend forecast, then examines
whether the deviation from the forecast explains price changes. The explana-
tory variables used to forecast the dividend are long-term dividend yield, q,
firm value, stock price, the difference in long- and short-term yields, and stock
volatility. Announcement effects are then regressed on dividend surprises.
Results indicate a positive relation between dividend surprise and announce-
ment effect, and dividends are more informative signals for high-yield firms.
Tobin’s q has limited marginal benefit beyond the dividend surprise. There
is little evidence of the agency or clientele effects after controlling for the
signaling effect, although these former hypotheses are not explicitly rejected.
Vermaelen (1981)
Vermaelen (1981) examines the price behavior of securities when firms repur-
chase shares in a tender offer or on the open market. This allows testing the
importance of information/signaling, personal taxes, corporate taxes, and
bondholder expropriation.
Repurchases serve as a signal of firm value since managers’ ownership,
etc. creates an incentive to increase stock price by announcing a tender
offer. Repurchasing shares above their true value will dilute the value of
the managers’ holdings. But with positive information, the manager may be
willing to pursue a tender offer. The more valuable the information, the lower
the marginal cost to buying back large fractions, offering a higher premium,
5.5. DIVIDENDS 93
and holding more shares in the firm. The price during the offer is given by
PA = αPT + (1 − α)P̄E
with α being the fraction purchased to the fraction tendered.
Vermaelen finds that repurchase announcements are followed by a perma-
nent increase in stock price. Signaling seems to be the predominant influence.
There is no evidence of wealth expropriation from bondholders or tendering
shareholders. Those that do not tender are worse off than those that do, but
they are better off than before. The results are also inconclusive with respect
to the leverage and personal tax hypotheses.
Open market transactions are associated with a negative CAR prior to
the announcement, followed by a an abnormal return of roughly 2% around
the announcement. Tender offers exhibit a flat CAR prior to announcement,
but an abnormal return on the order of 15% around the announcement.
Following the announcement, the tender offers have a decline the CAR, which
is consistent with the expiration of some of the offers. Looking specifically
at the expiration of offers, there is a negative abnormal return.
The abnormal return to shareholders, IN F O, is regressed on a number
of signaling variables to test this hypothesis. IN F O is defined as
I/(N0 P0 ) = (1 − FP )(PE0 − P0 )/P0 + FP (PT − P0 )/P0 ,
the weighted average of the return to tendered and non-tendered shares.
The results are consistent with the signaling hypothesis. The size of the offer
premium, target fraction, managerial ownership, and subscription level are
all positively related to the value of information.
DeAngelo, DeAngelo & Skinner (1996)

? provide another test of dividends as signals. They identify stocks with a
history of growth followed by a decline in earnings and examine the dividend
policy before and after the decline. They find that dividends are not reliable
signals of future earnings. The results could be due to overoptimistic man-
agers who “oversignal,” the relatively small cash commitment of a dividend
may undermine its credibility as a signal, or signaling based on imperfect
information.
At the year 0 dividend decision, 68% of the firms increase dividends while
only 1% decrease dividends. There is no evidence of positive earnings sur-
prises among these firms over the next three years, and some evidence of
negative surprises. Dividend increases cause small abnormal returns at the

announcement, but over the course of the year the firms have large negative
abnormal returns. The dividend increasing firms have a less negative abnor-
mal return than the decreasers, which suggests that managers may be able
to prop up the stock price with a dividend increase.
Eades, Hess & Kim (1994)
Eades, Hess, and Kim (1994) examine the time series of ex-dividend day
pricing and identify variation due to tax effects, strategic short-term trading
(dividend capturing), and business cycle effects. They find the variability in
pricing is positively correlated with dividend yield and dividend pricing is
countercyclical. Dividend capturing reduces ex-date returns and depends on
transactions costs, interest rates, and dividend yield.
The methodology forms ex-date portfolios on each calendar date. Stan-
dardized excess portfolio returns (SER) are the ex-date portfolio return (in-
cluding the dividend) less the average non-ex-date portfolio return, divided
by the estimated portfolio standard deviation. The portfolios are further sub-
divided into high-yield and low-yield portfolios. The SER of the low-yield
portfolio is always positive, has relatively low variation, and zero to negative
autocorrelation. The SER for the high-yield portfolio changes from positive
to negative, is more volatile, and exhibits high positive autocorrelation.
The tax effect hypothesis is tested by including dummy variables for dif-
ferent tax regimes in an ARIMA model. There is little evidence of a tax
effect. The test of the dividend capturing hypothesis includes a dummy for
the introduction of negotiated commissions. This lowers transactions costs
and makes it easier for corporations to perform tax arbitrage. The dummy is
significantly negative, especially for the high-yield firms. This is consistent
with the dividend capturing hypothesis. The dividend capturing hypothe-
sis also predicts dividend capturing is negative related to T-bill yields and
positively related to dividend yields. The evidence also supports these pre-
dictions. Analysis of the business cycle effects indicate that low-yield firms
are valued countercyclically (procyclical ex-date returns). The high-yield
firms do not exhibit this pattern because the dividend capturing effects work
in an offsetting direction.
5.6. CORPORATE CONTROL 95
5.6 Corporate Control

Manne (1965)
The Manne (1965) paper is the first to introduce the idea of a market for
corporate control. For the market for corporate control to be effective there
must be a high positive correlation between managerial efficiency and share
price. Takeovers lead to competitive efficiency among managers and are
more efficient than bankruptcy. They allow increased mobility of capital
which provides more efficient allocation of resources. Corporate control may
be transferred through a proxy contest, direct share purchases, or mergers.
Proxy contests are the most expensive, most uncertain, and least used.
This method tends to be used when the issue is over compensation not man-
agers’ policies. Proxy contests are more likely with disperse share ownership.
The share price generally rises on the announcement.
Direct share purchases may be open market purchases, direct purchases
of blocks from large owners, or tender offers. With lower ownership concen-
tration other shareholders are more likely to participate in the premium and
outsiders are willing to pay less for control.
Mergers typically offer cost advantages over the other methods. In a
merger the manager’s interest are generally in line with the owner’s. The
main exception is that managers do not have an incentive to buy managerial
services as cheaply as possible. When incumbent managers recommend a
merger there are likely to be side payments. Within an industry mergers
may be an alternative to bankruptcy. These mergers typically reduce the
information gap between the target and bidder.
Shleifer & Vishny (1986)
Shleifer and Vishny (1986) examine the role of large shareholders as monitors
and the ways in which they bring about improvements in corporate policy.
They basic idea is that someone needs to monitor the managers, but it is too
expensive for small owners to do so. Large shareholders are better able to
bear the monitoring costs and will do so when it is in their best interest.
In the model the large shareholder L has a probability I of getting a value
improvement Z above q from a probabilty distribution F (Z) for a cost C(I).
The large shareholder begins with α shares so he needs an additional .5 − α
to attain control. If he invests C(I) he will bid q + π where
.5Z − (.5 − α)π − cT ≥ 0 (5.3)
and cT represents the costs of making the bid. The small shareholders will
tender if
π − E[Z|Z ≥ (1 − 2α)π + 2cT ] ≥ 0.
Let π ∗ (α) and I ∗ (α) be the optimal amounts, and Z c (α) be the cutoff value
at which L is indifferent about taking over.
There are a number of important results. First, the premium decreases
in L’s stake, π ∗0 (α) ≤ 0. Second, a larger initial stake permits takeovers for
smaller improvements, Z c0 (α) < 0. Third, with a larger stake L invests more
in monitoring, I ∗0 (α) > 0. Next, the expected increase in firm profits rises
with α, given L has an improvement. Therefore, an increase in α decreases
the premium but increases the market value of the firm. Increasing cT will
increase the takeover premium but decrease the market value of the firm.
There is not an equilibrium where L attains more than the amount necessary
for control, say 50%. This is because the small shareholders will infer that L
is trying to profit at their expense.
“Jawboning” is an alternative to a takeover. Essential L uses his size as a
threat of takeover. The managers may then be willing to negotiate and make
some of the changes L seeks. This method can be incorporated into the above
analysis by including the condition that (5.3) be greater than αβZ, where
β is the proportion of the potential value gain attainable through negotia-
tion. Jawboning will typically be used for making less valuable improvements
since the costs are typically lower. As before, the value of the firm increases
with α, but now the option to jawbone can actually make the larger share-
holder worse off. This is because the the required bid on the takeovers rises.
Small shareholders can be worse off as well since takeovers are typically more
valuable to them than private negotiation.
Assembling a large block is a complicated problem. If L can accumulate
a position anonamously he can deprive small shareholders from their gains
from his larger holding. If L trades publicly small shareholders will bid the
price up to reflect the potential value gain. This makes it expensive for L
to get his position. He will want to increase his position again to offset
these additional costs. But the small shareholders will see this and holdout
from selling the first time. Similarly, L will never fragment his stake because
doing so reduces the value of his remaining shares since there will be less
monitoring. Assembling a block is a one-way proposition. It is expensive to
do, so once done it should not be undone. Large blocks should be sold intact
to preserve the value of monitoring.
Dividends may provide the compensation to L necessary to get him to
assemble a block. Large shareholders are typically corporations who enjoy
tax benefits on dividend income. Dividends are a sort of bribe from the small
shareholders to the large to get them to serve as monitors.
Stulz (1988)
Stulz (1988) shows that managements’ voting power is important in deter-
mining capital structure. For small α, ∂V /∂α > 0, for large α, ∂V /∂α < 0.
The intuition is that the premium offered in a takeover increases with α,
but the probability of an offer falls. When α is too high, it is beneficial
to make a takeover less costly to managment with a golden parachute, for
example. There is no benefit in this model to the manager holding the con-
trolling interest. He will be able to block any takeover in this case. This
implies α∗ ∈ [0, 1/2). The conflict of interest in the model arises from the
fact that successful tender offers affect the wealth of outside shareholders and
managers differently.
These results are demonstrated in a single period model where the man-
ager owns α of an all equity firm. At the beginning of the period there is
homogeneous information and a bidder decides if he wants to get information
on the target. He pays I for information delivered at the end of the period.
The bidder will bid for half the shares a price of the no-bid value plus a
premium on all the shares, y/2 + P . All the benefits of the value increase go
to the target. The probability of a successful offer depends on the likelihood
shareholders’ tax rates are low enough to accept the bid and the fraction
of outsiders needed to make the offer a success. The bidder chooses P to
maximize the difference between the gain and the premium times the prob-
ability of making a successful bid. With α > 0, the bidder has to persuade
1
z(α) = 2(1−α) > 1/2 of the outsiders to tender. Increasing α decreases the
probability of a successful bid so the bidder’s expected value falls as well.
For the bidder the optimal premium increases with α.
Allowing the manager to tender preserves the general results. More risk-
averse managers will hold less shares since they are risky. With DARA pref-
erences, α will increase with the manager’s wealth. Managers with greater
V
MSV
JM
Stulz
α
Figure 5.2: Managerial Ownership and Firm Value
benefits from control will hold more share to protect their interests. Managers
will also hold more shares when the sensitivity of offer success to changes in
ownership is large.
Due to risk aversion and budget constraints, managers typically hold only
a small portion of the shares. Alternatives that increase (some) their voting
power can increase firm value. Changing the debt ratio or repurchasing
shares will increase α. Convertible debt and delayed conversion can also help
since conversion will decrease α. By changing the requirements for control
a super-majority rule or differential voting rights effectively increase α. The
manager may also have voting power over shares he does not own. In ESOPs
and pensions the manager is often the trustee. A standstill agreement gives
the manager voting power over a large shareholder’s position but may also
effectively eliminate a bidder.
Morck, Shleifer & Vishny (1988)

Morck, Shleifer, and Vishny (1988) offer an empirical test of the effect of man-
agerial ownership on firm value. High managerial ownership may be good
because it aligns the incentives of the managers with the shareholders. How-
ever, too much managerial ownership may be bad because the manager may
become entrenched. The analysis studies the relation between Tobin’s q and
managerial ownership after controlling for intangible assets, tax shields, size,
and industry. More specific tests distinguish between insider and outsider
ownership and connections to founding families.

In the study management is defined as the board of directors. Tobin’s
q is regressed on measures of growth oppportunities (R&D/A, Adv./A), tax
shields/capital structure (D/A), size (A), industry dummies, and dummies
for board ownership. The ownership dummies indicate ownership up to 5%,
from 5% to 25%, and above 25%. The results indicate that there is a positive
relation between ownership and firm value at very low and very high levels of
ownership, and a negative relation in between. The explanation is that the
incentive alignment effect is always present, but the entrenchment effect is not
important until ownership is sufficicently high. Also, managers become fully
entrenched at some point, while the incentive effect continues to increase.
The results are robust to different ownership breakpoints and measures of
firm value. Ananlysis of the board composition indicates that outsiders are
slightly better monitors but still become entrenched. Close connection to the
founding family increases value in new firms but decreases value in old firms.
The main results are depicted graphically in Figure 5.2 (not drawn to scale).
These results are consistent with a combination of the predictions in Jensen
and Meckling (1976) and Stulz (1988).
These results may be partly due to the fact that managers in high q firms
are more likely to have more stock. This is likely to be especially important
in the low ownership range and can induce a spurious correlation between
ownership and q.
Cotter, Shivdasani & Zenner (1996)

? examine the effect that outside directors have on the value of target share-
holders. This is a situation where the board should be particularly important.
Insiders on the board may have incentives that are different than outsiders.
The results indicate that target shareholder gains are 20% larger when the
board is independent. The value comes at the expense of the bidder share-
holders. Outsiders are associated with higher initial bids and greater offer
revisions. With an independent board, defense mechanisms such as poison
pills enhance shareholder returns rather than entrench managers. Target
gains are negatively related to interlocking boards and positively related to
ownership of insiders.
The study examines the impact of board composition on the initial pre-
mium, premium revisions, and target shareholder gains. Board members are
classified as independent, insiders, or gray. Control variables include size,
poison pills, golden parachutes, managerial ownership, block ownership, and

performance.
5.7 Mergers and Acquisitions

There are many potential benefits to mergers and acquisitions. These takeovers
can remove inefficient management, achieve economies of scale, or generate
synergies. Offsetting these benefits are costs such as wealth redistributions
and reduced efficiency may arise due to the numerous conflicts of interest
and informational asymmetries.
Stylized Facts
• target SH earn large positive AR and negative AR on failure
• bidding SH earn zero to negative AR
• multiple bidder contests magnify AR
• bidder AR were lower in 1980’s than before
• joint MV increases on average
• success is highly uncertain and positively related to bid premium and
toehold
• defensive measures reduce probability of success
• target reaction to defensive measures and greenmail is negative
• large target mgt. share increases bid premium
• prob. of hostile takeover lower with high target D/E
• bid revisions are large jumps
• puzzlingly low toeholds
• mixed evidence about means of payment
5.7.1 Tender Offers

Tender offers can be either conditional or unconditional on attaining a critical
level of participation. Target SH are more likely to tender if he thinks the
post-takeover value is low and if he thinks he is pivotal. Shareholders have
an incentive not to tender if he thinks the post-takeover value is high. He can
let others tender so the takeover succeeds, giving him much of the benefit.
Complete Information
With complete information about the future value, no shareholders will ten-
der for less than the future value; all of the potential benefits go to the target
5.7. MERGERS AND ACQUISITIONS 101
shareholders and none to the bidder. The bidder may be able to make a profit
by diluting the value of minority shares after the takeover. The threat of this
dilution may induce the target SH to tender at a price less than the full
future value. If the bidder has a toehold he will also be able to profit even
without dilution. In practice the gains on the toehold are not likely to be im-
portant since toeholds are typically small. A bidder may be able to threaten
the target SH in other ways to get them to tender as well. One example
is to threaten to enter the target’s market and compete with them, thereby
reducing the value of the target.
Incomplete Information
A bidder may have a better idea about the future value than the target.
Under rational expectations, targets know that bidders will try to use their
superior information to under-bid. In equilibrium, the free-rider problem
remains and bidders will still refuse to tender. There are two types of equi-
librium, one where offers are uninformative, the other where the offer provides
information.
This problem was originally studied in Grossman & Hart (). The two-
tiered tender offer and dilution of holdout shares are potential solutions to
the problem. The difference is the type of signaling possible in a two-tiered
bid, which allows separation of the signals for undervaluation and private
synergies. This type of offer can eliminate the incentive to free-ride without
voluntary dilution. Another approach is to solve the free-rider problem by
allowing the individuals realize the effect their action has on the outcome.
This is a rational, but not a competitive, outcome.
In Shleifer and Vishny (1986) there are incentives in the form of divi-
dends for large shareholders to monitor the managers. This increases the
value of the shares for all shareholders, including the small shareholders.
The intention of acquiring a large block also raises the share price, making
it more costly to acquire the block. The dividend incentive argument, which
presumes the large shareholders are corporations, is not well-supported em-
pirically (cite ??).
A low bid may signal that the expected improvement is small. Since
bidders with high potential improvements have a stronger incentive to bid
high, a low bid is a credible signal. The probability of an offer’s success
increases with the bid premium and size of the toehold and decreases with
the number of additional shares needed for control.
Defensive Actions
Defensive actions may reduce shareholder value if it blocks a potentially good
takeover, but the may also improve value by increasing the incentive to bid
high and encouraging other bids. Some actions, such as the poison pill reduce
the incentive to bid high. In general, strategies that impose greater costs on
the bidder when the offer succeeds than when it fails reduce the incentive to
bid high. In summary, some defensive measures are in the shareholders’ best
interest, while others are used to create private benefits for the manager.
More subtly, a defensive measure may change the informational asym-
metry. Decreasing the importance of publicly known improvements decrease
the probability of success. Another defensive measure is to signal to the tar-
get shareholder that their shares are undervalued, in which cases they are
less willing to tender. Target management may do this by increasing lever-
age and/or repurchasing shares. There may also be reputational effects to
consider.
Pivotal Shareholders
A pivotal shareholder is more likely to tender than a non-pivotal one. A
large blockholder is much more likely to be pivotal than a small investor.
The ability of a bidder to revise a bid becomes very important with pivotal
shareholders.
Means of Payment
The means of payment has important consequences for the information re-
vealed by the bidder. Offering equity may indicate that the bidder’s shares
are overvalued [Myers and Majluf (1984)]. An offer of cash may signal high
value. Cash offers create an adverse selection problem for the bidder. Offer-
ing equity can reduce the risk of overpayment by making the terms of the
offer contingent on the target’s value. The target shares in gains or losses so
it will tend to reject the transactions that are likely to be undesirable. Fi-
nally, there are tax advantages to using at least 50% equity financing. With
no private information on the part of the target, the target can increase the
auction price with equity. If the target has private information about the
synergy, the bidder could benefit by conditioning the target’s acceptance on
its value.
5.7.2 Competition Among Bidders
In the English auction model of bidding, bidders trade incremental bids until
the bidders with the lowest valuations drop out. The winning bid will be
just above the second highest valuation. A very important assumption is
that bids can be costlessly revised and resubmitted.
If bids are costless to submit but there is an investigation cost, the bidder’s
strategy will change. Now he will want to submit a large initial bid to avoid a
costly bidding contest. The high initial pre-emptive bid does not deter other
bidders directly by requiring other bidders to improve, but rather it is a
signal that the initial bidder has a high valuation and reduce the probability
of additional bidders. All the bidders that decide to investigate will then
enter into the English auction. If bids are costly to submit, then the revised
bids will move in large steps.
Management may undertake activities to discriminate among bidders. In
general, exclusion of bidders is viewed as bad for target shareholder. There
are some reasons that these defensive measures may be good. For example,
target management may reject a bid if the target firm is worth more, or if
it is likely that other bids will come. Other measures may be repurchasing
shares to make a takeover more difficult, or removing the incentive for the
takeover by fixing existing problems. Removing bidders may increase ex ante
the frequency of bidding competition. It is optimal to pay greenmail only if
there is no white knight.
5.7.3 Managerial Power
Managers have potentially conflicting interests of maximizing shareholder

value and looking out for their own best interest. Increasing target debt
levels may be a way of reducing some of the agency costs that may be the
impetus for the takeover. High leverage may also allow the target to capture
a greater fraction of the bidder’s improvements. Shifts in debt levels can also
affect management’s voting power and gains from change in control. If the
supply of shares in upward sloping, bidders must offer larger premiums. This
causes the level of the bid to increase with the manager’s share ownership.
5.7.4 Key Papers

Roll (1986)
The hybris hypothesis of Roll (1986) suggests that takeovers occur because
managers overestimate their own abilities. Under this hypothesis the gains to
takeovers are small or non-existent. This explanation is not inconsistent with
stong-form efficiency, whereas other explanations require at least temporary
inefficiency. There is some evidence that is generally consistent with this
idea.
Since the current market price is a lower bound on bids, only bids with
relatively high valuations are observed. Thus takeovers attempts are likely
to contain random overvaluation errors. Since these transactions are driven
by a relatively small number of people and depend heavily on individual
decisions irrational behavior is more likely.
The theory predicts that the bidding firm will have a price decline on the
announcement followed by a further decline on winning or an increase on
losing the bid. The total gains to the takeover should be non-positive. Gains
to the target come at the expense of the bidder and transactions costs are a
deadweight loss. There should be more hubris among firms that have been
successful recently.
Lang, Stulz, & Walkling (1989)

Lang, Stulz, and Walkling (1989) examine the variation in tender offer abnor-
mal returns to understand the determinants of the bid premium. The results
indicate benefits are greatest for high q bidders and low q targets, consistent
with the Jensen (1986) FCF argument. The evidence that high q bidders
profit indicates that the hubris hypothesis is not a complete description of
the process.
In successful tender offers the typical bidder has a low q for several years
prior to the bid. The target’s q tends to have declined recently. The takeovers
creating the most value are high q firms acquiring low q firms. The most
value is lost when low q firms acquire high q firms. These results could also
be interpreted to mean that value is created when undervalued firms are
acquired and destroyed when overvalued firms are acquired.
The analysis regresses gains on dummy variables for the q of the bidder
and target. The q is measured as either the average over three years prior
to the bid or from the most recent year. The regressions do not control for
the growth opportunities, form of payment, or number of bidders. This is a

problem because q can measure not only management ability, but also the
growth potential of the firm. Separate regressions are performed for bidder
gains, target gains, and total gains. Each are further subdivided by whether
there are opposing offers.
Berger & Ofek (1995)

Berger and Ofek (1995) examine the effect of diversification on firm value.
Diversification programs were popular in the 1950s and ’60s, but more re-
cently firms have moved in the opposite direction. The authors compare the
sum of imputed stand-alone values for a firm’s segments to the market value
of the firm. The general conclusion is that diversifivation tends to reduce firm
value by roughly 15%. Unrelated diversifications destroy the most value.
There are many potential benefits and costs to consider in analyzing the
value of diversification. There are gains in operating efficiency, increased
debt capacity, reduced taxes, and efficiencies with internal capital markets.
Potential agency costs such as FCF, cross-subsidation, and incentive conflicts
between and among the divisions weigh against the benefits.
Imputed valuations are based on the median ratio of capital to {EBIT, A, S}
among single-segment firms. For each multi-segment firm the value from di-
versification is the log of the ratio of actual value to imputed values.2
The results indicate that multi-segment firms tend to have lower ratios.
Regressions of excess value on multi-segment indicators indicate that diver-
sification reduces value by 15%, even after controlling for size, profitability,
and growth. Acquisitions of related segments tend to be less harmful than
diversifying acquisitions. The results are robust to imputed measure and
persist over time.
In examing the sources of value gain or loss the authors consider overin-
vestment, cross-subsidization, and tax effects. Overinvestment, as measured
by capital expenditures to assets in low-q segments, is negatively related to
excess value, especially for diversified firms. The cross-subsidation effect is
captured by a dummy variable for negative cashflows in a segment. Again,
the effect is more negative for multi-segment firms. The evidence suggests
that the tax benefits are economically insignificant.
2
This is biased towards a diversification discount since logs are not symmetric about 1.
Also, the allocation of overhead and reliance on segment-level reporting may create biases
in the imputed valuations.
Mitchell & Lehn (1990)
Mitchell and Lehn (1990) ask “Do Bad Bidders Become Good Targets?”
The answer seems to be yes. The idea is to see whether takeovers discipline
managers of firms that have demonstrated poor acquisition programs. This
suggests that at least part of the gains to targets may be in reduced agency
costs. The authors find that there is little change in value for acquisitions
in general. But there is a significant decrease in value for bidders who are
subsequently acquired. For all firms, the average gain for an acquisition that
is later divested is smaller. This effect is especially true for firms that later
become targets themselves. Finally, the probability that a firm becomes a
target is inversely related to the announcement effects of its acquisition.
In defining bad bidders it is important to distinguish between overpay-
ment, which can not be fixed, and poor ongoing performance, which presum-
ably can be fixed. Also note that most targets of hostile takeovers did not
previously make an acquisition, so this is only a partial explanation.
The main analysis is based on an event study methodology of abnormal
bidder returns around the bid announcement. Average abnormal returns
for different classifications of the bidders are compared. On average, bid-
ders earn a negligible return, but non-targets actually earn a positive return.
Subsequent targets, especially those in hostile takeovers, earn negative re-
turns. Since the divestiture rate is higher for subsequent targets than for
non-targets, it appears that the bad bidders are bad because the have poor
ongoing performance. Logit regressions give evidence that firms that make
bad acquisitions are more likely to get takeover offers than firms that make
good acquisitions.
Mitchell & Mulherin (1996)
The Mitchell and Mulherin (1996) paper addresses the impact of industry
shocks on the high level of restructuring in the 1980s. The hypothesis is
that tender offers, mergers, and LBOs are among the lowest cost means of
responding to industry change. The study is motivated by the high concen-
tration of restructuring within industries. If these activities are driven by
industry effects, the announcement of one firm in an industry should provide
information about the prospects of the other firms in that industry. In this
sense it is not surprising that we see poor performance following a takeover.
These activities are not the cause of a problem, but rather a response to a
5.8. FINANCIAL DISTRESS 107
problem.
The study is based on the roughly 1,000 Value Line firms in 1981. These
firms are tracked throughout the rest of the decade and marked as to the
type of takeover target they were (if at all). The analysis indicates that
over the full period there is significant clustering of takeover activity at the
industry level. Furthermore, within industries there is also clustering over
time. Across all industries takeovers are spread fairly evenly over time. This
provides evidence that takeovers are responses to industry-specific shocks.
Further analysis indicates that this industry clustering was less common in
the 1970s. Regressions of takeover activity on variables measuring sales and
employment shock and growth indicate that it is industry change, not growth,
that drives the takeovers. The findings in this paper indicate the problem of
asset liquidity in Shleifer and Vishny (1992) may be important.
5.8 Financial Distress

When a firm faces financial distress a number of problems arise. Depending
on how severe the distress, the firm may be tempted to underinvest or engage
in risk-shifting. A firm in financial distress can attempt to reschedule its debt,
raise cash via the issuance of new securities, or sell some of its assets.
It is important to distinguish between financial and economic distress.
Bankruptcy proceedings are intended to do so by directing assets to their
greatest use. In some cases this means liquidating the assets and dissolving
the firm, whereas in other cases it means reorganizing the firm and its financ-
ing to preserve going-concern value. There is debate over whether markets
or courts are better at resolving distress.
There is evidence that competing firms experience a stock price drop
on the bankruptcy announcement, indicating that the announcement sig-
nals poor industry conditions [see Mitchell and Mulherin (1996)]. However,
firms in concentrated industries with low leverage have price increases. Zen-
der (1991) discusses optimal security design that implements efficient invest-
ment. Bankruptcy is the mechanism that creates a state-contingent transfer
of control.
The direct costs of bankruptcy are relatively small. Indirect costs may
be more significant, but are hard to measure. Liquidation costs are distinct
from costs of financial distress and arise in the process of selling assets.
5.8.1 Factors Affecting Reorganizations

Free Rider Problem
A debt restructuring requires unanimous approval of all holders of a class of
security. To get around this requirement, a firm can use an exchange offer,
where security holders have the right to participate in the exchange. Since
the restructuring is designed to increase the health of the firm, the old debt
increases in value. Therefore, some of the bondholders may hold out of the
exchange and capture this increase in value. Since all the bondholders (in a
given class) have the same incentives, the exchange is likely to fail.
This problem can be solved with different indenture provisions ex ante, or
with coercive participation. Examples of indenture provisions include grant-
ing a trustee the right to accept offers on behalf of the bond holders, requiring
only a majority for approval, or including a “continuous” call provision. Co-
ercive methods include ex post modification of the covenants directly.
Information Asymmetries
Insiders and outsiders may disagree about the value of the firm due to dif-
ferential information. Further, they may have incentives to intentionally
misrepresent the value of their claims. The state of financial distress may be
misrepresented as well (e.g., discount bonds). Insiders of a firm with poor
prospects may hide the truth, whereas insiders of a firm with better prospects
may claim they are in distress hoping for a favorable debt renegotiation. In-
termediate payments such as coupons and deviations from the APR rule can
reduce these problems.
Agency Costs
The various investor groups and managers have different incentives in the
bankruptcy process, leading to conflicts of interest. Some of these groups
may join together to form a coalition to increase their bargaining power.
Managerial Behavior
Fama (1980) posits that a competitive market for managerial talent is an
important mechanism to control the behavior of corporate managers. Man-
agerial behavior is likely to be influenced by financial distress for several
reasons, including direct financial effects, potential loss of future income,

loss of firm-specific human capital, loss of power/presige, and reputation ef-
fects. It difficult to observe managerial ability, so it is hard to tell if financial
distress is due to poor management, the wrong incentives, or an adverse
environment. There is evidence of increased board turnover prior to finan-
cial distress, just when the board is most needed to monitor the managers.
Management of distressed firms are many times more likely to experience
turnover than managers at healthy firms. There is also evidence that the
role of the board changes after restructuring.
5.8.2 Private Resolution

Private resolution of financial distress involves activities outside formal bank-
ruptcy proceedings. Common techniques include exchange offers, tender of-
fers, covenant modification, maturity extension, or rate adjustment.
Evidence on Restructurings
Asset and financial characteristics jointly affect the choice of restructuring
mechanism. Private workouts are more common for firms with (i) more in-
tangible assets, (ii) fewer classes of debt, and (iii) greater reliance on bank
financing. There is evidence that the market is capable of predicting whether
a workout will be successful, and that workouts are a more efficient form of
reorganization than Chapter 11.3 Evidence from the Japanese markets in-
dicates that firms with close ties to a main bank are able to invest more
and increase sales more following the onset of financial distress. The close
relationship with the main bank internalizes some of the free rider and asym-
metric information problems.
Asset Sales
A firm may sell some of its assets to relieve its financial distress. Asset sales
may be different for distressed firms than for healthy firms. As discussed in
Section 5.15, Shleifer and Vishny (1992) suggest that the secondary market
for interfirm asset sales may be subject to adverse liquidity problems. The
3
This may be misleading since the firms that choose Ch. 11 may have done so optimally
given the characteristics of their bankruptcy.
purchaser may be exposed to unique risks in the transaction with the dis-
tressed firm, or they may also be distressed if there are industry problems.
These factors combine to reduce the attractiveness of the asset sale. Evidence
indicates that asset sales among distressed firms are more common when the
firm has several divisions.
New Capital
If the firm still has good projects it may wish to acquire additional capital.
If the firm is in distress, it may have difficulty raising capital, as in Myers
(1977). Underinvestment arises because much of the benefit from the new
capital goes to the old debtholders. To solve this problem, new securities
should be senior and/or asset-backed.
5.8.3 Formal Resolution

Since a firm can generally choose private or formal bankruptcy proceedings,
the cost of bankruptcy will be the lesser of the two. The ability to choose
avenues will cause many of the features in the formal proceedings to appear
in the private resolutions as well.
Liquidation (Ch. 7)
Reorganization (Ch. 11)
• automatic stay
– stops principal and interest payments to unsecured creditors
– secured creditors lose rights to collateral, may receive “adequate
protection” payments
– effectively extends maturity of debt
– Executory contracts can be assumed or rejected
– reduces blocking power of debtholders and leads to renegotiation
• debtor-in-possession
– Current management and directors typically retain control
– Management can file reorganization plan within 120 days, exten-
sions are common
– incremental senior borrowing is allowed, strip seniority/collateral
from existing debt
• Negotiation
– All classes of creditors and court must approve agreement
– “Cramdown” forces creditors to accept the plan

– Management delays are a bargaining tool transferring wealth from
debt to equity
– Debtors have favorable bargaining power
– power given to management/equity viewed as compensation for
not exercising option to delay or shift risk
Chapter 11 has ambiguous effects on efficiency, but provides the greatest

economic benefit when underinvestment is a problem.
Prepackaged Bankruptcy
A firm is allowed to simultaneously file for bankruptcy and give its plan of
reorganization. This allows the firm to get the efficiency of the private re-
structuring, yet retain some of the benefits of the formal proceeding (e.g., the
cramdown and certain tax benefits). Prepacks may also reduce the holdout
problem inherent in workouts.
5.8.4 Key Papers

Ross (1977)
Ross (1977a) develops a theory incorporating managerial incentives into the

capital structure decision. Insiders have private information and are com-
pensated by a known incentive schedule. Since the manager incurs a penalty
if the firm goes into bankruptcy, the amount of debt is a valid signal since it
is costly for the managers, and more so for managers at lower-quality firms.
This signal then influences the market’s perception of the firm’s risk, al-
though it does not affect the actual risk. The M&M irrelevancy result holds
within a risk class, but there is an optimal capital structure for each firm
type.
There are several empirical predictions from this model. Cross-sectionally,
the cost of capital will be unrelated to the financing decision, although the
debt level is uniquely determined. Bankruptcy risk should be an increasing
function of firm type and debt level. Finally, value should increase with
leverage in the cross-section.
James (1995)
The paper by James (1995) attempts to understand the conditions under
which a bank will take equity in a distressed firm. Bank debt is generally
thought to be easier to renegotiate than public debt since coordination is
easier. Banks have limited incentives to make unilateral concessions, since
this will create a wealth transfer to junior claimants. Banks are more likely
to take equity when bankruptcy costs are high, such as when a firm has
significant growth opportunities.
James examines roughly 100 bank debt restructurings in the 1980s. In
some cases the firms attempted restructuring of public debt as well. The
restructurings involved either forgiving financial obligations or modifying the
terms of the debt.
There are five general findings. First, whenever the bank takes equity the
public debtholders (if any) also take equity. Public bondholders are much
more likely to act unilaterally than are banks. Second, banks tend to make
larger concessions when there is no public debt. Third, banks also tend to
take relatively large equity positions and hold them for several years. Fourth,
banks are more likely to take equity when the firms has a small proportion
of public debt, more valuable growth opportunities, greater cashflow con-
straints, poor prior operating performance. Finally, the firms in which banks
take equity tend to perform better subsequently than the ones in which they
do not.
Hotchkiss (1995)
The basic goal of Hotchkiss (1995) is to see if Chapter 11 bankruptcy proceed-
ings are effective in reviving troubled companies. Results indicate that a large
number of firms are not viable after the reorganization and that existing man-
agements’ role in the process is associated with continued poor performance.
The latter point may mean either that the process favors management or
that these distressed firms have difficulty in attracting new managers.
The paper includes an analysis of post-bankruptcy operating performance
in terms of accounting profitability, deviation from cashflow projections,
and subsequent distress. Many of the firms increase in size shortly after
bankruptcy. The firms begin with average profitability in their industry five
years prior to bankruptcy. Closer to the filing, performance deteriorates.
Following confirmation of the plan performance improves somewhat, but a
number of firms continue to have trouble. The cashflow forecast errors are
significantly negative each year, beyond any industry effect. This result may
be due to incentives to make high forecasts; the managers who remain in
control tend to make overly optimistic forecasts. Finally, roughly a third of
the firms file for a second restructuring within a few years.
Logit regressions provide additional evidence about firm characteristics.
Large firms are more likely to emerge as public companies and are less likely
to report negative operating income. There is strong evidence that retaining
the pre-bankruptcy CEO is positively related to poor post-bankruptcy per-
formance. Finally, there is some evidence that firms filing in New York are
more likely to remain in distress.
Weiss (1990)
Weiss (1990) performs an examination of the direct costs of bankruptcy and
violation of the absolute priority rule. He finds direct costs average about
3% of firm value (20% of equity value) the year prior to bankruptcy. The
absolute priority rule is frequently violated, especially in New York. There is
no evidence that these cases are resolved more quickly. Larger transactions
are more likely to violate strict priority since there are more opportunities to
extract concessions.
One view is that the violation of APR is to compensate equityholders
for not exercising their option to delay the proceedings or pursue actions
detrimental to the senior debtholders. Evidence suggests that equity mar-
kets anticipate the deviation from APR, and the junior debt incorporates a
premium for APR violations.
Betker (1995)
In order to understand the effectiveness of prepackaged bankruptcies, Betker
(1995) documents the costs and sources of economic gain associated with
this method. The time spent in bankruptcy is much shorter, 2.5 months in
a prepack versus 25 months in Chapter 11. The total time including prelim-
inary negotiations is similar to Chapter 11 and is long relative to workouts.
The direct costs are estimated to be about 3%, very similar to the results in
Weiss (1990) for Chapter 11 proceedings. Indirect costs in a prepack may be
lower, but it is not clear by how much. It is possible that the indirect costs
would be similar to a workout. Prepacks appear to offer some tax advantages
over workouts in treatment of NOLs, but not CODs.
5.9 Equity Issuance

The security issuance decision involves many of the same issues as capital
structure. In addition, the issuance process creates other considerations. Sea-
soned Equity Offerings (SEOs) are similar to Initial Public Offerings (IPOs)
in many respects. The primary difference is that there is an existing market
price from which the valuations can be based. This section discusses the
common elements and the specifics of SEOs. The following section addresses
the issues particular to IPOs.
Smith (1986) provides a review of the theory and evidence on security
issuance. There are several theories that attempt to explain the empirical
evidence. There may be an optimal capital structure in which case optimizing
firms should have non-negative valuation effects for capital structure changes.
The issuance could serve as a signal of decreased cashflows as in Miller and
Rock (1985). The degree of predictability will also influence the size of the
announcement reaction. Since debt principal repayment is predictable, debt
reissuances should also be more predictable and have smaller announcement
effects. A similar argument can be made with high dividend yield firms
such as utilities. As in Myers (1984) and Myers and Majluf (1984), when
information asymmetries are large the price impacts should also be larger.
Finally, changes in ownership concentration such as equity carve-outs may
affect value. Table 5.3 summarizes these predictions and the related evidence.
Stylized Facts
• Retained earnings are most common source of financing
• Debt is used more than equity, net retirement of equity in 1980’s
• Increased use of leverage over time
• Equity is issued relatively more frequently during expansions
• Private placements are becoming more important
• Gradual switch from rights to firm commitment
• Strong preference for firm commitment for non-equity issues
• IPOs use firm commitment (60%) or best efforts (40%)
• DRIPs and ESOPs have replaced rights offerings
• Underwritten offers are more expensive (directly), but more common
5.9. EQUITY ISSUANCE
Table 5.3: Theories of Security Issuance Reactions
Theory Prediction Evidence

Optimal Capital Structure AR > 0 Opler and Titman (1995)
Info. Asymmetry AR < 0, Yes: Mikkelson and Partch (1986)
Myers and Majluf (1984) more so for Eckbo and Masulis (1992),
securities with high Opler and Titman (1995).
info. asymmetry No: Helwege and Liang (1996)
Opler and Titman (1995)
Signaling Issue signals Yes: Mikkelson and Partch (1986), Prabhala
Miller and Rock (1985) lower earnings No: ?
Ownership Concentration Use underwriting with Eckbo and Masulis (1992)
Eckbo and Masulis (1992) disperse ownership
Predicatability Smaller reaction to Prabhala (1993)
Prabhala (1993) predictable issues Mikkelson and Partch (1986)
115
5.9.1 Flotation Methods

Since the use of an underwriter has higher direct costs than a rights of-
fer, there must be some indirect benefits provided by the underwriter. The
flotation choice can be viewed as an attempt to signal firm quality. The un-
derwriter also acts as a monitor or certifying agent. The best firms will use
standby rights offers, medium quality firms will use uninsured rights, and the
worst firms will use firm commitment underwriting. The flotation choice can
also be viewed as an optimal risk sharing contract in a principal-agent prob-
lem, where the issuer is the principal and the investment bank is the agent.
The issuer wants to incent the banker to expend effort which is difficult to
measure or observe.
Firm Commitment
In a firm commitment the investment bank assumes the risk of the offer. It
essentially buys the offer from the issuer and is responsible for selling it. The
process begins with an SEC filing. Next, a preliminary prospectus stating
a range of offer prices and the maximum number of shares is issued. After
SEC approval, the final offer price is set and a final prospectus is issued. The
underwriter’s guarantee begins once the final offer price is set. Competition
among underwriters has led to the “bought deal,” where an investment bank
will buy an entire issuance outright. Firm commitment becomes more at-
tractive with less asymmetric information, more risk-averse issuers, less risk-
averse underwriters, less price uncertainty, or when the investment bank’s
effort is more observable.
Best Efforts
In a best efforts offer the underwriter acts as a marketing agent on behalf

of the issuer. The issuer bears the risk of the offering. The filing process is
similar to a firm commitment offer, except there is a minimum sales level be-
low which the offer will be withdrawn. After SEC approval, the underwriter
attempts to sell the issue during a selling period. Average initial returns are
higher with best efforts offerings.
5.9. EQUITY ISSUANCE 117
Rights Offers
Current shareholders are given short-term warrants in proportion to their
shareholdings. Shareholders can either exercise the warrants or sell them.
The subscription price is typically 15-20% below the current market price.
Sometimes rights offers use standby underwriting to guarantee the proceeds
of unsubscribed shares. Rights offers in the U.S. are typically fully sub-
scribed.
Indirect Issuances
Convertibles, warrants, options, DRIPs, ESOPs are examples of indirect
methods of equity issuance. Stein (1992) develops a theory for convertible
issuance as ‘back door” equity financing. DRIPs and ESOPs have replaced
rights offerings to some extent.
Shelf Registration
The issuer can pre-register for the issuance of a security over a two year
period. This can reduce the direct costs of issuance but it increases the
information asymmetry problem since it is easier for managers to time their
offers.
Negotiated Bid
A firm can select its investment bank through either a negotiated or compet-
itive bid process. Negotiated bids are more common, especially among larger
issue, even though they are more expensive. The main users of competitive
bid offers are utilities, which are required to do so. Possible explanations in-
clude side payments to managers, increased accounting-based compensation
to managers, lower variability in costs, reduced agency costs, and protection
of proprietary information.
5.9.2 Direct Flotation Costs

A summary of direct floatation costs is shown in Panel A of Table 5.4. Direct
flotation costs are generally higher for equity issues than for other securi-
ties. They also tend to be higher for industrial companies than for utilities.
Table 5.4: Some Issuance Costs
Panel A: Direct Flotation Costs

Method Industrial Utility
Rights 1.8% 0.5%
Standby Rights 4.0% 2.4%
Firm Commitment 6.1% 4.2%
Panel B: Seasoned Issue Valuation Effects

Security Industrial Utility
Equity –3.14% –0.75%
Conv. Preferred –1.44% –1.38%
Preferred –0.19% 0.08%
Conv. Debt –2.07%
Debt –0.26% –0.13%
Convertible debt offers have higher flotation costs than similar sized non-
convertible offers, consistent with the hypothesis that issue costs are related
to security volatility. Not surprisingly, underwriter compensation is higher in
negotiated contracts than in competitively bid contracts. Underwriter com-
pensation has decreased since the introduction of shelf-registration, although
this may be due to selection bias issues.
Several firm characteristics are correlated with direct flotation costs [see
Smith (1986) and Eckbo and Masulis (1992)]. The models use direct flotation
costs as a percentage of issue proceeds as the dependent variable. A positive
intercept indicates there are fixed costs to the issuance. Measures of size
indicate that the costs are a decreasing, convex function of size, indicating
there are economies of scale. High shareholder concentration also lowers
issuance costs (this may be due to an increased reliance on subscription
precommitments). The direct costs are positively related to stock volatility.
Dummy variables indicate that rights offers have the lowest direct flotation
costs, and firm commitment offers the highest. These results are robust to
the time period used and across industrial and utility firms.
Issuers often grant an overallotment option, allowing the underwriter to
purchase additional shares if the offer is oversubscribed. This increases the
underwriter’s incentive to sell the issue, reducing the risk of failure.
5.9.3 Indirect Flotation Costs

Given the lower direct costs of rights offers, but the preference for firm com-
mitment offers, indirect expenses may be important. Managers may receive
personal benefits from underwriters, or there may be pressure from invest-
ment bankers who sit on the board. Also, sales to the public are more likely
to create a more disperse ownership structure, either reducing the monitor-
ing of managers as in Shleifer and Vishny (1986) or increasing liquidity as in
Merton (1987). Expected rights offer failure costs are small.
Other indirect costs may include the capital gains taxes and transactions
costs to the shareholders associated with a rights offer. There may also be
anti-dilution clauses and wealth transfers to convertible security holders.
5.9.4 Valuation Effects

Leverage increasing transactions produce positive ARs, while leverage de-
creasing transactions have a negative effect. There is an average negative
price impact of SEO announcements of about 3%. This contrasts to no sig-
nificant price impacts for the announcement of straight debt, equity sold
through rights offers, or private placements. Common stock offer cancella-
tions are also associated with positive reactions. Indirect equity issuances
(e.g., convertible debt) are also associated with negative announcement re-
actions. Evidence on shelf registration indicates a more negative reaction,
which is consistent with the increased adverse selection problem. These val-
uation effects are summarized in Panel B of Table 5.4.
There are several possible explanations for these valuation effects. If
there is an optimal capital structure, then a change to restore the optimal
level should be met with a positive reaction. The nonpositive announcement
effects do not support this hypothesis, although the announcement may also
convey information about the firm’s situation. This signaling effect, as in
Ross (1977a), implies that leverage-decreasing events signal negative revi-
sions in management’s expectations, and should be accompanied by a neg-
ative price reaction. Under the Miller and Rock (1985) model, any security
issuance signals lower than anticipated operating cash flow and is bad news.
There is some evidence indicating firms tend to issue debt following earnings
declines, whereas equity issuances tend to come before an abnormal earnings
decline.
The adverse selection problem in Myers and Majluf (1984) can also ex-
plain some of the valuation effects. In an extension to this framework, man-

agers who choose the size of the offer choose larger offers when their stock is
more overvalued. Other adverse selection problems arise because the under-
writer is able to distribute shares in the “good” offers to preferred clients.
Also, as in Rock (1986), with differentially informed investors the under-
priced issues will be oversubscribed, while the overpriced ones will go to the
uninformed investors. There are some (partial) solutions to these adverse
selection problems. The firm can try to change managerial incentives, use
private placements, maintain financial slack, use certifying institutions, use
equity carveouts, or issue convertible securities.
Much of the empirical evidence is consistent with the adverse selection
hypotheses. The announcements have nonpositive effects regardless of se-
curity type but larger or riskier issues have more negative reactions. Firm
commitment offers have the most negative reactions, followed by standbys,
then uninsured rights. These results are consistent with the model of Eckbo
and Masulis (1992). Other supportive evidence includes the more negative
reaction for industrial issues than for utilities, and more negative reactions
to shelf registration announcements.
5.9.5 SEO Timing
There is some evidence supporting the hypothesis that equity offers will be
more frequent in an expansion. The argument is that there are more prof-
itable investment opportunities in these times, and firms are less likely to
forego investment projects because of underpricing. Additional evidence in-
dicates that the announcement effect is less negative during expansions for
equity issues, while announcements of debt issuances are not effected. The
Myers (1984) pecking order hypothesis suggests firms will issue equity in eco-
nomic downturns because they are less likely to have excess cash and their
leverage is likely to have increased as market values of equity have fallen.
There relative regularity of debt issuances raises the possibility that the an-
nouncement effect is small because the market anticipates the issuance. Jung,
Kim, and Stulz (1996) and Opler and Titman (1995) provide evidence that
the debt-equity choice is predictable.
5.9.6 Key Papers

Eckbo & Masulis (1992)
Eckbo and Masulis (1992) model the choice between a rights issue and an
underwritten offer as an extension to Myers and Majluf (1984). In the model,
shareholder takeup k is an important determinant of the flotation method.
Firms using uninsured rights offers may use subscription precommitments
to credibly signal a high takeup. The precommitments in rights offers and
underwriter certification in firm commitment offers serve to reduce the wealth
transfer between current shareholders and outsiders.
Firms with more dispersed ownership will tend to choose underwriting.
Firms with less discretion over their issuance, such as a utility, will tend to
use a rights offer. The model predicts that the announcement effect will be
most negative for firm commitments offers, followed by standby rights and
uninsured rights. This analysis can be applied to other flotation methods as
well.
To analyze the determinants of direct costs, they estimate a regression
with measures of size, percentage change in shares, ownership concentration,
return standard deviation, and dummies for offer type. The results indicate
there are significant fixed costs and economies of scale. A positive coefficient
on the change in shares variable indicates there are adverse selection costs.
High ownership concentration lowers direct issuance costs, perhaps through
precommitments. More volatile returns are associated with higher costs since
there is increased underwriting risk. After controlling for issue characteris-
tics, rights offers are still less expensive than standby or firm commitment
offers. Having documented the rights issue paradox, the authors present a
model to explain it.
In the model, firms will issue if the value of the projects exceeds the direct
cost and dilution from issuing undervalued securities, b−(f +c) ≥ 0. The cost
c(k, m) depends on the level of existing shareholder participation. Managers
select the flotation method m to maximize firm value. The market gets
information about k through precommitments, trading volumes and actual
subscription levels.
With full participation the dilution cost is zero. When k < 1 some un-
dervalued firms will find it too costly to issue. In this sense k is similar to
the inverse of slack. High-k firms select uninsured rights and use takeup to
substitute for the underwriter guarantee. Firms with k ∈ (kf , ks ) will choose
standby rights. The lowest k firms will not bother paying the additional
rights distribution costs and will just use firm commitment offers. If a firm is
overvalued then high-k firms may choose either uninsured rights or they may
“hide” with a firm commitment offer. If they are detected they will sell at
a lower price or cancel and forgo the project. Since the market understands
these strategies, the high-k firms will face the lowest adverse selection costs
and the low-k firms the highest costs.
The authors test their model using an event study methodology. Con-
sistent with prior literature, the negative market reaction is strongest for
firm commitment offers and weakest for uninsured rights. After adjusting
for flotation costs, either type of rights issue has a negligible effect. Reac-
tions are less negative for utilities, consistent with smaller adverse selection.
Firm commitment offers are generally associated with stock price runups,
whereas this effect for standby or uninsured rights are smaller or negligible,
respectively.
Mikkelson & Partch (1986)

In a study similar to Masulis (1980), Mikkelson and Partch (1986) reexam-
ine the effect of announcements of capital structure changes on stock price
to better understand the determinants. They find a significant negative an-
nouncement effect for stock and convertible debt, and a less pronounced effect
for debt. Completed offerings have positive returns between announcement
and issuance, and a negative return at the issuance, indicating that firms
time their security issuance. Similarly, firms that refinance have more nega-
tive reactions than those who raise funds for capital expenditures. In general,
the results are consistent with the predictions of Myers and Majluf (1984)
and the notion of a pecking order.
The paper uses an event study methodology to measure excess returns.
The estimation window for α and β are the 140 days beginning 21 days after
issuance or cancellation. Throughout their sample the number of announce-
ments varies considerably across time. External financing is not a common
event for many firms, consistent with the pecking order hypothesis. Equity
offers tend to finance new assets. Among public offerings for cash, stock has
the most negative abnormal return at about –4% and straight debt the least
negative reaction. These results are consistent with the predictions in both
Myers and Majluf (1984) and Miller and Rock (1985), although the latter
paper does not distinguish between security types.
Table 5.5: Price Reactions by Issuance Type
Issuance Pre-AD AD AD-ID ID

All Equity + –
All Debt – 0
Completed Equity + –
Cancelled Equity – +
Completed Debt 0 0
The abnormal returns surrounding these events provide evidence that

manager time security issuance. Prior to the announcement, equity offers
tend to have runups, while debt offers tend to have declines. At the an-
nouncement, the equity offers have the most negative returns and the debt
offers the least negative. For completed offers, equity again has a runup
between the announcement and issuance, while debt is essentially flat. At
the issuance there is another negative effect for equity offers and a neutral
reaction to debt issuances. In general, convertible debt and preferred stock
fall in between the debt and equity effects, although small sample sizes make
interpretation more tenuous. A more direct analysis of cancelled and com-
pleted offers confirms that the cancelled offers have declines between the
announcement and cancellation, while completed offers increase in price be-
tween announcement and issuance. Further, at the cancellation there is a
positive return versus a negative return at the issuance. Note that these
patterns are ex post, and it is not likely that there are any profitable trading
strategies. An effort to determine whether debt ratings make a difference
is inconclusive due to small sample sizes, but announcements of bank credit
lines are associated with positive abnormal returns.
Opler & Titman (1995)

If there is an optimal capital structure then firms experiencing an equity price
runup should issue debt to move back towards the optimum. Evidence that
firms issue equity after a runup seems to indicate the opposite. Opler and
Titman (1995) address this issue by seeing if deviations from an estimate
of the optimal debt ratio are useful in predicting whether the firm issues
debt. The general results indicate that firms do move towards a target debt
ratio. A puzzling finding is that the security choice of firms least subject to
information asymmetry are the most sensitive to recent returns.
There are several possible explanations of equity issuance after run-ups.
The optimal capital structure could simply change over time. If firms whose
growth opportunities improve have price runups, these firms should desire
relatively more equity financing. An agency theory explanation is that addi-
tional debt constrains a manager’s ability to grow and raises the probability
of default and firing. The observed behavior is also consistent with the Myers
and Majluf (1984) model where the firms with overvalued securities issue. A
behavioral explanation is that managers want to avoid dilution as a rational
response to an irrational market.
The analysis is performed in two stages. In the first stage debt ratios
are regressed on proxies for growth opportunities and size to get predicted
debt levels. Deviations from the predicted level and control variables are
then used to predict the probability of debt issuance is a second stage. The
second stage regressions are further stratified by size, dividend policy, and
utilities.
Their findings do not fully support any of the proposed theories. Partial
support comes from several observations. Profitable firms issue debt or re-
purchase shares to offset the accumulation of retained earnings. The larger
issuances tend to involve equity, perhaps in response to the higher fixed costs.
Stock return and M/B are good predictors of equity issuance. The results
on convertible debt generally fall between debt and equity. Firms that issue
short-term debt are less profitable than equity issuers, whereas long-term
debt issuers are more profitable.
The results from the stratified regressions are less supportive of the theo-
ries. Utilities, firms that pay dividends, and firms followed by more analysts
are more sensitive to recent returns in their security choice. Small firms are
less sensitive to price runups. In the more active market for corporate control
in the mid-80s, there is no evidence that managers are less willing to issue
equity following a stock price decline.
Helwege & Liang (1996)

Helwege and Liang (1996) test the pecking order theory using a sample of
IPOs. The basic design is to identify a cohort of firms going public in 1983
and follow their financing choices through time. The general finding is that
there is little support for the pecking order. The probability of external
financing is unrelated to internal cash shortages and financing patterns indi-

cate an “overuse” of equity.
The study starts with 367 firms. Over the next decade a roughly equal
number go bankrupt, are acquired, and survive. The firms tend to have losses
early in their lives then show increases in profitability. Dividends are rarely
paid and there is a tendency to rely on internal funds over time. Firms seem
to choose private debt, then equity, then public debt. Large firms tend to
issue debt, whereas small firms or low growth firms use more private debt.
Coefficient estimates on default and asymmetric information variables are
mostly inconsistent with the pecking order. Riskier firms tend to issue more
equity.
Jung, Kim & Stulz (1996)

Jung, Kim, and Stulz (1996) perform a test comparing the issuance decision,
market reaction, and subsequent actions predicted by the pecking order [My-
ers (1984)], agency [a special case of Myers and Majluf (1984)], and issuance
timing [Loughran and Ritter (1995)] theories. The findings are consistent
with agency theory, partially support the pecking order, and do not support
issuance timing.
Under agency theory managers will issue when its shares are overvalued
to maximize current shareholder value (assuming it can not issue riskless
debt). This setup is a special case of Myers and Majluf (1984) where there
is no information asymmetry about the assets in place but the manager has
incentives to issue equity to take negative NPV projects that are privately
valuable. The agency cost of outside equity arises because of managerial
discretion. Issuing debt instead reduces the manager’s discretion, but gives
rise to the underinvestment problem of Myers (1977) since gains go to the
bondholders first. Thus, high growth firms will have less leverage to avoid
foregoing good projects. The pecking order says firms should issue debt
instead of equity whenever possible. The timing model predicts firms will
issue equity when overvalued, so subsequent returns should be lower.
The analysis is based on a sample of debt and equity issues between 1977
and 1984. Equity issuers tend to be smaller, riskier, growth oriented firms.
The security issue choice is estimated with a logistic regression. High M/B
firms, leading indicators, and recent returns are positively related to the
choice of equity. Firms with high taxes are less likely to issue equity. Firms
that are predicted to issue debt but instead issue equity appear to overinvest.
Overall, abnormal returns are negative for equity issues and insignificant
for debt issues. For equity issues with high prior excess returns the an-
nouncement abnormal return is positive. The correlation from the firm type
predicted in the logit regression and abnormal returns is positive for equity
issues and negative for debt issues. This is evidence supportive of the agency
theory but not the pecking order theory.
The general results indicate that firms issuing equity tend to either have
valuable growth opportunities or lack valuable growth opportunities but have
excess debt capacity. These firms lacking valuable growth opportunities have
more negative stock price reactions to announcement of equity issuance.
Other evidence indicates that some firms issue equity to benefit the man-
agers rather than the shareholders.
5.10 Initial Public Offerings

This section discusses the features that distinguish IPOs from seasoned is-
suances. The primary difference is that valuing an IPO is more difficult than
valuing an existing public firm. Essentially all the problems with SEOs re-
main with additional complications. The potential benefits of going public
include (i) diversification, (ii) liquidity, and (iii) more capital to take good
projects. The costs include (i) information collection/disclosure, (ii)legal,
auditing (iii) underwriting and one-time direct issuance costs, (iv) manage-
ment time and effort, and (v) dilution. Much of the general discussion in this
section comes from Ibottson and Ritter (1995).
IPO’s are a stage in the life cycle of a firm. Initially, firms will be self-
financed since the capital requirements are the smallest and the information
asymmetry problems the largest. The next step is often financing by friends,
relatives, and associates. Personal relationships serve to align the interests of
the manager and the investors. Next comes non-affiliated sources of private
capital auch as bank financing and venture capital. These owners typically re-
quire a large amount of information disclosure and are often active investors.
After exhausting available private financing a firm will go public.
IPOs are characterized by information asymmetry problems. In partic-
ular, there are adverse selection problems since the owners self-select into
going public, and moral hazard problems since the manager/owners affect
the value of the firm. There are several mechanisms to deal with the infor-
mational asymmetries. By holding a sizeable portion of the firm the manager
5.10. INITIAL PUBLIC OFFERINGS
Table 5.6: Theories of IPO Underpricing
Theory Prediction Evidence

Winner’s curse riskier issues, greater underpricing some
Costly info. acq. upward revisions more underpriced yes
Cascades underprice to guarantee sucess
I-banker power no underpricing of IB IPOs no
Lawsuits underprice to avoid lawsuits mixed
Signaling underprice IPO for successful SEO no
Regulatory utilities less underpriced
Wealth redist. bribe limited
Stabilization underwriter support generates return no
Ownership disp. underpricing creates diverse ownership
Market incompl. compensation for risk-bearing some
127
provides a signal to outsiders. Managers may agree to a “lock up” period,

during which they will not sell their shares. A manager could also take a
small fixed salary in exchange for a contingent compensation scheme. Firms
may hire certifying agents4 who have credibility arising from their desire to
protect their reputation capital. There is evidence supporting the role of
certifying agents [Booth and Chua (1996)].
5.10.1 IPO Anomalies

There are three puzzling observations with respect to IPOs. The issues are
significantly underpriced from their secondary market values, yet over the
longer term IPOs tend to underperform. There are also cycles in the extent
of underpricing.
New Issue Underpricing

Initial returns are skewed, with a positive mean and median near zero.
Smaller offerings are more underpriced so equally-weighted returns overstate
degree of underpricing. The underpricing effect is present internationally.
Underpricing can be viewed as the solution to a moral hazard problem
between the issuer and the underwriter. The degree of underpricing will
increase with demand uncertainty.
Rock (1986) argues there will be a winner’s curse due to an adverse se-
lection problem arising from an information asymmetry between informed
and uninformed shareholders. In this model the market price and under-
writer offers are jointly determined in equilibrium and the selling mechanism
is exogenous. The banker sets an optimal offer and the market reacts. The
informed agents will only buy IPOs if they are good deals, in which case
they will be oversold and the uninformed will not be able to get their desired
amount. In the case where the IPO is overpriced, the informed will pass,
leaving the full amount to the uninformed. The uninformed will realize this
and require a discount on all issues in order to buy any of the IPOs. An
implication is that underpricing should be greater for riskier issues. Koh and
Walter (1989) provide direct evidence in support of the model.
Welch (1992) presents a model of information cascades where agents’
decisions are influenced by the actions of other agents. Firms will underprice
4
See James (1995) or Smith (1986).
5.10. INITIAL PUBLIC OFFERINGS 129
to get the first few investors to participate, starting a cascade. Booth and
Chua (1996) argue that shares are more valuable to investors when they
are liquid. Providing a more dispersed ownership structure will increase
the liquidity of the shares. Shares are underpriced to compensate a broad
investor base for costly information acquisition.
Long-Run Underperformance
The is significant evidence that IPOs perform poorly after the initial large
returns. The magnitude of this underperformance is on the order a –15%
CAR over the following three years. This type of underperformance is also
present in closed-end funds and REITs.
There is some evidence supporting each of the following theories of under-
performance. The divergence of opinion argument of Miller (1977b) is that
the buyers of IPOs are the most optimistic. With greater uncertainty, the
difference between the optimistic and the pessimistic is larger. As time goes
on, information will be revealed that will cause the difference of opinion to
converge, and therefore the price will drop. There is some survey evidence
supporting this theory. The impresario hypothesis suggests that investment
bankers underprice initially to create the appearance of excess demand. Un-
der the windows of opportunity hypothesis there is a sort of dynamic pecking
theory where firms will issue equity when it is overvalued in general.
Cycles
Cycles in both volume and underpricing are well documented but hard to
explain as rational. One explanation is changing risk composition, meaning
more risky offerings are underpriced more, and there may be a clustering of
IPOs by similar firms. There is some evidence in this direction, but it is not
entirely convincing. A second explanation is “positive feedback” strategies,
where investors buy IPOs expecting positive autocorrelation. If enough in-
vestors do this, the autocorrelation becomes a self-fulfilling prophesy (this is
basically the “greater fool” theory). This effect may be difficult to stop with
arbitrage since is difficult to short-sell the IPO [Rajan and Servaes (????)].
5.10.2 Key Papers

Welch (1992)
In his “cascades” model Welch (1992) provides an explanation of IPO un-

derpricing. Investors are sequentially asked if they want part of the IPO.
Each investor gets a signal of the value, but the investors can not observe
each others signals. Investors are able to observe the actions of those that
went before them. Using the decisions of others to update their own beliefs,
information cascades can occur where individuals disregard their own infor-
mation and follow the masses. Issuers may underprice to ensure the first few
investors accept and ensure success of the offer. Cascades are not necessar-
ily bad for an issuer. He is at less of an informational disadvantage since
the individuals are unable to aggregate their information. The underwriter
seeks to distribute the offer widely to make investor communication more
difficult. With inside information, a high offer price increases the probability
of offer failure and more so for lower quality issuers, creating a separating
equilibrium.
The model uses an economy of rational, risk-neutral investors. The un-
known true price of the firm is V , and the issuer has a reservation price
V P ≤ V L < V H . All participants have a prior Ṽ ∼ U [V L , V H ]. The price
can be expressed as p = θV H + (1 − θ)V L where θ ∈ [0, 1] indexes the firm
type. Each investor gets a signal s ∈ {H, L} with Prob[si = H] = θ.
With perfect communication all successful offers are underpriced. Ob-
served ex post underpricing is strictly increasing in uncertainty. If communi-
cation can only go from early to late investors things change slightly. Issuer
proceeds are path-dependent, but in large economy the perfect information
results obtain. When only decisions, and not information, are observable,
things change dramatically. Once an investor M with an H decides not to
invest, no subsequent investors will invest. Similarly, once M with an L in-
vests, all that follow him will also invest. This says that as the game goes on,
individual signals get less weight relative to the information from previous
decisions. As soon as one person goes against their information, anyone after
him would place even less weight on that information.
With cascades and an infinite number of investors the probability of fail-
ure is zero for P ≤ 1/3 and one for P ≥ 2/3. All prices in between have an
uncertain outcome. An uninformed risk-neutral issuer will optimally choose
P = 1/3 and the offer always succeeds. Since everyone chooses this price and
5.10. INITIAL PUBLIC OFFERINGS 131
the average price is 1/2, the expected IPO underpricing is 50%. Under these
conditions, the issuer has higher expected proceeds with cascades than with
path dependency or perfect communication.
What if the issuer can modify the price based on past sales? Issuers with
sufficiently high risk aversion may prefer to start an immediate cascade to
path dependency with the option to change the price later.
The model has a number of implications. First, when distribution is less
fragmented (a local issue) the issuer will underprice more. For these issues
the offer price decreases with the issuer’s risk aversion and capital require-
ments. Issuers have an incentive to prevent communications to preserve the
cascade. Welch argues that the winner’s curse in Rock (1986) is not im-
portant, since success of the offer is a foregone conclusion by the time the
“marginal” investor is approached.
To add another element of realism, issuers are given inside information
about the firm type which is correlated with the outside signals. This makes
it relatively less expensive for a high quality firm to raise the price than for
a low quality firm, creating a separating equilibrium.
Loughran & Ritter (1995)

Loughran and Ritter (1995) attempt to understand the long run underper-
formance of new issues. They find that only a small part of the underperfor-
mance is explained by B/M effects. The degree of underperformance varies
through time.
The authors calculate three- and five-year returns for a large sample of
IPOs and SEOs. The issue date return is not included in the calculations. Re-
turns are also calculated for a sample of non-issuing matching firms. Wealth
indices of issuers’ returns relative to matching firms are calculated for the two
holding periods by cohort year. In almost all cases these indices are less than
one and deteriorate from the three year measure to the five year measure. For
SEOs the issuing firms had extremely high (72% on average) returns in the
year prior to issuance. This underperformance is not due to mean reversion
as in ?. Separating extreme winners into issuers and non-issuers, the issuer’s
underperform over the next five years while the non-issuers beat the market.
Since many researchers have documented a relation between the cross-
section of returns and B/M, the paper tests for this effect. Although size
and B/M are significant, a dummy variables for new issues is significantly
negative, especially in periods following heavy volume. Using a Fama-French
three factor model, intercept estimates for issuers is significantly less than for
non-issuers. Also, the issuers have higher betas, which is inconsistent with
the lower returns from the previous analysis.
Koh & Walter (1989)

Koh and Walter (1989) provide a direct test of the Rock (1986) model of IPO
underpricing as a response to the winner’s curse. This test is unique in that it
uses data from Singapore where rationing of oversubscribed offers is done in
a special lottery. In this market all applicants for a given size of the issuance
have an equal chance of winning. The tests confirm the implications of the
Rock model that there is a winner’s curse and that uninformed investors earn
a return similar to the riskless rate.
The authors use simulations to generate the returns to different bidding
strategies. Assuming no rationing occurs, underpricing is large even after
transactions costs. Examination of the probability of an allocation indicates
that small investors are more likely to get an allocation. More importantly,
investors are nearly three times as likely to get an overpriced issue than an
underpriced one. Average returns incorporating costs and the probability
of allocation are approximately zero, consistent with the Rock model. Also
consistent are the correlations between proportions applied for or allocated to
and initial returns. The small investors apply for and get a larger proportion
of the issue when the issue is more fairly priced, whereas the larger investors
apply for and get more when the issue is underpriced. Both small and large
investors’ demands increase for underpriced issues, but the large investors
are much more responsive.
Booth & Chua (1996)

Booth and Chua (1996) explain IPO underpricing as an attempt to generate
ownership dispersion and enhance liquidity in the secondary market giving
a flavor of Merton (1987). In the model, informed investors are more likely
to participate in secondary market trading. The underpricing is set to com-
pensate investors for information acquisition. They find that underpricing
is positively related to information costs. Investment banker prestige is neg-
atively related to underpricing in firm commitment offers and unrelated in
best efforts. Finally, the clustering of issues seem to lower information costs
and underpricing.
5.11. EXECUTIVE COMPENSATION 133
The model has a number of empirical predictions. Underpricing should

be negatively related to the probability of receiving an allocation. The costs
of achieving ownership dispersion and liquidity should be higher for best
efforts offers since they tend to be smaller. Best efforts offers have a higher
probability of failure so they should be more underpriced. Finally, best efforts
offers should benefit most from clustering.
Initial returns are regressed on firm size, offer price, IPO activity in the
market and industry, underwriter rank, and interactions with these variables
and dummies for offer type. The results indicate that more reputable un-
derwriters underprice less in firm commitment offers, and that clustering is
important, especially for best efforts offers.
5.11 Executive Compensation

Murphy (1985)
Murphy (1985) reexamines the relation between firm performance and exec-
utive compensation. This study focuses on individual executives over time
and includes important explanatory variables as well as indirect forms of
compensation which prior research has ignored. Murphy finds that executive
compensation is strongly positively related to firm performance.
The paper attempts to avoid errors in variables problems associated with
omitting factors such as entrepreneurial ability, managerial responsibility,
firm size and past performance. If these factors are constant over time,
then time series regressions for individual executives can correctly assess the
sensitivity of pay to performance. The components of compensation under
consideration include: salary, bonus, salary + bonus, deferred compensation,
stock options, and total. Compensation is purged of any direct relation to
the firm’s stock price, and compensation over time is re-expressed in 1983
dollars.
The analysis is conducted in two parts. Time series regressions of an-
nual compensation for each executive on measures of performance for each
firm-year and dummy variables to control for the executive’s position. The
measures of performance include combinations of the stock return and sales
growth. There is an intercept for each individual to capture any other im-
portant variables which are constant over time. Cross-sectional regressions
use average compensation (over time) and average performance.
The results indicate that executive compensation changes by about 20%

of the firm’s returns. The ranking of sensitivity is CEO, President, Chair-
man, and Vice President. The bonus is most sensitive to performance and
the option compensation is negatively related to performance. The cross-
sectional regression without sales growth gives the wrong signs, evidence
of mis-specification. Adding the sales growth to the regression reduces the
position-specific sensitivity in the time series regression and reverses the signs
in the cross-sectional regression. Including performance interacted with po-
sition gives positive coefficients but the hierarchical ordering fails. In partic-
ular, the results indicate that vice presidents have the most sensitive com-
pensation. Using relative performance provides evidence that salaries are
positively related to raw returns and negatively related to relative returns,
while bonuses are unrelated to raw returns and positively related to relative
returns.
Jensen & Murphy (1990)

Jensen and Murphy (1990) examine pay for top executives to see if they are
compensated in a way that will reduce agency costs by aligning incentives.
They find that during the mid-70s to mid-80s, executive pay is not particu-
larly sensitive to performance, and most of the sensitivity comes from stock
ownership.
The methodology regresses different measures on compensation change
on changes in shareholder wealth to get a sensitivity estimate. They find
that a $1000 change in value causes only a $3.25 change in CEO wealth,
and $2.50 of this comes from stock ownership. The sensitivity is greater in
small firms, $8.05 versus $1.85. Bonuses are generally very stable and do
not seem to reflect changes in performance. Real CEO stock holdings and
the level of pay have fallen over time, suggesting that political pressures have
constrained the ability to offer pay for performance contracts. The variability
of CEO pay has also fallen so that it is no more variable now than general
labor, although executives are less likely to receive pay cuts and more likely
to receive large raises. Pay seems to be tied to accounting measures rather
than market or individual performance. Dismissals do not seem to be an
important incentive, mainly because they are rare. Sensitivity does not seem
to reflect the managers level of stockholdings. Sensitivity has decreased over
time.
These results are inconsistent with formal agency models of optimal con-
tracts. Alternative explanations are that CEOs are unimportant inputs in

the production process, actions are easily monitored/evaluated, or there are
political or social pressures that “cap” compensation. Of these, only the
latter is reasonable. Perhaps managerial risk aversion requires even higher
compensation for subjecting managers to performance risk. This is hard to
rationalize as the sole reason since the amount of wealth at risk is a rela-
tively small portion of total CEO wealth. Highly sensitive contacts may not
be feasible since executives can not credibly commit to paying large amounts
in the event of poor performance. It may also be the case that there are
non-pecuniary benefits such as power and prestige that do provide the right
incentives. However, these factors may incent the manager to be a good
citizen rather than maximize share holder value.
A weakness of the study is that firm value changes may not be good
measures of the CEOs performance. For example, flat performance during
a recession may in fact be good. Tests indicate that relative performance is
not important, however. There is also an endogeneity problem.
Yermack (1995)
Yermack (1995) tests nine theories of why companies award executives stock
options. The main idea being tested is whether firms with high agency costs
increase pay for performance sensitivity with stock options. The primary
findings support few of the theories. There is evidence that regulated firms
are less likely to use options, while firms with noisy accounting earnings or
liquidity contraints will use options more.
The analysis uses two possible dependent variables, the option delta times
the fraction of ownership or the value of option compensation relative to
salary and bonus. The first is a “flow” measure while the latter is a “stock”
measure. Measures of option values are based on the Black-Scholes model
and include only new awards. A tobit regression incorporates individual firm
effects and accounts for the large number of variables with values of zero.
The predictions and results are in Table 5.7
Sloan (1993)
Sloan (1993) examines the incremental role of accounting figures in deter-
mining CEO compensation. The logic follows Fama (1980); accounting earn-
ings do not subject the risk-averse manager to uncontrollable market noise.
Table 5.7: Predictions and Results in Yermack
Theory Prediction Finding

Incentive Alignment –
Horizon Problems +
Growth Oppty’s + –
Accounting Noise + +
Agency Costs of Debt/FCF –
Regulation – –
Liquidity constraints + +
Tax loss CF +
Earnings Management –
There will be a greater reliance on earnings when: (i) the firm’s stock returns
are highly correlated with market noise, (ii) earnings are highly correlated
with firm specific signals in returns, or (iii) earnings are less correlated with
market wide noise. Thus, accounting earnings are used as an instrumental
variable in a sense. Ideally, pay would be a function of actions, but these
are not easily observable. Instead, price can be used as a determinant of
compensation, but price is a noisy measure. The weights placed on price and
earnings reflect the tradeoff between incentive alignment and risk-sharing.
There are two important variables in the analysis, the ratio of variance in
market wide noise to variance of earnings noise and the correlation between
these sources of noise. Both variables are interacted with accounting perfor-
mance and stock performance. When the ratio of noise variances is large,
compensation should be based more on the accounting earnings and less on
the returns performance. When the sources of noise are positively correlated
the firm will base compensation less on accounting measures and more on
stock returns. The results indicate that the variance of noise in returns is
less than the variance of noise in earnings. The correlation between market
wide noise and earnings noise is close to zero.
Sloan finds support for the three hypotheses tested in this paper. First,
earnings measures shield executives from market noise. Second, CEO com-
pensation is more sensitive to earnings performance when the returns are
noisy relative to earnings. Finally, firms place more emphasis on earnings
when the correlation between noise in stock returns and earnings are closer
to negative one.
Bizjak, Brickley & Coles (1993)

Bizjak, Brickley, and Coles (1993) explain why firms use multi-year compen-
sation contracts and show that it is not always optimal to tie compensation to
current performance when there are informtion asymmetries. This contrasts
with the rule of maximizing current stock price in a world of perfect markets
and homogeneous expectations [Fama & Miller (1972)]. The basic intuition
is that when compensation is based only on current performance there are
incentives to maximize the current stock price at the expense of long-run
performance, either by under- or overinvesting. Supportive evidence shows
that high growth firms use longer contracts. There is no relation between
either CEO starting age or tenure and growth opportunities. A surprising
result is that the sensitivity of salary/bonus and total compensation to stock
performance are lower in high growth firms.
In the model managers use the observable investment decisions to ma-
nipulate the market’s inference about the firm. The incentive to do so is
strongest when the manager is likely to leave the firm before the market fully
learns the firm’s type. The compensation plan is then structured to balance
the emphasis on current versus future stock price.
To test the theory empirically the authors use M/B and R&D as proxies
for informational asymmetries with control variables for size and regulated
industries. The main analysis uses the ratio of salary and bonus incentives to
total compensation incentives.5 Additional regressions use these variables in
isolation. The results indicate that firms with high information asymmetries
pay a lower proportion of compensation in the form of salary and bonus.
Large firms, regulated firms, and high growth firms have total compensation
and salary/bonus that are less sensitive to changes in shareholder wealth.
Smith & Watts (1992)

Smith and Watts (1992) test a variety of theories regarding decisions about
financing, dividend, and compensation policies. The evidence suggests that
contracting theories are more important in explaining cross-sectional varia-
tion in these policies than either tax-based or signaling theories.
5
These are actually the change in each per $1000 change in shareholder wealth as in
Jensen and Murphy (1990).
Table 5.8: Predictions and Results of Smith & Watts
Dependent Indendent Variable

Variable A/V Reg. Size Ret.
E/V – – –
D/P + + ? (+)
Comp. – + (–) + +
Bonus ? (+) – +
Option – – +
Symbols shown are predictions. Actual results that are significantly different are in
parenthesis.
The study considers four endogenous policy variables: E/V for financ-
ing, D/P for dividends, CEO salary for compensation, and frequency of op-
tion/bonus plans for incentive compensation. Independent variables include
book assets to value for the investment opportunity set, size, accounting re-
turn, and a dummy for regulated industries. The data are on the industry
level.
The results indicate that firms with more growth options have lower lever-
age, lower dividend yields, higher compensation, and more frequent usage of
stock option plans. Regulated firms have higher leverage, higher dividend
yields, lower compensation, and less frequent usage of stock option/bonus
plans. Finally, larger firms tend to have higher dividend yields and higher
levels of executive compensation. These results inply relations among the
policy variables as well. There should be a positive relation between lever-
age and dividend yield and between compensation and the use of incentive
plans. There should be negative relations between dividend yield and incen-
tive plans and also between leverage and either compensation or incentive
plans.
5.12 Risk Management

There are several ways a firm can manage risk, including diversification,
insurance (nonlinear), and hedging (linear). To measure the valuation effect
of risk management researchers either use an event study or matched samples.
5.12. RISK MANAGEMENT 139
Stulz (1995)
Stulz (1995) attempts to reconcile the theories and practice of risk manage-
ment. Survey data indicate that firms typically hedge transactions and do
not engage in speculation or arbitrage. At the same time, managers indicate
their view influences the extent of hedging and many large firms view the
tresury as a profit center. Large firms tend to use derivatives more than
smaller firms.
Theories predict gains from risk management may come from several
sources. In an efficient market with diversification, these gains must arise
only from real resource gains such as reducing costs due to financial distress,
taxes, wages, or capital acquisition. Since increases in capital are a substitute
for risk management, firms with low leverage are generally not expected to
benefit much from hedging. In this sense, hedging allows firms to save capital.
Since managers dictate the risk management policy it is important to consider
their incentives to reduce or increase risk. The chances of bankruptcy also
affect risk management. The lowest risk firms can afford to take bets and
the highest risk firms are forced to take bets.
Since most of the arguements for risk management focus on left-tail out-
comes, methods such as variance reduction are not really appropriate. Value
at risk emphasizes the magitude of the loss that occurs with a given proba-
bility, but it is not appropriate either. The path of firm value over time is
more important than the distribution at a point in time.
Froot, Sharfstein & Stein (1993)

Froot, Scharfstein, and Stein (1993) develop a theoretical framework describ-
ing optimal risk management strategies. The focus is on what and how much
hedging should be done as opposed to why or how to implement the program.
The optimal amount and type of hedging depends on the nature of a firm’s
investment and financing opportunities.
This paper take the view that the motivation for risk management is to
reduce the variability in cashflows since it disrupts investment and financing
activities.6 When cashflows are variable the amount of external financing
and/or investment will also be variable. Holding investment fixed requires
changing external financing. If the marginal cost of funds increases in the
6
Other theories include managerial risk aversion, information asymmetries in the labor
market, taxes, financial distress costs/additional debt capacity, and underinvestment.
amount of financing then the investment policy will still be altered. Thus,
actions the firm can take to reduce cashflow variability may increase firm
value. This is based on the assumption that firms are more efficient at
hedging than individuals.
An implication of the model is that high R&D firms are more likely to
hedge. These firms may have greater difficulty raising external funds because
either the growth opportunities are not good collateral or since there may be
large information asymmetries. Also, the R&D growth options are not likely
to be correlated with hedgeable risks. This effect comes from the distinction
between collateral value sensitivities and marginal product sensitivities. Here
the marginal product is insensitive to hedgeable risk. Therefore, the firm
desires more hedging so it can still fully invest in the bad states. If the
marginal product were more sensitive there would be a natural hedge in the
sense that when the firm is in a bad state it wants to invest less anyways.
Several conclusions arise from the model.
• Optimal hedging does not always mean full hedging.
• Firms should hedge less when future investment and cashflows are
highly correlated and more when collateral and cashflows are corre-
lated.
• Hedging by multinationals is influenced by revenue and expense expo-
sures to exchange rates.
• Nonlinear hedging allows added precision.
• Futures and forwards are different intertemporally.
• Hedging practices of competitors matters to a firm.
May (1995)
May (1995) tests the theory that managerial risk preferences affect the risk
management decisions of the firm. The paper focuses on acquisitions, which
can be a substitute for other risk management practices. For managers,
diversification may be a positive NPV project, even though it may be bad for
shareholders. The main finding is that managers with more personal wealth
invested in the firm tend to diversify, despite evidence that diversification
typically reduces firm value [Berger and Ofek (1995)].
The CEO’s motive are proxied by his tenure, estimated fraction of wealth
in equity, specialization of human capital, and past performance. The rela-
tion between these variables and the diversification level sought, industry-
adjusted leverage, volatility, and idiosyncratic risk are considered. Diversifi-
5.12. RISK MANAGEMENT 141
Table 5.9: Preditions and Results in Tufano
Hypothesis Variable Predicted Actual

Distress Cash costs +
Leverage + +
Disruption Exploration +
of Invest. Acquisitions +
Cost of Firm value –
Ext. Fin. Reserves –
Tax Tax loss CF +
Risk Mgr. stock + +
Aversion Mgr. options – –
Nonmgr. block ? –
Other Fin. Diversification –
Policies Cash – –
cation level sought is measured as the covariance of returns between bidder

and target, firm-specific risk reduction, and implied change in volatility.
There is strong evidence that the fraction of wealth in equity is impor-
tant. CEOs with specific expertise tend to buy related targets. Poor past
performers often make diversifying acquisitions. There is weak evidence that
seasoned experts also make diversifying acquisitions, perhaps because their
human capital becomes too firm-specific.
Tufano (1996)
By focusing on the gold industry Tufano (1996) is able to carefully examine
the determinants of risk management. Isolating the gold industry allows a
study where there is a common exposure to output price. The wide variety
of risk management policies and gold-related derivative instruments used
by the industry provides cross-sectional variation. Data collection efforts
are aided by the public disclosure of risk management activities. The gold
industry should use very little hedging since its assets are mostly tangible
and known, investors can hedge on their own relatively easily, and detailed
reporting minimizes informational asymmetries. Despite these reasons, 85%
of the firms do manage risk.
To perform the analysis, Tufano calcualtes a delta percentage (∆%) which

is the portfolio delta times the ratio of ounces hedged to expected production.
If ∆% = 0 there is no hedge and the firm is long its full production. At
∆% = 1 the firm has a full hedge. A delta percentage less than zero or greater
than one indicates a speculative long or short position, respecitvely. The
independent variables in the tobit regression are summarized in Table 5.9.
He finds support for management incentives, but little support for firm
incentives. When managers own more stock options firms manage risk less,
but when managers have more wealth invested the firms manage risk more.
Other results show that firms with low cash balances or CFO’s with short
tenure manage risk more.
Geczy, Minton & Schrand (1996)

Geczy, Minton, and Schrand (1996) try to explain “Why Firms Use Currency
Derivatives.” They test the predictions of hedging theories by looking at a
subset of the Fortune 500 firms with ex ante foreign exchange exposure. The
study also considers how the magnitude of the exposure affects the benefits
from risk reduction and the associated expenses. The results indicate that
financing constraints provide incentives for hedging. There is evidence of un-
derinvestment, especially for firms with little financial flexibility. Firms may
choose to use foreign-denominated debt as a substitute for direct hedging.
The expenses associated with hedging are important. There is no support
for speculative positions.
Roughly 40% of the firms use currency swaps, forwards, or options. Usage
is more common among firms with more growth opportunities or greater
financial constraints, consistent with the model of Froot, Scharfstein, and
Stein (1993). Larger firms or firms using other derivative instruments are
more likely to use currency derivatives, indicating economies of skill and scale.
Firms tend to hedge foreign currency with forwards and foreign interest with
swaps.
The analysis is based on a logit regression predicting currency derivative
use. The categories of factors considered are managerial incentives, bond-
holders, equityholders, operating characteristics, substitues, and costs. An
effort is made to account for the endogeneity problem related to a firm’s
choices of capital structure, executive compensation, and derivatives usage.
Consistent with the argument that more foreign exchange exposure in-
creases the benefits to hedging, the authors find that the likelihood of cur-
5.13. INTERNAL/EXTERNAL MARKETS AND BANKING 143
rency derivatives use is positively related to foreign sales, foreign-denominated

debt, and foreign pre-tax income. The positive relation between hedging and
R&D and the negative relation with the quick ratio support the claim that
firms with the highest external finance costs use currency derivatives.
5.13 Internal/External Markets and Banking

The distinction between internal and external capital markets becomes im-
portant when there are market frictions. Fama (1985) claims that banks
must provide some unique services since they are effectively taxed by reserve
requirements, but the orgainzational form still exists. Possible explanations
are an informational advantage, greater capacity to monitor, and a certifica-
tion/signaling role.
Rajan (1996)
Rajan (1996) presents a model incorporating the endogenous costs and bene-
fits of bank debt. An optimal borrowing structure reduces a bank’s ability to
appropriate rents from the borrower without drastically reducing its ability
to control. The main result is that an informed bank can prevent a manager
from continuing a negative NPV project, but it comes at a cost of reduced
managerial effort and value due to the bank’s bargaining power over positive
NPV projects. Arm’s length debt has neither the bargaining power nor the
monitoring capacity of bank debt, but demands a higher return ex ante to
compensate for the negative NPV projects.
In the model an owner-manager needs external financing to pursue a
project idea. After making the investment, the manager exerts costly effort
which affects the distribution of project returns. The bank has the ability to
force discontinuation if the project becomes negative NPV. Since the manager
is a residual claimant, he always wants to continue [Jensen and Meckling
(1976)]. Note that everyone is risk-neutral in the model.
The structure of the bank loan is important. If the bank requires repay-
ment when the true state is revealed, the bank has the power to hold up
the manager unless he has other financing options. This causes the owner to
lose some of the surplus from the project and he will no longer exert optimal
effort. Alternatively, the bank can require repayment only at completion of
the project. Now the bank loses its power to force discontinuation and has
to bribe the manager to stop negative NPV projects. Competition among

financiers has ambigous effects. It reduces the bank’s ability to extract a
surplus in the good states, but also reduces its ability to force discontinution
since the manager can borrow from uninformed sources.
Puri (1996)
The purpose of Puri (1996) is to determine whether banks suffered from a
conflict of interest when they were allowed to underwrite securities offerings.
The Glass-Steagall Act of 1933 prevented banks from underwriting based on
the premise that banks had an incentive to underwrite offerings of their own
troubled loans.
There is a tradeoff between the informational advantage banks have,
which should reduce the yield premium, and the conflict of interest, which
would raise the premium. The strategy of the paper is to look at yield pre-
miums of commerical banks versus investment banks. The null hypothesis
is that the yield premiums are the same for the two types of banks. The
sample includes several hundred offerings between 1927 and 1929, the pe-
riod between the McFadden Act, which made underwriting legal, and the
Depression. The main analysis is a regression of yield premium on control
variables and a dummy for commercial banks. The control variables include
credit quality, loan amount, syndiate size, firm age, and dummy variables for
exchange listing, securitization, and new issue.
The results suggest that commercial banks did not have a conflict of
interest. The yield on bank underwritten issues is lower than that on un-
derwritings by investment banks, especially for the informationally sensitive
offerings such as new issues, industrials, preferred, and lower-grade. This
indicates the informational effects dominate the conflict of interest and is
consistent with positive AR for bank loan announcements.
Shin & Stulz (1996)

A test of whether divisional structures influence investment policy is the
focus of Shin and Stulz (1996). A firm with multiple divisions has several
potential costs and benefits of diversification. On the one hand, internal
capital markets will provide cheaper access to capital if external markets are
imperfect. On the other hand, bureacracy may hamper efficient investment.
The basic evidence is that the investment of small divisions depends heavily
5.13. INTERNAL/EXTERNAL MARKETS AND BANKING 145
on the cashflow of larger divisions, but the investment of larger divisions

does not depend much on the cashflows of other divisions. This suggests
that internal capital markets are important, but does not tell us if they are
good or bad.
There are three hypotheses under consideration. With bureaucratic rigid-
ity, additional management and inefficient policies and procedures may cause
firms to give divisions “sticky” fraction of the total capital budget. One divi-
sion’s allocation will be inversely related to the cashflows of other divisions.
This inverse relation will be stronger with more divisions, and weaker when
investment is not expeceted to be sensitive to cashflows. Under the hy-
pothesis of efficient internal capital markets firms will shift funds (including
dividends) to the source of highest value. In this setting other divisions will
benefit when a large division has high cashflows and relatively poor invest-
ment opportunities. Finally, the free cashflow hypothesis says that firms
may still shift funds to the best use, but dividends will not be paid. The
prediction of this theory is that firms will invest more in non-core segments
if the core business has high cashflows and poor prospects.
To address these theories the authors examine the link between CF and
investment at the division level compared to the entire firm. The link between
a division’s investment and the cashflows of the other divisions is also consid-
ered. A distinction is made between small and large divisions. The ratio of
divisional capital expenditures to lagged divisional assets is regressed on the
lagged value of that ratio, divisional sales growth, divisional CF/Assets, and
CF/Assets of other segments. These regressions are performed separately for
small and large segments, with futher subdivision on the number of segments.
The entire anlysis is repeated for large firms.
The results indicate that the investment of all divisions are positively re-
lated to each of the independent variables. For small divisions, the cashflows
of other divisions are fairly important, while this is not the case for larger
divisions. With more divisions the importance of other segments increases.
For firms where investment is not expected to be sensitive to cashflows
(e.g., low leverage or high q7 ), the sensitivity of a division’s investment to
other divisions’ cashflows is weaker. This sensitivity increases with the num-
ber of divisions. These results are consistent with the bureaucratic rigidity
hypothesis. There is little evidence supportive of the efficient internal capi-
tal markets or free cashflow hypotheses; firms with large divisions that have
7
Market to Book is actually used as a proxy for Tobin’s q.
poor prosepects but large free cashflow do not seem to direct more funds to
small divisions in growing industries.
Billett, Flannery & Garfinkel (1995)

Billett, Flannery, and Garfinkel (1995) attempt to determine whether the
quality of the lender has a valuation impact on the borrower. The lender’s
identity might matter if certain lenders have special monitoring abilities, or
if the lender’s preferences for certain risk classes signal the borrower’s type.
Announcement of issuance of public securities is generally met with a price
decline. Private securities are often associated with a positive price impact.
Therefore, public and private financing do not seem to be perfect substitutes.
Institutional features may affect this process. Banks have access to pri-
vate information in the form of deposit accounts. Government regulations
require banks to focus on the risk of individual loans rather than the entire
portfolio. Therefore, borrowing from a constrained bank may signal a less
risky borrower. The lender’s credit quality may also matter. Borrowers are
likely to prefer healthy banks to preserve long-term relationships and mini-
mize search/switching costs. Expertise in monitoring may produce economies
to specialization. A high rating will reduce the lender’s cost of capital. A
reputational equilibrium may develop where lenders are expected to deliver
securities of a certain type.
The analysis performs an event study on a sample of firms with loan
announcements in the 1980s. Univariate analysis shows that there is not
a difference between the abnormal returns when the lender is a bank ver-
sus a non-bank. However, borrowers experience a positive abnormal return
when borrowing from a bank with a high credit rating, versus a negative
abnormal return from lenders with lower ratings. Regression results indi-
cate that abnormal returns increase by 20 basis points for each change in
the lender’s credit rating after controlling for other factors such as firm size,
preannouncement run-ups, and other firm characteristics.
Fazzari, Hubbard & Peterson (1988)

Fazzari, Hubbard, and Peterson (1988) test whether financing constraints
affect investment. In perfect markets, financing alternatives are perfect sub-
stitutes and the investment and financing decisions are separate. Market
imperfections make external markets more expensive. Asymmetric informa-
5.14. CONVERTIBLE DEBT 147
tion is the primary friction, others include transactions costs, taxes, agency
problems, and financial distress.
The paper explores the empirical support for the q theory, sales ac-
celorator model and the neoclassical model of investment. Each of these
models predict that factors other than cash flow drive investment. Under
the q theory, firms invest as long as the marginal q is greater than unity.
The neoclassical theory is based on the notion that the financial character-
istics of a firm do not affect the cost of capital. The sales accelerator model
says that sales growth drives investment.
The basic idea behind the empirical tests is to define three classes of firms
based on dividend payouts (retained earnings). These groups are proxies for
information asymmetry; high payouts mean the firm has the lowest costs
to external financing. Investment per dollar of capital is then regressed on
financial measures to see if there are differences across groups. The results
indicate that cashflow is important in determining investment, and more so
for the firms with low dividend payouts. This supports the pecking order
theory.
5.14 Convertible Debt

Convertible debt can be viewed as an indirect equity issuance — when a firm
calls its bonds it is like issuing equity. Under a signaling hypothesis, firms
tend to issue equity when their shares are overvalued.
Many researchers have argued that in perfect markets, convertible bonds
should be called as soon as possible to minimize the value of the liability.
Early empirical evidence suggests that corporations wait too long to call and
there are negative excess returns at the announcement of the call. Subse-
quent researchers proposed several reasons why a firm may choose to delay
the call. This could be due to managerial compensation schemes based on
EPS, the effect of reduced bondholder goodwill on future issuances, a prefer-
ence for voluntary conversion induced by dividend increases, and suboptimal
conversion strategies by the security holders.
Stein (1992)
Stein (1992) develops a theory explaining the use of convertible debt based
on the cost of financial distress and the importance of call provisions. Con-
vertibles allow a company to get equity into the capital structure “through
the back door,” while mitigating the adverse selection costs of a direct equity
issuance. Since a convertible issue is like a combination of debt and equity
the issuance signals better prospects than an equity issuance.
The model is an extension of Myers and Majluf (1984), where there are
good, medium, and bad firms that differ in the probability of a high cash flow.
The firm knows its type at time zero, while investors get this information at
time one. The cashflow is revealed at time two. A good firm is certain to
get the high cashflow XH . Medium firms get XH with probabiltity p. Bad
firms may improve with probability (1 − z) and have a p% chance at XH , or
deteriorate and get nothing.
A basic version of the model gives firms the choice of equity, long term
debt and convertible debt. When costs of financial distress are sufficiently
high (C > I − XL ) there is a separating equilibrium. Good firms choose
debt since there are no distress costs and the firm does not have to sell
undervalued securities. Medium firms choose convertible debt to reflect the
tradeoff between distress costs and issuing undervalued securites. The bad
firms choose equity because the distress costs of other securities outweigh the
benefits.
There are several forms of empirical support for the model. Firms often
state the desire to get equity into the capital structure as a reason for issuing
convertible securities. Convertible debt is often (and fairly quickly) converted
into equity. Convertible issuers tend to have high informational asymmetries
and costs of financial distress as indicated by high R&D/Sales, M/B, D/E,
and CF volatility. Finally, the stock price reaction to convertible issues is
typically half to a third the negative reaction of equity issuances.
Ofer & Natarajan (1989)

The paper by Ofer and Natarajan (1989) assesses whether the negative share
price reaction to a call announcement is due to signaling. There is a decline
in performance after the announcement as well as a continued negative CAR
over the next five years.
Under a signaling framework the announcement of the call will be met
with a negative return since investors perceive the call as signaling bad news.
For the signal to be effective the firm must perform poorly after the call.
The sample consists of over 100 voluntary calls during the 1970s. There
is a potential sample selection bias since the pre-announcement performance
5.14. CONVERTIBLE DEBT 149
tends to be abnormally high. After the call, what may be normal performance
will look poor in comparison. Other papers which correct for this problem
do not find evidence of poor post-announcement performance.
The authors use several measures of performance to avoid the causality
problem between the call decision and the performance measures. EBIT will
not be affected by the conversion. EBT is affected through the reduction in
interest. EPS is affected by both the interest and the increase in number of
shares. Finally, AEBT is EBT less the interest that would have been paid.
Three models of normal performance are used. The first assumes the per-
formance is stationary through time. The second and third models express
expected performance as a function of average market- and industry-wide
performance. In all cases the results indicate that these firms have unexpect-
edly poor performance. The call announcement is associated with a negative
abnormal return, then followed by negative CARs over the next five years.
These results are consistent with the information signaling hypothesis and
the predictions of Myers and Majluf (1984).
Dunn & Eades (1989)

Dunn and Eades (1989) attempt to explain the observation that firms wait
too long too call preferred stock by focusing on the assumption that investors
follow perfect-market strategies. If enough investors deviate from the perfect-
market strategy then it may be optimal for the firm to delay the call. If the
dividend yield on the callable security is lower than on the common stock
then managers can take advantage of the slow conversion by passive investors.
The optimal call policy for the firm is to force conversion by calling as soon
as the conversion value exceeds the call price, but before the issue enters the
voluntary conversion region (VCR). The VCR is the first ex-date where the
dividend on conversion is greater than the preferred dividend and conversion
premium.
The study uses convertible preferred stock to avoid complications related
to interest tax deductibility. Consistent with the passive investory theory
• Many investors do not convert in the VCR
• Convertible preferreds sell below conversion values
• Firms are generally not able to increase shareholder wealth by calling
• Passive investors would typically realize incremental returns by con-
verting
The authors define the dividend ratio (DR) as the total conversion dividends
relative to the total preferred dividends in a year. The price ratio (P R) is

the average ratio of preferred price to conversion value of equity. The share
ratio (SR) is the fraction of preferred shares remaining at the end of the year
after conversion.
When DR < 1 then P R > 1 indicating that the preferred sells at a
premium due to the conversion option and dividend advantage. When DR >
1 then P R = 1 since there is no conversion premium. The SR drops to around
80% prior to entering the VCR, drops to around 50% in the next year, then
declines to roughly 10% ten years after entering the VCR. Consistent with
the theory, callable survivors have higher φC/Call, lower SR, higher DR,
and lower P R than the called sample. The called sample also has a higher
proportion of issues in the VCR.
Regression results show that before entering the VCR, conversions in-
crease when preferred is selling below its conversion value. After entering
the conversion region, investors are increasingly motivated to convert as the
dividend advantage of common stock increases. Using institutional ownership
as a proxy for active investors, there is some weak evidence that institutional
investors reduce their holdings more than other investors.
Asquith & Mullins (1991)

Asquith and Mullins (1991) explain why companies do not call convertible
debt when the conversion value exceeds the call price, as predicted by many
theories. There are three primary criteria used to explain this behavior.
The first, and most obvious, is simply that the issues are still call-protected.
Second, the firms may want the conversion value to be somewhat higher
than the call price to provide protection from a price decline during the call
notice period. Finally, the most powerful explanation is that there may be
cashflow advantages to the firm from not calling when the after-tax interest
after corporate taxes is less than the dividends.
An analysis of convertible bonds with conversion values in excess of par
indicates that 89% fall into one of the above categories. 21 of the remaining
22 are close to or subsequently meet the requirements for one of these groups.
Voluntary conversion is more likely with higher conversion value or higher
dividends relative to after-tax interest. An increase in conversion value de-
creases the option value. Investors voluntarily convert when investors get
more cash in dividends, a time when firms have an incentive not to call.
This is supported by the data since less than 20% of the issues remain when
5.15. IMPERFECTIONS AND DEMAND 151
converted dividends exceed the interest. Although the investor’s problem is

the inverse of the firm’s, the decisions are not symmetric because of taxes.
Therefore there are bonds which a firm will not call and investors do not
convert.
Asquith (1995)
Asquith (1995) corrects prior studies by showing that, when measured prop-
erly, there is no call delay. Prior studies draw the conclusion that conversion
value in excess of call value indicates a delay from the optimal time to call.
A number of these bonds are still call-protected. Many of those that are
not protected have the after-tax yield below the dividend, providing a cash-
flow incentive not to convert. Finally, delayed conversion bonds often have
relatively low premia or volatile cashflows, providing a price protection justi-
fication for the delay. These motivations are discussed in Asquith and Mullins
(1991). This paper adds an analysis of the delay between when a bond is
callable and when it is called.
The paper finds that those bonds that are called have fewer “live” days.
Bonds with relatively high conversion prices and those with D < I(1 − τ ) are
called more quickly. A puzzle is that there are several bonds with D > I(1−τ )
that are called. The general conclusion is that most bonds are called as soon
as possible unless there are cashflow advantages to delaying. The median
call delay for all bonds is four months, but less than one month if a price
cushion is considered. Asquith argues that call premiums are not a useful
method of detecting whether bonds are called late. Overall, the average call
premium is 50%. The average call premium drops to 25% after considering
factors such as cashflow motivated delays, sudden stock price increases, and
large premiums while call protected.
5.15 Imperfections and Demand

In perfect markets demand curves should be flat but market imperfections
may cause downward sloping demand curves. Many important propositions
in finance are based on the assumption that investors can buy or sell stock
without changing the price. Observed price reactions indicate prices are
sensitive to volumes. Large block purchases generally result in price increases,
while sales cause prices to fall. With equity issuance there are negative price
reactions, potentially due to agency costs of free cashflow [Jensen (1986)],

asymmetric information [Myers and Majluf (1984)], and signaling [Miller and
Rock (1985)]. In takeovers bidder prices typically fall while targets receive a
premium. Convertible debt and the call announcements are associated with
negative market reactions. It is not clear if these reactions are driven by
signaling, liquidity, or downward sloping demand curves.
Shleifer (1986)
Shleifer (1986) provides evidence that demand curves for stocks do slope
down. He uses inclusion in the S&P 500 as a sample since this event increases
demand for the stock without contaminating information effects. Earlier
studies had examined the price effects of large block trades but these events
may be based on information. A possible certification role of index member-
ship is refuted since the returns are unrelated to bond ratings. The liquidity
hypothesis is rejected by finding no difference in the returns of Fortune 500
firms and other firms.
There is no evidence that the market is able to predict inclusion in the
index. Before daily notification of the inclusion there is no abnormal return
on the event day. Since 1976 there has been a daily notification service of
changes in the index. In this period inclusion in the index is associated with
a positive abnormal return of about 2.8%. This return lasts for several weeks
and seems to be related to buying by index funds. Other evidence supports
the downward sloping demand curve hypothesis as well. The price reaction
to large block trades typically only lasts a few hours. Firms with multiple
classes of stock that issue more of one class generally experience a price drop
only for that class of stock [Loderer, Cooney, and VanDrunen (1991)]. A
downward sloping demand curve is also consistent with the January effect.
Shleifer & Vishny (1992)

Shleifer and Vishny (1992) relate the costs of asset sales to leverage in a
general equilibrium setting. When a firm is in financial distress, the most
ideal purchasers of the assets are likely to be in financial distress themselves.
This liquidity cost is recognized ex ante as a cost of leverage. The main
result is that more liquid assets are able to support more debt. This is
broadly consistent with Myers (1977).
The intuition behind the model is that assets are often specialized, making
5.15. IMPERFECTIONS AND DEMAND 153
them most valuable to firms within the industry. When industry shocks send
a firm into financial distress its competitors will also be affected. As a result,
there is an industry debt capacity and the leverage of one firm will depend on
the leverage of its peers. Firms outside the industry may have an interest in
the assets but are likely to pay less. Outsiders fear overpaying since they lack
the expertise to properly value the assets, they may lack the knowledge or
skills to fully utilize the assets, and they face agency costs in hiring experts
to help them.
There are several empirical implications of the model. Liquid assets
should be financed with more debt. Cyclical and growth oriented assets
are likely to have lower debt financing. Ceteris paribus, smaller firms should
be able to support more debt since they can more easily be purchased. Con-
glomerates should also be able to use more debt since the divisions can cross-
subsidize each other. High markets are likely to be liquid markets.
The takeover wave of the 1980s is consistent with this theory. Corpo-
rate cashflows were large as were the number of potential buyers. Antitrust
enforcement was relaxed, allowing more intra-industry acquisitions. This in-
creased liquidity and the rise of the junk bond market reinforced each other.
Merton (1987)
Merton (1987) is an asset pricing model which relaxes the assumption of ho-
mogeneous information. Although the model is cast as one with imperfect
information, it can be interpreted as a model of incomplete markets. In-
vestors are unable to fully diversify so they demand a premium for bearing
this undiversifiable unsystematic risk.
In this one period model risk-averse investors know about a subset of the
securities in n risky firms. There is also a riskless asset and another asset
that combines the riskless security with a forward contract. The market is
absent frictions from taxes, transactions costs, and restrictions on borrow-
ing. If all investors had complete information sets the model reduces to the
standard SL CAPM, otherwise the market portfolio is not mean-variance ef-
ficient. Information costs come in the form of gathering and processing data,
transmitting information, and most impotantly, making investors aware of
the firm.
The return generating process is
R̃k = R̄k + bk Ỹ + σk ε̃k .

An investor is informed about asset k if he knows {R̄k , bk , σk }. All informed

investors have conditionally homogenous beliefs. This structure is similar to
the single asset model of Grossman and Stiglitz (1980), but here there is no
gaming between the informed and uninformed because investors only invest
in securities in which they are infomed. The shadow cost of not knowing
about an asset is the same for all uninformed investors and is equal to the
expected excess return on the asset. The equilibrium expected return is
R̄k = R + bk bδ + δxk σk2 /qk .
This equation shows the expected return decreases when the investor base
increases.
There model makes several predictions. A large common-factor exposure
(bk ), large size (xk ), or large variance (σk2 ) create high expected returns.
When the firm is well-known or has a large investor base (qk ) the expected
return is smaller. This may give rise to a size effect. These effects can give
rise to downward-sloping demand curves. Expansion of the firm’s investor
base and increases in investment will tend to coincide, giving a motivation for
an underwritten offer instead of a rights offer. Managers have an incentive to
expand the investor base, especially for relatively unknown firms and those
with large firm-specific variances. This can explain why firms advertise their
stock and invest in generating interest in the firm by the financial press. The
model is also consistent with IPO waves in gereral and concentration within
an industry.
Kadlec & McConnell (1994)

Kadlec and McConnell (1994) use exchange listing to test the predictions of
the Merton (1987) model of investor recognition and the Amihud & Mendel-
son (1986) model of liquidity factors. In the former model expected returns
decrease as the size of the investor base grows. In the latter, expected returns
decrease with a reduction in the relative bid-ask spread. If the expected re-
turn decreases then the market value should increase and abnormal returns
should be positive.
During the 1980s, announcement of NYSE listing results in an abnormal
return of 5 to 6%. The listing is also associated with a 19% increase in
the number of shareholders, a 27% increase in institutional ownership, a 5%
reduction in absolute bid-ask spreads, and a 7% reduction in relative spreads.
5.16. FINANCIAL INNOVATION 155
The results are consistent with both models. The proxy for Merton’s
shadow cost of incomplete information is the inverse of the change in investor
base scaled by the level of firm-specific risk and market value. Controlling
for the change in bid-ask spread, an increase in investor base results in a
positive abnormal return. Controlling for change in investor base, a decrease
in the spread is associated with higher abnormal returns.
Loderer, Cooney & VanDrunen (1991)
Loderer, Cooney, and VanDrunen (1991) isolate and identify the potential
influence of price elasticity on demand using the price discount from SEOs
by regulated firms. Regulated firms are used because they are more likely to
have preferred stock and less likely to have information asymmetries. If the
stock issuance announcement contains negative information it there should
be a neagative reaction for preferred stock as well. The evidence supports
the incomplete markets theory of Merton (1987), but is inconclusive with
respect to theories of liquidity or heterogeneous beliefs.
To estimate the determinants of elasticity, IN V ELAS 8 is regressed on
variance (–), size (–), investor base/liquidity (+), and proxies for information
effects (+). To capture information effects the authors consider ∆E[EP S],
∆EP S, ∆ROE, and the price change of nonconvertible preferred stock at the
announcement. The results are significant and consistent with predictions for
all variables except liquidity and information, which are insignificant. These
results are robust to a number of different specifications and proxy variables.
A potential caveat is the predictability of issuance by regulated firms may
make it difficult to detect information effects.
5.16 Financial Innovation

When there are market imperfections there may be structures of claims that
has special value. Just as the prior section dealt with imperfections and asset
demands, this section addresses the effects on the supply of securities. Topics
covered here include optimal financial contracts, the incentives to innovate,
and the existence of clienteles.
8
This is the inverse of elasticity. The inverse introduces a nonlinearity in the model
that may result in mis-specification.
Zender (1991)
Zender (1991) develops a model of the optimal financing contract that incor-
porates both cashflow and control allocations. Most existing theories focus
only on cashflows. The optimal financial instruments completely resolve in-
centive problems induced by asymmetric information. In the setting of the
paper, standard debt and equity contracts are optimal. Bankruptcy broad-
ens the investment opportunity set and facilitates cooperation between the
parties.
In the model there are three agents: an entrepreneur, an active owner,
and a passive owner. The owners are risk-neutral and have limited capital.
At t = 0 contracts are designed and sold and the initial investment I0 is
made. At t = 1 information about CF3 is made public. The firm receives
CF1 and assignment of t = 2 controls are made. At t = 2 the firms has an
investment opportunity which the controlling owner knows but the public
only knows the distribution. The investment requires an investment I that
is unobservable to outsiders. At t = 3 the investment payoff is realized and
the firm is liquidated.
There is disagreement among the agents about investment/dividend pol-
icy due to the passive investor’s inability to observe investment expenditures.
The agents realize up front that risk-shifting may occur and they mitigate
it by inducing a state-contingent control change when an observable signal
is realized. Cashflows to debt must be fixed in order to provide the equity-
holder incentives to make efficient investments. This can explain the use of
debt before tax shields.
Tufano (1989)
Tufano (1989) examines “innovative” investment banks and the benefits from
innovation. He finds that innovators gererally do not charge monopoly prices
(underwriting spread). Instead, they charge lower long-run prices and gain
market share. One interpretation is that innovation can reduce costs of
trading, underwriting, and marketing.
To identify the importance of price as a source of first-mover advan-
tage underwriting spreads are regressed on measures of competitiveness and
underwriter identity and control vartiables for offering characteristics. A
dummy variable for the monopoly period is insignificant for all offers and
negative for imitated products. Permanent price effects as measured by a
5.16. FINANCIAL INNOVATION 157
pioneer dummy are reliably negative.

The long-run quantity effects appear to be an important source of first-
mover advantage. Pioneers capture market share nearly 2.5 times as large
as imitators. Temporary quantity effects due to periods of monopoly are
not important since the number of deals is small and imitators are quick to
follow.
Kim & Stulz (1988)
Kim and Stulz (1988) directly test the clientele hypothesis, which says that
firm value can be increased by seeking funding from groups with unique
demands. The evidence is consistent with this hypothesis.
The authors focus on Eurobonds from U.S. firms that also issue domestic
debt. Eurobonds are geneally bearer bonds allowing the holder to escape
taxes. There are some questions over the enforceability of the bond indenture,
so reputation replaces restrictive covenants. Foreign investors may desire
these securities because they offer diversification yet have smaller purchasing
power and political risks. This market is characterized by larger underwriting
spreads.
If the supply of Eurobonds is not perfectly elastic then excess demand
can create profitable financing opportunities since investors will accept lower
yields. The supply of these securities is somewhat constrained because of the
high issuance costs, low risk requirement, and reputational capital required.
The results indicate there are positive abnormal returns at the announce-
ment of Eurobond issues. A comparison sample of domestic debt issues
shows no significant announcement effect, as in Mikkelson and Partch (1986).
The positive abnormal returns occur mostly during the 1979–1982 period of
bought-deal underwriting when yield spreads were large. This type of ar-
rangement reduced the time it takes to issue Eurobonds. The positive abnor-
mal returns diminished in subsequent years when shelf registration increased
the attractiveness of domestic issues. Abnormal returns were indistinguish-
able from zero when tax laws ended the withholding tax for foreign investors
in domestic bonds. The clientele hypothesis is tested by regressing abnormal
returns on the size of the financing bargain. They find a slope coefficient
different from zero but not different from one, consistent with the clientele
hypothesis.
Jung, Kim & Stulz (1996)

The paper by Jung, Kim, and Stulz (1996) finds that some firms appear
to issue equity for the benefit of managers rather than shareholder. See
Section 5.9.6 for a more complete discussion of this paper.
McConnell & Schwartz (1992)

McConnell and Schwartz (1992) describe the process leading up to the devel-
opment of the Liquid Yield Option Note (LYON) by Merrill Lynch in 1985.
This is a zero-coupon, callable, convertible, putable instrument. This instru-
ment is designed to reduce the transactions costs associated with a strategy
of investing in options and the money market. These investors desire a risky
investment paying interest but preserving the principal, much like portfo-
lio insurance. The value of the security is relatively insensitive to the risk
of the company, reducing the cost of information asymmetries. There is a
self-selection by firms since only those with the most confidence in their pros-
epects will issue. When pricing the instrument it is important to consider
the interaction/covariance between the various components.
Chapter 6
Market Microstructure
6.1 Introduction
Information economics deals with incorporating information into asset prices.
Market microstructure is the study of the process and outcomes of exchanging
assets under explicit trading rules. The focus is often on the interaction
between the mechanics of the trading process and its outcomes, with specific
emphasis on how actual markets and intermediaries behave.
Randomness is an important part of any of these models. The source
of the randomness has implications for the characteristics of the model. In
all cases there is uncertainty about future outcomes or cashflows. Informed
agents have imperfect infomation about the future value of the asset. This
information may be the same for all informed agents, or they may each
have diverse signals. Some models include uinformed agents whose demands
depend on price. Noise trading is an additional source of uncertainty that
introduces uncertainty about the net demands for the asset and prevents
fully revealing prices.
A major difference in the models is whether trades are processed in a
batch or sequentially. The latter allows dynamics in the price process and
facilitates analysis of the bid-ask spread. The literature is fragmentated in
the view on the risk preferences of specialists.
There are several important idiosyncracies in early papers that much of
the subsequent work tries to solve. The first is a paradox where agents
ignore their private information when prices are fully revealing. If this is
true, then how does the private information get into prices in the first place.
159
160 CHAPTER 6. MARKET MICROSTRUCTURE
A second paradox arises when private information is costly and prices are
fully revealing. If so, then there is no incentive for collection of private
information, and this private information will then never become impounded
in the prices. Finally, there is the schizophrenia result where rational agents
in a competitive market act as price takers, ignoring the impact their trades
will have on the price. The first problems are solved by introducing noise
trading. Allowing imperfect competition solves the last problem.
6.2 The Value of Information

Hirshleifer (1971)
Hirshleifer (1971) analyzes the private and social value to private informa-
tion in a context of uncertain personal productivity. A distinction is made
between foreknowledge, knowing something in advance of its occurance, and
discovery, recognition of something (that may have already occurred) which
is not readily observable. Hirshleifer argues that there is no social value to
foreknowledge without production. This is because information is valuable
only if it can affect actions. Under the assumptions in the paper, agents
have the same endowments, preferences, and beliefs so there is no incentive
to trade. If the informed agent could speculate the information would be
privately valuable.
In a production economy, foreknowledge is both privately and socially
valuable. This is because production can be shifted to the optimal channels
based on this information. The informed agent can sell his information so
that the economy can fully use it in redirecting productive activities. This
has implications for the timing of information releases. Announcements at
regularly scheduled intervals allow risk-averse agents to insure before the
news announcement to get out of the way. Random releases of news as it oc-
curs allow more efficient reallocation of production, but expose the agents to
distributive risk. The same general results obtain with discovery information.
Marshall (1974)
Marshall (1974) shows that information can be socially valuable even in a

pure exchange economy if agents have heterogeneous priors. With homoge-
neous beliefs, private information has no social value if it can be hedged and
6.3. SINGLE PERIOD REE 161
it reduces value if it can not be hedged. In a production economy informa-

tion is socially valuable with sufficient hedges. Marshall says that there is an
overincentive to produce private information.
6.3 Single Period REE

Prices reflect traders’ information in a securities market. In a Rational Expec-
tations Equilibrium (REE) traders with heterogeneous information attempt
to infer the information of others from the prices, and then use this infor-
mation to revise their beliefs. There are several problems with the REE
concept. Unless noise is added, prices are typically fully revealing [Grossman
(1976)]. Fully revealing prices preclude speculative trading on the basis of
heterogeneous beliefs, giving the “no trade” result of Tirole.1 If traders are
allowed to condition on trades as well as prices, then these data are sufficient
statistics for all information and there is no advantage to being informed. Fi-
nally, without restricting the distribution of information, there is no trading
mechanism that could implement an REE.
Milgrom & Stokey (1982)

Milgrom and Stokey (1982) is a base-case for the information content of
trades. The model imposes very restrictive assumptions such as complete
markets and concordant beliefs. Agents are risk-averse. Under these condi-
tions, a “no trade” result obtains. Once ex ante trading occurs to a Pareto
optimal level, no future trading will take place although prices may change.
This is because anyone willing to trade must have private information. Other
agents will realize this and will be unwilling to trade since they all interpret
information in the same way.
1
The no trade result of Milgrom and Stokey (1982) will obtain with homogeneous beliefs
and a Pareto optimal allocation.
Grossman (1976)
The Grossman (1976) paper deals with the price system as an aggregator of
diverse information. If private signals are identically distributed, then the
price reveals the average of all agents’ information and private information is
redundant given the price. The REE is identical to a Walrasian equilibrium
in an artificial economy where agents share their information before trading.
With complete markets, equilibirum allocations are ex post Pareto efficient.
In this model, agents have CARA utility so there are no wealth effects, but
in a REE there are information effects. A price change affects the desirability
of an asset.
The model specifies informed trader i knows
yi = p 1 + ε i .
The resulting price is p0 (y1 . . . yN ). Prices reflect each agent’s private infor-
mation but do not depend on preferences. This results in a paradox: indi-
viduals ignore their own information in favor of the aggregated information,
but if they do ignore their private information, how does it get into prices?
The result that prices perfectly aggregate information is not robust to the
addition of noise, but another paradox remains. If markets are “perfect” and
information collection is costly, then there is no incentive to collect informa-
tion. The agents in this model are “schizophrenic” in that their actions affect
price but they take price as given in determining their demand.
Grossman & Stiglitz (1980)

Grossman and Stiglitz (1980) say that informationally efficient markets can
not exist. If private information is costly, but has no value, then there is
no incentive to collect it. This paper differs from Grossman (1976) in that
it is a model of asymmetric information rather than diverse information. It
endogenously derives the allocation of pretrading information, whereas most
other papers take it as exogeneous.
The model is based on perfect competition, one-shot trading, and a Wal-
rasian auctioneer. The return on the risky asset is
u = θ + ε.
An agent can pay c to realize θ. Informed agents receive the same signal and
all agents have negative exponential utility with risk aversion parameter a.
6.3. SINGLE PERIOD REE
Table 6.1: Summary of Key Models
Paper MM Inf. Uninf.a Noiseb Comments
Milgrom and Stokey (1982) MAC — No
Grossman (1976) MAC — No p i = P + εi
Grossman and Stiglitz (1980) MAC MAC Yes r =θ+ε
Hellwig (1980) MAC — Yes diverse info
Diamond and Verrecchia (1981) MAC — Yes diverse info, noise in endowments
Admati (1985) MAC — Yes multiple assets
Kyle (1989) MAU MAU Yes i n = v + en
Kyle (1985) SNC SNU — Yes dynamic model
Admati and Pfleiderer (1988) SNC MNC M Yes
Admati and Pfleiderer (1989) MN MNU M Yes
Foster and Viswanathan (1990) SC SU C Yes
Slezak (1994) MAC MAC Yes multi-period generalization of GS
Amihud and Mendelson (1980) SNU M — Yes spread = cost, is MM is C?
Glosten and Milgrom (1985) NC MN — Yes
Glosten (1989) SNU MA — Yes all traders have liq. and info.
Rock (1989) SA MA MN Inf. = Mkt. orders, Uninf. = limit
Codes in Table: S: Single, M: Multiple; A: Risk-averse, N: Risk-neutral; C: Competitive, U: Uncompetitive.

a
Uninformed traders whose demands depend on price.
b
Noise generally refers to liquidity traders, whose demands do not depend on price.
163
The informed and uninformed have demands
θ − Rp E[ũ|P̃ () = p] − Rp
XI () = and XU () = .
aσε2 avar(ũ|P̃ = p)
For markets to clear
λXI + (1 − λ)XU = x.
In Grossman (1976) there is a paradox since agents ignore their own

information, yet prices perfectly aggregate this information. A solution is
to introduce noise in the form of liquidity traders. Now prices are not fully
revealing, private information still has value, and trading based on common
beliefs is possible. With dynamic trading the market maker can break even
on average, not on every trade [see Glosten (1989)].
If competition is imperfect, the equilibrium price reveals less information,
although the price is determined as if a nontrading auctioneer aggregated de-
mand curves. If a market maker replaces the auctioneer, one needs to ask
what services the market maker is providing. Many papers take the posi-
tion that the market maker is an information processor. The informational
component of the spread is proportional to the probability of trading with
an informed agent and also proportional to the informed trader’s expected
profit from holding the asset. Spreads will be larger for larger quantities.
Hellwig (1980)
Hellwig (1980) attempts to avoid the schizophrenic agents in Grossman (1976)
by enlarging the economy. The model takes the limit of the incorrect econ-
omy, rather than fixing the problem. In other words, this solution is essen-
tially “at the limit” rather than “in the limit,” leaving open the question of
how an economy becomes large in the first place. The paper is still important
in that it shows the schizophrenia problem may be small when the economy
is large.
The model is basically an extension of Grossman (1976), but with the
addition of noise in the supply of the risky asset. The amount of information
also grows with the size of the economy [Kyle (1989) holds it fixed]. In a finite-
agent economy when the noise is small, the price becomes fully revealing, as
in Grossman. Upon enlarging the economy, the prices do not fully reflect the
information of the informed agents. Individuals find their own information to
6.3. SINGLE PERIOD REE 165
be incrementally informative to the price alone. The strength of an agent’s

reaction to his signal is inversely related to his risk aversion and the noisiness
of his signal.
Diamond & Verrecchia (1981)

The Diamond and Verrecchia (1981) model of a competitive market yields
prices which partially aggregate diverse information to form prices which are
not fully revealing. Prices deviate from the “efficient” level by a random
amount. Noise is explicitly modeled as random endowments in the risky
asset. If individual endowments are iid, per capita supply is constant in the
limit and the model approaches the Grossman (1976) fully revealing model.
If the variance of individual endowments grows with the population, the limit
is Hellwig (1980).
Admati (1985)
Admati (1985) is a multisecurity version of Hellwig (1980). Investment deci-
sions are based on MV considerations, but each agent in effect uses a different
model since they condition on different information. These conditional mod-
els do not natually aggregate to imply similar unconditional models. There-
fore, the market is geneally not MV efficient for any particular information
set, including all public infomation. Uncertainty about the supply of one
asset may prevent the prices of other assets from being fully revealing. This
may represent a solution to the Grossman and Stiglitz (1980) paradox.
The correlations among the assets can result in a number of strange re-
sults. Price may be decreasing in the profitability of an asset or increasing
in its supply. The predicted payoff of an asset may be decreasing in price.
Finally, assets may increase in price with greater demand.
Kyle (1989)
The Kyle (1989) paper solves the schizophrenia problem by allowing imper-
fect competition. The model uses noise traders, uninformed traders, and
mulitple informed speculators in a static model. A Walrasian auctioneer ac-
cepts limit orders. The informed speculators receive independent, normally
distributed noisy signals of the asset value. Traders have negative exponen-
tial (CARA) utility.
The value of the asset is given by ṽ with variance τv−1 . Noise traders
have random demands z̃ with variance σz2 . There are N informed agents
with information ĩn = ṽ + ẽn where var(ẽn ) = τe−1 . There is a symmetric
linear equilibrium with informed demands Xn (p, in ) = µI + βin − γI p and
uninformed demands Xm (p) = µU − γU p.
If all information could be combined the precision of the forecast would
be τF = var−1 (ṽ|ĩ1 , . . . , ĩN ) = τU + N τe . The precision for the informed
and uninformed are τI = var−1 (ṽ|p̃, ĩn ) = τv + τe + ψI (N − 1)τe and τU =
var−1 (ṽ|p̃) = τv + ψU N τe . The terms ψI and ψU represent the fraction of
information available to the type of agent. When ψ = 0 prices are unin-
formative and when ψ = 1 prices are fully revealing. Expressions for these
terms are
N β2 (N − 1)β 2
ψU = and ψI = .
N β 2 + σz2 τe (N − 1)β 2 + σz2 τe
The results of the model are prices that are less revealing than in the
perfect competition case. The uninformed breakeven on average and the
informed profit at the expense of the noise traders. An increase in the number
of uninformed or a decrease in their risk aversion ρU increases the information
effect. As M → ∞, E[ṽ|p] = p, a martingale result. An increase in the
number of informed or a decrease in per capita noise trading increases ψI .
In the limiting economy as N → ∞, τe = τE /N where τE is fixed. The
uninformed do not trade (??). As the infomed become risk-neutral, prices
become fully revealing in the competitive case, but only half as much in the
imperfect competetion case. Endogenizing information acquisition overcomes
the schizophrenia problem.
This equilibrium is different from the competitive outcome since informed
traders now take into account the effect of their actions on the market price.
In this case, traders must know the pricing function, the number of other
traders, and all other agents’ demand schedules. By accounting for their
impact on price, traders no longer completely trade away their informational
advantage.
6.4 Batch Models

This section begins with the analysis of market orders in the Kyle (1985)
model. A market maker observes the net order flow and sets a single price
6.4. BATCH MODELS 167
at which all orders are cleared. Without price-contingent orders, it is not

possible to explore the bid-ask spread or transaction prices. This framework
does allow analysis of the effect informed traders’ strategies have on prices.
Kyle was the first to develop a model of this nature. Price-contingent or-
ders are taken up in Kyle (1989). The strategic action of uniformed agents
are covered in models such as Admati and Pfleiderer (1988), Admati and
Pfleiderer (1989), and Foster and Viswanathan (1990).
Kyle (1985)
The classic Kyle (1985) model uses a single risk-neutral informed trader, a
group of noise traders, and a single risk-neutral market maker. The model
is dynamic, allowing an analysis of trading strategies over time.
The model is presented first in a single period setting. The random future
asset value is ṽ, which only the informed trader can observe. The market
maker does not explicitly know v, but knows ṽ ∼ N (p0 , Σ0 ). The uninformed
traders provide noise in the aggregate order flow (x̃ + ũ), thereby preventing
the market maker from perfectly inferring v. These noise traders submit
orders for ũ ∼ N (0, σu2 ).
The informed trader, who has rational expectations, knows the pricing
function and the distribution of noise trades. He chooses an order quantity
to maximize his expected profits
X(v) = argmax E[Π(X(·), P (·))|v].
The informed trader does not know the price at which his order will be filled.
The equilibrium2 is the pair P (·) and X(·). The market maker sets price
equal to the expected value of v conditional on observing x + u.
P (x + u) = E[ṽ|x + u].
The equilibrium is
X(ṽ) = β(ṽ − p0 )
P (x̃ + ũ) = p0 + λ(x̃ + ũ)

2
This setup is not game-theoretic, but can be made so by including additional market
makers with identical information, or by giving the market maker an objective function.
The equilibrium then is such that each player’s strategy is a best response given his
information at each stage in the game.
p p
where β = σu2 /Σ0 and λ = 12 Σ0 /σu2 . The market maker can use his
knowledge of X(·) to observe a random variable ∼ N (v, σu2 /β 2 ). Using Bayes
rule, his posterior on v is N (p0 + λ(x + u), Σ0 /2).
To derive the equilibrium, suppose that P and X can be expressed as
linear functions of µ, λ, α, and β
P (y) = µ + λy and X(v) = α + βv.
The expected profit for the informed agent given his signal is

E[Π|tildev = v] = E [ṽ − P (x + ũ)]x|ṽ = v = (v − µ − λx)x.
Profit maximization gives the FOC v − µ − 2λx = 0, or X(v) = α + βv with

α = −µβ and β = 1/(2λ). The market efficiency condition is
µ + λy = E[ṽ|α + βṽ + ũ = y].
Normality makes the regression linear. Applying the projection theorem

gives
cov(v, y) βΣ0
λ= = 2
var(y) β Σ0 + σu2
and
µ − p0 = −λ(α + βp0 ).
Solving, we get µ = p0 and α = −βp0 .

Several characterizations can be made. The unconditional expeceted
profit to the informed is
E[Π̃] = E E[Π̃|v] = E[(v − p0 − λx)x] = E[β(1 − λβ)(v − p0 )2 ]

1 1 1
= βΣ0 = (Σ0 σu2 ) 2 .
2 2
The variance of the value conditional on the price is
var(ṽ|p) = var(v − p0 − λ(α + βv + u)) = E [v − p0 − λ(α + βv + u)]2

= E [(v − p0 )(1 − λβ) − λ(α + u)]2 = E[(v − p0 )(1 − λβ) + λ2 u2 ]

1
= Σ0 /4 + λ2 σu2 = Σ0
2
Note that the noise traders have an expected loss, which can be justi-
fied with liquidity trading arguments. The noise traders’ loss is the informed
trader’s gain. The market maker expects to break even on average by balanc-
ing his loss to the informed with the gain from trading with the uninformed.
In the discrete time sequential auction there is a unique linear equilibrium.
There are constants βn , λn , αn , δn , and Σn such that
∆x̃n = βn (ṽ − p̃n−1 )∆tn
∆p̃n = λn (∆x̃n + ∆ũn )
Σn = var(ṽ|∆x̃1 + ∆ũ1 , . . . , ∆x̃n + ∆ũn )
E[π̃n |p1 , . . . , pn−1 , v] = αn−1 (v − pn−1 )2 + δn−1
Given Σ0 , the constants are a unique solution to a difference equation system

1
αn−1 =
4λn (1 − αn λn )
δn−1 = δn + αn λ2n σu2 ∆tn
1 − 2αn λn
β n ∆n =
2λn (1 − αn λn )
λn = βn Σn /σu2
Σn = (1 − βn λn ∆tn )Σn−1
The derivation of the above results follows three steps. First, solve for the
informed agent’s trading strategy as a function of the price function. Second,
find the price function that is consistent with market efficiency given optimal
trades. Finally, show the difference equation system implied by the first two
steps has a solution.
In a continuous time setting, µ(t) follows a Brownian motion. Therefore,

the uninformed quantity is independent through time. Since this indepen-
dence will not be true for the informed trader, there is a linkage between
quantity and information that causes prices to (eventually) reflect all infor-
mation. The informed trader need not trade the same amount every period.
He changes his trade size to try to “hide” from the market maker.
The prices have a constant volatility as information is gradually incor-
porated into prices at a constant rate. Prices follow a martingale (and a
random walk), so they are efficient in the semi-strong sense. The informed
trader profits more by continuously trading than by using a mixed strategy
attempting to manipulate prices. You could not tell that there is an informed
trader by looking at prices alone. The continuous time setting makes it possi-
ble to spread information quickly without removing the incentives to acquire
information [Grossman and Stiglitz (1980)].
The speed with which the informed trader pushes prices to the true value
measures resiliency. This speed is the difference between his private infor-
mation and the current price, divided by the remaining trading time. The
depth of the market, constant over time, is proportional to the amount of
noise trading and is inversely proportional to the amount of private informa-
tion. The market is infinitely tight in continuous time.
There are many extensions to the model. You could let the market maker
know more about the distribution of orders than the market as a whole. This
drastically reduces the informed traders ability to make profits and prices
reflect information much more quickly. Another extension allows multiple
informed traders. Foster and Viswanathan (1990) is an example of this, where
the normality assumption is relaxed to elliptical distributions. The result is
the competition between informed forces prices to their full-information levels
almost immediately, eliminating the smoothing behavior. Their work also
shows that the Kyle results may be sensitive to the normality assumption.
Kyle (1989) uses a more complex trading mechanism (limit orders) in a single
period setting to overcome this problem.
6.4.1 Strategic Uninformed Traders

Strategic uninformed trading may allow these agents to reduce their trading
losses. This may create price effects by the uninformed traders, as they
attempt to “hide” from the informed traders. Admati and Pfleiderer (1988)
and Admati and Pfleiderer (1989) examine intraday timing decisions of the
uninformed. Foster and Viswanathan (1990) focus on interday effects as the

levels of public and private information vary across days.
Admati & Pfleiderer (1988)

There is empirical evidence of U-shaped patterns in intraday volume and
volatility. In Admati and Pfleiderer (1988) the risk-neutral3 informed traders
get their information one period before it becomes public knowledge. The
informed then just decide the optimal order size in each period. The unin-
formed discretionary traders can not split trades, but they do decide when to
trade. There are also nondiscretionary traders providing noise in the model.
The competitive informed traders do not consider the price consequences of
their actions. Uninformed traders end up clustering their trades. This clus-
tering can improve the liquidity of the market and reduce their losses to the
informed. The informed traders recognize the clumping of uninformed trades
and will also trade during these periods, intensifying the clustering.
The concentration of discretionary liquidity traders does not affect the
amount of information revealed by prices or the variance of price changes if
the number of informed traders is fixed. This is because there is an increase
in informed trading just enough to keep the informational content the same.
Endogenizing information acquisition intensifies the concentration of trading
and prices become more informative. The liquidity traders are better off with
no informed traders, but if there are any the cost of trading decreases with
the number of informed.
A critical assumption is the independence of trade between periods. Sub-
sequent prices will not reflect previous order flow. If the uninformed are
allowed to split their trades, an equilibrium may not exist, and if it does it
may not be unique. The results are also sensitive to the assumptions about
the risk preferences of the informed traders. If they are risk-averse, then it
may not be the case that periods with more informed traders result in better
prices for the uninformed. Thus, the clumping may not hold if traders are
risk-averse. If uniformed trade flows become more informative over time,
uninformed traders will be more likely to trade early.
Admati & Pfleiderer (1989)

Admati and Pfleiderer (1989) examine patterns in mean returns in a frame-
3
The results do not change with risk-averse liquidity traders.
work where market makers reduce the adverse selection problem by inducing
patterns in volume and price. By changing the bid or ask commission, the
market maker can change the expected number of liquidity sellers and buy-
ers. The market maker’s expected loss to the informed decreases with the
commission, but so does his expected profits on the discretionary liquidity
traders. The market maker processes trades in a manner combining some fea-
tures of batch and sequential trading. Traders do know the prices at which
they will transact, but prices are updated after every period in time, not
after every trade.
Equilibrium trading results in all discretionary buying occuring in a single
period, and similarly for selling. This is because the liquidity trading reduces
the adverse selection problem. The paper uses a market where traders can
only buy on even days and sell on odd days as an example.
Foster & Viswanathan (1990)

In Foster and Viswanathan (1990) an interday pattern in trading arises be-
cause the informational advantage of the informed decreases over time as the
uninformed infer information from the price and the market maker from the
order flow. The informed trader will be at the greatest advantage when the
market first opens, such as in the morning or on Monday.
The model is an extension of Kyle (1985). There is only one informed
trader and the uninformed act competitively. The ability of the uninformed
to choose when to trade creates the temporal pattern. ?? The sensitivity of
the price to the order flow increases with the amount of information released
by the informed and falls with the amount of liquidity trading. The informed
trades more when there are more liquidity traders to hide his trade. Conse-
quently, he has a higher profit when there is more liquidity trading, or when
he releases more private information. The release of private information will
be smooth throughout the day.
Slezak (1994)
Slezak (1994) develops a multiperiod generalization of Grossman and Stiglitz
(1980) that produces patterns in both the mean and variance of returns
without relying on irrationality, bubbles, or strategic liquidity trading. These
patterns arise because of the effect market closures have on the information
structure in the economy.
6.5. SEQUENTIAL TRADE MODELS 173
The model uses risk-averse agents. Market closures alter investor uncer-
tainty by changing the timing of resolution of uncertainty and by reducing
the informed agent’s comparative advantage at risk bearing. Closures affect
the variance of returns by altering the informativeness of the price. Post clo-
sure prices reflect a greater proportion of private news on the reopening day,
but less private news accumulated over the closure. Preclosure prices are rel-
atively less informative as well. Post closure liquidity costs are higher since
increased adverse selection causes the uninformed to provide less liquidity.
6.5 Sequential Trade Models

Sequential trade models allow for the analysis of bid-ask spreads and the
details of the price process. The main underlying idea is that an informed
trader will prefer to buy when the price is low and sell when it is high. The
market maker will lose money on him if there is a single price. By introducing
a bid-ask spread the market maker can offset the losses to the informed with
gains from the uninformed.
6.5.1 Specialists and Dealers

Amihud & Mendelson (1980)
In Amihud and Mendelson (1980) the (risk-averse ??) market maker max-
imizes expected profits by changing the bid and ask. This can give rise to
an asymmetric bid ask as the market maker manages his inventory. This
contrasts with Admati and Pfleiderer (1989) where an asymmetric spread
results from information effects.
In the model the market maker is a monopolist who sets bid and ask
prices (pb and pa ) to maximize expected profits. The quotes are good for a
single transaction. The arrival of buy and sell orders is Poisson with rates
D(pa ) and S(pb ) where D 0 < 0, S 0 > 0. The market maker dislikes extreme
inventory positions because they force him to take transactions under unfa-
vorable conditions. To stay in his desired inventory range the market maker
adjusts bid and ask prices to manage his inventory.
Glosten & Milgrom (1985)

Glosten and Milgrom (1985) model the market maker’s pricing decision in
an environment where he learns from previous trades. In a competitive mar-

ket, informed trades will reflect their information. A sell order will lower
the market maker’s expectation, while a buyer will raise his expectation.
The competitive market maker sets the bid and ask such that his expected
profit on any trade is zero. The bid reflects the expected value of an asset
conditional on a sell order arriving. Bayes rule updates the conditional prob-
abilities as trades occur. Since the distribution of trades differs depending on
the true state, the market maker will eventually learn the informed trader’s
information. Many of the results stem from the fact that the informed can
only trade a single unit at a point in time.
Prices follow a martingale with respect to the specialist’s and public in-
formation; price changes will be serially uncorrelated. Spreads due to adverse
selection are different from spreads arising from transactions costs, risk aver-
sion, or monopoly power. These other sources of spreads will lead to negative
serial correlation. The spread can be expressed as Ψ + 2c, where Ψ is the
adverse selection cost and c is the cost of transacting. The covariance of
adjacent price changes is − 12 Ψc − c2 . This is similar to Roll (1984), with the
addition of the adverse selection component. The variance of a price change
is θ 2 + (Ψ/2)2 + cΨ + 2c2 , where θ 2 is the variance of public information
arriving exogenously between trades.
The bid-ask spread reflects the informational asymmetry. With large
volume the spread will be small. As the market maker learns the insider’s
information the valuation of the informed trader and the market maker con-
verge. The spread will increase when the informed traders’ information is
better, insiders become relatively more numerous, or the elasticity of the
supply and demand of uninformed traders increases. If the adverse selection
problem is too large, then the market may collapse as in Akerlof (1970). If
the market closes for this reason the problem only gets worse. Once a market
closes it will stay closed until the information asymmetry is reduced.
Glosten (1989)
When investors trade on private information it can lead to suboptimal risk
sharing if the market maker reduces the liquidity of the market. Glosten
(1989) looks at whether the monopoly power of the specialist can preserve
market liquidity and avoid market failure.
In Glosten and Milgrom (1985) the market maker is competitive. He sets
the price to have a zero expected profit on every trade. When information
6.5. SEQUENTIAL TRADE MODELS 175
asymmetries are large the market may fail completely. Furthermore, market
closure generally makes the information asymmetry worse. By giving the
market market monopoly power4 he can set prices to average profits across
trades. He will lose on trades with the informed, but compensate with trades
to the uninformed. The result is increased liquidity. The market maker is
risk-neutral so there are no inventory costs associated with risk bearing. The
model also ignores any dynamic trading.
Rock (1989)
Rock (1989) examines the interaction of the specialists order book and prices.
risk-neutral uninformed traders submit limit orders. A risk-averse market
maker competes with the orders in the book. The market maker has two
advantages. First, he knows the size of the trade. Second, he moves second
so he can get out of the way of big trades and fill them from the book,
creating an adverse selection problem. The book orders tend to only get the
unprofitable trades.
Limit orders provide liquidity to the market. These orders have an option
component to them. The order is an obligation to buy or sell at the specified
price. Since the order submitters are writing the option, they receive an
option premium in the form of reduced transactions costs. These investors
avoid the adverse selection component of the bid-ask spread by standing
ready to transact ahead of time.
The assumptions about risk preferences are important in this model. If
the specialist was risk-neutral he would not need to bother with the order
book. It is the risk neutrality of the limit order submitters that gives them
a comparative advantage at risk bearing. If the limit order submitters were
risk-averse they would only submit orders in response to inventory positions,
etc. The risk aversion of the specialist will cause him to take transactions at
prices that may differ from underlying value.
4
The market maker need not literally be a monopolist. He has superior information
about the trading process from his order book, but may face competition from limit orders
and other floor traders. Limit orders allow traders to provide liquidity to the market and
compete with the market maker. What is important is his ability to average profits across
trades.
6.5.2 Other Topics

Trading Volume
Volume of trade generally increases with the precision of private information.
Equilibrium beliefs are not always more homogeneous if information is more
precise.
Sale of Information
People with private information can profit from it by selling it or by trading
on it themselves. The more selling they do, the less valuable the information
is in their trading. The information seller can add noise to the information
(either the same noise for each purchaser or unique noises). Selling can also
be done indirectly, as in a mutual fund.
Regulation
The adverse selection component of trading costs is like a tax on noise traders
that subsidizes the acquisition of private information and its release through
the price system. Regulators can attempt to influence the liquidity of markets
and the informativeness of prices. Attempts to reduce noise trading on the
grounds that it destabilizes prices may not work. It is noise trading that
attracts informed traders to the market in the first place. Reducing noise
trading may actually reduce the informativeness of prices.
6.6 Special Topics

6.6.1 Bubbles
Bubbles deal with deviations from fundamental value. Shiller (1981) is one
of the classic papers in this area. Refer to Section 2.8.6 for more information.
Tirole () has a no trade result in a dynamic context where trade does not
occur because it would burst a bubble.
Blanchard & Watson (1982)

Blanchard and Watson (1982) argue that rational bubbles are possible even
in efficient markets. The market price can be expressed as the fundamental
6.6. SPECIAL TOPICS 177
value plus a bubble
pt = p∗t + ct
where E[ct |Ωt−1 ] = (1 + r)ct−1 .

A deterministic bubble is given by ct = c0 (1 + r)t . The bubble grows with
time so that it eventually dominates the fundamental value portion of the
price. Since the growth must continue forever for the price to be rational,
this type of bubble is implausible. A stochastic bubble is created by adding
a random shock to the to a deterministic bubble.
A stochastic crash takes a value of ct = µt + ct−1 (1 + r)/π with probability
π and ct = µt otherwise. In this case E[µt |Ωt−1 ] = 0. This produces a
situation where the bubble will persist with probabilty π or crash. The
average return is greater than r to compensate for the risk of a crash.
Arbitrage does not eliminate these bubbles. Since the bubble grows in
any of the above cases, as the time horizon becomes infinite the bubble will
be infinitely large. Since some assets, such as bonds, have finite lives the
bubble must be zero at their maturity. Therefore bubbles are ruled out for
these securities. The structure above also rules out negative bubbles since
they imply negative security prices with a positive probability.
Empirically detecting bubbles is challenging. To use the price process
to say something intersting about bubbles requires an understanding of the
fundamental value process — including the information sets available. Tests
for bubbles can be divided into variance bounds and patterns in innovations.
The variance bounds tests, such as Shiller (1981) put upper bounds on the
conditional or unconditional variances of prices relative to the variance of
dividends. The innovation patterns tests look for either runs in shocks or
extreme outliers.
6.6.2 Speculation
Hart & Kreps (1986)
Hart and Kreps (1986) show that, contrary to common belief, speculation can
destabilize prices. Speculators buy when the chances of price appreciation
are high, which is not necessarily when prices are actually low.
6.6.3 Noise
DeLong, Shleifer, Summers, & Waldman (1990)
? develop an overlapping generations model with irrational noise traders.
The rational investors do not fully exploit the irrational investors. Their
short-run focus prevents them from completely wiping out the irrational in-
vestors.
“Noise trader risk” is the chance that marketwide irrational beliefs of
the noise traders may become even more irrational before reverting to their
mean. Essentially the noise trader beliefs are slowly mean reverting. If an
arbitrageur has a limited investment horizon there is a chance that the prices
will not return to their true value before he has to close out his position. In
fact, if the beliefs become more irrational the arbitrageur may face a loss.
There are several plausible preditions from the model. Prices are more
volatile with noise trading. If the noise traders’ opinions are stationary there
will be a mean-reverting component to stock returns. Assets may be un-
derpriced realtive to fundamental value, consistent with the equity premium
puzzle.
6.6.4 Cascades
Bikhchandani, Hirshleifer, & Welch (1992)
Bikhchandani, Hirshleifer, and Welch (1992) generalize the idea of IPO cas-
cades in Welch (1992). For the details of cascades refer to the discussion
of the original paper in Section 5.10.1. An information cascade describes
a sequence of decisions where individuals ignore their own private informa-
tion in favor of information inferred from the observation of others decisions.
Cascades can be reversed by the release of new information.
Chapter 7
International Finance
7.1 Introduction
What distinguishes international finance from traditional finance is the addi-
tion of foreign exchange rate assets, both spot and forward. There are several
measures of returns in international finance. The return from currency spec-
ulation by buying forward and selling spot is (ft − st+1 )/st . Mean returns are
generally close to zero. Depreciation is defined as (st+1 − st )/st . The forward
premium is (ft − st )/st .
There are several basic concepts that are important in international fi-
nance. Covered Interest Rate Parity dictates that
Ftij
exp(rti ) = exp(rtj ) or rti − rtj = ftij − sij
t
Stij
to prevent arbitrage. Uncovered Interest Rate Parity states that

ij
j E[St+1 ]
exp(rti ) = exp(rt ) or rti − rtj = E[sij ij
t+1 ] − st
Stij
In terms of notation, the ij superscript indicates the price of a unit of cur-

rency j in terms of currency i. Purchasing Power Parity (PPP) is another
no arbitrage condition that says the prices of a good in different countries
must be the same after converting currencies.
Floating rates began in 1973. The period shortly thereafter is known as
the “dirty float” period.
179
180 CHAPTER 7. INTERNATIONAL FINANCE
There are two puzzles in international finance. The first is the deviation
of the forward rate from the expected future spot rate. This captures the
difference between covered and uncovered interest parity. The second puzzle
is the home country bias — too little investment in foreign assets.
7.2 Spot Currency Pricing

Lucas (1982)
The Lucas (1982) model extends Lucas (1978) to international asset pric-
ing. In the model there are two infinitely-lived countries with identical
agents. There are two non-storable goods, no production, stochastic en-
dowment shocks, and monetary instability. The model is developed first in
a barter economy, then in a world with a single currency, and finally with
national currencies and flexible exchange rates. Country 0 produces good
X in amounts {ξt }. Similarly, country 1 produces good Y in amounts {ηt }.
Denote the price of Y , in units of X, in state s as pY (s). The prices of the
future streams {ξt } and {ηt } are given by qX (s) and qY (s), again in units of
X.
An agent with wealth θ chooses consuption of (X, Y ) at prices (1, pY (s))
and shares (θX , θY ) of ({ξt }, {ηt }) at prices (qX (s), qY (s)). The agents objec-
tive is to
"∞ #
X
max E β t U (Xit , Yit )
t=0
subject to a budget constraint and a cash in advance contraint. This means
that the value of current period endowments can not be used in trading for
assets or the other consumption good until next period. Each agent can be
viewed as a two member household. One member collects the endowment
and exchanges it for currency while the other uses existing currency to trade
assets and goods. The two members do not interact until the end of the
trading period.
With national currencies, the monetary shocks are given by
∆Mt+1 = w0,t+1 Mt and ∆Nt+1 = w1,t+1 Nt .
Within each country the price of the home good in terms of home currency
is
pX (s, M ) = M/ξ and pY (s, N ) = N/η.
7.3. FORWARD CURRENCY PRICING 181
∂U/∂Y
Also note the price of Y in terms of X can be expressed as pY (s) = ∂U/∂X
.
The exchange rate (currency 0 per unit of currency 1) is
pX (s, M ) M/ξ πY
e(s, M, N ) = pY (s) = pY (s) = pY (s)
pY (s, N ) N/η πX
where πi = 1/pi (s, ·) gives the purchasing power for country i. In equilibrium,
∂U (t) ∂U (t + 1)
Vi (t)πi (t) =E β πi (t + 1)[Vi (t + 1) + Di (t + 1)] .
∂i ∂i
For a riskless asset

∂U (t + 1)/∂i π(t + 1)
Bi (t) = E β = E[m].
∂U (t)/∂i π(t)
7.3 Forward Currency Pricing

Hansen & Hodrick (1983)
Hansen and Hodrick (1983) study the determinants of the risk premium in
foreign exchange rates. This premium arises when the forward rate is not
equal to the expected future spot rate, Ft 6= E[SSt+1 ]. The basic idea is to
test the orthogonality condition
E[Qm,t+k (sjt+k − ft,k

j
)] = 0
where Qm,t+k is the IMRS of money.

To make the above condition testable the authors propose three mod-
els. The first is a lognormal model which implies a constant risk premium.
The second uses a riskless nominal rate and assumes a constant conditional
covariance. The third is a latent variable model. The first two models are
rejected, while the third provides some evidence that the risk premium is
important. These test are joint tests of the orthogonality condition and the
auxillary restrictions in each of the three models.
Fama (1984)
Fama (1984a) uses the same basic framework Fama (1984b), which looks at
Treasury bills. The idea here is to determine the information in the forward
premium about forecast errors and changes in the spot rate. This research
shows that excess returns are not only predictable ex ante, but also that the
variance of the predictable component exceeds the variance of the expected
rate change.
The analysis begins with a specification for the components of the forward
rate
ft = E[st+1 ] + pt
where the lower case letters indicate logs and pt is the premium. This can
be modified to represent the forward premium, which is then used to predict
the forecast error and spot rate innovation
ft − st = E[st+1 − st ] + pt
ft − st+1 = α1 + β1 (ft − st ) + ε1,t+1
st+1 − st = α2 + β2 (ft − st ) + ε2,t+1 .

Adding the last two equations implies that α1 + α2 = 0, β1 + β1 = 1, and
ε1,t+1 + ε2,t+1 = 0.
Fama finds that both components vary through time, but most of the
variation in the forward rate is due to the premium. The null hypothesis is
β1 = 0 and β2 = 1. The estimated coefficients are β1 > 1 and β2 < 0. Thus,
the premium and expected future spot rate are negatively correlated.
Possible explanations for these findings can be categorized as either a
risk premium story or some type of forecast errors. The risk premium can
arise in either a CAPM or a dynamic gereral equilibrium setting if investors
have rational expectations. While a risk premium could account for non-zero
excess returns, it does not explain the high variablility. Explanations based
on forecast errors may rely on either rational or irrational agents. Examples
of cases with rational investors include learning models and the peso problem.
Mark (1988)
Mark (1988) allows time-variation in beta or the risk premium in a single beta
CAPM to attempt to explain the forward premium puzzle. The conditional
beta comes from an ARCH model. Using GMM, he fails to reject the model,
indicating that there is evidence of time-varying beta. Additional tests reject
the hypothesis of a constant beta.
7.3. FORWARD CURRENCY PRICING 183
Froot & Frankel (1989)

Froot and Frankel (1989) use survey data to extend the analysis in Fama
(1984a). They focus on the regression
st − st−1 = α + β(ft−1 − st−1 ) + εt
and decompose beta into β = 1 − βre − βrp . The term βre captures failure
of rational expectations while βrp represents the risk premium. The priors
are that βrp is large and βre small, but the authors find the opposite. The
risk premium does not appear to be an economically important source of
the forward premium. The authors fail to reject the hypothesis that all the
bias in the forward premium is due to expectation errors. Contrary to Fama
(1984a), Froot and Frankel find that the variance of expected depreciation
is large relative to the variance of the risk premium and the risk premium is
uncorrelated with the forward discount. This analysis does not incorporate
learning effects or the “peso problem.”
Backus, Gregory & Telmer (1993)

Backus, Gregory, and Telmer (1993) view the evidence on forward premi-
ums in the same light as the equity premium puzzle. They introduce habit
persistance to get around the high risk aversion implied by models with repre-
sentative agents and time-seperable utility. The model is tested using GMM
estimation and simulations.
The statistical properties of forward and spot rates imply predicatable
returns from speculation. These returns are highly variable and imply a
highly variable pricing kernel. Using GMM, the authors reject models with
power utility and a particular specification of habit persistance. Simulations
are used to place more structure on the theory. The evidence is partially
consistent with the revised theory.
Huang (1989)
Huang (1989) examines the risk-return characteristics of the term structure
of forward FX. The analysis is much like Hansen and Hodrick (1983), but in
a multiple maturity setting. The evidence is that there appear to be some
country-specific effects in the short (1 month) end of the term structure. In
particular, Huang rejects the model using one month forwards, contrary to
Hansen and Hodrick. With 3, 6, and 12 month forwards and with multiple
maturities he fails to reject. These results are important since virtually all
other papers in the literature (at least the ones mentioned here) use one
month forwards. If there are strange influences on this maturity then the
results in other papers may not be robust.
7.4 Integration
Bekaert and Harvey (1995) develop a conditional regime-switching model
where expected returns are a weighted average of returns in integrated and
segmented markets
E[ri ] = φλcov(ri , rW ) + (1 − φ)λvar(ri )
where φ is the probability the market is integrated. This probability is es-
timated with regime switching models assuming constant or time-varying
transition probabilties. The authors find evidence of a time-varying world
price of risk related to the business cycle (the Sharpe ratio is high in a trough).
There is also evidence of time-varying integration for a number of countries.
Evans & Lewis (1995)

Evans and Lewis (1995) study whether long swings in the dollar can affect
risk premium estimates. Using a regime-switching model they find that long
swings make risk premium to appear to contain a permanent disturbance and
can bias the Fama-style forward regressions. The authors are also unable
to reject the restriction that the actual forward premium equals the risk
premium plus the expected change in the exchange rate.
7.5 International Asset Pricing

Skip. Papers: Stulz (1981), Bansal, Hsieh, & Viswanathan (1993), Dumas &
Solnik (1995), Ferson & Harvey (1993).
7.6 Other Topics

Skip. Papers: Bekaert & Hodrick (1992), Engle & Hamilton (1990), Engle,
Ito, & Lin (1990).
Chapter 8
Appendix: Math Results
8.1 Basics
8.1.1 Norms
A norm measures the magnitude of a vector. The Euclidean norm is the
common measure.
√ hX i1/2
||x|| ≡ x0 x ≡ x2i
8.1.2 Moments
Moments describe the characteristics of a distribution.
h The ithi moment is
µi = E[x ] and the i central moment is µi = E (x − E[x])i . The first
0 i th
moment is the mean and the second central moment the variance.
8.1.3 Distributions
Normal
If x ∼ N (µ, σ 2 ) the density and characteristic functions are
(x − µ)2

2 −1/2
f (x) = (2πσ ) exp −
2σ 2
σ 2 t2

φ(t) = exp iµt −
2
185
186 CHAPTER 8. APPENDIX: MATH RESULTS
Lognormal
If x is normally distributed, then z = ex is lognormal (its log is normal).
(ln z − µ)2

1
f (z) = √ exp − .
σz 2π 2σ 2
z̄ = exp µ + σ 2 /2 var(z) = exp(2µ + σ 2 )(exp(σ 2 ) − 1)

8.1.4 Convergence
Probability
A sequence of random variables xn converges in probability to a constant c
if
lim Pr[|xn − c| < δ] = 1 ∀ δ > 0.
n→∞
Distribution
A sequence of random variables xn with cdf Fi converges in distribution to
a random variable x with cdf F if
lim Fn (x) = F (x)
n→∞
Almost Sure
A sequence of random variables xn defined on a probability space (Ω, F, P )
converges almost surely to an rv x if
lim xn (ω) = x(ω)
n→∞
for each ω ∈ Ω except for ω ∈ E where P (E) = 0.
Quadratic Mean
8.1.5 Some Famous Inequalities

Jensen’s Inequality
If G is concave in x, then
E[G(x)] ≤ G[E(x)].
This is where risk aversion comes from.
8.1. BASICS 187
Chebychev’s Inequality
If mean µ and variance σ exist, then for all ε > 0
Pr[|x̃ − µ| ≥ ε] ≤ σ/ε2
Cauchy-Schwarz Inequality
(E[xy])2 ≤ E[x2 ]E[y 2 ]
8.1.6 Stein’s Lemma

If (x, y) ∼ N (·, ·), g is everywhere differentiable, and E[|g 0 (x)|] < ∞, then
cov(g(x), y) = E[g 0 (x)]cov(x, y).
This result is useful in working with the fundamental valuation equation

1 = E[mR]. It can linearize a model under normality.
8.1.7 Bayes Law

Bayes law is useful for updating probabilities.
Prob(Xi Y ) Prob(Y |Xi )Prob(Xi )
Prob(Xi |Y ) = = PN
Prob(Y ) i=1 Prob(Y |Xi )Prob(Xi )
8.1.8 Law of Iterated Expectations

The law of iterated expectations is useful in conditioning down on a finer
information set. If E[|Y |] < ∞ and F0 ⊂ F1 ⊂ F , then

E[Y |F0 ] = E E[Y |F1 ]|F0 .
8.1.9 Stochastic Dominance

To compare two risky payoffs c̃1 and c̃2 , we can use the notion of stochastic
dominance. The idea is to choose the asset
Let F (c) = Pr[c̃1 ≤ c], G(c) = Pr[c̃2 ≤ c]. First order SD: 1 dominates 2
in the first-order sense if F (c)
R c ≤ G(c) ∀ c. RSecond order SD: 1 dominates 2
c
in the second-order sense if −∞ F (r)dr ≤ −∞ G(r)dr
8.2 Econometrics
This is a very brief review of some of the highlights from econometrics that
are not immediately obvious.
8.2.1 Projection Theorem

If
E[y|x] = α + βx
then
cov(x, y)
β̂ = and α̂ = ȳ − β̂ x̄.
var(x)
8.2.2 Cramer-Rao Bound and the Var-Cov Matrix

The Cramer-Rao Bound gives the minimum variance of an estimator. Esti-
mators that achieve the bound are most efficient in their class.
Under regularity conditions, the variance of an unbiased estimator θ̂n is
bounded by var(θ̂n ) ≥ var(G)−1 = −E[H]−1 where G and H are the gradiant
and Hessian.
8.2.3 Testing: Wald, LM, LR

There are three basic tests of hypotheses, the Wald (W), likelihood ratio
(LR), and the Lagrange multiplier. All three are asymtotically χ2 , but finite
sample properties may differ. One test may be preferred over the other
depnding on the easy of calculation under the null or alternative hypotheses.
Consider a ML estimate with g(y, θ) = ln[f (y, θ)] the log-likelihood. Let
G = gθ and H = gθθ0 . Then E[G] = 0, var(G) = E[GG0 ] = −E[H] = I and
a
var(θ̂) → I(θ)−1 .
W =(θ̂ − θ)0 [var(θ̂ − θ)]−1 (θ̂ − θ)

LM =G(θ̂R )0 I(θ̂R )G(θ̂R )
LR = − 2[g(θ̂R ) − g(θ̂U )]
8.3. CONTINUOUS-TIME MATH 189
G LR
g
c
LM W
8.3 Continuous-Time Math

8.3.1 Stochastic Processes
8.3.2 Martingales
prices follow a martingale when adjusted for dividends.
Random Walk
8.3.3 Itô’s Lemma

Consider the diffusion process of a variable X:
dX(t) = µ(X, t)dt + σ(X, t)dW (t)
where dW is a standard diffusion process with the properties E[dW ] = 0
and E[dW 2 ] = dt. Then the function F (X, t) has the stochastic differential
equation
∂2F

∂F ∂F 1 2
dF (X, t) = dX + + σ (X, t) dt
∂X ∂t 2 ∂X 2
8.3.4 Cameron-Martin-Girsanov Theorem

If Wt is a P-Brownian motion and γt is an F -previsible process satisfying the
boundedness condition
1 T 2
Z
P
E [exp( γ dt)] < ∞,
2 0 t
then there exists a measure Q such that
1. Q is equivalent to P
R
dQ T 1 T 2
R
2. dP
= exp − 0 γt dWt − 2 0 γt dt
RT
3. W̃t = Wt 0
γs ds is a Q-Brownian motion.
There is a converse as well.
8.3.5 Special Processes

Arithmetic Brownian Motion
dX = µdt + σdW
X grows linearly with increasing uncertainty.

√ X is normally distributed with
mean X + µ(τ ) and standard deviation σ τ .
Geometic Brownian Motion
dX = µXdt + σXdW
X grows exponentially at rate µ with volatility proportional to the level of

X. The distribution of X is lognormal which makes it useful in modeling
asset prices.
Mean-reverting Process
dX = κ(µ − X)dt + σX γ dW
If γ = 1/2 then X is distributed non-central χ2 . It is often used to model

interest rates, inflation, and volatility; the CIR model is an example of the
square root process. If γ = 1, this is called a Ornstein–Uhlenbeck process.
8.3. CONTINUOUS-TIME MATH 191
8.3.6 Special Lemma

If
σx2 σxy

x
∼ N (0, Ω) with Ω=
y σxy σy2
then
1 1
E[(Ax exp(x − σx2 − Ay exp(y − σy2 )+ ] = Ax N (d1) − Ay N (d2)
2 2
where
ln(Ax /Ay ) − Σ √
d1 = √ , d2 = d1 − Σ,
Σ
and Σ = var(x − y) = σx2 + σy2 − 2σxy .
Bibliography
Admati, Anat, 1985, A noisy rational expectations equilibrium for multi-

asset securities markets, Econometrica 53, 629–657.
, and Paul Pfleiderer, 1988, A theory of intraday patterns: Volume

and price variablity, Review of Financial Studies 1, 3–40.
, 1989, Divide and conquer: A theory of intraday and day-of-the-week

mean effects, Review of Financial Studies 2, 189–223.
Akerlof, George A., 1970, The market for “lemons”: Quality uncertainty and
the market mechanism, Quarterly Journal of Economics 84, 488–500.
Amihud, Y., and H. Mendelson, 1980, Dealership markets: Market-making

with inventory, Journal of Financial Economics 8, 31–53.
Asquith, Paul, 1995, Convertible bonds are not called late, Journal of Fi-
nance 50, 1275–1289.
, and David Mullins, 1991, Convertible debt: Corporate call policy

and voluntary conversion, Journal of Finance 46, 1273–1289.
Backus, David, Allan Gregory, and Chris Telmer, 1993, Accounting for for-
ward rates in markets for foreign currency, Journal of Finace 48, 1887–
1908.
Banz, R., 1981, The relation between the return and market value of common
stocks, Jounrnal of Financial Economics 9, 3–18.
Basu, S., 1977, The investment perfomance of common stocks in relation to

their price to earnings ratios: A test of the efficient markets hypothesis,
Journal of Finance 32, 663–682.
193
194 BIBLIOGRAPHY
Bekaert, Geert, and Campbell Harvey, 1995, Time-varying world market

integration, Journal of Finance 50, 403–444.
Berger, Phillip, and Eli Ofek, 1995, Diversification’s effect on firm value,
Journal of Financial Economics 37, 39–65.
Berk, Jonathan, 1995, A critique of size related anomalies, Review of Finan-

cial Studies 8, 275–286.
Betker, Brian, 1995, An empirical examination of pre-packaged bankruptcy,

Financial Management 24, 3–18.
Bhattacharya, Suipto, and George Constantinides, 1989, Frontiers of Mod-

ern Financial Theory . , vol. I & II of Studies in Financial Economics
(Rowman & Littlefield: Totowa, NJ).
Bikhchandani, S., David Hirshleifer, and Ivo Welch, 1992, A theory of fads,
fashion, custom, and cultural change as informational cascades, Journal
of Political Economy 100, 992–1025.
Billett, Matthew, Mark Flannery, and Jon Garfinkel, 1995, The effect of
lender identity on a borrowing firm’s equity return, Jounrnal of Finance
50, 699–718.
Bizjak, John, James Brickley, and Jeffrey Coles, 1993, Stock-based incen-
tive compensation and investment behavior, Journal of Accounting and
Economics 16, 349–372.
Black, Fisher, 1972, Capital market equilibrium with restricted borrowing,

Journal of Business 45, 444–455.
, 1976, The dividend puzzle, Journal of Portfolio Management 2, 5–8.
Black, Fischer, Michael Jensen, and Myron Scholes, 1972, The capital asset
pricing model: Some empirical tests, in Michael Jensen, ed.: Studies in
the Theory of Capital Markets (Praeger: New York, NY).
Black, Fischer, and Myron Scholes, 1973, The pricing of options and corpo-
rate liabilities, Journal of Political Economy 81, 637–659.
BIBLIOGRAPHY 195
Blanchard, O., and Mark Watson, 1982, Bubbles, Rational Expectations, and
Financial Markets . , vol. Crises in the Economic and Financial Structure
(Lexington Books: Lexington, MA).
Blume, M., and I. Friend, 1973, A new look at the capital asset pricing model,
Booth, James, and Lena Chua, 1996, Ownership dispersion, costly informa-
tion, and IPO underpricing, Journal of Financial Economics 41, 291–310.
Breeden, Douglas T., 1979, An intertemporal asset pricing model with

stochastic consumption and investment opportunities, Journal of Finan-
cial Economics 7, 265–96.
Brown, Roger, and Stephen Schaefer, 1994, The term structure of real in-
terest rates and the Cox, Ingersoll, and Ross model., Journal of Financial
Brown, Stephen, and Philip Dybvig, 1986, The empirical implications of the
Cox, Ingersoll, and Ross theory of the term structure of interest rates,
Campbell, John Y., Andrew W. Lo, and A. Craig MacKinlay, 1997, The
Econometrics of Financial Markets (Princeton University Press: Prince-
ton, NJ).
Chan, K.C., Nai-fu Chen, and David Hsieh, 1984, An exploratory investiga-
tion of the firm size effect, Journal of Financial Economics 14, 451–471.
Chan, K.C., G. Andrew Karolyi, Francis Longstaff, and Anthony Sanders,

1992, An empirical comparison of alternative models of the short-term
interest rate, Journal of Finance 47, 1209–1227.
Chen, Nai-fu, Richard Roll, and Stephen A. Ross, 1986, Economic forces and
the stock market, Journal of Business 59, 383–403.
Cochrane, John, 1998, Asset pricing, Unpublished Book.
Diamond, Douglas, and Robert Verrecchia, 1981, Information aggregation in

a noisy rational expectations economy, Journal of Financial Economics 9,
221–235.
196 BIBLIOGRAPHY
Dunn, Kenneth, and Kenneth Eades, 1989, Voluntary conversion of convert-

ible securities and the optimal call strategy, Journal of Financial Eco-
nomics 23, 273–301.
Eades, Kenneth, Patrick Hess, and E. Han Kim, 1994, Time-series variation
in dividend pricing, Journal of Finance 49, 1617–1638.
Eckbo, Espen, and Ronald Masulis, 1992, Adverse selection and the rights
offer paradox, Journal of Financial Economics 32, 293–332.
Evans, Martin, and Karen Lewis, 1995, Do long-swings in the dollar affect
estimates of the risk premia?, Review of Financial Studies 8, 709–742.
Fama, Eugene, 1980, Agency problems and the theory of the firm, Journal
of Politcial Economy 88, 288–307.
, 1984a, Forward and spot exchange rates, Journal of Monetary Eco-

nomics 14, 319–338.
, 1984b, The information in the term structure, Journal of Financial

, 1991, Efficient capital markets: II, Journal of Finance 46, 1575–

1618.
, and Kenneth French, 1992, The cross-section of expected stock

returns, Journal of Finance 47, 427–465.
, 1996b, The CAPM is wanted, dead or alive, Journal of Finance.
Fama, Eugene F., and James MacBeth, 1973, Risk, return and equilibrium:
Empirical tests, Journal of Political Economy 81, 607–636.
Fazzari, Steven, Glenn Hubbard, and Bruce Peterson, 1988, Financing con-
straints and corporate investment, Brookings Papers on Economic Activi-
ties 1, 141–195.
Foster, Douglass, and S. Viswanathan, 1990, A theory of interday variations

in volume, variance and trading costs in securities markets, Review of
Financial Studies 3, 593–624.
BIBLIOGRAPHY 197
Froot, Kenneth, and Jeffrey Frankel, 1989, Forward discount bias: Is it an

exchange rate risk premium?, Quarterly Journal of Economics Feb., 139–
161.
Froot, Kenneth, David Scharfstein, and Jeremy Stein, 1993, Risk manage-
ment: Coordinating corporate investment and financing policies, Journal
of Finance 48, 1629–1658.
Geczy, Christopher, Bernadette Minton, and Catherine Schrand, 1996, Why

firms use currency derivatives, Working paper.
Gibbons, Michael, 1982, Multivariate tests of financial models: A new ap-

proach, Journal of Financial Economics 10, 3–27.
, and Krishna Ramaswamy, 1993, A test of the Cox, Ingersoll, and

Ross model of the term structure, Review of Financial Studies 6, 619–658.
Glosten, Larry, 1989, Insider trading, liquidity, and the role of the monopolist
specialist, Journal of Business 62, 211–235.
, and P. Milgrom, 1985, Bid, ask, and transaction prices in a special-

ist market with heterogeneously informed traders, Journal of Financial
Graham, John, 1996, Debt and the marginal tax rate, Journal of Financial
Grossman, Sanford, 1976, On the efficiency of competitive stock markets

where trades have diverse information, Journal of Finance 31, 573–585.
, and J. E. Stiglitz, 1980, On the impossibility of informationally

efficient markets, American Economic Review 70, 393–408.
Hansen, Lars Peter, and Robert J. Hodrick, 1983, Risk Averse Speculation
in the Forward Foreign Exchange Market: An Econometric Analysis of
Linear Models, vol. Exchange Rates and International Macroeconomics .
pp. 113–152 (University of Chicago Press: Chicago).
Hansen, Lars Peter, and Ravi Jagannathan, 1991, Implications of securities

market data for models of dynamic economies, Journal of Political Econ-
omy 99, 225–262.
198 BIBLIOGRAPHY
Hansen, Lars Peter, and S.F.R Richard, 1987, The role of conditioning infor-
mation in deducting testable restrictions implied by dynamic asset pricing
models, Econometrica 55, 587–613.
Harrison, J., and David Kreps, 1979, Martingales and arbitrage in multi-
period securities markets, Journal of Economic Theory 20, 381–408.
Hart, Oliver, and David Kreps, 1986, Price destabilizing speculation, Journal
Hellwig, M.F., 1980, On the aggregation of information in competitive mar-

kets, Journal of Economic Theory 22, 477–498.
Helwege, Jean, and Nelie Liang, 1996, Is there a pecking order? evidence
from a panel of IPO firms, Journal of Financial Economics 40, 429–458.
Hirshleifer, Jack, 1971, The private and social value of information and the
reward to inventive activity, American Economic Review 61, 561–574.
Hotchkiss, Edith Shwalb, 1995, Postbankruptcy resolution: Direct costs and

violation of priority claims, Journal of Finance 50, 3–21.
Huang, Chi-fu, and Robert H. Litzenberger, 1988, Foundations for Financial

Economics (Prentice-Hall: Englewood Cliffs, NJ).
Huang, Roger, 1989, An analysis of intertemporal pricing for forward foreign

exchange contracts, Journal of Finance 44, 183–194.
Ibottson, Roger, and Jay Ritter, 1995, Initial Public Offerings, vol. North-
Holland Handbooks of Operations Research and Management Science: Fi-
nance . pp. 993–1016 (North-Holland: Amsterdam).
Ingersoll, 1987, Theory of Financial Decision Making . Studies in Financial

Economics (Roman & Littlefield: Savage, MD).
Ingersoll, Jonathan, 1984, Some results in the theory of arbitrage pricing,

James, Christopher, 1995, When do banks take equity in debt restructurings,

Review of Financial Studies 8.
BIBLIOGRAPHY 199
Jarrow, Robert, V. Maksimovic, and W. T. Ziemba, 1995, Finance . , vol. 9

of Handbooks in Operations Research and Management Science (North-
Holland: Amsterdam).
Jegadeesh, Narasimhan, and Sheridan Titman, 1993, Returns to buying win-
ners and selling losers: Implications for stock market efficiency, Journal of
Finance 48, 65–91.
Jensen, Michael, 1986, Agency costs of free cash flow, corporate finance, and
takeovers, American Economic Review 76, 323–329.
, and W.H. Meckling, 1976, Theory of the firm: Managerial behavior,
agency costs, and ownership structure, Journal of Financial Economics 3,
305–360.
Jensen, Michael, and Kevin Murphy, 1990, Performance pay and top man-
agement incentives, Journal of Political Economy 98, 225–264.
Jung, Kooyul, Yong-Cheol Kim, and René Stulz, 1996, Investment oppor-
tunities, managerial discretion, and the security issue decision, Journal of
Financial Economics 42, 159–185.
Kadlec, Greg, and John McConnell, 1994, The effect of market segmentation
and illiquidity on asset prices: Evidence from exchange listings, Jounral of
Finance 49, 611–636.
Kandel, S., and Robert Stambaugh, 1987, On correlations and inferences
about mean-variance efficiency, Journal of Financial Economics 18, 61–
90.
Kim, Yong-Cheol, and René Stulz, 1988, The Eurobond market and corpo-
rate financial policy: A test of the clientele hypothesis, Journal of Finan-
Koh, and Walter, 1989, A direct test of Rock’s model of the pricing of un-
seasoned issues, Journal of Financial Economics 23, 251–272.
Kothari, S., J. Shanken, and R. Sloan, 1995, Another look at the cross-section
of expected returns, Journal of Finance 50, 185–224.
Kyle, Albert S., 1985, Continuous auctions and insider trading, Econometrica
50, 1315–1335.
200 BIBLIOGRAPHY
, 1989, Informed speculation with imperfect competition, Review of

Economic Studies 56, 317–356.
Lang, Larry, René Stulz, and Ralph Walkling, 1989, Managerial perfomance,
Tobin’s q, and the gain from tender offers, Journal of Financial Economics
24, 137–154.
Lehn, Kenneth, and Annette Poulsen, 1989, Free cash flow and stockholder
gains in going private transactions, Journal of Finance 44, 771–787.
Litzenberger, Robert, and Krishna Ramaswamy, 1979, The effect of personal

taxes and dividends on capital asset prices: Theory and evidence, Journal
of Financial Economics 7, 163–196.
Loderer, Claudio, John Cooney, and Leonard VanDrunen, 1991, The price
elasticity of demand for common stock, Journal of Finance 46, 621–651.
Longstaff, Francis, and Eduardo Schwartz, 1992, Interest rate volatility and
the term structure: A two-factor general equilibrium model, Journal of
Finance 47, 1259–1282.
Loughran, and Ritter, 1995, The new issues puzzle, Journal of Finance 50,
23–51.
Lucas, Robert, 1978, Asset prices in an exchange economy, Econometrica 46,

1429–1445.
, 1982, Interest rates and currency prices in a two=country world,

Journal of Monetary Economics 10, 335–360.
MacKinlay, A. Craig, 1987, On multivariate tests of the CAPM, Journal of

Manne, Henry G., 1965, Mergers and the market for corporate control, Jour-
nal of Political Economy 73, 110–120.
Mark, Nelson, 1988, Time-varying betas and risk premia in the pricing of
forward foreign exchange contracts, Journal of Financial Economics 22,
335–354.
Markowitz, Harry, 1959, Portfolio Selection: Efficient Diversification of In-

vestments (Wiley: New York).
BIBLIOGRAPHY 201
Marshall, J. M., 1974, Provate incentives and information, American Eco-

nomic Review 64, 373–390.
Masulis, Ronald, 1980, The effects of capital structure change on security
prices, Journal of Financial Economics 8, 139–178.
May, Don, 1995, Do managerial motives influence firm risk reduction strate-
gies?, Journal of Finance 50, 1291–1308.
McConnell, John, and Eduardo Schwartz, 1992, The origin of LYONS: A
case study in financial innovation, Journal of Applied Corporate Finance
pp. 40–47.
Merton, Robert, 1987, A simple model of capital market equilibrium with
incomplete information, Jounral of Finance 42, 483–510.
Merton, Robert C., 1973, An intertemporal capital asset pricing model,
Econometrica 41, 867–887.
Mikkelson, Wayne, and Megan Partch, 1986, Valution effects of security offer-
ings and the issuance process, Journal of Financial Economics 15, 31–60.
Milgrom, P., and N. Stokey, 1982, Information, trade, and common knowl-
edge, Journal of Economic Theory 26, 17–27.
Miller, Merton, 1977a, Debt and taxes, Journal of Finance 32, 261–276.
, 1977b, Risk, uncertainty, and divergence of opinion, Journal of
Finance 32, 1151–1168.
, and Kevin Rock, 1985, Dividend policy under asymmetric informa-
tion, Journal of Finance 40, 1030–1051.
Mitchell, Mark, and Kenneth Lehn, 1990, Do bad bidders become good tar-
gets?, Journal of Political Economy 98, 372–398.
Mitchell, Mark L., and J. Harold Mulherin, 1996, Impact of industry shocks
on takeover and restructuring activity, Journal of Financial Economics 41,
193–229.
Morck, R.A., Andrei Shleifer, and Robert Vishny, 1988, Management own-
ership and market valution: An empirical analysis, Journal of Financial
202 BIBLIOGRAPHY
Murphy, Kevin, 1985, Corporate performance and managerial remuneration:

An empirical analysis, Journal of Accounting and Economics 7, 11–42.
Myers, Stewart, 1977, Determinants of corporate borrowing, Journal of Fi-

nancial Economics 5, 147–175.
, 1984, The capital structure puzzle, Journal of Finance 39, 575–592.
, and N. Majluf, 1984, Corporate financing and investment decisions

when firms have information that investors do not have, Journal of Finan-
Ofer, Aharon, and Ashok Natarajan, 1989, Convertible call policies: An

empirical analysis of an information-signalling hypothesis, Journal of Fi-
nancial Economics 19, 91–108.
Opler, Tim, and Sheridan Titman, 1995, The debt-equity choice: An analysis
of issuing firms, Working Paper.
Pearson, Neal, and Tong Sheng Sun, 1994, Exploiting the conditional density
in estimating the term structure: An application to the Cox, Ingersoll, and
Ross model, Journal of Finance 49, 1279–1304.
Prabhala, N. R., 1993, On interpreting dividend announcement effects: Free

cash flow, clientele, or signalling?, Yale Working Paper.
Puri, Manju, 1996, Commercial banks in investment banking: Conflict of

interest or certification role?, Journal of Financial Economics 40, 373–
401.
Rajan, Raghuram, 1996, Insiders and outsiders: The choice between informed
and arm’s length debt, Jounrnal of Finance 47, 1367–1400.
, and Henri Servaes, ????, The effect of market conditions on initial

public offerings, .
Rajan, Raghuram, and Luigi Zingales, 1995, What do we know about capital
structure? some evidence from international data, Journal of Finance 50,
1421–1460.
Reisman, H., 1992, Reference variables, factor structure, and the approxi-
mate multibeta representation, Journal of Finance 47, 1303–1314.
BIBLIOGRAPHY 203
Rock, Kevin, 1986, Why new issues are underpriced, Journal of Financial
, 1989, The specialist’s order book, Unpublished Working Paper.
Roll, Richard, 1977, A critique of the asset pricing theory’s tests, Journal of
, 1984, A simple measure of the effective bid/ask spread in an efficient
market, Journal of Finance 39, 1127–1139.
, 1986, The hybris hypothesis of corporate takeovers, Journal of Busi-
ness 59, 197–216.
, and Stephen Ross, 1994, On the cross-sectional relation between
expected returns and betas, Journal of Finance 49, 101–122.
Ross, Stephen, 1976, The arbitrage theory of capital asset prices, Journal of
Economic Theory 13, 341–360.
, 1977a, The determination of financial structure: The incentive sig-
nalling approach, Bell Jounrnal of Economics 8, 23–40.
, 1977b, Return, Risk, and Arbitrage . , vol. Risk and Return in
Finance, I (Ballinger: Cambridge, MA).
Shanken, Jay, 1982, The arbitrage pricing theory: Is it testable?, Jounal of
Finance 37, 1129–1140.
, 1985, Multivariate tests of the zero-beta CAPM, Journal of Finan-
Shanken, J., 1987, Multivariate proxies and asset pricing relations: Living
with the Roll critique, Journal of Financial Economics 18, 91–110.
, 1992, On the estimation of beta-pricing models, Review of Financial
Studies 5, 1–34.
Shanken, Jay, and M. Weinstein, 1990, Macroeconomic variables and asset
pricing: Esstimation and tests, Working Paper, University of Rochester.
Shiller, Robert J., 1981, Do stock prices move too much to be justified by
susequent changes in dividends?, American Economic Review 71, 421–436.
204 BIBLIOGRAPHY
Shin, Hyun-Han, and René Stulz, 1996, An analysis of divisional investment

policy, NBER Working Paper.
Shleifer, and Vishny, 1986, Large shareholders and corporate control, Journal
, 1992, Liquidation values and debt capacity: A market equilibrium
approach, Journal of Finance 47, 1343–1366.
Shleifer, Andrei, 1986, Do demand curves for stock slope down?, Jounrnal of
Finance 41, 579–590.
Slezak, Steve, 1994, A theory of the dynamics of security returns around
market closures, Journal of Finance 49, 1163–1211.
Sloan, Richard, 1993, Accounting earings and top executive compensation,
Journal of Accounting and Economics 16, 55–100.
Smith, Clifford, 1986, Investment banking and the capital acquisition process,
, and Ross Watts, 1992, The investment opportunity set and corpo-
rate financing, dividend and compensation policies, Journal of Financial
Snow, Karl, 1991, Diagnosing asset pricing models using the distribution of
asset returns, Journal of Finace 46, 955–983.
Spence, Michael, 1973, Job market signaling, Quarterly Journal of Economics
pp. 355–374.
, 1974, Competitive and optimal responses to signals: An analysis of
efficiency distribution, Journal of Economic Theory 7, 296–332.
Stambaugh, Robert, 1982, On the exclusion of assets from tests of the two
parameter model, Journal of Financial Economics 10, 235–268.
Stein, Jeremy, 1992, Convertible bonds as backdoor equity financing, Journal
of Financial Economics 32, 3–21.
Stulz, René, 1988, Managerial control of voting rights: Financing policies
and the market for corporate control, Journal of Financial Economics 20,
25–54.
BIBLIOGRAPHY 205
, 1995, Rethinking risk management, Working paper.
Titman, Sheridan, and Robert Wessels, 1988, The determinants of capital

structure choice, Journal of Finance 43, 1–19.
Tufano, Peter, 1989, Financial innovation and first mover advantages, Jour-
nal of Financial Economics 25, 213–240.
, 1996, Who manages risk? an emprical examination of risk man-

agement practices in the gold mining industry, Journal of Finance 51,
1097–1137.
Vermaelen, Theo, 1981, Common stock repurchases and market signalling:

An empirical study, Journal of Financial Economics 9, 138–183.
Weiss, Lawrence, 1990, Bankruptcy resolution: Direct costs and violation of

priority of claims, Journal of Financial Economics 27, 285–314.
Welch, Ivo, 1992, Sequential sales, learning, and cascades, Journal of Finance
47, 695–732.
Yermack, David, 1995, Do corporations award stock options effectively?,

Zender, Jaime, 1991, Optimal financial instruments, Journal of Finance 46,

1645–1663.

Selected Finance Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Selected Finance Notes

Uploaded by

Copyright:

Available Formats

FINANCE NOTES

2.8.2 General Procedures . . . . . . . . . . . . . . . . . . . . 36

4.4.2 Exotic Options . . . . . . . . . . . . . . . . . . . . . . 66

5.14 Convertible Debt . . . . . . . . . . . . . . . . . . . . . . . . . 147

6 Market Microstructure 159

7 International Finance 179

8 Appendix: Math Results 185

8.2 Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

These notes are an effort to integrate the body of knowledge encountered in

P unless otherwise denoted. The risk-neutral measure is represented by Q.

June 30, 1998

generally, moments higher than the second vanish in continuous time.

2.2 Portfolio Theory

2.2.1 Single Period Optimization Problem

Solve for w to get

w = λΣ−1 µ + γΣ−1 ι. (2.2)

Frontier portfolios are linear combinations of two portfolios. Premultiply by

where g = BΣ−1 ι − AΣ−1 µ /D and h = CΣ−1 µ − AΣ−1 ι /D. Note

2.2.2 Key Results

A beta representation is easy to derive from the FOCs.

where z is the portfolio orthogonal to p (or rf in the SL model) and i is an

Solving for λ and rearranging gives the desired result

2.2.3 Multiperiod Portfolio Choice

Define the indirect utility function as

with J(WT ) = B(WT ). At T − 1, indirect utility is

J(WT −1 ) = max U (CT −1 ) + ET −1 [J(WT )]

UC − ET −1 [BW R∗ ] = 0 and ET −1 [BW (Ri − Rf )] = 0.

J(Wτ ) = max U (Cτ ) + Eτ [J(Wτ +1 )]

UC = Eτ [JW R∗ ] and Eτ [JW (Ri − Rf )] = 0.

With log utility optimal consumption depends only on current wealth

2.3 Equilibrium Asset Pricing Theory

2.3.1 Utility Functions

This is a strong assumption, but it greatly simplifies much of the analysis.

Table 2.1: Common Utility Functions: HARA

2.3.2 CAPM Theory

• Quadratic utility or multivariate normality of returns

Derivation of Sharpe-Lintner Model

Derivation of Black Model

Alternatively, we can maximize expected return for a given portfolio vari-

2.3.3 ICAPM Theory

Under certain conditions, we have two-fund separation and the CAPM:

cov(dx, r) = ρix sσi dt = σix dt

J(W + dw, x + dx, t + dt) = J(W, x, t) + Jt dt + JW dW + Jx dx

where φ contains higher-order terms.

E[J(·, ·, ·)] = J + Jt dt + JW E[dW ] + Jx E[dx]

Now solve for optimal portfolio weights

= βim (x)(αm − r) + βih (x)(αh − r)

2.3.4 CCAPM Theory

The covariance between the return on asset i and consumption growth is

Summing over all investors we get

Defining a reference portfolio C,

σC2 = wC0 σ iC = T (αC − rf ).

Solving for T and substituting,

2.3.5 The CIR Model

dη(t) = Iη α(Y, t)dt + Iη G(Y, t)dw(t) (2.9)

dY (t) = µ(Y, t)dt + S(Y, t)dw(t) (2.10)

Value of Contingent Claim i

dF i = (F i βi − δi )dt + F i hi dw(t) (2.11)

Let the indirect utility function J(W, Y, t) be the solution to

max[Lv (t)J + U (v, Y, t)] + Jt = 0.

Solving for Ĉ, â, b̂, we obtain

The expected rate of return on the ith contingent claim is

(βi − r)F i = [φW φ0Y ][FW

Alternatively, we can write