Are you sure?
This action might not be possible to undo. Are you sure you want to continue?
Lecture Notes
Universit¨at Ulm
VERSION: February 6, 2004
THIS IS A PRELIMINARY VERSION
THERE WILL BE UPDATES DURING THE COURSE
Prof.Dr. R¨ udiger Kiesel
Abteilung Finanzmathematik
Universit¨at Ulm
email:kiesel@mathematik.uniulm.de
2
Short Description.
Times and Location: Lectures will be Monday 1012; Tuesday 810 in He120.
First Lecture Tuesday, 14.10.2003
Content.
This course covers the fundamental principles and techniques of ﬁnancial mathematics in discrete
and continuoustime models. The focus will be on probabilistic techniques which will be discussed
in some detail. Speciﬁc topics are
• Classical Asset Pricing: MeanVariance Analysis, CAPM, Arbitrage.
• Martingalebased stochastic market models: Fundamental Theorems of Asset Pricing.
• Contingent Claim Analysis: European, American and Exotic Options.
• Interest Rate Theory: Term Structure Models, Interest Rate Derivatives.
Prerequisites. Probability Theory, Calculus, Linear Algebra
Literature.
• N.H.Bingham & R.Kiesel, Risk Neutral Valuation, Springer 1998.
• H.F¨ollmer & A.Schied: Stochastic Finance: An Introduction in Discrete Time, De Gruyter
2002.
• J.Hull: Options, Futures & Other Derivatives, 4th edition, Prentice Hall, 1999.
• R.Jarrow & S.Turnbull, Derivative Securities, 2nd edition, 2000.
Oﬃce Hours. Tuesday 1011. He 230
course webpage: www.mathematik.uniulm.de/ﬁnmath
email: kiesel@mathematik.uniulm.de.
Contents
1 Arbitrage Theory 5
1.1 Derivative Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Derivative Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.2 Underlying securities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.3 Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.4 Types of Traders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.5 Modelling Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Arbitrage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Arbitrage Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.1 Fundamental Determinants of Option Values . . . . . . . . . . . . . . . . . 12
1.3.2 Arbitrage bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4 SinglePeriod Market Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4.1 A fundamental example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4.2 A singleperiod model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4.3 A few ﬁnancialeconomic considerations . . . . . . . . . . . . . . . . . . . . 22
2 Financial Market Theory 24
2.1 Choice under Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.1.1 Preferences and the Expected Utility Theorem . . . . . . . . . . . . . . . . 24
2.1.2 Risk Aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.1.3 Further measures of risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2 Optimal Portfolios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.1 The meanvariance approach . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.2 Capital asset pricing model . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2.3 Portfolio optimisation and the absence of arbitrage . . . . . . . . . . . . . . 35
3 Discretetime models 39
3.1 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Existence of Equivalent Martingale Measures . . . . . . . . . . . . . . . . . . . . . 42
3.2.1 The NoArbitrage Condition . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2.2 RiskNeutral Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3 Complete Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4 The CoxRossRubinstein Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4.1 Model Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.4.2 RiskNeutral Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.4.3 Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.5 Binomial Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.5.1 Model Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.5.2 The BlackScholes Option Pricing Formula . . . . . . . . . . . . . . . . . . 57
3.6 American Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.6.1 Stopping Times, Optional Stopping and Snell Envelopes . . . . . . . . . . . 60
3.6.2 The Financial Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3
CONTENTS 4
3.6.3 American Options in the CoxRossRubinstein model . . . . . . . . . . . . . 68
3.6.4 A Threeperiod Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4 Continuoustime Financial Market Models 72
4.1 The Stock Price Process and its Stochastic Calculus . . . . . . . . . . . . . . . . . 72
4.1.1 Continuoustime Stochastic Processes . . . . . . . . . . . . . . . . . . . . . 72
4.1.2 Stochastic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.1.3 Itˆo’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.1.4 Girsanov’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.2 Financial Market Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.2.1 The Financial Market Model . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.2.2 Equivalent Martingale Measures . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2.3 Riskneutral Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2.4 The BlackScholes Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.2.5 The Greeks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.2.6 Barrier Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5 Interest Rate Theory 91
5.1 The Bond Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.1.1 The Term Structure of Interest Rates . . . . . . . . . . . . . . . . . . . . . 91
5.1.2 Mathematical Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.1.3 Bond Pricing, .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.2 Shortrate Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.2.1 The Termstructure Equation . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.2.2 Martingale Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.3 HeathJarrowMorton Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.3.1 The HeathJarrowMorton Model Class . . . . . . . . . . . . . . . . . . . . 99
5.3.2 Forward Riskneutral Martingale Measures . . . . . . . . . . . . . . . . . . 101
5.4 Pricing and Hedging Contingent Claims . . . . . . . . . . . . . . . . . . . . . . . . 102
5.4.1 Gaussian HJM Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.4.2 Swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.4.3 Caps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
A Basic Probability Background 106
A.1 Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
A.2 Convolution and Characteristic Functions . . . . . . . . . . . . . . . . . . . . . . . 108
A.3 The Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
B Facts form Probability and Measure Theory 115
B.1 Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
B.2 Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
B.3 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
B.4 Equivalent Measures and RadonNikod´ ym Derivatives . . . . . . . . . . . . . . . . 124
B.5 Conditional expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
B.6 Modes of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
C Stochastic Processes in Discrete Time 133
C.1 Information and Filtrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
C.2 DiscreteParameter Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . 134
C.3 Deﬁnition and basic properties of martingales . . . . . . . . . . . . . . . . . . . . . 135
C.4 Martingale Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
Chapter 1
Arbitrage Theory
1.1 Derivative Background
Deﬁnition 1.1.1. A derivative security, or contingent claim, is a ﬁnancial contract whose value
at expiration date T (more brieﬂy, expiry) is determined exactly by the price (or prices within a
prespeciﬁed timeinterval) of the underlying ﬁnancial assets (or instruments) at time T (within
the time interval [0, T]).
This section provides the institutional background on derivative securities, the main groups
of underlying assets, the markets where derivative securities are traded and the ﬁnancial agents
involved in these activities. As our focus is on (probabilistic) models and not institutional consid
erations we refer the reader to the references for excellent sources describing institutions such as
Davis (1994), Edwards and Ma (1992) and Kolb (1991).
1.1.1 Derivative Instruments
Derivative securities can be grouped under three general headings: Options, Forwards and Futures
and Swaps. During this text we will mainly deal with options although our pricing techniques
may be readily applied to forwards, futures and swaps as well.
Options.
An option is a ﬁnancial instrument giving one the right but not the obligation to make a speciﬁed
transaction at (or by) a speciﬁed date at a speciﬁed price. Call options give one the right to buy.
Put options give one the right to sell. European options give one the right to buy/sell on the
speciﬁed date, the expiry date, on which the option expires or matures. American options give
one the right to buy/sell at any time prior to or at expiry.
Overthecounter (OTC) options were long ago negotiated by a broker between a buyer and a
seller. In 1973 (the year of the BlackScholes formula, perhaps the central result of the subject),
the Chicago Board Options Exchange (CBOE) began trading in options on some stocks. Since
then, the growth of options has been explosive. Options are now traded on all the major world
exchanges, in enormous volumes. Risk magazine (12/97) estimated $35 trillion as the gross ﬁgure
for worldwide derivatives markets in 1996. By contrast, the Financial Times of 7 October 2002
(Special Report on Derivatives) gives the interest rate and currency derivatives volume as $ 83
trillion  an indication of the rate of growth in recent years! The simplest call and put options
are now so standard they are called vanilla options. Many kinds of options now exist, including
socalled exotic options. Types include: Asian options, which depend on the average price over
a period, lookback options, which depend on the maximum or minimum price over a period and
barrier options, which depend on some price level being attained or not.
Terminology. The asset to which the option refers is called the underlying asset or the un
derlying. The price at which the transaction to buy/sell the underlying, on/by the expiry date
5
CHAPTER 1. ARBITRAGE THEORY 6
(if exercised), is made, is called the exercise price or strike price. We shall usually use K for the
strike price, time t = 0 for the initial time (when the contract between the buyer and the seller of
the option is struck), time t = T for the expiry or ﬁnal time.
Consider, say, a European call option, with strike price K; write S(t) for the value (or price)
of the underlying at time t. If S(t) > K, the option is in the money, if S(t) = K, the option is
said to be at the money and if S(t) < K, the option is out of the money.
The payoﬀ from the option, which is
S(T) −K if S(T) > K and 0 otherwise
(more brieﬂy written as (S(T) −K)
+
). Taking into account the initial payment of an investor one
obtains the proﬁt diagram below.
_
S(T)
K
'
proﬁt
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Z
Figure 1.1: Proﬁt diagram for a European call
Forwards
A forward contract is an agreement to buy or sell an asset S at a certain future date T for a certain
price K. The agent who agrees to buy the underlying asset is said to have a long position, the
other agent assumes a short position. The settlement date is called delivery date and the speciﬁed
price is referred to as delivery price. The forward price f(t, T) is the delivery price which would
make the contract have zero value at time t. At the time the contract is set up, t = 0, the forward
price therefore equals the delivery price, hence f(0, T) = K. The forward prices f(t, T) need not
(and will not) necessarily be equal to the delivery price K during the lifetime of the contract.
The payoﬀ from a long position in a forward contract on one unit of an asset with price S(T)
at the maturity of the contract is
S(T) −K.
Compared with a call option with the same maturity and strike price K we see that the investor
now faces a downside risk, too. He has the obligation to buy the asset for price K.
Swaps
A swap is an agreement whereby two parties undertake to exchange, at known dates in the future,
various ﬁnancial assets (or cash ﬂows) according to a prearranged formula that depends on the
CHAPTER 1. ARBITRAGE THEORY 7
value of one or more underlying assets. Examples are currency swaps (exchange currencies) and
interestrate swaps (exchange of ﬁxed for ﬂoating set of interest payments).
1.1.2 Underlying securities
Stocks.
The basis of modern economic life – or of the capitalist system  is the limited liability company (UK:
& Co. Ltd, now plc  public limited company), the corporation (US: Inc.), ‘die Aktiengesellschaft’
(Germany: AG). Such companies are owned by their shareholders; the shares
• provide partial ownership of the company, pro rata with investment,
• have value, reﬂecting both the value of the company’s (real) assets and the earning power
of the company’s dividends.
With publicly quoted companies, shares are quoted and traded on the Stock Exchange. Stock is
the generic term for assets held in the form of shares.
Interest Rates.
The value of some ﬁnancial assets depends solely on the level of interest rates (or yields), e.g.
Treasury (T) notes, Tbills, Tbonds, municipal and corporate bonds. These are ﬁxedincome
securities by which national, state and local governments and large companies partially ﬁnance
their economic activity. Fixedincome securities require the payment of interest in the form of a
ﬁxed amount of money at predetermined points in time, as well as repayment of the principal at
maturity of the security. Interest rates themselves are notional assets, which cannot be delivered.
Hedging exposure to interest rates is more complicated than hedging exposure to the price move
ments of a certain stock. A whole term structure is necessary for a full description of the level of
interest rates, and for hedging purposes one must clarify the nature of the exposure carefully. We
will discuss the subject of modelling the term structure of interest rates in Chapter 8.
Currencies.
A currency is the denomination of the national units of payment (money) and as such is a ﬁnancial
asset. The end of ﬁxed exchange rates and the adoption of ﬂoating exchange rates resulted in a
sharp increase in exchange rate volatility. International trade, and economic activity involving it,
such as most manufacturing industry, involves dealing with more than one currency. A company
may wish to hedge adverse movements of foreign currencies and in doing so use derivative instru
ments (see for example the exposure of the hedging problems British Steel faced as a result of the
sharp increase in the pound sterling in 96/97 Rennocks (1997)).
Indexes.
An index tracks the value of a (hypothetical) basket of stocks (FTSE100, S&P500, DAX), bonds
(REX), and so on. Again, these are not assets themselves. Derivative instruments on indexes
may be used for hedging if no derivative instruments on a particular asset (a stock, a bond, a
commodity) in question are available and if the correlation in movement between the index and
the asset is signiﬁcant. Furthermore, institutional funds (such as pension funds, mutual funds
etc.), which manage large diversiﬁed stock portfolios, try to mimic particular stock indexes and
use derivatives on stock indexes as a portfolio management tool. On the other hand, a speculator
may wish to bet on a certain overall development in a market without exposing him/herself to a
particular asset.
A new kind of index was generated with the Index of Catastrophe Losses (CATIndex) by the
Chicago Board of Trade (CBOT) lately. The growing number of huge natural disasters (such as
hurricane Andrew 1992, the Kobe earthquake 1995) has led the insurance industry to try to ﬁnd
CHAPTER 1. ARBITRAGE THEORY 8
new ways of increasing its capacity to carry risks. The CBOT tried to capitalise on this problem
by launching a market in insurance derivatives. Currently investors are oﬀered options on the
CATIndex, thereby taking in eﬀect the position of traditional reinsurance.
Derivatives are themselves assets – they are traded, have value etc.  and so can be used as
underlying assets for new contingent claims: options on futures, options on baskets of options, etc.
These developments give rise to socalled exotic options, demanding a sophisticated mathematical
machinery to handle them.
1.1.3 Markets
Financial derivatives are basically traded in two ways: on organized exchanges and overthe
counter (OTC). Organised exchanges are subject to regulatory rules, require a certain degree of
standardisation of the traded instruments (strike price, maturity dates, size of contract etc.) and
have a physical location at which trade takes place. Examples are the Chicago Board Options
Exchange (CBOE), which coincidentally opened in April 1973, the same year the seminal con
tributions on option prices by Black and Scholes Black and Scholes (1973) and Merton Merton
(1973) were published, the London International Financial Futures Exchange (LIFFE) and the
Deutsche Terminb¨orse (DTB).
OTC trading takes place via computers and phones between various commercial and investment
banks (leading players include institutions such as Bankers Trust, Goldman Sachs – where Fischer
Black worked, Citibank, Chase Manhattan and Deutsche Bank).
Due to the growing sophistication of investors boosting demand for increasingly complicated,
madetomeasure products, the OTC market volume is currently (as of 1998) growing at a much
faster pace than trade on most exchanges.
1.1.4 Types of Traders
We can classify the traders of derivative securities in three diﬀerent classes:
Hedgers.
Successful companies concentrate on economic activities in which they do best. They use the
market to insure themselves against adverse movements of prices, currencies, interest rates etc.
Hedging is an attempt to reduce exposure to risk a company already faces. Shorter Oxford English
Dictionary (OED): Hedge: ‘trans. To cover oneself against loss on (a bet etc.) by betting, etc.,
on the other side. Also ﬁg. 1672.’
Speculators.
Speculators want to take a position in the market – they take the opposite position to hedgers.
Indeed, speculation is needed to make hedging possible, in that a hedger, wishing to lay oﬀ risk,
cannot do so unless someone is willing to take it on.
In speculation, available funds are invested opportunistically in the hope of making a proﬁt: the
underlying itself is irrelevant to the investor (speculator), who is only interested in the potential
for possible proﬁt that trade involving it may present. Hedging, by contrast, is typically engaged
in by companies who have to deal habitually in intrinsically risky assets such as foreign exchange
next year, commodities next year, etc. They may prefer to forgo the chance to make exceptional
windfall proﬁts when future uncertainty works to their advantage by protecting themselves against
exceptional loss. This would serve to protect their economic base (trade in commodities, or
manufacture of products using these as raw materials), and also enable them to focus their eﬀort
in their chosen area of trade or manufacture. For speculators, on the other hand, it is the market
(forex, commodities or whatever) itself which is their main forum of economic activity.
CHAPTER 1. ARBITRAGE THEORY 9
Arbitrageurs.
Arbitrageurs try to lock in riskless proﬁt by simultaneously entering into transactions in two
or more markets. The very existence of arbitrageurs means that there can only be very small
arbitrage opportunities in the prices quoted in most ﬁnancial markets. The underlying concept of
this book is the absence of arbitrage opportunities (cf. ¸1.2).
1.1.5 Modelling Assumptions
Contingent Claim Pricing.
The fundamental problem in the mathematics of ﬁnancial derivatives is that of pricing. The
modern theory began in 1973 with the seminal BlackScholes theory of option pricing, Black and
Scholes (1973), and Merton’s extensions of this theory, Merton (1973).
To expose the relevant features, we start by discussing contingent claim pricing in the simplest
(idealised) case and impose the following set of assumptions on the ﬁnancial markets (We will
relax these assumptions subsequently):
No market frictions No transaction costs, no bid/ask spread, no taxes,
no margin requirements, no restrictions on short sales
No default risk Implying same interest for borrowing and lending
Competitive markets Market participants act as price takers
Rational agents Market participants prefer more to less
No arbitrage
Table 1.1: General assumptions
All real markets involve frictions; this assumption is made purely for simplicity. We develop
the theory of an ideal – frictionless – market so as to focus on the irreducible essentials of the
theory and as a ﬁrstorder approximation to reality. Understanding frictionless markets is also a
necessary step to understand markets with frictions.
The risk of failure of a company – bankruptcy – is inescapably present in its economic activity:
death is part of life, for companies as for individuals. Those risks also appear at the national level:
quite apart from war, or economic collapse resulting from war, recent decades have seen default of
interest payments of international debt, or the threat of it. We ignore default risk for simplicity
while developing understanding of the principal aspects (for recent overviews on the subject we
refer the reader to Jameson (1995), Madan (1998)).
We assume ﬁnancial agents to be price takers, not price makers. This implies that even large
amounts of trading in a security by one agent does not inﬂuence the security’s price. Hence agents
can buy or sell as much of any security as they wish without changing the security’s price.
To assume that market participants prefer more to less is a very weak assumption on the
preferences of market participants. Apart from this we will develop a preferencefree theory.
The relaxation of these assumptions is subject to ongoing research and we will include com
ments on this in the text.
We want to mention the special character of the noarbitrage assumption. If we developed a
theoretical price of a ﬁnancial derivative under our assumptions and this price did not coincide
with the price observed, we would take this as an arbitrage opportunity in our model and go on to
explore the consequences. This might lead to a relaxation of one of the other assumptions and a
restart of the procedure again with noarbitrage assumed. The noarbitrage assumption thus has
a special status that the others do not. It is the basis for the arbitrage pricing technique that we
shall develop, and we discuss it in more detail below.
CHAPTER 1. ARBITRAGE THEORY 10
1.2 Arbitrage
We now turn in detail to the concept of arbitrage, which lies at the centre of the relative pricing
theory. This approach works under very weak assumptions. We do not have to impose any
assumptions on the tastes (preferences) and beliefs of market participants. The economic agents
may be heterogeneous with respect to their preferences for consumption over time and with respect
to their expectations about future states of the world. All we assume is that they prefer more to
less, or more precisely, an increase in consumption without any costs will always be accepted.
The principle of arbitrage in its broadest sense is given by the following quotation from OED:
‘3 [Comm.]. The traﬃc in Bills of Exchange drawn on sundry places, and bought or sold in sight
of the daily quotations of rates in the several markets. Also, the similar traﬃc in Stocks. 1881.’
Used in this broad sense, the term covers ﬁnancial activity of many kinds, including trade
in options, futures and foreign exchange. However, the term arbitrage is nowadays also used
in a narrower and more technical sense. Financial markets involve both riskless (bank account)
and risky (stocks, etc.) assets. To the investor, the only point of exposing oneself to risk is
the opportunity, or possibility, of realising a greater proﬁt than the riskless procedure of putting
all one’s money in the bank (the mathematics of which – compound interest – does not require
a textbook treatment at this level). Generally speaking, the greater the risk, the greater the
return required to make investment an attractive enough prospect to attract funds. Thus, for
instance, a clearing bank lends to companies at higher rates than it pays to its account holders.
The companies’ trading activities involve risk; the bank tries to spread the risk over a range of
diﬀerent loans, and makes its money on the diﬀerence between high/risky and low/riskless interest
rates.
The essence of the technical sense of arbitrage is that it should not be possible to guarantee a
proﬁt without exposure to risk. Were it possible to do so, arbitrageurs (we use the French spelling,
as is customary) would do so, in unlimited quantity, using the market as a ‘moneypump’ to extract
arbitrarily large quantities of riskless proﬁt. This would, for instance, make it impossible for the
market to be in equilibrium. We shall restrict ourselves to markets in equilibrium for simplicity –
so we must restrict ourselves to markets without arbitrage opportunities.
The above makes it clear that a market with arbitrage opportunities would be a disorderly
market – too disorderly to model. The remarkable thing is the converse. It turns out that the
minimal requirement of absence of arbitrage opportunities is enough to allow one to build a model
of a ﬁnancial market which – while admittedly idealised – is realistic enough both to provide real
insight and to handle the mathematics necessary to price standard contingent claims. We shall
see that arbitrage arguments suﬃce to determine prices  the arbitrage pricing technique. For an
accessible treatment rather diﬀerent to ours, see e.g. Allingham (1991).
To explain the fundamental arguments of the arbitrage pricing technique we use the following:
Example.
Consider an investor who acts in a market in which only three ﬁnancial assets are traded: (riskless)
bonds B (bank account), stocks S and European Call options C with strike K = 1 on the stock.
The investor may invest today, time t = 0, in all three assets, leave his investment until time
t = T and get his returns back then (we assume the option expires at t = T, also). We assume
the current £ prices of the ﬁnancial assets are given by
B(0) = 1, S(0) = 1, C(0) = 0.2
and that at t = T there can be only two states of the world: an upstate with £ prices
B(T, u) = 1.25, S(T, u) = 1.75, and therefore C(T, u) = 0.75,
and a downstate with £ prices
B(T, d) = 1.25, S(T, d) = 0.75, and therefore C(T, d) = 0.
CHAPTER 1. ARBITRAGE THEORY 11
Financial asset Number of Total amount in £
Bond 10 10
Stock 10 10
Call 25 5
Table 1.2: Original portfolio
Now our investor has a starting capital of £25, and divides it as in Table 1.2 below (we call such
a division a portfolio). Depending of the state of the world at time t = T this portfolio will give
the £ return shown in Table 1.3. Can the investor do better? Let us consider the restructured
State of the world Bond Stock Call Total
Up 12.5 17.5 18.75 48.75
Down 12.5 7.5 0 20.
Table 1.3: Return of original portfolio
portfolio of Table 1.4. This portfolio requires only an investment of £24.6. We compute its return
Financial asset Number of Total amount in £
Bond 11.8 11.8
Stock 7 7
Call 29 5.8
Table 1.4: Restructured portfolio
in the diﬀerent possible future states (Table 1.5). We see that this portfolio generates the same
time t = T return while costing only £24.6 now, a saving of £0.4 against the ﬁrst portfolio. So
the investor should use the second portfolio and have a free lunch today!
In the above example the investor was able to restructure his portfolio, reducing the current
(time t = 0) expenses without changing the return at the future date t = T in both possible states
of the world. So there is an arbitrage possibility in the above market situation, and the prices
quoted are not arbitrage (or market) prices. If we regard (as we shall do) the prices of the bond
and the stock (our underlying) as given, the option must be mispriced. We will develop in this
book models of ﬁnancial market (with diﬀerent degrees of sophistication) which will allow us to
ﬁnd methods to avoid (or to spot) such pricing errors. For the time being, let us have a closer
look at the diﬀerences between portfolio 1, consisting of 10 bonds, 10 stocks and 25 call options,
in short (10, 10, 25), and portfolio 2, of the form (11.8, 7, 29). The diﬀerence (from the point of
view of portfolio 1, say) is the following portfolio, D: (−1.8, 3, −4). Sell short three stocks (see
below), buy four options and put £1.8 in your bank account. The leftover is exactly the £0.4
of the example. But what is the eﬀect of doing that? Let us consider the consequences in the
possible states of the world. From Table 1.6 below, we see in both cases that the eﬀects of the
diﬀerent positions of the portfolio oﬀset themselves. But clearly the portfolio generates an income
at t = 0 and is therefore itself an arbitrage opportunity.
If we only look at the position in bonds and stocks, we can say that this position covers us
against possible price movements of the option, i.e. having £1.8 in your bank account and being
three stocks short has the same time t = T eﬀects of having four call options outstanding against
us. We say that the bond/stock position is a hedge against the position in options.
Let us emphasise that the above arguments were independent of the preferences and plans of
CHAPTER 1. ARBITRAGE THEORY 12
State of the world Bond Stock Call Total
Up 14.75 12.25 21.75 48.75
Down 14.75 5.25 0 20.
Table 1.5: Return of the restructured portfolio
world is in state up world is in state down
exercise option 3 option is worthless 0
buy 3 stocks at 1.75 −5.25 buy 3 stocks at 0.75 −2.25
sell bond 2.25 sell bond 2.25
Balance 0 Balance 0
Table 1.6: Diﬀerence portfolio
the investor. They were also independent of the interpretation of t = T: it could be a ﬁxed time,
maybe a year from now, but it could refer to the happening of a certain event, e.g. a stock hitting
a certain level, exchange rates at a certain level, etc.
1.3 Arbitrage Relationships
We will in this section use arbitragebased arguments (arbitrage pricing technique) to develop
general bounds on the value of options. Such bounds, deduced from the underlying assumption
that no arbitrage should be possible, allow one to test the plausibility of sophisticated ﬁnancial
market models.
In our analysis here we use stocks as the underlying.
1.3.1 Fundamental Determinants of Option Values
We consider the determinants of the option value in table 1.7 below. Since we restrict ourselves
to nondividend paying stocks we don’t have to consider cash dividends as another natural deter
minant.
Current stock price S(t)
Strike price K
Stock volatility σ
Time to expiry T −t
Interest rates r
Table 1.7: Determinants aﬀecting option value
We now examine the eﬀects of the single determinants on the option prices (all other factors
remaining unchanged).
We saw that at expiry the only variables that mattered were the stock price S(T) and strike
price K: remember the payoﬀs C = (S(T) − K)
+
, P = (S(T) − K)
−
(:= max¦K − S(T), 0¦).
Looking at the payoﬀs, we see that an increase in the stock price will increase (decrease) the value
CHAPTER 1. ARBITRAGE THEORY 13
of a call (put) option (recall all other factors remain unchanged). The opposite happens if the
strike price is increased: the price of a call (put) option will go down (up).
When we buy an option, we bet on a favourable outcome. The actual outcome is uncertain; its
uncertainty is represented by a probability density; favourable outcomes are governed by the tails
of the density (right or left tail for a call or a put option). An increase in volatility ﬂattens out the
density and thickens the tails, so increases the value of both call and put options. Of course, this
argument again relies on the fact that we don’t suﬀer from (with the increase of volatility more
likely) more severe unfavourable outcomes – we have the right, but not the obligation, to exercise
the option.
A heuristic statement of the eﬀects of time to expiry or interest rates is not so easy to make.
In the simplest of models (no dividends, interest rates remain ﬁxed during the period under
consideration), one might argue that the longer the time to expiry the more can happen to the
price of a stock. Therefore a longer period increases the possibility of movements of the stock price
and hence the value of a call (put) should be higher the more time remains before expiry. But
only the owner of an Americantype option can react immediately to favourable price movements,
whereas the owner of a European option has to wait until expiry, and only the stock price then
is relevant. Observe the contrast with volatility: an increase in volatility increases the likelihood
of favourable outcomes at expiry, whereas the stock price movements before expiry may cancel
out themselves. A longer time until expiry might also increase the possibility of adverse eﬀects
from which the stock price has to recover before expiry. We see that by using purely heuristic
arguments we are not able to make precise statements. One can, however, show by explicit
arbitrage arguments that an increase in time to expiry leads to an increase in the value of call
options as well as put options. (We should point out that in case of a dividendpaying stock the
statement is not true in general for Europeantype options.)
To qualify the eﬀects of the interest rate we have to consider two aspects. An increase in the
interest rate tends to increase the expected growth rate in an economy and hence the stock price
tends to increase. On the other hand, the present value of any future cash ﬂows decreases. These
two eﬀects both decrease the value of a put option, while the ﬁrst eﬀect increases the value of a
call option. However, it can be shown that the ﬁrst eﬀect always dominates the second eﬀect, so
the value of a call option will increase with increasing interest rates.
The above heuristic statements, in particular the last, will be veriﬁed again in appropriate
models of ﬁnancial markets, see ¸4.5.4 and ¸6.2.3.
We summarise in table 1.8 the eﬀect of an increase of one of the parameters on the value of
options on nondividend paying stocks while keeping all others ﬁxed:
Parameter (increase) Call Put
Stock price Positive Negative
Strike price Negative Positive
Volatility Positive Positive
Interest rates Positive Negative
Time to expiry Positive Positive
Table 1.8: Eﬀects of parameters
We would like to emphasise again that these results all assume that all other variables remain
ﬁxed, which of course is not true in practice. For example stock prices tend to fall (rise) when
interest rates rise (fall) and the observable eﬀect on option prices may well be diﬀerent from the
eﬀects deduced under our assumptions.
Cox and Rubinstein (1985), p. 37–39, discuss other possible determining factors of option
value, such as expected rate of growth of the stock price, additional properties of stock price
movements, investors’ attitudes toward risk, characteristics of other assets and institutional envi
ronment (tax rules, margin requirements, transaction costs, market structure). They show that
CHAPTER 1. ARBITRAGE THEORY 14
in many important circumstances the inﬂuence of these variables is marginal or even vanishing.
1.3.2 Arbitrage bounds
We now use the principle of noarbitrage to obtain bounds for option prices. Such bounds, de
duced from the underlying assumption that no arbitrage should be possible, allow one to test
the plausibility of sophisticated ﬁnancial market models. We focus on European options (puts
and calls) with identical underlying (say a stock S), strike K and expiry date T. Furthermore we
assume the existence of a riskfree bank account (bond) with constant interest rate r (continuously
compounded) during the time interval [0, T]. We start with a fundamental relationship:
Proposition 1.3.1. We have the following putcall parity between the prices of the underlying
asset S and European call and put options on stocks that pay no dividends:
S +P −C = Ke
−r(T−t)
. (1.1)
Proof. Consider a portfolio consisting of one stock, one put and a short position in one call
(the holder of the portfolio has written the call); write V (t) for the value of this portfolio. Then
V (t) = S(t) +P(t) −C(t)
for all t ∈ [0, T]. At expiry we have
V (T) = S(T) + (S(T) −K)
−
−(S(T) −K)
+
= S(T) +K −S(T) = K.
This portfolio thus guarantees a payoﬀ K at time T. Using the principle of noarbitrage, the
value of the portfolio must at any time t correspond to the value of a sure payoﬀ K at T, that is
V (t) = Ke
−r(T−t)
.
Having established (1.1), we concentrate on European calls in the following.
Proposition 1.3.2. The following bounds hold for European call options:
max
_
S(t) −e
−r(T−t)
K, 0
_
=
_
S(t) −e
−r(T−t)
K
_
+
≤ C(t) ≤ S(t).
Proof. That C ≥ 0 is obvious, otherwise ‘buying’ the call would give a riskless proﬁt now and
no obligation later.
Similarly the upper bound C ≤ S must hold, since violation would mean that the right to
buy the stock has a higher value than owning the stock. This must be false, since a stock oﬀers
additional beneﬁts.
Now from putcall parity (1.1) and the fact that P ≥ 0 (use the same argument as above), we
have
S(t) −Ke
−r(T−t)
= C(t) −P(t) ≤ C(t),
which proves the last assertion.
It is immediately clear that an American call option can never be worth less than the corre
sponding European call option, for the American option has the added feature of being able to be
exercised at any time until the maturity date. Hence (with the obvious notation): C
A
(t) ≥ C
E
(t).
The striking result we are going to show (due to R.C. Merton in 1973, (Merton 1990), Theorem
8.2) is:
Proposition 1.3.3. For a nondividend paying stock we have
C
A
(t) = C
E
(t). (1.2)
Proof. Exercising the American call at time t < T generates the cashﬂow S(t) − K. From
Proposition 1.3.2 we know that the value of the call must be greater or equal to S(t) −Ke
−r(T−t)
,
which is greater than S(t) −K. Hence selling the call would have realised a higher cashﬂow and
the early exercise of the call was suboptimal.
CHAPTER 1. ARBITRAGE THEORY 15
Remark 1.3.1. Qualitatively, there are two reasons why an American call should not be exercised
early:
(i) Insurance. An investor who holds a call option instead of the underlying stock is ‘insured
against a fall in stock price below K, and if he exercises early, he loses this insurance.
(ii) Interest on the strike price. When the holder exercises the option, he buys the stock and pays
the strike price, K. Early exercise at t < T deprives the holder of the interest on K between
times t and T: the later he pays out K, the better.
We remark that an American put oﬀers additional value compared to a European put.
1.4 SinglePeriod Market Models
1.4.1 A fundamental example
We consider a oneperiod model, i.e. we allow trading only at t = 0 and t = T = 1(say). Our aim
is to value at t = 0 a European derivative on a stock S with maturity T.
First idea. Model S
T
as a random variable on a probability space (Ω, T, IP). The derivative
is given by H = f(S
T
), i.e. it is a random variable (for a suitable function f(.)). We could then
price the derivative using some discount factor β by using the expected value of the discounted
future payoﬀ:
H
0
= IE(βH). (1.3)
Problem. How should we pick the probability measure IP? According to their preferences
investors will have diﬀerent opinions about the distribution of the price S
T
.
BlackScholesMerton (Ross) approach. Use the noarbitrage principle and construct a
hedging portfolio using only known (and already priced) securities to duplicate the payoﬀ H. We
assume
1. Investors are nonsatiable, i.e. they always prefer more to less.
2. Markets do not allow arbitrage , i.e. the possibility of riskfree proﬁts.
From the noarbitrage principle we see:
If it is possible to duplicate the payoﬀ H of a derivative using a portfolio V of underlying (basic)
securities, i.e. H(ω) = V (ω), ∀ω, the price of the portfolio at t = 0 must equal the price of the
derivative at t = 0.
Let us assume there are two tradeable assets
• a riskfree bond (bank account) with B(0) = 1 and B(T) = 1, that is the interest rate r = 0
and the discount factor β(t) = 1. (In this context we use β(t) = 1/B(t) as the discount
factor).
• a risky stock S with S(0) = 10 and two possible values at t = T
S(T) =
_
20 with probability p
7.5 with probability 1 −p.
We call this setting a (B, S)− market. The problem is to price a European call at t = 0 with
strike K = 15 and maturity T, i.e. the random payoﬀ H = (S(T) − K)
+
. We can evaluate the
call in every possible state at t = T and see H = 5 (if S(T) = 20) with probability p and H = 0
(if S(T) = 7.5) with probability 1 −p. This is illustrated in ﬁgure (1.4.1)
The key idea now is to try to ﬁnd a portfolio combining bond and stock, which synthesizes the
cash ﬂow of the option. If such a portfolio exists, holding this portfolio today would be equivalent
to holding the option – they would produce the same cash ﬂow in the future. Therefore the price
CHAPTER 1. ARBITRAGE THEORY 16
today one period
S
0
= 10
B
0
= 1
H
0
=?
.
.
.
.
.
.
.
.
.
.
.
.
_
¸
¸
_
¸
¸
_
S
1
= 20
B
1
= 1
H
+
1
= max¦20 −15, 0¦
= 5
_
¸
¸
_
¸
¸
_
upstate
~
~
~
~
~
~
~
~
~
~
~
~
_
¸
¸
_
¸
¸
_
S
1
= 7.5
B
1
= 1
H
−
1
= max¦7.5 −15, 0¦
= 0.
_
¸
¸
_
¸
¸
_
downstate
Figure 1.2: Oneperiod example
of the option should be the same as the price of constructing the portfolio, otherwise investors
could just restructure their holdings in the assets and obtain a riskfree proﬁt today.
We brieﬂy present the constructing of the portfolio θ = (θ
0
, θ
1
), which in the current setting is
just a simple exercise in linear algebra. If we buy θ
1
stocks and invest θ
0
£ in the bank account,
then today’s value of the portfolio is
V (0) = θ
0
+θ
1
S(0).
In state 1 the stock price is 20 £ and the value of the option 5 £, so
θ
0
+θ
1
20 = 5.
In state 2 the stock price is 7.5 £ and the value of the option 0 £, so
θ
0
+θ
1
7.5 = 0.
We solve this and get θ
0
= −3 and θ
1
= 0.4.
So the value of our portfolio at time 0 in £ is
V (0) = −3B(0) + 0.4S(0) = −3 + 0.4 10 = 1
V (0) is called the noarbitrage price. Every other price allows a riskless proﬁt, since if the option
is too cheap, buy it and ﬁnance yourself by selling short the above portfolio (i.e. sell the portfolio
without possessing it and promise to deliver it at time T = 1 – this is riskfree because you own
the option). If on the other hand the option is too dear, write it (i.e. sell it in the market) and
cover yourself by setting up the above portfolio.
We see that the noarbitrage price is independent of the individual preferences of the investor
(given by certain probability assumptions about the future, i.e. a probability measure IP). But
one can identify a special, so called riskneutral, probability measure IP
∗
, such that
H
0
= IE
∗
(βH) = (p
∗
β(S
1
−K) + (1 −p
∗
) 0) = 1.
In the above example we get from 1 = p
∗
5 + (1 − p
∗
)0 that p
∗
= 0.2 This probability measure
IP
∗
is equivalent to IP, and the discounted stock price process, i.e. β
t
S
t
, t = 0, 1 follows a IP
∗

martingale. In the above example this corresponds to S(0) = p
∗
S(T)
up
+ (1 −p
∗
)S(T)
down
, that
is S(0) = IE
∗
(βS(T)).
We will show that the above generalizes. Indeed, we will ﬁnd that the noarbitrage condition
is equivalent to the existence of an equivalent martingale measure (ﬁrst fundamental theorem of
CHAPTER 1. ARBITRAGE THEORY 17
asset pricing) and that the property that we can price assets using the expectation operator is
equivalent to the uniqueness of the equivalent martingale measure.
Let us consider the construction of hedging strategies from a diﬀerent perspective. Consider
a oneperiod (B, S)−market setting with discount factor β = 1. Assume we want to replicate a
derivative H (that is a random variable on some probability space (Ω, T, IP)). For each hedging
strategy θ = (θ
0
, θ
1
) we have an initial value of the portfolio V (0) = θ
0
+ θ
1
S(0) and a time
t = T value of V (T) = θ
0
+ θ
1
S(T). We can write V (T) = V (0) + (V (T) − V (0)) with G(T) =
V (T) − V (0) = θ
1
(S(T) − S(0)) the gains from trading. So the costs C(0) of setting up this
portfolio at time t = 0 are given by C(0) = V (0), while maintaining (or achieving) a perfect hedge
at t = T requires an additional capital of C(T) = H − V (T). Thus we have two possibilities for
ﬁnding ’optimal’ hedging strategies:
• Meanvariance hedging. Find θ
0
(or alternatively V (0)) and θ
1
such that
IE
_
(H −V (T))
2
_
= IE
_
(H −(V (0) +θ
1
(S(T) −S(0))))
2
_
→ min
• Riskminimal hedging. Minimize the cost from trading, i.e. an appropriate functional in
volving the costs C(t), t = 0, T.
In our example meanvariance hedging corresponds to the standard linear regression problem, and
so
θ
1
=
CCov(H, (S(T) −S(0)))
VV ar(S(T) −S(0))
and V
0
= IE(H) −θ
1
IE(S(T) −S(0)).
We can also calculate the optimal value of the risk functional
R
min
= VV ar(H) −θ
2
1
VV ar(S(T) −S(0)) = VV ar(H)(1 −ρ
2
),
where ρ is the correlation coeﬃcient of H and S(T). Therefore we can’t expect a perfect hedge
in general. If however [ρ[ = 1, i.e. H is a linear function of S(T), a perfect hedge is possible. We
call a market complete if a perfect hedge is possible for all contingent claims.
1.4.2 A singleperiod model
We proceed to formalise and extend the above example and present in detail a simple model of
a ﬁnancial market. Despite its simplicity it already has all the key features needed in the sequel
(and the reader should not hesitate to come back here from more advanced chapters to see the
bare concepts again).
We introduce in passing a little of the terminology and notation of Chapter 4; see also Harrison
and Kreps (1979). We use some elementary vocabulary from probability theory, which is explained
in detail in Chapter 2.
We consider a single period model, i.e. we have two timeindices, say t = 0, which is the
current time (date), and t = T, which is the terminal date for all economic activities considered.
The ﬁnancial market contains d + 1 traded ﬁnancial assets, whose prices at time t = 0 are
denoted by the vector S(0) ∈ IR
d+1
,
S(0) = (S
0
(0), S
1
(0), . . . , S
d
(0))
(where
denotes the transpose of a vector or matrix). At time T, the owner of ﬁnancial asset num
ber i receives a random payment depending on the state of the world. We model this randomness
by introducing a ﬁnite probability space (Ω, T, IP), with a ﬁnite number [Ω[ = N of points (each
corresponding to a certain state of the world) ω
1
, . . . , ω
j
, . . . , ω
N
, each with positive probability:
IP(¦ω¦) > 0, which means that every state of the world is possible. T is the set of subsets of Ω
(events that can happen in the world) on which IP(.) is deﬁned (we can quantify how probable
these events are), here T = T(Ω) the set of all subsets of Ω. (In more complicated models it is
CHAPTER 1. ARBITRAGE THEORY 18
not possible to deﬁne a probability measure on all subsets of the state space Ω, see ¸2.1.) We can
now write the random payment arising from ﬁnancial asset i as
S
i
(T) = (S
i
(T, ω
1
), . . . , S
i
(T, ω
j
), . . . , S
i
(T, ω
N
))
.
At time t = 0 the agents can buy and sell ﬁnancial assets. The portfolio position of an individual
agent is given by a trading strategy ϕ, which is an IR
d+1
vector,
ϕ = (ϕ
0
, ϕ
1
, . . . , ϕ
d
)
.
Here ϕ
i
denotes the quantity of the ith asset bought at time t = 0, which may be negative as well
as positive (recall we allow short positions).
The dynamics of our model using the trading strategy ϕ are as follows: at time t = 0 we invest
the amount S(0)
ϕ =
d
i=0
ϕ
i
S
i
(0) and at time t = T we receive the random payment S(T, ω)
ϕ =
d
i=0
ϕ
i
S
i
(T, ω) depending on the realised state ω of the world. Using the (d +1) Nmatrix
S,
whose columns are the vectors S(T, ω), we can write the possible payments more compactly as
S
ϕ.
What does an arbitrage opportunity mean in our model? As arbitrage is ‘making something out
of nothing’; an arbitrage strategy is a vector ϕ ∈ IR
d+1
such that S(0)
ϕ = 0, our net investment
at time t = 0 is zero, and
S(T, ω)
ϕ ≥ 0, ∀ω ∈ Ω and there exists a ω ∈ Ω such that S(T, ω)
ϕ > 0.
We can equivalently formulate this as: S(0)
ϕ < 0, we borrow money for consumption at time
t = 0, and
S(T, ω)
ϕ ≥ 0, ∀ω ∈ Ω,
i.e we don’t have to repay anything at t = T. Now this means we had a ‘free lunch’ at t = 0 at
the market’s expense.
We agreed that we should not have arbitrage opportunities in our model. The consequences
of this assumption are surprisingly farreaching.
So assume that there are no arbitrage opportunities. If we analyse the structure of our model
above, we see that every statement can be formulated in terms of Euclidean geometry or linear
algebra. For instance, absence of arbitrage means that the space
Γ =
__
x
y
_
, x ∈ IR, y ∈ IR
N
: x = −S(0)
ϕ, y =
S
ϕ, ϕ ∈ IR
d+1
_
and the space
IR
N+1
+
=
_
z ∈ IR
N+1
: z
i
≥ 0 ∀ 0 ≤ i ≤ N ∃ i such that z
i
> 0
_
have no common points. A statement like that naturally points to the use of a separation theorem
for convex subsets, the separating hyperplane theorem (see e.g. Rockafellar (1970) for an account
of such results, or Appendix A). Using such a theorem we come to the following characterisation
of no arbitrage.
Theorem 1.4.1. There is no arbitrage if and only if there exists a vector
ψ ∈ IR
N
, ψ
i
> 0, ∀ 1 ≤ i ≤ N
such that
Sψ = S(0). (1.4)
Proof. The implication ‘⇐’ follows straightforwardly: assume that
S(T, ω)
ϕ ≥ 0, ω ∈ Ω for a vector ϕ ∈ IR
d+1
.Then
S(0)
ϕ = (
Sψ)
ϕ = ψ
S
ϕ ≥ 0,
CHAPTER 1. ARBITRAGE THEORY 19
since ψ
i
> 0, ∀1 ≤ i ≤ N. So no arbitrage opportunities exist.
To show the implication ‘⇒’ we use a variant of the separating hyperplane theorem. Absence
of arbitrage means the Γ and IR
N+1
+
have no common points. This means that K ⊂ IR
N+1
+
deﬁned
by
K =
_
z ∈ IR
N+1
+
:
N
i=0
z
i
= 1
_
and Γ do not meet. But K is a compact and convex set, and by the separating hyperplane theorem
(Appendix C), there is a vector λ ∈ IR
N+1
such that for all z ∈ K
λ
z > 0
but for all (x, y)
∈ Γ
λ
0
x +λ
1
y
1
+. . . +λ
N
y
N
= 0.
Now choosing z
i
= 1 successively we see that λ
i
> 0, i = 0, . . . N, and hence by normalising we
get ψ = λ/λ
0
with ψ
0
= 1. Now set x = −S(0)
ϕ and y =
S
ϕ and the claim follows.
The vector ψ is called a stateprice vector. We can think of ψ
j
as the marginal cost of obtaining
an additional unit of account in state ω
j
. We can now reformulate the above statement to:
There is no arbitrage if and only if there exists a stateprice vector.
Using a further normalisation, we can clarify the link to our probabilistic setting. Given a
stateprice vector ψ = (ψ
1
, . . . , ψ
N
), we set ψ
0
= ψ
1
+ . . . + ψ
N
and for any state ω
j
write
q
j
= ψ
j
/ψ
0
. We can now view (q
1
, . . . , q
N
) as probabilities and deﬁne a new probability measure
on Ω by QQ(¦ω
j
¦) = q
j
, j = 1, . . . , N. Using this probability measure, we see that for each asset i
we have the relation
S
i
(0)
ψ
0
=
N
j=1
q
j
S
i
(T, ω
j
) = IE
QQ
(S
i
(T)).
Hence the normalized price of the ﬁnancial security i is just its expected payoﬀ under some specially
chosen ‘riskneutral’ probabilities. Observe that we didn’t make any use of the speciﬁc probability
measure IP in our given probability space.
So far we have not speciﬁed anything about the denomination of prices. From a technical point
of view we could choose any asset i as long as its price vector (S
i
(0), S
i
(T, ω
1
), . . . , S
i
(T, ω
N
))
only contains positive entries, and express all other prices in units of this asset. We say that we
use this asset as num´eraire. Let us emphasise again that arbitrage opportunities do not depend
on the chosen num´eraire. It turns out that appropriate choice of the num´eraire facilitates the
probabilitytheoretic analysis in complex settings, and we will discuss the choice of the num´eraire
in detail later on.
For simplicity, let us assume that asset 0 is a riskless bond paying one unit in all states ω ∈ Ω
at time T. This means that S
0
(T, ω) = 1 in all states of the world ω ∈ Ω. By the above analysis
we must have
S
0
(0)
ψ
0
=
N
j=1
q
j
S
0
(T, ω
j
) =
N
j=1
q
j
1 = 1,
and ψ
0
is the discount on riskless borrowing. Introducing an interest rate r, we must have S
0
(0) =
ψ
0
= (1 +r)
−T
. We can now express the price of asset i at time t = 0 as
S
i
(0) =
N
j=1
q
j
S
i
(T, ω
j
)
(1 +r)
T
= IE
QQ
_
S
i
(T)
(1 +r)
T
_
.
We rewrite this as
S
i
(T)
(1 +r)
0
= IE
QQ
_
S
i
(T)
(1 +r)
T
_
.
CHAPTER 1. ARBITRAGE THEORY 20
In the language of probability theory we just have shown that the processes S
i
(t)/(1+r)
t
, t = 0, T
are QQmartingales. (Martingales are the probabilists’ way of describing fair games: see Chapter 3.)
It is important to notice that under the given probability measure IP (which reﬂects an individual
agent’s belief or the markets’ belief) the processes S
i
(t)/(1 + r)
t
, t = 0, T generally do not form
IPmartingales.
We use this to shed light on the relationship of the probability measures IP and QQ. Since
QQ(¦ω¦) > 0 for all ω ∈ Ω the probability measures IP and QQ are equivalent and (see Chapters 2
and 3) because of the argument above we call QQ an equivalent martingale measure. So we arrived
at yet another characterisation of arbitrage:
There is no arbitrage if and only if there exists an equivalent martingale measure.
We also see that riskneutral pricing corresponds to using the expectation operator with respect
to an equivalent martingale measure. This concept lies at the heart of stochastic (mathematical)
ﬁnance and will be the golden thread (or roter Faden) throughout this book.
We now know how the given prices of our (d +1) ﬁnancial assets should be related in order to
exclude arbitrage opportunities, but how should we price a newly introduced ﬁnancial instrument?
We can represent this ﬁnancial instrument by its random payments δ(T) = (δ(T, ω
1
), . . . , δ(T, ω
j
), . . . , δ(T, ω
N
))
(observe that δ(T) is a vector in IR
N
) at time t = T and ask for its price δ(0) at time t = 0. The
natural idea is to use an equivalent probability measure QQ and set
δ(0) = IE
QQ
(δ(T)/(1 +r)
T
)
(recall that all time t = 0 and time t = T prices are related in this way). Unfortunately, as we
don’t have a unique martingale measure in general, we cannot guarantee the uniqueness of the
t = 0 price. Put another way, we know every equivalent martingale measure leads to a reasonable
relative price for our newly created ﬁnancial instrument, but which measure should one choose?
The easiest way out would be if there were only one equivalent martingale measure at our
disposal – and surprisingly enough the classical economic pricing theory puts us exactly in this
situation! Given a set of ﬁnancial assets on a market the underlying question is whether we are
able to price any new ﬁnancial asset which might be introduced in the market, or equivalently
whether we can replicate the cashﬂow of the new asset by means of a portfolio of our original
assets. If this is the case and we can replicate every new asset, the market is called complete.
In our ﬁnancial market situation the question can be restated mathematically in terms of
Euclidean geometry: do the vectors S
i
(T) span the whole IR
N
? This leads to:
Theorem 1.4.2. Suppose there are no arbitrage opportunities. Then the model is complete if and
only if the matrix equation
S
ϕ = δ
has a solution ϕ ∈ IR
d+1
for any vector δ ∈ IR
N
.
Linear algebra immediately tells us that the above theorem means that the number of inde
pendent vectors in
S
must equal the number of states in Ω. In an informal way we can say
that if the ﬁnancial market model contains 2 (N) states of the world at time T it allows for
1 (N − 1) sources of randomness (if there is only one state we know the outcome). Likewise we
can view the num´eraire asset as riskfree and all other assets as risky. We can now restate the
above characterisation of completeness in an informal (but intuitive) way as:
A ﬁnancial market model is complete if it contains at least as many independent risky assets
as sources of randomness.
The question of completeness can be expressed equivalently in probabilistic language (to be
introduced in Chapter 3), as a question of representability of the relevant random variables or
whether the σalgebra they generate is the full σalgebra.
If a ﬁnancial market model is complete, traditional economic theory shows that there exists a
unique system of prices. If there exists only one system of prices, and every equivalent martingale
measure gives rise to a price system, we can only have a unique equivalent martingale measure.
(We will come back to this important question in Chapters 4 and 6).
The (arbitragefree) market is complete if and only if there exists a unique equivalent martingale
measure.
CHAPTER 1. ARBITRAGE THEORY 21
Example (continued).
We formalise the above example of a binary singleperiod model. We have d + 1 = 2 assets and
[Ω[ = 2 states of the world Ω = ¦ω
1
, ω
2
¦. Keeping the interest rate r = 0 we obtain the following
vectors (and matrices):
S(0) =
_
S
0
(0)
S
1
(0)
_
=
_
1
150
_
, S
0
(T) =
_
1
1
_
, S
1
(T) =
_
180
90
_
,
S =
_
1 1
180 90
_
.
We try to solve (1.4) for state prices, i.e. we try to ﬁnd a vector ψ = (ψ
1
, ψ
2
)
, ψ
i
> 0, i = 1, 2
such that
_
1
150
_
=
_
1 1
180 90
_ _
ψ
1
ψ
2
_
.
Now this has a solution
_
ψ
1
ψ
2
_
=
_
2/3
1/3
_
,
hence showing that there are no arbitrage opportunities in our market model. Furthermore, since
ψ
1
+ψ
2
= 1 we see that we already have computed riskneutral probabilities, and so we have found
an equivalent martingale measure QQ with
QQ(ω
1
) =
2
3
, QQ(ω
2
) =
1
3
.
We now want to ﬁnd out if the market is complete (or equivalently if there is a unique equivalent
martingale measure). For that we introduce a new ﬁnancial asset δ with random payments δ(T) =
(δ(T, ω
1
), δ(T, ω
2
))
. For the market to be complete, each such δ(T) must be in the linear span of
S
0
(T) and S
1
(T). Now since S
0
(T) and S
1
(T) are linearly independent, their linear span is the
whole IR
2
(= IR
Ω
) and δ(T) is indeed in the linear span. Hence we can ﬁnd a replicating portfolio
by solving
_
δ(T, ω
1
)
δ(T, ω
2
)
_
=
_
1 180
1 90
_ _
ϕ
0
ϕ
1
_
.
Let us consider the example of a European option above. There δ(T, ω
1
) = 30, δ(T, ω
2
) = 0 and
the above becomes
_
30
0
_
=
_
1 180
1 90
_ _
ϕ
0
ϕ
1
_
,
with solution ϕ
0
= −30 and ϕ
1
=
1
3
, telling us to borrow 30 units and buy
1
3
stocks, which is
exactly what we did to set up our portfolio above. Of course an alternative way of showing market
completeness is to recognise that (1.4) above admits only one solution for riskneutral probabilities,
showing the uniqueness of the martingale measure.
Example. Change of num´eraire.
We choose a situation similar to the above example, i.e. we have d + 1 = 2 assets and [Ω[ = 2
states of the world Ω = ¦ω
1
, ω
2
¦. But now we assume two risky assets (and no bond) as the
ﬁnancial instruments at our disposal. The price vectors are given by
S(0) =
_
S
0
(0)
S
1
(0)
_
=
_
1
1
_
, S
0
(T) =
_
3/4
5/4
_
, S
1
(T) =
_
1/2
2
_
,
S =
_
3/4 5/4
1/2 2
_
.
We solve (1.4) and get state prices
_
ψ
1
ψ
2
_
=
_
6/7
2/7
_
,
showing that there are no arbitrage opportunities in our market model. So we ﬁnd an equivalent
martingale measure QQ with
QQ(ω
1
) =
3
4
, QQ(ω
2
) =
1
4
.
CHAPTER 1. ARBITRAGE THEORY 22
Since we don’t have a riskfree asset in our model, this normalisation (this num´eraire) is very artiﬁ
cial, and we shall use one of the modelled assets as a num´eraire, say S
0
. Under this normalisation,
we have the following asset prices (in terms of S
0
(t, ω)!):
˜
S(0) =
_
˜
S
0
(0)
˜
S
1
(0)
_
=
_
S
0
(0)/S
0
(0)
S
1
(0)/S
0
(0)
_
=
_
1
1
_
,
˜
S
0
(T) =
_
S
0
(T, ω
1
)/S
0
(T, ω
1
)
S
0
(T, ω
2
)/S
0
(T, ω
2
)
_
=
_
1
1
_
,
˜
S
1
(T) =
_
S
1
(T, ω
1
)/S
0
(T, ω
1
)
S
1
(T, ω
2
)/S
0
(T, ω
2
)
_
=
_
2/3
8/5
_
.
Since the model is arbitragefree (recall that a change of num´eraire doesn’t aﬀect the noarbitrage
property), we are able to ﬁnd riskneutral probabilities ˜ q
1
=
9
14
and ˜ q
2
=
5
14
.
We now compute the prices for a call option to exchange S
0
for S
1
. Deﬁne Z(T) = max¦S
1
(T)−
S
0
(T), 0¦ and the cash ﬂow is given by
_
Z(T, ω
1
)
Z(T, ω
2
)
_
=
_
0
3/4
_
.
There are no diﬃculties in ﬁnding the hedge portfolio ϕ
0
= −
3
7
and ϕ
1
=
9
14
and pricing the
option as Z
0
=
3
14
.
We want to point out the following observation. Using S
0
as num´eraire we naturally write
Z(T, ω)
S
0
(T, ω)
= max
_
S
1
(T, ω)
S
0
(T, ω)
−1, 0
_
and see that this seemingly complicated option is (by use of the appropriate num´eraire) equivalent
to
˜
Z(T, ω) = max
_
˜
S
1
(T, ω) −1, 0
_
,
a European call on the asset
˜
S
1
.
1.4.3 A few ﬁnancialeconomic considerations
The underlying principle for modelling economic behaviour of investors (or economic agents in
general) is the maximisation of expected utility, that is one assumes that agents have a utility
function U(.) and base economic decisions on expected utility considerations. For instance, as
suming a oneperiod model, an economic agent might have a utility function over current(t = 0)
and future (t = T) values of consumption
U(c
0
, c
T
) = u(c
0
) +IE(βu(c
T
)), (1.5)
where c
t
is consumption at time t.
u(.) is a standard utility function expressing
• nonsatiation  investors prefer more to less; u is increasing;
• risk aversion  investors reject an actuarially fair gamble; u is concave;
• and (maybe) decreasing absolute risk aversion and constant relative risk aversion.
Typical examples are power utility u(x) = (x
γ
−1)/γ, log utility u(x) = log(x) or quadratic utility
u(x) = x
2
+dx (for which only the ﬁrst two properties are true.)
Assume such an investor is oﬀered at t = 0 at a price p a random payoﬀ X at t = T. How much
will she buy? Denote with ξ the amount of the asset she chooses to buy and with e
τ
, τ = 0, T
her original consumption. Thus, her problem is
max
ξ
[u(c
0
) +IE[βu(c
T
)]]
CHAPTER 1. ARBITRAGE THEORY 23
such that
c
0
= e
0
−pξ and c
T
= e
T
+Xξ.
Substituting the constraints into the objective function and diﬀerentiating with respect to ξ, we
get the ﬁrstorder condition
pu
(c
0
) = IE [βu
(c
T
)X] ⇔ p = IE
_
β
u
(c
T
)
u
(c
0
)
X
_
.
The investor buys or sells more of the asset until this ﬁrstorder condition is satisﬁed.
If we use the stochastic discount factor
m = β
u
(c
T
)
u
(c
0
)
,
we obtain the central equation
p = IE(mX). (1.6)
We can use (under regularity conditions) the random variable m to perform a change of measure,
i.e. deﬁne a probability measure IP
∗
using
IP
∗
(A) = IE
∗
(1
A
) = IE(m1
A
).
We write (1.6) under measure IP
∗
and get
p = IE
∗
(X)
Returning to the initial pricing equation, we see that under IP
∗
the investor has the utility function
u(x) = x. Such an investor is called riskneutral, and consequently one often calls the correspond
ing measure a riskneutral measure. An excellent discussion of these issues (and further much
deeper results) are given in Cochrane (2001).
Chapter 2
Financial Market Theory
2.1 Choice under Uncertainty
In a complete ﬁnancial market model prices of derivative securities can be obtained by arbitrage
arguments. In incomplete market situations, these ﬁnancial instruments carry an intrinsic risk
which cannot be hedged away. Thus in order to price these instruments further assumptions on
the investors, especially regarding their preferences towards the risks involved, are needed.
2.1.1 Preferences and the Expected Utility Theorem
Let A be some nonempty set. An element x ∈ A will be interpreted as a possible choice of an
economic agent. If presented with two choices x, y ∈ A the agent might prefer one over the other.
This can be formalized
Deﬁnition 2.1.1. A binary relation _ deﬁned on A A is called a preference relation, if it is
• transitive: x _ y, y _ z ⇒ x _ z.
• complete: for all x, y ∈ A either x _ y or y _ x.
If x _ y and y _ x we write x ∼ y (indiﬀerence relation). x is said to be strictly preferred to y,
denoted by x ~ y, if x _ y and y ,_ x.
Deﬁnition 2.1.2. A numerical representation of a preference order _ is a function U : A → IR
such that
y _ x ⇔ U(y) ≥ U(x). (2.1)
In order to characterize existence of a numerical representation we need
Deﬁnition 2.1.3. Let _ be a preference relation on A. A subset Z of A is called order dense if
for any pair x, y ∈ A such that x ~ y there exists some z ∈ Z such that x _ z _ y.
Theorem 2.1.1. For the existence of a numerical representation of a preference relation _ it
is necessary and suﬃcient that A contains a countable order dense subset Z. In particular, any
preference relation admits a numerical representation if A is countable.
Suppose that each possible choice of our economic agent corresponds to a probability distri
bution on a sample space (Ω, T). Thus the set A can be identiﬁed with a subset / of the set
/
1
(Ω, T) of all probability distributions on (Ω, T). In the context of the theory of choice the
elements of / are sometimes called lotteries. We assume in the sequel that / is convex. The aim
in the following is to characterize the preference orders _ that allow a numerical representation
of the form
24
CHAPTER 2. FINANCIAL MARKET THEORY 25
µ _ ν ⇔
_
Ω
u(ω)µ(dω) ≥
_
Ω
u(ω)ν(dω) (2.2)
Deﬁnition 2.1.4. A numerical representation U of a preference order _ on / is called a von
NeumannMorgenstern representation if it is of form (2.2).
Any von NeumannMorgenstern representation of U is aﬃne on / in the sense that
U(αµ + (1 −α)ν) = αU(µ) + (1 −α)U(ν)
for all µ, ν ∈ / and α ∈ [0, 1].
Aﬃnity implies the tow following properties or axioms
• Independence (substitution) (I):
Let µ, ν, λ ∈ / and α ∈ (0, 1] then
µ ~ ν ⇒ αµ + (1 −α)λ ~ αν + (1 −α)λ
• Archimedean (continuity) (A):
If µ ~ λ ~ ν ∈ / then there exist α, β ∈ (0, 1) such that
αµ + (1 −α)ν ~ λ ~ βµ + (1 −β)ν
Theorem 2.1.2. Suppose the preference relation _ on / satisﬁes the axioms (A) and (I). Then
there exists an aﬃne numerical representation U of _. Moreover, U is unique up to positive aﬃne
transformations, i.e. any other aﬃne numerical representation
˜
U with these properties is of the
form
˜
U = aU +b for some a > 0 and b ∈ IR.
In case of a discrete (ﬁnite) probability distribution existence of an aﬃne numerical represen
tation is equivalent to existence of a von NeumannMorgenstern representation.
In the general case one needs to introduce a further axiom, the socalled sure thing principle:
For µ, ν ∈ / and A ∈ T such that µ(A) = 1:
δ
x
~ ν for all x ∈ A ⇒ µ ~ ν
and
ν ~ δ
x
for all x ∈ A ⇒ ν ~ µ.
From now on we will work within the framework of the expected utility representation.
2.1.2 Risk Aversion
We focus now on individual ﬁnancial assets under the assumption that their payoﬀ distributions
at a ﬁxed time are known. We can view these asset distributions as probability distributions on
some interval S ⊆ IR. Thus we take / as a ﬁxed set of Borel probability measures on S. We also
assume that / is convex and contains all point masses δ
x
for x ∈ S. Also, we assume that each
µ ∈ / has a well deﬁned expectation
m(µ) =
_
xµ(dx) ∈ IR.
For assets with random payoﬀ µ resp. insurance contracts with random damage m(µ) is often
called fair price resp. fair premium. However, actual prices resp. premia will typically be diﬀerent
due to risk premia, which can be explained within our conceptual framework. We assume in the
sequel that preference relations have a von NeumannMorgenstern representation.
CHAPTER 2. FINANCIAL MARKET THEORY 26
Deﬁnition 2.1.5. A preference relation ~ on / is called monotone if
x > y implies δ
x
~ δ
y
.
The preference relation is called risk averse if for µ ∈ /
δ
m(µ)
~ µ unless µ = δ
m(µ)
. (2.3)
Remark 2.1.1. An economic agent is called riskaverse if his preference relation is risk averse.
A riskaverse economic agent is unwilling to accept or indiﬀerent to every actuarially fair gamble.
An economic agent is strictly risk averse if he is unwilling to accept every actuarially fair gamble.
Proposition 2.1.1. A preference relation ~ is
(i) monotone, iﬀ u is strictly increasing.
(ii) risk averse, iﬀ u is concave.
Proof.(i) Monotonicity is equivalent to
u(x) =
_
u(s)δ
x
(ds) = U(δ
x
) > U(δ
y
) = u(y)
for x > y.
(ii) For µ = αδ
x
+ (1 −α)δ
y
we have
m(µ) =
_
s(αδ
x
+ (1 −α)δ
y
)(ds) = αx + (1 −α)y.
So if ~ is riskaverse, then
δ
αx+(1−α)y
~ αδ
x
+ (1 −α)δ
y
holds for all distinct x, y ∈ S and α ∈ (0, 1). Hence
u(αx + (1 −α)y) > αu(x) + (1 −α)u(y),
so u is strictly concave. Conversely, if u is strictly concave, then Jensen’s inequality implies risk
aversion, since
U(δ
m(µ)
) = u
__
xµ(dx)
_
≥ intu(x)µ(dx) = U(µ)
with equality iﬀ µ = δ
m(µ)
.
Deﬁnition 2.1.6. A function u : S → IR is called utility function if it is strictly concave, strictly
increasing and continuous on S.
By the intermediate value theorem there is for any µ ∈ / a unique number c(µ) such that
u(c(µ)) = U(µ) =
_
udµ. (2.4)
So δ
c(µ)
∼ µ, i.e. there is indiﬀerence between µ and the sure amount c(µ).
Deﬁnition 2.1.7. The certainty equivalent of µ, denoted by c(µ) is the number deﬁned in (2.4).
It is the amount of money for which the individual is indiﬀerent between the lottery µ and the
certain amount c(µ) The number
ρ(µ) = m(µ) −c(µ) (2.5)
is called the risk premium.
CHAPTER 2. FINANCIAL MARKET THEORY 27
The certainty equivalent can be viewed as the upper price an investor would pay for the asset
distribution µ. Thus the fair price must be reduced by the risk premium if one wants an agent to
buy the asset.
Consider now an investor who has the choice to invest a fraction of his wealth in a riskfree and
the remaining fraction of his wealth in a risky asset. We want to ﬁnd conditions on the distribution
of the risky asset and the preferences (utility function) of the investor in order to determine his
willingness for a risky investment. Formally, we consider the following optimisation problem
f(λ) = U(µ
λ
) =
_
udµ
λ
→ max, (2.6)
where µ
λ
is the distribution of
X
λ
= (1 −λ)X +λc
with X an integrable random variable with nondegenerate distribution µ and c ∈ S is a certain
amount.
Proposition 2.1.2. We have λ
∗
= 1 if IE(X) ≤ c and λ
∗
> 0 if c ≥ c(µ).
Proof. By Jensen’s inequality
f(λ) ≤ u(IE [X
λ
]) = u((1 −λ)IE(X) +λc)
with equality iﬀ λ = 1. It follows that λ
∗
= 1 if the righthand side is increasing in λ, i.e.
IE(X) ≤ c.
Now, strict concavity of u implies
f(λ) ≥ IE (1 −λ)u(X) +λu(c)) = (1 −λ)u(c(µ)) +λu(c)
with equality iﬀ λ ∈ ¦0, 1¦. The righthand side is increasing in λ if c ≥ c(µ), and this implies
λ
∗
> 0.
Remark 2.1.2. (i) (Demand for a risky asset.) The price of a risky asset must be below the
expected discounted payoﬀ in order to attract any riskaverse investor.
(ii) (Demand for insurance.) Risk aversion can create a demand for insurance even if the insurance
premium lies above the fair price.
(iii) If u ∈ C
1
(IR) then
λ
∗
= 1 ⇔ IE(X) ≤ c
λ
∗
= 0 ⇔ c ≤
IE (Xu
(X))
IE (U
(X))
.
We assume now that µ has a ﬁnite variance VV ar(µ). We consider the Taylor expansion of a
suﬃciently smooth utility function u(x) at x = c(µ) around m = m(µ). We have
u(c(µ)) ≈ u(m) +u
(m) (c(µ) −m) = u(m) +u
(m)ρ(m).
On the other hand,
u(c(µ)) =
_
u(x)µ(dx) +
_ _
u(m) +u
(m) (c(µ) −m) +
1
2
u
(m) (c(µ) −m)
2
+r(x)
_
µ(dx)
≈ u(m) +
1
2
u
(m)VV ar(µ)
(where r(x) denotes the remainder term in the Taylor expansion). So
ρ(µ) ≈ −
u
(m)
2u
(m)
VV ar(µ) =
1
2
α(m)VV ar(µ). (2.7)
Thus α(m(µ)) is the factor by which an economic agent with utility function u weighs the risk
(measured by VV ar(µ)) in order to determine the risk premium.
CHAPTER 2. FINANCIAL MARKET THEORY 28
Deﬁnition 2.1.8. Suppose that u is a twice continuously diﬀerentiable utility function on S. Then
α(x) = −
u
(x)
u
(x)
(2.8)
is called the ArrowPratt coeﬃcient of absolute risk aversion of u at level x.
Example 2.1.1. (i) Constant absolute risk aversion (CARA). Here α(x) ≡ α for some constant.
This implies a (normalized) utility function
u(x) = 1 −e
−αx
.
(ii) Hyperbolic absolute risk aversion (HARA). Here α(x) = (1 − γ)/x for some γ ∈ [0, 1). This
implies
u(x) = log x for γ = 0
u(x) = x
γ
/γ for 0 < γ < 1.
Remark 2.1.3. For u and ˜ u two utility functions on S with α and ˜ α the corresponding Arrow
Pratt coeﬃcients we have
α(x) ≥ ˜ α(x) ∀x ∈ S ⇔ ρ(µ) ≥ ˜ ρ(µ)
with ρ, ˜ ρ the corresponding risk premia.
It is also useful to analyse risk aversion in terms of the proportion of wealth that is invested
in a risky asset.
Deﬁnition 2.1.9. Suppose that u is a twice continuously diﬀerentiable utility function on S. Then
α
R
(x) = α(x)x = −x
u
(x)
u
(x)
(2.9)
is called the ArrowPratt coeﬃcient of relative risk aversion of u at level x.
Remark 2.1.4. (i) An individuals utility function displays decreasing (constant, increasing)
absolute risk aversion if α(x) is decreasing (constant, increasing).
(ii) An individuals utility function displays decreasing (constant, increasing) relative risk aver
sion if α
R
(x) is decreasing (constant, increasing).
2.1.3 Further measures of risk
In this section we focus on the question whether one (risky asset) distribution is preferred to
another, regardless of the choice of a particular utility function. Assume S = IR and / = cM
1
(expectation exists).
Deﬁnition 2.1.10. Let µ, ν be elements of /. We say that µ is uniformly preferred over ν,
notation µ ~
uni
ν if for all utility functions u
_
udµ ≥
_
udν. (2.10)
~
uni
is also called second order stochastic dominance.
Theorem 2.1.3. For any pair µ, ν ∈ / the following conditions are equivalent:
(i) u ~
uni
ν.
CHAPTER 2. FINANCIAL MARKET THEORY 29
(ii)
_
fdµ ≥
_
fdν for all increasing concave functions f.
(iii) For all c ∈ IR we have
_
(c −x)
+
µ(dx) ≤
_
(c −x)
+
ν(dx). (2.11)
(iv) If F and G denote the respective distribution functions of µ and ν, then
c
_
−∞
F(x)dx ≤
c
_
−∞
G(x)dx ∀c ∈ IR. (2.12)
Proof. Recall u is a utility function iﬀ it is strictly concave and strictly increasing.
(a) ⇒ (b)((b) ⇒ (a) is clear). Choose any utility function u
0
with
_
u
0
dµ and
_
u
0
dν ﬁnite.
For instance,
u
0
(x) =
_
x −e
x/2
+ 1 if x ≤ 0
√
x + 1 −1 if x ≥ 0
Then for f concave and increasing and for α ∈ [0, 1)
u
α
(x) = αf(x) + (1 −α)u
0
(x)
is a utility function. Hence
_
fdµ = lim
α↑1
_
u
α
dµ ≥ lim
α↑1
_
u
α
dν =
_
fdν
(b) ⇔ (c)
”⇒”: Follows, because f(x) = −(c −x)
+
is concave and increasing.
”⇐”: Let f be an increasing concave function and h = −f. Then h is convex and decreasing
and its increasing righthand derivative h := h
+
can be regarded as a distribution function of a
nonnegative (Radon) measure γ on R. Thus
h
(b) = h
(a) +γ([a, b]) for a < b.
For x < b we ﬁnd
h(x) = h(b) −h
(b)(b −x) +
_
[−∞,b]
(z −x)
+
γ(dz) for x < b.
Using (c), the fact that, h
(b) ≤ 0 and Fubini’s theorem we obtain
_
(−∞,b]
hdµ
= h(b)(µ(−∞, b]) −h
(b)
_
(b −x)
+
µ(dx) +
_
(−∞,b]
_
(z −x)
+
µ(dx)γ(dz)
≤ h(b)(ν(−∞, b]) −h
(b)
_
(b −x)
+
ν(dx) +
_
(−∞,b]
_
(z −x)
+
ν(dx)γ(dz)
=
_
(−∞,b]
hdν.
Now letting b ↑ ∞ yields
_
fdµ ≥
_
fdν.
(c) ⇔ (d). By Fubini’s theorem
c
_
−∞
F(y)dy =
c
_
−∞
_
(−∞,y)
µ(dz)dy
=
_ _
1(z ≤ y ≤ c)dyµ(dz)
=
_
(c −z)
+
µ(dz).
CHAPTER 2. FINANCIAL MARKET THEORY 30
Remark 2.1.5. (i) Also: µ ≥
uni
ν ⇔
λ
_
0
F
−1
(y)dy ≥
λ
_
0
G
−1
(y)dy for all λ ∈ (0, 1], where
F
−1
, G
−1
are the inverses, or quartile functions of the distribution.
(ii) Taking f(x) = x in (b), we see µ ≥
uni
ν ⇒ m(µ) ≥ m(ν).
(iii) For normal distributions, we have
^(m, σ
2
) ≥
uni
^
iﬀ m ≥ ˜ m and σ
2
≤ ˜ σ
2
If µ, ν ⊂ / such that m(µ) = m(ν) and µ ≥
uni
ν then var(µ) ≤ var(ν). Here
var(µ) =
_
(x −m(µ))µ(dx) =
_
x
2
µ(dx) −m(µ)
2
is the variance of µ (use condition (b) with f(x) = −x
2
) which holds under m(ν) = m(ν) for all
concave functions.
In the ﬁnancial context, comparison of portfolios with known payoﬀ distributions often use a
meanvariance approach with
µ ≥ ν ⇔ m(ν) ≥ m(ν) and var(µ) ≤ var(ν).
For normal distributions µ ≥ ν is equivalent to µ ≥
uni
ν, but not in general.
Example 2.1.2. µ = U[−1, 1], so m(µ) = 0, var(µ) =
1
2
1
_
−1
x
2
dx =
1
3
_
x
3
3
[
1
−1
_
=
1
3
for ν =
p δ
−1/2
+ (1 −p)δ
2
, with p =
4
5
we have
m(ν) =
4
5
_
−
1
2
_
+
1
5
2 = 0, var(ν) =
4
5
1
4
+
1
5
4 = 1
Thus var(ν) > var(µ), but
_
(−
1
2
−x)
+
µ(dx) =
1
2
−1/2
_
−1
(−
1
2
−x)dx =
1
2
(−
1
2
x −
1
2
x)
¸
¸
−1/2
−1
=
1
2
_
1
4
−
1
8
−
1
2
+
1
2
_
= 1/16
and
_ _
−
1
2
−x
_
+
ν(dx) = 0.
So µ ≥
uni
ν does not hold.
A further important class of distributions is discussed in the following example.
Example 2.1.3. A realvalued random variable Y on some probability space (Ω, T, IP) is called
lognormally distributed with parameters α ∈ R and σ ∈ R
+
if it can be written as
Y = exp(α +σX)
where X has a standard normal law ^(0, 1). For lognormal distributions µ ≥
uni
˘ µ ⇔ σ
2
≤ ˜ σ
2
and α +
1
2
σ
2
≥ ˜ α +
1
2
˜ σ
2
Deﬁnition 2.1.11. Let µ and ν be two arbitrary probability measures on R. We say that µ
stochastically dominates ν, notation µ ≥
mon
ν if
_
fdµ ≥
_
fdν
for all bounded increasing functions f ∈ C(R). Stochastic dominance is also called ﬁrstorder
stochastic dominance.
CHAPTER 2. FINANCIAL MARKET THEORY 31
Theorem 2.1.4. For µ, v ∈ /
1
(R) the following conditions are equivalent
(a) µ ≥
mon
ν;
(b) for all x, F(x) ≤ G(x) where F, G are respectively the distribution functions of µ, ν;
(c) there exists a probability space (Ω, T, IP) with random variables X and Y having respective
distributions µ and ν such that X ≥ Y IP−a.s.
2.2 Optimal Portfolios
2.2.1 The meanvariance approach
Recall our oneperiod model with securities S
0
, S
1
, . . . , S
d
and security prices S
i
(T) at the ﬁnal
time t = T. Here S
0
is the riskfree bond and S
1
, . . . , S
d
are random variables on some prob. space
(Ω, T, IP). For the purpose of this section we disregard the risk free asset and invest only in the
risky assets. We consider their returns
R
i
(T) =
S
i
(T)
S
i
(0)
i = 1, . . . , d
and assume we know (or have estimated) their means and covariance matrix
IE(R
i
(T)) = m
i
i = 1, . . . , d
and
CCov(R
i
(T), R
j
(T)) = σ
ij
i, j = 1, . . . , d
(Observe that Σ = (σ
ij
) is positive semideﬁnite).
We consider portfolio vectors ϕ
i
∈ R
d
with ϕ
i
≥ 0 (in order to avoid the possibility of negative
ﬁnal wealth).
Deﬁnition 2.2.1. An investor with initial wealth x > 0 is assumed to hold ϕ
i
≥ 0 shares of
security i, i = 1, . . . , d with
d
i=1
ϕ
1
S
i
(0) = x ”budget equation”.
Then the portfolio vector π = (π
1
, . . . , π
d
) is deﬁned as
π
i
=
ϕ
i
S
i
(0)
x
i = 1, . . . , d
and
R
π
=
d
i=1
π
i
R
i
(T)
is called the corresponding portfolio return.
Remark 2.2.1. (1) The components of the portfolio vector represent the fractions of total wealth
invested in the corresponding securities. In particular, we have
d
i=1
π
i
=
1
x
d
i=1
ϕ
i
S
i
(0) =
x
x
= 1
CHAPTER 2. FINANCIAL MARKET THEORY 32
(2) Let V
π
(T) denote the ﬁnal wealth corresponding to an initial wealth of x and a portfolio
vector ϕ, i.e.
V
π
(T) =
d
i=1
ϕ
i
S
i
(T)
then we ﬁnd
R
π
=
d
i=1
π
i
R
i
(T) =
d
i=1
ϕ
i
S
i
(0)
x
S
i
(T)
S
i
(0)
=
V
π
(T)
x
(3) The mean and the variance of the portfolio return are given by
IE(R
π
) =
d
i=1
π
i
m
i
, VV ar(R
π
) =
d
i=1
d
j=1
π
i
σ
ij
π
j
We now need to consider criteria for selecting a portfolio. The basic (by now classical) idea
of Markowitz was to look for a balance between risk (i.e. portfolio variance) and return (i.e.
portfolio mean). He considered the problem of requiring a lower bound for the portfolio return
(minimum return) and then choosing from the corresponding set the portfolio vector with the
minimal variance. Alternatively, set an upper bound for the variance and determine the portfolio
vector with the highest possible mean return. We consider
Deﬁnition 2.2.2. A portfolio is a frontier portfolio if it has the minimum variance among port
folios that have the same expected rate of return. The set of all frontier portfolios is called the
portfolio frontier.
We now discuss brieﬂy the assumptions of the meanvariance approach.
(1) A preference for expected return and an aversion to variance is implied by monotonicity
and strict concavity of a utility function. However, for arbitrary utility functions, expected
utility cannot be deﬁned over just the expected return and variances. For µ ∈ / assume
U(µ) =
_
u(x)µ(dx) =
_
∞
k=0
1
k
u
(k)
(m) (x −m)
k
µ(dx)
= u(m) +
1
2
u
(m)VV ar(µ) +R
3
(µ)
(i.e. convergence of Taylor series and interchangeability of integral). Thus, the remainder
term needs (for the general case) to be considered as well.
(2) Assuming quadratic utility, i.e.
u(x) = x −
b
2
x
2
, b > 0
we ﬁnd
U(µ) = m−
b
2
m
2
= m−
b
2
(VV ar(µ) +m
2
).
Unfortunately, quadratic utility displays satiation (negative utility for increasing wealth)
and increasing absolute risk aversion.
(3) For µ
i
normal distributions, R
π
∼ ^, thus preferences can be expressed solely from mean
and variance.
Proposition 2.2.1. A portfolio p is a frontier portfolio if and only if the portfolio weight vector
π
p
is a solution to the optimisation problem
min
π
1
2
π
Σπ = min
π
1
2
i
j
π
i
π
j
σ
2
ij
(= 2VV ar(R
π
)) (2.13)
CHAPTER 2. FINANCIAL MARKET THEORY 33
subject to
π
µ =
i
π
i
µ
i
= IE(R
π
) := m
p
π
1 =
i
π
i
= 1.
1 is an Nvector of ones, m = (m
1
, . . . , m
d
) is the vector of expected returns of the assets and m
p
is a ﬁxed rate of portfolioreturn.
We can solve (2.13) and write the solution as
π
p
= g +hm
p
(2.14)
with
g =
1
D
_
B(Σ
−1
1) −A(Σ
−1
m)
¸
h =
1
D
_
C(Σ
−1
m) −A(Σ
−1
1)
¸
,
where A = 1
T
Σ
−1
m = m
Σ
−1
1, B = m
T
Σ
−1
m, C = 1
Σ
−1
1 and D = BC −A
2
> 0.
For m
p
= 0 we ﬁnd from (2.14) that the optimal portfolio is g. Also, for m
p
= 1 (2.14) implies
that the optimal portfolio is g +h. Now for any given expected return m
q
we ﬁnd
π
q
= g +hm
q
= (1 −m
q
)g +m
q
(g +h).
This generalizes to
Proposition 2.2.2 (Twofund separation). The portfolio frontier can be generated by any two
distinct frontier portfolios.
The covariance between the rates of return of any frontier portfolios p and q is
CCov (R
p
, R
q
) =
C
D
_
m
p
−
A
C
__
m
q
−
A
C
_
+
1
C
. (2.15)
Thus, for the variances
σ
2
(R
p
)
1/C
−
(m
p
−A/C)
2
D/C
2
= 1,
which is a hyperbola in the standarddeviation – expected return (σ −µ)space.
(i) The portfolio having the minimum variance of all feasible portfolios is called minimum vari
ance portfolio and denoted as mvp.
(ii) A frontier portfolio is eﬃcient if it has a strictly higher expected return than the mvp.
(iii) Frontier portfolios that are neither mvp nor eﬃcient are called ineﬃcient.
(iv) The eﬃcient frontier is the part of the curve lying above the point of global minimum of
standard deviation.
We can use (2.15) to show that for any frontier portfolio p (except the mvp) there exists a
unique frontier portfolio zc(p) which has zero covariance with p.
Proposition 2.2.3. If π is any envelope portfolio, then for any other portfolio (envelope or not)
˜ π we have the relation
m
π
= c +β
π
(m
˜ π
−c) (2.16)
where
β
π
=
CCov(π, ˜ π)
σ
2
˜ π
.
Furthermore, c is the expected return of a portfolio π
∗
whose covariance with ˜ π is zero.
CHAPTER 2. FINANCIAL MARKET THEORY 34
Existence of a riskfree asset has the eﬀect of making the eﬃcient frontier a straight line
extending from the rate to the point where the line is tangential to the original eﬃcient frontier
for the risky assets. This leads to the onefund theorem: There is a single fund (portfolio) such
that any eﬃcient portfolio can be constructed as a combination of the fund and the riskfree asset.
The implication of Proposition 2.2.3 in the presence of the a riskfree asset is that there exists a
linear relationship between any security and portfolios on the eﬃcient frontiers involving socalled
‘beta factors’.
m
i
−r = β
i,p
(m
p
−r), (2.17)
where β
i,p
is a linear factor deﬁned as
β
i,p
= CCov(R
i
, R
p
)/σ
2
p
.
This extends to any portfolio q, i.e.
m
q
−r = β
q,p
(m
p
−r),
with β
q,p
= CCov(R
q
, R
p
)/σ
2
p
.
2.2.2 Capital asset pricing model
We now consider a socalled Equilibrium Model. The focus of attention is turned from the
individual investor to the aggregate market for securities (and all investors) as a whole.
We need assumptions on the investors’ behaviour and the market as a whole.
• All investors have the same oneperiod horizon.
• All investors can borrow or lend at the same riskfree rate.
• The markets for risky assets are perfect. Information is freely and instantly available to all
investors and no investor believes that they can aﬀect the price of a security by their own
action.
• Investors have the same estimates of the expected returns, standard deviations and covari
ances over the oneperiod horizon.
• All investors measure in the same num´eraire.
Under the assumptions of meanvariance theory we have in equilibrium
1. If investors have homogeneous expectations, then they are all faced by the same eﬃcient
frontier of risky securities.
2. If there is a riskfree asset the eﬃcient frontier collapses for all investors to a straight line
which passes through the riskfree rate of return on the m axis and is tangential to the
eﬃcient frontier.
All investors face the same eﬃcient frontier because they have the same views on the available
securities. Thus:
1. All rational investors will hold a combination of the riskfree asset and the portfolio of assets
where the straight line through the riskfree return touches the original eﬃcient frontier.
2. Because investors share the same eﬃcient frontier they all hold the same diversiﬁed portfolio.
Because this portfolio is held in diﬀerent quantities by all investors it must consist of all
risky assets in proportion to their market capitalization. It is commonly called the ’market
portfolio‘ .
3. Other strategies are nonoptimal.
CHAPTER 2. FINANCIAL MARKET THEORY 35
The line denoting the eﬃcient frontier is called the capital market line and its equation is
m
p
−r = (m
M
−r)
σ
p
σ
M
(2.18)
where m
p
is the expected return of any portfolio p on the eﬃcient frontier; σ
p
is the standard
deviation of the return on portfolio p; m
M
is the expected return on the market portfolio;σ
M
is
the standard deviation of the market portfolio and r is the riskfree rate of return.
Thus the expected return on any portfolio is a linear function of its standard deviation. The
factor
m
M
−r
σ
M
is called the market price of risk.
We can also develop an equation relating the expected return of any asset to the return of the
market
m
i
−r = (m
M
−r)β
i
(2.19)
where m
i
is the expected return on security i; β
i
is the beta factor of security i, deﬁned as
CCov(R
i
, R
M
)/VV ar(R
M
); m
M
is the expected return on the market portfolio and r is the risk
free rate of return. Equation (2.19) is called the security market line. It shows that the
expected return of any security (and portfolio) can be expressed as a linear function of the securities
covariance with the market as a whole.
2.2.3 Portfolio optimisation and the absence of arbitrage
Consider the standard oneperiod model with assets (S
0
, . . . , S
d
), S
0
the riskfree bond with interest
rate r > 0. Denote by ϕ = (ϕ
0
, . . . , ϕ
d
) portfolio vectors specifying the amount of shares of the
assets in the portfolio.
We consider an investor with utility function ˜ u. A rational choice of the investor’s portfolio
will be based on expected utility
IE(˜ u(ϕ
S(T)))
of the payoﬀ ϕ
S(T) at time T, where the portfolio ϕ satisﬁes the budget constraint
ϕ
S(0) ≤ ω
with ω the initial of the investor. We consider the discounted net gain.
ϕ
S(T)
1 +r
−ϕ
S(0) = ϕ
Y
with Y = (Y
0
, Y
1
, . . . , Y
d
) and Y
i
=
S
i
(T)
1+r
−S
i
(0). (Here Y
0
= 0!).
For any portfolio ϕ with ϕ
S(0) < ω, adding the riskfree investment ω −ϕ
S(0) would lead to
the strictly better portfolio (ϕ
0
+ω−ϕ
S(0), ϕ
1
, . . . , ϕ
d
). Thus we can focus on ϕ with ϕ
S(0) = ω.
Then the payoﬀ is an aﬃne function of the discounted net gain
ϕ
S(T) = (1 +r)(ϕ
Y +ω).
Since Y
0
= 0 we only need the focus on the risky assets. Deﬁne the following transformation of
the original utility function ˜ u
u(y) = ˜ u((1 +r)(y +ω)).
So the optimization problem is equivalent to maximizing the expected utility of
IE(u(ϕ
Y )) (2.20)
among all ϕ ∈ IR
d
such that ϕ
Y is contained in the domain D of u.
Assumption A1. Either
(a) D = IR, then we admit all ϕ
∈ IR
d
, but assume the u is bounded from above (Example:
u(x) = 1 −e
−αx
.)
or
CHAPTER 2. FINANCIAL MARKET THEORY 36
(b) D = [a, ∞) for some a < 0. In this case we only consider portfolios with ϕ
Y ≥ a and
assume that the expected utility generated by such portfolios is ﬁnite, i.e.
IE[u(ϕ
Y )] < ∞ for all ϕ ∈ IR
d
with ϕ
Y ≥ a.
(Example u(x) =
1
γ
(x −c)
γ
).
Denote by
o(D) = ¦ϕ ∈ IR
d
[ϕ
Y ∈ D¦
the set of admissible portfolios for D. Our aim now is to ﬁnd a ϕ
∗
∈ o(D) which maximizes the
expected utility IE(u(ϕ
Y )) among ϕ ∈ o(D).
Theorem 2.2.1. Let the above assumption A1 hold true. Then there exits a maximizer of the
expected utility
IE[n(ϕ
Y )] ϕ ∈ o(D)
if the only if the market model is arbitragefree. Moreover, there exists at most one maximizer if
the market model is complete.
Proof. Uniqueness follows from the strict concavity of the function ϕ → IE(u(ϕ
Y )) for com
plete market models.
In case the model is incomplete we can ﬁnd a complete submodel and apply the result to this
submodel. So we may assume completeness. (Recall completeness is equivalent to η
Y = 0 ⇒
η = 0). If the model admits arbitrage, we ﬁnd a vector ξ
Y ≥ 0IPa.s. and IP(ξ
Y ) > 0 with no
initial investment. So for ϕ
∗
optimal
IE(u(ϕ
∗
Y )) < IE(u((ϕ
∗
+η)
Y ))
a contradiction.
Assume now that the market is arbitragefree. We consider the case D = [a, ∞) for some
a ∈ (−∞, 0). We show
(i) o(D) is compact;
(ii) ϕ → IE[u(ϕ
Y )] is continuous.
Clearly, (i) and (ii) imply existence of a maximizer.
To show (i), assume that (ϕ
n
) is a diverging sequence in o(D). By choosing a subsequence if
necessary, we may assume that η
n
= ϕ
n
/[
ϕ
n
[ converges to some unit vector η ∈ IR
d
. Then
η
Y = lim
n→∞
ϕ
n
Y
[ϕ
n
[
≥
a
[ϕ
n
[
= 0 IPa.s.
and so by completeness ˜ ϕ = (−S(0)
η, η) is an arbitrage opportunity. However, under completeness
η
Y = 0 IPa.s. implies η = 0.
(ii) To show continuity it suﬃces to construct an integrable random variable which dominates
u(ϕ
Y ) for all ϕ ∈ o(D). Deﬁne η ∈ IR
d
by
η
i
= 0 ∨ max
ϕ∈S(D)
ϕ
i
< ∞.
Then, η
S(T) ≥ ϕ
S(T) for ϕ ∈ o(D) and hence
ϕ
Y =
ϕ
S(T)
1 +r
−ϕ
S(0) ≤
η
S(T)
1 +r
−
_
0 ∧ min
ξ∈S(D)
ξ
S(0)
_
.
Now η
Y is bounded below by −η
S(0) and there exists some α ∈ (0, 1] such that αη
S(0) < [a[.
Hence αη ∈ o(D) and by our assumption IE(u(αη
Y )) < ∞.
CHAPTER 2. FINANCIAL MARKET THEORY 37
Applying Lemma 2.2.1 ﬁrst with b := −αS(0)
η and then with b := −0 ∧ min
ϕ∈S(D)
ϕ
S(0) shows
that
IE
_
u
_
η
S(0)
1 +r
−σ ∧ min
ϕ∈S(D)
ϕ
S(0)
__
< ∞.
Lemma 2.2.1. If D = [a, ∞), b < [a[, 0 < α ≤ 1, and X is a nonnegative random variable, then
IE[u(αX −b)] < ∞ ⇒ IE[u(X)] < ∞.
Proof. The concavity of n implies that u has a decreasing rightcontinuous derivative u
. Hence
u(aX −b) = u(−b) +
αX−b
_
−b
u
(x)dx ≥ u(−b) +
αX
_
0
u
(y)dy
= u(−b) +α
X
_
0
u
(αz)dz ≥ u(−b) +α(u(X) −u(0)).
This shows that u(X) can be dominated by a multiple of u(αX −b) plus some constant.
We now turn to a characterization of the solution ϕ
∗
of the utility maximization problem for
continuously diﬀerentiable utility functions.
Proposition 2.2.4. Let n be a twice continuously diﬀerentiable utility function on D such that
IE(u(ϕ
Y )) is ﬁnite for all ϕ ∈ o(D). Suppose that ϕ
∗
is a solution of the utility maximization
problem, and that one of the following conditions is satisﬁed. Either
• u is deﬁned on D = IR and bounded from above or
• u is deﬁned on D = [a, ∞) and ϕ
∗
is an interior point of o(D).
Then
u
(ϕ
∗
Y )[Y [ ∈ L
1
(IP)
and the following ﬁrstorder condition holds
IE[u
(ϕ
∗
Y ) Y ] = 0
Proof. For ϕ ∈ o(D) and ε ∈ [0, 1] let ϕ
ε
= εϕ + (1 −ε)ϕ
∗
and deﬁne
∆
ε
=
u(ϕ
ε
Y ) −u(ϕ
∗
Y )
ε
The concavity of u implies that ∆
ε
≥ ∆
δ
for ε ≤ δ, and so
∆
ε
↑ u
(ε ∗
Y )(ϕ −ϕ
∗
)
Y as ε ↓ 0. (2.21)
Since ∆
1
= u(ε
Y ) ∈ L(IP) monotone convergence and the optimality of ϕ
∗
yield
0 ≥ IE(∆
ε
) ↑ IE
_
u
(ϕ
∗
Y )(ϕ −ϕ
∗
)
Y
_
as ε ↓ 0. (2.22)
In particular, the expectation on the righthand side is ﬁnite.
Both sets of assumptions imply that ϕ
∗
is an interior point of o(D). Hence from (2.22) we
ﬁnd by letting η = ϕ −ϕ
∗
that
IE(u
(ϕ ∗
Y )η
Y ) ≤ 0
for all η in a small ball centered in the origin of IR
d
. Replacing η by −η shows that the expectation
must vanish; i.e. IE(u
(ϕ
∗
Y )η
Y ) = 0 ∀η in a small ball around the origin, so IE(u
(ϕ
∗
Y )η
Y ) =
0.
We can now give a characterisation of an equivalent riskneutral measure.
CHAPTER 2. FINANCIAL MARKET THEORY 38
Corollary 2.2.1. Suppose that the market model is arbitragefree and that the assumptions of
Proposition 2.2.4 are satisﬁed for a utility function u : D → IR. Let φ
∗
be a maximizer of the
expected utility. Then
dIP
∗
dIP
=
u
(ϕ
∗
Y )
IE(u
(ϕ
∗
Y ))
(2.23)
deﬁnes an equivalent riskneutral measure.
Proof. Recall that IE
∗
(Y ) = 0 is the criterion for riskneutrality which is satisﬁed by Proposi
tion 2.2.4. Hence IP
∗
is a riskneutral measure if it is welldeﬁned, i.e.
IE(u
(ϕ
∗
Y )) ∈ L
1
(IP).
Let
c = sup ¦u
(x)[x ∈ D and [x[ ≤ [ϕ
∗
[¦ ≤
_
u
(a) D ∈ [a, ∞)
u
(−[ϕ
∗
[) D = IR
which is ﬁnite by our assumption that u is continuously diﬀerentiable on D. So
0 ≤ u
(ϕ
∗
Y ) ≤ c +u
(ϕ
∗
Y )[Y [1
{Y ≥1}
and the righthand side has ﬁnite expectation.
Remark 2.2.2. We can now give a constructive proof of the ﬁrst fundamental theorem of asset
pricing. Suppose the model is arbitragefree.
(i) If Y is a.s. IPa.s. bounded, then so is u
(ϕ
∗
Y ) and the measure IP
∗
is an equivalent martingale
measure with a bounded density.
(ii) If Y is unbounded we may consider the bounded random vector
˜
Y =
Y
1 +[Y [
which also satisﬁes the noarbitrage condition.
Let ˜ ϕ
∗
be a maximiser of expected utility IE(u
(ϕ
˜
Y ). Then an equivalent martingale measure
is given by IP
∗
deﬁned via the bounded density
dIP
∗
dIP
= c
u
( ˜ ϕ
∗
˜
Y )
1 +[Y [
where c is an appropriate constant.
Chapter 3
Discretetime models of ﬁnancial
markets
3.1 The model
We will study socalled ﬁnite markets – i.e. discretetime models of ﬁnancial markets in which all
relevant quantities take a ﬁnite number of values. Following the approach of Harrison and Pliska
(1981) and Taqqu and Willinger (1987), it suﬃces, to illustrate the ideas, to work with a ﬁnite
probability space (Ω, T, IP), with a ﬁnite number [Ω[ of points ω, each with positive probability:
IP(¦ω¦) > 0.
We specify a time horizon T, which is the terminal date for all economic activities considered.
(For a simple option pricing model the time horizon typically corresponds to the expiry date of
the option.)
As before, we use a ﬁltration IF = ¦T
t
¦
T
t=0
consisting of σalgebras T
0
⊂ T
1
⊂ ⊂ T
T
: we
take T
0
= ¦∅, Ω¦, the trivial σﬁeld, T
T
= T = T(Ω) (here T(Ω) is the powerset of Ω, the class
of all 2
Ω
subsets of Ω: we need every possible subset, as they all – apart from the empty set –
carry positive probability).
The ﬁnancial market contains d + 1 ﬁnancial assets. The usual interpretation is to assume
one riskfree asset (bond, bank account) labelled 0, and d risky assets (stocks, say) labelled 1 to
d. While the reader may keep this interpretation as a mental picture, we prefer not to use it
directly. The prices of the assets at time t are random variables, S
0
(t, ω), S
1
(t, ω), . . . , S
d
(t, ω)
say, nonnegative and T
t
measurable (i.e. adapted: at time t, we know the prices S
i
(t)). We
write S(t) = (S
0
(t), S
1
(t), . . . , S
d
(t))
for the vector of prices at time t. Hereafter we refer to the
probability space (Ω, T, IP), the set of trading dates, the price process S and the information
structure IF, which is typically generated by the price process S, together as a securities market
model.
It will be essential to assume that the price process of at least one asset follows a strictly
positive process.
Deﬁnition 3.1.1. A num´eraire is a price process (X(t))
T
t=0
(a sequence of random variables),
which is strictly positive for all t ∈ ¦0, 1, . . . , T¦.
For the standard approach the riskfree bank account process is used as num´eraire. In some
applications, however, it is more convenient to use a security other than the bank account and we
therefore just use S
0
without further speciﬁcation as a num´eraire. We furthermore take S
0
(0) = 1
(that is, we reckon in units of the initial value of our num´eraire), and deﬁne β(t) := 1/S
0
(t) as a
discount factor.
A trading strategy (or dynamic portfolio) ϕ is a IR
d+1
vector stochastic process ϕ = (ϕ(t))
T
t=1
=
((ϕ
0
(t, ω), ϕ
1
(t, ω), . . . , ϕ
d
(t, ω))
)
T
t=1
which is predictable (or previsible): each ϕ
i
(t) is T
t−1
measurable
for t ≥ 1. Here ϕ
i
(t) denotes the number of shares of asset i held in the portfolio at time t – to
39
CHAPTER 3. DISCRETETIME MODELS 40
be determined on the basis of information available before time t; i.e. the investor selects his time
t portfolio after observing the prices S(t − 1). However, the portfolio ϕ(t) must be established
before, and held until after, announcement of the prices S(t). The components ϕ
i
(t) may assume
negative as well as positive values, reﬂecting the fact that we allow short sales and assume that
the assets are perfectly divisible.
Deﬁnition 3.1.2. The value of the portfolio at time t is the scalar product
V
ϕ
(t) = ϕ(t) S(t) :=
d
i=0
ϕ
i
(t)S
i
(t), (t = 1, 2, . . . , T) and V
ϕ
(0) = ϕ(1) S(0).
The process V
ϕ
(t, ω) is called the wealth or value process of the trading strategy ϕ.
The initial wealth V
ϕ
(0) is called the initial investment or endowment of the investor.
Now ϕ(t) S(t −1) reﬂects the market value of the portfolio just after it has been established at
time t −1, whereas ϕ(t) S(t) is the value just after time t prices are observed, but before changes
are made in the portfolio. Hence
ϕ(t) (S(t) −S(t −1)) = ϕ(t) ∆S(t)
is the change in the market value due to changes in security prices which occur between time t −1
and t. This motivates:
Deﬁnition 3.1.3. The gains process G
ϕ
of a trading strategy ϕ is given by
G
ϕ
(t) :=
t
τ=1
ϕ(τ) (S(τ) −S(τ −1)) =
t
τ=1
ϕ(τ) ∆S(τ), (t = 1, 2, . . . , T).
Observe the – for now – formal similarity of the gains process G
ϕ
from trading in S following
a trading strategy ϕ to the martingale transform of S by ϕ.
Deﬁne
˜
S(t) = (1, β(t)S
1
(t), . . . , β(t)S
d
(t))
, the vector of discounted prices, and consider the
discounted value process
˜
V
ϕ
(t) = β(t)(ϕ(t) S(t)) = ϕ(t)
˜
S(t), (t = 1, 2, . . . , T)
and the discounted gains process
˜
G
ϕ
(t) :=
t
τ=1
ϕ(τ) (
˜
S(τ) −
˜
S(τ −1)) =
t
τ=1
ϕ(τ) ∆
˜
S(τ), (t = 1, 2, . . . , T).
Observe that the discounted gains process reﬂects the gains from trading with assets 1 to d only,
which in case of the standard model (a bank account and d stocks) are the risky assets.
We will only consider special classes of trading strategies.
Deﬁnition 3.1.4. The strategy ϕ is selfﬁnancing, ϕ ∈ Φ, if
ϕ(t) S(t) = ϕ(t + 1) S(t) (t = 1, 2, . . . , T −1). (3.1)
Interpretation.
When new prices S(t) are quoted at time t, the investor adjusts his portfolio from ϕ(t) to ϕ(t +1),
without bringing in or consuming any wealth. The following result (which is trivial in our current
setting, but requires a little argument in continuous time) shows that renormalising security prices
(i.e. changing the num´eraire) has essentially no economic eﬀects.
Proposition 3.1.1 (Num´eraire Invariance). Let X(t) be a num´eraire. A trading strategy ϕ
is selfﬁnancing with respect to S(t) if and only if ϕ is selfﬁnancing with respect to X(t)
−1
S(t).
CHAPTER 3. DISCRETETIME MODELS 41
Proof. Since X(t) is strictly positive for all t = 0, 1, . . . , T we have the following equivalence,
which implies the claim:
ϕ(t) S(t) = ϕ(t + 1) S(t) (t = 1, 2, . . . , T −1)
⇔
ϕ(t) X(t)
−1
S(t) = ϕ(t + 1) X(t)
−1
S(t) (t = 1, 2, . . . , T −1).
Corollary 3.1.1. A trading strategy ϕ is selfﬁnancing with respect to S(t) if and only if ϕ is
selfﬁnancing with respect to
˜
S(t).
We now give a characterisation of selfﬁnancing strategies in terms of the discounted processes.
Proposition 3.1.2. A trading strategy ϕ belongs to Φ if and only if
˜
V
ϕ
(t) = V
ϕ
(0) +
˜
G
ϕ
(t), (t = 0, 1, . . . , T). (3.2)
Proof. Assume ϕ ∈ Φ. Then using the deﬁning relation (3.1), the num´eraire invariance theorem
and the fact that S
0
(0) = 1
V
ϕ
(0) +
˜
G
ϕ
(t) = ϕ(1) S(0) +
t
τ=1
ϕ(τ) (
˜
S(τ) −
˜
S(τ −1))
= ϕ(1)
˜
S(0) +ϕ(t)
˜
S(t)
−
t−1
τ=1
(ϕ(τ) −ϕ(τ + 1))
˜
S(τ) −ϕ(1)
˜
S(0)
= ϕ(t)
˜
S(t) =
˜
V
ϕ
(t).
Assume now that (3.2) holds true. By the num´eraire invariance theorem it is enough to show the
discounted version of relation (3.1). Summing up to t = 2 (3.2) is
ϕ(2)
˜
S(2) = ϕ(1)
˜
S(0) +ϕ(1) (
˜
S(1) −
˜
S(0)) +ϕ(2) (
˜
S(2) −
˜
S(1)).
Subtracting ϕ(2)
˜
S(2) on both sides gives ϕ(2)
˜
S(1) = ϕ(1)
˜
S(1), which is (3.1) for t = 1.
Proceeding similarly – or by induction – we can show ϕ(t)
˜
S(t) = ϕ(t+1)
˜
S(t) for t = 2, . . . , T −1
as required.
We are allowed to borrow (so ϕ
0
(t) may be negative) and sell short (so ϕ
i
(t) may be negative
for i = 1, . . . , d). So it is hardly surprising that if we decide what to do about the risky assets and
ﬁx an initial endowment, the num´eraire will take care of itself, in the following sense.
Proposition 3.1.3. If (ϕ
1
(t), . . . , ϕ
d
(t))
is predictable and V
0
is T
0
measurable, there is a unique
predictable process (ϕ
0
(t))
T
t=1
such that ϕ = (ϕ
0
, ϕ
1
, . . . , ϕ
d
)
is selfﬁnancing with initial value of
the corresponding portfolio V
ϕ
(0) = V
0
.
Proof. If ϕ is selfﬁnancing, then by Proposition 3.1.2,
˜
V
ϕ
(t) = V
0
+
˜
G
ϕ
(t) = V
0
+
t
τ=1
(ϕ
1
(τ)∆
˜
S
1
(τ) +. . . +ϕ
d
(τ)∆
˜
S
d
(τ)).
On the other hand,
˜
V
ϕ
(t) = ϕ(t)
˜
S(t) = ϕ
0
(t) +ϕ
1
(t)
˜
S
1
(t) +. . . +ϕ
d
(t)
˜
S
d
(t).
Equate these:
ϕ
0
(t) = V
0
+
t
τ=1
(ϕ
1
(τ)∆
˜
S
1
(τ) +. . . +ϕ
d
(τ)∆
˜
S
d
(τ))
−(ϕ
1
(t)
˜
S
1
(t) +. . . +ϕ
d
(t)
˜
S
d
(t)),
CHAPTER 3. DISCRETETIME MODELS 42
which deﬁnes ϕ
0
(t) uniquely. The terms in
˜
S
i
(t) are
ϕ
i
(t)∆
˜
S
i
(t) −ϕ
i
(t)
˜
S
i
(t) = −ϕ
i
(t)
˜
S
i
(t −1),
which is T
t−1
measurable. So
ϕ
0
(t) = V
0
+
t−1
τ=1
(ϕ
1
(τ)∆
˜
S
1
(τ) +. . . +ϕ
d
(τ)∆
˜
S
d
(τ))
−(ϕ
1
(t)S
1
(t −1) +. . . +ϕ
d
(t)
˜
S
d
(t −1)),
where as ϕ
1
, . . . , ϕ
d
are predictable, all terms on the righthand side are T
t−1
measurable, so ϕ
0
is predictable.
Remark 3.1.1. Proposition 3.1.3 has a further important consequence: for deﬁning a gains pro
cess
˜
G
ϕ
only the components (ϕ
1
(t), . . . , ϕ
d
(t))
are needed. If we require them to be predictable
they correspond in a unique way (after ﬁxing initial endowment) to a selfﬁnancing trading strat
egy. Thus for the discounted world predictable strategies and ﬁnal cashﬂows generated by them
are all that matters.
We now turn to the modelling of derivative instruments in our current framework. This is done
in the following fashion.
Deﬁnition 3.1.5. A contingent claim X with maturity date T is an arbitrary T
T
= Tmeasurable
random variable (which is by the ﬁniteness of the probability space bounded). We denote the class
of all contingent claims by L
0
= L
0
(Ω, T, IP).
The notation L
0
for contingent claims is motivated by the them being simply random variables
in our context (and the functionalanalytic spaces used later on).
A typical example of a contingent claim X is an option on some underlying asset S, then (e.g.
for the case of a European call option with maturity date T and strike K) we have a functional
relation X = f(S) with some function f (e.g. X = (S(T) − K)
+
). The general deﬁnition allows
for more complicated relationships which are captured by the T
T
measurability of X (recall that
T
T
is typically generated by the process S).
3.2 Existence of Equivalent Martingale Measures
3.2.1 The NoArbitrage Condition
The central principle in the single period example was the absence of arbitrage opportunities, i.e.
the absence investment strategies for making proﬁts without exposure to risk. As mentioned there
this principle is central for any market model, and we now deﬁne the mathematical counterpart
of this economic principle in our current setting.
Deﬁnition 3.2.1. Let
˜
Φ ⊂ Φ be a set of selfﬁnancing strategies. A strategy ϕ ∈
˜
Φ is called
an arbitrage opportunity or arbitrage strategy with respect to
˜
Φ if IP¦V
ϕ
(0) = 0¦ = 1, and the
terminal wealth of ϕ satisﬁes
IP¦V
ϕ
(T) ≥ 0¦ = 1 and IP¦V
ϕ
(T) > 0¦ > 0.
So an arbitrage opportunity is a selfﬁnancing strategy with zero initial value, which produces
a nonnegative ﬁnal value with probability one and has a positive probability of a positive ﬁnal
value. Observe that arbitrage opportunities are always deﬁned with respect to a certain class of
trading strategies.
Deﬁnition 3.2.2. We say that a security market / is arbitragefree if there are no arbitrage
opportunities in the class Φ of trading strategies.
CHAPTER 3. DISCRETETIME MODELS 43
We will allow ourselves to use ‘noarbitrage’ in place of ‘arbitragefree’ when convenient.
The fundamental insight in the singleperiod example was the equivalence of the noarbitrage
condition and the existence of riskneutral probabilities. For the multiperiod case we now use
probabilistic machinery to establish the corresponding result.
Deﬁnition 3.2.3. A probability measure IP
∗
on (Ω, T
T
) equivalent to IP is called a martingale
measure for
˜
S if the process
˜
S follows a IP
∗
martingale with respect to the ﬁltration IF. We denote
by T(
˜
S) the class of equivalent martingale measures.
Proposition 3.2.1. Let IP
∗
be an equivalent martingale measure (IP
∗
∈ T(
˜
S)) and ϕ ∈ Φ any
selfﬁnancing strategy. Then the wealth process
˜
V
ϕ
(t) is a IP
∗
martingale with respect to the
ﬁltration IF.
Proof. By the selfﬁnancing property of ϕ (compare Proposition 3.1.2, (3.2)), we have
˜
V
ϕ
(t) = V
ϕ
(0) +
˜
G
ϕ
(t) (t = 0, 1, . . . , T).
So
˜
V
ϕ
(t + 1) −
˜
V
ϕ
(t) =
˜
G
ϕ
(t + 1) −
˜
G
ϕ
(t) = ϕ(t + 1) (
˜
S(t + 1) −
˜
S(t)).
So for ϕ ∈ Φ,
˜
V
ϕ
(t) is the martingale transform of the IP
∗
martingale
˜
S by ϕ (see Theorem C.4.1)
and hence a IP
∗
martingale itself.
Observe that in our setting all processes are bounded, i.e. the martingale transform theorem
is applicable without further restrictions. The next result is the key for the further development.
Proposition 3.2.2. If an equivalent martingale measure exists  that is, if T(
˜
S) ,= ∅ – then the
market / is arbitragefree.
Proof. Assume such a IP
∗
exists. For any selfﬁnancing strategy ϕ, we have as before
˜
V
ϕ
(t) = V
ϕ
(0) +
t
τ=1
ϕ(τ) ∆
˜
S(τ).
By Proposition 3.2.1,
˜
S(t) a (vector) IP
∗
martingale implies
˜
V
ϕ
(t) is a P
∗
martingale. So the
initial and ﬁnal IP
∗
expectations are the same,
IE
∗
(
˜
V
ϕ
(T)) = IE
∗
(
˜
V
ϕ
(0)).
If the strategy is an arbitrage opportunity its initial value – the righthand side above – is
zero. Therefore the lefthand side IE
∗
(
˜
V
ϕ
(T)) is zero, but
˜
V
ϕ
(T) ≥ 0 (by deﬁnition). Also each
IP
∗
(¦ω¦) > 0 (by assumption, each IP(¦ω¦) > 0, so by equivalence each IP
∗
(¦ω¦) > 0). This and
˜
V
ϕ
(T) ≥ 0 force
˜
V
ϕ
(T) = 0. So no arbitrage is possible.
Proposition 3.2.3. If the market /is arbitragefree, then the class T(
˜
S) of equivalent martingale
measures is nonempty.
For the proof (for which we follow Schachermayer (2000) we need some auxiliary observations.
Recall the deﬁnition of arbitrage, i.e. Deﬁnition 3.2.1, in our ﬁnitedimensional setting: a self
ﬁnancing trading strategy ϕ ∈ Φ is an arbitrage opportunity if V
ϕ
(0) = 0, V
ϕ
(T, ω) ≥ 0 ∀ω ∈ Ω
and there exists a ω ∈ Ω with V
ϕ
(T, ω) > 0.
Now call L
0
= L
0
(Ω, T, IP) the set of random variables on (Ω, T) and
L
0
++
(Ω, T, IP) := ¦X ∈ L
0
: X(ω) ≥ 0 ∀ω ∈ Ω and ∃ ω ∈ Ω such that X(ω) > 0¦.
(Observe that L
0
++
is a cone closed under vector addition and multiplication by positive scalars.)
Using L
0
++
we can write the arbitrage condition more compactly as
V
ϕ
(0) =
˜
V
ϕ
(0) = 0 ⇒
˜
V
ϕ
(T) ,∈ L
0
++
(Ω, T, IP).
CHAPTER 3. DISCRETETIME MODELS 44
for any selfﬁnancing strategy ϕ.
The next lemma formulates the arbitrage condition in terms of discounted gains processes.
The important advantage in using this setting (rather than a setting in terms of value processes)
is that we only have to assume predictability of a vector process (ϕ
1
, . . . , ϕ
d
). Recall Remark
3.1.1 and Proposition 3.1.3 here: we can choose a process ϕ
0
in such a way that the strategy
ϕ = (ϕ
0
, ϕ
1
, . . . , ϕ
d
) has zero initial value and is selfﬁnancing.
Lemma 3.2.1. In an arbitragefree market any predictable vector process ϕ
= (ϕ
1
, . . . , ϕ
d
) sat
isﬁes
˜
G
ϕ
(T) ,∈ L
0
++
(Ω, T, IP).
(Observe the slight abuse of notation: for the value of the discounted gains process the zeroth
component of a trading strategy doesn’t matter. Hence we use the operator
˜
G for ddimensional
vectors as well.)
Proof. By Proposition 3.1.3 there exists a unique predictable process (ϕ
0
(t)) such that ϕ =
(ϕ
0
, ϕ
1
, . . . , ϕ
d
) has zero initial value and is selfﬁnancing. Assume
˜
G
ϕ
(T) ∈ L
0
++
(Ω, T, IP).
Then using Proposition 3.1.2,
V
ϕ
(T) = β(T)
−1
˜
V
ϕ
(T) = β(T)
−1
(V
ϕ
(0) +
˜
G
ϕ
(T)) = β(T)
−1
˜
G
ϕ
(T) ≥ 0,
and is positive somewhere (i.e. with positive probability) by deﬁnition of L
0
++
. Hence ϕ is an
arbitrage opportunity with respect to Φ. This contradicts the assumption that the market is
arbitragefree.
We now deﬁne the space of contingent claims, i.e. random variables on (Ω, T), which an
economic agent may replicate with zero initial investment by pursuing some predictable trading
strategy ϕ.
Deﬁnition 3.2.4. We call the subspace K of L
0
(Ω, T, IP) deﬁned by
K = ¦X ∈ L
0
(Ω, T, IP) : X =
˜
G
ϕ
(T), ϕ predictable¦
the set of contingent claims attainable at price 0.
We can now restate Lemma 3.2.1 in terms of spaces
A market is arbitragefree if and only if
K ∩ L
0
++
(Ω, T, IP) = ∅. (3.3)
Proof of Proposition 3.2.3. Since our market model is ﬁnite we can use results from Euclidean
geometry, in particular we can identify L
0
with IR
Ω
). By assumption we have (3.3), i.e. K and
L
0
++
do not intersect. So K does not meet the subset
D := ¦X ∈ L
0
++
:
ω∈Ω
X(ω) = 1¦.
Now D is a compact convex set. By the separating hyperplane theorem, there is a vector λ =
(λ(ω) : ω ∈ Ω) such that for all X ∈ D
λ X :=
ω∈Ω
λ(ω)X(ω) > 0, (3.4)
but for all
˜
G
ϕ
(T) in K,
λ
˜
G
ϕ
(T) =
ω∈Ω
λ(ω)
˜
G
ϕ
(T)(ω) = 0. (3.5)
CHAPTER 3. DISCRETETIME MODELS 45
Choosing each ω ∈ Ω successively and taking X to be 1 on this ω and zero elsewhere, (3.4) tells
us that each λ(ω) > 0. So
IP
∗
(¦ω¦) :=
λ(ω)
ω
∈Ω
λ(ω
)
deﬁnes a probability measure equivalent to IP (no nonempty null sets). With IE
∗
as IP
∗

expectation, (3.5) says that
IE
∗
_
˜
G
ϕ
(T)
_
= 0,
i.e.
IE
∗
_
T
τ=1
ϕ(τ) ∆
˜
S(τ)
_
= 0.
In particular, choosing for each i to hold only stock i,
IE
∗
_
T
τ=1
ϕ
i
(τ)∆
˜
S
i
(τ)
_
= 0 (i = 1, . . . , d).
Since this holds for any predictable ϕ (boundedness holds automatically as Ω is ﬁnite), the mar
tingale transform lemma tells us that the discounted price processes (
˜
S
i
(t)) are IP
∗
martingales.
Note. Our situation is ﬁnitedimensional, so all we have used here is Euclidean geometry. We
have a subspace, and a cone not meeting the subspace except at the origin. Take λ orthogonal to
the subspace on the same side of the subspace as the cone. The separating hyperplane theorem
holds also in inﬁnitedimensional situations, where it is a form of the HahnBanach theorem of
functional analysis.
We now combine Propositions 3.2.2 and 3.2.3 as a ﬁrst central theorem in this chapter.
Theorem 3.2.1 (NoArbitrage Theorem). The market / is arbitragefree if and only if there
exists a probability measure IP
∗
equivalent to IP under which the discounted ddimensional asset
price process
˜
S is a IP
∗
martingale.
3.2.2 RiskNeutral Pricing
We now turn to the main underlying question of this text, namely the pricing of contingent
claims (i.e. ﬁnancial derivatives). As in chapter 1 the basic idea is to reproduce the cash ﬂow
of a contingent claim in terms of a portfolio of the underlying assets. On the other hand, the
equivalence of the noarbitrage condition and the existence of riskneutral probability measures
imply the possibility of using riskneutral measures for pricing purposes. We will explore the
relation of these tow approaches in this subsection.
We say that a contingent claim is attainable if there exists a replicating strategy ϕ ∈ Φ such
that
V
ϕ
(T) = X.
So the replicating strategy generates the same time T cashﬂow as does X. Working with dis
counted values (recall we use β as the discount factor) we ﬁnd
β(T)X =
˜
V
ϕ
(T) = V (0) +
˜
G
ϕ
(T). (3.6)
So the discounted value of a contingent claim is given by the initial cost of setting up a replication
strategy and the gains from trading. In a highly eﬃcient security market we expect that the law
of one price holds true, that is for a speciﬁed cashﬂow there exists only one price at any time
instant. Otherwise arbitrageurs would use the opportunity to cash in a riskless proﬁt. So the
noarbitrage condition implies that for an attainable contingent claim its time t price must be
given by the value (inital cost) of any replicating strategy (we say the claim is uniquely replicated
in that case). This is the basic idea of the arbitrage pricing theory.
CHAPTER 3. DISCRETETIME MODELS 46
Let us investigate replicating strategies a bit further. The idea is to replicate a given cashﬂow
at a given point in time. Using a selfﬁnancing trading strategy the investor’s wealth may go
negative at time t < T, but he must be able to cover his debt at the ﬁnal date. To avoid negative
wealth the concept of admissible strategies is introduced. A selfﬁnancing trading strategy ϕ ∈ Φ
is called admissible if V
ϕ
(t) ≥ 0 for each t = 0, 1, . . . , T. We write Φ
a
for the class of admissible
trading strategies. The modelling assumption of admissible strategies reﬂects the economic fact
that the broker should be protected from unbounded short sales. In our current setting all processes
are bounded anyway, so this distinction is not really needed and we use selfﬁnancing strategies
when addressing the mathematical aspects of the theory. (In fact one can show that a security
market which is arbitragefree with respect to Φ
a
is also arbitragefree with respect to Φ; see
Exercises.)
We now return to the main question of the section: given a contingent claim X, i.e. a cashﬂow
at time T, how can we determine its value (price) at time t < T ? For an attainable contingent
claim this value should be given by the value of any replicating strategy at time t, i.e. there should
be a unique value process (say V
X
(t)) representing the time t value of the simple contingent claim
X. The following proposition ensures that the value processes of replicating trading strategies
coincide, thus proving the uniqueness of the value process.
Proposition 3.2.4. Suppose the market / is arbitragefree. Then any attainable contingent
claim X is uniquely replicated in /.
Proof. Suppose there is an attainable contingent claim X and strategies ϕ and ψ such that
V
ϕ
(T) = V
ψ
(T) = X,
but there exists a τ < T such that
V
ϕ
(u) = V
ψ
(u) for every u < τ and V
ϕ
(τ) ,= V
ψ
(τ).
Deﬁne A := ¦ω ∈ Ω : V
ϕ
(τ, ω) > V
ψ
(τ, ω)¦, then A ∈ T
τ
and IP(A) > 0 (otherwise just rename
the strategies). Deﬁne the T
τ
measurable random variable Y := V
ϕ
(τ) − V
ψ
(τ) and consider the
trading strategy ξ deﬁned by
ξ(u) =
_
ϕ(u) −ψ(u), u ≤ τ
1
A
c (ϕ(u) −ψ(u)) +1
A
(Y β(τ), 0, . . . , 0), τ < u ≤ T.
The idea here is to use ϕ and ψ to construct a selfﬁnancing strategy with zero initial investment
(hence use their diﬀerence ξ) and put any gains at time τ in the savings account (i.e. invest them
riskfree) up to time T.
We need to show formally that ξ satisﬁes the conditions of an arbitrage opportunity. By
construction ξ is predictable and the selfﬁnancing condition (3.1) is clearly true for t ,= τ, and
for t = τ we have using that ϕ, ψ ∈ Φ
ξ(τ) S(τ) = (ϕ(τ) −ψ(τ)) S(τ) = V
ϕ
(τ) −V
ψ
(τ),
ξ(τ + 1) S(τ) = 1
A
c (ϕ(τ + 1) −ψ(τ + 1)) S(τ) +1
A
Y β(τ)S
0
(τ)
= 1
A
c (ϕ(τ) −ψ(τ)) S(τ) +1
A
(V
ϕ
(τ) −V
ψ
(τ))β(τ)β
−1
(τ)
= V
ϕ
(τ) −V
ψ
(τ).
Hence ξ is a selfﬁnancing strategy with initial value equal to zero. Furthermore
V
ξ
(T) = 1
A
c (ϕ(T) −ψ(T)) S(T) +1
A
(Y β(τ), 0, . . . , 0) S(T)
= 1
A
Y β(τ)S
0
(T) ≥ 0
and
IP¦V
ξ
(T) > 0¦ = IP¦A¦ > 0.
Hence the market contains an arbitrage opportunity with respect to the class Φ of selfﬁnancing
strategies. But this contradicts the assumption that the market / is arbitragefree.
This uniqueness property allows us now to deﬁne the important concept of an arbitrage price
process.
CHAPTER 3. DISCRETETIME MODELS 47
Deﬁnition 3.2.5. Suppose the market is arbitragefree. Let X be any attainable contingent claim
with time T maturity. Then the arbitrage price process π
X
(t), 0 ≤ t ≤ T or simply arbitrage price
of X is given by the value process of any replicating strategy ϕ for X.
The construction of hedging strategies that replicate the outcome of a contingent claim (for
example a European option) is an important problem in both practical and theoretical applications.
Hedging is central to the theory of option pricing. The classical arbitrage valuation models, such
as the BlackScholes model ((Black and Scholes 1973), depend on the idea that an option can
be perfectly hedged using the underlying asset (in our case the assets of the market model), so
making it possible to create a portfolio that replicates the option exactly. Hedging is also widely
used to reduce risk, and the kinds of deltahedging strategies implicit in the BlackScholes model
are used by participants in option markets. We will come back to hedging problems subsequently.
Analysing the arbitragepricing approach we observe that the derivation of the price of a
contingent claim doesn’t require any speciﬁc preferences of the agents other than nonsatiation, i.e.
agents prefer more to less, which rules out arbitrage. So, the pricing formula for any attainable
contingent claim must be independent of all preferences that do not admit arbitrage. In particular,
an economy of riskneutral investors must price a contingent claim in the same manner. This
fundamental insight, due to Cox and Ross (Cox and Ross 1976) in the case of a simple economy –
a riskless asset and one risky asset  and in its general form due to Harrison and Kreps (Harrison
and Kreps 1979), simpliﬁes the pricing formula enormously. In its general form the price of an
attainable simple contingent claim is just the expected value of the discounted payoﬀ with respect
to an equivalent martingale measure.
Proposition 3.2.5. The arbitrage price process of any attainable contingent claim X is given by
the riskneutral valuation formula
π
X
(t) = β(t)
−1
IE
∗
(Xβ(T)[T
t
) ∀t = 0, 1, . . . , T, (3.7)
where IE
∗
is the expectation operator with respect to an equivalent martingale measure IP
∗
.
Proof. Since we assume the the market is arbitragefree there exists (at least) an equivalent mar
tingale measure IP
∗
. By Proposition 3.2.1 the discounted value process
˜
V
ϕ
of any selfﬁnancing
strategy ϕ is a IP
∗
martingale. So for any contingent claim X with maturity T and any replicating
trading strategy ϕ ∈ Φ we have for each t = 0, 1, . . . , T
π
X
(t) = V
ϕ
(t) = β(t)
−1
˜
V
ϕ
(t)
= β(t)
−1
E
∗
(
˜
V
ϕ
(T)[T
t
) (as
˜
V
ϕ
(t) is a IP
∗
martingale)
= β(t)
−1
E
∗
(β(T)V
ϕ
(T)[T
t
) (undoing the discounting)
= β(t)
−1
E
∗
(β(T)X[T
t
) (as ϕ is a replicating strategy for X).
3.3 Complete Markets: Uniqueness of Equivalent Martin
gale Measures
The last section made clear that attainable contingent claims can be priced using an equivalent
martingale measure. In this section we will discuss the question of the circumstances under which
all contingent claims are attainable. This would be a very desirable property of the market /,
because we would then have solved the pricing question (at least for contingent claims) completely.
Since contingent claims are merely T
T
measurable random variables in our setting, it should be
no surprise that we can give a criterion in terms of probability measures. We start with:
CHAPTER 3. DISCRETETIME MODELS 48
Deﬁnition 3.3.1. A market / is complete if every contingent claim is attainable, i.e. for every
T
T
measurable random variable X ∈ L
0
there exists a replicating selfﬁnancing strategy ϕ ∈ Φ
such that V
ϕ
(T) = X.
In the case of an arbitragefree market / one can even insist on replicating nonnegative
contingent claims by an admissible strategy ϕ ∈ Φ
a
. Indeed, if ϕ is selfﬁnancing and IP
∗
is an
equivalent martingale measure under which discounted prices
˜
S are IP
∗
martingales (such IP
∗
exist
since / is arbitragefree and we can hence use the noarbitrage theorem (Theorem 3.2.1)),
˜
V
ϕ
(t)
is also a IP
∗
martingale, being the martingale transform of the martingale
˜
S by ϕ (see Proposition
3.2.1). So
˜
V
ϕ
(t) = E
∗
(
˜
V
ϕ
(T)[T
t
) (t = 0, 1, . . . , T).
If ϕ replicates X, V
ϕ
(T) = X ≥ 0, so discounting,
˜
V
ϕ
(T) ≥ 0, so the above equation gives
˜
V
ϕ
(t) ≥ 0 for each t. Thus all the values at each time t are nonnegative – not just the ﬁnal value
at time T – so ϕ is admissible.
Theorem 3.3.1 (Completeness Theorem). An arbitragefree market / is complete if and
only if there exists a unique probability measure IP
∗
equivalent to IP under which discounted asset
prices are martingales.
Proof. ‘⇒’: Assume that the arbitragefree market / is complete. Then for any T
T

measurable random variable X ( contingent claim), there exists an admissible (so selfﬁnancing)
strategy ϕ replicating X: X = V
ϕ
(T). As ϕ is selfﬁnancing, by Proposition 3.1.2,
β(T)X =
˜
V
ϕ
(T) = V
ϕ
(0) +
T
τ=1
ϕ(τ) ∆
˜
S(τ).
We know by the noarbitrage theorem (Theorem 3.2.1) that an equivalent martingale measure IP
∗
exists; we have to prove uniqueness. So, let IP
1
, IP
2
be two such equivalent martingale measures.
For i = 1, 2, (
˜
V
ϕ
(t))
T
t=0
is a IP
i
martingale. So,
IE
i
(
˜
V
ϕ
(T)) = IE
i
(
˜
V
ϕ
(0)) = V
ϕ
(0),
since the value at time zero is nonrandom (T
0
= ¦∅, Ω¦) and β(0) = 1. So
IE
1
(β(T)X) = IE
2
(β(T)X).
Since X is arbitrary, IE
1
, IE
2
have to agree on integrating all integrands. Now IE
i
is expectation
(i.e. integration) with respect to the measure IP
i
, and measures that agree on integrating all
integrands must coincide. So IP
1
= IP
2
, giving uniqueness as required.
‘⇐’: Assume that the arbitragefree market /is incomplete: then there exists a nonattainable
T
T
measurable random variable X (a contingent claim). By Proposition 3.1.3, we may conﬁne
attention to the risky assets S
1
, . . . , S
d
, as these suﬃce to tell us how to handle the num´eraire S
0
.
Consider the following set of random variables:
˜
K :=
_
Y ∈ L
0
: Y = Y
0
+
T
t=1
ϕ(t) ∆
˜
S(t), Y
0
∈ IR , ϕ predictable
_
.
(Recall that Y
0
is T
0
measurable and set ϕ = ((ϕ
1
(t), . . . , ϕ
d
(t))
)
T
t=1
with predictable compo
nents.) Then by the above reasoning, the discounted value β(T)X does not belong to
˜
K, so
˜
K is
a proper subset of the set L
0
of all random variables on Ω (which may be identiﬁed with IR
Ω
).
Let IP
∗
be a probability measure equivalent to IP under which discounted prices are martingales
(such IP
∗
exist by the noarbitrage theorem (Theorem 3.2.1). Deﬁne the scalar product
(Z, Y ) → IE
∗
(ZY )
CHAPTER 3. DISCRETETIME MODELS 49
on random variables on Ω. Since
˜
K is a proper subset, there exists a nonzero random variable Z
orthogonal to
˜
K (since Ω is ﬁnite, IR
Ω
is Euclidean: this is just Euclidean geometry). That is,
IE
∗
(ZY ) = 0, ∀ Y ∈
˜
K.
Choosing the special Y = 1 ∈
˜
K given by ϕ
i
(t) = 0, t = 1, 2, . . . , T; i = 1, . . . , d and Y
0
= 1 we
ﬁnd
IE
∗
(Z) = 0.
Write X
∞
:= sup¦[X(ω)[ : ω ∈ Ω¦, and deﬁne IP
∗∗
by
IP
∗∗
(¦ω¦) =
_
1 +
Z(ω)
2 Z
∞
_
IP
∗
(¦ω¦).
By construction, IP
∗∗
is equivalent to IP
∗
(same null sets  actually, as IP
∗
∼ IP and IP has no
nonempty null sets, neither do IP
∗
, IP
∗∗
). From IE
∗
(Z) = 0, we see that
IP
∗∗
(ω) = 1, i.e. is a
probability measure. As Z is nonzero, IP
∗∗
and IP
∗
are diﬀerent. Now
IE
∗∗
_
T
t=1
ϕ(t) ∆
˜
S(t)
_
=
ω∈Ω
IP
∗∗
(ω)
_
T
t=1
ϕ(t, ω) ∆
˜
S(t, ω)
_
=
ω∈Ω
_
1 +
Z(ω)
2 Z
∞
_
IP
∗
(ω)
_
T
t=1
ϕ(t, ω) ∆
˜
S(t, ω)
_
.
The ‘1’ term on the right gives
IE
∗
_
T
t=1
ϕ(t) ∆
˜
S(t)
_
,
which is zero since this is a martingale transform of the IP
∗
martingale
˜
S(t) (recall martingale
transforms are by deﬁnition null at zero). The ‘Z’ term gives a multiple of the inner product
(Z,
T
t=1
ϕ(t) ∆
˜
S(t)),
which is zero as Z is orthogonal to
˜
K and
T
t=1
ϕ(t) ∆
˜
S(t) ∈
˜
K. By the martingale transform
lemma (Lemma C.4.1),
˜
S(t) is a IP
∗∗
martingale since ϕ is an arbitrary predictable process. Thus
IP
∗∗
is a second equivalent martingale measure, diﬀerent from IP
∗
. So incompleteness implies
nonuniqueness of equivalent martingale measures, as required.
Martingale Representation.
To say that every contingent claim can be replicated means that every IP
∗
martingale (where IP
∗
is
the riskneutral measure, which is unique) can be written, or represented, as a martingale transform
(of the discounted prices) by a replicating (perfecthedge) trading strategy ϕ. In stochastic
process language, this says that all IP
∗
martingales can be represented as martingale transforms
of discounted prices. Such martingale representation theorems hold much more generally, and are
very important. For background, see (Revuz and Yor 1991, Yor 1978).
3.4 The CoxRossRubinstein Model
In this section we consider simple discretetime ﬁnancial market models. The development of the
riskneutral pricing formula is particularly clear in this setting since we require only elementary
mathematical methods. The link to the fundamental economic principles of the arbitrage pricing
method can be obtained equally straightforwardly. Moreover binomial models, by their very
construction, give rise to simple and eﬃcient numerical procedures. We start with the paradigm of
all binomial models  the celebrated CoxRossRubinstein model (Cox, Ross, and Rubinstein 1979).
CHAPTER 3. DISCRETETIME MODELS 50
3.4.1 Model Structure
We take d = 1, that is, our model consists of two basic securities. Recall that the essence of the
relative pricing theory is to take the price processes of these basic securities as given and price
secondary securities in such a way that no arbitrage is possible.
Our time horizon is T and the set of dates in our ﬁnancial market model is t = 0, 1, . . . , T.
Assume that the ﬁrst of our given basic securities is a (riskless) bond or bank account B, which
yields a riskless rate of return r > 0 in each time interval [t, t + 1], i.e.
B(t + 1) = (1 +r)B(t), B(0) = 1.
So its price process is B(t) = (1 +r)
t
, t = 0, 1, . . . , T. Furthermore, we have a risky asset (stock)
S with price process
S(t + 1) =
_
(1 +u)S(t) with probability p,
(1 +d)S(t) with probability 1 −p,
t = 0, 1, . . . , T −1
with −1 < d < u, S
0
∈ IR
+
0
(see Fig. 3.4.1 below).
S(0)
.
.
.
.
.
.
.
.
.
.
..
p
S(1) = (1 +u)S(0)











1 −p
S(1) = (1 +d)S(0)
Figure 3.1: Onestep tree diagram
Alternatively we write this as
Z(t + 1) :=
S(t + 1)
S(t)
−1, t = 0, 1, . . . , T −1.
We set up a probabilistic model by considering the Z(t), t = 1, . . . , T as random variables deﬁned
on probability spaces (
˜
Ω
t
,
˜
T
t
,
˜
IP
t
) with
˜
Ω
t
=
˜
Ω = ¦d, u¦,
˜
T
t
=
˜
T = T(
˜
Ω) = ¦∅, ¦d¦, ¦u¦,
˜
Ω¦,
˜
IP
t
=
˜
IP with
˜
IP(¦u¦) = p,
˜
IP(¦d¦) = 1 −p, p ∈ (0, 1).
On these probability spaces we deﬁne
Z(t, u) = u and Z(t, d) = d, t = 1, 2, . . . , T.
Our aim, of course, is to deﬁne a probability space on which we can model the basic securities
(B, S). Since we can write the stock price as
S(t) = S(0)
t
τ=1
(1 +Z(τ)), t = 1, 2, . . . , T,
the above deﬁnitions suggest using as the underlying probabilistic model of the ﬁnancial market
the product space (Ω, T, IP) (see e.g. (Williams 1991) ch. 8), i.e.
Ω =
˜
Ω
1
. . .
˜
Ω
T
=
˜
Ω
T
= ¦d, u¦
T
,
CHAPTER 3. DISCRETETIME MODELS 51
with each ω ∈ Ω representing the successive values of Z(t), t = 1, 2, . . . , T. Hence each ω ∈ Ω is
a Ttuple ω = (˜ ω
1
, . . . , ˜ ω
T
) and ˜ ω
t
∈
˜
Ω = ¦d, u¦. For the σalgebra we use T = T(Ω) and the
probability measure is given by
IP(¦ω¦) =
˜
IP
1
(¦ω
1
¦) . . .
˜
IP
T
(¦ω
T
¦) =
˜
IP(¦ω
1
¦) . . .
˜
IP(¦ω
T
¦).
The role of a product space is to model independent replication of a random experiment. The
Z(t) above are twovalued random variables, so can be thought of as tosses of a biased coin; we
need to build a probability space on which we can model a succession of such independent tosses.
Now we redeﬁne (with a slight abuse of notation) the Z(t), t = 1, . . . , T as random variables
on (Ω, T, IP) as (the tth projection)
Z(t, ω) = Z(t, ω
t
).
Observe that by this deﬁnition (and the above construction) Z(1), . . . , Z(T) are independent and
identically distributed with
IP(Z(t) = u) = p = 1 −IP(Z(t) = d).
To model the ﬂow of information in the market we use the obvious ﬁltration
T
0
= ¦∅, Ω¦ (trivial σﬁeld),
T
t
= σ(Z(1), . . . , Z(t)) = σ(S(1), . . . , S(t)),
T
T
= T = T(Ω) (class of all subsets of Ω).
This construction emphasises again that a multiperiod model can be viewed as a sequence of
singleperiod models. Indeed, in the CoxRossRubinstein case we use identical and independent
singleperiod models. As we will see in the sequel this will make the construction of equivalent
martingale measures relatively easy. Unfortunately we can hardly defend the assumption of inde
pendent and identically distributed price movements at each time period in practical applications.
Remark 3.4.1. We used this example to show explicitly how to construct the underlying probability
space. Having done this in full once, we will from now on feel free to take for granted the existence
of an appropriate probability space on which all relevant random variables can be deﬁned.
3.4.2 RiskNeutral Pricing
We now turn to the pricing of derivative assets in the CoxRossRubinstein market model. To do
so we ﬁrst have to discuss whether the CoxRossRubinstein model is arbitragefree and complete.
To answer these questions we have, according to our fundamental theorems (Theorems 3.2.1 and
3.3.1), to understand the structure of equivalent martingale measures in the CoxRossRubinstein
model. In trying to do this we use (as is quite natural and customary) the bond price process B(t)
as num´eraire.
Our ﬁrst task is to ﬁnd an equivalent martingale measure QQ such that the Z(1), . . . , Z(T)
remain independent and identically distributed, i.e. a probability measure QQ deﬁned as a product
measure via a measure
˜
QQ on (
˜
Ω,
˜
T) such that
˜
QQ(¦u¦) = q and
˜
QQ(¦d¦) = 1 −q. We have:
Proposition 3.4.1. (i) A martingale measure QQ for the discounted stock price
˜
S exists if and
only if
d < r < u. (3.8)
(ii) If equation (3.8) holds true, then there is a unique such measure in T characterised by
q =
r −d
u −d
. (3.9)
CHAPTER 3. DISCRETETIME MODELS 52
Proof. Since S(t) =
˜
S(t)B(t) =
˜
S(t)(1 + r)
t
, we have Z(t + 1) = S(t + 1)/S(t) − 1 = (
˜
S(t +
1)/
˜
S(t))(1 + r) − 1. So, the discounted price (
˜
S(t)) is a QQmartingale if and only if for t =
0, 1, . . . , T −1
IE
QQ
[
˜
S(t + 1)[T
t
] =
˜
S(t) ⇔ IE
QQ
[(
˜
S(t + 1)/
˜
S(t))[T
t
] = 1
⇔ IE
QQ
[Z(t + 1)[T
t
] = r.
But Z(1), . . . , Z(T) are mutually independent and hence Z(t+1) is independent of T
t
= σ(Z(1), . . . , Z(t)).
So
r = IE
QQ
(Z(t + 1)[T
t
) = IE
QQ
(Z(t + 1)) = uq +d(1 −q)
is a weighted average of u and d; this can be r if and only if r ∈ [d, u]. As QQ is to be equivalent
to IP and IP has no nonempty null sets, r = d, u are excluded and (3.8) is proved.
To prove uniqueness and to ﬁnd the value of q we simply observe that under (3.8)
u q +d (1 −q) = r
has a unique solution. Solving it for q leads to the above formula.
From now on we assume that (3.8) holds true. Using the above Proposition we immediately
get:
Corollary 3.4.1. The CoxRossRubinstein model is arbitragefree.
Proof. By Proposition 3.4.1 there exists an equivalent martingale measure and this is by the no
arbitrage theorem (Theorem 3.2.1) enough to guarantee that the CoxRossRubinstein model is
free of arbitrage.
Uniqueness of the solution of the linear equation (4.7) under (3.8) gives completeness of the
model, by the completeness theorem (Theorem 3.3.1):
Proposition 3.4.2. The CoxRossRubinstein model is complete.
One can translate this result – on uniqueness of the equivalent martingale measure – into
ﬁnancial language. Completeness means that all contingent claims can be replicated. If we do
this in the large, we can do it in the small by restriction, and conversely, we can build up our full
model from its constituent components. To summarize:
Corollary 3.4.2. The multiperiod model is complete if and only if every underlying singleperiod
model is complete.
We can now use the riskneutral valuation formula to price every contingent claim in the
CoxRossRubinstein model.
Proposition 3.4.3. The arbitrage price process of a contingent claim X in the CoxRossRubinstein
model is given by
π
X
(t) = B(t)IE
∗
(X/B(T)[T
t
) ∀t = 0, 1, . . . , T,
where IE
∗
is the expectation operator with respect to the unique equivalent martingale measure IP
∗
characterised by p
∗
= (r −d)/(u −d).
Proof. This follows directly from Proposition 3.2.4 since the CoxRossRubinstein model is arbitrage
free and complete.
We now give simple formulas for pricing (and hedging) of European contingent claims X =
f(S
T
) for suitable functions f (in this simple framework all functions f : IR → IR). We use the
notation
F
τ
(x, p) :=
τ
j=0
_
τ
j
_
p
j
(1 −p)
τ−j
f
_
x(1 +u)
j
(1 +d)
τ−j
_
(3.10)
Observe that this is just an evaluation of f(S(j)) along the probabilityweighted paths of the price
process. Accordingly, j, τ −j are the numbers of times Z(i) takes the two possible values d, u.
CHAPTER 3. DISCRETETIME MODELS 53
Corollary 3.4.3. Consider a European contigent claim with expiry T given by X = f(S
T
). The
arbitrage price process π
X
(t), t = 0, 1, . . . , T of the contingent claim is given by (set τ = T −t)
π
X
(t) = (1 +r)
−τ
F
τ
(S
t
, p
∗
). (3.11)
Proof. Recall that
S(t) = S(0)
t
j=1
(1 +Z(j)), t = 1, 2, . . . , T.
By Proposition 3.4.3 the price Π
X
(t) of a contingent claim X = f(S
T
) at time t is
π
X
(t) = (1 +r)
−(T−t)
IE
∗
[f(S(T))[T
t
]
= (1 +r)
−(T−t)
IE
∗
_
f
_
S(t)
T
i=t+1
(1 +Z(i))
_¸
¸
¸
¸
¸
T
t
_
= (1 +r)
−(T−t)
IE
∗
_
f
_
S(t)
T
i=t+1
(1 +Z(i))
__
= (1 +r)
−τ
F
τ
(S(t), p
∗
).
We used the role of independence property of conditional expectations from Proposition B.5.1 in
the nexttolast equality. It is applicable since S(t) is T
t
measurable and Z(t + 1), . . . , Z(T) are
independent of T
t
.
An immediate consequence is the pricing formula for the European call option, i.e. X = f(S
T
)
with f(x) = (x −K)
+
.
Corollary 3.4.4. Consider a European call option with expiry T and strike price K written on
(one share of ) the stock S. The arbitrage price process Π
C
(t), t = 0, 1, . . . , T of the option is given
by (set τ = T −t)
Π
C
(t) = (1 +r)
−τ
τ
j=0
_
τ
j
_
p
∗j
(1 −p
∗
)
τ−j
(S(t)(1 +u)
j
(1 +d)
τ−j
−K)
+
. (3.12)
For a European put option, we can either argue similarly or use putcall parity.
3.4.3 Hedging
Since the CoxRossRubinstein model is complete we can ﬁnd unique hedging strategies for repli
cating contingent claims. Recall that this means we can ﬁnd a selfﬁnancing portfolio ϕ(t) =
(ϕ
0
(t), ϕ
1
(t)), ϕ predictable, such that the value process V
ϕ
(t) = ϕ
0
(t)B(t) +ϕ
1
(t)S(t) satisﬁes
Π
X
(t) = V
ϕ
(t), for all t = 0, 1, . . . , T.
Using the bond as num´eraire we get the discounted equation
˜
Π
X
(t) =
˜
V
ϕ
(t) = ϕ
0
(t) +ϕ
1
(t)
˜
S(t), for all t = 0, 1, . . . , T.
By the pricing formula, Proposition 3.4.3, we know the arbitrage price process and using the
restriction of predictability of ϕ, this leads to a unique replicating portfolio process ϕ. We can
compute this portfolio process at any point in time as follows. The equation
˜
Π
X
(t) = ϕ
0
(t) +
ϕ
1
(t)
˜
S(t) has to be true for each ω ∈ Ω and each t = 1, . . . , T. Given such a t we only can use
information up to (and including) time t −1 to ensure that ϕ is predictable. Therefore we know
S(t − 1), but we only know that S(t) = (1 + Z(t))S(t − 1). However, the fact that Z(t) ∈ ¦d, u¦
CHAPTER 3. DISCRETETIME MODELS 54
leads to the following system of equations, which can be solved for ϕ
0
(t) and ϕ
1
(t) uniquely.
Making the dependence of
˜
Π
X
on
˜
S explicit, we have
˜
Π
X
(t,
˜
S
t−1
(1 +u)) = ϕ
0
(t) +ϕ
1
(t)
˜
S
t−1
(1 +u),
˜
Π
X
(t,
˜
S
t−1
(1 +d)) = ϕ
0
(t) +ϕ
1
(t)
˜
S
t−1
(1 +d).
The solution is given by
ϕ
0
(t) =
˜
S
t−1
(1 +u)
˜
Π
X
(t,
˜
S
t−1
(1 +d)) −
˜
S
t−1
(1 +d)
˜
Π
X
(t,
˜
S
t−1
(1 +u))
˜
S
t−1
(1 +u) −
˜
S
t−1
(1 +d)
=
(1 +u)
˜
Π
X
(t,
˜
S
t−1
(1 +d)) −(1 +d)
˜
Π
X
(t,
˜
S
t−1
(1 +u))
(u −d)
ϕ
1
(t) =
˜
Π
X
(t,
˜
S
t−1
(1 +u)) −
˜
Π
X
(t,
˜
S
t−1
(1 +d))
˜
S
t−1
(1 +u) −
˜
S
t−1
(1 +d)
=
˜
Π
X
(t,
˜
S
t−1
(1 +u)) −
˜
Π
X
(t,
˜
S
t−1
(1 +d))
˜
S
t−1
(u −d)
.
Observe that we only need to have information up to time t − 1 to compute ϕ(t), hence ϕ is
predictable. We make this rather abstract construction more transparent by constructing the
hedge portfolio for the European contingent claims.
Proposition 3.4.4. The perfect hedging strategy ϕ = (ϕ
0
, ϕ
1
) replicating the European contingent
claim f(S
T
) with time of expiry T is given by (again using τ = T −t)
ϕ
1
(t) =
(1 +r)
−τ
(F
τ
(S
t−1
(1 +u), p
∗
) −F
τ
(S
t−1
(1 +d), p
∗
))
S
t−1
(u −d)
,
ϕ
0
(t) =
(1 +u)F
τ
(S
t−1
(1 +d), p
∗
) −(1 +d)F
τ
(S
t−1
(1 +u), p
∗
)
(u −d)(1 +r)
T
.
Proof. (1 + r)
−τ
F
τ
(S
t
, p
∗
) must be the value of the portfolio at time t if the strategy ϕ = (ϕ(t))
replicates the claim:
ϕ
0
(t)(1 +r)
t
+ϕ
1
(t)S(t) = (1 +r)
−τ
F
τ
(S
t
, p
∗
).
Now S(t) = S(t −1)(1 +Z(t)) = S(t −1)(1 +u) or S(t −1)(1 +d), so:
ϕ
0
(t)(1 +r)
t
+ϕ
1
(t)S(t −1)(1 +u) = (1 +r)
−τ
F
τ
(S
t−1
(1 +u), p
∗
),
ϕ
0
(t)(1 +r)
t
+ϕ
1
(t)S(t −1)(1 +d) = (1 +r)
−τ
F
τ
(S
t−1
(1 +d), p
∗
).
Subtract:
ϕ
1
(t)S(t −1)(u −d) = (1 +r)
−τ
(F
τ
(S
t−1
(1 +u), p
∗
) −F
τ
(S
t−1
(1 +d), p
∗
)) .
So ϕ
1
(t) in fact depends only on S(t −1), thus yielding the predictability of ϕ, and
ϕ
1
(t) =
(1 +r)
−τ
(F
τ
(S
t−1
(1 +u), p
∗
) −F
τ
(S
t−1
(1 +d), p
∗
))
S(t −1)(u −d)
.
Using any of the equations in the above system and solving for ϕ
0
(t) completes the proof.
To write the corresponding result for the European call, we use the following notation.
C(τ, x) :=
τ
j=0
_
τ
j
_
p
∗j
(1 −p
∗
)
τ−j
(x(1 +u)
j
(1 +d)
τ−j
−K)
+
.
Then (1 +r)
−τ
C(τ, x) is value of the call at time t (with time to expiry τ) given that S(t) = x.
CHAPTER 3. DISCRETETIME MODELS 55
Corollary 3.4.5. The perfect hedging strategy ϕ = (ϕ
0
, ϕ
1
) replicating the European call option
with time of expiry T and strike price K is given by
ϕ
1
(t) =
(1 +r)
−τ
(C(τ, S
t−1
(1 +u)) −C(τ, S
t−1
(1 +d)))
S
t−1
(u −d)
,
ϕ
0
(t) =
(1 +u)C(τ, S
t−1
(1 +d)) −(1 +d)C(τ, S
t−1
(1 +u))
(u −d)(1 +r)
T
.
Notice that the numerator in the equation for ϕ
1
(t) is the diﬀerence of two values of C(τ, x),
with the larger value of x in the ﬁrst term (recall u > d). When the payoﬀ function C(τ, x) is an
increasing function of x, as for the European call option considered here, this is nonnegative. In
this case, the Proposition gives ϕ
1
(t) ≥ 0: the replicating strategy does not involve shortselling.
We record this as:
Corollary 3.4.6. When the payoﬀ function is a nondecreasing function of the asset price S(t),
the perfecthedging strategy replicating the claim does not involve shortselling of the risky asset.
If we do not use the pricing formula from Proposition 3.4.3 (i.e. the information on the price
process), but only the ﬁnal values of the option (or more generally of a contingent claim) we
are still able to compute the arbitrage price and to construct the hedging portfolio by backward
induction. In essence this is again only applying the oneperiod calculations for each time interval
and each state of the world. We outline this procedure for the European call starting with the last
period [T − 1, T]. We have to choose a replicating portfolio ϕ(T) = (ϕ
0
(T), ϕ
1
(T) based on the
information available at time T − 1 (and so T
T−1
measurable). So for each ω ∈ Ω the following
equation has to hold:
π
X
(T, ω) = ϕ
0
(T, ω)B(T, ω) +ϕ
1
(T, ω)S(T, ω).
Given the information T
T−1
we know all but the last coordinate of ω, and this gives rise to two
equations (with the same notation as above):
π
X
(T, S
T−1
(1 +u)) = ϕ
0
(T)(1 +r)
T
+ϕ
1
(T)S
T−1
(1 +u),
π
X
(T, S
T−1
(1 +d)) = ϕ
0
(T)(1 +r)
T
+ϕ
1
(T)S
T−1
(1 +d).
Since we know the payoﬀ structure of the contingent claim time T, for example in case of a
European call. π
X
(T, S
T−1
(1 + u)) = ((1 + u)S
T−1
− K)
+
and π
X
(T, S
T−1
(1 + d)) = ((1 +
d)S
T−1
−K)
+
, we can solve the above system and obtain
ϕ
0
(T) =
(1 +u)Π
X
(T, S
T−1
(1 +d)) −(1 +d)Π
X
(T, S
T−1
(1 +u))
(u −d)(1 +r)
ϕ
1
(t) =
Π
X
(T, S
T−1
(1 +u)) −Π
X
(T, S
T−1
(1 +d))
S
T−1
(u −d)
.
Using this portfolio one can compute the arbitrage price of the contingent claim at time T − 1
given that the current asset price is S
T−1
as
π
X
(T −1, S
T−1
) = ϕ
0
(T, S
T−1
)(1 +r)
T−1
+ϕ
1
(T, S
T−1
)S(T −1).
Now the arbitrage prices at time T −1 are known and one can repeat the procedure to successively
compute the prices at T −2, . . . , 1, 0.
The advantage of our riskneutral pricing procedure over this approach is that we have a single
formula for the price of the contingent claim at all times t at once, and don’t have to go to a
backwards induction only to compute a price at a special time t.
CHAPTER 3. DISCRETETIME MODELS 56
3.5 Binomial Approximations
Suppose we observe ﬁnancial assets during a continuous time period [0, T]. To construct a stochas
tic model of the price processes of these assets (to, e.g. value contingent claims) one basically has
two choices: one could model the processes as continuoustime stochastic processes (for which the
theory of stochastic calculus is needed) or one could construct a sequence of discretetime models
in which the continuoustime price processes are approximated by discretetime stochastic pro
cesses in a suitable sense. We describe the the second approach now by examining the asymptotic
properties of a sequence of CoxRossRubinstein models.
3.5.1 Model Structure
We assume that all random variables subsequently introduced are deﬁned on a suitable probability
space (Ω, T, IP). We want to model two assets, a riskless bond B and a risky stock S, which we
now observe in a continuoustime interval [0, T]. To transfer the continuoustime framework into a
binomial structure we make the following adjustments. Looking at the nth CoxRossRubinstein
model in our sequence, there is a prespeciﬁed number k
n
of trading dates. We set ∆
n
= T/k
n
and
divide [0, T] in k
n
subintervals of length ∆
n
, namely I
j
= [j∆
n
, (j +1)∆
n
], j = 0, . . . , k
n
−1. We
suppose that trading occurs only at the equidistant time points t
n,j
= j∆
n
, j = 0, . . . , k
n
−1. We
ﬁx r
n
as the riskless interest rate over each interval I
j
, and hence the bond process (in the nth
model) is given by
B(t
n,j
) = (1 +r
n
)
j
, j = 0, . . . , k
n
.
In the continuoustime model we compound continuously with spot rate r ≥ 0 and hence the bond
price process B(t) is given by B(t) = e
rt
. In order to approximate this process in the discretetime
framework, we choose r
n
such that
1 +r
n
= e
r∆
n
. (3.13)
With this choice we have for any j = 0, . . . , k
n
that (1 +r
n
)
j
= exp(rj∆
n
) = exp(rt
n,j
). Thus we
have approximated the bond process exactly at the time points of the discrete model.
Next we model the oneperiod returns S(t
n,j+1
)/S(t
n,j
) of the stock by a family of random
variables Z
n,i
; i = 1, . . . , k
n
taking values ¦d
n
, u
n
¦ with
IP(Z
n,i
= u
n
) = p
n
= 1 −IP(Z
n,i
= d
n
)
for some p
n
∈ (0, 1) (which we specify later). With these Z
n,j
we model the stock price process
S
n
in the nth CoxRossRubinstein model as
S
n
(t
n,j
) = S
n
(0)
j
i=1
(1 +Z
n,i
) , j = 0, 1, . . . , k
n
.
With the speciﬁcation of the oneperiod returns we get a complete description of the discrete
dynamics of the stock price process in each CoxRossRubinstein model. We call such a ﬁnite
sequence Z
n
= (Z
n,i
)
k
n
i=1
a lattice or tree. The parameters u
n
, d
n
, p
n
, k
n
diﬀer from lattice to
lattice, but remain constant throughout a speciﬁc lattice. In the triangular array (Z
n,i
), i =
1, . . . , k
n
; n = 1, 2, . . . we assume that the random variables are rowwise independent (but we
allow dependence between rows). The approximation of a continuoustime setting by a sequence
of lattices is called the lattice approach.
It is important to stress that for each n we get a diﬀerent discrete stock price process S
n
(t)
and that in general these processes do not coincide on common time points (and are also diﬀerent
from the price process S(t)).
Turning back to a speciﬁc CoxRossRubinstein model, we now have as in ¸3.4 a discrete
time bond and stock price process. We want arbitragefree ﬁnancial market models and therefore
have to choose the parameters u
n
, d
n
, p
n
accordingly. An arbitragefree ﬁnancial market model is
CHAPTER 3. DISCRETETIME MODELS 57
guaranteed by the existence of an equivalent martingale measure, and by Proposition 3.4.1 (i) the
(necessary and) suﬃcient condition for that is
d
n
< r
n
< u
n
.
The riskneutrality approach implies that the expected (under an equivalent martingale measure)
oneperiod return must equal the oneperiod return of the riskless bond and hence we get (see
Proposition 3.4.1(ii))
p
∗
n
=
r
n
−d
n
u
n
−d
n
. (3.14)
So the only parameters to choose freely in the model are u
n
and d
n
. In the next sections we
consider some special choices.
3.5.2 The BlackScholes Option Pricing Formula
We now choose the parameters in the above lattice approach in a special way. Assuming the
riskfree rate of interest r as given, we have by (3.13) 1 +r
n
= e
r∆
n
, and the remaining degrees of
freedom are resolved by choosing u
n
and d
n
. We use the following choice:
1 +u
n
= e
σ
√
∆
n
, and 1 +d
n
= (1 +u
n
)
−1
= e
−σ
√
∆
n
.
By condition (3.14) the riskneutral probabilities for the corresponding single period models are
given by
p
∗
n
=
r
n
−d
n
u
n
−d
n
=
e
r∆
n
−e
−σ
√
∆
n
e
σ
√
∆
n
−e
−σ
√
∆
n
.
We can now price contingent claims in each CoxRossRubinstein model using the expectation
operator with respect to the (unique) equivalent martingale measure characterised by the proba
bilities p
∗
n
(compare ¸3.4.2). In particular we can compute the price Π
C
(t) at time t of a European
call on the stock S with strike K and expiry T by formula (3.12) of Corollary 3.4.4. Let us
reformulate this formula slightly. We deﬁne
a
n
= min
_
j ∈ IN
0
[S(0)(1 +u
n
)
j
(1 +d
n
)
k
n
−j
> K
_
. (3.15)
Then we can rewrite the pricing formula (3.12) for t = 0 in the setting of the nth CoxRoss
Rubinstein model as
Π
C
(0) = (1 +r
n
)
−k
n
k
n
j=a
n
_
k
n
j
_
p
∗
n
j
(1 −p
∗
n
)
k
n
−j
(S(0)(1 +u
n
)
j
(1 +d
n
)
k
n
−j
−K)
= S(0)
_
_
k
n
j=a
n
_
k
n
j
__
p
∗
n
(1 +u
n
)
1 +r
n
_
j
_
(1 −p
∗
n
)(1 +d
n
)
1 +r
n
_
k
n
−j
_
_
−(1 +r
n
)
−k
n
K
_
_
k
n
j=a
n
_
k
n
j
_
p
∗j
n
(1 −p
∗
n
)
k
n
−j
_
_
.
Denoting the binomial cumulative distribution function with parameters (n, p) as B
n,p
(.) we see
that the second bracketed expression is just
¯
B
k
n
,p
∗
n
(a
n
) = 1 −B
k
n
,p
∗
n
(a
n
).
Also the ﬁrst bracketed expression is
¯
B
k
n
, ˆ p
n
(a
n
) with
ˆ p
n
=
p
∗
n
(1 +u
n
)
1 +r
n
.
CHAPTER 3. DISCRETETIME MODELS 58
That ˆ p
n
is indeed a probability can be shown straightforwardly. Using this notation we have in
the nth CoxRossRubinstein model for the price of a European call at time t = 0 the following
formula:
Π
(n)
C
(0) = S
n
(0)
¯
B
k
n
,ˆ p
n
(a
n
) −K(1 +r
n
)
−k
n ¯
B
k
n
,p
∗
n
(a
n
). (3.16)
(We stress again that the underlying is S
n
(t), dependent on n, but S
n
(0) = S(0) for all n.) We
now look at the limit of this expression.
Proposition 3.5.1. We have the following limit relation:
lim
n→∞
Π
(n)
C
(0) = Π
BS
C
(0)
with Π
BS
C
(0) given by the BlackScholes formula (we use S = S(0) to ease the notation)
Π
BS
C
(0) = SN(d
1
(S, T)) −Ke
−rT
N(d
2
(S, T)). (3.17)
The functions d
1
(s, t) and d
2
(s, t) are given by
d
1
(s, t) =
log(s/K) + (r +
σ
2
2
)t
σ
√
t
,
d
2
(s, t) = d
1
(s, t) −σ
√
t =
log(s/K) + (r −
σ
2
2
)t
σ
√
t
and N(.) is the standard normal cumulative distribution function.
Proof. Since S
n
(0) = S (say) all we have to do to prove the proposition is to show
(i) lim
n→∞
¯
B
k
n
, ˆ p
n
(a
n
) = N(d
1
(S, T)),
(ii) lim
n→∞
¯
B
k
n
,p
∗
n
(a
n
) = N(d
2
(S, T)).
These statements involve the convergence of distribution functions.
To show (i) we interpret
¯
B
k
n
, ˆ p
n
(a
n
) = IP (a
n
≤ Y
n
≤ k
n
)
with (Y
n
) a sequence of random variables distributed according to the binomial law with param
eters (k
n
, ˆ p
n
). We normalise Y
n
to
˜
Y
n
=
Y
n
−IE(Y
n
)
_
V ar(Y
n
)
=
Y
n
−k
n
ˆ p
n
_
k
n
ˆ p
n
(1 − ˆ p
n
)
=
k
n
j=1
(B
j,n
− ˆ p
n
)
_
k
n
ˆ p
n
(1 − ˆ p
n
)
,
where B
j,n
, j = 1, . . . , k
n
; n = 1, 2, . . . are rowwise independent Bernoulli random variables with
parameter ˆ p
n
. Now using the central limit theorem we know that for α
n
→ α, β
n
→ β we have
lim
n→∞
IP(α
n
≤
˜
Y
n
≤ β
n
) = N(β) −N(α).
By deﬁnition we have
IP (a
n
≤ Y
n
≤ k
n
) = IP
_
α
n
≤
˜
Y
n
≤ β
n
_
with
α
n
=
a
n
−k
n
ˆ p
n
_
k
n
ˆ p
n
(1 − ˆ p
n
)
and β
n
=
k
n
(1 − ˆ p
n
)
_
k
n
ˆ p
n
(1 − ˆ p
n
)
.
Using the following limiting relations:
lim
n→∞
ˆ p
n
=
1
2
, lim
n→∞
k
n
(1 −2ˆ p
n
)
_
∆
n
= −T
_
r
σ
+
σ
2
_
,
CHAPTER 3. DISCRETETIME MODELS 59
and the deﬁning relation for a
n
, formula (3.15), we get
lim
n→∞
α
n
= lim
n→∞
log(K/S) +k
n
σ
√
∆
n
2σ
√
∆
n
−k
n
ˆ p
n
_
k
n
ˆ p
n
(1 − ˆ p
n
)
= lim
n→∞
log(K/S) +σk
n
√
∆
n
(1 −2ˆ p
n
)
2σ
_
k
n
∆
n
ˆ p
n
(1 − ˆ p
n
)
=
log(K/S) −(r +
σ
2
2
)T
σ
√
T
= −d
1
(S, T).
Furthermore we have
lim
n→∞
β
n
= lim
n→∞
_
k
n
ˆ p
−1
n
(1 − ˆ p
n
) = +∞.
So N(β
n
) → 1, N(α
n
) → N(−d
1
) = 1 −N(d
1
), completing the proof of (i).
To prove (ii) we can argue in very much the same way and arrive at parameters α
∗
n
and β
∗
n
with ˆ p
n
replaced by p
∗
n
. Using the following limiting relations:
lim
n→∞
p
∗
n
=
1
2
, lim
n→∞
k
n
(1 −2p
∗
n
)
_
∆
n
= T
_
σ
2
−
r
σ
_
,
we get
lim
n→∞
α
∗
n
= lim
n→∞
log(K/S) +σn
√
∆
n
(1 −2p
∗
n
)
2σ
_
n∆
n
p
∗
n
(1 −p
∗
n
)
=
log(K/S) −(r −
σ
2
2
)T
σ
√
T
= −d
2
(s, T).
For the upper limit we get
lim
n→∞
β
∗
n
= lim
n→∞
_
k
n
(p
∗
n
)
−1
(1 −p
∗
n
) = +∞,
whence (ii) follows similarly.
By the above proposition we have derived the classical BlackScholes European call option
valuation formula as an asymptotic limit of option prices in a sequence of CoxRossRubinstein
type models with a special choice of parameters. We will therefore call these models discrete
BlackScholes models. Let us mention here that in the continuoustime BlackScholes model the
dynamics of the (stochastic) stock price process S(t) are modelled by a geometric Brownian motion
(or exponential Wiener process). The sample paths of this stochastic price process are almost all
continuous and the probability law of S(t) at any time t is lognormal. In particular the time T
distribution of log¦S(T)/S(0)¦ is N(Tµ, Tσ
2
). Looking back at the construction of our sequence
of CoxRossRubinstein models we see that
log
S
n
(T)
S(0)
=
k
n
i=1
log(1 +Z
n,i
),
with log(Z
n,i
) Bernoulli random variables with
IP(log(1 +Z
n,i
) = σ
_
∆
n
) = p
n
= 1 −IP(log(1 +Z
n,i
) = −σ
_
∆
n
).
By the (triangular array version) of the central limit theorem we know that log
S
n
(T)
S(0)
properly
normalised converges in distribution to a random variable with standard normal distribution.
Doing similar calculations as in the above proposition we can compute the normalising constants
and get
lim
n→∞
log
S
n
(T)
S(0)
∼ N(T(r −σ
2
/2), Tσ
2
),
i.e.
S
n
(T)
S(0)
is in the limit lognormally distributed.
CHAPTER 3. DISCRETETIME MODELS 60
3.6 American Options
Consider a general multiperiod framework. The holder of an American derivative security can
‘exercise’ in any period t and receive payment f(S
t
) (or more general a nonnegative payment f
t
).
In order to hedge such an option, we want to construct a selfﬁnancing trading strategy ϕ
t
such
that for the corresponding value process V
ϕ
(t)
V
ϕ
(0) = x initial capital
V
ϕ
(t) ≥ f(S
t
), ∀t. (3.18)
Such a hedging portfolio is minimal, if for a stopping time τ
V
ϕ
(τ) = f(S
τ
).
Our aim in the following will be to discuss existence and construction of such a stopping time.
3.6.1 Stopping Times, Optional Stopping and Snell Envelopes
A random variable τ taking values in ¦0, 1, 2, . . . ; +∞¦ is called a stopping time (or optional time)
if
¦τ ≤ n¦ = ¦ω : τ(ω) ≤ n¦ ∈ T
n
∀ n ≤ ∞.
From ¦τ = n¦ = ¦τ ≤ n¦ ¸ ¦τ ≤ n − 1¦ and ¦τ ≤ n¦ =
k≤n
¦τ = k¦, we see the equivalent
characterisation
¦τ = n¦ ∈ T
n
∀ n ≤ ∞.
Call a stopping time τ bounded, if there is a constant K such that IP(τ ≤ K) = 1. (Since
τ(ω) ≤ K for some constant K and all ω ∈ Ω ¸ N with IP(N) = 0 all identities hold true except
on a null set, i.e. almost surely.)
Example. Suppose (X
n
) is an adapted process and we are interested in the time of ﬁrst entry
of X into a Borel set B (typically one might have B = [c, ∞)):
τ = inf¦n ≥ 0 : X
n
∈ B¦.
Now ¦τ ≤ n¦ =
k≤n
¦X
k
∈ B¦ ∈ T
n
and τ = ∞ if X never enters B. Thus τ is a stopping time.
Intuitively, think of τ as a time at which you decide to quit a gambling game: whether or not
you quit at time n depends only on the history up to and including time n – NOT the future.
Thus stopping times model gambling and other situations where there is no foreknowledge, or
prescience of the future; in particular, in the ﬁnancial context, where there is no insider trading.
Furthermore since a gambler cannot cheat the system the expectation of his hypothetical fortune
(playing with unit stake) should equal his initial fortune.
Theorem 3.6.1 (Doob’s StoppingTime Principle (STP)). Let τ be a bounded stopping
time and X = (X
n
) a martingale. Then X
τ
is integrable, and
IE(X
τ
) = IE(X
0
).
Proof. Assume τ(ω) ≤ K for all ω, where we can take K to be an integer and write
X
τ(ω)
(ω) =
∞
k=0
X
k
(ω)1
{τ(ω)=k}
=
K
k=0
X
k
(ω)1
{τ(ω)=k}
CHAPTER 3. DISCRETETIME MODELS 61
Thus using successively the linearity of the expectation operator, the martingale property of X,
the T
k
measurability of ¦τ = k¦ and ﬁnally the deﬁnition of conditional expectation, we get
IE(X
τ
) = IE
_
K
k=0
X
k
1
{τ=k}
_
=
K
k=0
IE
_
X
k
1
{τ=k}
¸
=
K
k=0
IE
_
IE(X
K
[T
k
)1
{τ=k}
¸
=
K
k=0
IE
_
X
K
1
{τ=k}
¸
= IE
_
X
K
K
k=0
1
{τ=k}
_
= IE(X
K
) = IE(X
0
).
The stopping time principle holds also true if X = (X
n
) is a supermartingale; then the con
clusion is
IEX
τ
≤ IEX
0
.
Also, alternative conditions such as
(i) X = (X
n
) is bounded ([X
n
(ω)[ ≤ L for some L and all n, ω);
(ii) IEτ < ∞ and (X
n
−X
n−1
) is bounded;
suﬃce for the proof of the stopping time principle.
The stopping time principle is important in many areas, such as sequential analysis in statistics.
We turn in the next section to related ideas speciﬁc to the gambling/ﬁnancial context.
We now wish to create the concept of the σalgebra of events observable up to a stopping time
τ, in analogy to the σalgebra T
n
which represents the events observable up to time n.
Deﬁnition 3.6.1. Let τ be a stopping time. The stopping time σ−algebra T
τ
is deﬁned to be
T
τ
= ¦A ∈ T : A∩ ¦τ ≤ n¦ ∈ T
n
, for all n¦.
Proposition 3.6.1. For τ a stopping time, T
τ
is a σ−algebra.
Proof. We simply have to check the deﬁning properties. Clearly Ω, ∅ are in T
τ
. Also for
A ∈ T
τ
we ﬁnd
A
c
∩ ¦τ ≤ n¦ = ¦τ ≤ n¦ ¸ (A∩ ¦τ ≤ n¦) ∈ T
n
,
thus A
c
∈ T
τ
. Finally, for a family A
i
∈ T
τ
, i = 1, 2, . . . we have
_
∞
_
i=1
A
i
_
∩ ¦τ ≤ n¦ =
∞
_
i=1
(A
i
∩ ¦τ ≤ n¦) ∈ T
n
,
showing
∞
i=1
A
i
∈ T
τ
.
Proposition 3.6.2. Let σ, τ be stopping times with σ ≤ τ. Then T
σ
⊆ T
τ
.
Proof. Since σ ≤ τ we have ¦τ ≤ n¦ ⊆ ¦σ ≤ n¦. So for A ∈ T
σ
we get
A∩ ¦τ ≤ n¦ = (A∩ ¦σ ≤ n¦) ∩ ¦τ ≤ n¦ ∈ T
n
,
since (.) ∈ T
n
as A ∈ T
σ
. So A ∈ T
τ
.
Proposition 3.6.3. For any adapted sequence of random variables X = (X
n
) and a.s. ﬁnite
stopping time τ, deﬁne
X
τ
=
∞
n=0
X
n
1
{τ=n}
.
Then X
τ
is T
τ
measurable.
CHAPTER 3. DISCRETETIME MODELS 62
Proof. Let B be a Borel set. We need to show ¦X
τ
∈ B¦ ∈ T
τ
. Now using the fact that on
the set ¦τ = k¦ we have X
τ
= X
k
, we ﬁnd
¦X
τ
∈ B¦ ∩ ¦τ ≤ n¦ =
n
_
k=1
¦X
τ
∈ B¦ ∩ ¦τ = k¦ =
n
_
k=1
¦X
k
∈ B¦ ∩ ¦τ = k¦.
Now sets ¦X
k
∈ B¦ ∩ ¦τ = k¦ ∈ T
k
⊆ T
n
, and the result follows.
We are now in position to obtain an important extension of the StoppingTime Principle,
Theorem 3.6.1.
Theorem 3.6.2 (Doob’s OptionalSampling Theorem, OST). Let X = (X
n
) be a martin
gale and let σ, τ be bounded stopping times with σ ≤ τ. Then
IE [X
τ
[T
σ
] = X
σ
and thus IE(X
τ
) = IE(X
σ
).
Proof. First observe that X
τ
and X
σ
are integrable (use the sum representation and the fact
that τ is bounded by an integer K) and X
σ
is T
σ
measurable by Proposition 3.6.3. So it only
remains to prove that
IE(1
A
X
τ
) = IE(1
A
X
σ
) ∀A ∈ T
σ
. (3.19)
For any such ﬁxed A ∈ T
σ
, deﬁne ρ by
ρ(ω) = σ(ω)1
A
(ω) +τ(ω)1
A
c (ω).
Since
¦ρ ≤ n¦ = (A∩ ¦σ ≤ n¦) ∪ (A
c
∩ ¦τ ≤ n¦) ∈ T
n
ρ is a stopping time, and from ρ ≤ τ we see that ρ is bounded. So the STP (Theorem 3.6.1)
implies IE(X
ρ
) = IE(X
0
) = IE(X
τ
). But
IE(X
ρ
) = IE (X
σ
1
A
+X
τ
1
A
c ) ,
IE(X
τ
) = IE (X
τ
1
A
+X
τ
1
A
c ) .
So subtracting yields (3.19).
We can establish a further characterisation of the martingale property.
Proposition 3.6.4. Let X = (X
n
) be an adapted sequence of random variables with IE([X
n
[) < ∞
for all n and IE(X
τ
) = 0 for all bounded stopping times τ. Then X is a martingale.
Proof. Let 0 ≤ m < n, ∞ and A ∈ T
m
. Deﬁne a stopping time τ by τ = n1
A
+m1
A
c . Then
0 = IE(X
τ
) = IE (X
n
1
A
+X
m
1
A
c ) ,
0 = IE(X
m
) = IE (X
m
1
A
+X
m
1
A
c ) ,
and by subtraction we obtain IE (X
m
1
A
) = IE (X
n
1
A
). Since this holds for all A ∈ T
m
,
IE(X
n
[T
m
) = X
m
by deﬁnition of conditional expectation. This says that (X
n
) is a martingale,
as required.
Write X
τ
= (X
τ
n
) for the sequence X = (X
n
) stopped at time τ, where we deﬁne X
τ
n
(ω) :=
X
τ(ω)∧n
(ω).
Proposition 3.6.5. (i) If X is adapted and τ is a stopping time, then the stopped sequence X
τ
is adapted.
(ii) If X is a martingale (super, submartingale) and τ is a stopping time, X
τ
is a martingale
(super, submartingale).
CHAPTER 3. DISCRETETIME MODELS 63
Proof. Let C
j
:= 1
{j≤τ}
; then
X
τ∧n
= X
0
+
n
j=1
C
j
(X
j
−X
j−1
)
(as the right is X
0
+
τ∧n
j=1
(X
j
−X
j−1
), which telescopes to X
τ∧n
). Since ¦j ≤ τ¦ is the complement
of ¦τ < j¦ = ¦τ ≤ j −1¦ ∈ T
j−1
, (C
n
) is predictable. So (X
τ
n
) is adapted.
If X is a martingale, so is X
τ
as it is the martingale transform of (X
n
) by (C
n
) (use Theorem
C.4.1). The righthand side above is X
τ∧(n−1)
+C
n
(X
n
−X
n−1
). So taking conditional expectation
given T
n−1
and using predictability of (C
n
),
IE(X
τ∧n
[T
n−1
) = X
τ∧(n−1)
+C
n
(IE[X
n
[T
n−1
] −X
n−1
).
Then C
n
≥ 0 shows that if X is a supermartingale (submartingale), so is X
τ
.
We now discuss the Snell envelope, which will be an important tool for the valuation of Amer
ican options. The idea is due to Snell (1952); for a textbook account, see e.g. Neveu (1975),
VI.
Deﬁnition 3.6.2. If X = (X
n
)
N
n=0
is a sequence adapted to a ﬁltration T
n
with IE([X
n
[) < ∞,
the sequence Z = (Z
n
)
N
n=0
deﬁned by
_
Z
N
:= X
N
,
Z
n
:= max ¦X
n
, IE (Z
n+1
[T
n
)¦ (n ≤ N −1)
is called the Snell envelope of X.
Theorem 3.6.3. The Snell envelope Z of X is a supermartingale, and is the smallest super
martingale dominating X (that is, with Z
n
≥ X
n
for all n).
Proof. First, Z
n
≥ IE(Z
n+1
[T
n
), so Z is a supermartingale, and Z
n
≥ X
n
, so Z dominates
X.
Next, let Y = (Y
n
) be any other supermartingale dominating X; we must show Y dominates
Z also. First, since Z
N
= X
N
and Y dominates X, we must have Y
N
≥ Z
N
. Assume inductively
that Y
n
≥ Z
n
. Then as Y is a supermartingale,
Y
n−1
≥ IE(Y
n
[T
n−1
) ≥ IE(Z
n
[T
n−1
),
and as Y dominates X,
Y
n−1
≥ X
n−1
.
Combining,
Y
n−1
≥ max ¦X
n−1
, IE(Z
n
[T
n−1
)¦ = Z
n−1
.
By repeating this argument (or more formally, by backward induction), Y
n
≥ Z
n
for all n, as
required.
Proposition 3.6.6. τ
∗
:= inf¦n ≥ 0 : Z
n
= X
n
¦ is a stopping time, and the stopped process Z
τ
∗
is a martingale.
Proof. Since Z
N
= X
N
, τ
∗
∈ ¦0, 1, . . . , N¦ is welldeﬁned and clearly bounded. For k = 0,
¦τ
∗
= 0¦ = ¦Z
0
= X
0
¦ ∈ T
0
; for k ≥ 1,
¦τ
∗
= k¦ = ¦Z
0
> X
0
¦ ∩ ∩ ¦Z
k−1
> X
k−1
¦ ∩ ¦Z
k
= X
k
¦ ∈ T
k
.
So τ
∗
is a stopping time.
As in the proof of Proposition 3.6.5,
Z
τ
∗
n
= Z
n∧τ
∗ = Z
0
+
n
j=1
C
j
∆Z
j
,
CHAPTER 3. DISCRETETIME MODELS 64
where C
j
= 1
{j≤τ
∗
}
is predictable. For n ≤ N −1,
Z
τ
∗
n+1
−Z
τ
∗
n
= C
n+1
(Z
n+1
−Z
n
) = 1
{n+1≤τ
∗
}
(Z
n+1
−Z
n
).
Now Z
n
:= max ¦X
n
, IE(Z
n+1
[T
n
)¦, and by deﬁnition of τ
∗
,
Z
n
> X
n
on ¦n + 1 ≤ τ
∗
¦.
So from the deﬁnition of Z
n
,
Z
n
= IE(Z
n+1
[T
n
) on ¦n + 1 ≤ τ
∗
¦.
We next prove
Z
τ
∗
n+1
−Z
τ
∗
n
= 1
{n+1≤τ
∗
}
(Z
n+1
−IE(Z
n+1
[T
n
)). (3.20)
For, suppose ﬁrst that τ
∗
≥ n + 1. Then the left of (3.20) is Z
n+1
− Z
n
, the right is Z
n+1
−
IE(Z
n+1
[T
n
), and these agree on ¦n+1 ≤ τ
∗
¦ by above. The other possibility is that τ
∗
< n+1, i.e.
τ
∗
≤ n. Then the left of (3.20) is Z
τ
∗ −Z
τ
∗ = 0, while the right is zero because the indicator is zero,
completing the proof of (3.20). Now apply IE(.[T
n
) to (3.20): since ¦n+1 ≤ τ
∗
¦ = ¦τ
∗
≤ n¦
c
∈ T
n
,
IE
_
(Z
τ
∗
n+1
−Z
τ
∗
n
)[T
n
_
= 1
{n+1≤τ
∗
}
IE [(Z
n+1
−IE(Z
n+1
[T
n
)) [T
n
]
= 1
{n+1≤τ
∗
}
[IE(Z
n+1
[T
n
) −IE(Z
n+1
[T
n
)] = 0.
So IE(Z
τ
∗
n+1
[T
n
) = Z
τ
∗
n
. This says that Z
τ
∗
is a martingale, as required.
Write T
n,N
for the set of stopping times taking values in ¦n, n + 1, . . . , N¦ (a ﬁnite set, as Ω
is ﬁnite). Call a stopping time σ ∈ T
n,N
optimal for (X
n
) if
IE(X
σ
[T
n
) = sup¦IE(X
τ
[T
n
) : τ ∈ T
n,N
¦.
We next see that the Snell envelope can be used to solve the optimal stopping problem for (X
n
)
in T
0,N
. Recall that T
0
= ¦∅, Ω¦ so IE(Y [T
0
) = IE(Y ) for any integrable random variable Y .
Proposition 3.6.7. τ
∗
solves the optimal stopping problem for X:
Z
0
= IE(X
τ
∗) = sup ¦IE (X
τ
) : τ ∈ T
0,N
¦ .
Proof. To prove the ﬁrst statement we use that (Z
τ
∗
n
) is a martingale and Z
τ
∗ = X
τ
∗; then
Z
0
= Z
τ
∗
0
= IE
_
Z
τ
∗
N
_
= IE (Z
τ
∗) = IE (X
∗
τ
) . (3.21)
Now for any stopping time τ ∈ T
0,N
, since Z is a supermartingale (above), so is the stopped
process Z
τ
(see Proposition 3.6.5). Together with the property that Z dominates X this yields
Z
0
= Z
τ
0
≥ IE (Z
τ
N
) = IE (Z
τ
) ≥ IE (X
τ
) . (3.22)
Combining (3.21) and (3.22) and taking the supremum on τ gives the result.
The same argument, starting at time n rather than time 0, gives
Corollary 3.6.1. If τ
∗
n
:= inf¦j ≥ n : Z
j
= X
j
¦,
Z
n
= IE(X
τ
∗
n
[T
n
) = sup ¦IE(X
τ
[T
n
) : τ ∈ T
n,N
¦ .
As we are attempting to maximise our payoﬀ by stopping X = (X
n
) at the most advantageous
time, the Corollary shows that τ
∗
n
gives the best stopping time that is realistic: it maximises our
expected payoﬀ given only information currently available.
We proceed by analysing optimal stopping times. One can characterize optimality by estab
lishing a martingale property:
CHAPTER 3. DISCRETETIME MODELS 65
Proposition 3.6.8. The stopping time σ ∈ T is optimal for (X
t
) if and only if the following two
conditions hold.
(i) Z
σ
= X
σ
;
(ii) Z
σ
is a martingale.
Proof. We start showing that (i) and (ii) imply optimality. If Z
σ
is a martingale then
Z
0
= IE(Z
σ
0
) = IE(Z
σ
N
) = IE(Z
σ
) = IE(X
σ
),
where we used (i) for the last identity. Since Z is a supermartingale Proposition 3.6.5 implies that
Z
τ
is a supermartingale for any τ ∈ T
0,N
. Now Z dominates X, and so
Z
0
= IE(Z
τ
0
) ≥ IE(Z
τ
N
) = IE(Z
τ
) ≥ IE(X
τ
).
Combining, σ is optimal.
Now assume that σ is optimal. Thus
Z
0
= max ¦IE(X
τ
); τ ∈ T
0,N
¦ = IE(X
σ
) ≤ IE(Z
σ
),
since Z dominates X. Since Z
σ
is a supermartingale, we also have Z
0
≥ IE(Z
σ
). Combining,
IE(X
σ
) = Z
0
= IE(Z
σ
).
But X ≤ Z, so X
σ
≤ Z
σ
, while by the above X
σ
and Z
σ
have the same expectation. So they must
be a.s. equal: X
σ
= Z
σ
a.s., showing (i). To see (ii), observe that for any n ≤ N
IE(Z
σ
) = Z
0
≥ IE(Z
σ∧n
) ≥ IE(Z
σ
) = IE(IE(Z
σ
[T
n
)),
where the second inequality follows from Doob’s OST (Theorem 3.6.2) with the bounded stopping
times (σ ∧ n) ≤ σ and the supermartingale Z. Using that Z is a supermartingale again, we also
ﬁnd
Z
σ∧n
≥ IE(Z
σ
[T
n
). (3.23)
As above, this inequality between random variables with equal expectations forces a.s. equality:
Z
σ∧n
= IE(Z
σ
[T
n
) a.s.. Apply IE(.[T
n−1
):
IE (Z
σ∧n
[T
n−1
) = IE (IE(Z
σ
[T
n
)[T
n−1
) = IE(Z
σ
[T
n−1
) = Z
σ∧(n−1)
,
by (3.23) with n −1 for n. This says
IE (Z
σ
n
[T
n−1
) = Z
σ
n−1
,
so Z
σ
is a martingale.
From Proposition 3.6.6 and its deﬁnition (ﬁrst time when Z and X are equal) it follows that
τ
∗
is the smallest optimal stopping time . To ﬁnd the largest optimal stopping time we try to
ﬁnd the time when Z ’ceases to be a martingale’. In order to do so we need a structural result of
genuine interest and importance
Theorem 3.6.4 (Doob Decomposition). Let X = (X
n
) be an adapted process with each X
n
∈
L
1
. Then X has an (essentially unique) Doob decomposition
X = X
0
+M +A : X
n
= X
0
+M
n
+A
n
∀n (3.24)
with M a martingale null at zero, A a predictable process null at zero. If also X is a submartingale
(‘increasing on average’), A is increasing: A
n
≤ A
n+1
for all n, a.s..
CHAPTER 3. DISCRETETIME MODELS 66
Proof. If X has a Doob decomposition (3.24),
IE[X
n
−X
n−1
[T
n−1
] = IE[M
n
−M
n−1
[T
n−1
] +IE[A
n
−A
n−1
[T
n−1
].
The ﬁrst term on the right is zero, as M is a martingale. The second is A
n
−A
n−1
, since A
n
(and
A
n−1
) is T
n−1
measurable by predictability. So
IE[X
n
−X
n−1
[T
n−1
] = A
n
−A
n−1
, (3.25)
and summation gives
A
n
=
n
k=1
IE[X
k
−X
k−1
[T
k−1
], a.s.
So set A
0
= 0 and use this formula to deﬁne (A
n
), clearly predictable. We then use (3.24) to
deﬁne (M
n
), then a martingale, giving the Doob decomposition (3.24). To see uniqueness, assume
two decompositions, i.e. X
n
= X
0
+M
n
+A
n
= X
0
+
˜
M
n
+
˜
A
n
, then M
n
−
˜
M
n
= A
n
−
˜
A
n
. Thus
the martingale M
n
−
˜
M
n
is predictable and so must be constant a.s..
If X is a submartingale, the LHS of (3.25) is ≥ 0, so the RHS of (3.25) is ≥ 0, i.e. (A
n
) is
increasing.
Although the Doob decomposition is a simple result in discrete time, the analogue in continuous
time – the DoobMeyer decomposition – is deep. This illustrates the contrasts that may arise
between the theories of stochastic processes in discrete and continuous time.
Equipped with the Doobdecomposition we return to the above setting and can write
Z = Z
0
+L +B
with L a martingale and B predictable and decreasing. Then M = Z
0
+ L is a martingale and
A = (−B) is increasing and we have Z = M −A.
Deﬁnition 3.6.3. Deﬁne a random variable ν : Ω → IN
0
by setting
ν(ω) =
_
N if A
N
(ω) = 0
min¦n ≥ 0 : A
n+1
> 0¦ if A
N
(ω) > 0.
Observe that ν (bounded by N) is a stopping time, since
¦ν = n¦ =
_
k≤n
¦A
k
= 0¦ ∩ ¦A
n+1
> 0¦ ∈ T
n
as A is predictable.
Proposition 3.6.9. ν is optimal for (X
t
), and it is the largest optimal stopping time for (X
t
).
Proof. We use Proposition 3.6.8. Since for k ≤ ν(ω), Z
k
(ω) = M
k
(ω) − A
k
(ω) = M
k
(ω), Z
ν
is a martingale and thus we have (ii) of Proposition 3.6.8. To see (i) we write
Z
ν
=
N−1
k=0
1
{ν=k}
Z
k
+1
{ν=N}
Z
N
=
N−1
k=0
1
{ν=k}
max¦X
k
, IE(Z
k+1
[T
k
)¦ +1
{ν=N}
X
N
.
Now IE(Z
k+1
[T
k
) = IE(M
k+1
− A
k+1
[T
k
) = M
k
− A
k+1
. On ¦ν = k¦ we have A
k
= 0 and
A
k+1
> 0, so IE(Z
k+1
[T
k
) < Z
k
. Hence Z
k
= max¦X
k
, IE(Z
k+1
[T
k
)¦ = X
k
on the set ¦ν = k¦.
So
Z
ν
=
N−1
k=0
1
{ν=k}
X
k
+1
{ν=N}
X
N
= X
ν
,
CHAPTER 3. DISCRETETIME MODELS 67
which is (i) of Proposition 3.6.8. Now take τ ∈ ¦T ¦
0,N
with τ ≥ ν and IP(τ > ν) > 0. From the
deﬁnition of ν and the fact that A is increasing, A
τ
> 0 with positive probability. So IE(A
τ
) > 0,
and
IE(Z
τ
) = IE(M
τ
) −IE(A
τ
) = IE(Z
0
) −IE(A
τ
) < IE(Z
0
).
So τ cannot be optimal.
3.6.2 The Financial Model
We assume now that we work in a market model (Ω, T, IF, IP), which is complete with IP
∗
the
unique martingale measure.
Then for any hedging strategy ϕ we have that under IP
∗
M(t) =
˜
V
ϕ
(t) = β(t)V
ϕ
(t) (3.26)
is a martingale. Thus we can use the STP (Theorem 3.6.1) to ﬁnd for any stopping time τ
V
ϕ
(0) = M
0
= IE
∗
(
˜
V
ϕ
(τ)). (3.27)
Since we require V
ϕ
(τ) ≥ f
τ
(S) for any stopping time we ﬁnd for the required initial capital
x ≥ sup
τ∈T
IE
∗
(β(τ)f
τ
(S)). (3.28)
Suppose now that τ
∗
is such that V
ϕ
(τ
∗
) = f
τ
∗(S) then the strategy ϕ is minimal and since
V
ϕ
(t) ≥ f
t
(S) for all t we have
x = IE
∗
(β(τ
∗
)f
τ
∗(S)) = sup
τ∈T
IE
∗
(β(τ)f
τ
(S)) (3.29)
Thus (3.29) is a necessary condition for the existence of a minimal strategy ϕ. We will show that
it is also suﬃcient and call the price in (3.29) the rational price of an American contingent claim.
Now consider the problem of the option writer to construct such a strategy ϕ. At time T the
hedging strategy needs to cover f
T
, i.e. V
ϕ
(T) ≥ f
T
is required (We write short f
t
for f
t
(S)). At
time T − 1 the option holder can either exercise and receive f
T−1
or hold the option to expiry,
in which case B(T −1)IE
∗
(β(T)f
T
[F
T−1
) needs to be covered. Thus the hedging strategy of the
writer has to satisfy
V
ϕ
(T −1) = max¦f
T−1
, B(T −1)IE
∗
(β(T)f
T
[T
T−1
)¦ (3.30)
Using a backwards induction argument we can show that
V
ϕ
(t −1) = max¦f
t−1
, B(t −1)IE
∗
(β(t)V
ϕ
(t)[T
t−1
)¦. (3.31)
Considering only discounted values this leads to
˜
V
ϕ
(t −1) = max¦
˜
f
t−1
, IE
∗
(
˜
V
ϕ
(t)[T
t−1
)¦. (3.32)
Thus we see that
˜
V
ϕ
(t) is the Snell envelope Z
t
of
˜
f
t
.
In particular we know that
Z
t
= sup
τ∈T
t
IE
∗
(
˜
f
τ
[T
t
) (3.33)
and the stopping time τ
∗
= min¦s ≥ t : Z
s
=
˜
f
s
¦ is optimal. So
Z
t
= IE
∗
(
˜
f
τ
∗[T
t
) (3.34)
In case t = 0 we can use τ
∗
0
= min¦s ≥ 0 : Z
s
=
˜
f
s
¦ and then
x = Z
0
= IE
∗
(
˜
f
τ
∗
0
) = sup
τ∈T
0
IE
∗
(
˜
f
τ
) (3.35)
CHAPTER 3. DISCRETETIME MODELS 68
is the rational option price.
We still need to construct the strategy ϕ. To do this recall that Z is a supermartingale and so
the Doob decomposition yields
Z =
˜
M −
˜
A (3.36)
with a martingale
˜
M and a predictable, increasing process
˜
A. We write M
t
=
˜
M
t
B
t
and A
t
=
˜
A
t
B
t
. Since the market is complete we know that there exists a selfﬁnancing strategy ¯ ϕ such that
˜
M
t
=
˜
V
¯ ϕ
(t). (3.37)
Also using (3.36) we ﬁnd Z
t
B
t
= V
¯ ϕ
(t) − A
t
. Now on C = ¦(t, ω) : 0 ≤ t < τ
∗
(ω)¦ we have that
Z is a martingale and thus A
t
(ω) = 0. Thus we obtain from
˜
V
¯ ϕ
(t) = Z
t
that
˜
V
¯ ϕ
(t) = sup
t≤τ≤T
IE
∗
(
˜
f
τ
[T
t
) ∀ (t, ω) ∈ C. (3.38)
Now τ
∗
is the smallest exercise time and
˜
A
τ
∗
(ω)
= 0. Thus
˜
V
¯ ϕ
(τ
∗
(ω), ω) = Z
τ
∗
(ω)
(ω) =
˜
f
τ
∗
(ω)
(ω) (3.39)
Undoing the discounting we ﬁnd
V
¯ ϕ
(τ
∗
) = f
τ
∗ (3.40)
and therefore
¯
φ is a minimal hedge.
Now consider the problem of the option holder, how to ﬁnd the optimal exercise time. We
observe that the optimal exercise time must be an optimal stopping time, since for any other
stopping time σ (use Proposition 3.6.8)
˜
V
ϕ
(σ) = Z
σ
>
˜
f
σ
(3.41)
and holding the asset longer would generate a larger payoﬀ. Thus the holder needs to wait until
Z
σ
=
˜
f
σ
i.e. (i) of Proposition 3.6.8 is true. On the other hand with ν the largest stopping time
(compare Deﬁnition 3.6.3) we see that σ ≤ ν. This follows since using
¯
φ after ν with initial capital
from exercising will always yield a higher portfolio value than the strategy of exercising later. To
see this recall that V
¯ ϕ
= Z
t
B
t
+ A
t
with A
t
> 0 for t > ν. So we must have σ ≤ ν and since
A
t
= 0 for t ≤ ν we see that Z
σ
is a martingale. Now criterion (ii) of Proposition 3.6.8 is true and
σ is thus optimal. So
Proposition 3.6.10. A stopping time σ ∈ T
t
is an optimal exercise time for the American option
(f
t
) if and only if
IE
∗
(β(σ)f
σ
) = sup
τ∈T
t
IE
∗
(β(τ)f
τ
) (3.42)
3.6.3 American Options in the CoxRossRubinstein model
We now consider how to evaluate an American put option in a standard CRR model. We assume
that the time interval [0, T] is divided into N equal subintervals of length ∆ say. Assuming the
riskfree rate of interest r (over [0,T]) as given, we have 1 + ρ = e
r∆
(where we denote the risk
free rate of interest in each subinterval by ρ). The remaining degrees of freedom are resolved by
choosing u and d as follows:
1 +u = e
σ
√
∆
, and 1 +d = (1 +u)
−1
= e
−σ
√
∆
.
By condition (3.9) the riskneutral probabilities for the corresponding single period models are
given by
p
∗
=
ρ −d
u −d
=
e
r∆
−e
−σ
√
∆
e
σ
√
∆
−e
−σ
√
∆
.
CHAPTER 3. DISCRETETIME MODELS 69
Thus the stock with initial value S = S(0) is worth S(1 + u)
i
(1 + d)
j
after i steps up and j
steps down. Consequently, after N steps, there are N + 1 possible prices, S(1 + u)
i
(1 + d)
N−i
(i = 0, . . . , N). There are 2
N
possible paths through the tree. It is common to take N of the order
of 30, for two reasons:
(i) typical lengths of time to expiry of options are measured in months (9 months, say); this gives
a time step around the corresponding number of days,
(ii) 2
30
paths is about the order of magnitude that can be comfortably handled by computers
(recall that 2
10
= 1, 024, so 2
30
is somewhat over a billion).
We can now calculate both the value of an American put option and the optimal exercise
strategy by working backwards through the tree (this method of backward recursion in time is a
form of the dynamic programming (DP) technique, due to Richard Bellman, which is important
in many areas of optimisation and Operational Research).
1. Draw a binary tree showing the initial stock value and having the right number, N, of time
intervals.
2. Fill in the stock prices: after one time interval, these are S(1+u) (upper) and S(1+d) (lower);
after two time intervals, S(1 +u)
2
, S and S(1 +d)
2
= S/(1 +u)
2
; after i time intervals, these are
S(1 +u)
j
(1 +d)
i−j
= S(1 +u)
2j−i
at the node with j ‘up’ steps and i −j ‘down’ steps (the ‘(i, j)’
node).
3. Using the strike price K and the prices at the terminal nodes, ﬁll in the payoﬀs f
A
N,j
=
max¦K −S(1 +u)
j
(1 +d)
N−j
, 0¦ from the option at the terminal nodes underneath the terminal
prices.
4. Work back down the tree, from right to left. The noexercise values f
ij
of the option at the
(i, j) node are given in terms of those of its upper and lower right neighbours in the usual way, as
discounted expected values under the riskneutral measure:
f
ij
= e
−r∆
[p
∗
f
A
i+1,j+1
+ (1 −p
∗
)f
A
i+1,j
].
The intrinsic (or earlyexercise) value of the American put at the (i, j) node – the value there if it
is exercised early – is
K −S(1 +u)
j
(1 +d)
i−j
(when this is nonnegative, and so has any value). The value of the American put is the higher of
these:
f
A
ij
= max¦f
ij
, K −S(1 +u)
j
(1 +d)
i−j
¦
= max
_
e
−r∆
(p
∗
f
A
i+1,j+1
+ (1 −p
∗
)f
A
i+1,j
), K −S(1 +u)
j
(1 +d)
i−j
_
.
5. The initial value of the option is the value f
A
0
ﬁlled in at the root of the tree.
6. At each node, it is optimal to exercise early if the earlyexercise value there exceeds the value
f
ij
there of expected discounted future payoﬀ.
3.6.4 A Threeperiod Example
Assume we have two basic securities: a riskfree bond and a risky stock. The oneyear riskfree
interest rate (continuously compounded) is r = 0.06 and the volatility of the stock is 20%. We
price calls and puts in threeperiod CoxRossRubinstein model. The up and down movements of
the stock price are given by
1 +u = e
σ
√
∆
= 1.1224 and 1 +d = (1 +u)
−1
= e
−σ
√
∆
= 0.8910,
with σ = 0.2 and ∆ = 1/3. We obtain riskneutral probabilities by (3.9)
p
∗
=
e
r∆
−d
u −d
= 0.5584.
CHAPTER 3. DISCRETETIME MODELS 70
We assume that the price of the stock at time t = 0 is S(0) = 100. To price a European call option
with maturity one year (N = 3) and strike K = 10) we can either use the valuation formula (3.12)
or work our way backwards through the tree. Prices of the stock and the call are given in Figure
4.2 below. One can implement the simple evaluation formulae for the CRR and the BSmodels
time t = 0
S = 100
c = 11.56
.
.
.
S = 112.24
c = 18.21
`
`
`
S = 89.10
c = 3.67
t = 1
.
.
.
S = 125.98
c = 27.96
·
·
·
S = 100
c = 6.70
.
.
.
·
·
· S = 79.38
c = 0
t = 2
.
.
.
·
·
·
S = 89.10
c = 0
S = 70.72
c = 0
.
.
.
S = 112.24
c = 12.24
·
·
·
.
.
.
S = 141.40
c = 41.40
·
·
·
t = 3
Figure 3.2: Stock and European call prices
and compare the values. Figure 3.3 is for S = 100, K = 90, r = 0.06, σ = 0.2, T = 1.
Approximation
C
a
l
l
p
r
i
c
e
50 100 150 200
1
6
.
6
9
4
1
6
.
6
9
6
1
6
.
6
9
8
1
6
.
7
0
0
1
6
.
7
0
2
1
6
.
7
0
4
Approximating CRR prices
Figure 3.3: Approximation of BlackScholes price by Binomial models
To price a European put, with price process denoted by p(t), and an American put, P(t),
(maturity N = 3, strike 100), we can for the European put either use the putcall parity (1.1), the
riskneutral pricing formula, or work backwards through the tree. For the prices of the American
put we use the technique outlined in ¸4.8.1. Prices of the two puts are given in Figure 4.4. We
indicate the early exercise times of the American put in bold type. Recall that the discretetime
rule is to exercise if the intrinsic value K − S(t) is larger than the value of the corresponding
European put.
CHAPTER 3. DISCRETETIME MODELS 71
time t = 0
p = 5.82
P = 6.18
.
.
.
p = 2.08
P = 2.08
`
`
`
p = 10.65
P = 11.59
t = 1
.
.
.
p = 0
P = 0
·
·
·
p = 4.76
P = 4.76
.
.
.
·
·
· p = 18.71
P = 20.62
t = 2
.
.
.
·
·
·
p = 10.90
P = 10.90
p = 29.28
P = 29.28
.
.
.
p = 0
P = 0
·
·
·
.
.
.
p = 0
P = 0
·
·
·
t = 3
Figure 3.4: European p(.) and American P(.) put prices
Chapter 4
Continuoustime Financial Market
Models
4.1 The Stock Price Process and its Stochastic Calculus
4.1.1 Continuoustime Stochastic Processes
A stochastic process X = (X(t))
t≥0
is a family of random variables deﬁned on (Ω, T, IP, IF). We
say X is adapted if X(t) ∈ T
t
(i.e. X(t) is T
t
measurable) for each t: thus X(t) is known when
T
t
is known, at time t.
The martingale property in continuous time is just that suggested by the discretetime case:
Deﬁnition 4.1.1. A stochastic process X = (X(t))
0≤t<∞
is a martingale relative to (IF, IP) if
(i) X is adapted, and IE [X(t)[ < ∞ for all ≤ t < ∞;
(ii) IE[X(t)[T
s
] = X(s) IP −a.s. (0 ≤ s ≤ t),
and similarly for sub and supermartingales.
There are regularisation results, under which one can take X(t) RCLL in t (basically t →
IEX(t) has to be rightcontinuous). Then the analogues of the results for discretetime martingales
hold true.
Interpretation. Martingales model fair games. Submartingales model favourable games. Su
permartingales model unfavourable games.
Brownian motion originates in work of the botanist Robert Brown in 1828. It was introduced
into ﬁnance by Louis Bachelier in 1900, and developed in physics by Albert Einstein in 1905.
Deﬁnition 4.1.2. A stochastic process X = (X(t))
t≥0
is a standard (onedimensional) Brownian
motion, BM or BM(IR), on some probability space (Ω, T, IP), if
(i) X(0) = 0 a.s.,
(ii) X has independent increments: X(t +u) −X(t) is independent of σ(X(s) : s ≤ t) for u ≥ 0,
(iii) X has stationary increments: the law of X(t +u) −X(t) depends only on u,
(iv) X has Gaussian increments: X(t + u) − X(t) is normally distributed with mean 0 and
variance u, X(t +u) −X(t) ∼ N(0, u),
(v) X has continuous paths: X(t) is a continuous function of t, i.e. t → X(t, ω) is continuous
in t for all ω ∈ Ω.
72
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 73
We shall henceforth denote standard Brownian motion BM(IR) by W = (W(t)) (W for
Wiener), though B = (B(t)) (B for Brown) is also common. Standard Brownian motion BM(IR
d
)
in d dimensions is deﬁned by W(t) := (W
1
(t), . . . , W
d
(t)), where W
1
, . . . , W
d
are independent stan
dard Brownian motions in one dimension (independent copies of BM(IR)).
We have Wiener’s theorem:
Theorem 4.1.1 (Wiener). Brownian motion exists.
For further background, see any measuretheoretic text on stochastic processes. A treatment
starting directly from our main reference of measuretheoretic results, Williams (Williams 1991),
is Rogers and Williams (Rogers and Williams 1994), Chapter 1. The classic is Doob’s book,
(Doob 1953), VIII.2. Excellent modern texts include (Karatzas and Shreve 1991, Revuz and
Yor 1991) (see particularly (Karatzas and Shreve 1991), ¸2.24 for construction).
Geometric Brownian Motion
Now that we have both Brownian motion W and Itˆo’s Lemma to hand, we can introduce the most
important stochastic process for us, a relative of Brownian motion  geometric (or exponential, or
economic) Brownian motion.
Suppose we wish to model the time evolution of a stock price S(t) (as we will, in the Black
Scholes theory). Consider how S will change in some small timeinterval from the present time t
to a time t +dt in the near future. Writing dS(t) for the change S(t +dt)−S(t) in S, the return on
S in this interval is dS(t)/S(t). It is economically reasonable to expect this return to decompose
into two components, a systematic part and a random part. The systematic part could plausibly
be modelled by µdt, where µ is some parameter representing the mean rate of return of the stock.
The random part could plausibly be modelled by σdW(t), where dW(t) represents the noise term
driving the stock price dynamics, and σ is a second parameter describing how much eﬀect this
noise has  how much the stock price ﬂuctuates. Thus σ governs how volatile the price is, and is
called the volatility of the stock. The role of the driving noise term is to represent the random
buﬀeting eﬀect of the multiplicity of factors at work in the economic environment in which the
stock price is determined by supply and demand.
Putting this together, we have the stochastic diﬀerential equation
dS(t) = S(t)(µdt +σdW(t)), S(0) > 0, (4.1)
due to Itˆo in 1944. This corrects Bachelier’s earlier attempt of 1900 (he did not have the factor
S(t) on the right  missing the interpretation in terms of returns, and leading to negative stock
prices!) Incidentally, Bachelier’s work served as Itˆo’s motivation in introducing Itˆo calculus. The
mathematical importance of Itˆo’s work was recognised early, and led on to the work of (Doob 1953),
(Meyer 1976) and many others (see the memorial volume (Ikeda, Watanabe, M., and Kunita 1996)
in honour of Itˆo’s eightieth birthday in 1995). The economic importance of geometric Brownian
motion was recognised by Paul A. Samuelson in his work from 1965 on ((Samuelson 1965)), for
which Samuelson received the Nobel Prize in Economics in 1970, and by Robert Merton (see
(Merton 1990) for a full bibliography), in work for which he was similarly honoured in 1997.
4.1.2 Stochastic Analysis
Stochastic integration was introduced by K. Itˆo in 1944, hence its name Itˆo calculus. It gives a
meaning to
t
_
0
XdY =
t
_
0
X(s, ω)dY (s, ω),
for suitable stochastic processes X and Y , the integrand and the integrator. We shall conﬁne our
attention here mainly to the basic case with integrator Brownian motion: Y = W. Much greater
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 74
generality is possible: for Y a continuous martingale, see (Karatzas and Shreve 1991) or (Revuz
and Yor 1991); for a systematic general treatment, see (Protter 2004).
The ﬁrst thing to note is that stochastic integrals with respect to Brownian motion, if they exist,
must be quite diﬀerent from the measuretheoretic integral. For, the LebesgueStieltjes integrals
described there have as integrators the diﬀerence of two monotone (increasing) functions, which
are locally of bounded variation. But we know that Brownian motion is of inﬁnite (unbounded)
variation on every interval. So LebesgueStieltjes and Itˆo integrals must be fundamentally diﬀerent.
In view of the above, it is quite surprising that Itˆo integrals can be deﬁned at all. But if we take
for granted Itˆo’s fundamental insight that they can be, it is obvious how to begin and clear enough
how to proceed. We begin with the simplest possible integrands X, and extend successively in
much the same way that we extended the measuretheoretic integral.
Indicators.
If X(t, ω) = 1
[a,b]
(t), there is exactly one plausible way to deﬁne
_
XdW:
t
_
0
X(s, ω)dW(s, ω) :=
_
_
_
0 if t ≤ a,
W(t) −W(a) if a ≤ t ≤ b,
W(b) −W(a) if t ≥ b.
Simple Functions.
Extend by linearity: if X is a linear combination of indicators, X =
n
i=1
c
i
1
[a
i
,b
i
]
, we should
deﬁne
t
_
0
XdW :=
n
i=1
c
i
t
_
0
1
[a
i
,b
i
]
dW.
Already one wonders how to extend this from constants c
i
to suitable random variables, and one
seeks to simplify the obvious but clumsy threeline expressions above.
We begin again, this time calling a stochastic process X simple if there is a partition 0 = t
0
<
t
1
< . . . < t
n
= T < ∞ and uniformly bounded T
t
n
measurable random variables ξ
k
([ξ
k
[ ≤ C for
all k = 0, . . . , n and ω, for some C) and if X(t, ω) can be written in the form
X(t, ω) = ξ
0
(ω)1
{0}
(t) +
n
i=0
ξ
i
(ω)1
(t
i
,t
i+1
]
(t) (0 ≤ t ≤ T, ω ∈ Ω).
Then if t
k
≤ t < t
k+1
,
I
t
(X) :=
t
_
0
XdW =
k−1
i=0
ξ
i
(W(t
i+1
) −W(t
i
)) +ξ
k
(W(t) −W(t
k
))
=
n
i=0
ξ
i
(W(t ∧ t
i+1
) −W(t ∧ t
i
)).
Note that by deﬁnition I
0
(X) = 0 IP −a.s. . We collect some properties of the stochastic integral
deﬁned so far:
Lemma 4.1.1. (i) I
t
(aX +bY ) = aI
t
(X) +bI
t
(Y ).
(ii) IE(I
t
(X)[T
s
) = I
s
(X) IP −a.s. (0 ≤ s < t < ∞), hence I
t
(X) is a continuous martingale.
The stochastic integral for simple integrands is essentially a martingale transform, and the
above is essentially the proof that martingale transforms are martingales.
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 75
We pause to note a property of squareintegrable martingales which we shall need below.
Call M(t) − M(s) the increment of M over (s, t]. Then for a martingale M, the product of the
increments over disjoint intervals has zero mean. For, if s < t ≤ u < v,
IE [(M(v) −M(u))(M(t) −M(s))]
= IE [IE((M(v) −M(u))(M(t) −M(s))[T
u
)]
= IE [(M(t) −M(s))IE((M(v) −M(u))[T
u
)] ,
taking out what is known (as s, t ≤ u). The inner expectation is zero by the martingale property,
so the lefthand side is zero, as required.
We now can add further properties of the stochastic integral for simple functions.
Lemma 4.1.2. (i) We have the Itˆo isometry
IE
_
(I
t
(X))
2
_
= IE
_
_
t
_
0
X(s)
2
ds
_
_
.
(ii)IE
_
(I
t
(X) −I
s
(X))
2
[T
s
_
= IE
_
_
t
s
X(u)
2
du
_
IP −a.s.
The Itˆo isometry above suggests that
_
t
0
XdW should be deﬁned only for processes with
t
_
0
IE
_
X(u)
2
_
du < ∞ for all t.
We then can transfer convergence on a suitable L
2
space of stochastic processes to a suitable L
2

space of martingales. This gives us an L
2
theory of stochastic integration, for which Hilbertspace
methods are available.
For the ﬁnancial applications we have in mind, there is a ﬁxed timeinterval  [0, T] say  on
which we work (e.g., an option is written at time t = 0, with expiry time t = T). Then the above
becomes
T
_
0
IE(X(u)
2
)du < ∞.
Approximation.
We seek a class of integrands suitably approximable by simple integrands. It turns out that:
(i) The suitable class of integrands is the class of (B([0, ∞)) ⊗T)measurable, (T
t
) adapted pro
cesses X with
_
t
0
IE
_
X(u)
2
_
du < ∞ for all t > 0.
(ii) Each such X may be approximated by a sequence of simple integrands X
n
so that the stochas
tic integral I
t
(X) =
_
t
0
XdW may be deﬁned as the limit of I
t
(X
n
) =
_
t
0
X
n
dW.
(iii) The properties from both lemmas above remain true for the stochastic integral
_
t
0
XdW de
ﬁned by (i) and (ii).
Example.
We calculate
_
W(u)dW(u). We start by approximating the integrand by a sequence of simple
functions.
X
n
(u) =
_
¸
¸
¸
¸
_
¸
¸
¸
¸
_
W(0) = 0 if 0 ≤ u ≤ t/n,
W(t/n) if t/n < u ≤ 2t/n,
.
.
.
.
.
.
W
_
(n−1)t
n
_
if (n −1)t/n < u ≤ t.
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 76
By deﬁnition,
t
_
0
W(u)dW(u) = lim
n→∞
n−1
k=0
W
_
kt
n
__
W
_
(k + 1)t
n
_
−W
_
kt
n
__
.
Rearranging terms, we obtain for the sum on the right
n−1
k=0
W
_
kt
n
__
W
_
(k + 1)t
n
_
−W
_
kt
n
__
=
1
2
W(t)
2
−
1
2
_
n−1
k=0
_
W
_
(k + 1)t
n
_
−W
_
kt
n
__
2
_
.
Since the second term approximates the quadratic variation of W and hence tends to t for n → ∞,
we ﬁnd
t
_
0
W(u)dW(u) =
1
2
W(t)
2
−
1
2
t. (4.2)
Note the contrast with ordinary (NewtonLeibniz) calculus! Itˆo calculus requires the second term
on the right – the Itˆo correction term – which arises from the quadratic variation of W.
One can construct a closely analogous theory for stochastic integrals with the Brownian inte
grator W above replaced by a squareintegrable martingale integrator M. The properties above
hold, with (i) in Lemma replaced by
IE
_
¸
_
_
_
t
_
0
X(u)dM(u)
_
_
2
_
¸
_ = IE
_
_
t
_
0
X(u)
2
d¸M)(u)
_
_
.
Quadratic Variation, Quadratic Covariation.
We shall need to extend quadratic variation and quadratic covariation to stochastic integrals. The
quadratic variation of I
t
(X) =
_
t
0
X(u)dW(u) is
_
t
0
X(u)
2
du. This is proved in the same way as the
case X ≡ 1, that W has quadratic variation process t. More generally, if Z(t) =
_
t
0
X(u)dM(u)
for a continuous martingale integrator M, then ¸Z) (t) =
_
t
0
X
2
(u)d ¸M) (u). Similarly (or by
polarisation), if Z
i
(t) =
_
t
0
X
i
(u)dM
i
(u) (i = 1, 2), ¸Z
1
, Z
2
) (t) =
_
t
0
X
1
(u)X
2
(u)d ¸M
1
, M
2
) (u).
4.1.3 Itˆo’s Lemma
Suppose that b is adapted and locally integrable (so
_
t
0
b(s)ds is deﬁned as an ordinary integral),
and σ is adapted and measurable with
_
t
0
IE
_
σ(u)
2
_
du < ∞ for all t (so
_
t
0
σ(s)dW(s) is deﬁned
as a stochastic integral). Then
X(t) := x
0
+
t
_
0
b(s)ds +
t
_
0
σ(s)dW(s)
deﬁnes a stochastic process X with X(0) = x
0
. It is customary, and convenient, to express such
an equation symbolically in diﬀerential form, in terms of the stochastic diﬀerential equation
dX(t) = b(t)dt +σ(t)dW(t), X(0) = x
0
. (4.3)
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 77
Now suppose f : IR → IR is of class C
2
. The question arises of giving a meaning to the stochastic
diﬀerential df(X(t)) of the process f(X(t)), and ﬁnding it. Given a partition T of [0, t], i.e.
0 = t
0
< t
1
< . . . < t
n
= t, we can use Taylor’s formula to obtain
f(X(t)) −f(X(0)) =
n−1
k=0
f(X(t
k+1
)) −f(X(t
k
))
=
n−1
k=0
f
(X(t
k
))∆X(t
k
)
+
1
2
n−1
k=0
f
(X(t
k
) +θ
k
∆X(t
k
))(∆X(t
k
))
2
with 0 < θ
k
< 1. We know that
(∆X(t
k
))
2
→ ¸X) (t) in probability (so, taking a subsequence,
with probability one), and with a little more eﬀort one can prove
n−1
k=0
f
(X(t
k
) +θ
k
∆X(t
k
))(∆X(t
k
))
2
→
t
_
0
f
(X(u))d ¸X) (u).
The ﬁrst sum is easily recognized as an approximating sequence of a stochastic integral; indeed,
we ﬁnd
n−1
k=0
f
(X(t
k
))∆X(t
k
) →
t
_
0
f
(X(u))dX(u).
So we have
Theorem 4.1.2 (Basic Itˆo formula). If X has stochastic diﬀerential given by 4.3 and f ∈ C
2
,
then f(X) has stochastic diﬀerential
df(X(t)) = f
(X(t))dX(t) +
1
2
f
(X(t))d ¸X) (t),
or writing out the integrals,
f(X(t)) = f(x
0
) +
t
_
0
f
(X(u))dX(u) +
1
2
t
_
0
f
(X(u))d ¸X) (u).
More generally, suppose that f : IR
2
→ IR is a function, continuously diﬀerentiable once in
its ﬁrst argument (which will denote time), and twice in its second argument (space): f ∈ C
1,2
.
By the Taylor expansion of a smooth function of several variables we get for t close to t
0
(we use
subscripts to denote partial derivatives: f
t
:= ∂f/∂t, f
tx
:= ∂
2
f/∂t∂x):
f(t, X(t)) = f(t
0
, X(t
0
))
+(t −t
0
)f
t
(t
0
, X(t
0
)) + (X(t) −X(t
0
))f
x
(t
0
, X(t
0
))
+
1
2
(t −t
0
)
2
f
tt
(t
0
, X(t
0
)) +
1
2
(X(t) −X(t
0
))
2
f
xx
(t
0
, X(t
0
))
+(t −t
0
)(X(t) −X(t
0
))f
tx
(t
0
, X(t
0
)) +. . . ,
which may be written symbolically as
df = f
t
dt +f
x
dX +
1
2
f
tt
(dt)
2
+f
tx
dtdX +
1
2
f
xx
(dX)
2
+. . . .
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 78
In this, we substitute dX(t) = b(t)dt +σ(t)dW(t) from above, to obtain
df = f
t
dt +f
x
(bdt +σdW)
+
1
2
f
tt
(dt)
2
+f
tx
dt(bdt +σdW) +
1
2
f
xx
(bdt +σdW)
2
+. . .
Now using the formal multiplication rules dt dt = 0, dt dW = 0, dW dW = dt (which are just
shorthand for the corresponding properties of the quadratic variations, we expand
(bdt +σdW)
2
= σ
2
dt + 2bσdtdW +b
2
(dt)
2
= σ
2
dt + higherorder terms
to get ﬁnally
df =
_
f
t
+bf
x
+
1
2
σ
2
f
xx
_
dt +σf
x
dW + higherorder terms.
As above the higherorder terms are irrelevant, and summarising, we obtain Itˆo’s lemma, the
analogue for the Itˆo or stochastic calculus of the chain rule for ordinary (NewtonLeibniz) calculus:
Theorem 4.1.3 (Itˆo’s Lemma). If X(t) has stochastic diﬀerential given by 4.3, then f =
f(t, X(t)) has stochastic diﬀerential
df =
_
f
t
+bf
x
+
1
2
σ
2
f
xx
_
dt +σf
x
dW.
That is, writing f
0
for f(0, x
0
), the initial value of f,
f = f
0
+
t
_
0
(f
t
+bf
x
+
1
2
σ
2
f
xx
)dt +
t
_
0
σf
x
dW.
We will make good use of:
Corollary 4.1.1. IE (f(t, X(t))) = f
0
+
_
t
0
IE
_
f
t
+bf
x
+
1
2
σ
2
f
xx
_
dt.
Proof.
_
t
0
σf
2
dW is a stochastic integral, so a martingale, so its expectation is constant (= 0,
as it starts at 0).
The diﬀerential equation (4.1) above has the unique solution
S(t) = S(0) exp
__
µ −
1
2
σ
2
_
t +σdW(t)
_
.
For, writing
f(t, x) := exp
__
µ −
1
2
σ
2
_
t +σx
_
,
we have
f
t
=
_
µ −
1
2
σ
2
_
f, f
x
= σf, f
xx
= σ
2
f,
and with x = W(t), one has
dx = dW(t), (dx)
2
= dt.
Thus Itˆo’s lemma gives
df(t, W(t)) = f
t
dt +f
x
dW(t) +
1
2
f
xx
(dW(t))
2
= f
__
µ −
1
2
σ
2
_
dt +σdW(t) +
1
2
σ
2
dt
_
= f(µdt +σdW(t)),
so f(t, W(t)) is a solution of the stochastic diﬀerential equation, and the initial condition f(0, W(0)) =
S(0) as W(0) = 0, giving existence.
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 79
4.1.4 Girsanov’s Theorem
Consider ﬁrst independent N(0, 1) random variables Z
1
, . . . , Z
n
on a probability space (Ω, T, IP).
Given a vector γ = (γ
1
, . . . , γ
n
), consider a new probability measure
˜
IP on (Ω, T) deﬁned by
˜
IP(dω) = exp
_
n
i=1
γ
i
Z
i
(ω) −
1
2
n
i=1
γ
2
i
_
IP(dω).
As exp¦.¦ > 0 and integrates to 1, as
_
exp¦γ
i
Z
i
¦dIP = exp¦
1
2
γ
2
i
¦, this is a probability measure.
It is also equivalent to IP (has the same null sets), again as the exponential term is positive. Also
˜
IP(Z
i
∈ dz
i
, i = 1, . . . , n)
= exp
_
n
i=1
γ
i
Z
i
−
1
2
n
i=1
γ
2
i
_
IP(Z
i
∈ dz
i
, i = 1, . . . , n)
= (2π)
−
n
2
exp
_
n
i=1
γ
i
z
i
−
1
2
n
i=1
γ
2
i
−
1
2
n
i=1
z
2
i
_
n
i=1
dz
i
= (2π)
−
n
2
exp
_
−
1
2
n
i=1
(z
i
−γ
i
)
2
_
dz
1
. . . dz
n
.
This says that if the Z
i
are independent N(0, 1) under IP, they are independent N(γ
i
, 1) under
˜
IP.
Thus the eﬀect of the change of measure IP →
˜
IP, from the original measure IP to the equivalent
measure
˜
IP, is to change the mean, from 0 = (0, . . . , 0) to γ = (γ
1
, . . . , γ
n
).
This result extends to inﬁnitely many dimensions  i.e., from random vectors to stochas
tic processes, indeed with random rather than deterministic means. Let W = (W
1
, . . . W
d
)
be a ddimensional Brownian motion deﬁned on a ﬁltered probability space (Ω, T, IP, IF) with
the ﬁltration IF satisfying the usual conditions. Let (γ(t) : 0 ≤ t ≤ T) be a measurable,
adapted ddimensional process with
_
T
0
γ
i
(t)
2
dt < ∞ a.s., i = 1, . . . , d, and deﬁne the pro
cess (L(t) : 0 ≤ t ≤ T) by
L(t) = exp
_
_
_
−
t
_
0
γ(s)
dW(s) −
1
2
t
_
0
γ(s)
2
ds
_
_
_
. (4.4)
Then L is continuous, and, being the stochastic exponential of −
_
t
0
γ(s)
dW(s), is a local martin
gale. Given suﬃcient integrability on the process γ, L will in fact be a (continuous) martingale.
For this, Novikov’s condition suﬃces:
IE
_
_
exp
_
_
_
1
2
T
_
0
γ(s)
2
ds
_
_
_
_
_
< ∞. (4.5)
We are now in the position to state a version of Girsanov’s theorem, which will be one of our
main tools in studying continuoustime ﬁnancial market models.
Theorem 4.1.4 (Girsanov). Let γ be as above and satisfy Novikov’s condition; let L be the
corresponding continuous martingale. Deﬁne the processes
˜
W
i
, i = 1, . . . , d by
˜
W
i
(t) := W
i
(t) +
t
_
0
γ
i
(s)ds, (0 ≤ t ≤ T), i = 1, . . . , d.
Then under the equivalent probability measure
˜
IP (deﬁned on (Ω, T
T
)) with RadonNikod´ym deriva
tive
d
˜
IP
dIP
= L(T),
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 80
the process
˜
W = (
˜
W
1
, . . . ,
˜
W
d
) is ddimensional Brownian motion.
In particular, for γ(t) constant (= γ), change of measure by introducing the RadonNikod´ ym
derivative exp
_
−γW(t) −
1
2
γ
2
t
_
corresponds to a change of drift from c to c −γ. If IF = (T
t
) is
the Brownian ﬁltration (basically T
t
= σ(W(s), 0 ≤ s ≤ t) slightly enlarged to satisfy the usual
conditions) any pair of equivalent probability measures QQ ∼ IP on T = T
T
is a Girsanov pair, i.e.
d
˜
QQ
dIP
¸
¸
¸
¸
¸
F
t
= L(t)
with L deﬁned as above. Girsanov’s theorem (or the CameronMartinGirsanov theorem) is for
mulated in varying degrees of generality, discussed and proved, e.g. in (Karatzas and Shreve 1991),
¸3.5, (Protter 2004), III.6, (Revuz and Yor 1991), VIII, (Dothan 1990), ¸5.4 (discrete time), ¸11.6
(continuous time).
4.2 Financial Market Models
4.2.1 The Financial Market Model
We start with a general model of a frictionless (compare Chapter 1) security market where investors
are allowed to trade continuously up to some ﬁxed ﬁnite planning horizon T. Uncertainty in the
ﬁnancial market is modelled by a probability space (Ω, T, IP) and a ﬁltration IF = (T
t
)
0≤t≤T
satisfying the usual conditions of rightcontinuity and completeness. We assume that the σﬁeld
T
0
is trivial, i.e. for every A ∈ T
0
either IP(A) = 0 or IP(A) = 1, and that T
T
= T.
There are d +1 primary traded assets, whose price processes are given by stochastic processes
S
0
, . . . , S
d
. We assume that the processes S
0
, . . . , S
d
represent the prices of some traded assets
(stocks, bonds, or options).
We have not emphasised so far that there was an implicit num´eraire behind the prices S
0
, . . . , S
d
;
it is the num´eraire relevant for domestic transactions at time t. The formal deﬁnition of a num´eraire
is very much as in the discrete setting.
Deﬁnition 4.2.1. A num´eraire is a price process X(t) almost surely strictly positive for each
t ∈ [0, T].
We assume now that S
0
(t) is a nondividend paying asset, which is (almost surely) strictly
positive and use S
0
as num´eraire. ‘Historically’ (see (Harrison and Pliska 1981)) the money
market account B(t), given by B(t) = e
r(t)
with a positive deterministic process r(t) and r(0) = 0,
was used as a num´eraire, and the reader may think of S
0
(t) as being B(t).
Our principal task will be the pricing and hedging of contingent claims, which we model as
T
T
measurable random variables. This implies that the contingent claims specify a stochastic
cashﬂow at time T and that they may depend on the whole path of the underlying in [0, T]
 because T
T
contains all that information. We will often have to impose further integrability
conditions on the contingent claims under consideration. The fundamental concept in (arbitrage)
pricing and hedging contingent claims is the interplay of selfﬁnancing replicating portfolios and
riskneutral probabilities. Although the current setting is on a much higher level of sophistication,
the key ideas remain the same.
We call an IR
d+1
valued predictable process
ϕ(t) = (ϕ
0
(t), . . . , ϕ
d
(t)), t ∈ [0, T]
with
_
T
0
IE(ϕ
0
(t))dt < ∞,
d
i=0
_
T
0
IE(ϕ
2
i
(t))dt < ∞ a trading strategy (or dynamic portfolio
process). Here ϕ
i
(t) denotes the number of shares of asset i held in the portfolio at time t  to
be determined on the basis of information available before time t; i.e. the investor selects his time
t portfolio after observing the prices S(t−). The components ϕ
i
(t) may assume negative as well
as positive values, reﬂecting the fact that we allow short sales and assume that the assets are
perfectly divisible.
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 81
Deﬁnition 4.2.2. (i) The value of the portfolio ϕ at time t is given by the scalar product
V
ϕ
(t) := ϕ(t) S(t) =
d
i=0
ϕ
i
(t)S
i
(t), t ∈ [0, T].
The process V
ϕ
(t) is called the value process, or wealth process, of the trading strategy ϕ.
(ii) The gains process G
ϕ
(t) is deﬁned by
G
ϕ
(t) :=
t
_
0
ϕ(u)dS(u) =
d
i=0
t
_
0
ϕ
i
(u)dS
i
(u).
(iii) A trading strategy ϕ is called selfﬁnancing if the wealth process V
ϕ
(t) satisﬁes
V
ϕ
(t) = V
ϕ
(0) +G
ϕ
(t) for all t ∈ [0, T].
Remark 4.2.1. (i) The ﬁnancial implications of the above equations are that all changes in the
wealth of the portfolio are due to capital gains, as opposed to withdrawals of cash or injections of
new funds.
(ii) The deﬁnition of a trading strategy includes regularity assumptions in order to ensure the
existence of stochastic integrals.
Using the special num´eraire S
0
(t) we consider the discounted price process
˜
S(t) :=
S(t)
S
0
(t)
= (1,
˜
S
1
(t), . . .
˜
S
d
(t))
with
˜
S
i
(t) = S
i
(t)/S
0
(t), i = 1, 2, . . . , d. Furthermore, the discounted wealth process
˜
V
ϕ
(t) is
given by
˜
V
ϕ
(t) :=
V
ϕ
(t)
S
0
(t)
= ϕ
0
(t) +
d
i=1
ϕ
i
(t)
˜
S
i
(t)
and the discounted gains process
˜
G
ϕ
(t) is
˜
G
ϕ
(t) :=
d
i=1
t
_
0
ϕ
i
(t)d
˜
S
i
(t).
Observe that
˜
G
ϕ
(t) does not depend on the num´eraire component ϕ
0
.
It is convenient to reformulate the selfﬁnancing condition in terms of the discounted processes:
Proposition 4.2.1. Let ϕ be a trading strategy. Then ϕ if selfﬁnancing if and only if
˜
V
ϕ
(t) =
˜
V
ϕ
(0) +
˜
G
ϕ
(t).
Of course, V
ϕ
(t) ≥ 0 if and only if
˜
V
ϕ
(t) ≥ 0.
The proof follows by the num´eraire invariance theorem using S
0
as num´eraire.
Remark 4.2.2. The above result shows that a selfﬁnancing strategy is completely determined by
its initial value and the components ϕ
1
, . . . , ϕ
d
. In other words, any set of predictable processes
ϕ
1
, . . . , ϕ
d
such that the stochastic integrals
_
ϕ
i
d
˜
S
i
, i = 1, . . . , d exist can be uniquely extended to
a selfﬁnancing strategy ϕ with speciﬁed initial value
˜
V
ϕ
(0) = v by setting the cash holding as
ϕ
0
(t) = v +
d
i=1
t
_
0
ϕ
i
(u)d
˜
S
i
(u) −
d
i=1
ϕ
i
(t)
˜
S
i
(t), t ∈ [0, T].
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 82
4.2.2 Equivalent Martingale Measures
We develop a relative pricing theory for contingent claims. Again the underlying concept is the
link between the noarbitrage condition and certain probability measures. We begin with:
Deﬁnition 4.2.3. A selfﬁnancing trading strategy ϕ is called an arbitrage opportunity if the
wealth process V
ϕ
satisﬁes the following set of conditions:
V
ϕ
(0) = 0, IP(V
ϕ
(T) ≥ 0) = 1, and IP(V
ϕ
(T) > 0) > 0.
Arbitrage opportunities represent the limitless creation of wealth through riskfree proﬁt and
thus should not be present in a wellfunctioning market.
The main tool in investigating arbitrage opportunities is the concept of equivalent martingale
measures:
Deﬁnition 4.2.4. We say that a probability measure QQ deﬁned on (Ω, T) is an equivalent mar
tingale measure if:
(i) QQ is equivalent to IP,
(ii) the discounted price process
˜
S is a QQ martingale.
We denote the set of martingale measures by T.
A useful criterion in determining whether a given equivalent measure is indeed a martingale
measure is the observation that the growth rates relative to the num´eraire of all given primary
assets under the measure in question must coincide. For example, in the case S
0
(t) = B(t) we
have:
Lemma 4.2.1. Assume S
0
(t) = B(t) = e
r(t)
, then QQ ∼ IP is a martingale measure if and only if
every asset price process S
i
has price dynamics under QQ of the form
dS
i
(t) = r(t)S
i
(t)dt +dM
i
(t),
where M
i
is a QQmartingale.
The proof is an application of Itˆo’s formula.
In order to proceed we have to impose further restrictions on the set of trading strategies.
Deﬁnition 4.2.5. A selfﬁnancing trading strategy ϕ is called tame (relative to the num´eraire S
0
)
if
˜
V
ϕ
(t) ≥ 0 for all t ∈ [0, T].
We use the notation Φ for the set of tame trading strategies.
We next analyse the value process under equivalent martingale measures for such strategies.
Proposition 4.2.2. For ϕ ∈ Φ
˜
V
ϕ
(t) is a martingale under each QQ ∈ T.
This observation is the key to our ﬁrst central result:
Theorem 4.2.1. Assume T ,= ∅. Then the market model contains no arbitrage opportunities in
Φ.
Proof. For any ϕ ∈ Φ and under any QQ ∈ T
˜
V
ϕ
(t) is a martingale. That is,
IE
QQ
_
˜
V
ϕ
(t)[T
u
_
=
˜
V
ϕ
(u), for all u ≤ t ≤ T.
For ϕ ∈ Φ to be an arbitrage opportunity we must have
˜
V
ϕ
(0) = V
ϕ
(0) = 0. Now
IE
QQ
_
˜
V
ϕ
(t)
_
= 0, for all 0 ≤ t ≤ T.
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 83
Now ϕ is tame, so
˜
V
ϕ
(t) ≥ 0, 0 ≤ t ≤ T, implying IE
QQ
_
˜
V
ϕ
(t)
_
= 0, 0 ≤ t ≤ T, and in particular
IE
QQ
_
˜
V
ϕ
(T)
_
= 0. But an arbitrage opportunity ϕ also has to satisfy IP (V
ϕ
(T) ≥ 0) = 1, and
since QQ ∼ IP, this means QQ(V
ϕ
(T) ≥ 0) = 1. Both together yield
QQ(V
ϕ
(T) > 0) = IP (V
ϕ
(T) > 0) = 0,
and hence the result follows.
4.2.3 Riskneutral Pricing
We now assume that there exists an equivalent martingale measure IP
∗
which implies that there
are no arbitrage opportunities with respect to Φ in the ﬁnancial market model. Until further notice
we use IP
∗
as our reference measure, and when using the term martingale we always assume that
the underlying probability measure is IP
∗
. In particular, we restrict our attention to contingent
claims X such that X/S
0
(T) ∈ L
1
(T, IP
∗
).
We now deﬁne a further subclass of trading strategies:
Deﬁnition 4.2.6. A selfﬁnancing trading strategy ϕ is called (IP
∗
) admissible if the relative
gains process
˜
G
ϕ
(t) =
t
_
0
ϕ(u)d
˜
S(u)
is a (IP
∗
) martingale. The class of all (IP
∗
) admissible trading strategies is denoted Φ(IP
∗
).
By deﬁnition
˜
S is a martingale, and
˜
G is the stochastic integral with respect to
˜
S. We see that
any suﬃciently integrable processes ϕ
1
, . . . , ϕ
d
give rise to IP
∗
admissible trading strategies.
We can repeat the above argument to obtain
Theorem 4.2.2. The ﬁnancial market model / contains no arbitrage opportunities in Φ(IP
∗
).
Under the assumption that no arbitrage opportunities exist, the question of pricing and hedg
ing a contingent claim reduces to the existence of replicating selfﬁnancing trading strategies.
Formally:
Deﬁnition 4.2.7. (i) A contingent claim X is called attainable if there exists at least one admis
sible trading strategy such that
V
ϕ
(T) = X.
We call such a trading strategy ϕ a replicating strategy for X.
(ii) The ﬁnancial market model / is said to be complete if any contingent claim is attainable.
Again we emphasise that this depends on the class of trading strategies. On the other hand,
it does not depend on the num´eraire: it is an easy exercise in the continuous assetprice process
case to show that if a contingent claim is attainable in a given num´eraire it is also attainable in
any other num´eraire and the replicating strategies are the same.
If a contingent claim X is attainable, X can be replicated by a portfolio ϕ ∈ Φ(IP
∗
). This
means that holding the portfolio and holding the contingent claim are equivalent from a ﬁnancial
point of view. In the absence of arbitrage the (arbitrage) price process Π
X
(t) of the contingent
claim must therefore satisfy
Π
X
(t) = V
ϕ
(t).
Of course the questions arise of what will happen if X can be replicated by more than one portfolio,
and what the relation of the price process to the equivalent martingale measure(s) is. The following
central theorem is the key to answering these questions:
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 84
Theorem 4.2.3 (RiskNeutral Valuation Formula). The arbitrage price process of any at
tainable claim is given by the riskneutral valuation formula
Π
X
(t) = S
0
(t)IE
IP
∗
_
X
S
0
(T)
¸
¸
¸
¸
T
t
_
. (4.6)
The uniqueness question is immediate from the above theorem:
Corollary 4.2.1. For any two replicating portfolios ϕ, ψ ∈ Φ(IP
∗
) we have
V
ϕ
(t) = V
ψ
(t).
Proof of Theorem 4.2.3 Since X is attainable, there exists a replicating strategy ϕ ∈ Φ(IP
∗
)
such that V
ϕ
(T) = X and Π
X
(t) = V
ϕ
(t). Since ϕ ∈ Φ(IP
∗
) the discounted value process
˜
V
ϕ
(t) is
a martingale, and hence
Π
X
(t) = V
ϕ
(t) = S
0
(t)
˜
V
ϕ
(t)
= S
0
(t)IE
IP
∗
_
˜
V
ϕ
(T)
¸
¸
¸ T
t
_
= S
0
(t)IE
IP
∗
_
V
ϕ
(T)
S
0
(T)
¸
¸
¸
¸
T
t
_
= S
0
(t)IE
IP
∗
_
X
S
0
(T)
¸
¸
¸
¸
T
t
_
.
4.2.4 The BlackScholes Model
The Model
We concentrate on the classical BlackScholes model
dB(t) = rB(t)dt, B(0) = 1,
dS(t) = S(t) (bdt +σdW(t)), S(0) = p ∈ (0, ∞),
with constant coeﬃcients b ∈ IR, r, σ ∈ IR
+
. We write as usual
˜
S(t) = S(t)/B(t) for the discounted
stock price process (with the bank account being the natural num´eraire), and get from Itˆo’s formula
d
˜
S(t) =
˜
S(t) ¦(b −r)dt +σdW(t)¦.
Equivalent Martingale Measure
Because we use the Brownian ﬁltration any pair of equivalent probability measures IP ∼ QQ on T
T
is a Girsanov pair, i.e.
dQQ
dIP
¸
¸
¸
¸
F
t
= L(t)
with
L(t) = exp
_
_
_
−
t
_
0
γ(s)dW(s) −
1
2
t
_
0
γ(s)
2
ds
_
_
_
,
and (γ(t) : 0 ≤ t ≤ T) a measurable, adapted ddimensional process with
_
T
0
γ(t)
2
dt < ∞ a.s..
By Girsanov’s theorem 4.1.4 we have
dW(t) = d
˜
W(t) −γ(t)dt,
where
˜
W is a QQWiener process. Thus the QQdynamics for
˜
S are
d
˜
S(t) =
˜
S(t)
_
(b −r −σγ(t))dt +σd
˜
W(t)
_
.
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 85
Since
˜
S has to be a martingale under QQ we must have
b −r −σγ(t) = 0 t ∈ [0, T],
and so we must choose
γ(t) ≡ γ =
b −r
σ
,
(the ’market price of risk’). Indeed, this argument leads to a unique martingale measure, and we
will make use of this fact later on. Using the product rule, we ﬁnd the QQdynamics of S as
dS(t) = S(t)
_
rdt +σd
˜
W
_
.
We see that the appreciation rate b is replaced by the interest rate r, hence the terminology
riskneutral (or yieldequating) martingale measure.
We also know that we have a unique martingale measure IP
∗
(recall γ = (b−r)/σ in Girsanov’s
transformation).
Pricing and Hedging Contingent Claims
Recall that a contingent claim X is a T
T
measurable random variable such that X/B(T) ∈
L
1
(Ω, T
T
, IP
∗
). (We write IE
∗
for IE
IP
∗ in this section.) By the riskneutral valuation princi
ple the price of a contingent claim X is given by
Π
X
(t) = e
{−r(T−t)}
IE
∗
[ X[ T
t
],
with IE
∗
given via the Girsanov density
L(t) = exp
_
−
_
b −r
σ
_
W(t) −
1
2
_
b −r
σ
_
2
t
_
.
Now consider a European call with strike K and maturity T on the stock S (so Φ(T) = (S(T) −
K)
+
), we can evaluate the above expected value (which is easier than solving the BlackScholes
partial diﬀerential equation) and obtain:
Proposition 4.2.3 (BlackScholes Formula). The BlackScholes price
process of a European call is given by
C(t) = S(t)N(d
1
(S(t), T −t)) −Ke
−r(T−t)
N(d
2
(S(t), T −t)). (4.7)
The functions d
1
(s, t) and d
2
(s, t) are given by
d
1
(s, t) =
log(s/K) + (r +
σ
2
2
)t
σ
√
t
,
d
2
(s, t) = d
1
(s, t) −σ
√
t =
log(s/K) + (r −
σ
2
2
)t
σ
√
t
Observe that we have already deduced this formula as a limit of a discretetime setting.
To obtain a replicating portfolio we start in the discounted setting. We have from the risk
neutral valuation principle
M(t) = exp ¦−rT¦ IE
∗
[ Φ(S(T))[ T
t
] .
Now we use Itˆo’s lemma to ﬁnd the dynamics of the IP
∗
martingale M(t) = G(t, S(t)):
dM(t) = σS(t)G
s
(t, S(t))d
˜
W(t).
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 86
Using this representation, we get for the stock component of the replicating portfolio
h(t) = σS(t)G
s
(t, S(t)).
Now for the discounted assets the stock component is
ϕ
1
(t) = G
s
(t, S(t))B(t),
and using the selfﬁnancing condition the cash component is
ϕ
0
(t) = G(t, S(t)) −G
s
(t, S(t))S(t).
To transfer this portfolio to undiscounted values we multiply it by the discount factor, i.e F(t, S(t))
= B(t)G(t, S(t)) and get:
Proposition 4.2.4. The replicating strategy in the classical BlackScholes model is given by
ϕ
0
=
F(t, S(t)) −F
s
(t, S(t))S(t)
B(t)
,
ϕ
1
= F
s
(t, S(t)).
We can also use an arbitrage approach to derive the BlackScholes formula. For this consider
a selfﬁnancing portfolio which has dynamics
dV
ϕ
(t) = ϕ
0
(t)dB(t) +ϕ
1
(t)dS(t) = (ϕ
0
(t)rB(t) +ϕ
1
(t)µS(t))dt +ϕ
1
(t)σS(t)dW(t).
Assume that the portfolio value can be written as
V
ϕ
(t) = V (t) = f(t, S(t))
for a suitable function f ∈ C
1,2
. Then by Itˆo’s formula
dV (t) = (f
t
(t, S
t
) +f
x
(t, S
t
)S
t
µ +
1
2
S
2
t
σ
2
f
xx
(t, S
t
))dt +f
x
(t, S
t
)σS
t
dW
t
.
Now we match the coeﬃcients and ﬁnd
ϕ
1
(t) = f
x
(t, S
t
)
and
ϕ
0
(t) =
1
rB(t)
(f
t
(t, S
t
) +
1
2
σ
2
S
2
t
f
xx
(t, S
t
)).
Then looking at the total portfolio value we ﬁnd that f(t, x) must satisfy the following PDE
f
t
(t, x) +rxf
x
(t, x) +
1
2
σ
2
x
2
f
xx
(t, x) −rf(t, x) = 0 (4.8)
and initial condition f(T, x) = (x −K)
+
.
In their original paper Black and Scholes (1973), Black and Scholes used an arbitrage pricing
approach (rather than our riskneutral valuation approach) to deduce the price of a European
call as the solution of a partial diﬀerential equation (we call this the PDE approach). The idea
is as follows: start by assuming that the option price C(t) is given by C(t) = f(t, S(t)) for some
suﬃciently smooth function f : IR
+
[0, T] → IR. By Itˆo’s formula (Theorem 4.1.3) we ﬁnd for
the dynamics of the option price process (observe that we work under IP so dS = S(bdt +σdW))
dC =
_
f
t
(t, S) +f
s
(t, S)Sb +
1
2
f
ss
(t, S)S
2
σ
2
_
dt +f
s
SσdW. (4.9)
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 87
Consider a portfolio ψ consisting of a short position in ψ
1
(t) = f
s
(t, S(t)) stocks and a long position
in ψ
2
(t) = 1 call and assume the portfolio is selfﬁnancing. Then its value process is
V
ψ
(t) = −ψ
1
(t)S(t) +C(t),
and by the selfﬁnancing condition we have (to ease the notation we omit the arguments)
dV
ψ
= −ψ
1
dS +dC
= −f
s
(Sbdt +SσdW) +
_
f
t
+f
s
Sb +
1
2
f
ss
S
2
σ
2
_
dt +f
s
SσdW
=
_
f
t
+
1
2
f
ss
S
2
σ
2
_
dt.
So the dynamics of the value process of the portfolio do not have any exposure to the driving
Brownian motion, and its appreciation rate in an arbitragefree world must therefore equal the
riskfree rate, i.e.
dV
ψ
(t) = rV
ψ
(t)dt = (−rf
s
S +rC) dt.
Comparing the coeﬃcients and using C(t) = f(t, S(t)), we must have
−rSf
s
+rf = f
t
+
1
2
σ
2
S
2
f
ss
.
This leads again to the BlackScholes partial diﬀerential equation (4.8) for f, i.e.
f
t
+rsf
s
+
1
2
σ
2
s
2
f
ss
−rf = 0.
Since C(T) = (S(T) −K)
+
we need to impose the terminal condition f(s, T) = (s −K)
+
for all
s ∈ IR
+
.
Note.
One point in the justiﬁcation of the above argument is missing: we have to show that the trading
strategy short ψ
1
stocks and long one call is selfﬁnancing. In fact, this is not true, since ψ
1
=
ψ
1
(t, S(t)) is dependent on the stock price process. Formally, for the selfﬁnancing condition to
be true we must have
dV
ψ
(t) = d(ψ
1
(t)S(t)) +dC(t) = ψ
1
(t)dS(t) +dC(t).
Now ψ(t) = ψ(t, S(t)) depends on the stock price and so we have
d(ψ
1
(t, S(t))S(t)) = ψ
1
(t)dS(t) +S(t)dψ
1
(t, S(t)) +d ¸ψ
1
, S) (t).
We see that the portfolio ψ is selfﬁnancing, if
S(t)dψ
1
(t, S(t)) +d ¸ψ
1
, S) (t) = 0.
It is an exercise in Itˆo calculus to show that this is not the case.
4.2.5 The Greeks
We will now analyse the impact of the underlying parameters in the standard BlackScholes model
on the prices of call and put options. The BlackScholes option values depend on the (current) stock
price, the volatility, the time to maturity, the interest rate and the strike price. The sensitivities
of the option price with respect to the ﬁrst four parameters are called the Greeks and are widely
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 88
used for hedging purposes. We can determine the impact of these parameters by taking partial
derivatives. Recall the BlackScholes formula for a European call (4.7):
C(0) = C(S, T, K, r, σ) = SN(d
1
(S, T)) −Ke
−rT
N(d
2
(S, T)),
with the functions d
1
(s, t) and d
2
(s, t) given by
d
1
(s, t) =
log(s/K) + (r +
σ
2
2
)t
σ
√
t
,
d
2
(s, t) = d
1
(s, t) −σ
√
t =
log(s/K) + (r −
σ
2
2
)t
σ
√
t
.
One obtains
∆ :=
∂C
∂S
= N(d
1
) > 0,
1 :=
∂C
∂σ
= S
√
Tn(d
1
) > 0,
Θ :=
∂C
∂T
=
Sσ
2
√
T
n(d
1
) +Kre
−rT
N(d
2
) > 0,
ρ :=
∂C
∂r
= TKe
−rT
N(d
2
) > 0,
Γ :=
∂
2
C
∂S
2
=
n(d
1
)
Sσ
√
T
> 0.
(As usual N is the cumulative normal distribution function and n is its density.) From the
deﬁnitions it is clear that ∆ – delta – measures the change in the value of the option compared
with the change in the value of the underlying asset, 1 – vega – measures the change of the
option compared with the change in the volatility of the underlying, and similar statements hold
for Θ – theta – and ρ – rho (observe that these derivatives are in line with our arbitragebased
considerations in ¸1.3). Furthermore, ∆ gives the number of shares in the replication portfolio for
a call option (see Proposition 4.2.4), so Γ measures the sensitivity of our portfolio to the change
in the stock price.
The BlackScholes partial diﬀerential equation (4.8) can be used to obtain the relation between
the Greeks, i.e. (observe that Θ is the derivative of C, the price of a European call, with respect
to the time to expiry T −t, while in the BlackScholes PDE the partial derivative with respect to
the current time t appears)
rC =
1
2
s
2
σ
2
Γ +rs∆−Θ.
Let us now compute the dynamics of the call option’s price C(t) under the riskneutral martingale
measure IP
∗
. Using formula (4.9) we ﬁnd
dC(t) = rC(t)dt +σN(d
1
(S(t), T −t))S(t)d
˜
W(t).
Deﬁning the elasticity coeﬃcient of the option’s price as
η
c
(t) =
∆(S(t), T −t)S(t)
C(t)
=
N(d
1
(S(t), T −t))
C(t)
we can rewrite the dynamics as
dC(t) = rC(t)dt +ση
c
(t)C(t)d
˜
W(t).
So, as expected in the riskneutral world, the appreciation rate of the call option equals the risk
free rate r. The volatility coeﬃcient is ση
c
, and hence stochastic. It is precisely this feature that
causes diﬃculties when assessing the impact of options in a portfolio.
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 89
4.2.6 Barrier Options
The question of whether or not a particular stock will attain a particular level within a speciﬁed
period has long been an important one for risk managers. From at least 1967 – predating both
CBOE and BlackScholes in 1973 – practitioners have sought to reduce their exposure to speciﬁc
risks of this kind by buying options designed with such barriercrossing events in mind. As usual,
the motivation is that buying speciﬁc options – that is, taking out speciﬁc insurance – is a cheaper
way of covering oneself against a speciﬁc danger than buying a more general one.
Onebarrier options specify a stockprice level, H say, such that the option pays (‘knocks in’)
or not (‘knocks out’) according to whether or not level H is attained, from below (‘up’) or above
(‘down’). There are thus four possibilities: ‘up and in’, ‘up and out’, ‘down and in’ and ‘down
and out’. Since barrier options are pathdependent (they involve the behaviour of the path, rather
than just the current price or price at expiry), they may be classiﬁed as exotic; alternatively,
the four basic onebarrier types above may be regarded as ‘vanilla barrier’ options, with their
more complicated variants, described below, as ‘exotic barrier’ options. Note that holding both a
knockin option and the corresponding knockout is equivalent to the corresponding vanilla option
with the barrier removed. The sum of the prices of the knockin and the knockout is thus the
price of the vanilla – again showing the attractiveness of barrier options as being cheaper than
their vanilla counterparts.
A barrier option is often designed to pay a rebate – a sum speciﬁed in advance – to compensate
the holder if the option is rendered otherwise worthless by hitting/not hitting the barrier. We
restrict attention to zero rebate here for simplicity.
Consider, to be speciﬁc, a downandout call option with strike K and barrier H (the other
possibilities may be handled similarly). The payoﬀ is (unless otherwise stated min and max are
over [0, T])
(S(T) −K)
+
1
{min S(.)≥H}
= (S(T) −K)1
{S(T)≥K,min S(.)≥H}
,
so by riskneutral pricing the value of the option is
DOC
K,H
:= IE
_
e
−rT
(S(T) −K)1
{S(T)≥K,min S(.)≥H}
¸
,
where S is geometric Brownian motion, S(t) = p
0
exp¦(µ−
1
2
σ
2
t)+σW(t)¦. Write c := µ−
1
2
σ
2
/σ;
then min S(.) ≥ H iﬀ min(ct +W(t)) ≥ σ
−1
log(H/p
0
). Writing X for X(t) := ct +W(t) – drifting
Brownian motion with drift c, m, M for its minimum and maximum processes
m(t) := min¦X(s) : s ∈ [0, t]¦, M(t) := max¦X(s) : s ∈ [0, t]¦,
the payoﬀ function involves the bivariate process (X, m), and the option price involves the joint
law of this process.
Consider ﬁrst the case c = 0: we require the joint law of standard Brownian motion and its
maximum or minimum, (W, M) or (W, m). Taking (W, M) for deﬁniteness, we start the Brownian
motion W at the origin at time zero, choose a level b > 0, and run the process until the ﬁrstpassage
time (see Exercise 5.2)
τ(b) := inf¦t ≥ 0 : W(t) ≥ b¦
at which the level b is ﬁrst attained. This is a stopping time, and we may use the strong Markov
property for W at time τ(b). The process now begins afresh at level b, and by symmetry the
probabilistic properties of its further evolution are invariant under reﬂection in the level b (thought
of as a mirror). This reﬂection principle leads to the joint density of (W(t), M(t)) as
IP
0
(W(t) ∈ dx, M(t) ∈ dy)
=
2(2y −x)
√
2πt
3
exp
_
−
1
2
(2y −x)
2
/t
_
(0 ≤ x ≤ y),
a formula due to L´evy. (L´evy also obtained the identity in law of the bivariate processes (M(t) −
W(t), M(t)) and ([W(t)[, L(t)), where L is the local time process of W at zero: see e.g. Revuz
CHAPTER 4. CONTINUOUSTIME FINANCIAL MARKET MODELS 90
and Yor (1991), VI.2). The idea behind the reﬂection principle goes back to work of D´esir´e
Andr´e in 1887, and indeed further, to the method of images of Lord Kelvin (18241907), then
Sir William Thomson, of 1848 on electrostatics. For background on this, see any good book on
electromagnetism, e.g. Jeans (1925), Chapter VIII.
L´evy’s formula for the joint density of (W(t), M(t)) may be extended to the case of general
drift c by the usual method for changing drift, Girsanov’s theorem. The general result is
IP
0
(X(t) ∈ dx, M(t) ∈ dy)
=
2(2y −x)
√
2πt
3
exp
_
−
(2y −x)
2
2t
+cx −
1
2
c
2
t
_
(0 ≤ x ≤ y).
See e.g. Rogers and Williams (1994), I, (13.10), or Harrison (1985), ¸1.8. As an alternative to
the probabilistic approach above, a second approach to this formula makes explicit use of Kelvin’s
language – mirrors, sources, sinks; see e.g. Cox and Miller (1972), ¸5.7.
Given such an explicit formula for the joint density of (X(t), M(t)) – or equivalently, (X(t), m(t))
– we can calculate the option price by integration. The factor S(T) −K, or S −K, gives rise to
two terms, in S and K, while the integrals, involving relatives of the normal density function n,
may be obtained explicitly in terms of the normal distribution function N – both features familiar
from the BlackScholes formula. Indeed, this resemblance makes it convenient to decompose the
price DOC
K,H
of the downandout call into the (BlackScholes) price of the corresponding vanilla
call, C
K
say, and the knockout discount, KOD
K,H
say, by which the knockout barrier at H lowers
the price:
DOC
K,H
= C
K
−KOD
K,H
.
The details of the integration, which are tedious, are omitted; the result is, writing λ := r −
1
2
σ
2
,
KOD
K,H
= p
0
(H/p
0
)
2+2λ/σ
2
N(c
1
) −Ke
−rT
(H/p
0
)
2λ/σ
2
N(c
2
),
where p
0
is the initial stock price as usual and c
1
, c
2
are functions of the price p = p
0
and time
t = T given by
c
1,2
(p, t) =
log(H
2
/pK) + (r ±
1
2
σ
2
)t
σ
√
t
(the notation is that of the excellent text Musiela and Rutkowski (1997), ¸9.6, to which we refer
for further detail). The other cases of vanilla barrier options, and their sensitivity analysis, are
given in detail in Zhang (1997), Chapter 10.
Chapter 5
Interest Rate Theory
5.1 The Bond Market
5.1.1 The Term Structure of Interest Rates
We start with a heuristic discussion, which we will formalize in the following section. The main
traded objects we consider are zerocoupon bonds. A zerocoupon bond is a bond that has no
coupon payments. The price of a zerocoupon bond at time t that pays, say, a sure £ at time
T ≥ t is denoted p(t, T). All zerocoupon bonds are assumed to be defaultfree and have strictly
positive prices. Various diﬀerent interest rates are deﬁned in connection with zerocoupon bonds,
but we will only consider continuously compounded interest rates (which facilitates theoretical
considerations).
Using the arbitrage pricing technique, we easily obtain pricing formulas for coupon bonds.
Coupon bonds are bonds with regular interest payments, called coupons, plus a principal repay
ment at maturity. Let c
j
be the payments at times t
j
, j = 1, . . . , n, F be the face value paid at
time t
n
. Then the price of the coupon bond B
c
must satisfy
B
c
=
n
j=1
c
j
p(0, t
j
) +Fp(0, t
n
). (5.1)
Hence, we see that a coupon bond is equivalent to a portfolio of zerocoupon bonds.
The yieldtomaturity is deﬁned as an interest rate per annum that equates the present value
of future cash ﬂows to the current market value. Using continuous compounding, the yieldto
maturity y
c
is deﬁned by the relation
B
c
=
n
j=1
c
j
exp¦−y
c
t
j
¦ +F exp¦−y
c
t
n
¦.
If for instance the t
j
, j = 1, . . . , n are expressed in years, then y
c
is an annual continuously
compounded yieldtomaturity (with the continuously compounded annual interest rate, r(T),
deﬁned by the relation p(0, T) = exp ¦−r(T) (T/365)¦).
The term structure of interest rates is deﬁned as the relationship between the yieldtomaturity
on a zerocoupon bond and the bond’s maturity. Normally, this yields an upward sloping curve
(as in ﬁgure 5.1), but ﬂat and downward sloping curves have also been observed.
In constructing the term structure of interest rates, we face the additional problem that in
most economies no zerocoupon bonds with maturity greater than one year are traded (in the
USA, Treasury bills are only traded with maturity up to one year). We can, however, use prices of
coupon bonds and invert formula (5.1) for zerocoupon prices. In practice, additional complications
arise, since the maturities of coupon bonds are not equally spaced and trading in bonds with some
maturities may be too thin to give reliable prices. We refer the reader to Jarrow and Turnbull
(2000) for further discussion of these issues.
91
CHAPTER 5. INTEREST RATE THEORY 92
'
Yield
_ Maturity
Figure 5.1: Yield curve
5.1.2 Mathematical Modelling
Let (Ω, T, IP, IF) be a ﬁltered probability space with a ﬁltration IF = (T
t
)
t≤T
∗ satisfying the usual
conditions (used to model the ﬂow of information) and ﬁx a terminal time horizon T
∗
. We assume
that all processes are deﬁned on this probability space. The basic building blocks for our relative
pricing approach, zerocoupon bonds, are deﬁned as follows.
Deﬁnition 5.1.1. A zerocoupon bond with maturity date T, also called a Tbond, is a contract
that guarantees the holder a cash payment of one unit on the date T. The price at time t of a bond
with maturity date T is denoted by p(t, T).
Obviously we have p(t, t) = 1 for all t. We shall assume that the price process p(t, T), t ∈ [0, T]
is adapted and strictly positive and that for every ﬁxed t, p(t, T) is continuously diﬀerentiable in
the T variable.
Based on arbitrage considerations (recall our basic aim is to construct a market model that is
free of arbitrage), we now deﬁne several riskfree interest rates. Given three dates t < T
1
< T
2
the basic question is: what is the riskfree rate of return, determined at the contract time t, over
the interval [T
1
, T
2
] of an investment of 1 at time T
1
? To answer this question we consider the
arbitrage Table 5.1 below (compare ¸1.3 for the use of arbitrage tables).
Time t T
1
T
2
Sell T
1
bond Pay out 1
Buy
p(t,T
1
)
p(t,T
2
)
T
2
bonds Receive
p(t,T
1
)
p(t,T
2
)
Net investment 0 −1 +
p(t,T
1
)
p(t,T
2
)
Table 5.1: Arbitrage table for forward rates
To exclude arbitrage opportunities, the equivalent constant rate of interest R over this period
(we pay out 1 at time T
1
and receive e
R(T
2
−T
1
)
at T
2
) has thus to be given by
e
R(T
2
−T
1
)
=
p(t, T
1
)
p(t, T
2
)
.
CHAPTER 5. INTEREST RATE THEORY 93
We formalize this in:
Deﬁnition 5.1.2. (i) The forward rate for the period [T
1
, T
2
] as seen at time at t is deﬁned as
R(t; T
1
, T
2
) = −
log p(t, T
2
) −log p(t, T
1
)
T
2
−T
1
.
(ii) The spot rate R(T
1
, T
2
), for the period [T
1
, T
2
] is deﬁned as
R(T
1
, T
2
) = R(T
1
; T
1
, T
2
).
(iii) The instantaneous forward rate with maturity T, at time t, is deﬁned by
f(t, T) = −
∂ log p(t, T)
∂T
.
(iv) The instantaneous short rate at time t is deﬁned by
r(t) = f(t, t).
Deﬁnition 5.1.3. The money account process is deﬁned by
B(t) = exp
_
_
_
t
_
0
r(s)ds
_
_
_
.
The interpretation of the money market account is a strategy of instantaneously reinvesting at
the current short rate.
Lemma 5.1.1. For t ≤ s ≤ T we have
p(t, T) = p(t, s) exp
_
_
_
−
T
_
s
f(t, u)du
_
_
_
,
and in particular
p(t, T) = exp
_
_
_
−
T
_
t
f(t, s)ds
_
_
_
.
In what follows, we model the above processes in a generalized BlackScholes framework. That
is, we assume that W = (W
1
, . . . , W
d
) is a standard ddimensional Brownian motion and the
ﬁltration IF is the augmentation of the ﬁltration generated by W(t). The dynamics of the various
processes are given as follows:
Shortrate Dynamics:
dr(t) = a(t)dt +b(t)dW(t), (5.2)
Bondprice Dynamics:
dp(t, T) = p(t, T) ¦m(t, T)dt +v(t, T)dW(t)¦, (5.3)
Forwardrate Dynamics:
df(t, T) = α(t, T)dt +σ(t, T)dW(t). (5.4)
CHAPTER 5. INTEREST RATE THEORY 94
We assume that in the above formulas, the coeﬃcients meet standard conditions required to
guarantee the existence of the various processes – that is, existence of solutions of the various
stochastic diﬀerential equations. Furthermore, we assume that the processes are smooth enough
to allow diﬀerentiation and certain operations involving changing of order of integration and
diﬀerentiation. Since the actual conditions are rather technical, we refer the reader to Bj¨ork (1997),
Heath, Jarrow, and Morton (1992) and Protter (2004) (the latter reference for the stochastic Fubini
theorem) for these conditions.
Following Bj¨ork (1997) for formulation and proof, we now give a small toolbox for the rela
tionships between the processes speciﬁed above.
Proposition 5.1.1. (i) If p(t, T) satisﬁes (5.3), then for the forwardrate dynamics we have
df(t, T) = α(t, T)dt +σ(t, T)dW(t),
where α and σ are given by
_
α(t, T) = v
T
(t, T)v(t, T) −m
T
(t, T),
σ(t, T) = −v
T
(t, T).
(ii) If f(t, T) satisﬁes (5.4), then the short rate satisﬁes
dr(t) = a(t)dt +b(t)dW(t),
where a and b are given by
_
a(t) = f
T
(t, t) +α(t, t),
b(t) = σ(t, t).
(5.5)
(iii) If f(t, T) satisﬁes (5.4), then p(t, T) satisﬁes
dp(t, T) = p(t, T)
_
r(t) +A(t, T) +
1
2
S(t, T)
2
_
dt +p(t, T)S(t, T)dW(t),
where
A(t, T) = −
T
_
t
α(t, s)ds, S(t, T) = −
T
_
t
σ(t, s)ds. (5.6)
Proof. To prove (i) we only have to apply Itˆo’s formula to the deﬁning equation for the
forward rates.
To prove (ii) we start by integrating the forwardrate dynamics. This leads to
f(t, t) = r(t) = f(0, t) +
t
_
0
α(s, t)ds +
t
_
0
σ(s, t)dW(s). (5.7)
Writing also α and σ in integrated form
α(s, t) = α(s, s) +
t
_
s
α
T
(s, u)du,
σ(s, t) = σ(s, s) +
t
_
s
σ
T
(s, u)du,
CHAPTER 5. INTEREST RATE THEORY 95
and inserting this into (5.7), we ﬁnd
r(t) = f(0, t) +
t
_
0
α(s, s)ds +
t
_
0
t
_
s
α
T
(s, u)duds
+
t
_
0
σ(s, s)dW(s) +
t
_
0
t
_
s
σ
T
(s, u)dudW(s).
After changing the order of integration we can identify terms to establish (ii).
For (iii) we use a technique from Heath, Jarrow, and Morton (1992); compare Bj¨ork (1997).
By the deﬁnition of the forward rates we may write the bondprice process as
p(t, T) = exp¦Z(t, T)¦,
where Z is given by
Z(t, T) = −
T
_
t
f(t, s)ds. (5.8)
Again we write (5.4) in integrated form:
f(t, s) = f(0, s) +
t
_
0
α(u, s)dt +
t
_
0
σ(u, s)dW(u).
We insert the integrated form in (5.8) to get
Z(t, T) = −
T
_
t
f(0, s)ds −
t
_
0
T
_
t
α(u, s)dsdu −
t
_
0
T
_
t
σ(u, s)dsdW(u).
Now, splitting the integrals and changing the order of integration gives us
Z(t, T) = −
T
_
0
f(0, s)ds −
t
_
0
T
_
u
α(u, s)dsdu −
t
_
0
T
_
u
σ(u, s)dsdW(u)
+
t
_
0
f(0, s)ds +
t
_
0
t
_
u
α(u, s)dsdu +
t
_
0
t
_
u
σ(u, s)dsdW(u)
= Z(0, T) −
t
_
0
T
_
u
α(u, s)dsdu −
t
_
0
T
_
u
σ(u, s)dsdW(u)
+
t
_
0
f(0, s)ds +
t
_
0
s
_
0
α(u, s)duds +
t
_
0
s
_
0
σ(u, s)dW(u)ds.
The last line is just the integrated form of the forwardrate dynamics (5.4) over the interval [0, s].
Since r(s) = f(s, s), this last line above equals
_
t
0
r(s)ds. So we obtain
Z(t, T) = Z(0, T) +
t
_
0
r(s)ds −
t
_
0
T
_
u
α(u, s)dsdu −
t
_
0
T
_
u
σ(u, s)dsdW(u).
Using A and S from (5.6), the stochastic diﬀerential of Z is given by
dZ(t, T) = (r(t) +A(t, T))dt +S(t, T)dW(t).
Now we can apply Itˆo’s lemma to the process p(t, T) = exp¦Z(t, T)¦ to complete the proof.
CHAPTER 5. INTEREST RATE THEORY 96
5.1.3 Bond Pricing, Martingale Measures and Trading Strategies
We will now examine the mathematical structure of our bondmarket model in more detail. As
usual, our ﬁrst task is to ﬁnd a convenient characterization of the noarbitrage assumption. By
Theorem 4.2.1, absence of arbitrage is guaranteed by the existence of an equivalent martingale
measure QQ. Recall that by deﬁnition an equivalent martingale measure has to satisfy QQ ∼ IP and
the discounted price processes (with respect to a suitable num´eraire) of the basic securities have
to be QQmartingales. For the bond market this implies that all zerocoupon bonds with maturities
0 ≤ T ≤ T
∗
have to be martingales.
More precisely, taking the riskfree bank account B(t) as num´eraire we have
Deﬁnition 5.1.4. A measure QQ ∼ IP deﬁned on (Ω, T, IP) is an equivalent martingale measure
for the bond market, if for every ﬁxed 0 ≤ T ≤ T
∗
the process
p(t, T)
B(t)
, 0 ≤ t ≤ T
is a QQmartingale.
Assume now that there exists at least one equivalent martingale measure, say QQ. Deﬁning
contingent claims as T
T
measurable random variables such that X/B(T) ∈ L
1
(T
T
, QQ) with some
0 ≤ T ≤ T
∗
(notation: Tcontingent claims), we can use the riskneutral valuation principle (4.6)
to obtain:
Proposition 5.1.2. Consider a Tcontingent claim X. Then the price process Π
X
(t), 0 ≤ t ≤ T
of the contingent claim is given by
Π
X
(t) = B(t)IE
QQ
_
X
B(T)
¸
¸
¸
¸
T
t
_
= IE
QQ
_
Xe
−
T
t
r(s)ds
¸
¸
¸ T
t
_
.
In particular, the price process of a zerocoupon bond with maturity T is given by
p(t, T) = IE
QQ
_
e
−
T
t
r(s)ds
¸
¸
¸ T
t
_
.
Proof. We just have to apply Theorem 4.2.3.
We thus see that the relevant dynamics of the price processes are those given under a martingale
measure QQ. The implication for model building is that it is natural to model all objects directly
under a martingale measure QQ. This approach is called martingale modelling. The price one has
to pay in using this approach lies in the statistical problems associated with parameter estimation.
5.2 Shortrate Models
Following our introductory remarks, we now look at models of the short rate of the type
dr(t) = a(t, r(t))dt +b(t, r(t))dW(t), (5.9)
with functions a, b suﬃciently regular and W a realvalued Brownian motion.
The crucial point in this setting is the assumption on the probability measure under which the
short rate is modelled.
If we model under the objective probability measure IP and assume that a locally riskfree
asset B (the money market) exists, we face the question whether in an arbitragefree market bond
prices – quite naturally viewed as derivatives with the short rate as underlying – are uniquely
determined. In contrast to the equity market setting, with a risky asset and a riskfree asset
available for trading, the short rate r is not the price of a traded asset, and hence we only can
CHAPTER 5. INTEREST RATE THEORY 97
set up portfolios consisting of putting money in the bank account. We thus face an incomplete
market situation, and the best we can hope for is to ﬁnd consistency requirements for bonds of
diﬀerent maturity. Given a single ‘benchmark’ bond, we should then be able to price all other
bonds relative to this given bond.
On the other hand, if we assume that the short rate is modelled under an equivalent martingale
measure, we can immediately price all contingent claims via the riskneutral valuation formula.
The drawback in this case is the question of calibrating the model (we do not observe the pa
rameters of the process under an equivalent martingale measure, but rather under the objective
probability measure!).
5.2.1 The Termstructure Equation
Let us assume that the shortrate dynamics satisfy (5.9) under the historical probability measure
IP. In our Brownian setting, we know that each equivalent martingale measure QQ is given in terms
of a Girsanov density
L(t) = exp
_
_
_
−
t
_
0
γ(u)dW(u) −
1
2
t
_
0
γ(u)
2
du
_
_
_
, 0 ≤ t ≤ T.
Assume now that γ is given as γ(t) = λ(t, r(t)), with a suﬃciently smooth function λ. (We will
use the notation QQ(λ) to emphasize the dependence of the equivalent martingale measure on λ).
By Girsanov’s Theorem 4.1.4, we know that
˜
W = W +
_
λdt is a QQ(λ)Brownian motion. So the
QQ(λ)dynamics of r are given by
dr(t) = ¦a(t, r(t)) −b(t, r(t))λ(t, r(t))¦dt +b(t, r(t))d
˜
W(t).
Now consider a Tcontingent claim X = Φ(r(T)), for some suﬃciently smooth function Φ : IR →
IR
+
. We know that using the riskneutral valuation formula we obtain arbitragefree prices for any
contingent claim by applying the expectation operator under an equivalent martingale measure
(to the discounted time T value). An slight modiﬁcation of the argument used to ﬁnd the Black
Scholes PDE yields, for any QQ(λ),
IE
QQ(λ)
_
e
−
T
t
r(u)du
Φ(r(T))
¸
¸
¸ T
t
_
= F(t, r(t)),
where F : [0, T
∗
] IR → IR satisﬁes the partial diﬀerential equation (we omit the arguments (t, r))
F
t
+ (a −bλ)F
r
+
1
2
b
2
F
rr
−rF = 0 (5.10)
and terminal condition F(T, r) = Φ(r), for all r ∈ IR. Suppose now that the price process p(t, T)
of a Tbond is determined by the assessment, at time t, of the segment ¦r(τ), t ≤ τ ≤ T¦ of the
short rate process over the term of the bond. So we assume
p(t, T) = F(t, r(t); T),
with F a suﬃciently smooth function.
Since we know that the value of a zerocoupon bond is one unit at maturity, we have the
terminal condition F(T, r; T) = 1. Thus we have
Proposition 5.2.1 (Termstructure Equation). If there exists an equivalent martingale mea
sure of type QQ(λ) for the bond market (implying that the noarbitrage condition holds) and the
price processes p(t, T) of Tbonds are given are given by a suﬃciently smooth function F as above,
then F will satisfy the partial diﬀerential equation (5.10) with terminal condition F(T, r; T) = 1.
CHAPTER 5. INTEREST RATE THEORY 98
5.2.2 Martingale Modelling
We now ﬁx an equivalent martingale measure QQ (which we assume to exist), and return to mod
elling the shortrate dynamics directly under QQ. Thus we assume that r has QQdynamics
dr(t) = a(t, r(t))dt +b(t, r(t))dW(t) (5.9)
with W a (realvalued) QQWiener process. We can immediately apply the riskneutral valuation
technique to obtain the price process Π
X
(t) of any suﬃciently integrable Tcontingent claim X
by computing the QQexpectation, i.e.
Π
X
(t) = IE
QQ
_
e
−
T
t
r(u)du
X[T
t
_
. (5.11)
If, additionally, the contingent claim is of the form X = Φ(r(T)) with a suﬃciently smooth function
Φ, we obtain
Proposition 5.2.2 (Termstructure Equation). Consider Tcontingent claims of the form
X = Φ(r(T)). Then arbitragefree price processes are given by Π
X
(t) = F(t, r(t)), where F is the
solution of the partial diﬀerential equation
F
t
+aF
r
+
b
2
2
F
rr
−rF = 0 (5.12)
with terminal condition F(T, r) = Φ(r) for all r ∈ IR. In particular, Tbond prices are given by
p(t, T) = F(t, r(t); T), with F solving (5.12) and terminal condition F(T, r; T) = 1.
Suppose we want to evaluate the price of a European call option with maturity S and strike
K on an underlying Tbond. This means we have to price the Scontingent claim
X = max¦p(S, T) −K, 0¦.
We ﬁrst have to ﬁnd the price process p(t, T) = F(t, r; T) by solving (5.12) with terminal condition
F(T, r; T) = 1. Secondly, we use the riskneutral valuation principle to obtain Π
X
(t) = G(t, r),
with G solving
G
t
+aG
r
+
b
2
2
G
rr
−rG = 0 and G(S, r) = max¦F(S, r; T) −K, 0¦, ∀r ∈ IR.
So we are clearly in need of eﬃcient methods of solving the above partial diﬀerential equations, or
from a modelling point of view, we need shortrate models that facilate this computational task.
Fortunately, there is a class of models, exhibiting an aﬃne term structure (ATS), which allows for
simpliﬁcation.
Deﬁnition 5.2.1. If bond prices are given as
p(t, T) = exp ¦A(t, T) −B(t, T)r¦, 0 ≤ t ≤ T,
with A(t, T) and B(t, T) are deterministic functions, we say that the model possesses an aﬃne
term structure.
Assuming that we have such a model in which both a and b
2
are aﬃne in r, say a(t, r) =
α(t) −β(t)r and b(t, r) =
_
γ(t) +δ(t)r, we ﬁnd that A and B are given as solutions of ordinary
diﬀerential equations,
A
t
(t, T) −α(t)B(t, T) +
γ(t)
2
B
2
(t, T) = 0, A(T, T) = 0,
(1 +B
t
(t, T)) −β(t)B(t, T) −
δ(t)
2
B
2
(t, T) = 0, B(T, T) = 0.
The equation for B is a Riccati equation, which can be solved analytically, see Ince (1944), ¸2.15,
12.51, A.21. Using the solution for B we get A by integrating.
Examples of shortrate models exhibiting an aﬃne term structure include the following.
CHAPTER 5. INTEREST RATE THEORY 99
1. Vasicek model: dr = (α −βr)dt +γdW;
2. CoxIngersollRoss (CIR) model: dr = (α −βr)dt +δ
√
rdW;
3. HoLee model: dr = α(t)dt +γdW;
4. HullWhite (extended Vasicek) model: dr = (α(t) −β(t)r)dt +γ(t)dW;
5. HullWhite (extended CIR) model: dr = (α(t) −β(t)r)dt +δ(t)
√
rdW.
5.3 HeathJarrowMorton Methodology
5.3.1 The HeathJarrowMorton Model Class
Modelling the term structure with only one explanatory variable leads to various undesirable
properties of the model (to say the least). Various authors have proposed models with more
than one state variable, e.g. the short rate and a long rate and/or intermediate rates. The
HeathJarrowMorton method (compare Heath, Jarrow, and Morton (1992)) is at the far end of
this spectrum – they propose using the entire forward rate curve as their (inﬁnitedimensional)
state variable. More precisely, for any ﬁxed T ≤ T
∗
the dynamics of instantaneous, continuously
compounded forward rates f(t, T) are exogenously given by
df(t, T) = α(t, T)dt +σ(t, T)dW(t), (5.4)
where W is a ddimensional Brownian motion with respect to the underlying (objective) prob
ability measure IP and α(t, T) resp. σ(t, T) are adapted IR resp. IR
d
valued processes. For any
ﬁxed maturity T, the initial condition of the stochastic diﬀerential equation (5.4) is determined
by the current value of the empirical (observed) forward rate for the future date T which prevails
at time 0. Observe that we have deﬁned an inﬁnitedimensional stochastic system, and that by
construction we obtain a perfect ﬁt to the observed term structure (thus avoiding the problem of
inverting the yield curve).
The exogenous speciﬁcation of the family of forward rates ¦f(t, T); T > 0¦ is equivalent to a
speciﬁcation of the entire family of bond prices ¦p(t, T); T > 0¦. Furthermore, by Proposition
5.1.1 we obtain the dynamics of the bondprice processes as
dp(t, T) = p(t, T) ¦m(t, T)dt +S(t, T)dW(t)¦, (5.13)
where
m(t, T) = r(t) +A(t, T) +
1
2
S(t, T)
2
, (5.14)
A(t, T) = −
_
T
t
α(t, s)ds and S(t, T) = −
_
T
t
σ(t, s)ds (compare (5.6)). We now explore what
conditions we must impose on the coeﬃcients in order to ensure the existence of an equivalent
martingale measure with respect to a suitable num´eraire. By Theorem 4.2.1, we then could
conclude that our bond market model is free of arbitrage.
As a ﬁrst possible choice of num´eraire, we use the moneymarket account B (assuming that
there exists a measurable version of f(t, t) in [0, T
∗
]), given by
B(t) = exp
_
_
_
t
_
0
f(u, u)du
_
_
_
= exp
_
_
_
t
_
0
r(u)du
_
_
_
.
So we allow investments in a savings account too. We must ﬁnd an equivalent measure such that
Z(t, T) =
p(t, T)
B(t)
is a martingale for every 0 ≤ T ≤ T
∗
. We will call such a measure riskneutral martingale measure
to emphasize the dependence on the num´eraire.
CHAPTER 5. INTEREST RATE THEORY 100
Theorem 5.3.1. Assume that the family of forward rates is given by (5.4). Then there exists
a riskneutral martingale measure if and only if there exists an adapted process λ(t), with the
properties that
(i)
_
T
0
λ(t)
2
dt < ∞, a.s. and IE(L(T)) = 1, with
L(t) = exp
_
_
_
−
t
_
0
λ(u)
dW(u) −
1
2
t
_
0
λ(u)
2
du
_
_
_
. (5.15)
(ii) For all 0 ≤ T ≤ T
∗
and for all t ≤ T, we have
α(t, T) = σ(t, T)
T
_
t
σ(t, s)ds +σ(t, T)λ(t). (5.16)
Proof. Since we are working in a Brownian framework we know that any equivalent measure
QQ ∼ IP is given via a Girsanov density (5.15). Using Itˆo’s formula and Girsanov’s Theorem 4.1.4,
we ﬁnd the QQdynamics of Z(t, T) (omitting the arguments) as:
dZ = Z
_
A+
1
2
S
2
−Sλ
_
dt +ZSd
˜
W, (5.17)
with
˜
W a QQBrownian motion. In order for Z to be a QQmartingale, the drift coeﬃcient in (5.17)
has to be zero, so we obtain
A(t, T) +
1
2
S(t, T)
2
= S(t, T)λ(t). (5.18)
Taking the derivative with respect to T, we get
−α(t, T) −σ(t, T)S(t, T) = −σ(t, T)λ(t),
which after rearranging is (5.16).
It is possible to interpret λ as a risk premium, which has to be exogenously speciﬁed to allow the
choice of a particular riskneutral martingale measure. In view of (5.16) this leads to a restriction
on drift and volatility coeﬃcients in the speciﬁcation of the forward rate dynamics (5.4). The
particular choice λ ≡ 0 means that we assume we model directly under a riskneutral martingale
measure QQ. In that case the relations between the various inﬁnitesimal characteristics for the
forward rate are known as the ‘HeathJarrowMorton drift condition’.
Theorem 5.3.2 (HeathJarrowMorton). Assume that QQ is a riskneutral martingale measure
for the bond market and that the forwardrate dynamics under QQ are given by
df(t, T) = α(t, T)dt +σ(t, T)d
˜
W(t), (5.19)
with
˜
Wa QQBrownian motion. Then we have:
(i) the HeathJarrowMorton drift condition
α(t, T) = σ(t, T)
T
_
t
σ(t, s)ds, 0 ≤ t ≤ T ≤ T
∗
, QQ−a.s., (5.20)
(ii) and bondprice dynamics under QQ are given by
dp(t, T) = p(t, T)r(t)dt +p(t, T)S(t, T)d
˜
W(t),
with S as in (5.6).
CHAPTER 5. INTEREST RATE THEORY 101
5.3.2 Forward Riskneutral Martingale Measures
For many valuation problems in the bond market it is more suitable to use the bond price process
p(t, T
∗
) as num´eraire. We then have to ﬁnd an equivalent probability measure QQ
∗
such that the
auxiliary process
Z
∗
(t, T) =
p(t, T)
p(t, T
∗
)
, ∀t ∈ [0, T],
is a martingale under QQ
∗
for all T ≤ T
∗
. We will call such a measure forward riskneutral mar
tingale measure. In this setting, a savings account is not used and the existence of a martingale
measure QQ
∗
guarantees that there are no arbitrage opportunities between bonds of diﬀerent ma
turities.
In order to ﬁnd suﬃcient conditions for the existence of such a martingale measure, we follow
the same programme as above. By (5.13) bond price dynamics under the original probability
measure IP are given as
dp(t, T) = p(t, T) ¦m(t, T)dt +S(t, T)dW(t)¦,
with m(t, T) as in (5.14). Now applying Itˆo’s formula to the quotient p(t, T)/p(t, T
∗
) we ﬁnd
dZ
∗
(t, T) = Z
∗
(t, T) ¦ ˜ m(t, T)dt + (S(t, T) −S(t, T
∗
))dW(t)¦, (5.21)
with ˜ m(t, T) = m(t, T) −m(t, T
∗
) −S(t, T
∗
)(S(t, T) −S(t, T
∗
)). Again, any equivalent martingale
measure QQ
∗
is given via a Girsanov density L(t) deﬁned by a function γ(t) as in Theorem 5.3.1.
So the drift coeﬃcient of Z
∗
(t, T) under QQ
∗
is given as
˜ m(t, T) −(S(t, T) −S(t, T
∗
))γ(t).
Now for Z
∗
(t, T) to be a QQ
∗
martingale this coeﬃcient has to be zero, and replacing ˜ m with its
deﬁnition we get
(A(t, T) −A(t, T
∗
)) +
1
2
_
S(t, T)
2
−S(t, T
∗
)
2
_
= (S(t, T
∗
) +γ(t)) (S(t, T) −S(t, T
∗
)) .
Written in terms of the coeﬃcients of the forwardrate dynamics, this identity simpliﬁes to
T
∗
_
T
α(t, s)ds +
1
2
_
_
_
_
_
_
T
∗
_
T
σ(t, s)ds
_
_
_
_
_
_
2
= γ(t)
T
∗
_
T
σ(t, s)ds.
Taking the derivative with respect to T, we obtain
α(t, T) +σ(t, T)
T
∗
_
T
σ(t, s)ds = γ(t)σ(t, T).
We have thus proved:
Theorem 5.3.3. Assume that the family of forward rates is given by (5.4). Then there exists a
forward riskneutral martingale measure if and only if there exists an adapted process γ(t), with
the properties (i) of Theorem 5.3.1, such that
α(t, T) = σ(t, T) (S(T, T
∗
) +γ(t)), 0 ≤ t ≤ T ≤ T
∗
.
CHAPTER 5. INTEREST RATE THEORY 102
5.4 Pricing and Hedging Contingent Claims
5.4.1 Gaussian HJM Framework
Assume that the dynamics of the forward rate are given under a riskneutral martingale measure
QQ by
df(t, T) = α(t, T)dt +σ(t, T)d
˜
W(t), f(0, T) =
ˆ
f(0, T),
with all processes realvalued. We restrict the class of models by assuming that the forward rate’s
volatility is deterministic. The HJMdrift condition (5.20) and the integrated form of the forward
rate (compare (5.7)) lead to
f(t, t) = r(t) = f(0, t) +
t
_
0
(−σ(u, t)S(u, t))du +
t
_
0
σ(u, t)d
˜
W(u),
which implies that the shortrate (as well as the forward rates f(t, T)) have Gaussian probability
laws (hence the terminology). By Theorem 5.3.2, bondprice dynamics under QQ are given by
dp(t, T) = p(t, T)
_
r(t)dt +S(t, T)d
˜
W(t)
_
,
which we can solve in special cases explicitly for p(t, T).
To price options on zerocoupon bonds, we use the change of num´eraire technique. Consider
a European call C on a T
∗
bond with maturity T ≤ T
∗
and strike K. So we consider the T
contingent claim
X = (p(T, T
∗
) −K)
+
. (5.22)
The price of this call at time t = 0 is given as
C(0) = p(0, T
∗
)QQ
∗
(A) −Kp(0, T)QQ
T
(A),
with A = ¦ω : p(T, T
∗
) > K¦ and QQ
T
resp. QQ
∗
the T resp. T
∗
forward riskneutral measure.
Now
˜
Z(t, T) =
p(t, T
∗
)
p(t, T)
has QQdynamics (omitting the arguments and writing S
∗
for S(t, T
∗
))
d
˜
Z =
˜
Z
_
S(S −S
∗
)dt −(S −S
∗
)d
˜
W(t)
_
,
so a deterministic variance coeﬃcient. Now
QQ
∗
(p(T, T
∗
) ≥ K) = QQ
∗
_
p(T, T
∗
)
p(T, T)
≥ K
_
= QQ
∗
(
˜
Z(T, T) ≥ K).
Since
˜
Z(t, T) is a QQ
T
martingale with QQ
T
dynamics
d
˜
Z(t, T) = −
˜
Z(t, T)(S(t, T) −S(t, T
∗
))dW
T
(t),
we ﬁnd that under QQ
T
(again S = S(t, T), S
∗
= S(t, T
∗
))
˜
Z(T, T) =
p(0, T
∗
)
p(0, T)
exp
_
_
_
−
T
_
0
(S −S
∗
)dW
T
(t) −
1
2
T
_
0
(S −S
∗
)
2
dt
_
_
_
(with W
T
a QQ
T
Brownian motion). The stochastic integral in the exponential is Gaussian with
zero mean and variance
Σ
2
(T) =
T
_
0
(S(t, T) −S(t, T
∗
))
2
dt.
CHAPTER 5. INTEREST RATE THEORY 103
So
QQ
T
(p(T, T
∗
) ≥ K) = QQ
T
(
˜
Z(T, T) ≥ K) = N(d
2
)
with
d
2
=
log
_
p(0,T)
Kp(0,T
∗
)
_
−
1
2
Σ
2
(T)
_
Σ
2
(T)
.
Similarly, for the ﬁrst term
Z
∗
(t, T) =
p(t, T)
p(t, T
∗
)
has QQdynamics (compare (5.21))
dZ
∗
= Z
∗
_
S
∗
(S
∗
−S)dt + (S −S
∗
)d
˜
W(t)
_
,
and also a deterministic variance coeﬃcient. Now
QQ
∗
(p(T, T
∗
) ≥ K) = QQ
∗
_
1
p(T, T
∗
)
≤
1
K
_
= QQ
∗
(Z
∗
(T, T) ≤
1
K
).
Under QQ
∗
Z
∗
(t, T) is a martingale with
dZ
∗
(t, T) = Z
∗
(t, T)(S(t, T) −S(t, T
∗
))dW
∗
(t),
so
dZ
∗
(T, T) =
p(0, T)
p(0, T
∗
)
exp
_
_
_
T
_
0
(S −S
∗
)dW
∗
(t) −
1
2
T
_
0
(S −S
∗
)
2
dt
_
_
_
.
Again we have a Gaussian variable with the (same) variance Σ
2
(T) in the exponential. Using this
fact it follows (after some computations) that:
QQ
∗
(p(T, T
∗
) ≥ K) = N(d
1
),
with
d
1
= d
2
+
_
Σ
2
(T).
So we obtain:
Proposition 5.4.1. The price of the call option deﬁned in (5.22) is given by
C(0) = p(0, T
∗
)N(d
2
) −Kp(0, T)N(d
1
), (5.23)
with parameters given as above.
5.4.2 Swaps
This section is devoted to the pricing of swaps. We consider the case of a forward swap settled in
arrears. Such a contingent claim is characterized by:
• a ﬁxed time t, the contract time,
• dates T
0
< T
1
, . . . < T
n
, equally distanced T
i+1
−T
i
= δ,
• R, a prespeciﬁed ﬁxed rate of interest,
• K, a nominal amount.
CHAPTER 5. INTEREST RATE THEORY 104
A swap contract S with K and R ﬁxed for the period T
0
, . . . T
n
is a sequence of payments, where
the amount of money paid out at T
i+1
, i = 0, . . . , n −1 is deﬁned by
X
i+1
= Kδ(L(T
i
, T
i
) −R).
The ﬂoating rate over [T
i
, T
i+1
] observed at T
i
is a simple rate deﬁned as
p(T
i
, T
i+1
) =
1
1 +δL(T
i
, T
i
)
.
We do not need to specify a particular interestrate model here, all we need is the existence of a
riskneutral martingale measure. Using the riskneutral pricing formula we obtain (we may use
K = 1),
Π(t, S) =
n
i=1
IE
QQ
_
e
−
T
t
r(s)ds
δ(L(T
i
, T
i
) −R)
¸
¸
¸ T
t
_
=
n
i=1
IE
QQ
_
IE
QQ
_
e
−
T
i
T
i−1
r(s)ds
¸
¸
¸
¸
T
T
i−1
_
e
−
T
t
r(s)ds
_
1
p(T
i−1
, T
i
)
−(1 +δR)
_¸
¸
¸
¸
T
t
_
=
n
i=1
(p(t, T
i−1
) −(1 +δR)p(t, T
i
)) = p(t, T
0
) −
n
i=1
c
i
p(t, T
i
),
with c
i
= δR, i = 1, . . . , n − 1 and c
n
= 1 + δR. So a swap is a linear combination of zero
coupon bonds, and we obtain its price accordingly. This again shows the power of riskneutral
pricing. Using the linearity of the expectation operator we can reduce complicated claims to sums
of simpler ones.
5.4.3 Caps
An interest cap is a contract where the seller of the contract promises to pay a certain amount of
cash to the holder of the contract if the interest rate exceeds a certain predetermined level (the
cap rate) at some future date. A cap can be broken down in a series of caplets. A caplet is a
contract written at t, in force between [T
0
, T
1
], δ = T
1
−T
0
, the nominal amount is K, the cap rate
is denoted by R. The relevant interest rate (LIBOR, for instance) is observed in T
0
and deﬁned
by
p(T
0
, T
1
) =
1
1 +δL(T
0
, T
0
)
.
A caplet C is a T
1
contingent claim with payoﬀ X = Kδ(L(T
0
, T
0
)−R)
+
. Writing L = L(T
0
, T
0
), p =
p(T
0
, T
1
), R
∗
= 1 +δR, we have L = (1 −p)/(δp), (assuming K = 1) and
X = δ(L −R)
+
= δ
_
1 −p
δp
−R
_
+
=
_
1
p
−(1 +δR)
_
+
=
_
1
p
−R
∗
_
+
.
CHAPTER 5. INTEREST RATE THEORY 105
The riskneutral pricing formula leads to
Π
C
(t) = IE
QQ
_
e
−
T
1
t
r(s)ds
_
1
p
−R
∗
_
+
¸
¸
¸
¸
T
t
_
= IE
QQ
_
IE
QQ
_
e
−
T
1
T
0
r(s)ds
¸
¸
¸ T
T
0
_
e
−
T
0
t
r(s)ds
_
1
p
−R
∗
_
+
¸
¸
¸
¸
¸
T
t
_
= IE
QQ
_
p(T
0
, T
1
) e
−
T
0
t
r(s)ds
_
1
p
−R
∗
_
+
¸
¸
¸
¸
¸
T
t
_
= IE
QQ
_
e
−
T
0
t
r(s)ds
(1 −pR
∗
)
+
¸
¸
¸ T
t
_
= R
∗
IE
QQ
_
e
−
T
0
t
r(s)ds
_
1
R
∗
−p
_
+
¸
¸
¸
¸
¸
T
t
_
.
So a caplet is equivalent to R
∗
put options on a T
1
bond with maturity T
0
and strike 1/R
∗
.
Appendix A
Basic Probability Background
A.1 Fundamentals
To describe a random experiment we use a sample space Ω, the set of all possible outcomes. Each
point ω of Ω, or sample point, represents a possible random outcome of performing the random
experiment.
Examples. Write down Ω for experiments such as ﬂip a coin three times, roll two dice.
For a set A ⊆ Ω we want to know the probability IP(A). The class T of subsets of Ω whose
probabilities IP(A) are deﬁned (call such A events) should be a σalgebra , i.e.
(i) ∅, Ω ∈ T.
(ii) F ∈ T implies F
c
∈ T.
(iii) F
1
, F
2
, . . . ∈ T then
n
F
n
∈ T.
We want a probability measure deﬁned on T
(i) IP(∅) = 0, IP(Ω) = 1,
(ii) IP(A) ≥ 0 for all A,
(iii) If A
1
, A
2
, . . . , are disjoint, IP(
i
A
i
) =
i
IP(A
i
) countable additivity.
Deﬁnition A.1.1. A probability space, or Kolmogorov triple, is a triple (Ω, T, IP) satisfying
Kolmogorov axioms (i),(ii) and (iii) above.
A probability space is a mathematical model of a random experiment.
Examples. Assign probabilities for the above experiments.
Deﬁnition A.1.2. Let (Ω, T, IP) be a probability space. A random variable (vector) X is a
function X : Ω → IR(IR
k
) such that X
−1
(B) = ¦ω ∈ Ω : X(ω) ∈ B¦ ∈ T for all Borel sets
B ∈ B(B(IR
k
)).
For a random variable X
¦ω ∈ Ω : X(ω) ≤ x¦ ∈ T
for all x ∈ IR. So deﬁne the distribution function F
X
of X by
F
X
(x) := IP (¦ω : X(ω) ≤ x¦).
Recall: σ(X), the σalgebra generated by X.
106
APPENDIX A. BASIC PROBABILITY BACKGROUND 107
Some important probability distributions
• Binomial distribution: Number of successes
IP(S
n
= k) =
_
n
k
_
p
k
(1 −p)
n−k
.
• Geometric distribution: Waiting time
IP(N = n) = p(1 −p)
n−1
.
• Poisson distribution:
IP(X = k) = e
−λ
λ
k
k!
.
• Density of Uniform distribution:
f(x) =
1
b −a
1
{(a,b)}
.
• Density of Exponential distribution:
f(x) = λe
−λx
1
{[0,∞)}
.
• Density of standard Normal distribution:
f(x) =
1
√
2π
e
−x
2
/2
.
Deﬁnition A.1.3. The expectation IE of a random variable X on (Ω, T, IP) is deﬁned by
IEX :=
_
Ω
XdIP, or
_
Ω
X(ω)dIP(ω).
The variance of a random variable is deﬁned as
VV ar(X) := IE
_
(X −IE(X))
2
¸
= IE
_
X
2
_
−(IEX)
2
.
If X is realvalued with density f (i.e f(x) ≥ 0 :
_
f(x)dx = 1),
IEX :=
_
xf(x)dx
or if X is discrete, taking values x
n
(n = 1, 2, . . .) with probability function f(x
n
)(≥ 0),
IEX :=
x
n
f(x
n
).
Examples. Calculate moments for some of the above distributions.
APPENDIX A. BASIC PROBABILITY BACKGROUND 108
Deﬁnition A.1.4. Random variables X
1
, . . . , X
n
are independent if whenever A
i
∈ B (the Borel
σalgebra) for i = 1, . . . n we have
IP
_
n
i=1
¦X
i
∈ A
i
¦
_
=
n
i=1
IP(¦X
i
∈ A
i
¦).
Lemma A.1.1. In order for X
1
, . . . , X
n
to be independent it is necessary and suﬃcient that for
all x
1
, . . . x
n
∈ (−∞, ∞],
IP
_
n
i=1
¦X
i
≤ x
i
¦
_
=
n
i=1
IP(¦X
i
≤ x
i
¦).
Theorem A.1.1 (Multiplication Theorem). If X
1
, . . . , X
n
are independent and IE [X
i
[ <
∞, i = 1, . . . , n, then
IE
_
n
i=1
X
i
_
=
n
i=1
IE(X
i
).
If X, Y are independent, with distribution functions F, G, deﬁne Z := X+Y with distribution
function H. We call H the convolution of F and G, written H = F ∗ G.
Suppose X, Y have densities f, g, then Z has a density h with
h(z) =
∞
_
−∞
f(z −y)g(y)dy =
∞
_
−∞
f(x)g(z −x)dx.
Example. Assume t
1
, . . . , t
n
are independent random variables that have an exponential distri
bution with parameter λ. Then T = t
1
+. . . +t
n
has the Gamma(n, λ) density function
f(x) =
λ
n
x
n−1
(n −1)!
e
−λx
.
Deﬁnition A.1.5. If X is a random variable with distribution function F, its moment generating
function φ
X
is
φ(t) := IE(e
tX
) =
∞
_
−∞
e
tx
dF(x).
The mgf takes convolution into multiplication: if X, Y are independent,
φ
X+Y
(t) = φ
X
(t)φ
Y
(t).
Observe φ
(k)
(t) = IE(X
k
e
tX
) and φ(0) = IE(X
k
).
For X on nonnegative integers use the generating function
γ
X
(z) = IE(z
X
) =
∞
k=0
z
k
IP(Z = k).
A.2 Convolution and Characteristic Functions
The most basic operation on numbers is addition; the most basic operation on random variables
is addition of independent random variables. If X, Y are independent, with distribution functions
F, G, and
Z := X +Y,
APPENDIX A. BASIC PROBABILITY BACKGROUND 109
let Z have distribution function H. Then since X + Y = Y + X (addition is commutative), H
depends on F and G symmetrically. We call H the convolution (German: Faltung) of F and G,
written
H = F ∗ G.
Suppose ﬁrst that X, Y have densities f, g. Then
H(z) = IP(Z ≤ z) = IP(X +Y ≤ z) =
_
{(x,y):x+y≤z}
f(x)g(y)dxdy,
since by independence of X and Y the joint density of X and Y is the product f(x)g(y) of their
separate (marginal) densities, and to ﬁnd probabilities in the density case we integrate the joint
density over the relevant region. Thus
H(z) =
∞
_
−∞
f(x)
_
_
_
z−x
_
−∞
g(y)dy
_
_
_
dx =
∞
_
−∞
f(x)G(z −x)dx.
If
h(z) :=
∞
_
−∞
f(x)g(z −x)dx,
(and of course symmetrically with f and g interchanged), then integrating we recover the equation
above (after interchanging the order of integration. This is legitimate, as the integrals are non
negative, by Fubini’s theorem, which we quote from measure theory, see e.g. (Williams 1991),
¸8.2). This shows that if X, Y are independent with densities f, g, and Z = X + Y , then Z has
density h, where
h(x) =
∞
_
−∞
f(x −y)g(y)dy.
We write
h = f ∗ g,
and call the density h the convolution of the densities f and g.
If X, Y do not have densities, the argument above may still be taken as far as
H(z) = IP(Z ≤ z) = IP(X +Y ≤ z) =
∞
_
−∞
F(x −y)dG(y)
(and, again, symmetrically with F and G interchanged), where the integral on the right is the
LebesgueStieltjes integral of ¸2.2. We again write
H = F ∗ G,
and call the distribution function H the convolution of the distribution functions F and G.
In sum: addition of independent random variables corresponds to convolution of distribution
functions or densities.
Now we frequently need to add (or average) lots of independent random variables: for example,
when forming sample means in statistics – when the bigger the sample size is, the better. But
convolution involves integration, so adding n independent random variables involves n−1 integra
tions, and this is awkward to do for large n. One thus seeks a way to transform distributions so
as to make the awkward operation of convolution as easy to handle as the operation of addition
of independent random variables that gives rise to it.
APPENDIX A. BASIC PROBABILITY BACKGROUND 110
Deﬁnition A.2.1. If X is a random variable with distribution function F, its characteristic
function φ (or φ
X
if we need to emphasise X) is
φ(t) := IE(e
itX
) =
∞
_
−∞
e
itx
dF(x), (t ∈ IR).
Note.
Here i :=
√
−1. All other numbers – t, x etc. – are real; all expressions involving i such as e
itx
,
φ(t) = IE(e
itx
) are complex numbers.
The characteristic function takes convolution into multiplication: if X, Y are independent,
φ
X+Y
(t) = φ
X
(t)φ
Y
(t).
For, as X, Y are independent, so are e
itX
and e
itY
for any t, so by the multiplication theorem
(Theorem B.3.1),
IE(e
it(X+Y )
) = IE(e
itX
e
itY
) = IE(e
itX
) IE(e
itY
),
as required.
We list some properties of characteristic functions that we shall need.
1. φ(0) = 1. For, φ(0) = IE(e
i·0·X
) = IE(e
0
) = IE(1) = 1.
2. [φ(t)[ ≤ 1 for all t ∈ IR.
Proof. [φ(t)[ =
¸
¸
¸
_
∞
−∞
e
itx
dF(x)
¸
¸
¸ ≤
_
∞
−∞
¸
¸
e
itx
¸
¸
dF(x) =
_
∞
−∞
1dF(x) = 1.
Thus in particular the characteristic function always exists (the integral deﬁning it is always
absolutely convergent). This is a crucial advantage, far outweighing the disadvantage of having to
work with complex rather than real numbers (the nuisance value of which is in fact slight).
3. φ is continuous (indeed, φ is uniformly continuous).
Proof.
[φ(t +u) −φ(t)[ =
¸
¸
¸
¸
¸
¸
∞
_
−∞
¦e
i(t+u)x
−e
itx
¦dF(x)
¸
¸
¸
¸
¸
¸
=
¸
¸
¸
¸
¸
¸
∞
_
−∞
e
itx
(e
iux
−1)dF(x)
¸
¸
¸
¸
¸
¸
≤
∞
_
−∞
¸
¸
e
iux
−1
¸
¸
dF(x),
for all t. Now as u → 0,
¸
¸
e
iux
−1
¸
¸
→ 0, and
¸
¸
e
iux
−1
¸
¸
≤ 2. The bound on the right tends
to zero as u → 0 by Lebesgue’s dominated convergence theorem (which we quote from measure
theory: see e.g. (Williams 1991), ¸5.9), giving continuity; the uniformity follows as the bound
holds uniformly in t.
4. (Uniqueness theorem): φ determines the distribution function F uniquely.
Technically, φ is the FourierStieltjes transform of F, and here we are quoting the uniqueness
property of this transform. Were uniqueness not to hold, we would lose information on taking
characteristic functions, and so φ would not be useful.
5. (Continuity theorem): If X
n
, X are random variables with distribution functions F
n
, F and
characteristic functions φ
n
, φ, then convergence of φ
n
to φ,
φ
n
(t) → φ(t) (n → ∞) for all t ∈ IR
is equivalent to convergence in distribution of X
n
to X. This result is due to L´evy; see e.g.
(Williams 1991), ¸18.1.
6. Moments. Suppose X has kth moment: IE[X[
k
< ∞. Take the Taylor (powerseries) expansion
of e
itx
as far as the kth power term:
e
itx
= 1 +itx + + (itx)
k
/k! +o
_
t
k
_
,
APPENDIX A. BASIC PROBABILITY BACKGROUND 111
where ‘o
_
t
k
_
’ denotes an error term of smaller order than t
k
for small k. Now replace x by X,
and take expectations. By linearity, we obtain
φ(t) = IE(e
itX
) = 1 +itIEX + +
(it)
k
k!
IE(X
k
) +e(t),
where the error term e(t) is the expectation of the error terms (now random, as X is random)
obtained above (one for each value X(ω) of X). It is not obvious, but it is true, that e(t) is still
of smaller order than t
k
for t → 0:
if IE
_
[X[
k
_
< ∞, φ(t) = 1 +itIE(X) +. . . +
(it)
k
k!
IE
_
X
k
_
+o
_
t
k
_
(t → 0).
We shall need the case k = 2 in dealing with the central limit theorem below.
Examples
1. Standard Normal Distribution,
N(0, 1). For the standard normal density f(x) =
1
√
2π
exp¦−
1
2
x
2
¦, one has, by the process of
‘completing the square’ (familiar from when one ﬁrst learns to solve quadratic equations!),
∞
_
−∞
e
tx
f(x)dx =
1
√
2π
∞
_
−∞
exp
_
tx −
1
2
x
2
_
dx
=
1
√
2π
∞
_
−∞
exp
_
−
1
2
(x −t)
2
+
1
2
t
2
_
dx
= exp
_
1
2
t
2
_
1
√
2π
∞
_
−∞
exp
_
−
1
2
(x −t)
2
_
dx.
The second factor on the right is 1 (it has the form of a normal integral). This gives the integral
on the left as exp¦
1
2
t
2
¦.
Now replace t by it (legitimate by analytic continuation, which we quote from complex analysis,
see e.g. (Burkill and Burkill 1970)). The right becomes exp¦−
1
2
t
2
¦. The integral on the left
becomes the characteristic function of the standard normal density – which we have thus now
identiﬁed (and will need below in ¸2.8).
2. General Normal Distribution,
N(µ, σ). Consider the transformation x → µ + σx. Applied to a random variable X, this adds µ
to the mean (a change of location), and multiplies the variance by σ
2
(a change of scale). One can
check that if X has the standard normal density above, then µ +σX has density
f(x) =
1
σ
√
2π
exp
_
−
1
2
(x −µ)
2
/σ
2
_
,
and characteristic function
IEe
it(µ+σX)
= exp¦iµt¦IE
_
e
(iσt)X
_
= exp¦iµt¦ exp
_
−
1
2
(σt)
2
_
= exp
_
iµt −
1
2
σ
2
t
2
_
.
Thus the general normal density and its characteristic function are
f(x) =
1
σ
√
2π
exp
_
−
1
2
(x −µ)
2
/σ
2
_
, φ(t) = exp
_
iµt −
1
2
σ
2
t
2
_
.
APPENDIX A. BASIC PROBABILITY BACKGROUND 112
3. Poisson Distribution,
P(λ). Here, the probability mass function is
f(k) := IP(X = k) = e
−λ
λ
k
/k!, (k = 0, 1, 2, . . .).
The characteristic function is thus
φ(t) = IE
_
e
itX
_
=
∞
k=0
e
−λ
λ
k
k!
e
itk
= e
−λ
∞
k=0
(λe
it
)
k
/k! = e
−λ
exp¦λe
it
¦ = exp¦−λ(1 −e
it
)¦.
A.3 The Central Limit Theorem
Readers of this book will be well aware that
_
1 +
x
n
_
n
→ e
x
(n → ∞) ∀x ∈ IR.
This is the formula governing the passage from discrete to continuous compound interest. Invest
one pound (or dollar) for one year at 100x% p.a.; with interest compounded n times p.a., our
capital after one year is (1 +
x
n
)
n
. With continuous compounding, our capital after one year is the
exponential e
x
: exponential growth corresponds to continuously compounded interest.
We need two extensions: the formula still holds with x ∈ IR replaced by a complex number
z ∈ IC:
_
1 +
z
n
_
n
→ e
z
(n → ∞) ∀z ∈ IC,
and if z
n
∈ IC, z
n
→ z,
_
1 +
z
n
n
_
n
→ e
z
(n → ∞) (z
n
→ z ∈ IC).
As a ﬁrst illustration of the power of transform methods, we prove the weak law of large
numbers:
Theorem A.3.1 (Weak Law of Large Numbers). If X
1
, X
2
, . . . are independent and identi
cally distributed with mean µ, then
1
n
n
i=1
X
i
→ µ (n → ∞) in probability.
Proof. If the X
i
have characteristic function φ, then by the moment property of ¸2.8 with
k = 1,
φ(t) = 1 +iµt +o(t) (t → 0).
Now using the i.i.d. assumption,
1
n
n
i=1
X
i
has characteristic function
IE
_
exp
_
it
1
n
n
1
X
j
__
= IE
_
n
i=1
exp
_
it
1
n
X
j
_
_
=
n
i=1
IE
_
exp
_
it
n
X
j
__
= (φ(t/n))
n
=
_
1 +
iµt
n
+o(1/n)
_
n
→ e
iµt
(n → ∞),
APPENDIX A. BASIC PROBABILITY BACKGROUND 113
and e
iµt
is the characteristic function of the constant µ (for ﬁxed t, o(1/n) is an error term of
smaller order than 1/n as n → ∞). By the continuity theorem,
1
n
n
i=1
X
i
→ µ in distribution,
and as µ is constant, this says (see ¸2.6) that
1
n
n
1
X
i
→ µ in probability.
The main result of this section is the same argument carried one stage further.
Theorem A.3.2 (Central Limit Theorem). If X
1
, X
2
, . . . are independent and identically
distributed with mean µ and variance σ
2
, then with N(0, 1) the standard normal distribution,
√
n
σ
1
n
n
i=1
(X
i
−µ) =
1
√
n
n
i=1
(X
i
−µ)/σ → N(0, 1) (n → ∞) in distribution.
That is, for all x ∈ IR,
IP
_
1
√
n
n
i=1
(X
i
−µ)/σ ≤ x
_
→ Φ(x) :=
1
√
2π
x
_
−∞
e
−
1
2
y
2
dy (n → ∞).
Proof. We ﬁrst centre at the mean. If X
i
has characteristic function φ, let X
i
− µ have
characteristic function φ
0
. Since X
i
− µ has mean 0 and second moment σ
2
= VV ar(X
i
) =
IE[(X
i
−IEX
i
)
2
] = IE[(X
i
−µ)
2
], the case k = 2 of the moment property of ¸2.7 gives
φ
0
(t) = 1 −
1
2
σ
2
t
2
+o
_
t
2
_
(t → 0).
Now
√
n(
1
n
n
i=1
X
i
−µ)/σ has characteristic function
IE
_
_
exp
_
_
_
it
√
n
σ
_
_
1
n
n
j=1
X
j
−µ
_
_
_
_
_
_
_
= IE
_
_
n
j=1
exp
_
it(X
j
−µ)
σ
√
n
_
_
_
=
n
j=1
IE
_
exp
_
it
σ
√
n
(X
j
−µ)
__
=
_
φ
0
_
t
σ
√
n
__
n
=
_
1 −
1
2
σ
2
t
2
σ
2
n
+o
_
1
n
__n
→ e
−
1
2
t
2
(n → ∞),
and e
−
1
2
t
2
is the characteristic function of the standard normal distribution N(0, 1). The result
follows by the continuity theorem.
Note.
In Theorem A.3.2, we:
(i) centre the X
i
by subtracting the mean (to get mean 0);
(ii) scale the resulting X
i
− µ by dividing by the standard deviation σ (to get variance 1). Then
if Y
i
:= (X
i
−µ)/σ are the resulting standardised variables,
1
√
n
n
1
Y
i
converges in distribution to
standard normal.
APPENDIX A. BASIC PROBABILITY BACKGROUND 114
Example: the Binomial Case.
If each X
i
is Bernoulli distributed with parameter p ∈ (0, 1),
IP(X
i
= 1) = p, IP(X
i
= 0) = q := 1 −p
– so X
i
has mean p and variance pq  S
n
:=
n
i=1
X
i
is binomially distributed with parameters n
and p:
IP
_
n
i=1
X
i
= k
_
=
_
n
k
_
p
k
q
n−k
=
n!
(n −k)!k!
p
k
q
n−k
.
A direct attack on the distribution of
1
√
n
n
i=1
(X
i
−p)/
√
pq can be made via
IP
_
a ≤
n
i=1
X
i
≤ b
_
=
k:np+a
√
npq≤k≤np+b
√
npq
n!
(n −k)!k!
p
k
q
n−k
.
Since n, k and n − k will all be large here, one needs an approximation to the factorials. The
required result is Stirling’s formula of 1730:
n! ∼
√
2πe
−n
n
n+
1
2
(n → ∞)
(the symbol ∼ indicates that the ratio of the two sides tends to 1). The argument can be carried
through to obtain the sum on the right as a Riemann sum (in the sense of the Riemann integral:
¸2.2) for
_
b
a
1
√
2π
e
−
1
2
x
2
dx, whence the result. This, the earliest form of the central limit theorem,
is the de MoivreLaplace limit theorem (Abraham de Moivre, 1667–1754; P.S. de Laplace, 1749–
1827). The proof of the deMoivreLaplace limit theorem sketched above is closely analogous to
the passage from the discrete to the continuous BlackScholes formula: see ¸4.6 and ¸6.4.
Local Limit Theorems.
The central limit theorem as proved above is a global limit theorem: it relates to distributions
and convergence thereof. The de MoivreLaplace limit theorem above, however, deals directly
with individual probabilities in the discrete case (the sum of a large number of which is shown
to approximate an integral). A limit theorem dealing with densities and convergence thereof in
the density case, or with the discrete analogues of densities – such as the individual probabilities
IP(S
n
= k) in the binomial case above – is called a local limit theorem.
Poisson Limit Theorem.
The de MoivreLaplace limit theorem – convergence of binomial to normal – is only one possible
limiting regime for binomial models. The next most important one has a Poisson limit in place
of a normal one.
Suppose we have a sequence of binomial models B(n, p), where the success probability p = p
n
varies with n, in such a way that
np
n
→ λ > 0, (n → ∞). (A.1)
Thus p
n
→ 0 – indeed, p
n
∼ λ/n. This models a situation where we have a large number n of
Bernoulli trials, each with small probability p
n
of success, but such that np
n
, the expected total
number of successes, is ‘neither large nor small, but intermediate’. Binomial models satisfying
condition (A.1) converge to the Poisson model P(λ) with parameter λ > 0.
This result is sometimes called the law of small numbers. The Poisson distribution is widely
used to model statistics of accidents, insurance claims and the like, where one has a large number
n of individuals at risk, each with a small probability p
n
of generating an accident, insurance claim
etc. (‘success probability’ seems a strange usage here!).
Appendix B
Facts form Probability and
Measure Theory
We will assume that most readers will be familiar with such things from an elementary course in
probability and statistics; for a clear introduction see, e.g. Grimmett and Welsh (1986), or the
ﬁrst few chapters of ?; Ross (1997), Resnick (2001), Durrett (1999), Ross (1997), Rosenthal (2000)
are also useful.
B.1 Measure
The language of modelling ﬁnancial markets involves that of probability, which in turn involves
that of measure theory. This originated with Henri Lebesgue (18751941), in his thesis, ‘Int´egrale,
longueur, aire’ Lebesgue (1902). We begin with deﬁning a measure on IR generalising the intuitive
notion of length.
The length µ(I) of an interval I = (a, b), [a, b], [a, b) or (a, b] should be b −a: µ(I) = b −a. The
length of the disjoint union I =
n
r=1
I
r
of intervals I
r
should be the sum of their lengths:
µ
_
n
_
r=1
I
r
_
=
n
r=1
µ(I
r
) (ﬁnite additivity).
Consider now an inﬁnite sequence I
1
, I
2
, . . .(ad inﬁnitum) of disjoint intervals. Letting n tend to
∞ suggests that length should again be additive over disjoint intervals:
µ
_
∞
_
r=1
I
r
_
=
∞
r=1
µ(I
r
) (countable additivity).
For I an interval, A a subset of length µ(A), the length of the complement I ¸ A := I ∩ A
c
of A
in I should be
µ(I ¸ A) = µ(I) −µ(A) (complementation).
If A ⊆ B and B has length µ(B) = 0, then A should have length 0 also:
A ⊆ B and µ(B) = 0 ⇒ µ(A) = 0 (completeness).
The term ‘countable’ here requires comment. We must distinguish ﬁrst between ﬁnite and inﬁnite
sets; then countable sets (like IN = ¦1, 2, 3, . . .¦) are the ‘smallest’, or ‘simplest’, inﬁnite sets, as
distinct from uncountable sets such as IR = (−∞, ∞).
Let T be the smallest class of sets A ⊂ IR containing the intervals, closed under countable
disjoint unions and complements, and complete (containing all subsets of sets of length 0 as sets of
length 0). The above suggests – what Lebesgue showed – that length can be sensibly deﬁned on the
115
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 116
sets T on the line, but on no others. There are others – but they are hard to construct (in technical
language: the axiom of choice, or some variant of it such as Zorn’s lemma, is needed to demonstrate
the existence of nonmeasurable sets – but all such proofs are highly nonconstructive). So: some
but not all subsets of the line have a length. These are called the Lebesguemeasurable sets, and
form the class T described above; length, deﬁned on T, is called Lebesgue measure µ (on the real
line, IR). Turning now to the general case, we make the above rigorous. Let Ω be a set.
Deﬁnition B.1.1. A collection /
0
of subsets of Ω is called an algebra on Ω if:
(i) Ω ∈ /
0
,
(ii) A ∈ /
0
⇒ A
c
= Ω ¸ A ∈ /
0
,
(iii) A, B ∈ /
0
⇒ A∪ B ∈ /
0
.
Using this deﬁnition and induction, we can show that an algebra on Ω is a family of subsets
of Ω closed under ﬁnitely many set operations.
Deﬁnition B.1.2. An algebra / of subsets of Ω is called a σalgebra on Ω if for any sequence
A
n
∈ /, (n ∈ IN), we have
∞
_
n=1
A
n
∈ /.
Such a pair (Ω, /) is called a measurable space.
Thus a σalgebra on Ω is a family of subsets of Ω closed under any countable collection of set
operations.
The main examples of σalgebras are σalgebras generated by a class ( of subsets of Ω, i.e.
σ(() is the smallest σalgebra on Ω containing (.
The Borel σalgebra B = B(IR) is the σalgebra of subsets of IR generated by the open intervals
(equivalently, by halflines such as (−∞, x] as x varies in IR.
As our aim is to deﬁne measures on collection of sets we now turn to set functions.
Deﬁnition B.1.3. Let Ω be a set, /
0
an algebra on Ω and µ
0
a nonnegative set function µ
0
:
/
0
→ [0, ∞] such that µ
0
(∅) = 0. µ
0
is called:
(i) additive, if A, B ∈ /
0
, A∩ B = ∅ ⇒ µ
0
(A∪ B) = µ
0
(A) +µ
0
(B),
(ii) countably additive, if whenever (A
n
)
n∈IN
is a sequence of disjoint sets in /
0
with
A
n
∈ /
0
then
µ
0
_
∞
_
n=0
A
n
_
=
∞
n=1
µ
0
(A
n
).
Deﬁnition B.1.4. Let (Ω, /) be a measurable space. A countably additive map
µ : / → [0, ∞]
is called a measure on (Ω, /). The triple (Ω, /, µ) is called a measure space.
Recall that our motivating example was to deﬁne a measure on IR consistent with our geomet
rical knowledge of length of an interval. That means we have a suitable deﬁnition of measure on
a family of subsets of IR and want to extend it to the generated σalgebra. The measuretheoretic
tool to do so is the Carath´eodory extension theorem, for which the following lemma is an inevitable
prerequisite.
Lemma B.1.1. Let Ω be a set. Let 1 be a πsystem on Ω, that is, a family of subsets of Ω closed
under ﬁnite intersections: I
1
, I
2
∈ 1 ⇒ I
1
∩ I
2
∈ 1. Let / = σ(1) and suppose that µ
1
and µ
2
are
ﬁnite measures on (Ω, /) (i.e. µ
1
(Ω) = µ
2
(Ω) < ∞) and µ
1
= µ
2
on 1. Then
µ
1
= µ
2
on /.
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 117
Theorem B.1.1 (Carath´eodory Extension Theorem). Let Ω be a set, /
0
an algebra on Ω
and / = σ(/
0
). If µ
0
is a countably additive set function on /
0
, then there exists a measure µ
on (Ω, /) such that
µ = µ
0
on /
0
.
If µ
0
is ﬁnite, then the extension is unique.
For proofs of the above and further discussion, we refer the reader to Chapter 1 and Appendix
1 of Williams (1991) and the appendix in Durrett (1996).
Returning to the motivating example Ω = IR, we say that A ⊂ IR belongs to the collection of
sets /
0
if A can be written as
A = (a
1
, b
1
] ∪ . . . ∪ (a
r
, b
r
],
where r ∈ IN, −∞ ≤ a
1
< b
1
≤ . . . ≤ a
r
< b
r
≤ ∞. It can be shown that /
0
is an algebra and
σ(/
0
) = B. For A as above deﬁne
µ
0
(A) =
r
k=1
(b
k
−a
k
).
µ
0
is welldeﬁned and countably additive on /
0
. As intervals belong to /
0
our geometric intuition
of length is preserved. Now by Carath´eodory’s extension theorem there exists a measure µ on
(Ω, B) extending µ
0
on (Ω, /
0
). This µ is called Lebesgue measure.
With the same approach we can generalise:
(i) the area of rectangles R = (a
1
, b
1
) (a
2
, b
2
) – with or without any of its perimeter included –
given by µ(R) = (b
1
−a
1
) (b
2
−a
2
) to Lebesgue measure on Borel sets in IR
2
;
(ii) the volume of cuboids C = (a
1
, b
1
) (a
2
, b
2
) (a
3
, b
3
) given by
µ(C) = (b
1
−a
1
) (b
2
−a
2
) (b
3
−a
3
)
to Lebesgue measure on Borel sets in IR
3
;
(iii) and similarly in kdimensional Euclidean space IR
k
. We start with the formula for a k
dimensional box,
µ
_
k
i=1
(a
i
, b
i
)
_
=
k
i=1
(b
i
−a
i
),
and obtain Lebesgue measure µ, deﬁned on B, in IR
k
.
We are mostly concerned with a special class of measures:
Deﬁnition B.1.5. A measure IP on a measurable space (Ω, /) is called a probability measure if
IP(Ω) = 1.
The triple (Ω, /, IP) is called a probability space.
Observe that the above lemma and Carath´eodory’s extension theorem guarantee uniqueness if
we construct a probability measure using the above procedure. For example the unit cube [0, 1]
k
in IR
k
has (Lebesgue) measure 1. Using Ω = [0, 1]
k
as the underlying set in the above construction
we ﬁnd a unique probability (which equals length/area/volume if k = 1/2/3).
If a property holds everywhere except on a set of measure zero, we say it holds almost every
where (a.e.). If it holds everywhere except on a set of probability zero, we say it holds almost
surely (a.s.) (or, with probability one).
Roughly speaking, one uses addition in countable (or ﬁnite) situations, integration in uncount
able ones. As the key measuretheoretic axiom of countable additivity above concerns addition,
countably inﬁnite situations (such as we meet in discrete time) ﬁt well with measure theory. By
contrast, uncountable situations (such as we meet in continuous time) do not – or at least, are
considerably harder to handle. This is why the discretetime setting of Chapters 3, 4 is easier than,
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 118
and precedes, the continuoustime setting of Chapters 5, 6. Our strategy is to do as much as pos
sible to introduce the key ideas – economic, ﬁnancial and mathematical – in discrete time (which,
because we work with a ﬁnite timehorizon, the expiry time T, is actually a ﬁnite situation), before
treating the harder case of continuous time.
B.2 Integral
Let (Ω, /) be a measurable space. We want to deﬁne integration for a suitable class of realvalued
functions.
Deﬁnition B.2.1. Let f : Ω → IR. For A ⊂ IR deﬁne f
−1
(A) = ¦ω ∈ Ω : f(ω) ∈ A¦. f is called
(/) measurable if
f
−1
(B) ∈ / for all B ∈ B.
Let µ be a measure on (Ω, /). Our aim now is to deﬁne, for suitable measurable functions,
the (Lebesgue) integral with respect to µ. We will denote this integral by
µ(f) =
_
Ω
fdµ =
_
Ω
f(ω)µ(dω).
We start with the simplest functions. If A ∈ / the indicator function 1
A
(ω) is deﬁned by
1
A
(ω) =
_
1, if ω ∈ A
0, if ω ,∈ A.
Then deﬁne µ(1
A
) = µ(A).
The next step extends the deﬁnition to simple functions. A function f is called simple if it is
a ﬁnite linear combination of indicators: f =
n
i=1
c
i
1
A
i
for constants c
i
and indicator functions
1
A
i
of measurable sets A
i
. One then extends the deﬁnition of the integral from indicator functions
to simple functions by linearity:
µ
_
n
i=1
c
i
1
A
i
_
:=
n
i=1
c
i
µ(1
A
i
) =
n
i=1
c
i
µ(A
i
),
for constants c
i
and indicators of measurable sets A
i
.
If f is a nonnegative measurable function, we deﬁne
µ(f) := sup¦µ(f
0
) : f
0
simple, f
0
≤ f¦.
The key result in integration theory, which we must use here to guarantee that the integral for
nonnegative measurable functions is welldeﬁned is:
Theorem B.2.1 (Monotone Convergence Theorem). If (f
n
) is a sequence of nonnegative
measurable functions such that f
n
is strictly monotonic increasing to a function f (which is then
also measurable), then µ(f
n
) → µ(f) ≤ ∞.
We quote that we can construct each nonnegative measurable f as the increasing limit of a
sequence of simple functions f
n
:
f
n
(ω) ↑ f(ω) for all ω ∈ Ω (n → ∞), f
n
simple.
Using the monotone convergence theorem we can thus obtain the integral of f as
µ(f) := lim
n→∞
µ(f
n
).
Since f
n
increases in n, so does µ(f
n
) (the integral is orderpreserving), so either µ(f
n
) increases
to a ﬁnite limit, or diverges to ∞. In the ﬁrst case, we say f is (Lebesgue) integrable with
(Lebesgue) integral µ(f) = limµ(f
n
).
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 119
Finally if f is a measurable function that may change sign, we split it into its positive and
negative parts, f
±
:
f
+
(ω) := max(f(ω), 0), f
−
(ω) := −min(f(ω), 0),
f(ω) = f
+
(ω) −f
−
(ω), [f(ω)[ = f
+
(ω) +f
−
(ω).
If both f
+
and f
−
are integrable, we say that f is too, and deﬁne
µ(f) := µ(f
+
) −µ(f
−
).
Thus, in particular, [f[ is also integrable, and
µ([f[) = µ(f
+
) +µ(f
−
).
The Lebesgue integral thus deﬁned is, by construction, an absolute integral: f is integrable iﬀ [f[
is integrable. Thus, for instance, the wellknown formula
∞
_
0
sinx
x
dx =
π
2
has no meaning for Lebesgue integrals, since
_
∞
1
sin x
x
dx diverges to +∞ like
_
∞
1
1
x
dx. It has to
be replaced by the limit relation
X
_
0
sin x
x
dx →
π
2
(X → ∞).
The class of (Lebesgue) integrable functions f on Ω is written L(Ω) or (for reasons explained
below) L
1
(Ω) – abbreviated to L
1
or L.
For p ≥ 1, the L
p
space L
p
(Ω) on Ω is the space of measurable functions f with L
p
norm
f
p
:=
_
_
_
Ω
[f[
p
dµ
_
_
1
p
< ∞.
The case p = 2 gives L
2
, which is particular important as it is a Hilbert space (Appendix A).
Turning now to the special case Ω = IR
k
we recall the wellknown Riemann integral. Math
ematics undergraduates are taught the Riemann integral (G.B. Riemann (1826–1866)) as their
ﬁrst rigorous treatment of integration theory – essentially this is just a rigorisation of the school
integral. It is much easier to set up than the Lebesgue integral, but much harder to manipulate.
For ﬁnite intervals [a, b] ,we quote:
(i) for any function f Riemannintegrable on [a, b], it is Lebesgueintegrable to the same value
(but many more functions are Lebesgue integrable);
(ii) f is Riemannintegrable on [a, b] iﬀ it is continuous a.e. on [a, b]. Thus the question, ‘Which
functions are Riemannintegrable?’ cannot be answered without the language of measure theory
– which gives one the technically superior Lebesgue integral anyway.
Suppose that F(x) is a nondecreasing function on IR:
F(x) ≤ F(y) if x ≤ y.
Such functions can have at most countably many discontinuities, which are at worst jumps. We
may without loss redeﬁne F at jumps so as to be rightcontinuous. We now generalise the starting
points above:
• Measure. We take µ((a, b]) := F(b) −F(a).
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 120
• Integral. We have µ(1
(a,b]
) = µ((a, b]) = F(b) −F(a).
We may now follow through the successive extension procedures used above. We obtain:
• LebesgueStieltjes measure µ
F
,
• LebesgueStieltjes integral µ
F
(f) =
_
fdµ
F
, or even
_
fdF.
The approach generalises to higher dimensions; we omit further details.
If instead of being monotone nondecreasing, F is the diﬀerence of two such functions, F =
F
1
−F
2
, we can deﬁne the integrals
_
fdF
1
,
_
fdF
2
as above, and then deﬁne
_
fdF =
_
fd(F
1
−F
2
) :=
_
fdF
1
−
_
fdF
2
.
If [a, b] is a ﬁnite interval and F is deﬁned on [a, b], a ﬁnite collection of points, x
0
, x
1
, . . . , x
n
with
a = x
0
< x
1
< < x
n
= b, is called a partition of [a, b], T say. The sum
n
i=1
[F(x
i
) −F(x
i−1
)[
is called the variation of F over the partition. The least upper bound of this over all partitions T
is called the variation of F over the interval [a, b], V
b
a
(F):
V
b
a
(F) := sup
P
[F(x
i
) −F(x
i−1
)[.
This may be +∞; but if V
b
a
(F) < ∞, F is said to be of bounded variation on [a, b], F ∈ BV
b
a
.
If F is of bounded variation on all ﬁnite intervals, F is said to be locally of bounded variation,
F ∈ BV
loc
; if F is of bounded variation on the real line IR, F is of bounded variation, F ∈ BV .
We quote that the following two properties are equivalent:
(i) F is locally of bounded variation,
(ii) F can be written as the diﬀerence F = F
1
−F
2
of two monotone functions.
So the above procedure deﬁnes the integral
_
fdF when the integrator F is of bounded varia
tion.
Remark B.2.1. (i) When we pass from discrete to continuous time, we will need to handle
both ‘smooth’ paths and paths that vary by jumps – of bounded variation – and ‘rough’ ones – of
unbounded variation but bounded quadratic variation;
(ii) The LebesgueStieltjes integral
_
g(x)dF(x) is needed to express the expectation IEg(X), where
X is random variable with distribution function F and g a suitable function.
B.3 Probability
As we remarked in the introduction of this chapter, the mathematical theory of probability can be
traced to 1654, to correspondence between Pascal (1623–1662) and Fermat (1601–1665). However,
the theory remained both incomplete and nonrigorous until the 20th century. It turns out that
the Lebesgue theory of measure and integral sketched above is exactly the machinery needed to
construct a rigorous theory of probability adequate for modelling reality (option pricing, etc.) for
us. This was realised by Kolmogorov (19031987), whose classic book of 1933, Grundbegriﬀe der
Wahrscheinlichkeitsrechnung (Foundations of Probability Theory), Kolmogorov (1933), inaugu
rated the modern era in probability.
Recall from your ﬁrst course on probability that, to describe a random experiment mathemat
ically, we begin with the sample space Ω, the set of all possible outcomes. Each point ω of Ω, or
sample point, represents a possible – random – outcome of performing the random experiment.
For a set A ⊆ Ω of points ω we want to know the probability IP(A) (or Pr(A), pr(A)). We clearly
want
(i) IP(∅) = 0, IP(Ω) = 1,
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 121
(ii) IP(A) ≥ 0 for all A,
(iii) If A
1
, A
2
, . . . , A
n
are disjoint, IP(
n
i=1
A
i
) =
n
i=1
IP(A
i
) (ﬁnite additivity), which, as
above we will strengthen to
(iii)* If A
1
, A
2
. . . (ad inf.) are disjoint,
IP
_
∞
_
i=1
A
i
_
=
∞
i=1
IP(A
i
) (countable additivity).
(iv) If B ⊆ A and IP(A) = 0, then IP(B) = 0 (completeness).
Then by (i) and (iii) (with A = A
1
, Ω ¸ A = A
2
),
IP(A
c
) = IP(Ω ¸ A) = 1 −IP(A).
So the class T of subsets of Ω whose probabilities IP(A) are deﬁned (call such A events) should
be closed under countable, disjoint unions and complements, and contain the empty set ∅ and the
whole space Ω. Therefore T should be a σalgebra and IP should be deﬁned on T according to
Deﬁnition 2.1.5. Repeating this:
Deﬁnition B.3.1. A probability space, or Kolmogorov triple, is a triple (Ω, T, IP) satisfying
Kolmogorov axioms (i),(ii),(iii)*, (iv) above.
A probability space is a mathematical model of a random experiment.
Often we quantify outcomes ω of random experiments by deﬁning a realvalued function X on
Ω, i.e. X : Ω → IR. If such a function is measurable it is called a random variable.
Deﬁnition B.3.2. Let (Ω, T, IP) be a probability space. A random variable (vector) X is a
function X : Ω → IR (X : Ω → IR
k
) such that X
−1
(B) = ¦ω ∈ Ω : X(ω) ∈ B¦ ∈ T for all Borel
sets B ∈ B(IR) (B ∈ B(IR
k
)).
In particular we have for a random variable X that ¦ω ∈ Ω : X(ω) ≤ x¦ ∈ T for all x ∈ IR.
Hence we can deﬁne the distribution function F
X
of X by
F
X
(x) := IP (¦ω : X(ω) ≤ x¦).
The smallest σalgebra containing all the sets ¦ω : X(ω) ≤ x¦ for all real x (equivalently,
¦X < x¦, ¦X ≥ x¦, ¦X > x¦) is called the σalgebra generated by X, written σ(X). Thus,
X is T −measurable (is a random variable) iﬀ σ(X) ⊆ T.
The events in the σalgebra generated by X are the events ¦ω : X(ω) ∈ B¦, where B runs through
the Borel σalgebra on the line. When the (random) value X(ω) is known, we know which of these
events have happened.
Interpretation.
Think of σ(X) as representing what we know when we know X, or in other words the information
contained in X (or in knowledge of X). This is reﬂected in the following result, due to J.L. Doob,
which we quote:
σ(X) ⊆ σ(Y ) if and only if X = g(Y )
for some measurable function g. For, knowing Y means we know X := g(Y ) – but not vice versa,
unless the function g is onetoone (injective), when the inverse function g
−1
exists, and we can
go back via Y = g
−1
(X).
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 122
Note.
An extended discussion of generated σalgebras in the ﬁnite case is given in Dothan’s book Dothan
(1990), Chapter 3. Although technically avoidable, this is useful preparation for the general case,
needed for continuous time.
A measure (¸2.1) determines an integral (¸2.2). A probability measure IP, being a special
kind of measure (a measure of total mass one) determines a special kind of integral, called an
expectation.
Deﬁnition B.3.3. The expectation IE of a random variable X on (Ω, T, IP) is deﬁned by
IEX :=
_
Ω
XdIP, or
_
Ω
X(ω)dIP(ω).
The expectation – also called the mean – describes the location of a distribution (and so is
called a location parameter). Information about the scale of a distribution (the corresponding
scale parameter) is obtained by considering the variance
VV ar(X) := IE
_
(X −IE(X))
2
¸
= IE
_
X
2
_
−(IEX)
2
.
If X is realvalued, say, with distribution function F, recall that IEX is deﬁned in your ﬁrst
course on probability by
IEX :=
_
xf(x)dx if X has a density f
or if X is discrete, taking values x
n
(n = 1, 2, . . .) with probability function f(x
n
)(≥ 0) (
x
n
f(x
n
) =
1),
IEX :=
x
n
f(x
n
).
These two formulae are the special cases (for the density and discrete cases) of the general formula
IEX :=
∞
_
−∞
xdF(x)
where the integral on the right is a LebesgueStieltjes integral. This in turn agrees with the
deﬁnition above, since if F is the distribution function of X,
_
Ω
XdIP =
∞
_
−∞
xdF(x)
follows by the change of variable formula for the measuretheoretic integral, on applying the map
X : Ω → IR (we quote this: see any book on measure theory, e.g. Dudley (1989)).
Clearly the expectation operator IE is linear. It even becomes multiplicative if we consider
independent random variables.
Deﬁnition B.3.4. Random variables X
1
, . . . , X
n
are independent if whenever A
i
∈ B for i =
1, . . . n we have
IP
_
n
i=1
¦X
i
∈ A
i
¦
_
=
n
i=1
IP(¦X
i
∈ A
i
¦).
Using Lemma B.1.1 we can give a more tractable condition for independence:
Lemma B.3.1. In order for X
1
, . . . , X
n
to be independent it is necessary and suﬃcient that for
all x
1
, . . . x
n
∈ (−∞, ∞],
IP
_
n
i=1
¦X
i
≤ x
i
¦
_
=
n
i=1
IP(¦X
i
≤ x
i
¦).
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 123
Now using the usual measuretheoretic steps (going from simple to integrable functions) it is
easy to show:
Theorem B.3.1 (Multiplication Theorem). If X
1
, . . . , X
n
are independent and IE [X
i
[ <
∞, i = 1, . . . , n, then
IE
_
n
i=1
X
i
_
=
n
i=1
IE(X
i
).
We now review the distributions we will mainly use in our models of ﬁnancial markets.
Examples.
(i) Bernoulli distribution. Recall our arbitragepricing example from ¸1.4. There we were given a
stock with price S(0) at time t = 0. We assumed that after a period of time ∆t the stock price
could have only one of two values, either S(∆t) = e
u
S(0) with probability p or S(∆t) = e
d
S(0)
with probability 1−p (u, d ∈ IR). Let R(∆t) = r(1) be a random variable modelling the logarithm
of the stock return over the period [0, ∆t]; then
IP(r(1) = u) = p and IP(r(1) = d) = 1 −p.
We say that r(1) is distributed according to a Bernoulli distribution. Clearly IE(r(1)) = up +
d(1 −p) and VV ar(r(1)) = u
2
p +d
2
(1 −p) −(IEX)
2
.
The standard case of a Bernoulli distribution is given by choosing u = 1, d = 0 (which is not a
very useful choice in ﬁnancial modelling).
(ii) Binomial distribution. If we consider the logarithm of the stock return over n periods (of
equal length), say over [0, T], then subdividing into the periods 1, . . . , n we have
R(T) = log
_
S(T)
S(0)
_
= log
_
S(T)
S(T −∆t)
S(∆t)
S(0)
_
= log
_
S(T)
S(T −∆t)
_
+. . . + log
_
S(∆t)
S(0)
_
= r(n) +. . . +r(1).
Assuming that r(i), i = 1, . . . , n are independent and each r(i) is Bernoulli distributed as above
we have that R(T) =
n
i=1
r(i) is binomially distributed. Linearity of the expectation operator
and independence yield IE(R(T)) =
n
i=1
IE(r(i)) and VV ar(R(T)) =
n
i=1
VV ar(r(i)).
Again for the standard case one would use u = 1, d = 0. The shorthand notation for a binomial
random variable X is then X ∼ B(n, p) and we can compute
IP(X = k) =
_
n
k
_
p
k
(1 −p)
(n−k)
, IE(X) = np, VV ar(X) = np(1 −p).
(iii) Normal distribution. As we will show in the sequel the limit of a sequence of appropriate
normalised binomial distributions is the (standard) normal distribution. We say a random variable
X is normally distributed with parameters µ, σ
2
, in short X ∼ N(µ, σ
2
), if X has density function
f
µ,σ
2(x) =
1
√
2πσ
exp
_
−
1
2
_
x −µ
σ
_
2
_
.
One can show that IE(X) = µ and VV ar(X) = σ
2
, and thus a normally distributed random variable
is fully described by knowledge of its mean and variance.
Returning to the above example, one of the key results of this text will be that the limiting
model of a sequence of ﬁnancial markets with oneperiod asset returns modelled by a Bernoulli
distribution is a model where the distribution of the logarithms of instantaneous asset returns is
normal. That means S(t+∆t)/S(t) is lognormally distributed (i.e. log(S(t+∆t)/S(t)) is normally
distributed). Although rejected by many empirical studies (see Eberlein and Keller (1995) for a
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 124
recent overview), such a model seems to be the standard in use among ﬁnancial practitioners (and
we will call it the standard model in the following). The main arguments against using normally
distributed random variables for modelling logreturns (i.e. lognormal distributions for returns)
are asymmetry and (semi) heavy tails. We know that distributions of ﬁnancial asset returns are
generally rather close to being symmetric around zero, but there is a deﬁnite tendency towards
asymmetry. This may be explained by the fact that the markets react diﬀerently to positive as
opposed to negative information (see Shephard (1996) ¸1.3.4). Since the normal distribution is
symmetric it is not possible to incorporate this empirical fact in the standard model. Financial
time series also suggest modelling by probability distributions whose densities behave for x → ±∞
as
[x[
ρ
exp¦−σ [x[¦
with ρ ∈ IR, σ > 0. This means that we should replace the normal distribution with a distribution
with heavier tails. Such a model like this would exhibit higher probabilities of extreme events
and the passage from ordinary observations (around the mean) to extreme observations would
be more sudden. Among suggested (classes of) distributions to be used to address these facts is
the class of hyperbolic distributions (see Eberlein and Keller (1995) and ¸2.12 below), and more
general distributions of normal inverse Gaussian type (see BarndorﬀNielsen (1998), Rydberg
(1996), Rydberg (1997)) appear to be very promising.
(iv) Poisson distribution. Sometimes we want to incorporate in our model of ﬁnancial markets
the possibility of sudden jumps. Using the standard model we model the asset price process by a
continuous stochastic process, so we need an additional process generating the jumps. To do this
we use point processes in general and the Poisson process in particular. For a Poisson process the
probability of a jump (and no jump respectively) during a small interval ∆t are approximately
IP(ν(1) = 1) ≈ λ∆t and IP(ν(1) = 0) ≈ 1 −λ∆t,
where λ is a positive constant called the rate or intensity. Modelling small intervals in such a way
we get for the number of jumps N(T) = ν(1) + . . . + ν(n) in the interval [0, T] the probability
function
IP(N(T) = k) =
e
−λT
(λT)
k
k!
, k = 0, 1, . . .
and we say the process N(T) has a Poisson distribution with parameter λT. We can show
IE(N(T)) = λT and VV ar(N(T)) = λT.
Glossary.
Table B.1 summarises the two parallel languages, measuretheoretic and probabilistic, which we
have established.
Measure Probability
Integral Expectation
Measurable set Event
Measurable function Random variable
Almosteverywhere (a.e.) Almostsurely (a.s.)
Table B.1: Measuretheoretic and probabilistic languages
B.4 Equivalent Measures and RadonNikod´ ym Derivatives
Given two measures IP and QQ deﬁned on the same σalgebra T, we say that IP is absolutely
continuous with respect to QQ, written
IP << QQ
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 125
if IP(A) = 0, whenever QQ(A) = 0, A ∈ T. We quote from measure theory the vitally important
RadonNikod´ym theorem:
Theorem B.4.1 (RadonNikod´ ym). IP << QQ iﬀ there exists a (T) measurable function f
such that
IP(A) =
_
A
fdQQ ∀A ∈ T.
(Note that since the integral of anything over a null set is zero, any IP so representable is
certainly absolutely continuous with respect to QQ – the point is that the converse holds.)
Since IP(A) =
_
A
dIP, this says that
_
A
dIP =
_
A
fdQQ for all A ∈ T. By analogy with the
chain rule of ordinary calculus, we write dIP/dQQ for f; then
_
A
dIP =
_
A
dIP
dQQ
dQQ ∀A ∈ T.
Symbolically,
if IP << QQ, dIP =
dIP
dQQ
dQQ.
The measurable function (random variable) dIP/dQQ is called the RadonNikod´ym derivative (RN
derivative) of IP with respect to QQ.
If IP << QQ and also QQ << IP, we call IP and QQ equivalent measures, written IP ∼ QQ. Then
dIP/dQQ and dQQ/dIP both exist, and
dIP
dQQ
= 1/
dQQ
dIP
.
For IP ∼ QQ, IP(A) = 0 iﬀ QQ(A) = 0: IP and QQ have the same null sets. Taking negations:
IP ∼ QQ iﬀ IP, QQ have the same sets of positive measure. Taking complements: IP ∼ QQ iﬀ IP, QQ
have the same sets of probability one (the same a.s. sets). Thus the following are equivalent:
IP ∼ QQ iﬀ IP, QQ have the same null sets,
iﬀ IP, QQ have the same a.s. sets,
iﬀ IP, QQ have the same sets of positive measure.
Far from being an abstract theoretical result, the RadonNikod´ ym theorem is of key practical
importance, in two ways:
(a) It is the key to the concept of conditioning (¸2.5, ¸2.6 below), which is of central importance
throughout,
(b) The concept of equivalent measures is central to the key idea of mathematical ﬁnance, risk
neutrality, and hence to its main results, the BlackScholes formula, fundamental theorem of asset
pricing, etc. The key to all this is that prices should be the discounted expected values under an
equivalent martingale measure. Thus equivalent measures, and the operation of change of measure,
are of central economic and ﬁnancial importance. We shall return to this later in connection with
the main mathematical result on change of measure, Girsanov’s theorem (see ¸5.7).
B.5 Conditional expectation
For basic events deﬁne
IP(A[B) := IP(A∩ B)/IP(B) if IP(B) > 0. (B.1)
From this deﬁnition, we get the multiplication rule
IP(A∩ B) = IP(A[B)IP(B).
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 126
Using the partition equation IP(B) =
n
IP(B[A
n
)IP(A
n
) with (A
n
) a ﬁnite or countable partition
of Ω, we get the Bayes rule
IP(A
i
[B) =
IP(A
i
)IP(B[A
i
)
j
IP(A
j
)IP(B[A
j
)
.
We can always write IP(A) = IE(1
A
) with 1
A
(ω) = 1 if ω ∈ A and 1
A
(ω) = 0 otherwise. Then
the above can be written
IE(1
A
[B) =
IE(1
A
1
B
)
IP(B)
(B.2)
This suggest deﬁning, for suitable random variables X, the IPaverage of X over B as
IE(X[B) =
IE(X1
B
)
IP(B)
. (B.3)
Consider now discrete random variables X and Y . Assume X takes values x
1
, . . . , x
m
with
probabilities f
1
(x
i
) > 0, Y takes values y
1
, . . . , y
n
with probabilities f
2
(y
j
) > 0, while the vector
(X, Y ) takes values (x
i
, y
j
) with probabilities f(x
i
, y
j
) > 0. Then the marginal distributions
are
f
1
(x
i
) =
n
j=1
f(x
i
, y
j
) and f
2
(y
j
) =
m
i=1
f(x
i
, y
j
).
We can use the standard deﬁnition above for the events ¦Y = y
j
¦ and ¦X = x
i
¦ to get
IP(Y = y
j
[X = x
i
) =
IP(X = x
i
, Y = y
j
)
IP(X = x
i
)
=
f(x
i
, y
j
)
f
1
(x
i
)
.
Thus conditional on X = x
i
(given the information X = x
i
), Y takes on the values y
1
, . . . , y
n
with
(conditional) probabilities
f
Y X
(y
j
[x
i
) =
f(x
i
, y
j
)
f
1
(x
i
)
.
So we can compute its expectation as usual:
IE(Y [X = x
i
) =
j
y
j
f
Y X
(y
j
[x
i
) =
j
y
j
f(x
i
, y
j
)
f
1
(x
i
)
.
Now deﬁne the random variable Z = IE(Y [X), the conditional expectation of Y given X, as
follows:
if X(ω) = x
i
, then Z(ω) = IE(Y [X = x
i
) = z
i
(say)
Observe that in this case Z is given by a ’nice’ function of X. However, a more abstract property
also holds true. Since Z is constant on the the sets ¦X = x
i
¦ it is σ(X)measurable (these sets
generate the σalgebra). Furthermore
_
{X=x
i
}
ZdIP = z
i
IP(X = x
i
) =
j
y
j
f
Y X
(y
j
[x
i
)IP(X = x
i
)
=
j
y
j
IP(Y = y
j
; X = x
i
) =
_
{X=x
i
}
Y dIP.
Since the ¦X = x
i
¦ generate σ(X), this implies
_
G
ZdIP =
_
G
Y dIP ∀ G ∈ σ(X).
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 127
Density case. If the random vector (X, Y ) has density f(x, y), then X has (marginal) density
f
1
(x) :=
_
∞
−∞
f(x, y)dy, Y has (marginal) density f
2
(y) :=
_
∞
−∞
f(x, y)dx. The conditional density
of Y given X = x is:
f
Y X
(y[x) :=
f(x, y)
f
1
(x)
.
Its expectation is
IE(Y [X = x) =
∞
_
−∞
yf
Y X
(y[x)dy =
_
∞
−∞
yf(x, y)dy
f
1
(x)
.
So we deﬁne
c(x) =
_
IE(Y [X = x) if f
1
(x) > 0
0 if f
1
(x) = 0,
and call c(X) the conditional expectation of Y given X, denoted by IE(Y [X). Observe that on
sets with probability zero (i.e ¦ω : X(ω) = x; f
1
(x) = 0¦) the choice of c(x) is arbitrary, hence
IE(Y [X) is only deﬁned up to a set of probability zero; we speak of diﬀerent versions in such cases.
With this deﬁnition we again ﬁnd
_
G
c(X)dIP =
_
G
Y dIP ∀ G ∈ σ(X).
Indeed, for sets G with G = ¦ω : X(ω) ∈ B¦ with B a Borel set, we ﬁnd by Fubini’s theorem
_
G
c(X)dIP =
∞
_
−∞
1
B
(x)c(x)f
1
(x)dx
=
∞
_
−∞
1
B
(x)f
1
(x)
∞
_
−∞
yf
Y X
(y[x)dydx
=
∞
_
−∞
∞
_
−∞
1
B
(x)yf(x, y)dydx =
_
G
Y dIP.
Now these sets G generate σ(X) and by a standard technique (the πsystems lemma, see Williams
(2001), ¸2.3) the claim is true for all G ∈ σ(X).
Example. Bivariate Normal Distribution,
N(µ
1
, µ
2
, σ
2
1
, σ
2
2
, ρ).
IE(Y [X = x) = µ
2
+ρ
σ
2
σ
1
(x −µ
1
),
the familiar regression line of statistics (linear model) – see Exercise 2.6.
General case. Here, we follow Kolmogorov’s construction using the RadonNikod´ ym theorem.
Suppose that ( is a subσalgebra of T, ( ⊂ T. If Y is a nonnegative random variable with
IEY < ∞, then
QQ(G) :=
_
G
Y dIP (G ∈ ()
is nonnegative, σadditive – because
_
G
Y dIP =
n
_
G
n
Y dIP
if G = ∪
n
G
n
, G
n
disjoint – and deﬁned on the σalgebra (, so it is a measure on (.
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 128
If IP(G) = 0, then QQ(G) = 0 also (the integral of anything over a null set is zero), so QQ << IP.
By the RadonNikod´ ym theorem, there exists a RadonNikod´ ym derivative of QQ with respect
to IP on (, which is (measurable. Following Kolmogorov, we call this RadonNikod´ ym derivative
the conditional expectation of Y given (or conditional on) (, IE(Y [(), whose existence we now
have established. For Y that changes sign, split into Y = Y
+
− Y
−
, and deﬁne IE(Y [() :=
IE(Y
+
[() −IE(Y
−
[(). We summarize:
Deﬁnition B.5.1. Let Y be a random variable with IE([Y [) < ∞ and ( be a subσalgebra of T.
We call a random variable Z a version of the conditional expectation IE(Y [() of Y given (, and
write Z = IE(Y [(), a.s., if
(i) Z is (measurable;
(ii) IE([Z[) < ∞;
(iii) for every set G in (, we have
_
G
Y dIP =
_
G
ZdIP ∀G ∈ (. (B.4)
Notation. Suppose ( = σ(X
1
, . . . , X
n
). Then
IE(Y [() = IE (Y [σ(X
1
, . . . , X
n
)) =: IE(Y [X
1
, . . . , X
n
),
and one can compare the general case with the motivating examples above.
To see the intuition behind conditional expectation, consider the following situation. Assume
an experiment has been performed, i.e. ω ∈ Ω has been realized. However, the only information we
have is the set of values X(ω) for every (measurable random variable X. Then Z(ω) = IE(Y [()(ω)
is the expected value of Y (ω) given this information.
We used the traditional approach to deﬁne conditional expectation via the RadonNikod´ ym
theorem. Alternatively, one can use Hilbert space projection theory (Neveu (1975) and Jacod and
Protter (2000) follow this route). Indeed, for Y ∈ L
2
(Ω, T, IP) one can show that the conditional
expectation Z = IE(Y [() is the leastsquaresbest (measurable predictor of Y : amongst all
(measurable random variables it minimises the quadratic distance, i.e.
IE[(Y −IE(Y [())
2
] = min¦IE[(Y −X)
2
] : X ( −measurable¦.
Note.
1. To check that something is a conditional expectation: we have to check that it integrates the
right way over the right sets (i.e., as in (B.4)).
2. From (B.4): if two things integrate the same way over all sets B ∈ (, they have the same
conditional expectation given (.
3. For notational convenience, we shall pass between IE(Y [() and IE
G
Y at will.
4. The conditional expectation thus deﬁned coincides with any we may have already encountered –
in regression or multivariate analysis, for example. However, this may not be immediately obvious.
The conditional expectation deﬁned above – via σalgebras and the RadonNikod´ ym theorem – is
rightly called by Williams ((Williams 1991), p.84) ‘the central deﬁnition of modern probability’.
It may take a little getting used to. As with all important but nonobvious deﬁnitions, it proves
its worth in action: see ¸2.6 below for properties of conditional expectations, and Chapter 3 for
its use in studying stochastic processes, particularly martingales (which are deﬁned in terms of
conditional expectations).
We now discuss the fundamental properties of conditional expectation. From the deﬁnition
linearity of conditional expectation follows from the linearity of the integral. Further properties
are given by
Proposition B.5.1. 1. ( = ¦∅, Ω¦, IE(Y [¦∅, Ω¦) = IEY.
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 129
2. If ( = T, IE(Y [T) = Y IP −a.s..
3. If Y is (measurable, IE(Y [() = Y IP −a.s..
4. Positivity. If X ≥ 0, then IE(X[() ≥ 0 IP −a.s..
5. Taking out what is known. If Y is (measurable and bounded, IE(Y Z[() = Y IE(Z[() IP −
a.s..
6. Tower property. If (
0
⊂ (, IE[IE(Y [()[(
0
] = IE[Y [(
0
] a.s..
7. Conditional mean formula. IE[IE(Y [()] = IEY IP −a.s.
8. Role of independence. If Y is independent of (, IE(Y [() = IEY a.s.
9. Conditional Jensen formula. If c : IR → IR is convex, and IE[c(X)[ < ∞, then
IE(c(X)[() ≥ c (IE(X[()) .
Proof. 1. Here ( = ¦∅, Ω¦ is the smallest possible σalgebra (any σalgebra of subsets of
Ω contains ∅ and Ω), and represents ‘knowing nothing’. We have to check (B.4) for G = ∅ and
G = Ω. For G = ∅ both sides are zero; for G = Ω both sides are IEY .
2. Here ( = T is the largest possible σalgebra, and represents ‘knowing everything’. We have
to check (B.4) for all sets G ∈ T. The only integrand that integrates like Y over all sets is Y
itself, or a function agreeing with Y except on a set of measure zero.
Note. When we condition on T (‘knowing everything’), we know Y (because we know everything).
There is thus no uncertainty left in Y to average out, so taking the conditional expectation
(averaging out remaining randomness) has no eﬀect, and leaves Y unaltered.
3. Recall that Y is always Tmeasurable (this is the deﬁnition of Y being a random variable).
For ( ⊂ T, Y may not be (measurable, but if it is, the proof above applies with ( in place of T.
Note. To say that Y is (measurable is to say that Y is known given ( – that is, when we are
conditioning on (. Then Y is no longer random (being known when ( is given), and so counts as
a constant when the conditioning is performed.
4. Let Z be a version of IE(X[(). If IP(Z < 0) > 0, then for some n, the set
G := ¦Z < −n
−1
¦ ∈ ( and IP(¦Z < −n
−1
¦) > 0.
Thus
0 ≤ IE(X1
G
) = IE(Z1
G
) < −n
−1
IP(G) < 0,
which contradicts the positivity of X.
5. First, consider the case when Y is discrete. Then Y can be written as
Y =
N
n=1
b
n
1
B
n
,
for constants b
n
and events B
n
∈ (. Then for any B ∈ (, B ∩ B
n
∈ ( also (as ( is a σalgebra),
and using linearity and (B.4):
_
B
Y IE(Z[()dIP =
_
B
_
N
n=1
b
n
1
B
n
_
IE(Z[()dIP =
N
n=1
b
n
_
B∩B
n
IE(Z[()dIP
=
N
n=1
b
n
_
B∩B
n
ZdIP =
_
B
N
n=1
b
n
1
B
n
ZdIP
=
_
B
Y ZdIP.
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 130
Since this holds for all B ∈ (, the result holds by (B.4).
For the general case, we approximate to a general random variable Y by a sequence of discrete
random variables Y
n
, for each of which the result holds as just proved. We omit details of the
proof here, which involves the standard approximation steps based on the monotone convergence
theorem from measure theory (see e.g. (Williams 1991), p.90, proof of (j)). We are thus left to
show the IE([ZY [) < ∞, which follows from the assumption that Y is bounded and Z ∈ L
1
.
6. IE
G
0
IE
G
Y is (
0
measurable, and for C ∈ (
0
⊂ (, using the deﬁnition of IE
G
0
, IE
G
:
_
C
IE
G
0
[IE
G
Y ]dIP =
_
C
IE
G
Y dIP =
_
C
Y dIP.
So IE
G
0
[IE
G
Y ] satisﬁes the deﬁning relation for IE
G
0
Y . Being also (
0
measurable, it is IE
G
0
Y (a.s.).
We also have:
6‘. If (
0
⊂ (, IE[IE(Y [(
0
)[(] = IE[Y [(
0
] a.s..
Proof. IE[Y [(
0
] is (
0
measurable, so (measurable as (
0
⊂ (, so IE[.[(] has no eﬀect on it, by
3.
Note.
6, 6‘ are the two forms of the iterated conditional expectations property. When conditioning on two
σalgebras, one larger (ﬁner), one smaller (coarser), the coarser rubs out the eﬀect of the ﬁner,
either way round. This may be thought of as the coarseaveraging property: we shall use this
term interchangeably with the iterated conditional expectations property (Williams (1991) uses
the term tower property).
7. Take (
0
= ¦∅, Ω¦ in 6. and use 1.
8. If Y is independent of (, Y is independent of 1
B
for every B ∈ (. So by (B.4) and linearity,
_
B
IE(Y [()dIP =
_
B
Y dIP =
_
Ω
1
B
Y dIP
= IE(1
B
Y ) = IE(1
B
)IE(Y ) =
_
B
IEY dIP,
using the multiplication theorem for independent random variables. Since this holds for all B ∈ (,
the result follows by (B.4).
9. Recall (see e.g. Williams (1991), ¸6.6a, ¸9.7h, ¸9.8h), that for every convex function there
exists a countable sequence ((a
n
, b
n
)) of points in IR
2
such that
c(x) = sup
n
(a
n
x +b
n
), x ∈ IR.
For each ﬁxed n we use 4. to see from c(X) ≥ a
n
X +b
n
that
IE[c(X)[(] ≥ a
n
IE(X[() +b
n
.
So,
IE[c(X)[(] ≥ sup
n
(a
n
IE(X[() +b
n
) = c (IE(X[()) .
Remark B.5.1. If in 6, 6
we take ( = (
0
, we obtain:
IE[IE(X[()[(] = IE(X[().
Thus the map X → IE(X[() is idempotent: applying it twice is the same as applying it once.
Hence we may identify the conditional expectation operator as a projection.
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 131
B.6 Modes of Convergence
So far, we have dealt with one probability measure – or its expectation operator – at a time.
We shall, however, have many occasions to consider a whole sequence of them, converging (in a
suitable sense) to some limiting probability measure. Such situations arise, for example, whenever
we approximate a ﬁnancial model in continuous time (such as the continuoustime BlackScholes
model of ¸6.2) by a sequence of models in discrete time (such as the discretetime BlackScholes
model of ¸4.6).
In the stochasticprocess setting – such as the passage from discrete to continuous BlackScholes
models mentioned above – we need concepts beyond those we have to hand, which we develop
later. We conﬁne ourselves here to setting out what we need to discuss convergence of random
variables, in the various senses that are useful.
The ﬁrst idea that occurs to one is to use the ordinary convergence concept in this new setting,
of random variables: then if X
n
, X are random variables,
X
n
→ X (n → ∞)
would be taken literally – as if the X
n
, X were nonrandom. For instance, if X
n
is the observed
frequency of heads in a long series of n independent tosses of a fair coin, X = 1/2 the expected
frequency, then the above in this case would be the maninthestreet’s idea of the ‘law of averages’.
It turns out that the above statement is false in this case, taken literally: some qualiﬁcation is
needed. However, the qualiﬁcation needed is absolutely the minimal one imaginable: one merely
needs to exclude a set of probability zero – that is, to assert convergence on a set of probability
one (‘almost surely’), rather than everywhere.
Deﬁnition B.6.1. If X
n
, X are random variables, we say X
n
converges to X almost surely –
X
n
→ X (n → ∞) a.s.
– if X
n
→ X with probability one – that is, if
IP(¦ω : X
n
(ω) → X(ω) as n → ∞¦) = 1.
The loose idea of the ‘law of averages’ has as its precise form a statement on convergence almost
surely. This is Kolmogorov’s strong law of large numbers (see e.g. (Williams 1991), ¸12.10), which
is quite diﬃcult to prove.
Weaker convergence concepts are also useful: they may hold under weaker conditions, or they
may be easier to prove.
Deﬁnition B.6.2. If X
n
, X are random variables, we say that X
n
converges to X in probability

X
n
→ X (n → ∞) in probability
 if, for all > 0,
IP (¦ω : [X
n
(ω) −X(ω)[ > ¦) → 0 (n → ∞).
It turns out that convergence almost surely implies convergence in probability, but not in gen
eral conversely. Thus almostsure convergence is a stronger convergence concept than convergence
in probability. This comparison is reﬂected in the form the ‘law of averages’ takes for convergence
in probability: this is called the weak law of large numbers, which as its name implies is a weaker
form of the strong law of large numbers. It is correspondingly much easier to prove: indeed, we
shall prove it in ¸2.8 below.
Recall the L
p
spaces of pthpower integrable functions (¸2.2). We similarly deﬁne the L
p
spaces
of pthpower integrable random variables: if p ≥ 1 and X is a random variable with
X
p
:= (IE[X[
p
)
1/p
< ∞,
APPENDIX B. FACTS FORM PROBABILITY AND MEASURE THEORY 132
we say that X ∈ L
p
(or L
p
(Ω, T, IP) to be precise). For X
n
, X ∈ L
p
, there is a natural convergence
concept: we say that X
n
converges to X in L
p
, or in pth mean,
X
n
→ X in L
p
,
if
X
n
−X
p
→ 0 (n → ∞),
that is, if
IE([X
n
−X[
p
) → 0 (n → ∞).
The cases p = 1, 2 are particularly important: if X
n
→ X in L
1
, we say that X
n
→ X in mean;
if X
n
→ X in L
2
we say that X
n
→ X in mean square. Convergence in pth mean is not directly
comparable with convergence almost surely (of course, we have to restrict to random variables in
L
p
for the comparison even to be meaningful): neither implies the other. Both, however, imply
convergence in probability.
All the modes of convergence discussed so far involve the values of random variables. Often,
however, it is only the distributions of random variables that matter. In such cases, the natural
mode of convergence is the following:
Deﬁnition B.6.3. We say that random variables X
n
converge to X in distribution if the distri
bution functions of X
n
converge to that of X at all points of continuity of the latter:
X
n
→ X in distribution
if
IP(¦X
n
≤ x¦) → IP(¦X ≤ x¦) (n → ∞)
for all points x at which the righthand side is continuous.
The restriction to continuity points x of the limit seems awkward at ﬁrst, but it is both natural
and necessary. It is also quite weak: note that the function x → IP(¦X ≤ x¦), being monotone in
x, is continuous except for at most countably many jumps. The set of continuity points is thus
uncountable: ‘most’ points are continuity points.
Convergence in distribution is (by far) the weakest of the modes of convergence introduced so
far: convergence in probability implies convergence in distribution, but not conversely. There is,
however, a partial converse (which we shall need in ¸2.8): if the limit X is constant (nonrandom),
convergence in probability and in distribution are equivalent.
Weak Convergence.
If IP
n
, IP are probability measures, we say that
IP
n
→ IP (n → ∞) weakly
if
_
fdIP
n
→
_
fdIP (n → ∞) (B.5)
for all bounded continuous functions f. This deﬁnition is given a fulllength book treatment in
(Billingsley 1968), and we refer to this for background and details. For ordinary (realvalued)
random variables, weak convergence of their probability measures is the same as convergence
in distribution of their distribution functions. However, the weakconvergence deﬁnition above
applies equally, not just to this onedimensional case, or to the ﬁnitedimensional (vectorvalued)
setting, but also to inﬁnitedimensional settings such as arise in convergence of stochastic processes.
We shall need such a framework in the passage from discrete to continuoustime BlackScholes
models.
Appendix C
Stochastic Processes in Discrete
Time
C.1 Information and Filtrations
Access to full, accurate, uptodate information is clearly essential to anyone actively engaged in
ﬁnancial activity or trading. Indeed, information is arguably the most important determinant
of success in ﬁnancial life. Partly for simplicity, partly to reﬂect the legislation and regulations
against insider trading, we shall conﬁne ourselves to the situation where agents take decisions on
the basis of information in the public domain, and available to all. We shall further assume that
information once known remains known – is not forgotten – and can be accessed in real time.
In reality, of course, matters are more complicated. Information overload is as much of a danger
as information scarcity. The ability to retain information, organise it, and access it quickly, is one
of the main factors which will discriminate between the abilities of diﬀerent economic agents to
react to changing market conditions. However, we restrict ourselves here to the simplest possible
situation and do not diﬀerentiate between agents on the basis of their informationprocessing
abilities. Thus as time passes, new information becomes available to all agents, who continually
update their information. What we need is a mathematical language to model this information
ﬂow, unfolding with time. This is provided by the idea of a ﬁltration; we outline below the elements
of this theory that we shall need.
The Kolmogorov triples (Ω, T, P), and the Kolmogorov conditional expectations IE(X[B), give
us all the machinery we need to handle static situations involving randomness. To handle dynamic
situations, involving randomness which unfolds with time, we need further structure.
We may take the initial, or starting, time as t = 0. Time may evolve discretely, or continuously.
We postpone the continuous case to Chapter 5; in the discrete case, we may suppose time evolves
in integer steps, t = 0, 1, 2, . . . (say, stockmarket quotations daily, or tick data by the second).
There may be a ﬁnal time T, or time horizon, or we may have an inﬁnite time horizon (in the
context of option pricing, the time horizon T is the expiry time).
We wish to model a situation involving randomness unfolding with time. As above, we suppose,
for simplicity, that information is never lost (or forgotten): thus, as time increases we learn more.
We recall from Chapter 2 that σalgebras represent information or knowledge. We thus need a
sequence of σalgebras ¦T
n
: n = 0, 1, 2, . . .¦, which are increasing:
T
n
⊂ T
n+1
(n = 0, 1, 2, . . .),
with T
n
representing the information, or knowledge, available to us at time n. We shall always
suppose all σalgebras to be complete (this can be avoided, and is not always appropriate, but it
simpliﬁes matters and suﬃces for our purposes). Thus T
0
represents the initial information (if
133
APPENDIX C. STOCHASTIC PROCESSES IN DISCRETE TIME 134
there is none, T
0
= ¦∅, Ω¦, the trivial σalgebra). On the other hand,
T
∞
:= lim
n→∞
T
n
= σ
_
_
n
T
n
_
represents all we ever will know (the ‘Doomsday σalgebra’). Often, T
∞
will be T (the σalgebra
from Chapter 2, representing ‘knowing everything’). But this will not always be so; see e.g.
Williams (1991), ¸15.8 for an interesting example. Such a family IF := ¦T
n
: n = 0, 1, 2, . . .¦
is called a ﬁltration; a probability space endowed with such a ﬁltration, ¦Ω, IF, T, P¦ is called
a stochastic basis or ﬁltered probability space. These deﬁnitions are due to P. A. Meyer of
Strasbourg; Meyer and the Strasbourg (and more generally, French) school of probabilists have
been responsible for the ‘general theory of (stochastic) processes’, and for much of the progress
in stochastic integration, since the 1960s; see e.g. Dellacherie and Meyer (1978), Dellacherie and
Meyer (1982), Meyer (1966), Meyer (1976).
For the special case of a ﬁnite state space Ω = ¦ω
1
, . . . , ω
n
¦ and a given σalgebra T on Ω
(which in this case is just an algebra) we can always ﬁnd a unique ﬁnite partition T = ¦A
1
, . . . , A
l
¦
of Ω, i.e. the sets A
i
are disjoint and
l
i=1
A
i
= Ω, corresponding to T. A ﬁltration IF therefore
corresponds to a sequence of ﬁner and ﬁner partitions T
n
. At time t = 0 the agents only know that
some event ω ∈ Ω will happen, at time T < ∞ they know which speciﬁc event ω
∗
has happened.
During the ﬂow of time the agents learn the speciﬁc structure of the (σ) algebras T
n
, which means
they learn the corresponding partitions T. Having the information in T
n
revealed is equivalent to
knowing in which A
(n)
i
∈ T
n
the event ω
∗
is. Since the partitions become ﬁner the information on
ω
∗
becomes more detailed with each step.
Unfortunately this nice interpretation breaks down as soon as Ω becomes inﬁnite. It turns
out that the concept of ﬁltrations rather than that of partitions is relevant for the more general
situations of inﬁnite Ω, inﬁnite T and continuoustime processes.
C.2 DiscreteParameter Stochastic Processes
The word ‘stochastic’ (derived from the Greek) is roughly synonymous with ‘random’. It is perhaps
unfortunate that usage favours ‘stochastic process’ rather than the simpler ‘random process’, but
as it does, we shall follow it.
We need a framework which can handle dynamic situations, in which time evolves, and in
which new information unfolds with time. In particular, we need to be able to speak in terms of
‘the information available at time n’, or, ‘what we know at time n’. Further, we need to be able to
increase n – thereby increasing the information available as new information (typically, new price
information) comes in, and talk about the information ﬂow over time. One has a clear mental
picture of what is meant by this – there is no conceptual diﬃculty. However, what is needed
is a precise mathematical construct, which can be conveniently manipulated  perhaps in quite
complicated ways – and yet which bears the above heuristic meaning. Now ‘information’ is not
only an ordinary word, but even a technical term in mathematics – many books have been written
on the subject of information theory. However, information theory in this sense is not what we
need: for us, the emphasis is on the ﬂow of information, and how to model and describe it. With
this by way of motivation, we proceed to give some of the necessary deﬁnitions.
A stochastic process X = ¦X
n
: n ∈ I¦ is a family of random variables, deﬁned on some
common probability space, indexed by an indexset I. Usually (always in this book), I represents
time (sometimes I represents space, and one calls X a spatial process). Here, I = ¦0, 1, 2, . . . , T¦
(ﬁnite horizon) or I = ¦0, 1, 2, . . .¦ (inﬁnite horizon). The (stochastic) process X = (X
n
)
∞
n=0
is
said to be adapted to the ﬁltration IF = (T
n
)
∞
n=0
if
X
n
is T
n
−measurable for all n.
So if X is adapted, we will know the value of X
n
at time n. If
T
n
= σ(X
0
, X
1
, . . . , X
n
)
APPENDIX C. STOCHASTIC PROCESSES IN DISCRETE TIME 135
we call (T
n
) the natural ﬁltration of X. Thus a process is always adapted to its natural ﬁltration.
A typical situation is that
T
n
= σ(W
0
, W
1
, . . . , W
n
)
is the natural ﬁltration of some process W = (W
n
). Then X is adapted to IF = (T
n
), i.e. each
X
n
is T
n
 (or σ(W
0
, , W
n
)) measurable, iﬀ
X
n
= f
n
(W
0
, W
1
, . . . , W
n
)
for some measurable function f
n
(nonrandom) of n + 1 variables.
Notation.
For a random variable X on (Ω, T, IP), X(ω) is the value X takes on ω (ω represents the ran
domness). For a stochastic process X = (X
n
), it is convenient (e.g., if using suﬃxes, n
i
say) to
use X
n
, X(n) interchangeably, and we shall feel free to do this. With ω displayed, these become
X
n
(ω), X(n, ω), etc.
The concept of a stochastic process is very general – and so very ﬂexible – but it is too general
for useful progress to be made without specifying further structure or further restrictions. There
are two main types of stochastic process which are both general enough to be suﬃciently ﬂexible
to model many commonly encountered situations, and suﬃciently speciﬁc and structured to have
a rich and powerful theory. These two types are Markov processes and martingales. A Markov
process models a situation in which where one is, is all one needs to know when wishing to predict
the future – how one got there provides no further information. Such a ‘lack of memory’ property,
though an idealisation of reality, is very useful for modelling purposes. We shall encounter Markov
processes more in continuous time (see Chapter 5) than in discrete time, where usage dictates that
they are called Markov chains. For an excellent and accessible recent treatment of Markov chains,
see e.g. Norris (1997). Martingales, on the other hand (see ¸3.3 below) model fair gambling games
– situations where there may be lots of randomness (or unpredictability), but no tendency to drift
one way or another: rather, there is a tendency towards stability, in that the chance inﬂuences
tend to cancel each other out on average.
C.3 Deﬁnition and basic properties of martingales
Excellent accounts of discreteparameter martingales are Neveu (1975), Williams (1991) and
Williams (2001) to which we refer the reader for detailed discussions. We will summarise what we
need to use martingales for modelling in ﬁnance.
Deﬁnition C.3.1. A process X = (X
n
) is called a martingale relative to (¦T
n
¦, IP) if
(i) X is adapted (to ¦T
n
¦);
(ii) IE [X
n
[ < ∞ for all n;
(iii) IE[X
n
[T
n−1
] = X
n−1
IP −a.s. (n ≥ 1).
X is a supermartingale if in place of (iii)
IE[X
n
[T
n−1
] ≤ X
n−1
IP −a.s. (n ≥ 1);
X is a submartingale if in place of (iii)
IE[X
n
[T
n−1
] ≥ X
n−1
IP −a.s. (n ≥ 1).
Martingales have a useful interpretation in terms of dynamic games: a martingale is ‘constant
on average’, and models a fair game; a supermartingale is ‘decreasing on average’, and models an
unfavourable game; a submartingale is ‘increasing on average’, and models a favourable game.
APPENDIX C. STOCHASTIC PROCESSES IN DISCRETE TIME 136
Note.
1. Martingales have many connections with harmonic functions in probabilistic potential theory.
The terminology in the inequalities above comes from this: supermartingales correspond to su
perharmonic functions, submartingales to subharmonic functions.
2. X is a submartingale (supermartingale) if and only if −X is a supermartingale (submartingale);
X is a martingale if and only if it is both a submartingale and a supermartingale.
3. (X
n
) is a martingale if and only if (X
n
− X
0
) is a martingale. So we may without loss of
generality take X
0
= 0 when convenient.
4. If X is a martingale, then for m < n using the iterated conditional expectation and the
martingale property repeatedly (all equalities are in the a.s.sense)
IE[X
n
[T
m
] = IE[IE(X
n
[T
n−1
)[T
m
] = IE[X
n−1
[T
m
]
= . . . = IE[X
m
[T
m
] = X
m
,
and similarly for submartingales, supermartingales.
From the Oxford English Dictionary: martingale (etymology unknown)
1. 1589. An article of harness, to control a horse’s head.
2. Naut. A rope for guying down the jibboom to the dolphinstriker.
3. A system of gambling which consists in doubling the stake when losing in order to recoup
oneself (1815).
Thackeray: ‘You have not played as yet? Do not do so; above all avoid a martingale if you do.’
Gambling games have been studied since time immemorial – indeed, the PascalFermat cor
respondence of 1654 which started the subject was on a problem (de M´er´e’s problem) related to
gambling. The doubling strategy above has been known at least since 1815.
The term ‘martingale’ in our sense is due to J. Ville (1939). Martingales were studied by Paul
L´evy (1886–1971) from 1934 on (see obituary (Loˆeve 1973)) and by J.L. Doob (1910–) from 1940
on. The ﬁrst systematic exposition was (Doob 1953). This classic book, though hard going, is
still a valuable source of information.
Examples.
1. Mean zero random walk: S
n
=
X
i
, with X
i
independent with IE(X
i
) = 0 is a martingale
(submartingales: positive mean; supermartingale: negative mean).
2. Stock prices: S
n
= S
0
ζ
1
ζ
n
with ζ
i
independent positive r.vs with existing ﬁrst moment.
3. Accumulating data about a random variable (Williams (1991), pp. 96, 166–167). If ξ ∈
L
1
(Ω, T, IP), M
n
:= IE(ξ[T
n
) (so M
n
represents our best estimate of ξ based on knowledge at
time n), then using iterated conditional expectations
IE[M
n
[T
n−1
] = IE[IE(ξ[T
n
)[T
n−1
] = IE[ξ[T
n−1
] = M
n−1
,
so (M
n
) is a martingale. One has the convergence
M
n
→ M
∞
:= IE[ξ[T
∞
] a.s. and in L
1
.
C.4 Martingale Transforms
Now think of a gambling game, or series of speculative investments, in discrete time. There is no
play at time 0; there are plays at times n = 1, 2, . . ., and
∆X
n
:= X
n
−X
n−1
APPENDIX C. STOCHASTIC PROCESSES IN DISCRETE TIME 137
represents our net winnings per unit stake at play n. Thus if X
n
is a martingale, the game is ‘fair
on average’.
Call a process C = (C
n
)
∞
n=1
predictable if C
n
is T
n−1
measurable for all n ≥ 1. Think of C
n
as your stake on play n (C
0
is not deﬁned, as there is no play at time 0). Predictability says that
you have to decide how much to stake on play n based on the history before time n (i.e., up to
and including play n −1). Your winnings on game n are C
n
∆X
n
= C
n
(X
n
−X
n−1
). Your total
(net) winnings up to time n are
Y
n
=
n
k=1
C
k
∆X
k
=
n
k=1
C
k
(X
k
−X
k−1
).
We write
Y = C • X, Y
n
= (C • X)
n
, ∆Y
n
= C
n
∆X
n
((C • X)
0
= 0 as
0
k=1
is empty), and call C • X the martingale transform of X by C.
Theorem C.4.1. (i) If C is a bounded nonnegative predictable process and X is a supermartin
gale, C • X is a supermartingale null at zero.
(ii) If C is bounded and predictable and X is a martingale, C • X is a martingale null at zero.
Proof. Y = C • X is integrable, since C is bounded and X integrable. Now
IE[Y
n
−Y
n−1
[T
n−1
] = IE[C
n
(X
n
−X
n−1
)[T
n−1
]
= C
n
IE[(X
n
−X
n−1
)[T
n−1
]
(as C
n
is bounded, so integrable, and T
n−1
measurable, so can be taken out)
≤ 0
in case (i), as C ≥ 0 and X is a supermartingale,
= 0
in case (ii), as X is a martingale.
Interpretation. You can’t beat the system! In the martingale case, predictability of C means
we can’t foresee the future (which is realistic and fair). So we expect to gain nothing – as we should.
Note.
1. Martingale transforms were introduced and studied by Burkholder (1966). For a textbook
account, see e.g. Neveu (1975), VIII.4.
2. Martingale transforms are the discrete analogues of stochastic integrals. They dominate the
mathematical theory of ﬁnance in discrete time, just as stochastic integrals dominate the theory
in continuous time.
Lemma C.4.1 (Martingale Transform Lemma). An adapted sequence of real integrable ran
dom variables (X
n
) is a martingale iﬀ for any bounded predictable sequence (C
n
),
IE
_
n
k=1
C
k
∆X
k
_
= 0 (n = 1, 2, . . .).
Proof. If (X
n
) is a martingale, Y deﬁned by Y
0
= 0,
Y
n
=
n
k=1
C
k
∆X
k
(n ≥ 1)
APPENDIX C. STOCHASTIC PROCESSES IN DISCRETE TIME 138
is the martingale transform C • X, so is a martingale. Now IE(Y
1
) = IE(C
1
IE(X
1
−X
0
)) = 0 and
we see by induction that
IE(Y
n+1
) = IE(C
n+1
(X
n+1
−X
n
)) +IE(Y
n
) = 0.
Conversely, if the condition of the proposition holds, choose j, and for any T
j
measurable set
A write C
n
= 0 for n ,= j + 1, C
j+1
= 1
A
. Then (C
n
) is predictable, so the condition of the
proposition, IE(
n
k=1
C
k
∆X
k
) = 0, becomes
IE[1
A
(X
j+1
−X
j
)] = 0.
Since this holds for every set A ∈ T
j
, the deﬁnition of conditional expectation gives
IE(X
j+1
[T
j
) = X
j
.
Since this holds for every j, (X
n
) is a martingale.
Remark C.4.1. The proof above is a good example of the value of Kolmogorov’s deﬁnition of
conditional expectation – which reveals itself, not in immediate transparency, but in its ease of
handling in proofs. We shall see in Chapter 4 the ﬁnancial signiﬁcance of martingale transforms
H • M.
Bibliography
Allingham, M., 1991, Arbitrage. Elements of ﬁnancial economics. (MacMillan).
BarndorﬀNielsen, O.E., 1998, Processes of normal inverse Gaussian type, Finance and Stochastics
2, 41–68.
Billingsley, P., 1968, Convergence of probability measures. (Wiley, New York).
Bj¨ork, T., 1995, Arbitrage theory in continuous time, Notes from Ascona meeting.
Bj¨ork, T., 1997, Interest rate theory, in Financial Mathematics, ed. by W.J. Runggaldier Lecture
Notes in Mathematics pp. 53–122. Springer, Berlin New York London.
Black, F., and M. Scholes, 1973, The pricing of options and corporate liabilities, Journal of Political
Economy 72, 637–659.
Brown, R.H., and S.M. Schaefer, 1995, Interest rate volatility and the shape of the term structure,
in S.D. Howison, F.P. Kelly, and P. Wilmott, eds.: Mathematical models in ﬁnance (Chapman
& Hall, ).
Burkholder, D.L., 1966, Martingale transforms, Ann. Math. Statist. 37, 1494–1504.
Burkill, J.C., and H. Burkill, 1970, A second course in mathematical analysis. (Cambridge Uni
versity Press).
Cochrane, J.H., 2001, Asset pricing. (Princeton University Press).
Cox, D.R., and H.D. Miller, 1972, The theory of stochastic processes. (Chapman and Hall, London
and New York) First published 1965 by Methuen & Co Ltd.
Cox, J.C., and S.A. Ross, 1976, The valuation of options for alternative stochastic processes,
Journal of Financial Economics 3, 145–166.
Cox, J.C., S. A. Ross, and M. Rubinstein, 1979, Option pricing: a simpliﬁed approach, J. Financial
Economics 7, 229–263.
Cox, J.C., and M. Rubinstein, 1985, Options markets. (PrenticeHall).
Davis, M.H.A., 1994, A general option pricing formula, Preprint, Imperial College.
Dellacherie, C., and P.A. Meyer, 1978, Probabilities and potential vol. A. (Hermann, Paris).
Dellacherie, C., and P.A. Meyer, 1982, Probabilities and potential vol. B. (North Holland, Ams
terdam New York).
Doob, J. L., 1953, Stochastic processes. (Wiley).
Dothan, M. U., 1990, Prices in ﬁnancial markets. (Oxford University Press).
Dudley, R.M., 1989, Real Analysis and Probability. (Wadsworth, Paciﬁc Grove).
139
BIBLIOGRAPHY 140
Duﬃe, D., 1992, Dynamic Asset Pricing Theory. (Princton University Press).
Duﬃe, D., and R. Kan, 1995, Multifactor term structure models, in S.D. Howison, F.P. Kelly,
and P. Wilmott, eds.: Mathematical models in ﬁnance (Chapman & Hall, ).
Durrett, R., 1996, Probability: Theory and Examples. (Duxbury Press at Wadsworth Publishing
Company) 2nd edn.
Durrett, R., 1999, Essentials of stochastic processes. (Springer).
Eberlein, E., and U. Keller, 1995, Hyperbolic distributions in ﬁnance, Bernoulli 1, 281–299.
Edwards, F.R., and C.W. Ma, 1992, Futures and Options. (McGrawHill, New York).
El Karoui, N., R. Myneni, and R. Viswanathan, 1992, Arbitrage pricing and hedging of interest
rate claims with state variables: I. Theory, Universit´e de Paris VI and Stanford University.
Grimmett, G.R., and D.J.A. Welsh, 1986, Probability: An introduction. (Oxford University Press).
Harrison, J.M., 1985, Brownian motion and stochastic ﬂow systems. (John Wiley and Sons, New
York).
Harrison, J.M., and D. M. Kreps, 1979, Martingales and arbitrage in multiperiod securities mar
kets, J. Econ. Th. 20, 381–408.
Harrison, J.M., and S.R. Pliska, 1981, Martingales and stochastic integrals in the theory of con
tinuous trading, Stochastic Processes and their Applications 11, 215–260.
Heath, D., R. Jarrow, and A. Morton, 1992, Bond pricing and the term structure of interest rates:
a new methodology for contingent claim valuation, Econometrica 60, 77–105.
Ikeda, N., S. Watanabe, Fukushima M., and H. Kunita (eds.), 1996, Itˆo stochastic calculus
and probability theory. (Springer, Tokyo Berlin New York) Festschrift for Kiyosi Itˆo’s eightieth
birthday, 1995.
Ince, E.L., 1944, Ordinary diﬀerential equations. (Dover New York).
Jacod, J., and P. Protter, 2000, Probability essentials. (Springer).
Jameson, R. (ed.), 1995, Derivative credit risk. (Risk Publications London).
Jarrow, R.A., and S.M. Turnbull, 2000, Derivative Securities. (SouthWestern College Publishing,
Cincinnati) 2nd edn. 1st ed. 1996.
Jeans, Sir James, 1925, The mathematical theory of electricity and magnetism. (Cambridge Uni
versity Press) 5th. edn.
Karatzas, I., and S. Shreve, 1991, Brownian motion and stochastic calculus. (Springer, New York)
2nd edn.
Kolb, R.W., 1991, Understanding Futures Markets. (Kolb Publishing, Miami) 3rd edn.
Kolmogorov, A.N., 1933, Grundbegriﬀe der Wahrscheinlichkeitsrechnung. (Springer) English trans
lation: Foundations of probability theory, Chelsea, New York, (1965).
Lebesgue, H., 1902, Int´egrale, longueur, aire, Annali di Mat. 7, 231–259.
Loˆeve, M., 1973, Paul L´evy (18861971), obituary, Annals of Probability 1, 1–18.
Madan, D., 1998, Default risk, in D.J. Hand, and S.D. Jacka, eds.: Statistics in ﬁnance (Arnold,
London, ).
BIBLIOGRAPHY 141
Merton, R.C., 1973, Theory of rational option pricing, Bell Journal of Economics and Management
Science 4, 141–183.
Merton, R.C., 1990, Continuoustime ﬁnance. (Blackwell, Oxford).
Meyer, P.A., 1966, Probability and potential. (Blaisdell, Waltham, Mass.).
Meyer, P.A., 1976, Un cours sur les int´egrales stochastiques, in S´eminaire de Probabilit´es X no.
511 in Lecture Notes in Mathematics pp. 245–400. Springer, Berlin Heidelberg New York.
Musiela, M., and M. Rutkowski, 1997, Martingale methods in ﬁnancial modelling vol. 36 of Appli
cations of Mathematics: Stochastic Modelling and Applied Probability. (Springer, New York).
Neveu, J., 1975, Discreteparameter martingales. (NorthHolland).
Norris, J.R., 1997, Markov chains. (Cambridge University Press).
Protter, P., 2004, Stochastic Integration and Diﬀerential Equations. (Springer, New York) 2nd
edn. 1rst edition, 1992.
Rennocks, John, 1997, Hedging can only defer currency volatility impact for British Steel, Financial
Times 08, Letter to the editor.
Resnick, S., 2001, A probability path. (Birkh¨auser) 2nd printing.
Revuz, D., and M. Yor, 1991, Continuous martingales and Brownian motion. (Springer, New
York).
Rockafellar, R.T., 1970, Convex Analysis. (Princton University Press, Princton NJ).
Rogers, L.C.G., and D. Williams, 1994, Diﬀusions, Markov processes and martingales, Volume 1:
Foundation. (Wiley) 2nd edn. 1st ed. D. Williams, 1970.
Rosenthal, J.S., 2000, A ﬁrst look at rigorous probability theory. (World Scientiﬁc).
Ross, S.M., 1997, Probability models. (Academic Press) 6th edn.
Rydberg, T.H., 1996, Generalized hyperbolic diﬀusions with applications towards ﬁnance, Re
search Report 342, Department of Theoretical Statistics, Institute of Mathematics, University
of
˚
Arhus University.
Rydberg, T.H., 1997, The normal inverse Gaussian L´evy process: Simulation and approximation,
Research Report, Department of Theoretical Statistics, Institute of Mathematics, University of
˚
Arhus University.
Samuelson, P.A., 1965, Rational theory of warrant pricing, Industrial Management Review 6,
13–39.
Schachermayer, W., 2000, Introduction to the mathematics of ﬁnancial markets, to appear as
Springer Lecture Notes.
Shephard, N., 1996, Statistical aspects of ARCH and stochastic volatility, in D.R. Cox, D.V.
Hinkley, and O.E. BarndorﬀNielsen, eds.: Time Series Models  in econometrics, ﬁnance and
other ﬁelds (Chapman & Hall, ).
Snell, J. L., 1952, Applications of martingale systems theorems, Trans. Amer. Math. Soc. 73,
293–312.
Taqqu, M.S., and W. Willinger, 1987, The analysis of ﬁnite security markets using martingales,
Adv. Appl. Prob. 19, 1–25.
BIBLIOGRAPHY 142
Williams, D., 1991, Probability with martingales. (Cambridge University Press).
Williams, D., 2001, Weighing the odds. (Cambridge University Press).
Yor, M., 1978, Sousespaces denses dans L
1
et H
1
et repr´esentation des martingales, in S´eminaire
de Probabilit´es, XII no. 649 in Lecture Notes in Mathematics pp. 265–309. Springer.
Zhang, P.G., 1997, Exotic Options. (World Scientiﬁc, Singapore).
2
Short Description.
Times and Location: Lectures will be Monday 1012; Tuesday 810 in He120. First Lecture Tuesday, 14.10.2003 Content. This course covers the fundamental principles and techniques of ﬁnancial mathematics in discreteand continuoustime models. The focus will be on probabilistic techniques which will be discussed in some detail. Speciﬁc topics are • Classical Asset Pricing: MeanVariance Analysis, CAPM, Arbitrage. • Martingalebased stochastic market models: Fundamental Theorems of Asset Pricing. • Contingent Claim Analysis: European, American and Exotic Options. • Interest Rate Theory: Term Structure Models, Interest Rate Derivatives. Prerequisites. Probability Theory, Calculus, Linear Algebra Literature. • N.H.Bingham & R.Kiesel, Risk Neutral Valuation, Springer 1998. • H.F¨llmer & A.Schied: Stochastic Finance: An Introduction in Discrete Time, De Gruyter o 2002. • J.Hull: Options, Futures & Other Derivatives, 4th edition, Prentice Hall, 1999. • R.Jarrow & S.Turnbull, Derivative Securities, 2nd edition, 2000. Oﬃce Hours. Tuesday 1011. He 230 course webpage: www.mathematik.uniulm.de/ﬁnmath email: kiesel@mathematik.uniulm.de.
Contents
1 Arbitrage Theory 1.1 Derivative Background . . . . . . . . . . . . . . . . . 1.1.1 Derivative Instruments . . . . . . . . . . . . . 1.1.2 Underlying securities . . . . . . . . . . . . . . 1.1.3 Markets . . . . . . . . . . . . . . . . . . . . . 1.1.4 Types of Traders . . . . . . . . . . . . . . . . 1.1.5 Modelling Assumptions . . . . . . . . . . . . 1.2 Arbitrage . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Arbitrage Relationships . . . . . . . . . . . . . . . . 1.3.1 Fundamental Determinants of Option Values 1.3.2 Arbitrage bounds . . . . . . . . . . . . . . . . 1.4 SinglePeriod Market Models . . . . . . . . . . . . . 1.4.1 A fundamental example . . . . . . . . . . . . 1.4.2 A singleperiod model . . . . . . . . . . . . . 1.4.3 A few ﬁnancialeconomic considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 5 7 8 8 9 10 12 12 14 15 15 17 22 24 24 24 25 28 31 31 34 35 39 39 42 42 45 47 49 50 51 53 56 56 57 60 60 67
2 Financial Market Theory 2.1 Choice under Uncertainty . . . . . . . . . . . . . . . . . . 2.1.1 Preferences and the Expected Utility Theorem . . 2.1.2 Risk Aversion . . . . . . . . . . . . . . . . . . . . . 2.1.3 Further measures of risk . . . . . . . . . . . . . . . 2.2 Optimal Portfolios . . . . . . . . . . . . . . . . . . . . . . 2.2.1 The meanvariance approach . . . . . . . . . . . . 2.2.2 Capital asset pricing model . . . . . . . . . . . . . 2.2.3 Portfolio optimisation and the absence of arbitrage 3 Discretetime models 3.1 The model . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Existence of Equivalent Martingale Measures . . . . 3.2.1 The NoArbitrage Condition . . . . . . . . . 3.2.2 RiskNeutral Pricing . . . . . . . . . . . . . . 3.3 Complete Markets . . . . . . . . . . . . . . . . . . . 3.4 The CoxRossRubinstein Model . . . . . . . . . . . 3.4.1 Model Structure . . . . . . . . . . . . . . . . 3.4.2 RiskNeutral Pricing . . . . . . . . . . . . . . 3.4.3 Hedging . . . . . . . . . . . . . . . . . . . . . 3.5 Binomial Approximations . . . . . . . . . . . . . . . 3.5.1 Model Structure . . . . . . . . . . . . . . . . 3.5.2 The BlackScholes Option Pricing Formula . 3.6 American Options . . . . . . . . . . . . . . . . . . . 3.6.1 Stopping Times, Optional Stopping and Snell 3.6.2 The Financial Model . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Envelopes . . . . . .
3
. . . . . .2. . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Equivalent Martingale Measures . . . . . . . .1. . . . .6 Modes of Convergence . . . . . . . . . . . . . . . . 106 A. . . . . . . . . . . . . . . . . . . . . . . 4 68 69 72 72 72 73 76 79 80 80 82 83 84 87 89 91 91 91 92 96 96 97 98 99 99 101 102 102 103 104 4 Continuoustime Financial Market Models 4. . .4 Pricing and Hedging Contingent Claims . . . . . . . . . . . . B. . . . . . . .3 HeathJarrowMorton Methodology . . . . . . .1. . . . . . . . . . . .3.3 Caps . . . . . . . . . . . . . . . . . . . . . . . . . 5. . . . . . . . . . . .6 Barrier Options . . . . . . . .1 The Stock Price Process and its Stochastic Calculus 4. . . . . . . . . . . . . 5. . .2 Integral . . . . . . . . . . . . . . . . . . . . .1. . . . . . . . . . . . . . .4 American Options in the CoxRossRubinstein model . . . . . . .4 Equivalent Measures and RadonNikod´m Derivatives y B.2. . . . . . . . 108 A. . . . . . . . . . . . . . . . . . . .2 Swaps .4 Girsanov’s Theorem . . . .1 Measure . . . . . . .2 Financial Market Models . . . . . . . . . . . . . . . .3 Probability . . . . . . . . . . . . . . . . . . .1 The HeathJarrowMorton Model Class . .1. . . . . . 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Basic Probability Background 106 A. .1 The Bond Market . . . . . . . . . . B. . . . . . . . . 112 B Facts form Probability and Measure Theory B. . . . . . . . . . .1 Continuoustime Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Deﬁnition and basic properties of martingales C. . . . .2. . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. .2 Stochastic Analysis . .1 The Financial Market Model . . . . . . . . . . . . . . . . . 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Fundamentals . . . . . 4. . . . . . . . . . . . . . . B. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6. . . . .2 Convolution and Characteristic Functions . . . . . . . . . . . . .2. . . . . . . . .2 Martingale Modelling . . . . . . . 4. . . . . . . . . . . 5. . . . . . 5. . .2 Mathematical Modelling . . . . . . . . . . . . . . . . . . . . . . . . .3. . .3 Bond Pricing. . . . . . . . . 4. .2. . . . . . . . . . C Stochastic Processes in Discrete Time C. . . . . . .1. . . . . . . . . . . . . .3 The Central Limit Theorem . . . . . . . . . . . . . . . . . . . 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 The BlackScholes Model . .2. . . . . 4. . . 5. . . . . . . . . . . . . . . . . . . . .5 The Greeks . . . 5 Interest Rate Theory 5. . . . . . . . . . . . . . . . . . . . 5. . . . . . C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6. . . . . . .CONTENTS 3. . . . 115 115 118 120 124 125 131 133 133 134 135 136 . . . . .4. . . . . . . . .1. . . . . . . . . . 5. . . .3 3. 5. . .1 Information and Filtrations . . . . . . . . . . 4. . . . . . . . . . . . . . . . . . . . . . .2. . . . A Threeperiod Example . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Shortrate Models . . . .1 Gaussian HJM Framework . .3 Riskneutral Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 The Termstructure Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 DiscreteParameter Stochastic Processes . . . . . .1. . . . . .4 Martingale Transforms . . . . . . . . . . . 5. . . . . . . . . . . . . . . . . . . . .2. . .4. . . . . . . . . . . . . . . . . . . . . . . . . .2 Forward Riskneutral Martingale Measures 5. . . . . . o 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. . . . . . . . . . 5. . 4. . . . . .3 Itˆ’s Lemma . . . . . . . .4. . . . . . C. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Conditional expectation . . . . . . . . .1 The Term Structure of Interest Rates . . . . . . . . . . . . . . . . . . . . 4. . . . . . . . . . . . . . . . . .
lookback options. Terminology. Put options give one the right to sell. on which the option expires or matures. Types include: Asian options. the growth of options has been explosive. the Financial Times of 7 October 2002 (Special Report on Derivatives) gives the interest rate and currency derivatives volume as $ 83 trillion . As our focus is on (probabilistic) models and not institutional considerations we refer the reader to the references for excellent sources describing institutions such as Davis (1994). the expiry date. European options give one the right to buy/sell on the speciﬁed date. is a ﬁnancial contract whose value at expiration date T (more brieﬂy. Many kinds of options now exist.1 Derivative Background Deﬁnition 1. in enormous volumes. An option is a ﬁnancial instrument giving one the right but not the obligation to make a speciﬁed transaction at (or by) a speciﬁed date at a speciﬁed price.1. The asset to which the option refers is called the underlying asset or the underlying.1. expiry) is determined exactly by the price (or prices within a prespeciﬁed timeinterval) of the underlying ﬁnancial assets (or instruments) at time T (within the time interval [0.1. which depend on some price level being attained or not. on/by the expiry date 5 . which depend on the average price over a period. or contingent claim. the markets where derivative securities are traded and the ﬁnancial agents involved in these activities. In 1973 (the year of the BlackScholes formula. Options are now traded on all the major world exchanges.an indication of the rate of growth in recent years! The simplest call and put options are now so standard they are called vanilla options. which depend on the maximum or minimum price over a period and barrier options. 1. This section provides the institutional background on derivative securities. Call options give one the right to buy. Risk magazine (12/97) estimated $35 trillion as the gross ﬁgure for worldwide derivatives markets in 1996. Forwards and Futures and Swaps. During this text we will mainly deal with options although our pricing techniques may be readily applied to forwards. futures and swaps as well. Options. including socalled exotic options. perhaps the central result of the subject). T ]). Since then. By contrast. A derivative security.Chapter 1 Arbitrage Theory 1. The price at which the transaction to buy/sell the underlying.1 Derivative Instruments Derivative securities can be grouped under three general headings: Options. Edwards and Ma (1992) and Kolb (1991). American options give one the right to buy/sell at any time prior to or at expiry. the Chicago Board Options Exchange (CBOE) began trading in options on some stocks. Overthecounter (OTC) options were long ago negotiated by a broker between a buyer and a seller. the main groups of underlying assets.
T ) is the delivery price which would make the contract have zero value at time t. Taking into account the initial payment of an investor one obtains the proﬁt diagram below. The payoﬀ from a long position in a forward contract on one unit of an asset with price S(T ) at the maturity of the contract is S(T ) − K. the option is in the money. The payoﬀ from the option. the option is out of the money. the forward price therefore equals the delivery price. T ) = K. if S(t) = K. hence f (0. the other agent assumes a short position. various ﬁnancial assets (or cash ﬂows) according to a prearranged formula that depends on the . proﬁt T K E S(T ) Figure 1. time t = 0 for the initial time (when the contract between the buyer and the seller of the option is struck). a European call option. t = 0.CHAPTER 1. The agent who agrees to buy the underlying asset is said to have a long position. At the time the contract is set up. Compared with a call option with the same maturity and strike price K we see that the investor now faces a downside risk. We shall usually use K for the strike price. with strike price K. which is S(T ) − K if S(T ) > K and 0 otherwise (more brieﬂy written as (S(T ) − K)+ ). say. The forward price f (t. He has the obligation to buy the asset for price K. Swaps A swap is an agreement whereby two parties undertake to exchange. time t = T for the expiry or ﬁnal time. write S(t) for the value (or price) of the underlying at time t. The settlement date is called delivery date and the speciﬁed price is referred to as delivery price. is called the exercise price or strike price. T ) need not (and will not) necessarily be equal to the delivery price K during the lifetime of the contract. The forward prices f (t. is made.1: Proﬁt diagram for a European call Forwards A forward contract is an agreement to buy or sell an asset S at a certain future date T for a certain price K. too. the option is said to be at the money and if S(t) < K. Consider. If S(t) > K. at known dates in the future. ARBITRAGE THEORY 6 (if exercised).
Interest rates themselves are notional assets.2 Stocks. reﬂecting both the value of the company’s (real) assets and the earning power of the company’s dividends. e. as well as repayment of the principal at maturity of the security. Fixedincome securities require the payment of interest in the form of a ﬁxed amount of money at predetermined points in time. a commodity) in question are available and if the correlation in movement between the index and the asset is signiﬁcant.). and economic activity involving it. DAX). Underlying securities The basis of modern economic life – or of the capitalist system . Indexes.CHAPTER 1.). mutual funds etc. and so on. The growing number of huge natural disasters (such as hurricane Andrew 1992. such as most manufacturing industry. Again. the Kobe earthquake 1995) has led the insurance industry to try to ﬁnd . Tbills.is the limited liability company (UK: & Co. a bond. Stock is the generic term for assets held in the form of shares. Tbonds. shares are quoted and traded on the Stock Exchange. These are ﬁxedincome securities by which national. the corporation (US: Inc. We will discuss the subject of modelling the term structure of interest rates in Chapter 8. Furthermore.g. Treasury (T) notes. International trade. Ltd. The end of ﬁxed exchange rates and the adoption of ﬂoating exchange rates resulted in a sharp increase in exchange rate volatility. involves dealing with more than one currency. try to mimic particular stock indexes and use derivatives on stock indexes as a portfolio management tool. A whole term structure is necessary for a full description of the level of interest rates. On the other hand. which manage large diversiﬁed stock portfolios. which cannot be delivered. Such companies are owned by their shareholders. A new kind of index was generated with the Index of Catastrophe Losses (CATIndex) by the Chicago Board of Trade (CBOT) lately. ‘die Aktiengesellschaft’ (Germany: AG). With publicly quoted companies. the shares • provide partial ownership of the company. A company may wish to hedge adverse movements of foreign currencies and in doing so use derivative instruments (see for example the exposure of the hedging problems British Steel faced as a result of the sharp increase in the pound sterling in 96/97 Rennocks (1997)). S&P500.public limited company). The value of some ﬁnancial assets depends solely on the level of interest rates (or yields). state and local governments and large companies partially ﬁnance their economic activity. ARBITRAGE THEORY 7 value of one or more underlying assets. Currencies. A currency is the denomination of the national units of payment (money) and as such is a ﬁnancial asset. Hedging exposure to interest rates is more complicated than hedging exposure to the price movements of a certain stock. pro rata with investment. Examples are currency swaps (exchange currencies) and interestrate swaps (exchange of ﬁxed for ﬂoating set of interest payments). now plc . Interest Rates.1. these are not assets themselves. and for hedging purposes one must clarify the nature of the exposure carefully. a speculator may wish to bet on a certain overall development in a market without exposing him/herself to a particular asset. • have value. Derivative instruments on indexes may be used for hedging if no derivative instruments on a particular asset (a stock. 1. municipal and corporate bonds. institutional funds (such as pension funds. An index tracks the value of a (hypothetical) basket of stocks (FTSE100. bonds (REX).
Citibank. who is only interested in the potential for possible proﬁt that trade involving it may present. Also ﬁg. commodities or whatever) itself which is their main forum of economic activity. Due to the growing sophistication of investors boosting demand for increasingly complicated.. the London International Financial Futures Exchange (LIFFE) and the Deutsche Terminb¨rse (DTB). require a certain degree of standardisation of the traded instruments (strike price. in that a hedger. Speculators want to take a position in the market – they take the opposite position to hedgers. or manufacture of products using these as raw materials). Goldman Sachs – where Fischer Black worked.4 Types of Traders We can classify the traders of derivative securities in three diﬀerent classes: Hedgers.CHAPTER 1. Derivatives are themselves assets – they are traded. Examples are the Chicago Board Options Exchange (CBOE). thereby taking in eﬀect the position of traditional reinsurance. Currently investors are oﬀered options on the CATIndex. They may prefer to forgo the chance to make exceptional windfall proﬁts when future uncertainty works to their advantage by protecting themselves against exceptional loss. currencies.’ Speculators. interest rates etc. etc. For speculators.1.and so can be used as underlying assets for new contingent claims: options on futures. available funds are invested opportunistically in the hope of making a proﬁt: the underlying itself is irrelevant to the investor (speculator). o OTC trading takes place via computers and phones between various commercial and investment banks (leading players include institutions such as Bankers Trust. Successful companies concentrate on economic activities in which they do best. commodities next year. These developments give rise to socalled exotic options.3 Markets Financial derivatives are basically traded in two ways: on organized exchanges and overthecounter (OTC). speculation is needed to make hedging possible. Chase Manhattan and Deutsche Bank). 1.) by betting. and also enable them to focus their eﬀort in their chosen area of trade or manufacture.1. etc. Shorter Oxford English Dictionary (OED): Hedge: ‘trans. They use the market to insure themselves against adverse movements of prices. etc. ARBITRAGE THEORY 8 new ways of increasing its capacity to carry risks. is typically engaged in by companies who have to deal habitually in intrinsically risky assets such as foreign exchange next year. The CBOT tried to capitalise on this problem by launching a market in insurance derivatives. options on baskets of options. . Hedging. To cover oneself against loss on (a bet etc. wishing to lay oﬀ risk. 1672. the OTC market volume is currently (as of 1998) growing at a much faster pace than trade on most exchanges. Hedging is an attempt to reduce exposure to risk a company already faces. cannot do so unless someone is willing to take it on. Indeed. In speculation. 1. madetomeasure products. demanding a sophisticated mathematical machinery to handle them. . which coincidentally opened in April 1973. the same year the seminal contributions on option prices by Black and Scholes Black and Scholes (1973) and Merton Merton (1973) were published. on the other hand. on the other side.) and have a physical location at which trade takes place. This would serve to protect their economic base (trade in commodities. size of contract etc. maturity dates. it is the market (forex. have value etc. Organised exchanges are subject to regulatory rules. by contrast.
The very existence of arbitrageurs means that there can only be very small arbitrage opportunities in the prices quoted in most ﬁnancial markets. This might lead to a relaxation of one of the other assumptions and a restart of the procedure again with noarbitrage assumed. If we developed a theoretical price of a ﬁnancial derivative under our assumptions and this price did not coincide with the price observed. ARBITRAGE THEORY Arbitrageurs. and we discuss it in more detail below. Hence agents can buy or sell as much of any security as they wish without changing the security’s price. or economic collapse resulting from war. no margin requirements. no taxes.2). The modern theory began in 1973 with the seminal BlackScholes theory of option pricing. not price makers. Understanding frictionless markets is also a necessary step to understand markets with frictions. Black and Scholes (1973). We ignore default risk for simplicity while developing understanding of the principal aspects (for recent overviews on the subject we refer the reader to Jameson (1995). The relaxation of these assumptions is subject to ongoing research and we will include comments on this in the text. 1. Those risks also appear at the national level: quite apart from war. or the threat of it. Merton (1973). we would take this as an arbitrage opportunity in our model and go on to explore the consequences. The fundamental problem in the mathematics of ﬁnancial derivatives is that of pricing. Apart from this we will develop a preferencefree theory.CHAPTER 1. We want to mention the special character of the noarbitrage assumption. The noarbitrage assumption thus has a special status that the others do not. 9 Arbitrageurs try to lock in riskless proﬁt by simultaneously entering into transactions in two or more markets. and Merton’s extensions of this theory.1: General assumptions All real markets involve frictions.1. we start by discussing contingent claim pricing in the simplest (idealised) case and impose the following set of assumptions on the ﬁnancial markets (We will relax these assumptions subsequently): No market frictions No default risk Competitive markets Rational agents No arbitrage No transaction costs. §1. It is the basis for the arbitrage pricing technique that we shall develop.5 Modelling Assumptions Contingent Claim Pricing. Madan (1998)). . We develop the theory of an ideal – frictionless – market so as to focus on the irreducible essentials of the theory and as a ﬁrstorder approximation to reality. no bid/ask spread. The risk of failure of a company – bankruptcy – is inescapably present in its economic activity: death is part of life. this assumption is made purely for simplicity. We assume ﬁnancial agents to be price takers. This implies that even large amounts of trading in a security by one agent does not inﬂuence the security’s price. The underlying concept of this book is the absence of arbitrage opportunities (cf. for companies as for individuals. To expose the relevant features. To assume that market participants prefer more to less is a very weak assumption on the preferences of market participants. recent decades have seen default of interest payments of international debt. no restrictions on short sales Implying same interest for borrowing and lending Market participants act as price takers Market participants prefer more to less Table 1.
the arbitrage pricing technique. To explain the fundamental arguments of the arbitrage pricing technique we use the following: Example. ARBITRAGE THEORY 10 1. an increase in consumption without any costs will always be accepted. or more precisely. All we assume is that they prefer more to less. Consider an investor who acts in a market in which only three ﬁnancial assets are traded: (riskless) bonds B (bank account). the only point of exposing oneself to risk is the opportunity. Generally speaking. including trade in options. S(0) = 1. in unlimited quantity. It turns out that the minimal requirement of absence of arbitrage opportunities is enough to allow one to build a model of a ﬁnancial market which – while admittedly idealised – is realistic enough both to provide real insight and to handle the mathematics necessary to price standard contingent claims. Thus. in all three assets.75.’ Used in this broad sense. u) = 1. and makes its money on the diﬀerence between high/risky and low/riskless interest rates. the term arbitrage is nowadays also used in a narrower and more technical sense.25.2 and that at t = T there can be only two states of the world: an upstate with £ prices B(T. a clearing bank lends to companies at higher rates than it pays to its account holders. which lies at the centre of the relative pricing theory.2 Arbitrage We now turn in detail to the concept of arbitrage. arbitrageurs (we use the French spelling. and bought or sold in sight of the daily quotations of rates in the several markets. However. We do not have to impose any assumptions on the tastes (preferences) and beliefs of market participants.g. etc. d) = 0. The traﬃc in Bills of Exchange drawn on sundry places. and a downstate with £ prices B(T. see e.].75. make it impossible for the market to be in equilibrium. Allingham (1991). The remarkable thing is the converse. u) = 1. also). The principle of arbitrage in its broadest sense is given by the following quotation from OED: ‘3 [Comm. The above makes it clear that a market with arbitrage opportunities would be a disorderly market – too disorderly to model. stocks S and European Call options C with strike K = 1 on the stock. as is customary) would do so. time t = 0. of realising a greater proﬁt than the riskless procedure of putting all one’s money in the bank (the mathematics of which – compound interest – does not require a textbook treatment at this level). d) = 1. This would.75. For an accessible treatment rather diﬀerent to ours.25. S(T. We shall restrict ourselves to markets in equilibrium for simplicity – so we must restrict ourselves to markets without arbitrage opportunities. The companies’ trading activities involve risk. leave his investment until time t = T and get his returns back then (we assume the option expires at t = T . the bank tries to spread the risk over a range of diﬀerent loans. . d) = 0. u) = 0. The essence of the technical sense of arbitrage is that it should not be possible to guarantee a proﬁt without exposure to risk. 1881. for instance. Were it possible to do so. and therefore C(T. using the market as a ‘moneypump’ to extract arbitrarily large quantities of riskless proﬁt. the greater the risk. the greater the return required to make investment an attractive enough prospect to attract funds. We assume the current £ prices of the ﬁnancial assets are given by B(0) = 1.) assets. C(0) = 0. the similar traﬃc in Stocks. the term covers ﬁnancial activity of many kinds. The investor may invest today. futures and foreign exchange. Financial markets involve both riskless (bank account) and risky (stocks. or possibility. This approach works under very weak assumptions. Also. The economic agents may be heterogeneous with respect to their preferences for consumption over time and with respect to their expectations about future states of the world.CHAPTER 1. and therefore C(T. for instance. S(T. We shall see that arbitrage arguments suﬃce to determine prices . To the investor.
D: (−1. But what is the eﬀect of doing that? Let us consider the consequences in the possible states of the world. We will develop in this book models of ﬁnancial market (with diﬀerent degrees of sophistication) which will allow us to ﬁnd methods to avoid (or to spot) such pricing errors. We compute its return Financial asset Bond Stock Call Number of 11.2 below (we call such a division a portfolio). reducing the current (time t = 0) expenses without changing the return at the future date t = T in both possible states of the world. we see in both cases that the eﬀects of the diﬀerent positions of the portfolio oﬀset themselves. having £1.5 12. If we regard (as we shall do) the prices of the bond and the stock (our underlying) as given. If we only look at the position in bonds and stocks.8.6 now.5 7. ARBITRAGE THEORY 11 Financial asset Bond Stock Call Number of 10 10 25 Total amount in £ 10 10 5 Table 1. Depending of the state of the world at time t = T this portfolio will give the £ return shown in Table 1.3: Return of original portfolio portfolio of Table 1.2: Original portfolio Now our investor has a starting capital of £25.e. say) is the following portfolio. 29). the option must be mispriced. Let us emphasise that the above arguments were independent of the preferences and plans of .6.8 in your bank account and being three stocks short has the same time t = T eﬀects of having four call options outstanding against us. For the time being.CHAPTER 1. a saving of £0. So the investor should use the second portfolio and have a free lunch today! In the above example the investor was able to restructure his portfolio. 25).75 20. i.75 0 Total 48. in short (10. and divides it as in Table 1. consisting of 10 bonds.4 of the example.4 against the ﬁrst portfolio. Can the investor do better? Let us consider the restructured State of the world Up Down Bond 12. We see that this portfolio generates the same time t = T return while costing only £24. Sell short three stocks (see below).8 in your bank account. and the prices quoted are not arbitrage (or market) prices. of the form (11.5). The diﬀerence (from the point of view of portfolio 1. 10 stocks and 25 call options. let us have a closer look at the diﬀerences between portfolio 1. Table 1. The leftover is exactly the £0. 3. 10. But clearly the portfolio generates an income at t = 0 and is therefore itself an arbitrage opportunity.6 below. We say that the bond/stock position is a hedge against the position in options. buy four options and put £1.5 Call 18.4.8. −4). This portfolio requires only an investment of £24. and portfolio 2. 7.5 Stock 17.8 7 29 Total amount in £ 11.8 Table 1. From Table 1.3. we can say that this position covers us against possible price movements of the option.4: Restructured portfolio in the diﬀerent possible future states (Table 1.8 7 5. So there is an arbitrage possibility in the above market situation.
We saw that at expiry the only variables that mattered were the stock price S(T ) and strike price K: remember the payoﬀs C = (S(T ) − K)+ .7: Determinants aﬀecting option value We now examine the eﬀects of the single determinants on the option prices (all other factors remaining unchanged). Table 1. allow one to test the plausibility of sophisticated ﬁnancial market models.75 0 Total 48.25 Call 21.75 sell bond Balance 3 −5. etc.25 0 Table 1.CHAPTER 1. a stock hitting a certain level.75 Stock 12. but it could refer to the happening of a certain event. Such bounds.75 sell bond Balance 0 −2.6: Diﬀerence portfolio the investor. P = (S(T ) − K)− (:= max{K − S(T ).1 Fundamental Determinants of Option Values We consider the determinants of the option value in table 1. 1.75 14. deduced from the underlying assumption that no arbitrage should be possible. e. Current stock price Strike price Stock volatility Time to expiry Interest rates S(t) K σ T −t r Table 1.75 20. They were also independent of the interpretation of t = T : it could be a ﬁxed time. 0}).g.25 2.3.5: Return of the restructured portfolio world is in state up exercise option buy 3 stocks at 1.25 2. In our analysis here we use stocks as the underlying. we see that an increase in the stock price will increase (decrease) the value . Looking at the payoﬀs. ARBITRAGE THEORY 12 State of the world Up Down Bond 14. exchange rates at a certain level.25 5. 1. maybe a year from now.25 0 world is in state down option is worthless buy 3 stocks at 0.7 below. Since we restrict ourselves to nondividend paying stocks we don’t have to consider cash dividends as another natural determinant.3 Arbitrage Relationships We will in this section use arbitragebased arguments (arbitrage pricing technique) to develop general bounds on the value of options.
For example stock prices tend to fall (rise) when interest rates rise (fall) and the observable eﬀect on option prices may well be diﬀerent from the eﬀects deduced under our assumptions. such as expected rate of growth of the stock price. An increase in the interest rate tends to increase the expected growth rate in an economy and hence the stock price tends to increase.4 and §6. it can be shown that the ﬁrst eﬀect always dominates the second eﬀect. ARBITRAGE THEORY 13 of a call (put) option (recall all other factors remain unchanged). to exercise the option. characteristics of other assets and institutional environment (tax rules. the present value of any future cash ﬂows decreases. so increases the value of both call and put options.2. However. The actual outcome is uncertain. margin requirements. this argument again relies on the fact that we don’t suﬀer from (with the increase of volatility more likely) more severe unfavourable outcomes – we have the right. however. On the other hand. but not the obligation. discuss other possible determining factors of option value. market structure). see §4. and only the stock price then is relevant.) To qualify the eﬀects of the interest rate we have to consider two aspects. They show that . An increase in volatility ﬂattens out the density and thickens the tails. One can. Cox and Rubinstein (1985). will be veriﬁed again in appropriate models of ﬁnancial markets. while the ﬁrst eﬀect increases the value of a call option. additional properties of stock price movements. which of course is not true in practice. But only the owner of an Americantype option can react immediately to favourable price movements. one might argue that the longer the time to expiry the more can happen to the price of a stock. Of course. The above heuristic statements. show by explicit arbitrage arguments that an increase in time to expiry leads to an increase in the value of call options as well as put options.5.8: Eﬀects of parameters We would like to emphasise again that these results all assume that all other variables remain ﬁxed. whereas the owner of a European option has to wait until expiry. interest rates remain ﬁxed during the period under consideration). We summarise in table 1. Observe the contrast with volatility: an increase in volatility increases the likelihood of favourable outcomes at expiry. favourable outcomes are governed by the tails of the density (right or left tail for a call or a put option). When we buy an option.3. Therefore a longer period increases the possibility of movements of the stock price and hence the value of a call (put) should be higher the more time remains before expiry. A heuristic statement of the eﬀects of time to expiry or interest rates is not so easy to make.8 the eﬀect of an increase of one of the parameters on the value of options on nondividend paying stocks while keeping all others ﬁxed: Parameter (increase) Stock price Strike price Volatility Interest rates Time to expiry Call Positive Negative Positive Positive Positive Put Negative Positive Positive Negative Positive Table 1. We see that by using purely heuristic arguments we are not able to make precise statements. we bet on a favourable outcome. The opposite happens if the strike price is increased: the price of a call (put) option will go down (up). so the value of a call option will increase with increasing interest rates. In the simplest of models (no dividends.CHAPTER 1. 37–39. investors’ attitudes toward risk. transaction costs. its uncertainty is represented by a probability density. whereas the stock price movements before expiry may cancel out themselves. These two eﬀects both decrease the value of a put option. (We should point out that in case of a dividendpaying stock the statement is not true in general for Europeantype options. p. in particular the last. A longer time until expiry might also increase the possibility of adverse eﬀects from which the stock price has to recover before expiry.
1.1) Proof.2. T ]. the value of the portfolio must at any time t correspond to the value of a sure payoﬀ K at T . Hence selling the call would have realised a higher cashﬂow and the early exercise of the call was suboptimal. that is V (t) = Ke−r(T −t) . Exercising the American call at time t < T generates the cashﬂow S(t) − K. Now from putcall parity (1.3. We have the following putcall parity between the prices of the underlying asset S and European call and put options on stocks that pay no dividends: S + P − C = Ke−r(T −t) . for the American option has the added feature of being able to be exercised at any time until the maturity date. Having established (1. We start with a fundamental relationship: Proposition 1. Hence (with the obvious notation): CA (t) ≥ CE (t). Using the principle of noarbitrage. Similarly the upper bound C ≤ S must hold. That C ≥ 0 is obvious. (Merton 1990). write V (t) for the value of this portfolio. ARBITRAGE THEORY 14 in many important circumstances the inﬂuence of these variables is marginal or even vanishing. which is greater than S(t) − K.3. Such bounds. since a stock oﬀers additional beneﬁts. 0 = S(t) − e−r(T −t) K + ≤ C(t) ≤ S(t).3.1). we have S(t) − Ke−r(T −t) = C(t) − P (t) ≤ C(t). For a nondividend paying stock we have CA (t) = CE (t).C.3. .2) Proof. one put and a short position in one call (the holder of the portfolio has written the call). since violation would mean that the right to buy the stock has a higher value than owning the stock. Proof. deduced from the underlying assumption that no arbitrage should be possible. 1.2 Arbitrage bounds We now use the principle of noarbitrage to obtain bounds for option prices. Theorem 8. Consider a portfolio consisting of one stock. From Proposition 1. Furthermore we assume the existence of a riskfree bank account (bond) with constant interest rate r (continuously compounded) during the time interval [0.1) and the fact that P ≥ 0 (use the same argument as above). otherwise ‘buying’ the call would give a riskless proﬁt now and no obligation later.2 we know that the value of the call must be greater or equal to S(t) − Ke−r(T −t) . The striking result we are going to show (due to R. This portfolio thus guarantees a payoﬀ K at time T .2) is: Proposition 1. we concentrate on European calls in the following.CHAPTER 1. allow one to test the plausibility of sophisticated ﬁnancial market models. (1. At expiry we have V (T ) = S(T ) + (S(T ) − K)− − (S(T ) − K)+ = S(T ) + K − S(T ) = K.3. Merton in 1973. strike K and expiry date T .3. T ]. Then V (t) = S(t) + P (t) − C(t) for all t ∈ [0. Proposition 1. We focus on European options (puts and calls) with identical underlying (say a stock S). This must be false. (1. The following bounds hold for European call options: max S(t) − e−r(T −t) K. which proves the last assertion. It is immediately clear that an American call option can never be worth less than the corresponding European call option.
e. We can evaluate the call in every possible state at t = T and see H = 5 (if S(T ) = 20) with probability p and H = 0 (if S(T ) = 7.e. The problem is to price a European call at t = 0 with strike K = 15 and maturity T .5 with probability p with probability 1 − p. there are two reasons why an American call should not be exercised early: (i) Insurance. How should we pick the probability measure I ? According to their preferences P investors will have diﬀerent opinions about the distribution of the price ST .CHAPTER 1. The derivative P is given by H = f (ST ).3. i. Markets do not allow arbitrage . S)− market. the better. We remark that an American put oﬀers additional value compared to a European put. First idea. they always prefer more to less. it is a random variable (for a suitable function f (.4. which synthesizes the cash ﬂow of the option. An investor who holds a call option instead of the underlying stock is ‘insured against a fall in stock price below K.1 SinglePeriod Market Models A fundamental example We consider a oneperiod model. Qualitatively. the possibility of riskfree proﬁts. BlackScholesMerton (Ross) approach. When the holder exercises the option. Therefore the price . We call this setting a (B. that is the interest rate r = 0 and the discount factor β(t) = 1. I ). i. • a risky stock S with S(0) = 10 and two possible values at t = T S(T ) = 20 7. (1. Use the noarbitrage principle and construct a hedging portfolio using only known (and already priced) securities to duplicate the payoﬀ H. he buys the stock and pays the strike price.4. Early exercise at t < T deprives the holder of the interest on K between times t and T : the later he pays out K.3) Problem. i.1.)). Our aim is to value at t = 0 a European derivative on a stock S with maturity T . If such a portfolio exists. 2. ARBITRAGE THEORY 15 Remark 1. ∀ω. 1. This is illustrated in ﬁgure (1. We assume 1. F. Investors are nonsatiable. the random payoﬀ H = (S(T ) − K)+ .e. (In this context we use β(t) = 1/B(t) as the discount factor).e. the price of the portfolio at t = 0 must equal the price of the derivative at t = 0. i. he loses this insurance. (ii) Interest on the strike price.e. H(ω) = V (ω). Let us assume there are two tradeable assets • a riskfree bond (bank account) with B(0) = 1 and B(T ) = 1.e. and if he exercises early.5) with probability 1 − p. K.1) The key idea now is to try to ﬁnd a portfolio combining bond and stock. We could then price the derivative using some discount factor β by using the expected value of the discounted future payoﬀ: H0 = I E(βH). Model ST as a random variable on a probability space (Ω.4 1. From the noarbitrage principle we see: If it is possible to duplicate the payoﬀ H of a derivative using a portfolio V of underlying (basic) securities. holding this portfolio today would be equivalent to holding the option – they would produce the same cash ﬂow in the future. i. we allow trading only at t = 0 and t = T = 1(say). i.
CHAPTER 1. ARBITRAGE THEORY today S1 B1 + H1 one period = 20 = 1 = max{20 − 15, 0} = 5 upstate
16
S0 = 10 B 0 = 1 H0 =?
S 1 B1 − H1
= = = =
7.5 1 max{7.5 − 15, 0} 0.
downstate
Figure 1.2: Oneperiod example of the option should be the same as the price of constructing the portfolio, otherwise investors could just restructure their holdings in the assets and obtain a riskfree proﬁt today. We brieﬂy present the constructing of the portfolio θ = (θ0 , θ1 ), which in the current setting is just a simple exercise in linear algebra. If we buy θ1 stocks and invest θ0 £ in the bank account, then today’s value of the portfolio is V (0) = θ0 + θ1 · S(0). In state 1 the stock price is 20 £ and the value of the option 5 £, so θ0 + θ1 · 20 = 5. In state 2 the stock price is 7.5 £ and the value of the option 0 £, so θ0 + θ1 · 7.5 = 0. We solve this and get θ0 = −3 and θ1 = 0.4. So the value of our portfolio at time 0 in £ is V (0) = −3B(0) + 0.4S(0) = −3 + 0.4 × 10 = 1 V (0) is called the noarbitrage price. Every other price allows a riskless proﬁt, since if the option is too cheap, buy it and ﬁnance yourself by selling short the above portfolio (i.e. sell the portfolio without possessing it and promise to deliver it at time T = 1 – this is riskfree because you own the option). If on the other hand the option is too dear, write it (i.e. sell it in the market) and cover yourself by setting up the above portfolio. We see that the noarbitrage price is independent of the individual preferences of the investor (given by certain probability assumptions about the future, i.e. a probability measure I ). But P one can identify a special, so called riskneutral, probability measure I ∗ , such that P H0 = I ∗ (βH) = (p∗ · β(S1 − K) + (1 − p∗ ) · 0) = 1. E In the above example we get from 1 = p∗ 5 + (1 − p∗ )0 that p∗ = 0.2 This probability measure P P I ∗ is equivalent to I , and the discounted stock price process, i.e. βt St , t = 0, 1 follows a I ∗ P martingale. In the above example this corresponds to S(0) = p∗ S(T )up + (1 − p∗ )S(T )down , that E is S(0) = I ∗ (βS(T )). We will show that the above generalizes. Indeed, we will ﬁnd that the noarbitrage condition is equivalent to the existence of an equivalent martingale measure (ﬁrst fundamental theorem of
CHAPTER 1. ARBITRAGE THEORY
17
asset pricing) and that the property that we can price assets using the expectation operator is equivalent to the uniqueness of the equivalent martingale measure. Let us consider the construction of hedging strategies from a diﬀerent perspective. Consider a oneperiod (B, S)−market setting with discount factor β = 1. Assume we want to replicate a derivative H (that is a random variable on some probability space (Ω, F, I )). For each hedging P strategy θ = (θ0 , θ1 ) we have an initial value of the portfolio V (0) = θ0 + θ1 S(0) and a time t = T value of V (T ) = θ0 + θ1 S(T ). We can write V (T ) = V (0) + (V (T ) − V (0)) with G(T ) = V (T ) − V (0) = θ1 (S(T ) − S(0)) the gains from trading. So the costs C(0) of setting up this portfolio at time t = 0 are given by C(0) = V (0), while maintaining (or achieving) a perfect hedge at t = T requires an additional capital of C(T ) = H − V (T ). Thus we have two possibilities for ﬁnding ’optimal’ hedging strategies: • Meanvariance hedging. Find θ0 (or alternatively V (0)) and θ1 such that I (H − V (T ))2 = I (H − (V (0) + θ1 (S(T ) − S(0))))2 → min E E
• Riskminimal hedging. Minimize the cost from trading, i.e. an appropriate functional involving the costs C(t), t = 0, T . In our example meanvariance hedging corresponds to the standard linear regression problem, and so C Cov(H, (S(T ) − S(0))) θ1 = and V0 = I E(H) − θ1 I E(S(T ) − S(0)). V ar(S(T ) − S(0)) V We can also calculate the optimal value of the risk functional
2 V V Rmin = V ar(H) − θ1 V ar(S(T ) − S(0)) = V ar(H)(1 − ρ2 ), V
where ρ is the correlation coeﬃcient of H and S(T ). Therefore we can’t expect a perfect hedge in general. If however ρ = 1, i.e. H is a linear function of S(T ), a perfect hedge is possible. We call a market complete if a perfect hedge is possible for all contingent claims.
1.4.2
A singleperiod model
We proceed to formalise and extend the above example and present in detail a simple model of a ﬁnancial market. Despite its simplicity it already has all the key features needed in the sequel (and the reader should not hesitate to come back here from more advanced chapters to see the bare concepts again). We introduce in passing a little of the terminology and notation of Chapter 4; see also Harrison and Kreps (1979). We use some elementary vocabulary from probability theory, which is explained in detail in Chapter 2. We consider a single period model, i.e. we have two timeindices, say t = 0, which is the current time (date), and t = T , which is the terminal date for all economic activities considered. The ﬁnancial market contains d + 1 traded ﬁnancial assets, whose prices at time t = 0 are denoted by the vector S(0) ∈ I d+1 , R S(0) = (S0 (0), S1 (0), . . . , Sd (0)) (where denotes the transpose of a vector or matrix). At time T , the owner of ﬁnancial asset number i receives a random payment depending on the state of the world. We model this randomness P by introducing a ﬁnite probability space (Ω, F, I ), with a ﬁnite number Ω = N of points (each corresponding to a certain state of the world) ω1 , . . . , ωj , . . . , ωN , each with positive probability: I ({ω}) > 0, which means that every state of the world is possible. F is the set of subsets of Ω P (events that can happen in the world) on which I (.) is deﬁned (we can quantify how probable P these events are), here F = P(Ω) the set of all subsets of Ω. (In more complicated models it is
CHAPTER 1. ARBITRAGE THEORY
18
not possible to deﬁne a probability measure on all subsets of the state space Ω, see §2.1.) We can now write the random payment arising from ﬁnancial asset i as Si (T ) = (Si (T, ω1 ), . . . , Si (T, ωj ), . . . , Si (T, ωN )) . At time t = 0 the agents can buy and sell ﬁnancial assets. The portfolio position of an individual R agent is given by a trading strategy ϕ, which is an I d+1 vector, ϕ = (ϕ0 , ϕ1 , . . . , ϕd ) . Here ϕi denotes the quantity of the ith asset bought at time t = 0, which may be negative as well as positive (recall we allow short positions). The dynamics of our model using the trading strategy ϕ are as follows: at time t = 0 we invest d the amount S(0) ϕ = i=0 ϕi Si (0) and at time t = T we receive the random payment S(T, ω) ϕ = d i=0 ϕi Si (T, ω) depending on the realised state ω of the world. Using the (d + 1) × N matrix S, whose columns are the vectors S(T, ω), we can write the possible payments more compactly as S ϕ. What does an arbitrage opportunity mean in our model? As arbitrage is ‘making something out of nothing’; an arbitrage strategy is a vector ϕ ∈ I d+1 such that S(0) ϕ = 0, our net investment R at time t = 0 is zero, and S(T, ω) ϕ ≥ 0, ∀ω ∈ Ω and there exists a ω ∈ Ω such that S(T, ω) ϕ > 0. We can equivalently formulate this as: S(0) ϕ < 0, we borrow money for consumption at time t = 0, and S(T, ω) ϕ ≥ 0, ∀ω ∈ Ω, i.e we don’t have to repay anything at t = T . Now this means we had a ‘free lunch’ at t = 0 at the market’s expense. We agreed that we should not have arbitrage opportunities in our model. The consequences of this assumption are surprisingly farreaching. So assume that there are no arbitrage opportunities. If we analyse the structure of our model above, we see that every statement can be formulated in terms of Euclidean geometry or linear algebra. For instance, absence of arbitrage means that the space Γ= and the space I + +1 = z ∈ I N +1 : zi ≥ 0 ∀ 0 ≤ i ≤ N RN R ∃ i such that zi > 0 x y , x ∈ I y ∈ I N : x = −S(0) ϕ, y = S ϕ, ϕ ∈ I d+1 R, R R
have no common points. A statement like that naturally points to the use of a separation theorem for convex subsets, the separating hyperplane theorem (see e.g. Rockafellar (1970) for an account of such results, or Appendix A). Using such a theorem we come to the following characterisation of no arbitrage. Theorem 1.4.1. There is no arbitrage if and only if there exists a vector ψ ∈ I N , ψi > 0, R such that Sψ = S(0). Proof. The implication ‘⇐’ follows straightforwardly: assume that S(T, ω) ϕ ≥ 0, ω ∈ Ω for a vector ϕ ∈ I d+1 .Then R S(0) ϕ = (Sψ) ϕ = ψ S ϕ ≥ 0, (1.4) ∀ 1≤i≤N
We rewrite this as Si (T ) =I Q EQ (1 + r)0 Si (T ) (1 + r)T . By the above analysis we must have N N S0 (0) = qj S0 (T. ωj ) =I Q EQ (1 + r)T Si (T ) (1 + r)T . We say that we use this asset as num´raire. and we will discuss the choice of the num´raire e in detail later on. i = 0. ψ0 j=1 j=1 and ψ0 is the discount on riskless borrowing. . j = 1. . . P So far we have not speciﬁed anything about the denomination of prices. we can clarify the link to our probabilistic setting. Observe that we didn’t make any use of the speciﬁc probability measure I in our given probability space. N . Introducing an interest rate r.CHAPTER 1. we see that for each asset i we have the relation N Si (0) = qj Si (T. We can now view (q1 . ω) = 1 in all states of the world ω ∈ Ω. ωj ) = I Q (Si (T )). Using a further normalisation. . and express all other prices in units of this asset. qN ) as probabilities and deﬁne a new probability measure on Ω by Q Q({ωj }) = qj . We can think of ψj as the marginal cost of obtaining an additional unit of account in state ωj . let us assume that asset 0 is a riskless bond paying one unit in all states ω ∈ Ω at time T . But K is a compact and convex set. . . . . we must have S0 (0) = ψ0 = (1 + r)−T . From a technical point of view we could choose any asset i as long as its price vector (Si (0). we set ψ0 = ψ1 + . . Now choosing zi = 1 successively we see that λi > 0. . Si (T. . Given a stateprice vector ψ = (ψ1 . . This means that S0 (T. N . Absence of arbitrage means the Γ and I + +1 have no common points. . . . + ψN and for any state ωj write qj = ψj /ψ0 . . Si (T. This means that K ⊂ I + +1 deﬁned RN RN by N K= z∈ I + +1 RN : i=0 zi = 1 and Γ do not meet. For simplicity. and hence by normalising we get ψ = λ/λ0 with ψ0 = 1. The vector ψ is called a stateprice vector. and by the separating hyperplane theorem (Appendix C). ARBITRAGE THEORY 19 since ψi > 0. . ψN ). . . + λN yN = 0. So no arbitrage opportunities exist. . there is a vector λ ∈ I N +1 such that for all z ∈ K R λz>0 but for all (x. . . Let us emphasise again that arbitrage opportunities do not depend e on the chosen num´raire. EQ ψ0 j=1 Hence the normalized price of the ﬁnancial security i is just its expected payoﬀ under some specially chosen ‘riskneutral’ probabilities. We can now reformulate the above statement to: There is no arbitrage if and only if there exists a stateprice vector. We can now express the price of asset i at time t = 0 as N Si (0) = j=1 qj Si (T. . It turns out that appropriate choice of the num´raire facilitates the e e probabilitytheoretic analysis in complex settings. Now set x = −S(0) ϕ and y = S ϕ and the claim follows. ωN )) only contains positive entries. ∀1 ≤ i ≤ N . Using this probability measure. . ωj ) = qj 1 = 1. y) ∈ Γ λ0 x + λ1 y1 + . ω1 ). To show the implication ‘⇒’ we use a variant of the separating hyperplane theorem.
t = 0. the market is called complete. We now know how the given prices of our (d + 1) ﬁnancial assets should be related in order to exclude arbitrage opportunities. we can only have a unique equivalent martingale measure. we cannot guarantee the uniqueness of the t = 0 price. We can now restate the above characterisation of completeness in an informal (but intuitive) way as: A ﬁnancial market model is complete if it contains at least as many independent risky assets as sources of randomness. P We use this to shed light on the relationship of the probability measures I and Q Since P Q. ωj ). So we arrived Q at yet another characterisation of arbitrage: There is no arbitrage if and only if there exists an equivalent martingale measure. The question of completeness can be expressed equivalently in probabilistic language (to be introduced in Chapter 3). Likewise we e can view the num´raire asset as riskfree and all other assets as risky. . we know every equivalent martingale measure leads to a reasonable relative price for our newly created ﬁnancial instrument. In our ﬁnancial market situation the question can be restated mathematically in terms of Euclidean geometry: do the vectors Si (T ) span the whole I N ? This leads to: R Theorem 1. and every equivalent martingale measure gives rise to a price system. Suppose there are no arbitrage opportunities. or equivalently whether we can replicate the cashﬂow of the new asset by means of a portfolio of our original assets. If a ﬁnancial market model is complete. ω1 ). Then the model is complete if and only if the matrix equation Sϕ=δ has a solution ϕ ∈ I d+1 for any vector δ ∈ I N R R .4. This concept lies at the heart of stochastic (mathematical) ﬁnance and will be the golden thread (or roter Faden) throughout this book. ωN )) (observe that δ(T ) is a vector in I N ) at time t = T and ask for its price δ(0) at time t = 0.) It is important to notice that under the given probability measure I (which reﬂects an individual P agent’s belief or the markets’ belief) the processes Si (t)/(1 + r)t . If there exists only one system of prices. . (Martingales are the probabilists’ way of describing fair games: see Chapter 3. but how should we price a newly introduced ﬁnancial instrument? We can represent this ﬁnancial instrument by its random payments δ(T ) = (δ(T. ARBITRAGE THEORY 20 In the language of probability theory we just have shown that the processes Si (t)/(1+r)t . . .CHAPTER 1. T are Q Qmartingales. The (arbitragefree) market is complete if and only if there exists a unique equivalent martingale measure. . t = 0. We also see that riskneutral pricing corresponds to using the expectation operator with respect to an equivalent martingale measure. If this is the case and we can replicate every new asset. T generally do not form I martingales. as we don’t have a unique martingale measure in general. δ(T. Q Q({ω}) > 0 for all ω ∈ Ω the probability measures I and Q are equivalent and (see Chapters 2 P Q and 3) because of the argument above we call Q an equivalent martingale measure. . . Linear algebra immediately tells us that the above theorem means that the number of independent vectors in S must equal the number of states in Ω.2. In an informal way we can say that if the ﬁnancial market model contains 2 (N ) states of the world at time T it allows for 1 (N − 1) sources of randomness (if there is only one state we know the outcome). . δ(T. . Unfortunately. but which measure should one choose? The easiest way out would be if there were only one equivalent martingale measure at our disposal – and surprisingly enough the classical economic pricing theory puts us exactly in this situation! Given a set of ﬁnancial assets on a market the underlying question is whether we are able to price any new ﬁnancial asset which might be introduced in the market. traditional economic theory shows that there exists a unique system of prices. The R natural idea is to use an equivalent probability measure Q and set Q δ(0) = I Q (δ(T )/(1 + r)T ) EQ (recall that all time t = 0 and time t = T prices are related in this way). (We will come back to this important question in Chapters 4 and 6). as a question of representability of the relevant random variables or whether the σalgebra they generate is the full σalgebra. Put another way.
For the market to be complete. hence showing that there are no arbitrage opportunities in our market model. The price vectors are given by S(0) = S0 (0) S1 (0) = 1 1 . 2 such that 1 1 1 ψ1 = . For that we introduce a new ﬁnancial asset δ with random payments δ(T ) = (δ(T. = 1 90 ϕ1 δ(T. S0 (T ) = 3/4 5/4 . i. showing the uniqueness of the martingale measure. Keeping the interest rate r = 0 we obtain the following vectors (and matrices): S(0) = S0 (0) S1 (0) = 1 150 . Now since S0 (T ) and S1 (T ) are linearly independent. 21 We formalise the above example of a binary singleperiod model. since ψ1 +ψ2 = 1 we see that we already have computed riskneutral probabilities. We try to solve (1.e. e We choose a situation similar to the above example. δ(T.CHAPTER 1. ω2 )) . Example. ARBITRAGE THEORY Example (continued). we have d + 1 = 2 assets and Ω = 2 states of the world Ω = {ω1 .4) above admits only one solution for riskneutral probabilities. 0 1 90 ϕ1 1 with solution ϕ0 = −30 and ϕ1 = 1 . their linear span is the whole I 2 (= I Ω ) and δ(T ) is indeed in the linear span. We have d + 1 = 2 assets and Ω = 2 states of the world Ω = {ω1 . showing that there are no arbitrage opportunities in our market model. There δ(T. we try to ﬁnd a vector ψ = (ψ1 . ψ2 ) . 150 180 90 ψ2 Now this has a solution ψ1 ψ2 = 2/3 1/3 . each such δ(T ) must be in the linear span of S0 (T ) and S1 (T ). Furthermore. and so we have found an equivalent martingale measure Q with Q Q 1) = Q(ω 2 1 . So we ﬁnd an equivalent Q martingale measure Q with 3 1 Q 1) = . ω1 ). ψi > 0. S= 3/4 5/4 1/2 2 . ω2 ) = 0 and the above becomes 30 1 180 ϕ0 = . S0 (T ) = 1 1 . ω2 }. ω2 }. which is 3 exactly what we did to set up our portfolio above. Change of num´raire. i. δ(T. S1 (T ) = 1/2 2 . But now we assume two risky assets (and no bond) as the ﬁnancial instruments at our disposal.4) and get state prices ψ1 ψ2 = 6/7 2/7 . telling us to borrow 30 units and buy 3 stocks. Hence we can ﬁnd a replicating portfolio R R by solving δ(T. S1 (T ) = 180 90 . Of course an alternative way of showing market completeness is to recognise that (1. S= 1 1 180 90 . Q 2) = . ω2 ) Let us consider the example of a European option above. Q 2) = . i = 1.e. ω1 ) ϕ0 1 180 . Q(ω Q(ω 4 4 . Q(ω 3 3 We now want to ﬁnd out if the market is complete (or equivalently if there is a unique equivalent martingale measure). We solve (1. ω1 ) = 30.4) for state prices.
S0 (T.) and base economic decisions on expected utility considerations.) is a standard utility function expressing • nonsatiation .CHAPTER 1. ω1 ) S1 (T.4. say S0 . ω2 ) = 0 3/4 .3 A few ﬁnancialeconomic considerations The underlying principle for modelling economic behaviour of investors (or economic agents in general) is the maximisation of expected utility.5) . ω1 ) S0 (T. ω) = max S0 (T.investors prefer more to less. ω2 )/S0 (T. her problem is max [u(c0 ) + I E[βu(cT )]] ξ (1. 0 S0 (T.) Assume such an investor is oﬀered at t = 0 at a price p a random payoﬀ X at t = T . e we have the following asset prices (in terms of S0 (t. ω1 )/S0 (T. u is concave. assuming a oneperiod model. τ = 0. 0} and the cash ﬂow is given by Z(T. • and (maybe) decreasing absolute risk aversion and constant relative risk aversion. u is increasing. ω2 ) Since the model is arbitragefree (recall that a change of num´raire doesn’t aﬀect the noarbitrage e 9 5 property). 0 . ω1 )/S0 (T. we are able to ﬁnd riskneutral probabilities q1 = 14 and q2 = 14 . ω)!): ˜ S(0) = ˜ S0 (T ) = ˜ S1 (T ) = ˜ S0 (0) ˜ S1 (0) = S0 (0)/S0 (0) S1 (0)/S0 (0) = = = 1 1 2/3 8/5 1 1 . ω2 )/S0 (T. ω) − 1. 9 There are no diﬃculties in ﬁnding the hedge portfolio ϕ0 = − 3 and ϕ1 = 14 and pricing the 7 3 option as Z0 = 14 . Under this normalisation. u(. T her original consumption. ω) = max S1 (T. log utility u(x) = log(x) or quadratic utility u(x) = x2 + dx (for which only the ﬁrst two properties are true. and we shall use one of the modelled assets as a num´raire. ω) and see that this seemingly complicated option is (by use of the appropriate num´raire) equivalent e to ˜ ˜ Z(T. Using S0 as num´raire we naturally write e Z(T. ω1 ) Z(T. 1. ω) S1 (T. Deﬁne Z(T ) = max{S1 (T )− S0 (T ). ARBITRAGE THEORY 22 Since we don’t have a riskfree asset in our model. For instance. cT ) = u(c0 ) + I E(βu(cT )). We want to point out the following observation. Thus.investors reject an actuarially fair gamble. How much will she buy? Denote with ξ the amount of the asset she chooses to buy and with eτ . an economic agent might have a utility function over current(t = 0) and future (t = T ) values of consumption U (c0 . . this normalisation (this num´raire) is very artiﬁe cial. that is one assumes that agents have a utility function U (. • risk aversion . ˜ ˜ We now compute the prices for a call option to exchange S0 for S1 . where ct is consumption at time t. ω) − 1. Typical examples are power utility u(x) = (xγ − 1)/γ. . ω2 ) S1 (T. ˜ a European call on the asset S1 .
u (cT ) . (1. deﬁne a probability measure I ∗ using P I ∗ (A) = I ∗ (1A ) = I P E E(m1A ). i. u (c0 ) .e. 23 Substituting the constraints into the objective function and diﬀerentiating with respect to ξ.6) We can use (under regularity conditions) the random variable m to perform a change of measure. and consequently one often calls the corresponding measure a riskneutral measure. Such an investor is called riskneutral. u (c0 ) The investor buys or sells more of the asset until this ﬁrstorder condition is satisﬁed. we see that under I ∗ the investor has the utility function P u(x) = x. ARBITRAGE THEORY such that c0 = e0 − pξ and cT = eT + Xξ. An excellent discussion of these issues (and further much deeper results) are given in Cochrane (2001).6) under measure I ∗ and get P p = I ∗ (X) E Returning to the initial pricing equation. we get the ﬁrstorder condition pu (c0 ) = I [βu (cT )X] E ⇔ p=I β E u (cT ) X . We write (1.CHAPTER 1. If we use the stochastic discount factor m=β we obtain the central equation p=I E(mX).
Let be a preference relation on X .1. 2. The aim in the following is to characterize the preference orders that allow a numerical representation of the form 24 .Chapter 2 Financial Market Theory 2.1. F) of all probability distributions on (Ω. Thus the set X can be identiﬁed with a subset M of the set M1 (Ω. y ∈ X the agent might prefer one over the other.2. We assume in the sequel that M is convex. For the existence of a numerical representation of a preference relation it is necessary and suﬃcient that X contains a countable order dense subset Z. In incomplete market situations. any preference relation admits a numerical representation if X is countable.1. In the context of the theory of choice the elements of M are sometimes called lotteries. A subset Z of X is called order dense if for any pair x. denoted by x y. If presented with two choices x. if x y and y x. Suppose that each possible choice of our economic agent corresponds to a probability distribution on a sample space (Ω. y ∈ X such that x y there exists some z ∈ Z such that x z y.1. F). Thus in order to price these instruments further assumptions on the investors. In particular. are needed. y or y x. An element x ∈ X will be interpreted as a possible choice of an economic agent.1.1.3.1) In order to characterize existence of a numerical representation we need Deﬁnition 2. especially regarding their preferences towards the risks involved. deﬁned on X × X is called a preference relation.1.1 Choice under Uncertainty In a complete ﬁnancial market model prices of derivative securities can be obtained by arbitrage arguments. these ﬁnancial instruments carry an intrinsic risk which cannot be hedged away. This can be formalized Deﬁnition 2. F). Deﬁnition 2.1 Preferences and the Expected Utility Theorem Let X be some nonempty set. x is said to be strictly preferred to y. y z⇒x z. Theorem 2. y ∈ X either x If x y and y x we write x ∼ y (indiﬀerence relation). A numerical representation of a preference order such that y x ⇔ U (y) ≥ U (x). if it is • complete: for all x. A binary relation • transitive: x y. is a function U : X → I R (2.
CHAPTER 2. FINANCIAL MARKET THEORY
25
µ
ν
⇔
Ω
u(ω)µ(dω) ≥
Ω
u(ω)ν(dω)
(2.2) on M is called a von
Deﬁnition 2.1.4. A numerical representation U of a preference order NeumannMorgenstern representation if it is of form (2.2).
Any von NeumannMorgenstern representation of U is aﬃne on M in the sense that U (αµ + (1 − α)ν) = αU (µ) + (1 − α)U (ν) for all µ, ν ∈ M and α ∈ [0, 1]. Aﬃnity implies the tow following properties or axioms • Independence (substitution) (I): Let µ, ν, λ ∈ M and α ∈ (0, 1] then µ ν ⇒ αµ + (1 − α)λ αν + (1 − α)λ
• Archimedean (continuity) (A): If µ λ ν ∈ M then there exist α, β ∈ (0, 1) such that αµ + (1 − α)ν λ βµ + (1 − β)ν
Theorem 2.1.2. Suppose the preference relation on M satisﬁes the axioms (A) and (I). Then there exists an aﬃne numerical representation U of . Moreover, U is unique up to positive aﬃne ˜ transformations, i.e. any other aﬃne numerical representation U with these properties is of the ˜ = aU + b for some a > 0 and b ∈ I form U R. In case of a discrete (ﬁnite) probability distribution existence of an aﬃne numerical representation is equivalent to existence of a von NeumannMorgenstern representation. In the general case one needs to introduce a further axiom, the socalled sure thing principle: For µ, ν ∈ M and A ∈ F such that µ(A) = 1: δx and ν δx for all x ∈ A ⇒ ν µ. From now on we will work within the framework of the expected utility representation. ν for all x ∈ A ⇒ µ ν
2.1.2
Risk Aversion
We focus now on individual ﬁnancial assets under the assumption that their payoﬀ distributions at a ﬁxed time are known. We can view these asset distributions as probability distributions on R. some interval S ⊆ I Thus we take M as a ﬁxed set of Borel probability measures on S. We also assume that M is convex and contains all point masses δx for x ∈ S. Also, we assume that each µ ∈ M has a well deﬁned expectation m(µ) = xµ(dx) ∈ I R.
For assets with random payoﬀ µ resp. insurance contracts with random damage m(µ) is often called fair price resp. fair premium. However, actual prices resp. premia will typically be diﬀerent due to risk premia, which can be explained within our conceptual framework. We assume in the sequel that preference relations have a von NeumannMorgenstern representation.
CHAPTER 2. FINANCIAL MARKET THEORY Deﬁnition 2.1.5. A preference relation x>y on M is called monotone if implies δx δy .
26
The preference relation is called risk averse if for µ ∈ M δm(µ) µ unless µ = δm(µ) . (2.3)
Remark 2.1.1. An economic agent is called riskaverse if his preference relation is risk averse. A riskaverse economic agent is unwilling to accept or indiﬀerent to every actuarially fair gamble. An economic agent is strictly risk averse if he is unwilling to accept every actuarially fair gamble. Proposition 2.1.1. A preference relation (i) monotone, iﬀ u is strictly increasing. (ii) risk averse, iﬀ u is concave. Proof.(i) Monotonicity is equivalent to u(x) = u(s)δx (ds) = U (δx ) > U (δy ) = u(y) is
for x > y. (ii) For µ = αδx + (1 − α)δy we have m(µ) = So if is riskaverse, then δαx+(1−α)y αδx + (1 − α)δy s(αδx + (1 − α)δy )(ds) = αx + (1 − α)y.
holds for all distinct x, y ∈ S and α ∈ (0, 1). Hence u(αx + (1 − α)y) > αu(x) + (1 − α)u(y), so u is strictly concave. Conversely, if u is strictly concave, then Jensen’s inequality implies risk aversion, since U (δm(µ) ) = u with equality iﬀ µ = δm(µ) . Deﬁnition 2.1.6. A function u : S → I is called utility function if it is strictly concave, strictly R increasing and continuous on S. By the intermediate value theorem there is for any µ ∈ M a unique number c(µ) such that u(c(µ)) = U (µ) = udµ. (2.4) xµ(dx) ≥ intu(x)µ(dx) = U (µ)
So δc(µ) ∼ µ, i.e. there is indiﬀerence between µ and the sure amount c(µ). Deﬁnition 2.1.7. The certainty equivalent of µ, denoted by c(µ) is the number deﬁned in (2.4). It is the amount of money for which the individual is indiﬀerent between the lottery µ and the certain amount c(µ) The number ρ(µ) = m(µ) − c(µ) (2.5) is called the risk premium.
CHAPTER 2. FINANCIAL MARKET THEORY
27
The certainty equivalent can be viewed as the upper price an investor would pay for the asset distribution µ. Thus the fair price must be reduced by the risk premium if one wants an agent to buy the asset. Consider now an investor who has the choice to invest a fraction of his wealth in a riskfree and the remaining fraction of his wealth in a risky asset. We want to ﬁnd conditions on the distribution of the risky asset and the preferences (utility function) of the investor in order to determine his willingness for a risky investment. Formally, we consider the following optimisation problem f (λ) = U (µλ ) = where µλ is the distribution of Xλ = (1 − λ)X + λc with X an integrable random variable with nondegenerate distribution µ and c ∈ S is a certain amount. Proposition 2.1.2. We have λ∗ = 1 if I E(X) ≤ c and λ∗ > 0 if c ≥ c(µ). Proof. By Jensen’s inequality f (λ) ≤ u (I [Xλ ]) = u ((1 − λ)I E E(X) + λc) with equality iﬀ λ = 1. It follows that λ∗ = 1 if the righthand side is increasing in λ, i.e. I E(X) ≤ c. Now, strict concavity of u implies f (λ) ≥ I (1 − λ)u(X) + λu(c)) = (1 − λ)u(c(µ)) + λu(c) E with equality iﬀ λ ∈ {0, 1}. The righthand side is increasing in λ if c ≥ c(µ), and this implies λ∗ > 0. Remark 2.1.2. (i) (Demand for a risky asset.) The price of a risky asset must be below the expected discounted payoﬀ in order to attract any riskaverse investor. (ii) (Demand for insurance.) Risk aversion can create a demand for insurance even if the insurance premium lies above the fair price. (iii) If u ∈ C 1 (I then R) λ∗ = 1 ⇔ I E(X) ≤ c λ∗ = 0 ⇔ c≤ I (Xu (X)) E . I (U (X)) E udµλ → max, (2.6)
We assume now that µ has a ﬁnite variance V ar(µ). We consider the Taylor expansion of a V suﬃciently smooth utility function u(x) at x = c(µ) around m = m(µ). We have u(c(µ)) ≈ u(m) + u (m) (c(µ) − m) = u(m) + u (m)ρ(m). On the other hand, u(c(µ)) = ≈ u(x)µ(dx) + 1 2 u(m) + u (m) (c(µ) − m) + u (m) (c(µ) − m) + r(x) µ(dx) 2
1 u(m) + u (m)V ar(µ) V 2 u (m) 1 V ar(µ) = α(m)V ar(µ). V V 2u (m) 2
(where r(x) denotes the remainder term in the Taylor expansion). So ρ(µ) ≈ − (2.7)
Thus α(m(µ)) is the factor by which an economic agent with utility function u weighs the risk (measured by V ar(µ)) in order to determine the risk premium. V
10. Suppose that u is a twice continuously diﬀerentiable utility function on S.4. ρ the corresponding risk premia.1. increasing).1. Then αR (x) = α(x)x = −x u (x) u (x) (2. Let µ.1. 2.1. Deﬁnition 2. (i) An individuals utility function displays decreasing (constant. Suppose that u is a twice continuously diﬀerentiable utility function on S. ν be elements of M. This implies u(x) = log x for γ = 0 u(x) = xγ /γ for 0 < γ < 1. We say that µ is uniformly preferred over ν. increasing) absolute risk aversion if α(x) is decreasing (constant. increasing) relative risk aversion if αR (x) is decreasing (constant.CHAPTER 2.1.9) ⇔ ρ(µ) ≥ ρ(µ) ˜ is called the ArrowPratt coeﬃcient of relative risk aversion of u at level x. For any pair µ. This implies a (normalized) utility function u(x) = 1 − e−αx .3. (2.1. Assume S = I and M = cM1 (expectation exists). notation µ uni ν if for all utility functions u udµ ≥ uni udν. Then α(x) = − u (x) u (x) (2. Remark 2. (ii) Hyperbolic absolute risk aversion (HARA). .10) is also called second order stochastic dominance. (ii) An individuals utility function displays decreasing (constant.8.1. ν ∈ M the following conditions are equivalent: (i) u uni ν. Here α(x) ≡ α for some constant.8) is called the ArrowPratt coeﬃcient of absolute risk aversion of u at level x.9. increasing). Deﬁnition 2. Remark 2. Here α(x) = (1 − γ)/x for some γ ∈ [0. Example 2. Theorem 2. FINANCIAL MARKET THEORY 28 Deﬁnition 2. 1). ˜ It is also useful to analyse risk aversion in terms of the proportion of wealth that is invested in a risky asset.3 Further measures of risk In this section we focus on the question whether one (risky asset) distribution is preferred to R another. (i) Constant absolute risk aversion (CARA). regardless of the choice of a particular utility function. For u and u two utility functions on S with α and α the corresponding Arrow˜ ˜ Pratt coeﬃcients we have α(x) ≥ α(x) ∀x ∈ S ˜ with ρ.3.1.1.
FINANCIAL MARKET THEORY (ii) f dµ ≥ f dν for all increasing concave functions f . For instance. Then h is convex and decreasing and its increasing righthand derivative h := h+ can be regarded as a distribution function of a nonnegative (Radon) measure γ on R. (a) ⇒ (b)((b) ⇒ (a) is clear). (−∞. 29 (iii) For all c ∈ I we have R (2. the fact that.12) Proof. Choose any utility function u0 with u0 dµ and u0 dν ﬁnite. (c) ⇔ (d). = = . because f (x) = −(c − x)+ is concave and increasing.b] (z − x)+ γ(dz) for x < b. Hence f dµ = lim α↑1 uα dµ ≥ lim α↑1 uα dν = f dν (b) ⇔ (c) ”⇒”: Follows. x − ex/2 + 1 if x ≤ 0 √ u0 (x) = x + 1 − 1 if x ≥ 0 Then for f concave and increasing and for α ∈ [0. b]) − h (b) (b − x)+ µ(dx) + (−∞. (2. then c c F (x)dx ≤ −∞ −∞ G(x)dx ∀c ∈ I R. Recall u is a utility function iﬀ it is strictly concave and strictly increasing. By Fubini’s theorem c c F (y)dy −∞ = −∞ (−∞.b] h(b)(ν(−∞.CHAPTER 2.b] (z − x)+ µ(dx)γ(dz) (z − x)+ ν(dx)γ(dz) (−∞. ”⇐”: Let f be an increasing concave function and h = −f .y) µ(dz)dy 1(z ≤ y ≤ c)dyµ(dz) (c − z)+ µ(dz). b]) for a < b. For x < b we ﬁnd h(x) = h(b) − h (b)(b − x) + [−∞. Using (c). Thus h (b) = h (a) + γ([a.b] Now letting b ↑ ∞ yields f dµ ≥ f dν. b]) − h (b) (b − x)+ ν(dx) + hdν. 1) uα (x) = αf (x) + (1 − α)u0 (x) is a utility function.b] = ≤ = h(b)(µ(−∞.11) (iv) If F and G denote the respective distribution functions of µ and ν. h (b) ≤ 0 and Fubini’s theorem we obtain hdµ (−∞. (c − x)+ µ(dx) ≤ (c − x)+ ν(dx).
CHAPTER 2. FINANCIAL MARKET THEORY
λ λ 0
30
Remark 2.1.5.
(i) Also: µ ≥uni ν ⇔
0
F −1 (y)dy ≥
G−1 (y)dy for all λ ∈ (0, 1], where
F −1 , G−1 are the inverses, or quartile functions of the distribution. (ii) Taking f (x) = x in (b), we see µ ≥uni ν ⇒ m(µ) ≥ m(ν). (iii) For normal distributions, we have N (m, σ 2 ) ≥uni N iﬀ m ≥ m and σ 2 ≤ σ 2 ˜ ˜ If µ, ν ⊂ M such that m(µ) = m(ν) and µ ≥uni ν then var(µ) ≤ var(ν). Here var(µ) = (x − m(µ))µ(dx) = x2 µ(dx) − m(µ)2
is the variance of µ (use condition (b) with f (x) = −x2 ) which holds under m(ν) = m(ν) for all concave functions. In the ﬁnancial context, comparison of portfolios with known payoﬀ distributions often use a meanvariance approach with µ ≥ ν ⇔ m(ν) ≥ m(ν) and var(µ) ≤ var(ν). For normal distributions µ ≥ ν is equivalent to µ ≥uni ν, but not in general. Example 2.1.2. µ = U [−1, 1], so m(µ) = 0, var(µ) = p δ−1/2 + (1 − p)δ2 , with p = m(ν) = Thus var(ν) > var(µ), but (− 1 − x)+ µ(dx) = 2 = and 1 − −x 2 So µ ≥uni ν does not hold. A further important class of distributions is discussed in the following example. Example 2.1.3. A realvalued random variable Y on some probability space (Ω, F, I ) is called P lognormally distributed with parameters α ∈ R and σ ∈ R+ if it can be written as Y = exp(α + σX) where X has a standard normal law N (0, 1). For lognormal distributions µ ≥uni µ ⇔ σ 2 ≤ σ 2 ˘ ˜ and α + 1 σ 2 ≥ α + 2 σ 2 ˜ 1˜ 2 Deﬁnition 2.1.11. Let µ and ν be two arbitrary probability measures on R. We say that µ stochastically dominates ν, notation µ ≥mon ν if f dµ ≥ f dν
1 2 1 4 1 2 −1/2 −1 1 (− 1 − x)dx = 1 (− 1 x − 2 x) 2 2 2 −1/2 −1 4 5 1 2 1 −1
x2 dx =
1 3
x3 1 3 −1
=
1 3
for ν =
we have − 1 2 + 1 · 2 = 0, 5 var(ν) = 4 1 1 · + ·4=1 5 4 5
4 5
−
1 8
−
1 2
+
1 2
= 1/16
+
ν(dx) = 0.
for all bounded increasing functions f ∈ C(R). Stochastic dominance is also called ﬁrstorder stochastic dominance.
CHAPTER 2. FINANCIAL MARKET THEORY Theorem 2.1.4. For µ, v ∈ M1 (R) the following conditions are equivalent (a) µ ≥mon ν; (b) for all x, F (x) ≤ G(x) where F, G are respectively the distribution functions of µ, ν;
31
(c) there exists a probability space (Ω, F, I ) with random variables X and Y having respective P distributions µ and ν such that X ≥ Y I −a.s. P
2.2
2.2.1
Optimal Portfolios
The meanvariance approach
Recall our oneperiod model with securities S0 , S1 , . . . , Sd and security prices Si (T ) at the ﬁnal time t = T. Here S0 is the riskfree bond and S1 , . . . , Sd are random variables on some prob. space (Ω, F, I ). For the purpose of this section we disregard the risk free asset and invest only in the P risky assets. We consider their returns Ri (T ) = Si (T ) Si (0) i = 1, . . . , d
and assume we know (or have estimated) their means and covariance matrix I E(Ri (T )) = mi and C Cov(Ri (T ), Rj (T )) = σij i, j = 1, . . . , d (Observe that Σ = (σij ) is positive semideﬁnite). We consider portfolio vectors ϕi ∈ Rd with ϕi ≥ 0 (in order to avoid the possibility of negative ﬁnal wealth). Deﬁnition 2.2.1. An investor with initial wealth x > 0 is assumed to hold ϕi ≥ 0 shares of security i, i = 1, . . . , d with
d
i = 1, . . . , d
ϕ1 Si (0) = x
i=1
”budget equation”.
Then the portfolio vector π = (π1 , . . . , πd ) is deﬁned as πi = and R =
i=1 π
ϕi · Si (0) x
d
i = 1, . . . , d
πi Ri (T )
is called the corresponding portfolio return. Remark 2.2.1. (1) The components of the portfolio vector represent the fractions of total wealth invested in the corresponding securities. In particular, we have
d
πi =
i=1
1 x
d
ϕi Si (0) =
i=1
x =1 x
CHAPTER 2. FINANCIAL MARKET THEORY
32
(2) Let V π (T ) denote the ﬁnal wealth corresponding to an initial wealth of x and a portfolio vector ϕ, i.e.
d
V (T ) =
i=1
π
ϕi Si (T )
then we ﬁnd
d d
Rπ =
i=1
πi Ri (T ) =
i=1
ϕi Si (0) Si (T ) V π (T ) · = x Si (0) x
(3) The mean and the variance of the portfolio return are given by
d d d
I E(Rπ ) =
i=1
πi m i ,
V ar(Rπ ) = V
i=1 j=1
πi σij πj
We now need to consider criteria for selecting a portfolio. The basic (by now classical) idea of Markowitz was to look for a balance between risk (i.e. portfolio variance) and return (i.e. portfolio mean). He considered the problem of requiring a lower bound for the portfolio return (minimum return) and then choosing from the corresponding set the portfolio vector with the minimal variance. Alternatively, set an upper bound for the variance and determine the portfolio vector with the highest possible mean return. We consider Deﬁnition 2.2.2. A portfolio is a frontier portfolio if it has the minimum variance among portfolios that have the same expected rate of return. The set of all frontier portfolios is called the portfolio frontier. We now discuss brieﬂy the assumptions of the meanvariance approach. (1) A preference for expected return and an aversion to variance is implied by monotonicity and strict concavity of a utility function. However, for arbitrary utility functions, expected utility cannot be deﬁned over just the expected return and variances. For µ ∈ M assume U (µ) = u(x)µ(dx) =
∞ k=0 1 (k) (m) (x ku
− m)k µ(dx)
= u(m) + 1 u (m)V ar(µ) + R3 (µ) V 2 (i.e. convergence of Taylor series and interchangeability of integral). Thus, the remainder term needs (for the general case) to be considered as well. (2) Assuming quadratic utility, i.e. u(x) = x − we ﬁnd b 2 x 2 ,b > 0
b b U (µ) = m − m2 = m − (V ar(µ) + m2 ). V 2 2 Unfortunately, quadratic utility displays satiation (negative utility for increasing wealth) and increasing absolute risk aversion. (3) For µi normal distributions, Rπ ∼ N , thus preferences can be expressed solely from mean and variance. Proposition 2.2.1. A portfolio p is a frontier portfolio if and only if the portfolio weight vector π p is a solution to the optimisation problem 1 1 min π Σπ = min π 2 π 2
2 πi πj σij (= 2V ar(Rπ )) V i j
(2.13)
2. .3. md ) is the vector of expected returns of the assets and mp is a ﬁxed rate of portfolioreturn. π ) ˜ .15) to show that for any frontier portfolio p (except the mvp) there exists a unique frontier portfolio zc(p) which has zero covariance with p. D where A = 1T Σ−1 m = m Σ−1 1. B = mT Σ−1 m. m = (m1 . c is the expected return of a portfolio π ∗ whose covariance with π is zero. (i) The portfolio having the minimum variance of all feasible portfolios is called minimum variance portfolio and denoted as mvp. (iii) Frontier portfolios that are neither mvp nor eﬃcient are called ineﬃcient.14) that the optimal portfolio is g. The portfolio frontier can be generated by any two distinct frontier portfolios. Rq ) = Cov Thus. Proposition 2. . (iv) The eﬃcient frontier is the part of the curve lying above the point of global minimum of standard deviation. The covariance between the rates of return of any frontier portfolios p and q is C (Rp . ˜ . If π is any envelope portfolio. FINANCIAL MARKET THEORY subject to πµ= i 33 πi µi πi i = = I E(Rπ ) := mp 1. Now for any given expected return mq we ﬁnd h = π q = g + hmq = (1 − mq )g + mq (g + h).CHAPTER 2. (ii) A frontier portfolio is eﬃcient if it has a strictly higher expected return than the mvp.2 (Twofund separation).14) 1 C(Σ−1 m) − A(Σ−1 1) . π1= 1 is an N vector of ones. C (2. 1/C D/C 2 which is a hyperbola in the standarddeviation – expected return (σ − µ)space.14) implies that the optimal portfolio is g + h.16) ˜ where βπ = C Cov(π. 2 σπ ˜ Furthermore.2. C = 1 Σ−1 1 and D = BC − A2 > 0. This generalizes to Proposition 2. for mp = 1 (2.15) σ 2 (Rp ) (mp − A/C)2 − = 1. for the variances C D mp − A C mq − A C + 1 . For mp = 0 we ﬁnd from (2.13) and write the solution as π p = g + hmp with g = 1 B(Σ−1 1) − A(Σ−1 m) D (2. . We can use (2. Also. then for any other portfolio (envelope or not) π we have the relation ˜ mπ = c + βπ (mπ − c) (2. . We can solve (2.
2.p (mp − r). • Investors have the same estimates of the expected returns. If investors have homogeneous expectations. standard deviations and covariances over the oneperiod horizon. Because this portfolio is held in diﬀerent quantities by all investors it must consist of all risky assets in proportion to their market capitalization. We need assumptions on the investors’ behaviour and the market as a whole.p = C Cov(Rq . 2. The implication of Proposition 2. Rp )/σp . All rational investors will hold a combination of the riskfree asset and the portfolio of assets where the straight line through the riskfree return touches the original eﬃcient frontier. This extends to any portfolio q. (2.CHAPTER 2. 3.2. mi − r = βi. then they are all faced by the same eﬃcient frontier of risky securities.17) where βi. mq − r = βq.2 Capital asset pricing model We now consider a socalled Equilibrium Model. Other strategies are nonoptimal. • All investors can borrow or lend at the same riskfree rate. Thus: 1. Because investors share the same eﬃcient frontier they all hold the same diversiﬁed portfolio.3 in the presence of the a riskfree asset is that there exists a linear relationship between any security and portfolios on the eﬃcient frontiers involving socalled ‘beta factors’. 2 with βq. . The focus of attention is turned from the individual investor to the aggregate market for securities (and all investors) as a whole. This leads to the onefund theorem: There is a single fund (portfolio) such that any eﬃcient portfolio can be constructed as a combination of the fund and the riskfree asset. e Under the assumptions of meanvariance theory we have in equilibrium 1. Information is freely and instantly available to all investors and no investor believes that they can aﬀect the price of a security by their own action. FINANCIAL MARKET THEORY 34 Existence of a riskfree asset has the eﬀect of making the eﬃcient frontier a straight line extending from the rate to the point where the line is tangential to the original eﬃcient frontier for the risky assets.p (mp − r).e. i.p is a linear factor deﬁned as 2 βi. • The markets for risky assets are perfect.p = C Cov(Ri . If there is a riskfree asset the eﬃcient frontier collapses for all investors to a straight line which passes through the riskfree rate of return on the m axis and is tangential to the eﬃcient frontier. It is commonly called the ’market portfolio‘ .2. All investors face the same eﬃcient frontier because they have the same views on the available securities. 2. • All investors have the same oneperiod horizon. • All investors measure in the same num´raire. Rp )/σp .
Yd ) and Yi = S1+r) − Si (0). FINANCIAL MARKET THEORY The line denoting the eﬃcient frontier is called the capital market line and its equation is mp − r = (mM − r) σp σM 35 (2. Since Y0 = 0 we only need the focus on the risky assets. deﬁned as C Cov(Ri . The M −r factor mσM is called the market price of risk.3 Portfolio optimisation and the absence of arbitrage Consider the standard oneperiod model with assets (S0 . (Here Y0 = 0!). Y1 . S0 the riskfree bond with interest rate r > 0. We can also develop an equation relating the expected return of any asset to the return of the market mi − r = (mM − r)βi (2. R Assumption A1. Denote by ϕ = (ϕ0 . For any portfolio ϕ with ϕ S(0) < ω. . .) or (2. Deﬁne the following transformation of ˜ the original utility function u u(y) = u((1 + r)(y + ω)). 2. Equation (2. We consider an investor with utility function u. where the portfolio ϕ satisﬁes the budget constraint ϕ S(0) ≤ ω with ω the initial of the investor. . mM is the expected return on the market portfolio and r is the riskV free rate of return. A rational choice of the investor’s portfolio ˜ will be based on expected utility I u(ϕ S(T ))) E(˜ of the payoﬀ ϕ S(T ) at time T . .19) is called the security market line. . ϕd ) portfolio vectors specifying the amount of shares of the assets in the portfolio.σM is the standard deviation of the market portfolio and r is the riskfree rate of return. Either (a) D = I then we admit all ϕ ∈ I d . . .19) where mi is the expected return on security i.20) . mM is the expected return on the market portfolio. but assume the u is bounded from above (Example: R. ˜ So the optimization problem is equivalent to maximizing the expected utility of I E(u(ϕ Y )) among all ϕ ∈ I d such that ϕ Y is contained in the domain D of u.18) where mp is the expected return of any portfolio p on the eﬃcient frontier. Then the payoﬀ is an aﬃne function of the discounted net gain ϕ S(T ) = (1 + r)(ϕ Y + ω). . . . ϕd ). Thus we can focus on ϕ with ϕ S(0) = ω. . .2. We consider the discounted net gain. ϕ S(T ) − ϕ S(0) = ϕ Y 1+r i (T with Y = (Y0 . R u(x) = 1 − e−αx . RM )/V ar(RM ). . . . It shows that the expected return of any security (and portfolio) can be expressed as a linear function of the securities covariance with the market as a whole. σp is the standard deviation of the return on portfolio p. Sd ). Thus the expected return on any portfolio is a linear function of its standard deviation. .CHAPTER 2. ϕ1 . adding the riskfree investment ω − ϕ S(0) would lead to the strictly better portfolio (ϕ0 +ω −ϕ S(0). βi is the beta factor of security i.
Then there exits a maximizer of the expected utility I E[n(ϕ Y )] ϕ ∈ S(D) if the only if the market model is arbitragefree. there exists at most one maximizer if the market model is complete. implies η = 0.s. P and so by completeness ϕ = (−S(0) η. E[u(ϕ Y )] is continuous. (i) and (ii) imply existence of a maximizer. 1+r 1+r ξ∈S(D) Now η Y is bounded below by −η S(0) and there exists some α ∈ (0. By choosing a subsequence if necessary. We show (i) S(D) is compact. we may assume that ηn = ϕn /ϕn  converges to some unit vector η ∈ I d . Hence αη ∈ S(D) and by our assumption I . Proof.2. Deﬁne η ∈ I d by R ηi = 0 ∨ max ϕi < ∞. Let the above assumption A1 hold true. In this case we only consider portfolios with ϕ Y ≥ a and assume that the expected utility generated by such portfolios is ﬁnite. ϕ∈S(D) Then. So for ϕ∗ optimal I E(u(ϕ∗ Y )) < I E(u((ϕ∗ + η) Y )) a contradiction. η) is an arbitrage opportunity. P (ii) To show continuity it suﬃces to construct an integrable random variable which dominates u(ϕ Y ) for all ϕ ∈ S(D). Moreover. Our aim now is to ﬁnd a ϕ∗ ∈ S(D) which maximizes the expected utility I E(u(ϕ Y )) among ϕ ∈ S(D). Theorem 2. i. (Recall completeness is equivalent to η Y = 0 ⇒ η = 0). Denote by S(D) = {ϕ ∈ I d ϕ Y ∈ D} R the set of admissible portfolios for D. FINANCIAL MARKET THEORY 36 (b) D = [a.s. In case the model is incomplete we can ﬁnd a complete submodel and apply the result to this submodel.1. for all ϕ∈I d R with ϕ Y ≥ a. η S(T ) ≥ ϕ S(T ) for ϕ ∈ S(D) and hence ϕY = ϕ S(T ) η · S(T ) − ϕ S(0) ≤ − 0 ∧ min ξ S(0) . ∞) for some a < 0. ∞) for some a ∈ (−∞. However. under completeness ˜ η Y = 0 I a.s. If the model admits arbitrage. and I (ξ Y ) > 0 with no P P initial investment. Uniqueness follows from the strict concavity of the function ϕ → I E(u(ϕ Y )) for complete market models. assume that (ϕn ) is a diverging sequence in S(D).e. I E[u(ϕ Y )] < ∞ 1 (Example u(x) = γ (x − c)γ ). 1] such that αη S(0) < a. So we may assume completeness. 0).CHAPTER 2. Assume now that the market is arbitragefree. Then R η Y = lim n→∞ ϕn Y a ≥ =0 ϕn  ϕn  I a. We consider the case D = [a. we ﬁnd a vector ξ · Y ≥ 0I a. To show (i). (ii) ϕ → I Clearly. E(u(αη Y )) < ∞.
22) In particular. (2. and that one of the following conditions is satisﬁed. This shows that u(X) can be dominated by a multiple of u(αX − b) plus some constant. Lemma 2.1 ﬁrst with b := −αS(0) η and then with b := −0 ∧ min ϕ S(0) shows ϕ∈S(D) that I u E η S(0) − σ ∧ min ϕ S(0) 1+r ϕ∈S(D) < ∞. then I E[u(αX − b)] < ∞ ⇒ I E[u(X)] < ∞. b < a.22) we ﬁnd by letting η = ϕ − ϕ∗ that I E(u (ϕ ∗ Y )η Y ) ≤ 0 for all η in a small ball centered in the origin of I d . Proposition 2. . FINANCIAL MARKET THEORY 37 Applying Lemma 2. Suppose that ϕ∗ is a solution of the utility maximization problem. We can now give a characterisation of an equivalent riskneutral measure. so I E(u (ϕ∗ Y )η Y ) = must vanish. (2. We now turn to a characterization of the solution ϕ∗ of the utility maximization problem for continuously diﬀerentiable utility functions. I 0. Both sets of assumptions imply that ϕ∗ is an interior point of S(D).21) Since ∆1 = u(ε Y ) ∈ L(I ) monotone convergence and the optimality of ϕ∗ yield P 0≥I E(∆ε ) ↑ I u (ϕ∗ Y )(ϕ − ϕ∗ ) Y E as ε ↓ 0. Hence from (2.2.4. Hence αX−b αX u(aX − b) = u(−b) + −b X u (x)dx ≥ u(−b) + 0 u (y)dy = u(−b) + α 0 u (αz)dz ≥ u(−b) + α(u(X) − u(0)).1.2. The concavity of n implies that u has a decreasing rightcontinuous derivative u .CHAPTER 2. Let n be a twice continuously diﬀerentiable utility function on D such that I E(u(ϕ Y )) is ﬁnite for all ϕ ∈ S(D). the expectation on the righthand side is ﬁnite.2. Replacing η by −η shows that the expectation R E(u (ϕ∗ Y )η Y ) = 0 ∀η in a small ball around the origin. 0 < α ≤ 1. ∞) and ϕ∗ is an interior point of S(D). and X is a nonnegative random variable. and so ∆ε ↑ u (ε ∗ Y )(ϕ − ϕ∗ ) Y as ε ↓ 0. 1] let ϕε = εϕ + (1 − ε)ϕ∗ and deﬁne ∆ε = u(ϕε Y ) − u(ϕ∗ Y ) ε The concavity of u implies that ∆ε ≥ ∆δ for ε ≤ δ. i. For ϕ ∈ S(D) and ε ∈ [0. ∞).e. Proof. Then u (ϕ∗ Y )Y  ∈ L1 (I ) P and the following ﬁrstorder condition holds I E[u (ϕ∗ Y ) · Y ] = 0 Proof. If D = [a. Either • u is deﬁned on D = I and bounded from above or R • u is deﬁned on D = [a.
then so is u (ϕ∗ Y ) and the measure I ∗ is an equivalent martingale P P measure with a bounded density. expected utility. So 0 ≤ u (ϕ∗ Y ) ≤ c + u (ϕ∗ Y )Y 1{Y ≥1} and the righthand side has ﬁnite expectation. We can now give a constructive proof of the ﬁrst fundamental theorem of asset pricing.2. ˜ Let ϕ∗ be a maximiser of expected utility I ˜ E(u (ϕ Y ). Remark 2. Suppose that the market model is arbitragefree and that the assumptions of Proposition 2.4. .2. P I E(u (ϕ∗ Y )) ∈ L1 (I ). bounded. ∞) D=I R which is ﬁnite by our assumption that u is continuously diﬀerentiable on D. Hence I ∗ is a riskneutral measure if it is welldeﬁned.4 are satisﬁed for a utility function u : D → I Let φ∗ be a maximizer of the R. Proof.CHAPTER 2. P Let c = sup {u (x)x ∈ D and x ≤ ϕ∗ } ≤ u (a) u (−ϕ∗ ) D ∈ [a. (i) If Y is a.23) dI P I E(u (ϕ∗ Y )) deﬁnes an equivalent riskneutral measure.s.e.s.2. Then an equivalent martingale measure is given by I ∗ deﬁned via the bounded density P dI ∗ P u (ϕ∗ Y ) ˜ ˜ =c dI P 1 + Y  where c is an appropriate constant.2.2. Recall that I ∗ (Y ) = 0 is the criterion for riskneutrality which is satisﬁed by ProposiE tion 2. (ii) If Y is unbounded we may consider the bounded random vector ˜ Y = Y 1 + Y  which also satisﬁes the noarbitrage condition. Suppose the model is arbitragefree. Then dI ∗ P u (ϕ∗ Y ) = (2.1. i. FINANCIAL MARKET THEORY 38 Corollary 2. I a.
The usual interpretation is to assume one riskfree asset (bond.1 The model We will study socalled ﬁnite markets – i. say) labelled 1 to d. We write S(t) = (S0 (t). Ω}. however. T }. I ). . discretetime models of ﬁnancial markets in which all relevant quantities take a ﬁnite number of values. . A trading strategy (or dynamic portfolio) ϕ is a I d+1 vector stochastic process ϕ = (ϕ(t))T = R t=1 ((ϕ0 (t. Hereafter we refer to the probability space (Ω. ω)) )T which is predictable (or previsible): each ϕi (t) is Ft−1 measurable t=1 for t ≥ 1. P We specify a time horizon T . FT = F = P(Ω) (here P(Ω) is the powerset of Ω. the set of trading dates. . S1 (t. . ϕd (t. as they all – apart from the empty set – carry positive probability). . . . bank account) labelled 0. we reckon in units of the initial value of our num´raire). ω). The ﬁnancial market contains d + 1 ﬁnancial assets. . . F. ϕ1 (t. For the standard approach the riskfree bank account process is used as num´raire. I ). . each with positive probability: P I ({ω}) > 0. we prefer not to use it directly. which is typically generated by the price process S. nonnegative and Ft measurable (i. we use a ﬁltration I = {Ft }T consisting of σalgebras F0 ⊂ F1 ⊂ · · · ⊂ FT : we F t=0 take F0 = {∅. Sd (t. While the reader may keep this interpretation as a mental picture. and deﬁne β(t) := 1/S0 (t) as a discount factor. with a ﬁnite number Ω of points ω. F. Here ϕi (t) denotes the number of shares of asset i held in the portfolio at time t – to 39 . Sd (t)) for the vector of prices at time t. .1. together as a securities market F model. The prices of the assets at time t are random variables. it is more convenient to use a security other than the bank account and we e therefore just use S0 without further speciﬁcation as a num´raire.e. In some e applications. (For a simple option pricing model the time horizon typically corresponds to the expiry date of the option. .Chapter 3 Discretetime models of ﬁnancial markets 3. e t=0 which is strictly positive for all t ∈ {0. A num´raire is a price process (X(t))T (a sequence of random variables). Deﬁnition 3.1. we know the prices Si (t)). ω). . adapted: at time t. it suﬃces.) As before. ω). the class of all 2Ω subsets of Ω: we need every possible subset. S0 (t. to illustrate the ideas. to work with a ﬁnite probability space (Ω. ω) say. which is the terminal date for all economic activities considered. the trivial σﬁeld. . . 1.e. . S1 (t). We furthermore take S0 (0) = 1 e (that is. and d risky assets (stocks. It will be essential to assume that the price process of at least one asset follows a strictly positive process. ω). Following the approach of Harrison and Pliska (1981) and Taqqu and Willinger (1987). the price process S and the information P structure I .
. The following result (which is trivial in our current setting. This motivates: Deﬁnition 3. Interpretation. .1. . 2. . . changing the num´raire) has essentially no economic eﬀects. and held until after. and consider the discounted value process ˜ ˜ Vϕ (t) = β(t)(ϕ(t) · S(t)) = ϕ(t) · S(t). .e. . A trading strategy ϕ e e is selfﬁnancing with respect to S(t) if and only if ϕ is selfﬁnancing with respect to X(t)−1 S(t). whereas ϕ(t) · S(t) is the value just after time t prices are observed. (t = 1. . Now ϕ(t) · S(t − 1) reﬂects the market value of the portfolio just after it has been established at time t − 1. if ϕ(t) · S(t) = ϕ(t + 1) · S(t) (t = 1. 2. . Deﬁnition 3.1. . (t = 1. (t = 1. The components ϕi (t) may assume negative as well as positive values. the portfolio ϕ(t) must be established before. The gains process Gϕ of a trading strategy ϕ is given by t t Gϕ (t) := τ =1 ϕ(τ ) · (S(τ ) − S(τ − 1)) = τ =1 ϕ(τ ) · ∆S(τ ). T ) and the discounted gains process t t ˜ Gϕ (t) := τ =1 ˜ ˜ ϕ(τ ) · (S(τ ) − S(τ − 1)) = τ =1 ˜ ϕ(τ ) · ∆S(τ ). T ).1 (Num´raire Invariance). . The process Vϕ (t. Hence ϕ(t) · (S(t) − S(t − 1)) = ϕ(t) · ∆S(t) is the change in the market value due to changes in security prices which occur between time t − 1 and t. 2. . Proposition 3. T ). the investor adjusts his portfolio from ϕ(t) to ϕ(t + 1). The strategy ϕ is selfﬁnancing. T ) and Vϕ (0) = ϕ(1) · S(0). .CHAPTER 3. . The initial wealth Vϕ (0) is called the initial investment or endowment of the investor. However. the investor selects his time t portfolio after observing the prices S(t − 1).4. Let X(t) be a num´raire. without bringing in or consuming any wealth. Observe that the discounted gains process reﬂects the gains from trading with assets 1 to d only.e. 2. . . but before changes are made in the portfolio. . We will only consider special classes of trading strategies. . but requires a little argument in continuous time) shows that renormalising security prices e (i. announcement of the prices S(t). . ˜ Deﬁne S(t) = (1. ω) is called the wealth or value process of the trading strategy ϕ. . . Observe the – for now – formal similarity of the gains process Gϕ from trading in S following a trading strategy ϕ to the martingale transform of S by ϕ. β(t)Sd (t)) . DISCRETETIME MODELS 40 be determined on the basis of information available before time t. the vector of discounted prices.1.2. reﬂecting the fact that we allow short sales and assume that the assets are perfectly divisible. . The value of the portfolio at time t is the scalar product d Vϕ (t) = ϕ(t) · S(t) := i=0 ϕi (t)Si (t). . ϕ ∈ Φ. which in case of the standard model (a bank account and d stocks) are the risky assets. (t = 1. . T − 1).3. Deﬁnition 3. β(t)S1 (t). i. 2. When new prices S(t) are quoted at time t.1) .1. (3.
. Corollary 3. (t = 0. . which is (3. the num´raire will take care of itself.2) holds true. So it is hardly surprising that if we decide what to do about the risky assets and ﬁx an initial endowment. T we have the following equivalence. e Proposition 3. Then using the deﬁning relation (3. . .CHAPTER 3. If ϕ is selfﬁnancing. . 2. + ϕd (t)Sd (t).2. .1. A trading strategy ϕ belongs to Φ if and only if ˜ ˜ Vϕ (t) = Vϕ (0) + Gϕ (t).1. ϕd ) is selfﬁnancing with initial value of t=1 the corresponding portfolio Vϕ (0) = V0 . We are allowed to borrow (so ϕ0 (t) may be negative) and sell short (so ϕi (t) may be negative for i = 1. On the other hand. 1. . . + ϕd (τ )∆Sd (τ )). ˜ ˜ ˜ ˜ Vϕ (t) = ϕ(t) · S(t) = ϕ0 (t) + ϕ1 (t)S1 (t) + .1. ˜ ˜ Proceeding similarly – or by induction – we can show ϕ(t)· S(t) = ϕ(t+1)· S(t) for t = 2. then by Proposition 3.1) for t = 1. the num´raire invariance theorem e and the fact that S0 (0) = 1 t ˜ Vϕ (0) + Gϕ (t) = = − = ϕ(1) · S(0) + τ =1 t−1 ˜ ˜ ϕ(τ ) · (S(τ ) − S(τ − 1)) ˜ ˜ ϕ(1) · S(0) + ϕ(t) · S(t) ˜ ˜ (ϕ(τ ) − ϕ(τ + 1)) · S(τ ) − ϕ(1) · S(0) τ =1 ˜ ˜ ϕ(t) · S(t) = Vϕ (t). ϕ1 . 1. d). . . By the num´raire invariance theorem it is enough to show the e discounted version of relation (3. ϕd (t)) is predictable and V0 is F0 measurable. T − 1). . . . . Assume ϕ ∈ Φ. . . . . in the following sense.1. . T − 1) ⇔ ϕ(t) · X(t)−1 S(t) = ϕ(t + 1) · X(t)−1 S(t) (t = 1. + ϕd (t)Sd (t)).1. .2) Proof. Proof. which implies the claim: ϕ(t) · S(t) = ϕ(t + 1) · S(t) (t = 1. Equate these: t ϕ0 (t) = V0 + ˜ ˜ (ϕ1 (τ )∆S1 (τ ) + .2. .3. T −1 as required. . Since X(t) is strictly positive for all t = 0. . Assume now that (3. Summing up to t = 2 (3. . ˜ ˜ ˜ Subtracting ϕ(2) · S(2) on both sides gives ϕ(2) · S(1) = ϕ(1) · S(1).2) is ˜ ˜ ˜ ˜ ˜ ˜ ϕ(2) · S(2) = ϕ(1) · S(0) + ϕ(1) · (S(1) − S(0)) + ϕ(2) · (S(2) − S(1)). . . . Proposition 3. t ˜ ˜ Vϕ (t) = V0 + Gϕ (t) = V0 + τ =1 ˜ ˜ (ϕ1 (τ )∆S1 (τ ) + . . . . We now give a characterisation of selfﬁnancing strategies in terms of the discounted processes. If (ϕ1 (t). . . T ). A trading strategy ϕ is selfﬁnancing with respect to S(t) if and only if ϕ is ˜ selfﬁnancing with respect to S(t).1). . 2. . + ϕd (τ )∆Sd (τ )) τ =1 ˜ ˜ −(ϕ1 (t)S1 (t) + . (3. . DISCRETETIME MODELS 41 Proof. . . . . . there is a unique predictable process (ϕ0 (t))T such that ϕ = (ϕ0 .1). .
DISCRETETIME MODELS ˜ which deﬁnes ϕ0 (t) uniquely.1. A typical example of a contingent claim X is an option on some underlying asset S. P The notation L0 for contingent claims is motivated by the them being simply random variables in our context (and the functionalanalytic spaces used later on).2.2. where as ϕ1 .g. i.1. + ϕd (t)Sd (t − 1)). . .2.1 Existence of Equivalent Martingale Measures The NoArbitrage Condition The central principle in the single period example was the absence of arbitrage opportunities. . . so ϕ0 is predictable. . and the P an arbitrage opportunity or arbitrage strategy with respect to Φ terminal wealth of ϕ satisﬁes I {Vϕ (T ) ≥ 0} = 1 and I {Vϕ (T ) > 0} > 0. A strategy ϕ ∈ Φ is called ˜ if I {Vϕ (0) = 0} = 1. . We denote the class of all contingent claims by L0 = L0 (Ω. . F. Proposition 3. Let Φ ⊂ Φ be a set of selfﬁnancing strategies. then (e. 3. which produces a nonnegative ﬁnal value with probability one and has a positive probability of a positive ﬁnal value. We now turn to the modelling of derivative instruments in our current framework. for the case of a European call option with maturity date T and strike K) we have a functional relation X = f (S) with some function f (e.1. We say that a security market M is arbitragefree if there are no arbitrage opportunities in the class Φ of trading strategies.1. . Observe that arbitrage opportunities are always deﬁned with respect to a certain class of trading strategies. Thus for the discounted world predictable strategies and ﬁnal cashﬂows generated by them are all that matters. So t−1 42 ϕ0 (t) = V0 + ˜ ˜ (ϕ1 (τ )∆S1 (τ ) + .2.e. Remark 3. . X = (S(T ) − K)+ ). .5. + ϕd (τ )∆Sd (τ )) τ =1 ˜ −(ϕ1 (t)S1 (t − 1) + .CHAPTER 3.g. Deﬁnition 3. which is Ft−1 measurable. . ˜ ˜ Deﬁnition 3. This is done in the following fashion. ϕd are predictable. . I ). Deﬁnition 3. If we require them to be predictable they correspond in a unique way (after ﬁxing initial endowment) to a selfﬁnancing trading strategy. A contingent claim X with maturity date T is an arbitrary FT = Fmeasurable random variable (which is by the ﬁniteness of the probability space bounded). ϕd (t)) are needed.1. The terms in Si (t) are ˜ ˜ ˜ ϕi (t)∆Si (t) − ϕi (t)Si (t) = −ϕi (t)Si (t − 1). As mentioned there this principle is central for any market model. P P So an arbitrage opportunity is a selfﬁnancing strategy with zero initial value. . all terms on the righthand side are Ft−1 measurable. The general deﬁnition allows for more complicated relationships which are captured by the FT measurability of X (recall that FT is typically generated by the process S).2 3.3 has a further important consequence: for deﬁning a gains pro˜ cess Gϕ only the components (ϕ1 (t). and we now deﬁne the mathematical counterpart of this economic principle in our current setting. the absence investment strategies for making proﬁts without exposure to risk.
If an equivalent martingale measure exists . The fundamental insight in the singleperiod example was the equivalence of the noarbitrage condition and the existence of riskneutral probabilities.e. F Proof. A probability measure I ∗ on (Ω. P I ∗ (Vϕ (T )) = I ∗ (Vϕ (0)).3. Deﬁnition 3.2. ˜ Proposition 3. in our ﬁnitedimensional setting: a selfﬁnancing trading strategy ϕ ∈ Φ is an arbitrage opportunity if Vϕ (0) = 0. F. Proof. but Vϕ (T ) ≥ 0 (by deﬁnition).that is. i. Assume such a I ∗ exists.) ++ Using L0 we can write the arbitrage condition more compactly as ++ ˜ Vϕ (0) = Vϕ (0) = 0 ⇒ ˜ Vϕ (T ) ∈ L0 (Ω. . T ).2. ˜ Proposition 3. Recall the deﬁnition of arbitrage. We denote P F ˜ by P(S) the class of equivalent martingale measures. If the market M is arbitragefree. Vϕ (T. This and P P P ˜ ˜ Vϕ (T ) ≥ 0 force Vϕ (T ) = 0.1. ω) > 0. Then the wealth process V ﬁltration I . 1. For any selfﬁnancing strategy ϕ. then the class P(S) of equivalent martingale measures is nonempty. The next result is the key for the further development. Also each E ˜ ∗ I ({ω}) > 0 (by assumption. So no arbitrage is possible. P ++ (Observe that L0 is a cone closed under vector addition and multiplication by positive scalars.CHAPTER 3. if P(S) = ∅ – then the market M is arbitragefree. I ). ˜ ˜ By Proposition 3.2. For the multiperiod case we now use probabilistic machinery to establish the corresponding result.2. E ˜ E ˜ If the strategy is an arbitrage opportunity its initial value – the righthand side above – is ˜ zero. we have ˜ ˜ Vϕ (t) = Vϕ (0) + Gϕ (t) (t = 0. P ++ . F.e. I ) the set of random variables on (Ω. So the P initial and ﬁnal I ∗ expectations are the same. the martingale transform theorem is applicable without further restrictions. Therefore the lefthand side I ∗ (Vϕ (T )) is zero.2)). we have as before P t ˜ Vϕ (t) = Vϕ (0) + τ =1 ˜ ϕ(τ ) · ∆S(τ ). Deﬁnition 3. ω) ≥ 0 ∀ω ∈ Ω and there exists a ω ∈ Ω with Vϕ (T. Now call L0 = L0 (Ω. ˜ Proposition 3.2. I ) := {X ∈ L0 : X(ω) ≥ 0 ∀ω ∈ Ω and ∃ ω ∈ Ω such that X(ω) > 0}. So ˜ ˜ ˜ ˜ ˜ ˜ Vϕ (t + 1) − Vϕ (t) = Gϕ (t + 1) − Gϕ (t) = ϕ(t + 1) · (S(t + 1) − S(t)). F) and P L0 (Ω.2.2. Let I ∗ be an equivalent martingale measure (I ∗ ∈ P(S)) and ϕ ∈ Φ any P P ˜ϕ (t) is a I ∗ martingale with respect to the P selfﬁnancing strategy. F. each I ({ω}) > 0. ˜ ˜ So for ϕ ∈ Φ. so by equivalence each I ∗ ({ω}) > 0). By the selfﬁnancing property of ϕ (compare Proposition 3.1. FT ) equivalent to I is called a martingale P P ˜ ˜ measure for S if the process S follows a I ∗ martingale with respect to the ﬁltration I . . . S(t) a (vector) I ∗ martingale implies Vϕ (t) is a P ∗ martingale. For the proof (for which we follow Schachermayer (2000) we need some auxiliary observations. (3.1.4. DISCRETETIME MODELS 43 We will allow ourselves to use ‘noarbitrage’ in place of ‘arbitragefree’ when convenient.1.2. Vϕ (t) is the martingale transform of the I ∗ martingale S by ϕ (see Theorem C. .1) P and hence a I ∗ martingale itself.3. i. P Observe that in our setting all processes are bounded.
.5) . We can now restate Lemma 3. I ) deﬁned by P ˜ K = {X ∈ L0 (Ω. .2.1.CHAPTER 3.1 in terms of spaces A market is arbitragefree if and only if K ∩ L0 (Ω.e. Since our market model is ﬁnite we can use results from Euclidean geometry. F.1. Recall Remark 3. .3 there exists a unique predictable process (ϕ0 (t)) such that ϕ = ˜ (ϕ0 . . ϕd ).e. Hence we use the operator G for ddimensional vectors as well. ϕd ) has zero initial value and is selfﬁnancing. F. The next lemma formulates the arbitrage condition in terms of discounted gains processes. ϕ1 . F. . P ++ (3.4) ˜ but for all Gϕ (T ) in K. F). By assumption we have (3. Assume Gϕ (T ) ∈ L0 (Ω. ϕ predictable} P the set of contingent claims attainable at price 0. We call the subspace K of L0 (Ω. Deﬁnition 3. ϕd ) has zero initial value and is selfﬁnancing. and is positive somewhere (i. P ++ Then using Proposition 3. .) Proof. F.2. (3. (3.3).3.1 and Proposition 3. there is a vector λ = (λ(ω) : ω ∈ Ω) such that for all X ∈ D λ · X := ω∈Ω λ(ω)X(ω) > 0. in particular we can identify L0 with I Ω ). . Lemma 3. ϕ1 . DISCRETETIME MODELS 44 for any selfﬁnancing strategy ϕ. By Proposition 3. We now deﬁne the space of contingent claims. I ). which an economic agent may replicate with zero initial investment by pursuing some predictable trading strategy ϕ. Now D is a compact convex set. In an arbitragefree market any predictable vector process ϕ = (ϕ1 . So K does not meet the subset ++ D := {X ∈ L0 : ++ ω∈Ω X(ω) = 1}. i.4. The important advantage in using this setting (rather than a setting in terms of value processes) is that we only have to assume predictability of a vector process (ϕ1 .1. P ++ (Observe the slight abuse of notation: for the value of the discounted gains process the zeroth ˜ component of a trading strategy doesn’t matter. .e. . ˜ ˜ ˜ Vϕ (T ) = β(T )−1 Vϕ (T ) = β(T )−1 (Vϕ (0) + Gϕ (T )) = β(T )−1 Gϕ (T ) ≥ 0. I ). I ) : X = Gϕ (T ). Hence ϕ is an ++ arbitrage opportunity with respect to Φ. By the separating hyperplane theorem. random variables on (Ω.2.3 here: we can choose a process ϕ0 in such a way that the strategy ϕ = (ϕ0 . F. . with positive probability) by deﬁnition of L0 .1. K and R L0 do not intersect. ˜ λ · Gϕ (T ) = ω∈Ω ˜ λ(ω)Gϕ (T )(ω) = 0. . . .2. I ) = ∅.2. This contradicts the assumption that the market is arbitragefree. . . i. ϕd ) satisﬁes ˜ Gϕ (T ) ∈ L0 (Ω.3) Proof of Proposition 3.1. .
Working with discounted values (recall we use β as the discount factor) we ﬁnd ˜ ˜ β(T )X = Vϕ (T ) = V (0) + Gϕ (T ). (3. E i. We have a subspace. Otherwise arbitrageurs would use the opportunity to cash in a riskless proﬁt. ﬁnancial derivatives). namely the pricing of contingent claims (i.e. DISCRETETIME MODELS 45 Choosing each ω ∈ Ω successively and taking X to be 1 on this ω and zero elsewhere. This is the basic idea of the arbitrage pricing theory. that is for a speciﬁed cashﬂow there exists only one price at any time instant. So the replicating strategy generates the same time T cashﬂow as does X. where it is a form of the HahnBanach theorem of functional analysis. I ∗ E τ =1 T With I ∗ as I ∗ E P ˜ ϕ(τ ) · ∆S(τ ) = 0. (3. .2. choosing for each i to hold only stock i. In particular. P Note. d). Take λ orthogonal to the subspace on the same side of the subspace as the cone. So λ(ω) I ∗ ({ω}) := P ω ∈Ω λ(ω ) deﬁnes a probability measure equivalent to I (no nonempty null sets).6) So the discounted value of a contingent claim is given by the initial cost of setting up a replication strategy and the gains from trading. Theorem 3. Since this holds for any predictable ϕ (boundedness holds automatically as Ω is ﬁnite).2. We say that a contingent claim is attainable if there exists a replicating strategy ϕ ∈ Φ such that Vϕ (T ) = X. The separating hyperplane theorem holds also in inﬁnitedimensional situations. P 3. The market M is arbitragefree if and only if there exists a probability measure I ∗ equivalent to I under which the discounted ddimensional asset P P ˜ price process S is a I ∗ martingale.2 RiskNeutral Pricing We now turn to the main underlying question of this text.5) says that ˜ I ∗ Gϕ (T ) = 0. T I ∗ E τ =1 ˜ ϕi (τ )∆Si (τ ) = 0 (i = 1. P expectation. the equivalence of the noarbitrage condition and the existence of riskneutral probability measures imply the possibility of using riskneutral measures for pricing purposes. So the noarbitrage condition implies that for an attainable contingent claim its time t price must be given by the value (inital cost) of any replicating strategy (we say the claim is uniquely replicated in that case).2.3 as a ﬁrst central theorem in this chapter. On the other hand. .2 and 3. . Our situation is ﬁnitedimensional. (3. We now combine Propositions 3. In a highly eﬃcient security market we expect that the law of one price holds true.CHAPTER 3. .1 (NoArbitrage Theorem). We will explore the relation of these tow approaches in this subsection. . so all we have used here is Euclidean geometry. the mar˜ tingale transform lemma tells us that the discounted price processes (Si (t)) are I ∗ martingales. As in chapter 1 the basic idea is to reproduce the cash ﬂow of a contingent claim in terms of a portfolio of the underlying assets. and a cone not meeting the subspace except at the origin.4) tells us that each λ(ω) > 0.2.e.
P P Hence the market contains an arbitrage opportunity with respect to the class Φ of selfﬁnancing strategies. then A ∈ Fτ and I (A) > 0 (otherwise just rename P the strategies). . 1Ac (ϕ(u) − ψ(u)) + 1A (Y β(τ ). a cashﬂow at time T . The idea is to replicate a given cashﬂow at a given point in time. A selfﬁnancing trading strategy ϕ ∈ Φ is called admissible if Vϕ (t) ≥ 0 for each t = 0. . Deﬁne A := {ω ∈ Ω : Vϕ (τ.2.4. 0) · S(T ) = 1A Y β(τ )S0 (T ) ≥ 0 .) We now return to the main question of the section: given a contingent claim X. ψ ∈ Φ ξ(τ ) · S(τ ) ξ(τ + 1) · S(τ ) = (ϕ(τ ) − ψ(τ )) · S(τ ) = Vϕ (τ ) − Vψ (τ ). To avoid negative wealth the concept of admissible strategies is introduced. and for t = τ we have using that ϕ. . We write Φa for the class of admissible trading strategies. Suppose the market M is arbitragefree. In our current setting all processes are bounded anyway. . Furthermore Vξ (T ) and I {Vξ (T ) > 0} = I {A} > 0. .e. T .1) is clearly true for t = τ . Then any attainable contingent claim X is uniquely replicated in M. thus proving the uniqueness of the value process. invest them riskfree) up to time T . . i. . Hence ξ is a selfﬁnancing strategy with initial value equal to zero. . Proposition 3.e. This uniqueness property allows us now to deﬁne the important concept of an arbitrage price process. We need to show formally that ξ satisﬁes the conditions of an arbitrage opportunity. 0. Proof. Using a selfﬁnancing trading strategy the investor’s wealth may go negative at time t < T . see Exercises. . . Suppose there is an attainable contingent claim X and strategies ϕ and ψ such that Vϕ (T ) = Vψ (T ) = X. . but he must be able to cover his debt at the ﬁnal date. 1. = 1Ac (ϕ(τ + 1) − ψ(τ + 1)) · S(τ ) + 1A Y β(τ )S0 (τ ) = 1Ac (ϕ(τ ) − ψ(τ )) · S(τ ) + 1A (Vϕ (τ ) − Vψ (τ ))β(τ )β −1 (τ ) = Vϕ (τ ) − Vψ (τ ). there should be a unique value process (say VX (t)) representing the time t value of the simple contingent claim X. but there exists a τ < T such that Vϕ (u) = Vψ (u) for every u < τ and Vϕ (τ ) = Vψ (τ ). Deﬁne the Fτ measurable random variable Y := Vϕ (τ ) − Vψ (τ ) and consider the trading strategy ξ deﬁned by ξ(u) = ϕ(u) − ψ(u).e. u≤τ τ < u ≤ T. (In fact one can show that a security market which is arbitragefree with respect to Φa is also arbitragefree with respect to Φ. i. The idea here is to use ϕ and ψ to construct a selfﬁnancing strategy with zero initial investment (hence use their diﬀerence ξ) and put any gains at time τ in the savings account (i. By construction ξ is predictable and the selfﬁnancing condition (3. . 0. = 1Ac (ϕ(T ) − ψ(T )) · S(T ) + 1A (Y β(τ ). The modelling assumption of admissible strategies reﬂects the economic fact that the broker should be protected from unbounded short sales. The following proposition ensures that the value processes of replicating trading strategies coincide. so this distinction is not really needed and we use selfﬁnancing strategies when addressing the mathematical aspects of the theory. DISCRETETIME MODELS 46 Let us investigate replicating strategies a bit further. But this contradicts the assumption that the market M is arbitragefree. 0). ω)}. how can we determine its value (price) at time t < T ? For an attainable contingent claim this value should be given by the value of any replicating strategy at time t. ω) > Vψ (τ.CHAPTER 3.
0 ≤ t ≤ T or simply arbitrage price of X is given by the value process of any replicating strategy ϕ for X. In particular. Suppose the market is arbitragefree. Then the arbitrage price process πX (t). agents prefer more to less. Since contingent claims are merely FT measurable random variables in our setting. This would be a very desirable property of the market M. Hedging is central to the theory of option pricing. simpliﬁes the pricing formula enormously. . The arbitrage price process of any attainable contingent claim X is given by the riskneutral valuation formula πX (t) = β(t)−1 I ∗ (Xβ(T )Ft ) ∀t = 0. E where I ∗ is the expectation operator with respect to an equivalent martingale measure I ∗ E P. the pricing formula for any attainable contingent claim must be independent of all preferences that do not admit arbitrage.2. because we would then have solved the pricing question (at least for contingent claims) completely. and the kinds of deltahedging strategies implicit in the BlackScholes model are used by participants in option markets. an economy of riskneutral investors must price a contingent claim in the same manner. So. . Since we assume the the market is arbitragefree there exists (at least) an equivalent mar˜ tingale measure I ∗ . such as the BlackScholes model ((Black and Scholes 1973). The classical arbitrage valuation models. Let X be any attainable contingent claim with time T maturity. DISCRETETIME MODELS 47 Deﬁnition 3. We will come back to hedging problems subsequently.e. 1. depend on the idea that an option can be perfectly hedged using the underlying asset (in our case the assets of the market model). Proof. . due to Cox and Ross (Cox and Ross 1976) in the case of a simple economy – a riskless asset and one risky asset . it should be no surprise that we can give a criterion in terms of probability measures.1 the discounted value process Vϕ of any selfﬁnancing P strategy ϕ is a I ∗ martingale. . We start with: . In its general form the price of an attainable simple contingent claim is just the expected value of the discounted payoﬀ with respect to an equivalent martingale measure. The construction of hedging strategies that replicate the outcome of a contingent claim (for example a European option) is an important problem in both practical and theoretical applications.2. Proposition 3. So for any contingent claim X with maturity T and any replicating P trading strategy ϕ ∈ Φ we have for each t = 0.and in its general form due to Harrison and Kreps (Harrison and Kreps 1979).7) 3. T. T πX (t) = = = = ˜ Vϕ (t) = β(t)−1 Vϕ (t) ˜ β(t)−1 E ∗ (Vϕ (T )Ft ) β(t)−1 E ∗ (β(T )Vϕ (T )Ft ) β(t)−1 E ∗ (β(T )XFt ) ˜ (as Vϕ (t) is a I ∗ martingale) P (undoing the discounting) (as ϕ is a replicating strategy for X). so making it possible to create a portfolio that replicates the option exactly. i.5. By Proposition 3. 1. . . Analysing the arbitragepricing approach we observe that the derivation of the price of a contingent claim doesn’t require any speciﬁc preferences of the agents other than nonsatiation. (3.CHAPTER 3. which rules out arbitrage.3 Complete Markets: Uniqueness of Equivalent Martingale Measures The last section made clear that attainable contingent claims can be priced using an equivalent martingale measure. Hedging is also widely used to reduce risk.2. . .5. In this section we will discuss the question of the circumstances under which all contingent claims are attainable. This fundamental insight.
So. A market M is complete if every contingent claim is attainable. Vϕ (t) ∗ ˜ by ϕ (see Proposition is also a I martingale. . by Proposition 3. I 2 have to agree on integrating all integrands. t=0 I i (Vϕ (T )) = I i (Vϕ (0)) = Vϕ (0). T ˜ β(T )X = Vϕ (T ) = Vϕ (0) + τ =1 ˜ ϕ(τ ) · ∆S(τ ).1). Y ) → I ∗ (ZY ) E . e Consider the following set of random variables: T ˜ K := Y ∈ L0 : Y = Y0 + t=1 ˜ ϕ(t) · ∆S(t). giving uniqueness as required.1).) Then by the above reasoning. if ϕ is selfﬁnancing and I ∗ is an P ˜ P P equivalent martingale measure under which discounted prices S are I ∗ martingales (such I ∗ exist ˜ since M is arbitragefree and we can hence use the noarbitrage theorem (Theorem 3. i. An arbitragefree market M is complete if and only if there exists a unique probability measure I ∗ equivalent to I under which discounted asset P P prices are martingales.1. so K is Ω 0 a proper subset of the set L of all random variables on Ω (which may be identiﬁed with I R ). . Deﬁne the scalar product P (Z. (Vϕ (t))T is a I i martingale. We know by the noarbitrage theorem (Theorem 3.2. so the above equation gives ˜ϕ (t) ≥ 0 for each t. Vϕ (T ) ≥ 0. . ϕ predictable . for every FT measurable random variable X ∈ L0 there exists a replicating selfﬁnancing strategy ϕ ∈ Φ such that Vϕ (T ) = X.2. Proof. Thus all the values at each time t are nonnegative – not just the ﬁnal value V at time T – so ϕ is admissible. . let I 1 .3. P P ˜ P For i = 1. As ϕ is selfﬁnancing. So. .3. Vϕ (T ) = X ≥ 0. . DISCRETETIME MODELS 48 Deﬁnition 3. 1. In the case of an arbitragefree market M one can even insist on replicating nonnegative contingent claims by an admissible strategy ϕ ∈ Φa .3.1.1 (Completeness Theorem). So I 1 (β(T )X) = I 2 (β(T )X). . so discounting. Ω}) and β(0) = 1. we have to prove uniqueness. P P ‘⇐’: Assume that the arbitragefree market M is incomplete: then there exists a nonattainable FT measurable random variable X (a contingent claim).CHAPTER 3. there exists an admissible (so selfﬁnancing) strategy ϕ replicating X: X = Vϕ (T ). Now I i is expectation E E E (i. P Let I ∗ be a probability measure equivalent to I under which discounted prices are martingales P (such I ∗ exist by the noarbitrage theorem (Theorem 3. . integration) with respect to the measure I i . 2.e. Sd . Then for any FT measurable random variable X ( contingent claim). . . Theorem 3. ‘⇒’: Assume that the arbitragefree market M is complete.1) that an equivalent martingale measure I ∗ P exists. I 2 be two such equivalent martingale measures. we may conﬁne attention to the risky assets S1 . ϕd (t)) )T with predictable compot=1 ˜ ˜ nents.2. ˜ If ϕ replicates X. .e.1)). E E Since X is arbitrary. as these suﬃce to tell us how to handle the num´raire S0 .2.1. I 1 . the discounted value β(T )X does not belong to K. E ˜ E ˜ since the value at time zero is nonrandom (F0 = {∅. Y0 ∈ I .2. and measures that agree on integrating all P integrands must coincide. . R (Recall that Y0 is F0 measurable and set ϕ = ((ϕ1 (t). T ). So ˜ ˜ Vϕ (t) = E ∗ (Vϕ (T )Ft ) (t = 0. being the martingale transform of the martingale S P 3. By Proposition 3. So I 1 = I 2 . Indeed.
there exists a nonzero random variable Z ˜ orthogonal to K (since Ω is ﬁnite. From I ∗ (Z) = 0.the celebrated CoxRossRubinstein model (Cox. Martingale Representation. ω) . S(t) P I ∗∗ is a second equivalent martingale measure. i.actually. . t=1 ˜ ϕ(t) · ∆S(t)). As Z is nonzero. We start with the paradigm of all binomial models . 3. neither do I ∗ . and deﬁne I ∗∗ by P I ∗∗ ({ω}) = P 1+ Z(ω) 2 Z ∞ I ∗ ({ω}). Ross. Such martingale representation theorems hold much more generally. which is unique) can be written.CHAPTER 3. this says that all I ∗ martingales can be represented as martingale transforms of discounted prices. give rise to simple and eﬃcient numerical procedures. I ∗∗ ). ω) T = ω∈Ω 1+ Z(ω) 2 Z ∞ I ∗ (ω) P t=1 ˜ ϕ(t. is a P probability measure. and Rubinstein 1979). T ˜ ˜ ˜ which is zero as Z is orthogonal to K and t=1 ϕ(t) · ∆S(t) ∈ K. The development of the riskneutral pricing formula is particularly clear in this setting since we require only elementary mathematical methods. or represented. . Thus lemma (Lemma C. I Ω is Euclidean: this is just Euclidean geometry). d and Y0 = 1 we ﬁnd I ∗ (Z) = 0. Yor 1978). by their very construction. and are very important. To say that every contingent claim can be replicated means that every I ∗ martingale (where I ∗ is P P the riskneutral measure. P By construction. . I ∗∗ is equivalent to I ∗ (same null sets . DISCRETETIME MODELS 49 ˜ on random variables on Ω. t = 1.4. . . . Now P P T T I ∗∗ E t=1 ˜ ϕ(t) · ∆S(t) = ω∈Ω I ∗∗(ω) P t=1 ˜ ϕ(t.4 The CoxRossRubinstein Model In this section we consider simple discretetime ﬁnancial market models. . E ˜ ∀ Y ∈ K. I ∗∗ and I ∗ are diﬀerent. see (Revuz and Yor 1991. ω) · ∆S(t.e. nonuniqueness of equivalent martingale measures. ˜ which is zero since this is a martingale transform of the I ∗ martingale S(t) (recall martingale P transforms are by deﬁnition null at zero). R I ∗ (ZY ) = 0. By the martingale transform ˜ is a I ∗∗ martingale since ϕ is an arbitrary predictable process. The ‘1’ term on the right gives T I ∗ E t=1 ˜ ϕ(t) · ∆S(t) .1). That is. For background. E Write X ∞ := sup{X(ω) : ω ∈ Ω}. . Since K is a proper subset. ω) · ∆S(t. The ‘Z’ term gives a multiple of the inner product T (Z. as a martingale transform (of the discounted prices) by a replicating (perfecthedge) trading strategy ϕ. T . Moreover binomial models. In stochasticP process language. we see that P P E I ∗∗ (ω) = 1. 2. The link to the fundamental economic principles of the arbitrage pricing method can be obtained equally straightforwardly. . diﬀerent from I ∗ So incompleteness implies P P. as I ∗ ∼ I and I has no P P P P P nonempty null sets. ˜ Choosing the special Y = 1 ∈ K given by ϕi (t) = 0. as required. i = 1.
Assume that the ﬁrst of our given basic securities is a (riskless) bond or bank account B. t = 0. 2. . t = 1. R0 p $$ $$$ S(0) 1−p $ $$ S(1) = (1 + u)S(0) $$$ $ S(1) = (1 + d)S(0) Figure 3. 1. T . T. . . Recall that the essence of the relative pricing theory is to take the price processes of these basic securities as given and price secondary securities in such a way that no arbitrage is possible. Furthermore. i. . . Our aim.1 below). 2. p ∈ (0. . d) = d. Ft . P ˜ ˜ ˜ Ω = Ω1 × . i. T. {d}. S). is to deﬁne a probability space on which we can model the basic securities (B. . . 1.e. T − 1. B(t + 1) = (1 + r)B(t). 8). Our time horizon is T and the set of dates in our ﬁnancial market model is t = 0. . t = 1. our model consists of two basic securities. 1 − p. u}. . . the above deﬁnitions suggest using as the underlying probabilistic model of the ﬁnancial market the product space (Ω. I t ) with ˜ Ωt ˜ Ft ˜ I t P ˜ =Ω ˜ =F ˜ =I P = {d. . DISCRETETIME MODELS 50 3. 3. Since we can write the stock price as t S(t) = S(0) τ =1 (1 + Z(τ )).e. . ˜ ˜ = P(Ω) = {∅. {u}.4. . . ˜ ˜ with I ({u}) = p. . Ω}.CHAPTER 3. that is.1 Model Structure We take d = 1. . .1: Onestep tree diagram Alternatively we write this as Z(t + 1) := S(t + 1) − 1. 1. S(t) t = 0. 1). .g. . . t = 1. we have a risky asset (stock) S with price process S(t + 1) = (1 + u)S(t) (1 + d)S(t) with probability with probability p. T. (Williams 1991) ch. t + 1]. . . t = 0. u}T . . which yields a riskless rate of return r > 0 in each time interval [t. × ΩT = ΩT = {d. So its price process is B(t) = (1 + r)t . I ({d}) = 1 − p. T as random variables deﬁned ˜ ˜ P ˜ on probability spaces (Ωt . F. u) = u and Z(t. P P On these probability spaces we deﬁne Z(t. . . T − 1 with −1 < d < u. . We set up a probabilistic model by considering the Z(t). . 1. . of course. .4. . B(0) = 1. I ) (see e. S0 ∈ I + (see Fig.
. Z(T ) Q remain independent and identically distributed.4. . . . . .1. . . then there is a unique such measure in P characterised by q= r−d . . F. . ωT ) and ωt ∈ Ω = {d.e. Z(t)) = σ(S(1). u}. ω) = Z(t. As we will see in the sequel this will make the construction of equivalent martingale measures relatively easy. . (class of all subsets of Ω). i. . t = 1. Z(T ) are independent and identically distributed with I (Z(t) = u) = p = 1 − I (Z(t) = d). Remark 3. P P To model the ﬂow of information in the market we use the obvious ﬁltration F0 Ft FT = F = {∅. . F) such that Q ˜ Proposition 3. . P P P P P The role of a product space is to model independent replication of a random experiment. . T .9) . . . . The Z(t) above are twovalued random variables. We have: measure via a measure Q on (Ω.4. Having done this in full once. .1.3. . 2. .8) holds true. S(t)). . according to our fundamental theorems (Theorems 3. In trying to do this we use (as is quite natural and customary) the bond price process B(t) as num´raire. . = P(Ω) (trivial σﬁeld). To do so we ﬁrst have to discuss whether the CoxRossRubinstein model is arbitragefree and complete. (i) A martingale measure Q for the discounted stock price S exists if and Q only if d < r < u. To answer these questions we have. T as random variables on (Ω. to understand the structure of equivalent martingale measures in the CoxRossRubinstein model. t = 1.2 RiskNeutral Pricing We now turn to the pricing of derivative assets in the CoxRossRubinstein market model. For the σalgebra we use F = P(Ω) and the ω ˜ ˜ probability measure is given by ˜ ˜ ˜ ˜ I ({ω}) = I 1 ({ω1 }) × .CHAPTER 3. This construction emphasises again that a multiperiod model can be viewed as a sequence of singleperiod models. . 3. Indeed. u−d (3. (3.8) (ii) If equation (3.4. in the CoxRossRubinstein case we use identical and independent singleperiod models. . e Our ﬁrst task is to ﬁnd an equivalent martingale measure Q such that the Z(1). . so can be thought of as tosses of a biased coin. × I T ({ωT }) = I ({ω1 }) × . a probability measure Q deﬁned as a product Q ˜ ˜ ˜ ˜ ˜ Q Q({u}) = q and Q Q({d}) = 1 − q. ωt ).2. I ) as (the tth projection) P Z(t. . we need to build a probability space on which we can model a succession of such independent tosses. DISCRETETIME MODELS 51 with each ω ∈ Ω representing the successive values of Z(t).1). Hence each ω ∈ Ω is ˜ a T tuple ω = (˜ 1 . Unfortunately we can hardly defend the assumption of independent and identically distributed price movements at each time period in practical applications. . Ω} = σ(Z(1). . .1 and 3. × I ({ωT }). . Observe that by this deﬁnition (and the above construction) Z(1). Now we redeﬁne (with a slight abuse of notation) the Z(t). We used this example to show explicitly how to construct the underlying probability space. we will from now on feel free to take for granted the existence of an appropriate probability space on which all relevant random variables can be deﬁned. .
Proposition 3. We use the f (ST ) for suitable functions f (in this simple framework all functions f : I → I notation τ τ j Fτ (x. The arbitrage price process of a contingent claim X in the CoxRossRubinstein model is given by πX (t) = B(t)I ∗ (X/B(T )Ft ) ∀t = 0. u. From now on we assume that (3. we have Z(t + 1) = S(t + 1)/S(t) − 1 = (S(t + ˜ ˜ 1)/S(t))(1 + r) − 1. we can build up our full model from its constituent components.2.1): Proposition 3.1 there exists an equivalent martingale measure and this is by the noarbitrage theorem (Theorem 3. E where I ∗ is the expectation operator with respect to the unique equivalent martingale measure I ∗ E P characterised by p∗ = (r − d)/(u − d). The CoxRossRubinstein model is complete.3.4. u].4 since the CoxRossRubinstein model is arbitragefree and complete. p) := p (1 − p)τ −j f x(1 + u)j (1 + d)τ −j (3. We now give simple formulas for pricing (and hedging) of European contingent claims X = R R).2. Solving it for q leads to the above formula. . Z(T ) are mutually independent and hence Z(t+1) is independent of Ft = σ(Z(1). P P To prove uniqueness and to ﬁnd the value of q we simply observe that under (3. This follows directly from Proposition 3. . the discounted price (S(t)) is a Q Qmartingale if and only if for t = 0. by the completeness theorem (Theorem 3. Completeness means that all contingent claims can be replicated. this can be r if and only if r ∈ [d. 1. and conversely. . So r = I Q (Z(t + 1)Ft ) = I Q (Z(t + 1)) = uq + d(1 − q) EQ EQ is a weighted average of u and d. E But Z(1). . .1) enough to guarantee that the CoxRossRubinstein model is free of arbitrage. We can now use the riskneutral valuation formula to price every contingent claim in the CoxRossRubinstein model. So. Since S(t) = S(t)B(t) = S(t)(1 + r)t .CHAPTER 3. .8) holds true. DISCRETETIME MODELS 52 ˜ ˜ ˜ Proof. 1.4. Uniqueness of the solution of the linear equation (4.8) gives completeness of the model. As Q is to be equivalent Q to I and I has no nonempty null sets. . u are excluded and (3. . If we do this in the large.2. The multiperiod model is complete if and only if every underlying singleperiod model is complete. T.1. Z(t)).7) under (3. r = d. we can do it in the small by restriction. . . j. Proof. Accordingly.8) u × q + d × (1 − q) = r has a unique solution. Proof.4.4. . To summarize: Corollary 3. . . . By Proposition 3.10) j j=0 Observe that this is just an evaluation of f (S(j)) along the probabilityweighted paths of the price process. One can translate this result – on uniqueness of the equivalent martingale measure – into ﬁnancial language. .4. .8) is proved.3.2. τ − j are the numbers of times Z(i) takes the two possible values d. The CoxRossRubinstein model is arbitragefree. . T − 1 ⇔ ˜ ˜ I Q [S(t + 1)Ft ] = S(t) ⇔ I Q [(S(t + 1)/S(t))Ft ] = 1 EQ ˜ EQ ˜ Q Q I [Z(t + 1)Ft ] = r. Using the above Proposition we immediately get: Corollary 3.
T. . .4. . . Recall that this means we can ﬁnd a selfﬁnancing portfolio ϕ(t) = (ϕ0 (t). this leads to a unique replicating portfolio process ϕ.1 in the nexttolast equality. j (3.4.4. t = 0. . . . T of the contingent claim is given by (set τ = T − t) πX (t) = (1 + r)−τ Fτ (St . . T. Using the bond as num´raire we get the discounted equation e ˜ ˜ ˜ ΠX (t) = Vϕ (t) = ϕ0 (t) + ϕ1 (t)S(t). 3. . . Z(T ) are independent of Ft . we can either argue similarly or use putcall parity. . 2. Proposition 3. . . . j=1 By Proposition 3. . ϕ1 (t)). 1.4. T.3 the price ΠX (t) of a contingent claim X = f (ST ) at time t is πX (t) = = (1 + r)−(T −t) I ∗ [f (S(T ))Ft ] E T (1 + r)−(T −t) I ∗ f E S(t) T (1 + Z(i)) i=t+1 Ft = = (1 + r) −(T −t) I E ∗ f S(t) i=t+1 (1 + Z(i)) (1 + r)−τ Fτ (S(t). X = f (ST ) with f (x) = (x − K)+ .CHAPTER 3.3 Hedging Since the CoxRossRubinstein model is complete we can ﬁnd unique hedging strategies for replicating contingent claims. We used the role of independence property of conditional expectations from Proposition B. 1. 1. It is applicable since S(t) is Ft measurable and Z(t + 1). .e.11) S(t) = S(0) (1 + Z(j)). . . Consider a European contigent claim with expiry T given by X = f (ST ).3. . .4. .5. An immediate consequence is the pricing formula for the European call option. . By the pricing formula. t = 1. Given such a t we only can use information up to (and including) time t − 1 to ensure that ϕ is predictable. u} . However. We can ˜ compute this portfolio process at any point in time as follows. the fact that Z(t) ∈ {d. T of the option is given by (set τ = T − t) τ ΠC (t) = (1 + r)−τ j=0 τ ∗j p (1 − p∗ )τ −j (S(t)(1 + u)j (1 + d)τ −j − K)+ . .12) For a European put option.3. Proof. DISCRETETIME MODELS 53 Corollary 3. . Recall that t (3. T . but we only know that S(t) = (1 + Z(t))S(t − 1). Corollary 3. t = 0. p∗ ). p∗ ). for all t = 0. . . such that the value process Vϕ (t) = ϕ0 (t)B(t) + ϕ1 (t)S(t) satisﬁes ΠX (t) = Vϕ (t). The arbitrage price process ΠC (t). .4. ϕ predictable. i. for all t = 0. The arbitrage price process πX (t). Therefore we know S(t − 1). . Consider a European call option with expiry T and strike price K written on (one share of ) the stock S. we know the arbitrage price process and using the restriction of predictability of ϕ. The equation ΠX (t) = ϕ0 (t) + ˜ ϕ1 (t)S(t) has to be true for each ω ∈ Ω and each t = 1. 1.
p∗ ). p∗ ) − Fτ (St−1 (1 + d). ˜ St−1 (u − d) ˜ ϕ0 (t) + ϕ1 (t)St−1 (1 + u). so: ϕ0 (t)(1 + r)t + ϕ1 (t)S(t − 1)(1 + u) = (1 + r)−τ Fτ (St−1 (1 + u). p∗ )) . Observe that we only need to have information up to time t − 1 to compute ϕ(t). DISCRETETIME MODELS 54 leads to the following system of equations. p∗ ) must be the value of the portfolio at time t if the strategy ϕ = (ϕ(t)) replicates the claim: ϕ0 (t)(1 + r)t + ϕ1 (t)S(t) = (1 + r)−τ Fτ (St . . St−1 (1 + d)) − (1 + d)ΠX (t. p∗ ) − Fτ (St−1 (1 + d). p∗ ) − (1 + d)Fτ (St−1 (1 + u). Subtract: ϕ1 (t)S(t − 1)(u − d) = (1 + r)−τ (Fτ (St−1 (1 + u). Now S(t) = S(t − 1)(1 + Z(t)) = S(t − 1)(1 + u) or S(t − 1)(1 + d). hence ϕ is predictable. St−1 (1 + d)) − St−1 (1 + d)ΠX (t. thus yielding the predictability of ϕ. Proposition 3. St−1 (1 + d)) = ˜ Π The solution is given by ϕ0 (t) = = ϕ1 (t) = = ˜ ˜ ˜ ˜ ˜ ˜ St−1 (1 + u)ΠX (t. x) is value of the call at time t (with time to expiry τ ) given that S(t) = x. So ϕ1 (t) in fact depends only on S(t − 1). p∗ )) . The perfect hedging strategy ϕ = (ϕ0 . j Then (1 + r)−τ C(τ. St−1 (1 + u)) − ΠX (t. S(t − 1)(u − d) Using any of the equations in the above system and solving for ϕ0 (t) completes the proof. p∗ )) . St−1 (u − d) (1 + u)Fτ (St−1 (1 + d). p∗ ) . τ C(τ. x) := j=0 τ ∗j p (1 − p∗ )τ −j (x(1 + u)j (1 + d)τ −j − K)+ . ϕ1 ) replicating the European contingent claim f (ST ) with time of expiry T is given by (again using τ = T − t) ϕ1 (t) = ϕ0 (t) = (1 + r)−τ (Fτ (St−1 (1 + u).4. St−1 (1 + u)) ˜ ˜ ˜ (1 + u)Π (u − d) ˜ ˜ ˜ ˜ ΠX (t. (u − d)(1 + r)T Proof. St−1 (1 + u)) = ˜ X (t. we use the following notation. p∗ ). To write the corresponding result for the European call.4. p∗ ). which can be solved for ϕ0 (t) and ϕ1 (t) uniquely. ˜ ϕ0 (t) + ϕ1 (t)St−1 (1 + d). St−1 (1 + u)) ˜t−1 (1 + u) − St−1 (1 + d) ˜ S ˜ X (t. we have ˜ ˜ ΠX (t. We make this rather abstract construction more transparent by constructing the hedge portfolio for the European contingent claims.CHAPTER 3. ϕ0 (t)(1 + r)t + ϕ1 (t)S(t − 1)(1 + d) = (1 + r)−τ Fτ (St−1 (1 + d). p∗ ) − Fτ (St−1 (1 + d). and ϕ1 (t) = (1 + r)−τ (Fτ (St−1 (1 + u). St−1 (1 + d)) . St−1 (1 + d)) ˜ ˜ St−1 (1 + u) − St−1 (1 + d) ˜ ˜ ˜ ˜ ΠX (t. ˜ ˜ Making the dependence of ΠX on S explicit. St−1 (1 + u)) − ΠX (t. (1 + r)−τ Fτ (St .
ϕ0 (T )(1 + r)T + ϕ1 (T )ST −1 (1 + d). St−1 (u − d) (1 + u)C(τ. the information on the price process). . ω)B(T.6. St−1 (1 + d)) − (1 + d)C(τ. When the payoﬀ function C(τ. ϕ1 (T ) based on the information available at time T − 1 (and so FT −1 measurable). x). ST −1 (1 + u)) = πX (T. πX (T. St−1 (1 + d))) . DISCRETETIME MODELS 55 Corollary 3. 1. Given the information FT −1 we know all but the last coordinate of ω. ω) = ϕ0 (T. Now the arbitrage prices at time T − 1 are known and one can repeat the procedure to successively compute the prices at T − 2. When the payoﬀ function is a nondecreasing function of the asset price S(t). St−1 (1 + u)) .4. In essence this is again only applying the oneperiod calculations for each time interval and each state of the world.5. but only the ﬁnal values of the option (or more generally of a contingent claim) we are still able to compute the arbitrage price and to construct the hedging portfolio by backward induction. as for the European call option considered here. T ].3 (i. this is nonnegative. ST −1 (1 + d)) = ((1 + d)ST −1 − K)+ . the perfecthedging strategy replicating the claim does not involve shortselling of the risky asset. The advantage of our riskneutral pricing procedure over this approach is that we have a single formula for the price of the contingent claim at all times t at once. x) is an increasing function of x. ST −1 (1 + d)) . and don’t have to go to a backwards induction only to compute a price at a special time t.e. . ω). ω)S(T. If we do not use the pricing formula from Proposition 3. So for each ω ∈ Ω the following equation has to hold: πX (T. ST −1 (1 + u)) − ΠX (T. In this case. ST −1 (1 + d)) − (1 + d)ΠX (T. Since we know the payoﬀ structure of the contingent claim time T . The perfect hedging strategy ϕ = (ϕ0 . We record this as: Corollary 3. ST −1 (1 + d)) = ϕ0 (T )(1 + r)T + ϕ1 (T )ST −1 (1 + u). . ST −1 (1 + u)) (u − d)(1 + r) ΠX (T. ST −1 )(1 + r)T −1 + ϕ1 (T. St−1 (1 + u)) − C(τ.4. 0. (u − d)(1 + r)T Notice that the numerator in the equation for ϕ1 (t) is the diﬀerence of two values of C(τ. We outline this procedure for the European call starting with the last period [T − 1. we can solve the above system and obtain ϕ0 (T ) = ϕ1 (t) = (1 + u)ΠX (T. . with the larger value of x in the ﬁrst term (recall u > d). ST −1 (1 + u)) = ((1 + u)ST −1 − K)+ and πX (T. ϕ1 ) replicating the European call option with time of expiry T and strike price K is given by ϕ1 (t) ϕ0 (t) = = (1 + r)−τ (C(τ. ST −1 )S(T − 1). and this gives rise to two equations (with the same notation as above): πX (T. .4. We have to choose a replicating portfolio ϕ(T ) = (ϕ0 (T ). ω) + ϕ1 (T. for example in case of a European call. ST −1 ) = ϕ0 (T.CHAPTER 3. the Proposition gives ϕ1 (t) ≥ 0: the replicating strategy does not involve shortselling. ST −1 (u − d) Using this portfolio one can compute the arbitrage price of the contingent claim at time T − 1 given that the current asset price is ST −1 as πX (T − 1.
. . value contingent claims) one basically has two choices: one could model the processes as continuoustime stochastic processes (for which the theory of stochastic calculus is needed) or one could construct a sequence of discretetime models in which the continuoustime price processes are approximated by discretetime stochastic processes in a suitable sense. . we choose rn such that 1 + rn = er∆n . F.i ). . . n = 1.i = un ) = pn = 1 − I (Zn.i ) . which we P now observe in a continuoustime interval [0. . . . kn taking values {dn .4 a discretetime bond and stock price process. With the speciﬁcation of the oneperiod returns we get a complete description of the discrete dynamics of the stock price process in each CoxRossRubinstein model. kn − 1.5 Binomial Approximations Suppose we observe ﬁnancial assets during a continuous time period [0. . We describe the the second approach now by examining the asymptotic properties of a sequence of CoxRossRubinstein models. . j = 0. and hence the bond process (in the nth model) is given by B(tn. . We want to model two assets. To construct a stochastic model of the price processes of these assets (to. T ]. kn diﬀer from lattice to i=1 lattice. The parameters un . We suppose that trading occurs only at the equidistant time points tn. 2. We want arbitragefree ﬁnancial market models and therefore have to choose the parameters un . In the continuoustime model we compound continuously with spot rate r ≥ 0 and hence the bond price process B(t) is given by B(t) = ert . With these Zn.j = j∆n . An arbitragefree ﬁnancial market model is . . 1. kn . . . DISCRETETIME MODELS 56 3. . we assume that the random variables are rowwise independent (but we allow dependence between rows). dn .i = dn ) P P for some pn ∈ (0. . .j we model the stock price process Sn in the nth CoxRossRubinstein model as j Sn (tn. . . pn accordingly. We call such a ﬁnite sequence Zn = (Zn. The approximation of a continuoustime setting by a sequence of lattices is called the lattice approach. (3. 1) (which we specify later). I ). We ﬁx rn as the riskless interest rate over each interval Ij . (j + 1)∆n ]. . Turning back to a speciﬁc CoxRossRubinstein model. but remain constant throughout a speciﬁc lattice. there is a prespeciﬁed number kn of trading dates. . i = 1. We set ∆n = T /kn and divide [0. T ] in kn subintervals of length ∆n .j ) of the stock by a family of random variables Zn.j ) = Sn (0) i=1 (1 + Zn. .j ). pn . Next we model the oneperiod returns S(tn.i )kn a lattice or tree.j ) = (1 + rn )j . Looking at the nth CoxRossRubinstein model in our sequence. . dn . a riskless bond B and a risky stock S. . It is important to stress that for each n we get a diﬀerent discrete stock price process Sn (t) and that in general these processes do not coincide on common time points (and are also diﬀerent from the price process S(t)). we now have as in §3. .13) With this choice we have for any j = 0.g. e. T ]. un } with I (Zn.j+1 )/S(tn. kn . kn − 1. i = 1.CHAPTER 3. kn . .i . . j = 0. . . j = 0.1 Model Structure We assume that all random variables subsequently introduced are deﬁned on a suitable probability space (Ω. In order to approximate this process in the discretetime framework.5. kn that (1 + rn )j = exp(rj∆n ) = exp(rtn. 3. To transfer the continuoustime framework into a binomial structure we make the following adjustments. . In the triangular array (Zn. j = 0. Thus we have approximated the bond process exactly at the time points of the discrete model. namely Ij = [j∆n . .
3. By condition (3.2). We deﬁne an = min j ∈ I 0 S(0)(1 + un )j (1 + dn )kn −j > K .4.4.pn (an ).pn (an ) with pn = ˆ p∗ (1 + un ) n . N (3. 1 + rn . DISCRETETIME MODELS 57 guaranteed by the existence of an equivalent martingale measure. Assuming the riskfree rate of interest r as given. In particular we can compute the price ΠC (t) at time t of a European n call on the stock S with strike K and expiry T by formula (3. = √ un − dn eσ ∆n − e−σ ∆n We can now price contingent claims in each CoxRossRubinstein model using the expectation operator with respect to the (unique) equivalent martingale measure characterised by the probabilities p∗ (compare §3.14) n un − dn So the only parameters to choose freely in the model are un and dn .5.1(ii)) rn − dn p∗ = . freedom are resolved by choosing un and dn .13) 1 + rn = er∆n and the remaining degrees of . and by Proposition 3.4. p) as B n.1 (i) the (necessary and) suﬃcient condition for that is d n < rn < u n .4.15) Then we can rewrite the pricing formula (3. ¯ ˆ Also the ﬁrst bracketed expression is B kn .12) of Corollary 3.pn (an ) = 1 − B kn . we have by (3. We use the following choice: 1 + un = eσ √ ∆n . j j=a = (1 + rn ) −kn n kn Denoting the binomial cumulative distribution function with parameters (n.2 The BlackScholes Option Pricing Formula We now choose the parameters in the above lattice approach in a special way.p (. In the next sections we consider some special choices. Let us reformulate this formula slightly.14) the riskneutral probabilities for the corresponding single period models are given by √ rn − dn er∆n − e−σ ∆n ∗ √ pn = .CHAPTER 3. The riskneutrality approach implies that the expected (under an equivalent martingale measure) oneperiod return must equal the oneperiod return of the riskless bond and hence we get (see Proposition 3.4. (3.12) for t = 0 in the setting of the nth CoxRossRubinstein model as ΠC (0) kn ∗ j pn (1 − p∗ )kn −j (S(0)(1 + un )j (1 + dn )kn −j − K) n j j=an kn j k −j kn p∗ (1 + un ) (1 − p∗ )(1 + dn ) n n n = S(0) 1 + rn 1 + rn j j=an kn kn ∗j −kn ∗ kn −j −(1 + rn ) K pn (1 − pn ) .) we see that the second bracketed expression is just ∗ ∗ ¯ B kn . and 1 + dn = (1 + un )−1 = e−σ √ ∆n .
σ 2 . t) = log(s/K) + (r + √ σ t √ σ2 2 )t (3.n . Proof. 2 and βn = kn (1 − pn ) ˆ kn pn (1 − pn ) ˆ ˆ . To show (i) we interpret ¯ ˆ B kn . n = 1.n − pn ) ˆ = j=1 kn pn (1 − pn ) ˆ ˆ . t) are given by d1 (s. .1.pn (an ) = N (d1 (S. pn ). T )). T )) − Ke−rT N (d2 (S. Using the following limiting relations: n→∞ lim pn = ˆ n→∞ lim kn (1 − 2ˆn ) ∆n = −T p σ r + . dependent on n. C The functions d1 (s.pn (an ) ∗ ¯ lim B kn . We have the following limit relation: n→∞ lim ΠC (0) = ΠBS (0) C (n) with ΠBS (0) given by the BlackScholes formula (we use S = S(0) to ease the notation) C ΠBS (0) = SN (d1 (S. Using this notation we have in ˆ the nth CoxRossRubinstein model for the price of a European call at time t = 0 the following formula: ∗ (n) ¯ ˆ ¯ (3. 2. . .) is the standard normal cumulative distribution function. DISCRETETIME MODELS 58 That pn is indeed a probability can be shown straightforwardly. σ2 2 )t log(s/K) + (r − √ d1 (s. kn .pn (an ). t) and d2 (s.CHAPTER 3. βn → β we have ˆ n→∞ ˜ lim I (αn ≤ Yn ≤ βn ) = N (β) − N (α). where Bj. . T )). but Sn (0) = S(0) for all n.17) . These statements involve the convergence of distribution functions. T )).) We now look at the limit of this expression. = N (d2 (S. t) = d2 (s. Since Sn (0) = S (say) all we have to do to prove the proposition is to show (i) (ii) n→∞ n→∞ ¯ ˆ lim B kn .pn (an ) = I (an ≤ Yn ≤ kn ) P with (Yn ) a sequence of random variables distributed according to the binomial law with parameters (kn . .16) ΠC (0) = Sn (0)B kn . We normalise Yn to ˆ kn ˜ Yn = Yn − I E(Yn ) V ar(Yn ) = Yn − kn pn ˆ kn pn (1 − pn ) ˆ ˆ (Bj. . are rowwise independent Bernoulli random variables with parameter pn . t) − σ t = σ t and N (.5.pn (an ) − K(1 + rn )−kn B kn . Proposition 3. j = 1. Now using the central limit theorem we know that for αn → α. . (We stress again that the underlying is Sn (t). P By deﬁnition we have ˜ I (an ≤ Yn ≤ kn ) = I αn ≤ Yn ≤ βn P P with αn = an − kn pn ˆ kn pn (1 − pn ) ˆ ˆ 1 .
DISCRETETIME MODELS and the deﬁning relation for an .15). ∗ ∗ To prove (ii) we can argue in very much the same way and arrive at parameters αn and βn ∗ with pn replaced by pn . Let us mention here that in the continuoustime BlackScholes model the dynamics of the (stochastic) stock price process S(t) are modelled by a geometric Brownian motion (or exponential Wiener process).i ) = −σ P ∆n ). lim βn = lim n→∞ kn p−1 (1 − pn ) = +∞. We will therefore call these models discrete BlackScholes models. completing the proof of (i). T σ 2 ).i ) = σ P ∆n ) = pn = 1 − I (log(1 + Zn. ˆn ˆ So N (βn ) → 1. 2 σ we get n→∞ lim ∗ αn n→∞ lim √ log(K/S) + σn ∆n (1 − 2p∗ ) n 2σ n∆n p∗ (1 − p∗ ) n n σ2 2 )T log(K/S) − (r − √ σ T = −d2 (s. 2 = = n→∞ lim kn (1 − 2p∗ ) ∆n = T n σ r − .i ). In particular the time T distribution of log{S(T )/S(0)} is N (T µ. formula (3. Using the following limiting relations: ˆ n→∞ lim p∗ = n 1 . n (T By the (triangular array version) of the central limit theorem we know that log SS(0)) properly normalised converges in distribution to a random variable with standard normal distribution. T σ 2 ). . Doing similar calculations as in the above proposition we can compute the normalising constants and get Sn (T ) ∼ N (T (r − σ 2 /2). The sample paths of this stochastic price process are almost all continuous and the probability law of S(t) at any time t is lognormal. lim log n→∞ S(0) i.CHAPTER 3. N (αn ) → N (−d1 ) = 1 − N (d1 ). By the above proposition we have derived the classical BlackScholes European call option valuation formula as an asymptotic limit of option prices in a sequence of CoxRossRubinstein type models with a special choice of parameters. T ). n n whence (ii) follows similarly. i=1 with log(Zn. T ). we get 59 n→∞ lim αn = = √ log(K/S) + kn σ ∆n √ − kn pn ˆ 2σ ∆n lim n→∞ kn pn (1 − pn ) ˆ ˆ √ log(K/S) + σkn ∆n (1 − 2ˆn ) p lim n→∞ 2σ kn ∆n pn (1 − pn ) ˆ ˆ log(K/S) − (r + √ σ T σ2 2 )T = Furthermore we have n→∞ = −d1 (S. Looking back at the construction of our sequence of CoxRossRubinstein models we see that log Sn (T ) = S(0) kn log(1 + Zn.i ) Bernoulli random variables with I (log(1 + Zn. For the upper limit we get n→∞ ∗ lim βn = lim n→∞ kn (p∗ )−1 (1 − p∗ ) = +∞. Sn (T ) S(0) is in the limit lognormally distributed.e.
From {τ = n} = {τ ≤ n} \ {τ ≤ n − 1} and {τ ≤ n} = characterisation {τ = n} ∈ Fn ∀ n ≤ ∞. . and I E(Xτ ) = I E(X0 ). Suppose (Xn ) is an adapted process and we are interested in the time of ﬁrst entry of X into a Borel set B (typically one might have B = [c. Now {τ ≤ n} = k≤n {Xk ∈ B} ∈ Fn and τ = ∞ if X never enters B. 1. where we can take K to be an integer and write ∞ K Xτ (ω) (ω) = k=0 Xk (ω)1{τ (ω)=k} = k=0 Xk (ω)1{τ (ω)=k} . Theorem 3. DISCRETETIME MODELS 60 3. Assume τ (ω) ≤ K for all ω. think of τ as a time at which you decide to quit a gambling game: whether or not you quit at time n depends only on the history up to and including time n – NOT the future. 3. . or prescience of the future. In order to hedge such an option. The holder of an American derivative security can ‘exercise’ in any period t and receive payment f (St ) (or more general a nonnegative payment ft ).6 American Options Consider a general multiperiod framework. 2. Optional Stopping and Snell Envelopes A random variable τ taking values in {0. Our aim in the following will be to discuss existence and construction of such a stopping time. (Since P τ (ω) ≤ K for some constant K and all ω ∈ Ω \ N with I (N ) = 0 all identities hold true except P on a null set. Furthermore since a gambler cannot cheat the system the expectation of his hypothetical fortune (playing with unit stake) should equal his initial fortune. ∞)): τ = inf{n ≥ 0 : Xn ∈ B}.6.) Example. where there is no insider trading.6. Intuitively. we want to construct a selfﬁnancing trading strategy ϕt such that for the corresponding value process Vϕ (t) Vϕ (0) = Vϕ (t) ≥ x initial capital f (St ). Let τ be a bounded stopping time and X = (Xn ) a martingale. ∀t.1 Stopping Times.e. if there is a constant K such that I (τ ≤ K) = 1. Then Xτ is integrable. almost surely. (3. . Proof. in the ﬁnancial context. Thus τ is a stopping time.18) Such a hedging portfolio is minimal. if for a stopping time τ Vϕ (τ ) = f (Sτ ). we see the equivalent Call a stopping time τ bounded.1 (Doob’s StoppingTime Principle (STP)). .CHAPTER 3. k≤n {τ = k}. +∞} is called a stopping time (or optional time) if {τ ≤ n} = {ω : τ (ω) ≤ n} ∈ Fn ∀ n ≤ ∞. Thus stopping times model gambling and other situations where there is no foreknowledge. in particular. i.
Clearly Ω. . thus Ac ∈ Fτ . we have ∞ ∞ Ai i=1 ∩ {τ ≤ n} = i=1 (Ai ∩ {τ ≤ n}) ∈ Fn .1. such as sequential analysis in statistics. Finally.) ∈ Fn as A ∈ Fσ . Eτ (ii) I < ∞ and (Xn − Xn−1 ) is bounded. i = 1.3. Proposition 3. . alternative conditions such as (i) X = (Xn ) is bounded (Xn (ω) ≤ L for some L and all n. So for A ∈ Fσ we get A ∩ {τ ≤ n} = (A ∩ {σ ≤ n}) ∩ {τ ≤ n} ∈ Fn . since (.6. Deﬁnition 3. 2.CHAPTER 3. showing ∞ i=1 Ai ∈ Fτ . For any adapted sequence of random variables X = (Xn ) and a. EX EX Also. The stopping time principle is important in many areas. We turn in the next section to related ideas speciﬁc to the gambling/ﬁnancial context.6. . Let τ be a stopping time. Then Xτ is Fτ measurable. For τ a stopping time.6. Fτ is a σ−algebra. Proof. We simply have to check the deﬁning properties. in analogy to the σalgebra Fn which represents the events observable up to time n. Proof.s. We now wish to create the concept of the σalgebra of events observable up to a stopping time τ . The stopping time principle holds also true if X = (Xn ) is a supermartingale.2.6. Proposition 3. the Fk measurability of {τ = k} and ﬁnally the deﬁnition of conditional expectation. Then Fσ ⊆ Fτ . ﬁnite stopping time τ . DISCRETETIME MODELS 61 Thus using successively the linearity of the expectation operator. for all n}. for a family Ai ∈ Fτ . then the conclusion is I τ ≤ I 0. deﬁne ∞ Xτ = n=0 Xn 1{τ =n} . So A ∈ Fτ . Proposition 3.1. Also for A ∈ Fτ we ﬁnd Ac ∩ {τ ≤ n} = {τ ≤ n} \ (A ∩ {τ ≤ n}) ∈ Fn . . ∅ are in Fτ . The stopping time σ−algebra Fτ is deﬁned to be Fτ = {A ∈ F : A ∩ {τ ≤ n} ∈ Fn . ω). the martingale property of X. Since σ ≤ τ we have {τ ≤ n} ⊆ {σ ≤ n}. Let σ. τ be stopping times with σ ≤ τ . suﬃce for the proof of the stopping time principle. we get K K I E(Xτ ) = = I E k=0 K Xk 1{τ =k} = k=0 I Xk 1{τ =k} E K I I E E(XK Fk )1{τ =k} = k=0 K k=0 I XK 1{τ =k} E = I XK E k=0 1{τ =k} = I E(XK ) = I E(X0 ).
We can establish a further characterisation of the martingale property. Now using the fact that on the set {τ = k} we have Xτ = Xk .5. Let 0 ≤ m < n.4.19). E E I E(Xn Fm ) = Xm by deﬁnition of conditional expectation. First observe that Xτ and Xσ are integrable (use the sum representation and the fact that τ is bounded by an integer K) and Xσ is Fσ measurable by Proposition 3.6. (3. Proposition 3. Deﬁne a stopping time τ by τ = n1A + m1Ac . Proof. E and by subtraction we obtain I (Xm 1A ) = I (Xn 1A ).6.6. This says that (Xn ) is a martingale.2 (Doob’s OptionalSampling Theorem. where we deﬁne Xn (ω) := Xτ (ω)∧n (ω). So it only remains to prove that I E(1A Xτ ) = I E(1A Xσ ) ∀A ∈ Fσ .1. Let B be a Borel set. Proof.19) For any such ﬁxed A ∈ Fσ . we ﬁnd n n {Xτ ∈ B} ∩ {τ ≤ n} = {Xτ ∈ B} ∩ {τ = k} = k=1 k=1 {Xk ∈ B} ∩ {τ = k}. Since this holds for all A ∈ Fm . But I E(Xρ ) I E(Xτ ) = I (Xσ 1A + Xτ 1Ac ) . Since {ρ ≤ n} = (A ∩ {σ ≤ n}) ∪ (Ac ∩ {τ ≤ n}) ∈ Fn ρ is a stopping time. Proposition 3. (ii) If X is a martingale (super. submartingale) and τ is a stopping time. Then I [Xτ Fσ ] = Xσ E and thus I E(Xτ ) = I E(Xσ ).6. Let X = (Xn ) be an adapted sequence of random variables with I E(Xn ) < ∞ for all n and I E(Xτ ) = 0 for all bounded stopping times τ . and from ρ ≤ τ we see that ρ is bounded.6.6. as required. We are now in position to obtain an important extension of the StoppingTime Principle. We need to show {Xτ ∈ B} ∈ Fτ . deﬁne ρ by ρ(ω) = σ(ω)1A (ω) + τ (ω)1Ac (ω). Theorem 3. DISCRETETIME MODELS 62 Proof. Then 0=I E(Xτ ) 0=I E(Xm ) = I (Xn 1A + Xm 1Ac ) . then the stopped sequence X τ is adapted. X τ is a martingale (super. . E = I (Xτ 1A + Xτ 1Ac ) . Theorem 3.1) implies I E(Xρ ) = I E(X0 ) = I E(Xτ ). Then X is a martingale. So the STP (Theorem 3. and the result follows. (i) If X is adapted and τ is a stopping time. τ τ Write X τ = (Xn ) for the sequence X = (Xn ) stopped at time τ .3. E So subtracting yields (3. E = I (Xm 1A + Xm 1Ac ) . τ be bounded stopping times with σ ≤ τ .CHAPTER 3. Let X = (Xn ) be a martingale and let σ. Now sets {Xk ∈ B} ∩ {τ = k} ∈ Fk ⊆ Fn . ∞ and A ∈ Fm . submartingale). OST).
I E(Xτ ∧n Fn−1 ) = Xτ ∧(n−1) + Cn (I E[Xn Fn−1 ] − Xn−1 ). Then Cn ≥ 0 shows that if X is a supermartingale (submartingale). By repeating this argument (or more formally. Yn ≥ Zn for all n. for k ≥ 1.3. so is X τ . First. So (Xn ) is adapted. since ZN = XN and Y dominates X.2. Next. Proposition 3. and is the smallest supermartingale dominating X (that is. If X = (Xn )N is a sequence adapted to a ﬁltration Fn with I E(Xn ) < ∞. Yn−1 ≥ I E(Yn Fn−1 ) ≥ I E(Zn Fn−1 ).4.g. . {τ ∗ = k} = {Z0 > X0 } ∩ · · · ∩ {Zk−1 > Xk−1 } ∩ {Zk = Xk } ∈ Fk . so Z is a supermartingale. with Zn ≥ Xn for all n). τ ∗ := inf{n ≥ 0 : Zn = Xn } is a stopping time. as required. see e.6. Since ZN = XN .6. Deﬁnition 3. Then as Y is a supermartingale. Proof. Yn−1 ≥ max {Xn−1 . Theorem 3. {τ = 0} = {Z0 = X0 } ∈ F0 . . First. So taking conditional expectation given Fn−1 and using predictability of (Cn ). The idea is due to Snell (1952). which will be an important tool for the valuation of American options. Zn := max {Xn .6. then n 63 Xτ ∧n = X0 + j=1 τ ∧n Cj (Xj − Xj−1 ) (as the right is X0 + j=1 (Xj −Xj−1 ). let Y = (Yn ) be any other supermartingale dominating X. and the stopped process Z τ is a martingale. τ If X is a martingale. τ Zn = Zn∧τ ∗ = Z0 + j=1 ∗ n Cj ∆Zj . For k = 0. DISCRETETIME MODELS Proof. I E(Zn Fn−1 )} = Zn−1 .1). VI. for a textbook account. So τ ∗ is a stopping time. Yn−1 ≥ Xn−1 . Combining. We now discuss the Snell envelope. (Cn ) is predictable. Since {j ≤ τ } is the complement τ of {τ < j} = {τ ≤ j − 1} ∈ Fj−1 . . As in the proof of Proposition 3. N } is welldeﬁned and clearly bounded. Assume inductively that Yn ≥ Zn .CHAPTER 3. so Z dominates X. τ ∗ ∈ {0. so is X as it is the martingale transform of (Xn ) by (Cn ) (use Theorem C. we must have YN ≥ ZN . ∗ ∗ (n ≤ N − 1) Proof. which telescopes to Xτ ∧n ). and as Y dominates X.5. The Snell envelope Z of X is a supermartingale. . Zn ≥ I E(Zn+1 Fn ). Let Cj := 1{j≤τ } . . by backward induction). Neveu (1975). I (Zn+1 Fn )} E is called the Snell envelope of X. The righthand side above is Xτ ∧(n−1) +Cn (Xn −Xn−1 ). n=0 the sequence Z = (Zn )N deﬁned by n=0 ZN := XN . 1. we must show Y dominates Z also.6.6. and Zn ≥ Xn .
Then the left of (3. . E τ Proof. Then the left of (3. . E τ E E (3. then τ τ Z0 = Z0 = I ZN E ∗ ∗ ∗ ∗ = I (Zτ ∗ ) = I (Xτ ) . I E(Zn+1 Fn )}.22) and taking the supremum on τ gives the result. the Corollary shows that τn gives the best stopping time that is realistic: it maximises our expected payoﬀ given only information currently available. gives ∗ Corollary 3. n + 1.N } . Together with the property that Z dominates X this yields τ Z0 = Z0 ≥ I (ZN ) = I (Zτ ) ≥ I (Xτ ) . i.Fn ) to (3. Recall that F0 = {∅.e.6. τ ∗ solves the optimal stopping problem for X: Z0 = I E(Xτ ∗ ) = sup {I (Xτ ) : τ ∈ T0.N }.5). suppose ﬁrst that τ ∗ ≥ n + 1. N } (a ﬁnite set. Now apply I E(. ∗ Zn = I E(Xτn Fn ) = sup {I E(Xτ Fn ) : τ ∈ Tn.6. Call a stopping time σ ∈ Tn. We proceed by analysing optimal stopping times. E E (3. as Ω is ﬁnite). One can characterize optimality by establishing a martingale property: .21) Now for any stopping time τ ∈ T0. Zn = I E(Zn+1 Fn ) on {n + 1 ≤ τ ∗ }. ∗ ∗ 64 Now Zn := max {Xn .CHAPTER 3. If τn := inf{j ≥ n : Zj = Xj }. Write Tn. τ τ I (Zn+1 − Zn )Fn E ∗ ∗ = 1{n+1≤τ ∗ } I [(Zn+1 − I E E(Zn+1 Fn )) Fn ] = 1{n+1≤τ ∗ } [I E(Zn+1 Fn ) − I E(Zn+1 Fn )] = 0.20) is Zn+1 − Zn . Zn > Xn on {n + 1 ≤ τ ∗ }. starting at time n rather than time 0. The same argument. For n ≤ N − 1. We next see that the Snell envelope can be used to solve the optimal stopping problem for (Xn ) in T0.20). . and by deﬁnition of τ ∗ . To prove the ﬁrst statement we use that (Zn ) is a martingale and Zτ ∗ = Xτ ∗ .22) Combining (3. The other possibility is that τ ∗ < n+1.20): since {n+1 ≤ τ ∗ } = {τ ∗ ≤ n}c ∈ Fn .20) is Zτ ∗ −Zτ ∗ = 0. τ τ Zn+1 − Zn = Cn+1 (Zn+1 − Zn ) = 1{n+1≤τ ∗ } (Zn+1 − Zn ).N . completing the proof of (3. so is the stopped process Z τ (see Proposition 3. So from the deﬁnition of Zn . τ τ So I E(Zn+1 Fn ) = Zn . as required. Ω} so I E(Y F0 ) = I E(Y ) for any integrable random variable Y . since Z is a supermartingale (above).6.N .N optimal for (Xn ) if ∗ ∗ ∗ I E(Xσ Fn ) = sup{I E(Xτ Fn ) : τ ∈ Tn. ∗ ∗ (3.20) For. This says that Z τ is a martingale. We next prove τ τ Zn+1 − Zn = 1{n+1≤τ ∗ } (Zn+1 − I E(Zn+1 Fn )).1.N for the set of stopping times taking values in {n. the right is Zn+1 − I E(Zn+1 Fn ).7. and these agree on {n+1 ≤ τ ∗ } by above. As we are attempting to maximise our payoﬀ by stopping X = (Xn ) at the most advantageous ∗ time. while the right is zero because the indicator is zero.N } . DISCRETETIME MODELS where Cj = 1{j≤τ ∗ } is predictable.21) and (3. . Proposition 3. τ ∗ ≤ n.
23) As above. A is increasing: An ≤ An+1 for all n. From Proposition 3.s. To see (ii). we also have Z0 ≥ I E(Zσ ).s.s.6. Combining. where we used (i) for the last identity. Thus Z0 = max {I E(Xτ ).N .6. A a predictable process null at zero. Proof.6. while by the above Xσ and Zσ have the same expectation. this inequality between random variables with equal expectations forces a. This says σ I (Zn Fn−1 ) = Zn−1 .N } = I E(Xσ ) ≤ I E(Zσ ). (i) Zσ = Xσ . since Z dominates X. Combining. We start showing that (i) and (ii) imply optimality. E σ so Z σ is a martingale. But X ≤ Z.5 implies that Z τ is a supermartingale for any τ ∈ T0.Fn−1 ): I (Zσ∧n Fn−1 ) = I (I E E E(Zσ Fn )Fn−1 ) = I E(Zσ Fn−1 ) = Zσ∧(n−1) .6.s. by (3. E(I where the second inequality follows from Doob’s OST (Theorem 3.6. showing (i). The stopping time σ ∈ T is optimal for (Xt ) if and only if the following two conditions hold. If also X is a submartingale (‘increasing on average’). I E(Xσ ) = Z0 = I E(Zσ ). observe that for any n ≤ N I E(Zσ ) = Z0 ≥ I E(Zσ∧n ) ≥ I E(Zσ ) = I E(Zσ Fn )). Now assume that σ is optimal. so Xσ ≤ Zσ . a.. To ﬁnd the largest optimal stopping time we try to ﬁnd the time when Z ’ceases to be a martingale’. If Z σ is a martingale then σ σ Z0 = I E(Z0 ) = I E(ZN ) = I E(Zσ ) = I E(Xσ )..24) with M a martingale null at zero.6 and its deﬁnition (ﬁrst time when Z and X are equal) it follows that τ ∗ is the smallest optimal stopping time ..s. DISCRETETIME MODELS 65 Proposition 3. Let X = (Xn ) be an adapted process with each Xn ∈ L1 .CHAPTER 3. Since Z is a supermartingale Proposition 3.2) with the bounded stopping times (σ ∧ n) ≤ σ and the supermartingale Z. (3. In order to do so we need a structural result of genuine interest and importance Theorem 3. (ii) Z σ is a martingale. Since Z σ is a supermartingale. σ is optimal. . equal: Xσ = Zσ a.8. So they must be a. Using that Z is a supermartingale again. Now Z dominates X. we also ﬁnd Zσ∧n ≥ I E(Zσ Fn ). Apply I E(.4 (Doob Decomposition). τ ∈ T0.23) with n − 1 for n. and so τ τ Z0 = I E(Z0 ) ≥ I E(ZN ) = I E(Zτ ) ≥ I E(Xτ ). equality: Zσ∧n = I E(Zσ Fn ) a. Then X has an (essentially unique) Doob decomposition X = X0 + M + A : Xn = X0 + Mn + An ∀n (3.
25) is ≥ 0. I E(Zk+1 Fk )} + 1{ν=N } XN . so the RHS of (3. clearly predictable. assume ˜ ˜ ˜ ˜ two decompositions.s. ν is optimal for (Xt ).6.s.8. The second is An − An−1 . k=0 = Now I E(Zk+1 Fk ) = I E(Mk+1 − Ak+1 Fk ) = Mk − Ak+1 . We use Proposition 3.9.e. Z ν is a martingale and thus we have (ii) of Proposition 3. Then M = Z0 + L is a martingale and A = (−B) is increasing and we have Z = M − A. Equipped with the Doobdecomposition we return to the above setting and can write Z = Z0 + L + B with L a martingale and B predictable and decreasing. since {ν = n} = k≤n {Ak = 0} ∩ {An+1 > 0} ∈ Fn as A is predictable. So I E[Xn − Xn−1 Fn−1 ] = An − An−1 . Observe that ν (bounded by N ) is a stopping time. Hence Zk = max{Xk . the LHS of (3.25) is ≥ 0. . i. so I E(Zk+1 Fk ) < Zk .24). (An ) is increasing.. Proof.6. I E[Xn − Xn−1 Fn−1 ] = I E[Mn − Mn−1 Fn−1 ] + I E[An − An−1 Fn−1 ]. then Mn − Mn = An − An .6. To see uniqueness.25) I E[Xk − Xk−1 Fk−1 ]. a. To see (i) we write N −1 Zν = k=0 N −1 1{ν=k} Zk + 1{ν=N } ZN 1{ν=k} max{Xk .e. Although the Doob decomposition is a simple result in discrete time. On {ν = k} we have Ak = 0 and Ak+1 > 0. as M is a martingale. and summation gives An = k=1 n (3.CHAPTER 3. giving the Doob decomposition (3. Thus ˜ n is predictable and so must be constant a.24) to deﬁne (Mn ). I E(Zk+1 Fk )} = Xk on the set {ν = k}. We then use (3.24). Proposition 3. Zk (ω) = Mk (ω) − Ak (ω) = Mk (ω). i. So N −1 k=0 Zν = 1{ν=k} Xk + 1{ν=N } XN = Xν . Deﬁnition 3. So set A0 = 0 and use this formula to deﬁne (An ). Deﬁne a random variable ν : Ω → I 0 by setting N ν(ω) = N min{n ≥ 0 : An+1 > 0} if AN (ω) = 0 if AN (ω) > 0. DISCRETETIME MODELS Proof. If X has a Doob decomposition (3. This illustrates the contrasts that may arise between the theories of stochastic processes in discrete and continuous time. then a martingale. since An (and An−1 ) is Fn−1 measurable by predictability. the analogue in continuous time – the DoobMeyer decomposition – is deep. Xn = X0 + Mn + An = X0 + Mn + An .8. and it is the largest optimal stopping time for (Xt ).6.3. 66 The ﬁrst term on the right is zero. the martingale Mn − M If X is a submartingale. Since for k ≤ ν(ω).
6.1) to ﬁnd for any stopping time τ Vϕ (0) = M0 = I ∗ (Vϕ (τ )).27) (3. in which case B(T − 1)I ∗ (β(T )fT FT −1 ) needs to be covered. F.6.35) . So τ cannot be optimal.N with τ ≥ ν and I (τ > ν) > 0. DISCRETETIME MODELS 67 which is (i) of Proposition 3. Thus the hedging strategy of the E writer has to satisfy Vϕ (T − 1) = max{fT −1 . Now take τ ∈ {T }0. We will show that it is also suﬃcient and call the price in (3. Vϕ (T ) ≥ fT is required (We write short ft for ft (S)). So I E(Aτ ) > 0. Then for any hedging strategy ϕ we have that under I ∗ P ˜ M (t) = Vϕ (t) = β(t)Vϕ (t) is a martingale. P P unique martingale measure. and I E(Zτ ) = I E(Mτ ) − I E(Aτ ) = I E(Z0 ) − I E(Aτ ) < I E(Z0 ).30) (3.CHAPTER 3. I I ). I ∗ (Vϕ (t)Ft−1 )}. Thus we can use the STP (Theorem 3. Aτ > 0 with positive probability. E ˜ ˜ ˜ Thus we see that Vϕ (t) is the Snell envelope Zt of ft . In particular we know that Zt = sup I ∗ (fτ Ft ) E ˜ τ ∈Tt (3.32) (3.34) x = Z0 = I ∗ (fτ0 ) = sup I ∗ (fτ ) E ˜∗ E ˜ τ ∈T0 (3.6. which is complete with I ∗ the F. B(t − 1)I ∗ (β(t)Vϕ (t)Ft−1 )}.29) the rational price of an American contingent claim.8.26) Since we require Vϕ (τ ) ≥ fτ (S) for any stopping time we ﬁnd for the required initial capital x ≥ sup I ∗ (β(τ )fτ (S)).33) ˜ and the stopping time τ ∗ = min{s ≥ t : Zs = fs } is optimal. At time T the hedging strategy needs to cover fT .2 The Financial Model We assume now that we work in a market model (Ω.31) (3. 3.29) Thus (3.28) Suppose now that τ ∗ is such that Vϕ (τ ∗ ) = fτ ∗ (S) then the strategy ϕ is minimal and since Vϕ (t) ≥ ft (S) for all t we have x = I ∗ (β(τ ∗ )fτ ∗ (S)) = sup I ∗ (β(τ )fτ (S)) E E τ ∈T (3. B(T − 1)I ∗ (β(T )fT FT −1 )} E Using a backwards induction argument we can show that Vϕ (t − 1) = max{ft−1 . i. From the P deﬁnition of ν and the fact that A is increasing. At time T − 1 the option holder can either exercise and receive fT −1 or hold the option to expiry. So Zt = I ∗ (fτ ∗ Ft ) E ˜ ∗ ˜ In case t = 0 we can use τ0 = min{s ≥ 0 : Zs = fs } and then (3. E τ ∈T (3.e.29) is a necessary condition for the existence of a minimal strategy ϕ. E Considering only discounted values this leads to ˜ ˜ Vϕ (t − 1) = max{ft−1 . E ˜ (3. Now consider the problem of the option writer to construct such a strategy ϕ.
40) ¯ and therefore φ is a minimal hedge.38) ˜ Now τ ∗ is the smallest exercise time and Aτ ∗ (ω) = 0. Now criterion (ii) of Proposition 3.6.3) we see that σ ≤ ν.10.36) Z =M −A ˜ ˜ ˜ with a martingale M and a predictable. ω) = Zτ ∗ (ω) (ω) = fτ ∗ (ω) (ω) Undoing the discounting we ﬁnd Vϕ (τ ∗ ) = fτ ∗ ¯ (3. T ] is divided into N equal subintervals of length ∆ say.6.36) we ﬁnd Zt Bt = Vϕ (t) − At . To do this recall that Z is a supermartingale and so the Doob decomposition yields ˜ ˜ (3. DISCRETETIME MODELS 68 is the rational option price. (3. ω) : 0 ≤ t < τ ∗ (ω)} we have that ¯ ˜¯ Z is a martingale and thus At (ω) = 0. The remaining degrees of freedom are resolved by choosing u and d as follows: 1 + u = eσ √ ∆ .3 American Options in the CoxRossRubinstein model We now consider how to evaluate an American put option in a standard CRR model. So Proposition 3. On the other hand with ν the largest stopping time ¯ (compare Deﬁnition 3. Since the market is complete we know that there exists a selfﬁnancing strategy ϕ such that A ¯ ˜ ˜¯ Mt = Vϕ (t). Now on C = {(t.9) the riskneutral probabilities for the corresponding single period models are given by √ ρ−d er∆ − e−σ ∆ ∗ √ .T]) as given. To see this recall that Vϕ = Zt Bt + At with At > 0 for t > ν.6. ω) ∈ C. and 1 + d = (1 + u)−1 = e−σ √ ∆ .41) and holding the asset longer would generate a larger payoﬀ. By condition (3.CHAPTER 3.8 is true and σ is thus optimal. A stopping time σ ∈ Tt is an optimal exercise time for the American option (ft ) if and only if I ∗ (β(σ)fσ ) = sup I ∗ (β(τ )fτ ) E E (3. Thus ˜ ˜¯ Vϕ (τ ∗ (ω).6. since for any other stopping time σ (use Proposition 3. We write Mt = Mt Bt and At = ˜t Bt .8) ˜ ˜ Vϕ (σ) = Zσ > fσ (3. (i) of Proposition 3.37) Also using (3.6.6.39) (3. Now consider the problem of the option holder. We assume that the time interval [0.42) τ ∈Tt 3. how to ﬁnd the optimal exercise time.8 is true. Thus the holder needs to wait until ˜ Zσ = fσ i. Assuming the riskfree rate of interest r (over [0. E ˜ t≤τ ≤T (3. We observe that the optimal exercise time must be an optimal stopping time. So we must have σ ≤ ν and since ¯ At = 0 for t ≤ ν we see that Z σ is a martingale. This follows since using φ after ν with initial capital from exercising will always yield a higher portfolio value than the strategy of exercising later. we have 1 + ρ = er∆ (where we denote the riskfree rate of interest in each subinterval by ρ). increasing process A. p = = √ u−d eσ ∆ − e−σ ∆ . Thus we obtain from Vϕ (t) = Zt that ˜¯ Vϕ (t) = sup I ∗ (fτ Ft ) ∀ (t.e. We still need to construct the strategy ϕ.
3. The value of the American put is the higher of these: A fij = max{fij . these are S(1 + u) (upper) and S(1 + d) (lower).CHAPTER 3. which is important in many areas of optimisation and Operational Research). (ii) 230 paths is about the order of magnitude that can be comfortably handled by computers (recall that 210 = 1.j ]. The noexercise values fij of the option at the (i.j ). Using the strike price K and the prices at the terminal nodes.j+1 + (1 − p∗ )fi+1. We can now calculate both the value of an American put option and the optimal exercise strategy by working backwards through the tree (this method of backward recursion in time is a form of the dynamic programming (DP) technique. . 4. K − S(1 + u)j (1 + d)i−j } A A = max e−r∆ (p∗ fi+1. S and S(1 + d)2 = S/(1 + u)2 . The intrinsic (or earlyexercise) value of the American put at the (i. after N steps. j) node are given in terms of those of its upper and lower right neighbours in the usual way. 2. N . and so has any value). there are N + 1 possible prices. say). . after two time intervals. S(1 + u)i (1 + d)N −i (i = 0. There are 2N possible paths through the tree. The oneyear riskfree interest rate (continuously compounded) is r = 0. Work back down the tree. it is optimal to exercise early if the earlyexercise value there exceeds the value fij there of expected discounted future payoﬀ. Fill in the stock prices: after one time interval.4 A Threeperiod Example Assume we have two basic securities: a riskfree bond and a risky stock. this gives a time step around the corresponding number of days. 6. Draw a binary tree showing the initial stock value and having the right number.j+1 + (1 − p∗ )fi+1. due to Richard Bellman.06 and the volatility of the stock is 20%. .j = j N −j max{K − S(1 + u) (1 + d) .8910. 0} from the option at the terminal nodes underneath the terminal prices. K − S(1 + u)j (1 + d)i−j . as discounted expected values under the riskneutral measure: A A fij = e−r∆ [p∗ fi+1.1224 and 1 + d = (1 + u)−1 = e−σ √ ∆ = 0. from right to left. u−d . j) node – the value there if it is exercised early – is K − S(1 + u)j (1 + d)i−j (when this is nonnegative. A 3.9) p∗ = er∆ − d = 0.5584. DISCRETETIME MODELS 69 Thus the stock with initial value S = S(0) is worth S(1 + u)i (1 + d)j after i steps up and j steps down. j)’ node). A 5. these are S(1 + u)j (1 + d)i−j = S(1 + u)2j−i at the node with j ‘up’ steps and i − j ‘down’ steps (the ‘(i. . so 230 is somewhat over a billion). The up and down movements of the stock price are given by 1 + u = eσ √ ∆ = 1. At each node. We price calls and puts in threeperiod CoxRossRubinstein model. for two reasons: (i) typical lengths of time to expiry of options are measured in months (9 months. S(1 + u)2 . It is common to take N of the order of 30. of time intervals. We obtain riskneutral probabilities by (3.2 and ∆ = 1/3. 1. after i time intervals. with σ = 0. The initial value of the option is the value f0 ﬁlled in at the root of the tree. Consequently.6. 024. ﬁll in the payoﬀs fN. N ).
the riskneutral pricing formula. r = 0.2 below.40 c = 41. Prices of the two puts are given in Figure 4.10 c=0 S = 70.98 ¨¨ rr c = 27.72 c=0 t=3 time t = 0 t=1 Figure 3.694 16.1).4. Figure 3.698 16. (maturity N = 3. K = 90. To price a European call option with maturity one year (N = 3) and strike K = 10) we can either use the valuation formula (3. Prices of the stock and the call are given in Figure 4.06.704 Call price 16.21 r ¨ S = 89.96 r ¨ S = 100 ¨¨ c = 6. DISCRETETIME MODELS 70 We assume that the price of the stock at time t = 0 is S(0) = 100.56 ¨ S = 112. One can implement the simple evaluation formulae for the CRR.12) or work our way backwards through the tree. .3 is for S = 100.and the BSmodels S = 100 c = 11.CHAPTER 3.38 ¨¨ rr c=0 r t=2 S = 141. strike 100). P (t). σ = 0. We indicate the early exercise times of the American put in bold type.10 ¨¨ rr c = 3. For the prices of the American put we use the technique outlined in §4. T = 1.2: Stock and European call prices and compare the values.8.3: Approximation of BlackScholes price by Binomial models To price a European put. with price process denoted by p(t).67 r ¨ S = 125. Approximating CRR prices 16.70 rr r ¨ S = 79. Recall that the discretetime rule is to exercise if the intrinsic value K − S(t) is larger than the value of the corresponding European put.702 50 100 Approximation 150 200 Figure 3.40 S = 112.2.24 S = 89. or work backwards through the tree.1. and an American put.700 16.696 16. we can for the European put either use the putcall parity (1.24 ¨ ¨ rr c = 18.24 c = 12.
65 ¨¨ rr P = 11.18 ¨ p = 2.62 rr r t=2 time t = 0 t=1 Figure 3.28 P = 29. DISCRETETIME MODELS 71 p = 5.) and American P (.82 P = 6.28 t=3 ¨ p = 4.4: European p(.90 p = 29.CHAPTER 3.71 ¨¨ P = 20.59 r p=0 P =0 ¨¨ ¨ r rr p=0 P =0 p=0 P =0 p = 10.76 ¨¨ rr P = 4.08 rr ¨ p = 10.) put prices .08 ¨¨ r P = 2.90 P = 10.76 r ¨ p = 18.
E E[X(t)Fs ] = X(s) I − a. and I X(t) < ∞ for all ≤ t < ∞. Interpretation.s. P (ii) I and similarly for sub.e. on some probability space (Ω.e. F say X is adapted if X(t) ∈ Ft (i. Then the analogues of the results for discretetime martingales hold true. Martingales model fair games.Chapter 4 Continuoustime Financial Market Models 4. t → X(t. A stochastic process X = (X(t))0≤t<∞ is a martingale relative to (I I ) if F. (iii) X has stationary increments: the law of X(t + u) − X(t) depends only on u.1. if P (i) X(0) = 0 a. BM or BM (I R). u). F. (ii) X has independent increments: X(t + u) − X(t) is independent of σ(X(s) : s ≤ t) for u ≥ 0.1. The martingale property in continuous time is just that suggested by the discretetime case: Deﬁnition 4. There are regularisation results. under which one can take X(t) RCLL in t (basically t → I EX(t) has to be rightcontinuous).1 The Stock Price Process and its Stochastic Calculus Continuoustime Stochastic Processes A stochastic process X = (X(t))t≥0 is a family of random variables deﬁned on (Ω. A stochastic process X = (X(t))t≥0 is a standard (onedimensional) Brownian motion. Supermartingales model unfavourable games. I ). We P. 72 .1 4. X(t + u) − X(t) ∼ N (0.1. ω) is continuous in t for all ω ∈ Ω.. at time t.and supermartingales. (0 ≤ s ≤ t). (iv) X has Gaussian increments: X(t + u) − X(t) is normally distributed with mean 0 and variance u. It was introduced into ﬁnance by Louis Bachelier in 1900.1. I I ). P (i) X is adapted.2. Brownian motion originates in work of the botanist Robert Brown in 1828. i. (v) X has continuous paths: X(t) is a continuous function of t.s. Deﬁnition 4. X(t) is Ft measurable) for each t: thus X(t) is known when Ft is known. and developed in physics by Albert Einstein in 1905. Submartingales model favourable games. F.
1. and by Robert Merton (see (Merton 1990) for a full bibliography).missing the interpretation in terms of returns. Wd (t)). . . Suppose we wish to model the time evolution of a stock price S(t) (as we will. where µ is some parameter representing the mean rate of return of the stock. in the BlackScholes theory). . ω)dY (s. and σ is a second parameter describing how much eﬀect this noise has . Itˆ in 1944. Putting this together. the integrand and the integrator. Chapter 1. and leading to negative stock prices!) Incidentally. It is economically reasonable to expect this return to decompose into two components. and led on to the work of (Doob 1953). is Rogers and Williams (Rogers and Williams 1994). Bachelier’s work served as Itˆ’s motivation in introducing Itˆ calculus.geometric (or exponential. Brownian motion exists. Much greater . For further background. The role of the driving noise term is to represent the random buﬀeting eﬀect of the multiplicity of factors at work in the economic environment in which the stock price is determined by supply and demand. It gives a o o meaning to t t XdY = 0 0 X(s. S(0) > 0. The systematic part could plausibly be modelled by µdt. Standard Brownian motion BM (I d ) R in d dimensions is deﬁned by W (t) := (W1 (t). Watanabe. Revuz and Yor 1991) (see particularly (Karatzas and Shreve 1991). ω). or economic) Brownian motion. CONTINUOUSTIME FINANCIAL MARKET MODELS 73 We shall henceforth denote standard Brownian motion BM (I by W = (W (t)) (W for R) Wiener). for suitable stochastic processes X and Y . where dW (t) represents the noise term driving the stock price dynamics. we can introduce the most o important stochastic process for us. 4. The o o mathematical importance of Itˆ’s work was recognised early.. and Kunita 1996) in honour of Itˆ’s eightieth birthday in 1995). The classic is Doob’s book. We shall conﬁne our attention here mainly to the basic case with integrator Brownian motion: Y = W . see any measuretheoretic text on stochastic processes. VIII. though B = (B(t)) (B for Brown) is also common. M.2 Stochastic Analysis Stochastic integration was introduced by K. (4. dard Brownian motions in one dimension (independent copies of BM (I We have Wiener’s theorem: Theorem 4.1) due to Itˆ in 1944.1. The economic importance of geometric Brownian o motion was recognised by Paul A. hence its name Itˆ calculus. (Doob 1953). a relative of Brownian motion . Samuelson in his work from 1965 on ((Samuelson 1965)).CHAPTER 4. where W1 . This corrects Bachelier’s earlier attempt of 1900 (he did not have the factor o S(t) on the right .1 (Wiener). §2. . we have the stochastic diﬀerential equation dS(t) = S(t)(µdt + σdW (t)). for which Samuelson received the Nobel Prize in Economics in 1970. in work for which he was similarly honoured in 1997. . Writing dS(t) for the change S(t + dt) − S(t) in S. The random part could plausibly be modelled by σdW (t).how much the stock price ﬂuctuates. the return on S in this interval is dS(t)/S(t).2. Excellent modern texts include (Karatzas and Shreve 1991. a systematic part and a random part. Consider how S will change in some small timeinterval from the present time t to a time t + dt in the near future. Thus σ governs how volatile the price is. A treatment starting directly from our main reference of measuretheoretic results. o (Meyer 1976) and many others (see the memorial volume (Ikeda. . and is called the volatility of the stock. Wd are independent stanR)). . . Williams (Williams 1991). Geometric Brownian Motion Now that we have both Brownian motion W and Itˆ’s Lemma to hand.24 for construction).
for some C) and if X(t. CONTINUOUSTIME FINANCIAL MARKET MODELS 74 generality is possible: for Y a continuous martingale.1. For. 0 W (t) − W (a) if a ≤ t ≤ b. n and ω.bi ] . this time calling a stochastic process X simple if there is a partition 0 = t0 < t1 < . . X(s. i=0 = Note that by deﬁnition I0 (X) = 0 I − a. E(I P The stochastic integral for simple integrands is essentially a martingale transform. But we know that Brownian motion is of inﬁnite (unbounded) variation on every interval. see (Karatzas and Shreve 1991) or (Revuz and Yor 1991). .1.b] (t). and extend successively in much the same way that we extended the measuretheoretic integral. We begin with the simplest possible integrands X. . there is exactly one plausible way to deﬁne XdW : t if t ≤ a. X = deﬁne t n t n i=1 ci 1[ai . (0 ≤ s < t < ∞). the LebesgueStieltjes integrals described there have as integrators the diﬀerence of two monotone (increasing) functions. hence It (X) is a continuous martingale. . it is obvious how to begin and clear enough o how to proceed. 0 Simple Functions. Indicators. it is quite surprising that Itˆ integrals can be deﬁned at all. must be quite diﬀerent from the measuretheoretic integral. . We collect some properties of the stochastic integral P deﬁned so far: Lemma 4.bi ] dW. Then if tk ≤ t < tk+1 . So LebesgueStieltjes and Itˆ integrals must be fundamentally diﬀerent. ω) = ξ0 (ω)1{0} (t) + i=0 ξi (ω)1(ti . if they exist. . for a systematic general treatment. ω) can be written in the form n X(t. see (Protter 2004). o In view of the above.ti+1 ] (t) (0 ≤ t ≤ T.s. (ii) I t (X)Fs ) = Is (X) I − a. we should XdW := 0 i=1 ci 0 1[ai . . ω ∈ Ω). t k−1 It (X) := 0 XdW = i=0 n ξi (W (ti+1 ) − W (ti )) + ξk (W (t) − W (tk )) ξi (W (t ∧ ti+1 ) − W (t ∧ ti )).CHAPTER 4. Extend by linearity: if X is a linear combination of indicators. < tn = T < ∞ and uniformly bounded Ftn measurable random variables ξk (ξk  ≤ C for all k = 0. The ﬁrst thing to note is that stochastic integrals with respect to Brownian motion.s. If X(t. (i) It (aX + bY ) = aIt (X) + bIt (Y ). ω) := W (b) − W (a) if t ≥ b. ω)dW (s. . But if we take o for granted Itˆ’s fundamental insight that they can be. Already one wonders how to extend this from constants ci to suitable random variables. and one seeks to simplify the obvious but clumsy threeline expressions above. ω) = 1[a. which are locally of bounded variation. We begin again. and the above is essentially the proof that martingale transforms are martingales.
W (0) = 0 W (t/n) if t/n < u ≤ 2t/n.adapted prot cesses X with 0 I X(u)2 du < ∞ for all t > 0.g. T ] say . taking out what is known (as s.[0. with expiry time t = T ). as required. .2. there is a ﬁxed timeinterval . Example.. n . (n−1)t W if (n − 1)t/n < u ≤ t. if s < t ≤ u < v. The inner expectation is zero by the martingale property. E 0 We then can transfer convergence on a suitable L2 space of stochastic processes to a suitable L2 space of martingales. We calculate functions. Call M (t) − M (s) the increment of M over (s. I (It (X))2 = I E E 0 (ii)I (It (X) − Is (X))2 Fs = I E E t s X(u)2 du t 0 I − a. t ≤ u). We start by approximating the integrand by a sequence of simple if 0 ≤ u ≤ t/n. (Ft ). We seek a class of integrands suitably approximable by simple integrands. W (u)dW (u). .1. (i) We have the Itˆ isometry o t X(s)2 ds. For. This gives us an L2 theory of stochastic integration. CONTINUOUSTIME FINANCIAL MARKET MODELS 75 We pause to note a property of squareintegrable martingales which we shall need below. . P The Itˆ isometry above suggests that o t XdW should be deﬁned only for processes with I X(u)2 du < ∞ for all t. I [(M (v) − M (u))(M (t) − M (s))] E = = I [I E E((M (v) − M (u))(M (t) − M (s))Fu )] I [(M (t) − M (s))I E E((M (v) − M (u))Fu )] . . t]. It turns out that: (i) The suitable class of integrands is the class of (B([0. Lemma 4. the product of the increments over disjoint intervals has zero mean.on which we work (e. Then for a martingale M . For the ﬁnancial applications we have in mind. an option is written at time t = 0. . 0 Approximation. for which Hilbertspace methods are available. so the lefthand side is zero.CHAPTER 4. ∞)) ⊗ F)measurable.s. Then the above becomes T I E(X(u)2 )du < ∞. Xn (u) = . E (ii) Each such X may be approximated by a sequence of simple integrands Xn so that the stochast t tic integral It (X) = 0 XdW may be deﬁned as the limit of It (Xn ) = 0 Xn dW . t (iii) The properties from both lemmas above remain true for the stochastic integral 0 XdW deﬁned by (i) and (ii). We now can add further properties of the stochastic integral for simple functions.
More generally. then Z (t) = 0 X 2 (u)d M (u). 4. t t and σ is adapted and measurable with 0 I σ(u)2 du < ∞ for all t (so 0 σ(s)dW (s) is deﬁned E as a stochastic integral). Then t t X(t) := x0 + 0 b(s)ds + 0 σ(s)dW (s) deﬁnes a stochastic process X with X(0) = x0 .CHAPTER 4. M2 (u). The t t quadratic variation of It (X) = 0 X(u)dW (u) is 0 X(u)2 du. we ﬁnd t W (u)dW (u) = 0 1 1 W (t)2 − t. o One can construct a closely analogous theory for stochastic integrals with the Brownian integrator W above replaced by a squareintegrable martingale integrator M . and convenient. X(0) = x0 . (4. The properties above hold. This is proved in the same way as the t case X ≡ 1. 2 2 (4.3 Itˆ’s Lemma o t Suppose that b is adapted and locally integrable (so 0 b(s)ds is deﬁned as an ordinary integral). if Zi (t) = 0 Xi (u)dMi (u) (i = 1. to express such an equation symbolically in diﬀerential form. It is customary. t n−1 76 W (u)dW (u) = lim 0 n→∞ W k=0 kt n W (k + 1)t n −W kt n . Z2 (t) = 0 X1 (u)X2 (u)d M1 .1. Rearranging terms. Z1 . 2).3) . CONTINUOUSTIME FINANCIAL MARKET MODELS By deﬁnition. Since the second term approximates the quadratic variation of W and hence tends to t for n → ∞. Quadratic Covariation. if Z(t) = 0 X(u)dM (u) t for a continuous martingale integrator M . we obtain for the sum on the right n−1 W k=0 kt n W n−1 (k + 1)t n W −W kt n −W kt n 2 = 1 1 W (t)2 − 2 2 k=0 (k + 1)t n . We shall need to extend quadratic variation and quadratic covariation to stochastic integrals. that W has quadratic variation process t.2) Note the contrast with ordinary (NewtonLeibniz) calculus! Itˆ calculus requires the second term o on the right – the Itˆ correction term – which arises from the quadratic variation of W . Similarly (or by t t polarisation). in terms of the stochastic diﬀerential equation dX(t) = b(t)dt + σ(t)dW (t). Quadratic Variation. with (i) in Lemma replaced by 2 t t I E 0 X(u)dM (u) = I E 0 X(u)2 d M (u).
0 More generally. X(t)) = f (t0 .2 (Basic Itˆ formula). . X(t0 )) + (X(t) − X(t0 ))2 fxx (t0 . Given a partition P of [0. with probability one). . t]. If X has stochastic diﬀerential given by 4. t f (X(t)) = f (x0 ) + 0 1 f (X(u))dX(u) + 2 t f (X(u))d X (u). . and twice in its second argument (space): f ∈ C 1. < tn = t. ftx := ∂ 2 f /∂t∂x): f (t.1. X(t0 )) 1 1 + (t − t0 )2 ftt (t0 . taking a subsequence. By the Taylor expansion of a smooth function of several variables we get for t close to t0 (we use subscripts to denote partial derivatives: ft := ∂f /∂t. indeed.2 . and ﬁnding it. The question arises of giving a meaning to the stochastic R R diﬀerential df (X(t)) of the process f (X(t)). 2 2 . The ﬁrst sum is easily recognized as an approximating sequence of a stochastic integral. . and with a little more eﬀort one can prove n−1 t f (X(tk ) + θk ∆X(tk ))(∆X(tk )) → k=0 0 2 f (X(u))d X (u). we ﬁnd n−1 t f (X(tk ))∆X(tk ) → k=0 0 f (X(u))dX(u). CONTINUOUSTIME FINANCIAL MARKET MODELS 77 Now suppose f : I → I is of class C 2 . So we have Theorem 4. . . X(t0 )) +(t − t0 )ft (t0 . We know that (∆X(tk ))2 → X (t) in probability (so. o then f (X) has stochastic diﬀerential 1 df (X(t)) = f (X(t))dX(t) + f (X(t))d X (t). suppose that f : I 2 → I is a function. continuously diﬀerentiable once in R R its ﬁrst argument (which will denote time). 0 = t0 < t1 < . which may be written symbolically as 1 1 df = ft dt + fx dX + ftt (dt)2 + ftx dtdX + fxx (dX)2 + .e. i. we can use Taylor’s formula to obtain n−1 f (X(t)) − f (X(0)) = k=0 n−1 f (X(tk+1 )) − f (X(tk )) f (X(tk ))∆X(tk ) k=0 n−1 = + 1 2 f (X(tk ) + θk ∆X(tk ))(∆X(tk ))2 k=0 with 0 < θk < 1.CHAPTER 4. . X(t0 )) + (X(t) − X(t0 ))fx (t0 . . 2 or writing out the integrals. X(t0 )) 2 2 +(t − t0 )(X(t) − X(t0 ))ftx (t0 .3 and f ∈ C 2 . X(t0 )) + .
W (t)) is a solution of the stochastic diﬀerential equation. If X(t) has stochastic diﬀerential given by 4.1. to obtain df = ft dt + fx (bdt + σdW ) 1 1 + ftt (dt)2 + ftx dt(bdt + σdW ) + fxx (bdt + σdW )2 + . E 2 Proof. X(t))) = f0 + E t t 0 I ft + bfx + 1 σ 2 fxx dt. W (0)) = S(0) as W (0) = 0. so a martingale. 0 We will make good use of: Corollary 4. The diﬀerential equation (4. 2 That is. 2 1 µ − σ 2 f. one has dx = dW (t). . we substitute dX(t) = b(t)dt + σ(t)dW (t) from above. and the initial condition f (0. 2 fxx = σ 2 f. so its expectation is constant (= 0. and summarising. . x0 ). I (f (t. x) := exp we have ft = and with x = W (t). dt · dW = 0. dW · dW = dt (which are just shorthand for the corresponding properties of the quadratic variations.1. the initial value of f . CONTINUOUSTIME FINANCIAL MARKET MODELS In this. 2 As above the higherorder terms are irrelevant. t f = f0 + 0 1 (ft + bfx + σ 2 fxx )dt + 2 t σfx dW. . we expand (bdt + σdW )2 = σ 2 dt + 2bσdtdW + b2 (dt)2 = σ 2 dt + higherorder terms to get ﬁnally 1 ft + bfx + σ 2 fxx dt + σfx dW + higherorder terms. X(t)) has stochastic diﬀerential df = 1 ft + bfx + σ 2 fxx dt + σfx dW. the o analogue for the Itˆ or stochastic calculus of the chain rule for ordinary (NewtonLeibniz) calculus: o df = Theorem 4.1) above has the unique solution S(t) = S(0) exp For. 0 σf2 dW is a stochastic integral. Thus Itˆ’s lemma gives o df (t. giving existence. 2 (dx)2 = dt. f (µdt + σdW (t)). as it starts at 0). writing f0 for f (0.3.3 (Itˆ’s Lemma). 1 µ − σ 2 t + σdW (t) . writing f (t. W (t)) = = = 1 ft dt + fx dW (t) + fxx (dW (t))2 2 f 1 1 µ − σ 2 dt + σdW (t) + σ 2 dt 2 2 1 µ − σ 2 t + σx . then f = o f (t. 2 2 78 Now using the formal multiplication rules dt · dt = 0. we obtain Itˆ’s lemma. so f (t. fx = σf.1.CHAPTER 4.
Novikov’s condition suﬃces: T 1 2 γ(s) ds < ∞. and. being the stochastic exponential of − 0 γ(s) dW (s).i. . . . . . from the original measure I to the equivalent P P P Thus the eﬀect of the change of measure I → I ˜ measure I . P n i = 1. . .1. P ˜ Given a vector γ = (γ1 . Let γ be as above and satisfy Novikov’s condition. P This result extends to inﬁnitely many dimensions . . indeed with random rather than deterministic means. i = 1. . . CONTINUOUSTIME FINANCIAL MARKET MODELS 79 4. . .4 Girsanov’s Theorem Consider ﬁrst independent N (0. Also P ˜ I (Zi ∈ dzi .. F. F T adapted ddimensional process with 0 γi (t)2 dt < ∞ a. . .s. F) deﬁned by P n ˜ I (dω) = exp P i=1 γi Zi (ω) − 1 2 n 2 γi i=1 I (dω). . . Deﬁne the processes Wi . F the ﬁltration I satisfying the usual conditions. .5) 2 0 t We are now in the position to state a version of Girsanov’s theorem. n) P n 2 γi − i=1 = (2π) = (2π) exp i=1 γi zi − 1 2 n 1 2 1 2 n 2 zi i=1 n dzi i=1 −n 2 exp − (zi − γi )2 i=1 dz1 . P 2 It is also equivalent to I (has the same null sets). is to change the mean. . consider a new probability measure I on (Ω. d. i = 1. d by t ˜ Wi (t) := Wi (t) + 0 γi (s)ds.. Wd ) be a ddimensional Brownian motion deﬁned on a ﬁltered probability space (Ω.e. i = 1. . 0) to γ = (γ1 . Theorem 4. . (0 ≤ t ≤ T ). I exp E (4. I I ) with P. . let L be the ˜ corresponding continuous martingale. . . i = 1. For this. is a local martingale. d. . . Zn on a probability space (Ω. ˜ This says that if the Zi are independent N (0. from random vectors to stochastic processes.4) 2 0 0 Then L is continuous. F. γn ).1. P 2 As exp{. . 1) random variables Z1 . Let W = (W1 .4 (Girsanov). FT )) with RadonNikod´m derivaP y tive ˜ dI P = L(T ). . . . 1) under I . . (4. dI P . 1) under I . . . and deﬁne the process (L(t) : 0 ≤ t ≤ T ) by t t 1 2 L(t) = exp − γ(s) dW (s) − γ(s) ds . . P P ˜ . Let (γ(t) : 0 ≤ t ≤ T ) be a measurable. . Given suﬃcient integrability on the process γ. which will be one of our main tools in studying continuoustime ﬁnancial market models. I ). they are independent N (γi . . . again as the exponential term is positive. . this is a probability measure. n) 1 2 n 2 γi i=1 = exp i=1 γi Zi − −n 2 n I (Zi ∈ dzi .CHAPTER 4. dzn . ˜ Then under the equivalent probability measure I (deﬁned on (Ω.} > 0 and integrates to 1. . γn ). . from 0 = (0. . . L will in fact be a (continuous) martingale. as exp{γi Zi }dI = exp{ 1 γi }.
e.2 4. Sd . 0 ≤ s ≤ t) slightly enlarged to satisfy the usual conditions) any pair of equivalent probability measures Q ∼ I on F = FT is a Girsanov pair. e e e it is the num´raire relevant for domestic transactions at time t. i. . . . Uncertainty in the ﬁnancial market is modelled by a probability space (Ω. . reﬂecting the fact that we allow short sales and assume that the assets are perfectly divisible. We assume that the σﬁeld F0 is trivial. was used as a num´raire. i=0 0 I E(ϕ2 (t))dt < ∞ a trading strategy (or dynamic portfolio i process). given by B(t) = er(t) with a positive deterministic process r(t) and r(0) = 0.2.g. and that FT = F.to be determined on the basis of information available before time t. Girsanov’s theorem (or the CameronMartinGirsanov theorem) is formulated in varying degrees of generality.2. Wd ) is ddimensional Brownian motion. the investor selects his time t portfolio after observing the prices S(t−).1. F. . .6 (continuous time). Although the current setting is on a much higher level of sophistication.because FT contains all that information. §5. Sd . ϕd (t)). and the reader may think of S0 (t) as being B(t). I ) and a ﬁltration I = (Ft )0≤t≤T P F satisfying the usual conditions of rightcontinuity and completeness. which we model as FT measurable random variables. for γ(t) constant (= γ). . (Dothan 1990). . We call an I d+1 valued predictable process R ϕ(t) = (ϕ0 (t). . which is (almost surely) strictly positive and use S0 as num´raire. The formal deﬁnition of a num´raire is very much as in the discrete setting.1 Financial Market Models The Financial Market Model We start with a general model of a frictionless (compare Chapter 1) security market where investors are allowed to trade continuously up to some ﬁxed ﬁnite planning horizon T . III. bonds. T d T t ∈ [0. Q P ˜ dQ Q dI P = L(t) Ft with L deﬁned as above. The components ϕi (t) may assume negative as well as positive values. . If I = (Ft ) is F 2 the Brownian ﬁltration (basically Ft = σ(W (s). . T ]. We will often have to impose further integrability conditions on the contingent claims under consideration. P P There are d + 1 primary traded assets. We have not emphasised so far that there was an implicit num´raire behind the prices S0 .e. (Protter 2004). or options).4 (discrete time). whose price processes are given by stochastic processes S0 . . . . (Revuz and Yor 1991). . Deﬁnition 4. 80 In particular. for every A ∈ F0 either I (A) = 0 or I (A) = 1. discussed and proved.e. §11. in (Karatzas and Shreve 1991).6. T ] with 0 I E(ϕ0 (t))dt < ∞.CHAPTER 4. . The fundamental concept in (arbitrage) pricing and hedging contingent claims is the interplay of selfﬁnancing replicating portfolios and riskneutral probabilities. T ] . change of measure by introducing the RadonNikod´m y derivative exp −γW (t) − 1 γ 2 t corresponds to a change of drift from c to c − γ.e. . e Our principal task will be the pricing and hedging of contingent claims. . Here ϕi (t) denotes the number of shares of asset i held in the portfolio at time t .5. We assume now that S0 (t) is a nondividend paying asset. . . CONTINUOUSTIME FINANCIAL MARKET MODELS ˜ ˜ ˜ the process W = (W1 . . §3. A num´raire is a price process X(t) almost surely strictly positive for each e t ∈ [0. i. This implies that the contingent claims specify a stochastic cashﬂow at time T and that they may depend on the whole path of the underlying in [0. We assume that the processes S0 . 4. ‘Historically’ (see (Harrison and Pliska 1981)) the money e market account B(t). VIII. Sd represent the prices of some traded assets (stocks. i. the key ideas remain the same.
2. . In other words. . (ii) The gains process Gϕ (t) is deﬁned by t d t Gϕ (t) := 0 ϕ(u)dS(u) = i=0 0 ϕi (u)dSi (u).1. T ]. . i = 1. (i) The value of the portfolio ϕ at time t is given by the scalar product d 81 Vϕ (t) := ϕ(t) · S(t) = i=0 ϕi (t)Si (t).1. . T ]. . t ∈ [0.2. . Let ϕ be a trading strategy. Then ϕ if selfﬁnancing if and only if ˜ ˜ ˜ Vϕ (t) = Vϕ (0) + Gϕ (t). i = 1. T ]. . . ˜ Observe that Gϕ (t) does not depend on the num´raire component ϕ0 . d exist can be uniquely extended to ˜ a selfﬁnancing strategy ϕ with speciﬁed initial value Vϕ (0) = v by setting the cash holding as d t d ϕ0 (t) = v + i=1 0 ˜ ϕi (u)dSi (u) − i=1 ˜ ϕi (t)Si (t). The above result shows that a selfﬁnancing strategy is completely determined by its initial value and the components ϕ1 . . Sd (t)) S0 (t) ˜ ˜ with Si (t) = Si (t)/S0 (t). t ∈ [0. . .CHAPTER 4. . . Vϕ (t) ≥ 0 if and only if Vϕ (t) ≥ 0. . the discounted wealth process Vϕ (t) is given by d Vϕ (t) ˜ ˜ Vϕ (t) := = ϕ0 (t) + ϕi (t)Si (t) S0 (t) i=1 ˜ and the discounted gains process Gϕ (t) is d t ˜ Gϕ (t) := i=1 0 ˜ ϕi (t)dSi (t). (ii) The deﬁnition of a trading strategy includes regularity assumptions in order to ensure the existence of stochastic integrals. (i) The ﬁnancial implications of the above equations are that all changes in the wealth of the portfolio are due to capital gains. e e Remark 4. The process Vϕ (t) is called the value process. Furthermore. . . or wealth process. of the trading strategy ϕ. S1 (t). any set of predictable processes ˜ ϕ1 . The proof follows by the num´raire invariance theorem using S0 as num´raire. ϕd .2. .2. as opposed to withdrawals of cash or injections of new funds. e It is convenient to reformulate the selfﬁnancing condition in terms of the discounted processes: Proposition 4. 2. Using the special num´raire S0 (t) we consider the discounted price process e S(t) ˜ ˜ ˜ S(t) := = (1. (iii) A trading strategy ϕ is called selfﬁnancing if the wealth process Vϕ (t) satisﬁes Vϕ (t) = Vϕ (0) + Gϕ (t) for all t ∈ [0.2. ˜ Of course. . CONTINUOUSTIME FINANCIAL MARKET MODELS Deﬁnition 4. . Remark 4. d. .2. ϕd such that the stochastic integrals ϕi dSi .
˜ Proposition 4.2.2. For example. ˜ For ϕ ∈ Φ to be an arbitrage opportunity we must have Vϕ (0) = Vϕ (0) = 0. F) is an equivalent marQ tingale measure if: (i) Q is equivalent to I . Assume S0 (t) = B(t) = er(t) . then Q ∼ I is a martingale measure if and only if Q P every asset price process Si has price dynamics under Q of the form Q dSi (t) = r(t)Si (t)dt + dMi (t). For any ϕ ∈ Φ and under any Q ∈ P Vϕ (t) is a martingale. A selfﬁnancing trading strategy ϕ is called tame (relative to the num´raire S0 ) e if ˜ Vϕ (t) ≥ 0 for all t ∈ [0. Q This observation is the key to our ﬁrst central result: Theorem 4. Q We denote the set of martingale measures by P. for all EQ ˜ 0 ≤ t ≤ T.4. We say that a probability measure Q deﬁned on (Ω. A selfﬁnancing trading strategy ϕ is called an arbitrage opportunity if the wealth process Vϕ satisﬁes the following set of conditions: Vϕ (0) = 0. P and I (Vϕ (T ) > 0) > 0.1.1.2. We next analyse the value process under equivalent martingale measures for such strategies. Q P ˜ (ii) the discounted price process S is a Q martingale. That is. for all EQ ˜ u ≤ t ≤ T. Then the market model contains no arbitrage opportunities in Φ. We use the notation Φ for the set of tame trading strategies. I (Vϕ (T ) ≥ 0) = 1. P Arbitrage opportunities represent the limitless creation of wealth through riskfree proﬁt and thus should not be present in a wellfunctioning market.2. o In order to proceed we have to impose further restrictions on the set of trading strategies. For ϕ ∈ Φ Vϕ (t) is a martingale under each Q ∈ P. Q ˜ I Q Vϕ (t)Fu = Vϕ (u). Now I Q Vϕ (t) = 0. Deﬁnition 4.CHAPTER 4. Again the underlying concept is the link between the noarbitrage condition and certain probability measures.2.2.3. CONTINUOUSTIME FINANCIAL MARKET MODELS 82 4.5. . The main tool in investigating arbitrage opportunities is the concept of equivalent martingale measures: Deﬁnition 4.2. T ]. in the case S0 (t) = B(t) we have: Lemma 4. ˜ Proof. We begin with: Deﬁnition 4.2. The proof is an application of Itˆ’s formula. where Mi is a Q Qmartingale. Assume P = ∅.2 Equivalent Martingale Measures We develop a relative pricing theory for contingent claims. A useful criterion in determining whether a given equivalent measure is indeed a martingale measure is the observation that the growth rates relative to the num´raire of all given primary e assets under the measure in question must coincide.
We call such a trading strategy ϕ a replicating strategy for X. (ii) The ﬁnancial market model M is said to be complete if any contingent claim is attainable.7. The following central theorem is the key to answering these questions: . X can be replicated by a portfolio ϕ ∈ Φ(I ∗ ). A selfﬁnancing trading strategy ϕ is called (I ∗ ) admissible if the relative P gains process t ˜ Gϕ (t) = 0 ∗ ∗ ˜ ϕ(u)dS(u) is a (I ) martingale. The class of all (I ) admissible trading strategies is denoted Φ(I ∗ ). this means Q (Vϕ (T ) ≥ 0) = 1. CONTINUOUSTIME FINANCIAL MARKET MODELS 83 ˜ Now ϕ is tame. The ﬁnancial market model M contains no arbitrage opportunities in Φ(I ∗ ).2. Again we emphasise that this depends on the class of trading strategies. This P means that holding the portfolio and holding the contingent claim are equivalent from a ﬁnancial point of view. we restrict our attention to contingent P claims X such that X/S0 (T ) ∈ L1 (F.2. . On the other hand.2.6. . P We can repeat the above argument to obtain Theorem 4. . P P P ˜ ˜ ˜ By deﬁnition S is a martingale. We see that ∗ any suﬃciently integrable processes ϕ1 . e If a contingent claim X is attainable. P We now deﬁne a further subclass of trading strategies: Deﬁnition 4. and G is the stochastic integral with respect to S. Both together yield Q P Q Q (Vϕ (T ) > 0) = I (Vϕ (T ) > 0) = 0. Q P and hence the result follows. 4. Formally: Deﬁnition 4. so Vϕ (t) ≥ 0.2. it does not depend on the num´raire: it is an easy exercise in the continuous assetprice process e e case to show that if a contingent claim is attainable in a given num´raire it is also attainable in any other num´raire and the replicating strategies are the same. Until further notice we use I ∗ as our reference measure. and when using the term martingale we always assume that P the underlying probability measure is I ∗ .CHAPTER 4. In the absence of arbitrage the (arbitrage) price process ΠX (t) of the contingent claim must therefore satisfy ΠX (t) = Vϕ (t). implying I Q Vϕ (t) = 0. ϕd give rise to I admissible trading strategies. 0 ≤ t ≤ T . Of course the questions arise of what will happen if X can be replicated by more than one portfolio. P Under the assumption that no arbitrage opportunities exist. 0 ≤ t ≤ T . the question of pricing and hedging a contingent claim reduces to the existence of replicating selfﬁnancing trading strategies. and what the relation of the price process to the equivalent martingale measure(s) is. (i) A contingent claim X is called attainable if there exists at least one admissible trading strategy such that Vϕ (T ) = X.3 Riskneutral Pricing We now assume that there exists an equivalent martingale measure I ∗ which implies that there P are no arbitrage opportunities with respect to Φ in the ﬁnancial market model.2. . I ∗ ). But an arbitrage opportunity ϕ also has to satisfy I (Vϕ (T ) ≥ 0) = 1. and in particular EQ ˜ I Q Vϕ (T ) = 0. and EQ ˜ P since Q ∼ I . In particular.
We write as usual S(t) = S(t)/B(t) for the discounted R. S(0) = p ∈ (0. Thus the Q Qdynamics for S are ˜ ˜ ˜ dS(t) = S(t) (b − r − σγ(t))dt + σdW (t) . . σ ∈ I + . ˜ ˜ where W is a Q QWiener process. Since ϕ ∈ Φ(I ∗ ) the discounted value process Vϕ (t) is a martingale. S(t) (bdt + σdW (t)). CONTINUOUSTIME FINANCIAL MARKET MODELS 84 Theorem 4. Equivalent Martingale Measure Because we use the Brownian ﬁltration any pair of equivalent probability measures I ∼ Q on FT P Q is a Girsanov pair.1.2. ψ ∈ Φ(I ∗ ) we have P Vϕ (t) = Vψ (t).6) The uniqueness question is immediate from the above theorem: Corollary 4. For any two replicating portfolios ϕ. S0 (T ) Vϕ (T ) Ft S0 (T ) 4.2.1.e. adapted ddimensional process with By Girsanov’s theorem 4. and get from Itˆ’s formula e o ˜ ˜ dS(t) = S(t) {(b − r)dt + σdW (t)}. there exists a replicating strategy ϕ ∈ Φ(I ∗ ) P ˜ P such that Vϕ (T ) = X and ΠX (t) = Vϕ (t). The arbitrage price process of any attainable claim is given by the riskneutral valuation formula ΠX (t) = S0 (t)I I ∗ EP X Ft . Proof of Theorem 4. i.2.3 Since X is attainable. T 0 0 and (γ(t) : 0 ≤ t ≤ T ) a measurable. S0 (T ) (4. ∞).3 (RiskNeutral Valuation Formula). and hence ΠX (t) ˜ = Vϕ (t) = S0 (t)Vϕ (t) EP = S0 (t)I I ∗ Vϕ (T ) Ft = S0 (t)I I ∗ EP ˜ = S0 (t)I I ∗ EP X Ft . γ(t)2 dt < ∞ a..4 The BlackScholes Model The Model We concentrate on the classical BlackScholes model dB(t) = dS(t) = rB(t)dt.s. R stock price process (with the bank account being the natural num´raire). dQ Q = L(t) dI Ft P with L(t) = exp − 0 t 1 γ(s)dW (s) − 2 t γ(s)2 ds .2. B(0) = 1.CHAPTER 4.4 we have ˜ dW (t) = dW (t) − γ(t)dt. ˜ with constant coeﬃcients b ∈ I r.
3 (BlackScholes Formula).7) . t) and d2 (s. CONTINUOUSTIME FINANCIAL MARKET MODELS ˜ Since S has to be a martingale under Q we must have Q b − r − σγ(t) = 0 t ∈ [0. E with I ∗ given via the Girsanov density E L(t) = exp − b−r σ W (t) − 1 2 b−r σ 2 and so we must choose t . T − t)). T ]. (We write I ∗ for I I ∗ in this section. FT . S(t))dW (t). Indeed. . The BlackScholes price process of a European call is given by C(t) = S(t)N (d1 (S(t). we can evaluate the above expected value (which is easier than solving the BlackScholes partial diﬀerential equation) and obtain: Proposition 4. E Now we use Itˆ’s lemma to ﬁnd the dynamics of the I ∗ martingale M (t) = G(t. t) = log(s/K) + (r + √ σ t √ σ2 2 )t (4. t) = d2 (s. hence the terminology riskneutral (or yieldequating) martingale measure. To obtain a replicating portfolio we start in the discounted setting. We have from the riskneutral valuation principle M (t) = exp {−rT } I ∗ [ Φ(S(T )) Ft ] .2. T − t)) − Ke−r(T −t) N (d2 (S(t). σ2 2 )t log(s/K) + (r − √ d1 (s. Now consider a European call with strike K and maturity T on the stock S (so Φ(T ) = (S(T ) − K)+ ). t) − σ t = σ t Observe that we have already deduced this formula as a limit of a discretetime setting. The functions d1 (s. We also know that we have a unique martingale measure I ∗ (recall γ = (b−r)/σ in Girsanov’s P transformation). Using the product rule. I ∗ ). this argument leads to a unique martingale measure.CHAPTER 4. and we will make use of this fact later on. we ﬁnd the Q Qdynamics of S as γ(t) ≡ γ = ˜ dS(t) = S(t) rdt + σdW . We see that the appreciation rate b is replaced by the interest rate r. S(t)): o P ˜ dM (t) = σS(t)Gs (t. t) are given by d1 (s. 85 b−r . Pricing and Hedging Contingent Claims Recall that a contingent claim X is a FT measurable random variable such that X/B(T ) ∈ L1 (Ω.) By the riskneutral valuation princiP E EP ple the price of a contingent claim X is given by ΠX (t) = e{−r(T −t)} I ∗ [ X Ft ]. σ (the ’market price of risk’).
S(t)). St ) + σ 2 St fxx (t. i. S)S 2 σ 2 dt + fs SσdW.1. Assume that the portfolio value can be written as Vϕ (t) = V (t) = f (t.2. S(t)). For this consider a selfﬁnancing portfolio which has dynamics dVϕ (t) = ϕ0 (t)dB(t) + ϕ1 (t)dS(t) = (ϕ0 (t)rB(t) + ϕ1 (t)µS(t))dt + ϕ1 (t)σS(t)dW (t). St ) + fx (t. S(t)) − Gs (t. S)Sb + fss (t. S(t)) for some suﬃciently smooth function f : I + × [0.3) we ﬁnd for R R. St )).2 . S(t))B(t). In their original paper Black and Scholes (1973). x) = 0 2 (4. S(t))= B(t)G(t. and using the selfﬁnancing condition the cash component is ϕ0 (t) = G(t.9) . St )St µ + St σ 2 fxx (t. 86 To transfer this portfolio to undiscounted values we multiply it by the discount factor. x) must satisfy the following PDE 1 ft (t. We can also use an arbitrage approach to derive the BlackScholes formula. CONTINUOUSTIME FINANCIAL MARKET MODELS Using this representation. Then by Itˆ’s formula o 1 2 dV (t) = (ft (t. o the dynamics of the option price process (observe that we work under I so dS = S(bdt + σdW )) P dC = 1 ft (t. B(t) = Fs (t. T ] → I By Itˆ’s formula (Theorem 4. S) + fs (t. 2 (4.e F (t. 2 Now we match the coeﬃcients and ﬁnd ϕ1 (t) = fx (t.CHAPTER 4. St ) and ϕ0 (t) = 1 1 2 (ft (t. The replicating strategy in the classical BlackScholes model is given by ϕ0 ϕ1 = F (t. The idea is as follows: start by assuming that the option price C(t) is given by C(t) = f (t. S(t))S(t) . x) + rxfx (t. x) − rf (t. S(t))S(t). S(t)) for a suitable function f ∈ C 1. Now for the discounted assets the stock component is ϕ1 (t) = Gs (t. x) + σ 2 x2 fxx (t.4. rB(t) 2 Then looking at the total portfolio value we ﬁnd that f (t. we get for the stock component of the replicating portfolio h(t) = σS(t)Gs (t. St )σSt dWt . S(t)) − Fs (t.8) and initial condition f (T. St ))dt + fx (t. S(t)) and get: Proposition 4. x) = (x − K)+ . Black and Scholes used an arbitrage pricing approach (rather than our riskneutral valuation approach) to deduce the price of a European call as the solution of a partial diﬀerential equation (we call this the PDE approach).
Comparing the coeﬃcients and using C(t) = f (t. We see that the portfolio ψ is selfﬁnancing. dVψ (t) = rVψ (t)dt = (−rfs S + rC) dt. o 4. S(t))S(t)) = ψ1 (t)dS(t) + S(t)dψ1 (t. S(t)) + d ψ1 . and by the selfﬁnancing condition we have (to ease the notation we omit the arguments) dVψ = = = −ψ1 dS + dC 1 −fs (Sbdt + SσdW ) + ft + fs Sb + fss S 2 σ 2 dt + fs SσdW 2 1 ft + fss S 2 σ 2 dt. One point in the justiﬁcation of the above argument is missing: we have to show that the trading strategy short ψ1 stocks and long one call is selfﬁnancing. R Note. the interest rate and the strike price. T ) = (s − K)+ for all s ∈ I +. the time to maturity. since ψ1 = ψ1 (t. and its appreciation rate in an arbitragefree world must therefore equal the riskfree rate. The BlackScholes option values depend on the (current) stock price.2. i. this is not true. S (t). S(t)) is dependent on the stock price process. S(t)) stocks and a long position in ψ2 (t) = 1 call and assume the portfolio is selfﬁnancing.CHAPTER 4.e.5 The Greeks We will now analyse the impact of the underlying parameters in the standard BlackScholes model on the prices of call and put options. we must have 1 −rSfs + rf = ft + σ 2 S 2 fss . In fact. if S(t)dψ1 (t. the volatility. 2 So the dynamics of the value process of the portfolio do not have any exposure to the driving Brownian motion. CONTINUOUSTIME FINANCIAL MARKET MODELS 87 Consider a portfolio ψ consisting of a short position in ψ1 (t) = fs (t. Now ψ(t) = ψ(t. i. 2 This leads again to the BlackScholes partial diﬀerential equation (4.e. for the selfﬁnancing condition to be true we must have dVψ (t) = d(ψ1 (t)S(t)) + dC(t) = ψ1 (t)dS(t) + dC(t). S (t) = 0. 2 Since C(T ) = (S(T ) − K)+ we need to impose the terminal condition f (s. Then its value process is Vψ (t) = −ψ1 (t)S(t) + C(t). The sensitivities of the option price with respect to the ﬁrst four parameters are called the Greeks and are widely .8) for f . S(t)) + d ψ1 . S(t)) depends on the stock price and so we have d(ψ1 (t. S(t)). Formally. 1 ft + rsfs + σ 2 s2 fss − rf = 0. It is an exercise in Itˆ calculus to show that this is not the case.
i. T. (observe that Θ is the derivative of C. σ) = SN (d1 (S. T )). t) = d2 (s.CHAPTER 4. K. V – vega – measures the change of the option compared with the change in the volatility of the underlying. T − t)) = C(t) C(t) .4). with the functions d1 (s. and hence stochastic. t) given by d1 (s. ∆ gives the number of shares in the replication portfolio for a call option (see Proposition 4. r. n(d1 ) √ > 0. It is precisely this feature that causes diﬃculties when assessing the impact of options in a portfolio. Recall the BlackScholes formula for a European call (4. We can determine the impact of these parameters by taking partial derivatives.2. CONTINUOUSTIME FINANCIAL MARKET MODELS 88 used for hedging purposes. the price of a European call. the appreciation rate of the call option equals the riskfree rate r. σ2 2 )t log(s/K) + (r − √ d1 (s. The volatility coeﬃcient is ση c . as expected in the riskneutral world. 2 T T Ke−rT N (d2 ) > 0. ∆(S(t).8) can be used to obtain the relation between the Greeks. t) = One obtains ∆ := V Θ ρ Γ := := := := ∂C ∂S ∂C ∂σ ∂C ∂T ∂C ∂r ∂2C ∂S 2 log(s/K) + (r + √ σ t √ σ2 2 )t .e. Using formula (4. Sσ T (As usual N is the cumulative normal distribution function and n is its density. Deﬁning the elasticity coeﬃcient of the option’s price as η c (t) = we can rewrite the dynamics as ˜ dC(t) = rC(t)dt + ση c (t)C(t)dW (t). T − t))S(t)dW (t).9) we ﬁnd P ˜ dC(t) = rC(t)dt + σN (d1 (S(t). T − t)S(t) N (d1 (S(t). .3).) From the deﬁnitions it is clear that ∆ – delta – measures the change in the value of the option compared with the change in the value of the underlying asset.7): C(0) = C(S. while in the BlackScholes PDE the partial derivative with respect to the current time t appears) 1 rC = s2 σ 2 Γ + rs∆ − Θ. Sσ √ n(d1 ) + Kre−rT N (d2 ) > 0. 2 Let us now compute the dynamics of the call option’s price C(t) under the riskneutral martingale measure I ∗ . So. Furthermore. t) and d2 (s. The BlackScholes partial diﬀerential equation (4. √ S T n(d1 ) > 0. T )) − Ke−rT N (d2 (S. t) − σ t = σ t = = = = = N (d1 ) > 0. so Γ measures the sensitivity of our portfolio to the change in the stock price. and similar statements hold for Θ – theta – and ρ – rho (observe that these derivatives are in line with our arbitragebased considerations in §1. with respect to the time to expiry T − t.
From at least 1967 – predating both CBOE and BlackScholes in 1973 – practitioners have sought to reduce their exposure to speciﬁc risks of this kind by buying options designed with such barriercrossing events in mind. L(t)). the motivation is that buying speciﬁc options – that is. Onebarrier options specify a stockprice level. S(t) = p0 exp{(µ − 1 σ 2 t) + σW (t)}. m). Consider ﬁrst the case c = 0: we require the joint law of standard Brownian motion and its maximum or minimum. rather than just the current price or price at expiry). a downandout call option with strike K and barrier H (the other possibilities may be handled similarly).)≥H} . 2 −1 then min S(. M ) or (W. The process now begins afresh at level b. choose a level b > 0. t]}. a formula due to L´vy.)≥H} . m). Since barrier options are pathdependent (they involve the behaviour of the path. ‘up and out’. The sum of the prices of the knockin and the knockout is thus the price of the vanilla – again showing the attractiveness of barrier options as being cheaper than their vanilla counterparts. taking out speciﬁc insurance – is a cheaper way of covering oneself against a speciﬁc danger than buying a more general one. (L´vy also obtained the identity in law of the bivariate processes (M (t) − e e W (t). m. such that the option pays (‘knocks in’) or not (‘knocks out’) according to whether or not level H is attained. and by symmetry the probabilistic properties of its further evolution are invariant under reﬂection in the level b (thought of as a mirror). to be speciﬁc. we start the Brownian motion W at the origin at time zero.min S(. A barrier option is often designed to pay a rebate – a sum speciﬁed in advance – to compensate the holder if the option is rendered otherwise worthless by hitting/not hitting the barrier.CHAPTER 4.)≥H} = (S(T ) − K)1{S(T )≥K. This reﬂection principle leads to the joint density of (W (t). The payoﬀ is (unless otherwise stated min and max are over [0. M for its minimum and maximum processes m(t) := min{X(s) : s ∈ [0.6 Barrier Options The question of whether or not a particular stock will attain a particular level within a speciﬁed period has long been an important one for risk managers. M (t) ∈ dy) P = 2(2y − x) 1 √ exp − (2y − x)2 /t 3 2 2πt (0 ≤ x ≤ y). E 1 where S is geometric Brownian motion. As usual. T ]) (S(T ) − K)+ 1{min S(. Taking (W. the payoﬀ function involves the bivariate process (X. Note that holding both a knockin option and the corresponding knockout is equivalent to the corresponding vanilla option with the barrier removed. with their more complicated variants. as ‘exotic barrier’ options. and run the process until the ﬁrstpassage time (see Exercise 5. t]}. M (t)) as I 0 (W (t) ∈ dx. ‘down and in’ and ‘down and out’. H say. they may be classiﬁed as exotic. where L is the local time process of W at zero: see e. alternatively. We restrict attention to zero rebate here for simplicity. M (t) := max{X(s) : s ∈ [0.g.2. This is a stopping time.min S(.H := I e−rT (S(T ) − K)1{S(T )≥K. There are thus four possibilities: ‘up and in’. Revuz . Writing X for X(t) := ct + W (t) – drifting Brownian motion with drift c. and we may use the strong Markov property for W at time τ (b). described below. from below (‘up’) or above (‘down’). M (t)) and (W (t). CONTINUOUSTIME FINANCIAL MARKET MODELS 89 4. (W. Write c := µ − 2 σ 2 /σ. the four basic onebarrier types above may be regarded as ‘vanilla barrier’ options. Consider. M ) for deﬁniteness. and the option price involves the joint law of this process.2) τ (b) := inf{t ≥ 0 : W (t) ≥ b} at which the level b is ﬁrst attained.) ≥ H iﬀ min(ct + W (t)) ≥ σ log(H/p0 ). so by riskneutral pricing the value of the option is DOC K.
2 (p. where p0 is the initial stock price as usual and c1 . see any good book on electromagnetism. of 1848 on electrostatics. M (t)) may be extended to the case of general e drift c by the usual method for changing drift. CK say. CONTINUOUSTIME FINANCIAL MARKET MODELS 90 and Yor (1991). See e. which are tedious. Chapter VIII.7. The other cases of vanilla barrier options.H of the downandout call into the (BlackScholes) price of the corresponding vanilla call. by which the knockout barrier at H lowers the price: DOCK.10). or S − K. e. a second approach to this formula makes explicit use of Kelvin’s language – mirrors. sources. t) = σ t (the notation is that of the excellent text Musiela and Rutkowski (1997). sinks. the result is.CHAPTER 4. in S and K. then e Sir William Thomson. may be obtained explicitly in terms of the normal distribution function N – both features familiar from the BlackScholes formula.2).H = p0 (H/p0 )2+2λ/σ N (c1 ) − Ke−rT (H/p0 )2λ/σ N (c2 ). (13.g. Indeed. M (t) ∈ dy) P = 2(2y − x) 1 (2y − x)2 √ + cx − c2 t exp − 3 2t 2 2πt (0 ≤ x ≤ y).H . I. L´vy’s formula for the joint density of (W (t). or Harrison (1985). The factor S(T ) − K. As an alternative to the probabilistic approach above. writing λ := r − 1 σ 2 . to the method of images of Lord Kelvin (18241907). and indeed further. and the knockout discount. Chapter 10. m(t)) – we can calculate the option price by integration. Given such an explicit formula for the joint density of (X(t). Rogers and Williams (1994). are omitted. VI. see e.H = CK − KODK. gives rise to two terms. §5. M (t)) – or equivalently. are given in detail in Zhang (1997). and their sensitivity analysis.H say. 2 2 . c2 are functions of the price p = p0 and time t = T given by 1 log(H 2 /pK) + (r ± 2 σ 2 )t √ c1. The details of the integration. this resemblance makes it convenient to decompose the price DOCK.g. §9. involving relatives of the normal density function n. KODK. Girsanov’s theorem. The general result is I 0 (X(t) ∈ dx. Cox and Miller (1972). while the integrals. Jeans (1925). The idea behind the reﬂection principle goes back to work of D´sir´ e e Andr´ in 1887. to which we refer for further detail). 2 KODK. (X(t).g.6. For background on this.8. §1.
j = 1. a sure £ at time T ≥ t is denoted p(t. (5. Let cj be the payments at times tj . If for instance the tj . The yieldtomaturity is deﬁned as an interest rate per annum that equates the present value of future cash ﬂows to the current market value. . plus a principal repayment at maturity. . .1) Hence. F be the face value paid at time tn .1). however. additional complications arise. The price of a zerocoupon bond at time t that pays. The term structure of interest rates is deﬁned as the relationship between the yieldtomaturity on a zerocoupon bond and the bond’s maturity. tj ) + F p(0. . All zerocoupon bonds are assumed to be defaultfree and have strictly positive prices. Various diﬀerent interest rates are deﬁned in connection with zerocoupon bonds. r(T ). use prices of coupon bonds and invert formula (5. n are expressed in years. 91 . Coupon bonds are bonds with regular interest payments.1. we see that a coupon bond is equivalent to a portfolio of zerocoupon bonds. . since the maturities of coupon bonds are not equally spaced and trading in bonds with some maturities may be too thin to give reliable prices. but we will only consider continuously compounded interest rates (which facilitates theoretical considerations). Using continuous compounding. we easily obtain pricing formulas for coupon bonds. T ). We refer the reader to Jarrow and Turnbull (2000) for further discussion of these issues. n. . but ﬂat and downward sloping curves have also been observed. we face the additional problem that in most economies no zerocoupon bonds with maturity greater than one year are traded (in the USA. Then the price of the coupon bond Bc must satisfy n Bc = j=1 cj p(0. Treasury bills are only traded with maturity up to one year). Normally. tn ). then yc is an annual continuously compounded yieldtomaturity (with the continuously compounded annual interest rate. say. .1) for zerocoupon prices. this yields an upward sloping curve (as in ﬁgure 5. T ) = exp {−r(T ) (T /365)}). which we will formalize in the following section. In practice.1 The Bond Market The Term Structure of Interest Rates We start with a heuristic discussion.1 5. In constructing the term structure of interest rates.Chapter 5 Interest Rate Theory 5. Using the arbitrage pricing technique. deﬁned by the relation p(0. A zerocoupon bond is a bond that has no coupon payments. j = 1. called coupons. the yieldtomaturity yc is deﬁned by the relation n Bc = j=1 cj exp{−yc tj } + F exp{−yc tn }. The main traded objects we consider are zerocoupon bonds. We can. .
1. t) = 1 for all t. is a contract that guarantees the holder a cash payment of one unit on the date T . T1 ) . T2 ) . INTEREST RATE THEORY 92 Yield T E Maturity Figure 5. T ] is adapted and strictly positive and that for every ﬁxed t. T2 ] of an investment of 1 at time T1 ? To answer this question we consider the arbitrage Table 5. determined at the contract time t. we now deﬁne several riskfree interest rates. The basic building blocks for our relative pricing approach.T1 ) p(t.CHAPTER 5.1. Based on arbitrage considerations (recall our basic aim is to construct a market model that is free of arbitrage).1: Yield curve 5.2 Mathematical Modelling Let (Ω. We assume that all processes are deﬁned on this probability space. p(t. We shall assume that the price process p(t.1: Arbitrage table for forward rates To exclude arbitrage opportunities. T ) is continuously diﬀerentiable in the T variable. also called a T bond. Deﬁnition 5.T2 ) T2 bonds 0 T1 Pay out 1 T2 p(t. T ). Given three dates t < T1 < T2 the basic question is: what is the riskfree rate of return.3 for the use of arbitrage tables). T ). zerocoupon bonds. F. Obviously we have p(t. Time t Sell T1 bond p(t. I I ) be a ﬁltered probability space with a ﬁltration I = (Ft )t≤T ∗ satisfying the usual P. are deﬁned as follows. over the interval [T1 .1 below (compare §1.1. F F conditions (used to model the ﬂow of information) and ﬁx a terminal time horizon T ∗ .T1 ) p(t.T2 ) Table 5. t ∈ [0. A zerocoupon bond with maturity date T .T2 ) Buy Net investment Receive −1 + p(t. The price at time t of a bond with maturity date T is denoted by p(t. the equivalent constant rate of interest R over this period (we pay out 1 at time T1 and receive eR(T2 −T1 ) at T2 ) has thus to be given by eR(T2 −T1 ) = p(t. p(t.T1 ) p(t.
(5. T ) = p(t. T ) . T2 ) = R(T1 . Forwardrate Dynamics: df (t. That is. . t). T2 − T1 (ii) The spot rate R(T1 . s T f (t. The money account process is deﬁned by t B(t) = exp r(s)ds . 93 (i) The forward rate for the period [T1 . Deﬁnition 5.1.2. INTEREST RATE THEORY We formalize this in: Deﬁnition 5. at time t. T ) = α(t.2) . T2 ) − log p(t.1. T )dW (t)}. . s) exp − and in particular p(t. T ) {m(t. T1 ) . we assume that W = (W1 . T1 . In what follows.CHAPTER 5. 0 The interpretation of the money market account is a strategy of instantaneously reinvesting at the current short rate. T2 ] as seen at time at t is deﬁned as R(t. T2 ) = − log p(t. The dynamics of the various F processes are given as follows: Shortrate Dynamics: dr(t) = a(t)dt + b(t)dW (t). Lemma 5. for the period [T1 . T1 . . T )dW (t).3.1.4) (5. is deﬁned by f (t. T ) = exp − t T f (t. T ) = − ∂ log p(t. s)ds . T2 ). Bondprice Dynamics: dp(t.1. Wd ) is a standard ddimensional Brownian motion and the ﬁltration I is the augmentation of the ﬁltration generated by W (t). T ) = p(t. For t ≤ s ≤ T we have p(t. we model the above processes in a generalized BlackScholes framework. ∂T (iv) The instantaneous short rate at time t is deﬁned by r(t) = f (t. . T )dt + v(t. T2 ).3) (5. (iii) The instantaneous forward rate with maturity T . T )dt + σ(t. u)du . T2 ] is deﬁned as R(T1 .
T ). then p(t. T )dW (t). u)du. t) + α(t. (ii) If f (t. This leads to t t f (t. t) + 0 α(s. . T ) − mT (t. Proposition 5.4). (i) If p(t. we refer the reader to Bj¨rk (1997). Since the actual conditions are rather technical.4). T ) = p(t. To prove (ii) we start by integrating the forwardrate dynamics. T ) satisﬁes (5. t). b(t) = σ(t. existence of solutions of the various stochastic diﬀerential equations. Heath. S(t. T T α(t. T ) satisﬁes (5. Furthermore. t)ds + 0 σ(s. T )dW (t). then the short rate satisﬁes dr(t) = a(t)dt + b(t)dW (t).3). where α and σ are given by α(t. t) = r(t) = f (0. (5. Following Bj¨rk (1997) for formulation and proof. t) = σ(s.1. T )dt + σ(t. t)dW (s). where a and b are given by a(t) = fT (t. T ) satisﬁes dp(t. and Morton (1992) and Protter (2004) (the latter reference for the stochastic Fubini theorem) for these conditions. T ) = − t σ(t. T )S(t. s)ds. T ) + where A(t. (5. s)ds. INTEREST RATE THEORY 94 We assume that in the above formulas. To prove (i) we only have to apply Itˆ’s formula to the deﬁning equation for the o forward rates. σ(t. T ) r(t) + A(t. the coeﬃcients meet standard conditions required to guarantee the existence of the various processes – that is. s) + s σT (s. t).7) Writing also α and σ in integrated form t α(s.5) (iii) If f (t.CHAPTER 5. s) + s t αT (s. we now give a small toolbox for the relao tionships between the processes speciﬁed above. T ) = vT (t. Jarrow. T )v(t. T ) = − t 1 S(t. t) = α(s.6) Proof. T ) satisﬁes (5. T ). σ(s. T ) = −vT (t. (5. u)du. then for the forwardrate dynamics we have df (t. we assume that the processes are smooth enough to allow diﬀerentiation and certain operations involving changing of order of integration and o diﬀerentiation. T ) = α(t. T ) 2 2 dt + p(t.1.
s)dt + 0 σ(u. s)dsdu − t s 0 u σ(u.4) in integrated form: t t f (t.7). where Z is given by T Z(t. o By the deﬁnition of the forward rates we may write the bondprice process as p(t. T ) = − t f (0. s) + 0 α(u. T ) + 0 r(s)ds − 0 u α(u. T )dW (t). T )} to complete the proof. s)ds + t 0 T u α(u. INTEREST RATE THEORY and inserting this into (5. s)dsdW (u). s) = f (0. s)ds. the stochastic diﬀerential of Z is given by dZ(t. After changing the order of integration we can identify terms to establish (ii). T )}. o . The last line is just the integrated form of the forwardrate dynamics (5. s)dsdu − 0 u σ(u. and Morton (1992). s)dsdu − 0 t u t σ(u. s)dsdu + t 0 T u σ(u. So we obtain t t T t T Z(t. Now we can apply Itˆ’s lemma to the process p(t.6). t) + t 0 α(s. T ) = (r(t) + A(t. u)dudW (s). s). s)dW (u). s)ds − 0 t u t α(u. We insert the integrated form in (5. For (iii) we use a technique from Heath. Jarrow. Now. s)dW (u)ds. (5. s)dW (s) + 0 s σT (s. T ) = exp{Z(t.CHAPTER 5. we ﬁnd t t t 95 r(t) = f (0. t Since r(s) = f (s. Using A and S from (5. T ) − t 0 u α(u. T ) = − t f (t. T ))dt + S(t.8) to get T t T t T Z(t.8) Again we write (5. s)dsdu − 0 t σ(u. s)ds − 0 t α(u. s)ds + t 0 t s αT (s. s)ds + 0 0 α(u. s)dsdW (u). T ) = − 0 t f (0. T ) = exp{Z(t. T ) = Z(0. s]. compare Bj¨rk (1997).4) over the interval [0. u)duds + 0 σ(s. s)dsdW (u) + 0 f (0. splitting the integrals and changing the order of integration gives us T t T t T Z(t. s)dsdW (u) = Z(0. this last line above equals 0 r(s)ds. s)dsdW (u) t s + 0 f (0. s)duds + 0 0 σ(u.
taking the riskfree bank account B(t) as num´raire we have e Deﬁnition 5. In particular. under a martingale measure Q This approach is called martingale modelling. T ) . we can use the riskneutral valuation principle (4.1. We thus see that the relevant dynamics of the price processes are those given under a martingale measure Q The implication for model building is that it is natural to model all objects directly Q. We just have to apply Theorem 4.2. A measure Q ∼ I deﬁned on (Ω. we face the question whether in an arbitragefree market bond prices – quite naturally viewed as derivatives with the short rate as underlying – are uniquely determined. 5. b suﬃciently regular and W a realvalued Brownian motion. Q P the discounted price processes (with respect to a suitable num´raire) of the basic securities have e to be Q Qmartingales. More precisely. Then the price process ΠX (t). to pay in using this approach lies in the statistical problems associated with parameter estimation. r(t))dt + b(t. As usual. r(t))dW (t).3. the short rate r is not the price of a traded asset.3 Bond Pricing. (5. Assume now that there exists at least one equivalent martingale measure.1.1.2. INTEREST RATE THEORY 96 5. absence of arbitrage is guaranteed by the existence of an equivalent martingale measure Q Recall that by deﬁnition an equivalent martingale measure has to satisfy Q ∼ I and Q.2 Shortrate Models dr(t) = a(t. By Theorem 4. Q) contingent claims as FT measurable random variables such that X/B(T ) ∈ L1 (FT . For the bond market this implies that all zerocoupon bonds with maturities 0 ≤ T ≤ T ∗ have to be martingales. Q with some 0 ≤ T ≤ T ∗ (notation: T contingent claims). If we model under the objective probability measure I and assume that a locally riskfree P asset B (the money market) exists. The price one has Q. 0 ≤ t ≤ T of the contingent claim is given by ΠX (t) = B(t)I Q EQ X Ft = I Q Xe− EQ B(T ) T t 0≤t≤T r(s)ds Ft .4.9) Following our introductory remarks. Consider a T contingent claim X. with a risky asset and a riskfree asset available for trading.CHAPTER 5. say Q Deﬁning Q. B(t) is a Q Qmartingale. F. Martingale Measures and Trading Strategies We will now examine the mathematical structure of our bondmarket model in more detail. and hence we only can . we now look at models of the short rate of the type with functions a. The crucial point in this setting is the assumption on the probability measure under which the short rate is modelled.6) to obtain: Proposition 5. the price process of a zerocoupon bond with maturity T is given by p(t. our ﬁrst task is to ﬁnd a convenient characterization of the noarbitrage assumption.1. In contrast to the equity market setting. T t r(s)ds Ft . T ) = I Q e− EQ Proof. I ) is an equivalent martingale measure Q P P for the bond market.2. if for every ﬁxed 0 ≤ T ≤ T ∗ the process p(t.
T ∗ ] × I → I satisﬁes the partial diﬀerential equation (we omit the arguments (t. So we assume p(t. t ≤ τ ≤ T } of the short rate process over the term of the bond. but rather under the objective probability measure!). I Q E Q(λ) e− T t r(u)du Φ(r(T )) Ft = F (t. we should then be able to price all other bonds relative to this given bond. Thus we have Proposition 5. Given a single ‘benchmark’ bond. The drawback in this case is the question of calibrating the model (we do not observe the parameters of the process under an equivalent martingale measure. r(t)) − b(t.2. ˜ By Girsanov’s Theorem 4. we have the terminal condition F (T. and the best we can hope for is to ﬁnd consistency requirements for bonds of diﬀerent maturity. T ) = 1. we know that each equivalent martingale measure Q is given in terms P Q of a Girsanov density t t 1 L(t) = exp − γ(u)dW (u) − γ(u)2 du . T ). An slight modiﬁcation of the argument used to ﬁnd the BlackScholes PDE yields. T ) of T bonds are given are given by a suﬃciently smooth function F as above.4. T ) = F (t. .10) with terminal condition F (T. with a suﬃciently smooth function λ. Now consider a T contingent claim X = Φ(r(T )). for all r ∈ I Suppose now that the price process p(t.1 (Termstructure Equation).2. of the segment {r(τ ). r) = Φ(r).9) under the historical probability measure I . with F a suﬃciently smooth function. at time t. r(t)). If there exists an equivalent martingale meaQ(λ) for the bond market (implying that the noarbitrage condition holds) and the sure of type Q price processes p(t. if we assume that the short rate is modelled under an equivalent martingale measure. Since we know that the value of a zerocoupon bond is one unit at maturity. then F will satisfy the partial diﬀerential equation (5.1. So the Q Q(λ)dynamics of r are given by ˜ dr(t) = {a(t. T ) = 1. r)) R R 1 Ft + (a − bλ)Fr + b2 Frr − rF = 0 2 (5. we can immediately price all contingent claims via the riskneutral valuation formula.10) and terminal condition F (T. r(t)). we know that W = W + λdt is a Q Q(λ)Brownian motion. 0 ≤ t ≤ T.1 The Termstructure Equation Let us assume that the shortrate dynamics satisfy (5. r(t))}dt + b(t. 5. r(t))dW (t). r. We know that using the riskneutral valuation formula we obtain arbitragefree prices for any R contingent claim by applying the expectation operator under an equivalent martingale measure (to the discounted time T value). r(t))λ(t. In our Brownian setting. of a T bond is determined by the assessment. (We will use the notation Q Q(λ) to emphasize the dependence of the equivalent martingale measure on λ). T ) R. where F : [0. r. INTEREST RATE THEORY 97 set up portfolios consisting of putting money in the bank account. On the other hand. 2 0 0 Assume now that γ is given as γ(t) = λ(t. for any Q Q(λ). r(t).CHAPTER 5. for some suﬃciently smooth function Φ : I → R I + . We thus face an incomplete market situation.
12) with terminal condition F (T. r. see Ince (1944). with A(t.2. r(t))dt + b(t.CHAPTER 5. T ) = 1. r(t). This means we have to price the Scontingent claim X = max{p(S.e. The equation for B is a Riccati equation. p(t. Using the solution for B we get A by integrating. we obtain Proposition 5. we need shortrate models that facilate this computational task. r) = α(t) − β(t)r and b(t. Consider T contingent claims of the form X = Φ(r(T )). T ) − 2 At (t. T )) − β(t)B(t. T ) = exp {A(t. with F solving (5. INTEREST RATE THEORY 98 5.1. T ) − K. which allows for simpliﬁcation. or from a modelling point of view. T ) = 0. r). ΠX (t) = I Q e− EQ T t r(u)du XFt . T ). T )r}. T ) − B(t. Assuming that we have such a model in which both a and b2 are aﬃne in r. A. (5. T ) and B(t. say a(t. Deﬁnition 5. T ) + A(T. B(T. 12.9) with W a (realvalued) Q QWiener process. we say that the model possesses an aﬃne term structure. Fortunately. If bond prices are given as p(t. §2. T ) = 0. We can immediately apply the riskneutral valuation technique to obtain the price process ΠX (t) of any suﬃciently integrable T contingent claim X by computing the Q Qexpectation. there is a class of models. T ) = 0. T ) = 1. ∀r ∈ I R. Examples of shortrate models exhibiting an aﬃne term structure include the following. r) = max{F (S. 0}. Then arbitragefree price processes are given by ΠX (t) = F (t.2 Martingale Modelling We now ﬁx an equivalent martingale measure Q (which we assume to exist). γ(t) 2 B (t. r(t)). T ) are deterministic functions. .12) with terminal condition F (T. T ) by solving (5. the contingent claim is of the form X = Φ(r(T )) with a suﬃciently smooth function Φ. where F is the solution of the partial diﬀerential equation Ft + aFr + b2 Frr − rF = 0 2 (5. 2 δ(t) 2 B (t. Qdynamics dr(t) = a(t. 2 So we are clearly in need of eﬃcient methods of solving the above partial diﬀerential equations.51.21. Secondly. T ) − K. additionally. Suppose we want to evaluate the price of a European call option with maturity S and strike K on an underlying T bond. we use the riskneutral valuation principle to obtain ΠX (t) = G(t. with G solving Gt + aGr + b2 Grr − rG = 0 and G(S. r(t))dW (t) (5. r. r) = Φ(r) for all r ∈ I In particular. 0 ≤ t ≤ T. exhibiting an aﬃne term structure (ATS). T ) = F (t. 0}.11) If. We ﬁrst have to ﬁnd the price process p(t. we ﬁnd that A and B are given as solutions of ordinary diﬀerential equations. and return to modQ elling the shortrate dynamics directly under Q Thus we assume that r has Q Q. T ) − α(t)B(t. (1 + Bt (t. r) = γ(t) + δ(t)r. which can be solved analytically.12) and terminal condition F (T. T bond prices are given by R. i. T ) = 0.2. T ) = F (t. r.15. r.2.2 (Termstructure Equation).
T ) = α(t. T )dW (t)}. Observe that we have deﬁned an inﬁnitedimensional stochastic system. where m(t. T ) resp. T ) = p(t. we use the moneymarket account B (assuming that e there exists a measurable version of f (t. (5. The exogenous speciﬁcation of the family of forward rates {f (t.2.1. T ) 2 2 .14) 1 S(t. 0 0 So we allow investments in a savings account too. we then could e conclude that our bond market model is free of arbitrage. the short rate and a long rate and/or intermediate rates. For any P R R ﬁxed maturity T . CoxIngersollRoss (CIR) model: dr = (α − βr)dt + δ rdW .3. I d valued processes. T ) are exogenously given by df (t.1 HeathJarrowMorton Methodology The HeathJarrowMorton Model Class Modelling the term structure with only one explanatory variable leads to various undesirable properties of the model (to say the least). s)ds and S(t. 4. u)du = exp r(u)du . T ).1 we obtain the dynamics of the bondprice processes as dp(t. As a ﬁrst possible choice of num´raire.13) (5. for any ﬁxed T ≤ T ∗ the dynamics of instantaneous. More precisely. T ) = p(t. t) in [0. √ 5. by Proposition 5.1. T ) = − t α(t. T > 0}.g. and Morton (1992)) is at the far end of this spectrum – they propose using the entire forward rate curve as their (inﬁnitedimensional) state variable. continuously compounded forward rates f (t.6)). A(t. T ). HullWhite (extended Vasicek) model: dr = (α(t) − β(t)r)dt + γ(t)dW . T ∗ ]). By Theorem 4. Jarrow. the initial condition of the stochastic diﬀerential equation (5. We will call such a measure riskneutral martingale measure to emphasize the dependence on the num´raire. e . and that by construction we obtain a perfect ﬁt to the observed term structure (thus avoiding the problem of inverting the yield curve). T ) {m(t.4) where W is a ddimensional Brownian motion with respect to the underlying (objective) probability measure I and α(t. 99 5. We now explore what conditions we must impose on the coeﬃcients in order to ensure the existence of an equivalent martingale measure with respect to a suitable num´raire. We must ﬁnd an equivalent measure such that Z(t.CHAPTER 5. given by t t B(t) = exp f (u. T )dt + σ(t. T ) = − t σ(t. σ(t.4) is determined by the current value of the empirical (observed) forward rate for the future date T which prevails at time 0. HullWhite (extended CIR) model: dr = (α(t) − β(t)r)dt + δ(t) rdW . T )dt + S(t. T > 0} is equivalent to a speciﬁcation of the entire family of bond prices {p(t. T ) are adapted I resp. Furthermore. Various authors have proposed models with more than one state variable. √ 2. T ) B(t) is a martingale for every 0 ≤ T ≤ T ∗ . T ) = r(t) + A(t.3 5. HoLee model: dr = α(t)dt + γdW . 3. INTEREST RATE THEORY 1. Vasicek model: dr = (α − βr)dt + γdW . T ) + T T (5. s)ds (compare (5. e. T )dW (t). The HeathJarrowMorton method (compare Heath.
T ) = σ(t.16). Q (5. (5. s)ds + σ(t. Q P o we ﬁnd the Q Qdynamics of Z(t. . T ) = α(t. T ) 2 2 = S(t. Since we are working in a Brownian framework we know that any equivalent measure Q ∼ I is given via a Girsanov density (5.20) (ii) and bondprice dynamics under Q are given by Q ˜ dp(t. Using Itˆ’s formula and Girsanov’s Theorem 4. Assume that Q is a riskneutral martingale measure Q for the bond market and that the forwardrate dynamics under Q are given by Q ˜ df (t. The particular choice λ ≡ 0 means that we assume we model directly under a riskneutral martingale measure Q In that case the relations between the various inﬁnitesimal characteristics for the Q. In order for Z to be a Q Qmartingale. (5. T ) = p(t.4). forward rate are known as the ‘HeathJarrowMorton drift condition’. with S as in (5. ˜ QBrownian motion.CHAPTER 5.16) Proof.16) this leads to a restriction on drift and volatility coeﬃcients in the speciﬁcation of the forward rate dynamics (5. T ) = σ(t.3. It is possible to interpret λ as a risk premium. with the properties that (i) T 0 λ(t) 2 dt < ∞. T ) (omitting the arguments) as: dZ = Z A + 1 S 2 2 ˜ − Sλ dt + ZSdW . 0 ≤ t ≤ T ≤ T ∗ . In view of (5.4). T ) t σ(t. with t 1 L(t) = exp − λ(u) dW (u) − 2 0 t 2 λ(u) du . INTEREST RATE THEORY 100 Theorem 5. T )dW (t). T )r(t)dt + p(t. which has to be exogenously speciﬁed to allow the choice of a particular riskneutral martingale measure. T )dW (t). Assume that the family of forward rates is given by (5. so we obtain A(t. T )S(t. a.15) 0 (ii) For all 0 ≤ T ≤ T ∗ and for all t ≤ T . T ) − σ(t. T )λ(t). T ) + 1 S(t. T )S(t.18) Taking the derivative with respect to T .19) α(t..s.15).1.17) has to be zero.2 (HeathJarrowMorton). (5. Q − a. s)ds.6). T )λ(t).4. we get −α(t.s.1. Then we have: with W a Q (i) the HeathJarrowMorton drift condition T (5.17) ˜ with W a Q QBrownian motion. which after rearranging is (5.3. the drift coeﬃcient in (5. Then there exists a riskneutral martingale measure if and only if there exists an adapted process λ(t). (5. T )dt + σ(t. we have T α(t. and I E(L(T )) = 1. T )λ(t). T ) t σ(t. Theorem 5. T ) = −σ(t.
T ∗ ))dW (t)}.2 Forward Riskneutral Martingale Measures For many valuation problems in the bond market it is more suitable to use the bond price process p(t. T ) to be a Q ∗ martingale this coeﬃcient has to be zero. ˜ Now for Z ∗ (t. T ∗ ))γ(t). T ) Z ∗ (t.1. ∀t ∈ [0. ˜ (5. T ∗ ) we ﬁnd o dZ ∗ (t. we obtain T∗ α(t. T ].14). s)ds T = γ(t) T σ(t. T ) (S(T. T )/p(t. p(t. T ) {m(t. and replacing m with its Q ˜ deﬁnition we get 1 2 S(t. . s)ds = γ(t)σ(t.3. T ) − S(t.3. T ∗ )). with the properties (i) of Theorem 5.4). T ) − (S(t. any equivalent martingale ˜ measure Q ∗ is given via a Girsanov density L(t) deﬁned by a function γ(t) as in Theorem 5. such that α(t. T ) − m(t.CHAPTER 5. In order to ﬁnd suﬃcient conditions for the existence of such a martingale measure. T ) {m(t. T ∗ )(S(t. By (5. T ) = σ(t. T ) − A(t. T ) = m(t. INTEREST RATE THEORY 101 5. Q So the drift coeﬃcient of Z ∗ (t.3. Then there exists a forward riskneutral martingale measure if and only if there exists an adapted process γ(t). T ) = p(t. T ) − S(t. with m(t. T ) + σ(t. T ) − S(t. We then have to ﬁnd an equivalent probability measure Q ∗ such that the e Q auxiliary process p(t. Now applying Itˆ’s formula to the quotient p(t. (A(t. Assume that the family of forward rates is given by (5. T ∗ ) + γ(t)). T ∗ )) + 2 Written in terms of the coeﬃcients of the forwardrate dynamics. T )dt + (S(t. 0 ≤ t ≤ T ≤ T∗ . T )dW (t)}. T ∗ )) . T Taking the derivative with respect to T . T ) = . this identity simpliﬁes to T∗ 1 α(t. T ) as in (5. T ∗ ) 2 = (S(t.3. T ∗ ) − S(t.1. a savings account is not used and the existence of a martingale measure Q ∗ guarantees that there are no arbitrage opportunities between bonds of diﬀerent maQ turities. T ) = Z ∗ (t. Again. T ∗ ) is a martingale under Q ∗ for all T ≤ T ∗ .3. T )dt + S(t. s)ds + 2 T∗ 2 T∗ σ(t. We will call such a measure forward riskneutral marQ tingale measure. T ) under Q ∗ is given as Q m(t. we follow the same programme as above. T ). T ) T σ(t. We have thus proved: Theorem 5. T ∗ ) as num´raire.21) with m(t. In this setting. T ∗ ) + γ(t)) (S(t. s)ds. T ) − S(t.13) bond price dynamics under the original probability measure I are given as P dp(t. T ) − S(t.
T ) = α(t. T ∗ ) ˜ exp − (S − S ∗ )dW T (t) − Z(T. T ) = p(t. T ) 2 0 T T T (S − S ∗ )2 dt 0 (with W a Q Brownian motion). T ) = −Z(t.3.7)) lead to t t f (t. The stochastic integral in the exponential is Gaussian with Q zero mean and variance T Σ (T ) = 0 2 (S(t. T ) − S(t. T ). Q ∗ the T . T ∗ ))2 dt. . Consider e a European call C on a T ∗ bond with maturity T ≤ T ∗ and strike K. we use the change of num´raire technique. Q Q Now p(t. T ∗ ) ˜ Z(t. T ∗ ) ≥ K) = Q ∗ Q Q p(T. T ∗ forward riskneutral measure. T ) = Q ∗ (Z(T.4 5. By Theorem 5.20) and the integrated form of the forward rate (compare (5.CHAPTER 5. Now Q ∗ (p(T. T ∗ ) − K) . T )dt + σ(t. T ∗ )) Q T 1 p(0. which we can solve in special cases explicitly for p(t. t)dW (u). bondprice dynamics under Q are given by Q ˜ dp(t. T ∗ ) ≥K p(T. S ∗ = S(t. T )) have Gaussian probability laws (hence the terminology). T ∗ )) ˜ ˜ ˜ dZ = Z S(S − S ∗ )dt − (S − S ∗ )dW (t) . T ) has Q Qdynamics (omitting the arguments and writing S ∗ for S(t.resp. f (0. with all processes realvalued. INTEREST RATE THEORY 102 5. T ) − S(t. T ) is a Q T martingale with Q T dynamics Q Q ˜ ˜ dZ(t. To price options on zerocoupon bonds. Q ˜ ˜ Since Z(t. T )dW (t) .22) The price of this call at time t = 0 is given as C(0) = p(0. T ) = p(0. T ∗ ) > K} and Q T resp. We restrict the class of models by assuming that the forward rate’s volatility is deterministic. T ). T ) ≥ K). T ∗ )Q ∗ (A) − Kp(0.1 Pricing and Hedging Contingent Claims Gaussian HJM Framework Assume that the dynamics of the forward rate are given under a riskneutral martingale measure Q by Q ˆ ˜ df (t. T ) = p(t. t) + 0 (−σ(u.2. The HJMdrift condition (5. so a deterministic variance coeﬃcient. t) = r(t) = f (0. So we consider the T contingent claim + X = (p(T. T )(S(t.4. we ﬁnd that under Q T (again S = S(t. (5. T ∗ ))dW T (t). Q Q with A = {ω : p(T. t))du + 0 ˜ σ(u. t)S(u. which implies that the shortrate (as well as the forward rates f (t. T )dW (t). T ) r(t)dt + S(t. T ) = f (0. T ). T )Q T (A).
T ∗ ) ≥ K) = Q T (Z(T. T ) is a martingale with Q dZ ∗ (t.22) is given by C(0) = p(0. T ∗ ))dW ∗ (t). . T ∗ ) ≥ K) = Q ∗ Q Q Under Q ∗ Z ∗ (t. T ) = has Q Qdynamics (compare (5. T )N (d1 ). for the ﬁrst term Z ∗ (t. .1.23) Σ2 (T ). 5. • K. T ∗ ) . Q with d1 = d2 + So we obtain: Proposition 5.2 Swaps This section is devoted to the pricing of swaps. T ∗ )N (d2 ) − Kp(0. T ) = Z ∗ (t.T ) Kp(0. . T 2 0 2 T 103 Q T (p(T. a prespeciﬁed ﬁxed rate of interest. < Tn . K (S − S ∗ )2 dt . a nominal amount. Now Q ∗ (p(T. the contract time. T ∗ ) ≥ K) = N (d1 ). T ) − S(t. (5. We consider the case of a forward swap settled in arrears. INTEREST RATE THEORY So with d2 = Similarly. Using this fact it follows (after some computations) that: Q ∗ (p(T. The price of the call option deﬁned in (5. and also a deterministic variance coeﬃcient. with parameters given as above. T ∗ ) K = Q ∗ (Z ∗ (T. . T ) ≥ K) = N (d2 ) Q Q ˜ log p(0.4. T ) p(t. • dates T0 < T1 . T ) = exp (S − S ∗ )dW ∗ (t) − ∗) p(0. T )(S(t. 1 1 ≤ p(T.4.T ∗ ) − 1 Σ2 (T ) 2 Σ2 (T ) p(t.CHAPTER 5. so T p(0. Such a contingent claim is characterized by: • a ﬁxed time t. • R.21)) ˜ dZ ∗ = Z ∗ S ∗ (S ∗ − S)dt + (S − S ∗ )dW (t) . 0 Again we have a Gaussian variable with the (same) variance Σ (T ) in the exponential. equally distanced Ti+1 − Ti = δ. T ) 1 dZ ∗ (T. T ) ≤ Q 1 ).
where the amount of money paid out at Ti+1 . n Π(t. Ti ) = i=1 (p(t. The relevant interest rate (LIBOR. i = 1. . p = p(T0 . . . the cap rate is denoted by R. we have L = (1 − p)/(δp). A caplet is a contract written at t. . Ti−1 ) − (1 + δR)p(t. . T0 ) − i=1 ci p(t. This again shows the power of riskneutral pricing. n − 1 is deﬁned by Xi+1 = Kδ(L(Ti . with ci = δR.CHAPTER 5. Using the riskneutral pricing formula we obtain (we may use K = 1). for instance) is observed in T0 and deﬁned by 1 p(T0 . .4. in force between [T0 . 1 + δL(T0 . 1 + δL(Ti . T0 ) A caplet C is a T1 contingent claim with payoﬀ X = Kδ(L(T0 . A cap can be broken down in a series of caplets. n − 1 and cn = 1 + δR. INTEREST RATE THEORY 104 A swap contract S with K and R ﬁxed for the period T0 . . all we need is the existence of a riskneutral martingale measure. . Ti )) = p(t. S) = i=1 n I Q e− EQ T t r(s)ds δ(L(Ti . i = 0. T1 ) = . T0 ). The ﬂoating rate over [Ti . Ti ) − R). and we obtain its price accordingly. Ti+1 ) = 1 . (assuming K = 1) and X = δ(L − R)+ = δ = 1−p −R δp + 1 1 − (1 + δR) = − R∗ p p + + . 5. Ti+1 ] observed at Ti is a simple rate deﬁned as p(Ti . . the nominal amount is K. . T1 ). . Ti ) − R) Ft r(s)ds = i=1 I Q I Q e EQ EQ × e− n T t − Ti Ti−1 FTi−1 Ft n r(s)ds 1 − (1 + δR) p(Ti−1 . T1 ]. So a swap is a linear combination of zerocoupon bonds.3 Caps An interest cap is a contract where the seller of the contract promises to pay a certain amount of cash to the holder of the contract if the interest rate exceeds a certain predetermined level (the cap rate) at some future date. T0 )−R)+ . δ = T1 − T0 . Ti ). Ti ) We do not need to specify a particular interestrate model here. . Writing L = L(T0 . Using the linearity of the expectation operator we can reduce complicated claims to sums of simpler ones. R∗ = 1 + δR. Tn is a sequence of payments.
INTEREST RATE THEORY The riskneutral pricing formula leads to ΠC (t) = I Q e− EQ T1 t 105 r(s)ds 1 p − R∗ + Ft T0 t = I Q I Q e− EQ EQ T1 T0 r(s)ds FT0 e− r(s)ds 1 − R∗ p + Ft = = I Q p(T0 . .CHAPTER 5. T1 ) e− EQ I Q e− EQ T0 t T0 t r(s)ds 1 − R∗ p + + Ft r(s)ds (1 − pR∗ ) Ft + = R ∗ I Q e− EQ T0 t r(s)ds 1 −p R∗ Ft . So a caplet is equivalent to R∗ put options on a T1 bond with maturity T0 and strike 1/R∗ .
FX (x) := I ({ω : X(ω) ≤ x}). P If A1 . . A random variable (vector) X is a P function X : Ω → I Rk ) such that X −1 (B) = {ω ∈ Ω : X(ω) ∈ B} ∈ F for all Borel sets R(I B ∈ B(B(I k )). i. Ω ∈ F . P (i) ∅. P P I (A) ≥ 0 for all A.2. . F. R For a random variable X {ω ∈ Ω : X(ω) ≤ x} ∈ F for all x ∈ I So deﬁne the distribution function FX of X by R. P Recall: σ(X). A2 .1. A probability space. Examples. Each point ω of Ω. Write down Ω for experiments such as ﬂip a coin three times. I (Ω) = 1. . I ) satisfying P Kolmogorov axioms (i). Assign probabilities for the above experiments.1 Fundamentals To describe a random experiment we use a sample space Ω. P Deﬁnition A. I ) be a probability space. Deﬁnition A. 106 . The class F of subsets of Ω whose P probabilities I (A) are deﬁned (call such A events) should be a σalgebra . represents a possible random outcome of performing the random experiment. ∈ F then n Fn ∈ F. I ( P i Ai ) = i I (Ai ) countable additivity. roll two dice. is a triple (Ω. (iii) F1 . Let (Ω. are disjoint. We want a probability measure deﬁned on F (i) (ii) (iii) I (∅) = 0.(ii) and (iii) above.1. the σalgebra generated by X. . or sample point. For a set A ⊆ Ω we want to know the probability I (A). . Examples. or Kolmogorov triple. (ii) F ∈ F implies F c ∈ F. .e. A probability space is a mathematical model of a random experiment. F2 .Appendix A Basic Probability Background A. .1. F. the set of all possible outcomes.
∞)} . taking values xn (n = 1.1.APPENDIX A. 2.b)} . xf (x)dx or if X is discrete. The expectation I of a random variable X on (Ω. k • Geometric distribution: Waiting time I (N = n) = p(1 − p)n−1 . Calculate moments for some of the above distributions. BASIC PROBABILITY BACKGROUND 107 Some important probability distributions • Binomial distribution: Number of successes I (Sn = k) = P n k p (1 − p)n−k . 2π E P Deﬁnition A. Examples. . b−a • Density of Exponential distribution: f (x) = λe−λx 1{[0. k! • Density of Uniform distribution: f (x) = 1 1{(a. P The variance of a random variable is deﬁned as V ar(X) := I (X − I V E E(X))2 = I X 2 − (I E EX)2 . .) with probability function f (xn )(≥ 0). If X is realvalued with density f (i. . F. I ) is deﬁned by I EX := Ω XdI P. . • Density of standard Normal distribution: 2 1 f (x) = √ e−x /2 . or Ω X(ω)dI (ω). P • Poisson distribution: I (X = k) = e−λ P λk . I EX := xn f (xn ).e f (x) ≥ 0 : I EX := f (x)dx = 1).3.
If X is a random variable with distribution function F .4. . G. . . Y are independent. . .APPENDIX A. tn are independent random variables that have an exponential distribution with parameter λ. its moment generating function φX is ∞ φ(t) := I E(e tX )= −∞ etx dF (x). P Theorem A. i = 1. .1. Suppose X. BASIC PROBABILITY BACKGROUND 108 Deﬁnition A. The mgf takes convolution into multiplication: if X. Xn to be independent it is necessary and suﬃcient that for all x1 . Y are independent. . with distribution functions F . (n − 1)! Deﬁnition A. Observe φ(k) (t) = I E(X k etX ) and φ(0) = I E(X k ). . Example.1.1. We call H the convolution of F and G.1 (Multiplication Theorem). written H = F ∗ G. ∞]. with distribution functions F . G. P Lemma A. n we have n n I P {Xi ∈ Ai } i=1 = i=1 I ({Xi ∈ Ai }).2 Convolution and Characteristic Functions The most basic operation on numbers is addition. . . . . Xn are independent if whenever Ai ∈ B (the Borel σalgebra) for i = 1. If X. . For X on nonnegative integers use the generating function ∞ γX (z) = I X ) = E(z k=0 z k I (Z = k).1. n n I P {Xi ≤ xi } i=1 = i=1 I ({Xi ≤ xi }). . . . g. . . . Random variables X1 . . and Z := X + Y. λ) density function f (x) = λn xn−1 −λx e . . . . the most basic operation on random variables is addition of independent random variables. Then T = t1 + . In order for X1 . Y are independent. Assume t1 . .1. xn ∈ (−∞. then Z has a density h with ∞ ∞ h(z) = −∞ f (z − y)g(y)dy = −∞ f (x)g(z − x)dx. + tn has the Gamma(n. Xn are independent and I Xi  < E ∞. then n n I E i=1 Xi = i=1 I E(Xi ). . n. If X1 . P A. . . . φX+Y (t) = φX (t)φY (t). . Y have densities f .5. If X. deﬁne Z := X +Y with distribution function H.
This is legitimate. Y do not have densities. We write h = f ∗ g. the argument above may still be taken as far as ∞ H(z) = I (Z ≤ z) = I (X + Y ≤ z) = P P −∞ F (x − y)dG(y) (and.g. by Fubini’s theorem. the better.2). . We again write H = F ∗ G. where ∞ h(x) = −∞ f (x − y)g(y)dy. g.2.APPENDIX A. g. §8. Now we frequently need to add (or average) lots of independent random variables: for example.y):x+y≤z} f (x)g(y)dxdy. again. so adding n independent random variables involves n − 1 integrations. Y have densities f . then integrating we recover the equation above (after interchanging the order of integration. which we quote from measure theory. This shows that if X. and call the distribution function H the convolution of the distribution functions F and G. and call the density h the convolution of the densities f and g. and Z = X + Y . symmetrically with F and G interchanged). If X. One thus seeks a way to transform distributions so as to make the awkward operation of convolution as easy to handle as the operation of addition of independent random variables that gives rise to it. and to ﬁnd probabilities in the density case we integrate the joint density over the relevant region. since by independence of X and Y the joint density of X and Y is the product f (x)g(y) of their separate (marginal) densities. then Z has density h. Y are independent with densities f . and this is awkward to do for large n. as the integrals are nonnegative. −∞ (and of course symmetrically with f and g interchanged). BASIC PROBABILITY BACKGROUND 109 let Z have distribution function H. written H = F ∗ G. see e. H depends on F and G symmetrically. where the integral on the right is the LebesgueStieltjes integral of §2. −∞ −∞ ∞ −∞ If h(z) := f (x)g(z − x)dx. We call H the convolution (German: Faltung) of F and G. when forming sample means in statistics – when the bigger the sample size is. In sum: addition of independent random variables corresponds to convolution of distribution functions or densities. (Williams 1991). Thus z−x ∞ ∞ H(z) = f (x) g(y)dy dx = f (x)G(z − x)dx. Then H(z) = I (Z ≤ z) = I (X + Y ≤ z) = P P {(x. But convolution involves integration. Suppose ﬁrst that X. Then since X + Y = Y + X (addition is commutative).
This is a crucial advantage. The bound on the right tends to zero as u → 0 by Lebesgue’s dominated convergence theorem (which we quote from measure theory: see e. φ(t) = I itx ) are complex numbers. 1.3. E(e E(e E(1) = 1. For. see e. §18.1. All other numbers – t. 4. Suppose X has kth moment: I of eitx as far as the kth power term: eitx = 1 + itx + · · · + (itx)k /k! + o tk . – are real. and here we are quoting the uniqueness property of this transform. §5. φ(t) = −∞ eitx dF (x) ≤ −∞ eitx dF (x) = −∞ 1dF (x) = 1. φX+Y (t) = φX (t)φY (t). ∞ ∞ ∞ Proof. ∞ φ(t + u) − φ(t) = −∞ ∞ {ei(t+u)x − eitx }dF (x) ∞ = −∞ iux e itx (e iux − 1)dF (x) ≤ −∞ iux eiux − 1 dF (x). Note. then convergence of φn to φ. so by the multiplication theorem (Theorem B. φ is continuous (indeed.g. φ is the FourierStieltjes transform of F . e (Williams 1991). Were uniqueness not to hold. If X is a random variable with distribution function F . all expressions involving i such as eitx . we would lose information on taking characteristic functions.1. X are random variables with distribution functions Fn . Moments. 3. This result is due to L´vy. Proof. x etc. Take the Taylor (powerseries) expansion 6. F and characteristic functions φn . φ is uniformly continuous). as X. (Uniqueness theorem): φ determines the distribution function F uniquely.g. its characteristic function φ (or φX if we need to emphasise X) is ∞ φ(t) := I E(e itX )= −∞ eitx dF (x). φ(t) ≤ 1 for all t ∈ I R. BASIC PROBABILITY BACKGROUND 110 Deﬁnition A. so are eitX and eitY for any t. Y are independent. the uniformity follows as the bound holds uniformly in t. φn (t) → φ(t) (n → ∞) for all t ∈ I R is equivalent to convergence in distribution of Xn to X.2. and so φ would not be useful. We list some properties of characteristic functions that we shall need. I it(X+Y ) ) = I itX · eitY ) = I itX ) · I itY ). and e − 1 ≤ 2. for all t. Thus in particular the characteristic function always exists (the integral deﬁning it is always absolutely convergent). (Continuity theorem): If Xn .9).APPENDIX A. φ(0) = 1. E(e E(e E(e E(e as required. √ Here i := −1. Now as u → 0. For. (Williams 1991). 5. (t ∈ I R).1). E(e The characteristic function takes convolution into multiplication: if X. giving continuity. EXk < ∞. φ(0) = I i·0·X ) = I 0 ) = I 2. φ. far outweighing the disadvantage of having to work with complex rather than real numbers (the nuisance value of which is in fact slight). Y are independent. Technically. e − 1 → 0. .
as X is random) obtained above (one for each value X(ω) of X). 2. 2 1 1 √ exp − (x − µ)2 /σ 2 . We shall need the case k = 2 in dealing with the central limit theorem below. 2 σ 2π 1 φ(t) = exp iµt − σ 2 t2 . This gives the integral on the left as exp{ 1 t2 }. One can check that if X has the standard normal density above. 1). and multiplies the variance by σ 2 (a change of scale). by the process of 2π ‘completing the square’ (familiar from when one ﬁrst learns to solve quadratic equations!). then µ + σX has density f (x) = and characteristic function I it(µ+σX) Ee = = 1 exp{iµt}I e(iσt)X = exp{iµt} exp − (σt)2 E 2 1 exp iµt − σ 2 t2 . and take expectations. ∞ e f (x)dx = −∞ tx 1 √ 2π 1 √ 2π exp ∞ −∞ ∞ 1 exp tx − x2 dx 2 1 1 exp − (x − t)2 + t2 dx 2 2 1 ·√ 2π ∞ = −∞ = 1 2 t 2 −∞ 1 exp − (x − t)2 dx. + (it)k I X k + o tk E k! (t → 0). 2 σ 2π Thus the general normal density and its characteristic function are f (x) = 1 1 √ exp − (x − µ)2 /σ 2 . φ(t) = 1 + itI E(X) + . General Normal Distribution. 1 N (0. that e(t) is still of smaller order than tk for t → 0: if I X E k < ∞. which we quote from complex analysis. one has. The integral on the left becomes the characteristic function of the standard normal density – which we have thus now identiﬁed (and will need below in §2. Examples 1. . but it is true. For the standard normal density f (x) = √1 exp{− 2 x2 }. we obtain φ(t) = I itX ) = 1 + itI E(e EX + · · · + (it)k I E(X k ) + e(t). (Burkill and Burkill 1970)). Standard Normal Distribution. 2 The second factor on the right is 1 (it has the form of a normal integral). It is not obvious.APPENDIX A. k! where the error term e(t) is the expectation of the error terms (now random. By linearity. 2 Now replace t by it (legitimate by analytic continuation. Now replace x by X. Applied to a random variable X. BASIC PROBABILITY BACKGROUND 111 where ‘o tk ’ denotes an error term of smaller order than tk for small k. this adds µ to the mean (a change of location). N (µ. 2 .g.8). The right becomes exp{− 2 t2 }. σ). Consider the transformation x → µ + σx. 1 see e. .
1 (Weak Law of Large Numbers).a. . then by the moment property of §2. . C n and if zn ∈ I . If the Xi have characteristic function φ. assumption. X2 . P The characteristic function is thus ∞ 112 φ(t) = I eitX = E k=0 ∞ e−λ λk itk ·e k! = e−λ (λeit )k /k! = e−λ exp{λeit } = exp{−λ(1 − eit )}. With continuous compounding. i=1 Proof. .3 The Central Limit Theorem 1+ x n n Readers of this book will be well aware that → ex (n → ∞) ∀x ∈ I R. are independent and identically distributed with mean µ. then 1 n n Xi → µ (n → ∞) in probability.APPENDIX A. φ(t) = 1 + iµt + o (t) (t → 0). P (λ). (k = 0.. with interest compounded n times p. we prove the weak law of large numbers: Theorem A.a. .8 with k = 1. C 1+ zn n n → ez (n → ∞) (zn → z ∈ I ). . Now using the i. 1. the probability mass function is f (k) := I (X = k) = e−λ λk /k!. I E 1 exp it · n n 1 n n i=1 Xi has characteristic function n Xj 1 = = I E i=1 n exp it · I exp E 1 Xj n = (φ(t/n))n n i=1 it Xj n = 1+ iµt + o (1/n) n → eiµt (n → ∞). our x capital after one year is (1 + n )n . Here. 2. If X1 . This is the formula governing the passage from discrete to continuous compound interest.d. . We need two extensions: the formula still holds with x ∈ I replaced by a complex number R z∈I : C z n 1+ → ez (n → ∞) ∀z ∈ I . . C As a ﬁrst illustration of the power of transform methods. Poisson Distribution. zn → z.i. our capital after one year is the x exponential e : exponential growth corresponds to continuously compounded interest. k=0 A.3. Invest one pound (or dollar) for one year at 100x% p..). BASIC PROBABILITY BACKGROUND 3.
Since Xi − µ has mean 0 and second moment σ 2 = V ar(Xi ) = V I E[(Xi − I i )2 ] = I EX E[(Xi − µ)2 ]. The result follows by the continuity theorem. the case k = 2 of the moment property of §2. this says (see §2. o (1/n) is an error term of smaller order than 1/n as n → ∞). (ii) scale the resulting Xi − µ by dividing by the standard deviation σ (to get variance 1). X2 . we: (i) centre the Xi by subtracting the mean (to get mean 0).3. for all x ∈ I R. −∞ 1 2 Proof. Xi − µ)/σ has characteristic function √ n n 1 I exp it · E Xj − µ σ n j=1 n n it(Xj − µ) √ = I exp E = I E exp σ n j=1 j=1 = φ0 σ n t √ n it √ (Xj − µ) σ n → e− 2 t 1 2 = 1− 1 2 2 2σ t σ2 n +o 1 n n (n → ∞). √n 1 Yi converges in distribution to standard normal. By the continuity theorem. let Xi − µ have characteristic function φ0 . Note. and e− 2 t is the characteristic function of the standard normal distribution N (0. We ﬁrst centre at the mean. 1) the standard normal distribution. i=1 That is. In Theorem A. 1 n n Xi → µ in distribution. √ n1 σ n 1 (Xi − µ) = √ n i=1 n n (Xi − µ)/σ → N (0.APPENDIX A. 1 2 . .2. then with N (0. are independent and identically distributed with mean µ and variance σ 2 .6) that 1 n n Xi → µ in probability. Theorem A. i=1 and as µ is constant.2 (Central Limit Theorem). Then n 1 if Yi := (Xi − µ)/σ are the resulting standardised variables. If X1 . . 1 The main result of this section is the same argument carried one stage further. .7 gives 1 φ0 (t) = 1 − σ 2 t2 + o t2 2 Now √ 1 n( n n i=1 (t → 0). I P 1 √ n n (Xi − µ)/σ ≤ x i=1 1 → Φ(x) := √ 2π x e− 2 y dy (n → ∞).3. BASIC PROBABILITY BACKGROUND 113 and eiµt is the characteristic function of the constant µ (for ﬁxed t. 1) (n → ∞) in distribution. 1). If Xi has characteristic function φ.
. The proof of the deMoivreLaplace limit theorem sketched above is closely analogous to the passage from the discrete to the continuous BlackScholes formula: see §4. where the success probability p = pn varies with n. (‘success probability’ seems a strange usage here!). or with the discrete analogues of densities – such as the individual probabilities I (Sn = k) in the binomial case above – is called a local limit theorem. (n → ∞). This models a situation where we have a large number n of Bernoulli trials. P. where one has a large number n of individuals at risk. P – so Xi has mean p and variance pq .4.APPENDIX A. insurance claims and the like. in such a way that npn → λ > 0. Binomial models satisfying condition (A. p). the expected total number of successes.S. deals directly with individual probabilities in the discrete case (the sum of a large number of which is shown to approximate an integral). Local Limit Theorems. one needs an approximation to the factorials. each with small probability pn of success. I (Xi = 1) = p. The Poisson distribution is widely used to model statistics of accidents. 1667–1754. 2π is the de MoivreLaplace limit theorem (Abraham de Moivre. but intermediate’. The de MoivreLaplace limit theorem – convergence of binomial to normal – is only one possible limiting regime for binomial models. P Poisson Limit Theorem. each with a small probability pn of generating an accident. p q = (n − k)!k! k n i=1 (Xi A direct attack on the distribution of n √ − p)/ pq can be made via n! pk q n−k . Suppose we have a sequence of binomial models B(n. (n − k)!k! I P a≤ i=1 Xi ≤ b = √ √ k:np+a npq≤k≤np+b npq Since n. but such that npn . (A. The next most important one has a Poisson limit in place of a normal one. whence the result. A limit theorem dealing with densities and convergence thereof in the density case. 1).2) for a √1 e− 2 x dx. is ‘neither large nor small. insurance claim etc. de Laplace. pn ∼ λ/n. The required result is Stirling’s formula of 1730: √ 1 n! ∼ 2πe−n nn+ 2 (n → ∞) (the symbol ∼ indicates that the ratio of the two sides tends to 1). The argument can be carried through to obtain the sum on the right as a Riemann sum (in the sense of the Riemann integral: 1 2 b §2.1) Thus pn → 0 – indeed.Sn := and p: n 114 I (Xi = 0) = q := 1 − p P n i=1 Xi is binomially distributed with parameters n I P i=1 Xi = k = 1 √ n n k n−k n! pk q n−k . This result is sometimes called the law of small numbers.6 and §6. The de MoivreLaplace limit theorem above. the earliest form of the central limit theorem. however. This. The central limit theorem as proved above is a global limit theorem: it relates to distributions and convergence thereof. If each Xi is Bernoulli distributed with parameter p ∈ (0. BASIC PROBABILITY BACKGROUND Example: the Binomial Case. k and n − k will all be large here. 1749– 1827).1) converge to the Poisson model P (λ) with parameter λ > 0.
3. then countable sets (like I = {1. A a subset of length µ(A). inﬁnite sets. The n length of the disjoint union I = r=1 Ir of intervals Ir should be the sum of their lengths: n n µ r=1 Ir = r=1 µ(Ir ) (ﬁnite additivity). . ∞).1 Measure The language of modelling ﬁnancial markets involves that of probability. The above suggests – what Lebesgue showed – that length can be sensibly deﬁned on the 115 . . . We begin with deﬁning a measure on I generalising the intuitive R notion of length. or ‘simplest’. b) or (a. Grimmett and Welsh (1986). If A ⊆ B and B has length µ(B) = 0. b). and complete (containing all subsets of sets of length 0 as sets of length 0). Rosenthal (2000) are also useful.(ad inﬁnitum) of disjoint intervals. Letting n tend to ∞ suggests that length should again be additive over disjoint intervals: ∞ ∞ µ r=1 Ir = r=1 µ(Ir ) (countable additivity). b] should be b − a: µ(I) = b − a. . Ross (1997). ‘Int´grale. We must distinguish ﬁrst between ﬁnite and inﬁnite sets.Appendix B Facts form Probability and Measure Theory We will assume that most readers will be familiar with such things from an elementary course in probability and statistics. For I an interval. in his thesis. The length µ(I) of an interval I = (a. which in turn involves that of measure theory. as N distinct from uncountable sets such as I = (−∞. Resnick (2001).}) are the ‘smallest’. then A should have length 0 also: A ⊆ B and µ(B) = 0 ⇒ µ(A) = 0 (completeness). b]. This originated with Henri Lebesgue (18751941). the length of the complement I \ A := I ∩ Ac of A in I should be µ(I \ A) = µ(I) − µ(A) (complementation). aire’ Lebesgue (1902). closed under countable R disjoint unions and complements. 2. Durrett (1999). [a. [a. R Let F be the smallest class of sets A ⊂ I containing the intervals. Ross (1997). e. or the ﬁrst few chapters of ?.g. Consider now an inﬁnite sequence I1 . e longueur. for a clear introduction see. . The term ‘countable’ here requires comment. I2 . B. .
So: some but not all subsets of the line have a length. if A. (iii) A.1. B ∈ A0 . As our aim is to deﬁne measures on collection of sets we now turn to set functions. The main examples of σalgebras are σalgebras generated by a class C of subsets of Ω. FACTS FORM PROBABILITY AND MEASURE THEORY 116 sets F on the line. i. That means we have a suitable deﬁnition of measure on R a family of subsets of I and want to extend it to the generated σalgebra.1. deﬁned on F. A0 an algebra on Ω and µ0 a nonnegative set function µ0 : A0 → [0. Let Ω be a set. A ∩ B = ∅ ⇒ µ0 (A ∪ B) = µ0 (A) + µ0 (B). A). . is needed to demonstrate the existence of nonmeasurable sets – but all such proofs are highly nonconstructive). (n ∈ I ). Turning now to the general case.1.APPENDIX B. µ0 is called: (i) additive. A. A collection A0 of subsets of Ω is called an algebra on Ω if: (i) Ω ∈ A0 . The measuretheoretic e tool to do so is the Carath´odory extension theorem. Let Ω be a set. we have N ∞ An ∈ A.e. The triple (Ω. we can show that an algebra on Ω is a family of subsets of Ω closed under ﬁnitely many set operations. There are others – but they are hard to construct (in technical language: the axiom of choice. but on no others. and form the class F described above. µ1 (Ω) = µ2 (Ω) < ∞) and µ1 = µ2 on I. Deﬁnition B. n=1 Such a pair (Ω. Using this deﬁnition and induction. ∞] is called a measure on (Ω.4. or some variant of it such as Zorn’s lemma. a family of subsets of Ω closed under ﬁnite intersections: I1 . if whenever (An )n∈I is a sequence of disjoint sets in A0 with N then ∞ ∞ An ∈ A 0 µ0 n=0 An = n=1 µ0 (An ). length. Let A = σ(I) and suppose that µ1 and µ2 are ﬁnite measures on (Ω. Let I be a πsystem on Ω. is called Lebesgue measure µ (on the real line. Deﬁnition B. ∞] such that µ0 (∅) = 0. by halflines such as (−∞. Lemma B. I R). A) (i. An algebra A of subsets of Ω is called a σalgebra on Ω if for any sequence An ∈ A. I2 ∈ I ⇒ I1 ∩ I2 ∈ I.2. A) be a measurable space. σ(C) is the smallest σalgebra on Ω containing C. Thus a σalgebra on Ω is a family of subsets of Ω closed under any countable collection of set operations. x] as x varies in I R.1. we make the above rigorous. µ) is called a measure space.3. Deﬁnition B. These are called the Lebesguemeasurable sets. (ii) A ∈ A0 ⇒ Ac = Ω \ A ∈ A0 . B ∈ A0 ⇒ A ∪ B ∈ A0 . The Borel σalgebra B = B(I is the σalgebra of subsets of I generated by the open intervals R) R (equivalently. Deﬁnition B.1. Then µ1 = µ2 on A. (ii) countably additive. Let (Ω.e. A) is called a measurable space. for which the following lemma is an inevitable prerequisite.1. that is. A countably additive map µ : A → [0. Recall that our motivating example was to deﬁne a measure on I consistent with our geometR rical knowledge of length of an interval.1. Let Ω be a set.
in I k . k k µ i=1 (ai . Now by Carath´odory’s extension theorem there exists a measure µ on e (Ω. Let Ω be a set. b2 ) – with or without any of its perimeter included – given by µ(R) = (b1 − a1 ) × (b2 − a2 ) to Lebesgue measure on Borel sets in I 2 . uncountable situations (such as we meet in continuous time) do not – or at least. ≤ ar < br ≤ ∞. b1 ) × (a2 . This µ is called Lebesgue measure. . b2 ) × (a3 . Returning to the motivating example Ω = I we say that A ⊂ I belongs to the collection of R. . I ) is called a probability space. σ(A0 ) = B. For example the unit cube [0. If µ0 is ﬁnite. A. .) (or. where r ∈ I −∞ ≤ a1 < b1 ≤ . P The triple (Ω. R (iii) and similarly in kdimensional Euclidean space I k . For proofs of the above and further discussion. As intervals belong to A0 our geometric intuition of length is preserved. we refer the reader to Chapter 1 and Appendix 1 of Williams (1991) and the appendix in Durrett (1996). As the key measuretheoretic axiom of countable additivity above concerns addition.e. We start with the formula for a kR dimensional box. 1]k in I k has (Lebesgue) measure 1. A0 an algebra on Ω e and A = σ(A0 ). P Observe that the above lemma and Carath´odory’s extension theorem guarantee uniqueness if e we construct a probability measure using the above procedure. integration in uncountable ones.). b3 ) given by µ(C) = (b1 − a1 ) · (b2 − a2 ) · (b3 − a3 ) to Lebesgue measure on Borel sets in I 3 . If µ0 is a countably additive set function on A0 . . A) is called a probability measure if P I (Ω) = 1. Using Ω = [0. 4 is easier than.APPENDIX B. one uses addition in countable (or ﬁnite) situations. A measure I on a measurable space (Ω. R We are mostly concerned with a special class of measures: Deﬁnition B. FACTS FORM PROBABILITY AND MEASURE THEORY 117 Theorem B. b1 ] ∪ .s. then there exists a measure µ on (Ω. bi ) = i=1 (bi − ai ). then the extension is unique. deﬁned on B. B) extending µ0 on (Ω. If it holds everywhere except on a set of probability zero. A0 ).1 (Carath´odory Extension Theorem). For A as above deﬁne r µ0 (A) = (bk − ak ). A) such that µ = µ0 on A0 . 1]k as the underlying set in the above construction R we ﬁnd a unique probability (which equals length/area/volume if k = 1/2/3). and obtain Lebesgue measure µ. Roughly speaking.1. R sets A0 if A can be written as A = (a1 . br ]. with probability one). we say it holds almost surely (a. R (ii) the volume of cuboids C = (a1 . By contrast. we say it holds almost everywhere (a. This is why the discretetime setting of Chapters 3. It can be shown that A0 is an algebra and N. are considerably harder to handle.1.5. ∪ (ar . countably inﬁnite situations (such as we meet in discrete time) ﬁt well with measure theory. k=1 µ0 is welldeﬁned and countably additive on A0 . If a property holds everywhere except on a set of measure zero. . b1 ) × (a2 . With the same approach we can generalise: (i) the area of rectangles R = (a1 .
or diverges to ∞. If (fn ) is a sequence of nonnegative measurable functions such that fn is strictly monotonic increasing to a function f (which is then also measurable). 6. Using the monotone convergence theorem we can thus obtain the integral of f as µ(f ) := lim µ(fn ). Deﬁnition B. the continuoustime setting of Chapters 5. 0. then µ(fn ) → µ(f ) ≤ ∞. R (A) measurable if f −1 (B) ∈ A for all B ∈ B. We start with the simplest functions. if ω ∈ A if ω ∈ A. Let f : Ω → I For A ⊂ I deﬁne f −1 (A) = {ω ∈ Ω : f (ω) ∈ A}. A function f is called simple if it is n a ﬁnite linear combination of indicators: f = i=1 ci 1Ai for constants ci and indicator functions 1Ai of measurable sets Ai . Our strategy is to do as much as possible to introduce the key ideas – economic. FACTS FORM PROBABILITY AND MEASURE THEORY 118 and precedes. A). which we must use here to guarantee that the integral for nonnegative measurable functions is welldeﬁned is: Theorem B.APPENDIX B.2.2 Integral Let (Ω. If f is a nonnegative measurable function.2. so either µ(fn ) increases to a ﬁnite limit. f is called R. for constants ci and indicators of measurable sets Ai . The next step extends the deﬁnition to simple functions. One then extends the deﬁnition of the integral from indicator functions to simple functions by linearity: n n n µ i=1 ci 1Ai := i=1 ci µ(1Ai ) = i=1 ci µ(Ai ). so does µ(fn ) (the integral is orderpreserving). A) be a measurable space. We want to deﬁne integration for a suitable class of realvalued functions. we say f is (Lebesgue) integrable with (Lebesgue) integral µ(f ) = lim µ(fn ). is actually a ﬁnite situation). fn simple. If A ∈ A the indicator function 1A (ω) is deﬁned by 1A (ω) = 1. f0 ≤ f }.1. because we work with a ﬁnite timehorizon. We will denote this integral by µ(f ) = Ω f dµ = Ω f (ω)µ(dω).1 (Monotone Convergence Theorem). The key result in integration theory. the (Lebesgue) integral with respect to µ. for suitable measurable functions. before treating the harder case of continuous time. We quote that we can construct each nonnegative measurable f as the increasing limit of a sequence of simple functions fn : fn (ω) ↑ f (ω) for all ω ∈ Ω (n → ∞). B. n→∞ Since fn increases in n. ﬁnancial and mathematical – in discrete time (which. . Then deﬁne µ(1A ) = µ(A). Our aim now is to deﬁne. In the ﬁrst case. we deﬁne µ(f ) := sup{µ(f0 ) : f0 simple. the expiry time T . Let µ be a measure on (Ω.
we split it into its positive and negative parts. FACTS FORM PROBABILITY AND MEASURE THEORY 119 Finally if f is a measurable function that may change sign. it is Lebesgueintegrable to the same value (but many more functions are Lebesgue integrable). b]. b]. Thus. which is particular important as it is a Hilbert space (Appendix A). f  is also integrable. MathR ematics undergraduates are taught the Riemann integral (G. and deﬁne µ(f ) := µ(f+ ) − µ(f− ). which are at worst jumps. we say that f is too.APPENDIX B. . Thus. for instance. by construction. since be replaced by the limit relation X diverges to +∞ like ∞ 1 dx. b] . 0). an absolute integral: f is integrable iﬀ f  is integrable. For p ≥ 1. 0). We now generalise the starting points above: • Measure. in particular. but much harder to manipulate. on [a. Such functions can have at most countably many discontinuities.e. the Lp space Lp (Ω) on Ω is the space of measurable functions f with Lp norm f p 1 p := Ω f  dµ < ∞. b] iﬀ it is continuous a. the wellknown formula ∞ 0 sin x π dx = x 2 ∞ sin x x dx 1 has no meaning for Lebesgue integrals.we quote: (i) for any function f Riemannintegrable on [a. b]) := F (b) − F (a). f (ω) = f+ (ω) + f− (ω). Thus the question. Riemann (1826–1866)) as their ﬁrst rigorous treatment of integration theory – essentially this is just a rigorisation of the school integral.B. x 2 The class of (Lebesgue) integrable functions f on Ω is written L(Ω) or (for reasons explained below) L1 (Ω) – abbreviated to L1 or L. For ﬁnite intervals [a. The Lebesgue integral thus deﬁned is. p The case p = 2 gives L2 . f± : f+ (ω) := max(f (ω). We take µ((a. If both f+ and f− are integrable. Turning now to the special case Ω = I k we recall the wellknown Riemann integral. (ii) f is Riemannintegrable on [a. We may without loss redeﬁne F at jumps so as to be rightcontinuous. 1 x It has to 0 π sin x dx → (X → ∞). and µ(f ) = µ(f+ ) + µ(f− ). f− (ω) := − min(f (ω). Suppose that F (x) is a nondecreasing function on I R: F (x) ≤ F (y) if x ≤ y. ‘Which functions are Riemannintegrable?’ cannot be answered without the language of measure theory – which gives one the technically superior Lebesgue integral anyway. f (ω) = f+ (ω) − f− (ω). It is much easier to set up than the Lebesgue integral.
. • LebesgueStieltjes integral µF (f ) = f dµF . whose classic book of 1933. F is the diﬀerence of two such functions. pr(A)). We may now follow through the successive extension procedures used above. F ∈ BV .1. If [a. We clearly want (i) I (∅) = 0. b] is a ﬁnite interval and F is deﬁned on [a. This was realised by Kolmogorov (19031987). We obtain: • LebesgueStieltjes measure µF . we will need to handle both ‘smooth’ paths and paths that vary by jumps – of bounded variation – and ‘rough’ ones – of unbounded variation but bounded quadratic variation. The least upper bound of this over all partitions P b is called the variation of F over the interval [a. Grundbegriﬀe der Wahrscheinlichkeitsrechnung (Foundations of Probability Theory).3 Probability As we remarked in the introduction of this chapter. . Kolmogorov (1933). P P . or even f dF . b]. we begin with the sample space Ω. F ∈ BVloc . represents a possible – random – outcome of performing the random experiment. (ii) F can be written as the diﬀerence F = F1 − F2 of two monotone functions. The sum i=1 F (xi ) − F (xi−1 ) is called the variation of F over the partition. . f dF2 as above. . (ii) The LebesgueStieltjes integral g(x)dF (x) is needed to express the expectation I Eg(X). but if Va (F ) < ∞. F = F1 − F2 . x0 . to correspondence between Pascal (1623–1662) and Fermat (1601–1665). etc. inaugurated the modern era in probability. the set of all possible outcomes. or sample point. P say. F ∈ BVa . we omit further details. b b This may be +∞. B. If F is of bounded variation on all ﬁnite intervals. We quote that the following two properties are equivalent: (i) F is locally of bounded variation. xn with n a = x0 < x1 < · · · < xn = b. a ﬁnite collection of points. F is said to be of bounded variation on [a. Va (F ): b Va (F ) := sup P F (xi ) − F (xi−1 ). the mathematical theory of probability can be traced to 1654. So the above procedure deﬁnes the integral tion. R.) for us. F is said to be locally of bounded variation. b]. x1 . Recall from your ﬁrst course on probability that. 120 The approach generalises to higher dimensions. However. It turns out that the Lebesgue theory of measure and integral sketched above is exactly the machinery needed to construct a rigorous theory of probability adequate for modelling reality (option pricing. Each point ω of Ω. f dF when the integrator F is of bounded varia Remark B. where X is random variable with distribution function F and g a suitable function. the theory remained both incomplete and nonrigorous until the 20th century. P For a set A ⊆ Ω of points ω we want to know the probability I (A) (or Pr(A). FACTS FORM PROBABILITY AND MEASURE THEORY • Integral. We have µ(1(a. (i) When we pass from discrete to continuous time. b]) = F (b) − F (a).2. we can deﬁne the integrals f dF1 .APPENDIX B. and then deﬁne f dF = f d(F1 − F2 ) := f dF1 − f dF2 . b].b] ) = µ((a. b]. is called a partition of [a. If instead of being monotone nondecreasing. to describe a random experiment mathematically. I (Ω) = 1. if F is of bounded variation on the real line I F is of bounded variation.
I ( P above we will strengthen to If A1 . . P If A1 .1. P P P So the class F of subsets of Ω whose probabilities I (A) are deﬁned (call such A events) should P be closed under countable.1. F.(ii). An are disjoint. Let (Ω. R) R In particular we have for a random variable X that {ω ∈ Ω : X(ω) ≤ x} ∈ F for all x ∈ I R. I ) satisfying P Kolmogorov axioms (i). X : Ω → I If such a function is measurable it is called a random variable.e. Doob. Therefore F should be a σalgebra and I should be deﬁned on F according to Deﬁnition 2. as P I P i=1 Ai = i=1 I (Ai ) P (countable additivity). {X < x}. R. (iv) If B ⊆ A and I (A) = 0. Thus. i. A probability space is a mathematical model of a random experiment. . disjoint unions and complements. The events in the σalgebra generated by X are the events {ω : X(ω) ∈ B}. For. Hence we can deﬁne the distribution function FX of X by FX (x) := I ({ω : X(ω) ≤ x}). which. A probability space.APPENDIX B.) are disjoint.2. I (Ac ) = I (Ω \ A) = 1 − I (A). or Kolmogorov triple. When the (random) value X(ω) is known. and we can go back via Y = g −1 (X). is a triple (Ω. A random variable (vector) X is a P function X : Ω → I (X : Ω → I k ) such that X −1 (B) = {ω ∈ Ω : X(ω) ∈ B} ∈ F for all Borel R R sets B ∈ B(I (B ∈ B(I k )). {X ≥ x}. knowing Y means we know X := g(Y ) – but not vice versa. when the inverse function g −1 exists. which we quote: σ(X) ⊆ σ(Y ) if and only if X = g(Y ) for some measurable function g. FACTS FORM PROBABILITY AND MEASURE THEORY (ii) (iii) (iii)* I (A) ≥ 0 for all A.3.(iii)*. ∞ ∞ n i=1 121 Ai ) = n i=1 I (Ai ) (ﬁnite additivity). Often we quantify outcomes ω of random experiments by deﬁning a realvalued function X on Ω. written σ(X).3. A2 . . (ad inf. (iv) above.L. then I (B) = 0 (completeness). . Deﬁnition B. . {X > x}) is called the σalgebra generated by X. A2 . F. Ω \ A = A2 ). I ) be a probability space. P P Then by (i) and (iii) (with A = A1 . P The smallest σalgebra containing all the sets {ω : X(ω) ≤ x} for all real x (equivalently. X is F − measurable (is a random variable) iﬀ σ(X) ⊆ F. Repeating this: Deﬁnition B. Think of σ(X) as representing what we know when we know X. and contain the empty set ∅ and the P whole space Ω. .5. or in other words the information contained in X (or in knowledge of X). unless the function g is onetoone (injective). we know which of these events have happened. Interpretation. This is reﬂected in the following result. where B runs through the Borel σalgebra on the line. . due to J.
FACTS FORM PROBABILITY AND MEASURE THEORY Note. e. F. .1) determines an integral (§2. Deﬁnition B. P Using Lemma B. I EX := xn f (xn ). Chapter 3.3.1. taking values xn (n = 1. R Clearly the expectation operator I is linear. Xn to be independent it is necessary and suﬃcient that for all x1 . ∞ f (xn ) = These two formulae are the special cases (for the density and discrete cases) of the general formula I EX := −∞ xdF (x) where the integral on the right is a LebesgueStieltjes integral. xn ∈ (−∞.2). . I ) is deﬁned by E P I EX := Ω XdI P. . n we have n n I P {Xi ∈ Ai } i=1 = i=1 I ({Xi ∈ Ai }). In order for X1 . or Ω X(ω)dI (ω).3. It even becomes multiplicative if we consider E independent random variables. recall that I EX is deﬁned in your ﬁrst course on probability by I EX := xf (x)dx if X has a density f xn or if X is discrete. n n I P {Xi ≤ xi } i=1 = i=1 I ({Xi ≤ xi }). since if F is the distribution function of X. Random variables X1 . Deﬁnition B. . with distribution function F . . The expectation I of a random variable X on (Ω. this is useful preparation for the general case. . A probability measure I . A measure (§2. say. This in turn agrees with the deﬁnition above. . .4. P The expectation – also called the mean – describes the location of a distribution (and so is called a location parameter). Xn are independent if whenever Ai ∈ B for i = 1. being a special P kind of measure (a measure of total mass one) determines a special kind of integral. . 2.) with probability function f (xn )(≥ 0) ( 1). . on applying the map X : Ω → I (we quote this: see any book on measure theory. .3. Dudley (1989)). needed for continuous time. . . Although technically avoidable. . . .3.APPENDIX B.g. 122 An extended discussion of generated σalgebras in the ﬁnite case is given in Dothan’s book Dothan (1990). ∞]. P .1. ∞ XdI = P Ω −∞ xdF (x) follows by the change of variable formula for the measuretheoretic integral. Information about the scale of a distribution (the corresponding scale parameter) is obtained by considering the variance V ar(X) := I (X − I V E E(X))2 = I X 2 − (I E EX)2 .1 we can give a more tractable condition for independence: Lemma B. . called an expectation. If X is realvalued.
Clearly I E(r(1)) = up + d(1 − p) and V ar(r(1)) = u2 p + d2 (1 − p) − (I V EX)2 .4. Recall our arbitragepricing example from §1. σ 2 ). V ar(X) = np(1 − p). n we have R(T ) = = log log S(T ) S(T ) S(∆t) = log ··· S(0) S(T − ∆t) S(0) S(T ) S(∆t) + . log(S(t+∆t)/S(t)) is normally distributed). . The shorthand notation for a binomial random variable X is then X ∼ B(n. d = 0 (which is not a very useful choice in ﬁnancial modelling). . . . . (ii) Binomial distribution. . We now review the distributions we will mainly use in our models of ﬁnancial markets. P P We say that r(1) is distributed according to a Bernoulli distribution. . i = 1. either S(∆t) = eu S(0) with probability p or S(∆t) = ed S(0) with probability 1 − p (u. then subdividing into the periods 1. . Let R(∆t) = r(1) be a random variable modelling the logarithm of the stock return over the period [0. p) and we can compute I (X = k) = P n k p (1 − p)(n−k) .e. Examples.APPENDIX B. V V Again for the standard case one would use u = 1. if X has density function fµ. + log = r(n) + . FACTS FORM PROBABILITY AND MEASURE THEORY 123 Now using the usual measuretheoretic steps (going from simple to integrable functions) it is easy to show: Theorem B. . ∆t]. (i) Bernoulli distribution. . . .σ2 (x) = √ 1 1 exp − 2 2πσ x−µ σ 2 . If we consider the logarithm of the stock return over n periods (of equal length). . That means S(t+∆t)/S(t) is lognormally distributed (i. We assumed that after a period of time ∆t the stock price could have only one of two values. We say a random variable X is normally distributed with parameters µ. .1 (Multiplication Theorem). say over [0. + r(1). . and thus a normally distributed random variable V is fully described by knowledge of its mean and variance. As we will show in the sequel the limit of a sequence of appropriate normalised binomial distributions is the (standard) normal distribution. One can show that I E(X) = µ and V ar(X) = σ 2 . . Linearity of the expectation operator n n and independence yield I E(R(T )) = i=1 I E(r(i)) and V ar(R(T )) = i=1 V ar(r(i)). then I (r(1) = u) = p and I (r(1) = d) = 1 − p. If X1 . V k (iii) Normal distribution. in short X ∼ N (µ. n. σ 2 . . I E(X) = np. then n n I E i=1 Xi = i=1 I E(Xi ). Returning to the above example. one of the key results of this text will be that the limiting model of a sequence of ﬁnancial markets with oneperiod asset returns modelled by a Bernoulli distribution is a model where the distribution of the logarithms of instantaneous asset returns is normal. d ∈ I R).3. . Although rejected by many empirical studies (see Eberlein and Keller (1995) for a . S(T − ∆t) S(0) Assuming that r(i). n are independent and each r(i) is Bernoulli distributed as above n we have that R(T ) = i=1 r(i) is binomially distributed. i = 1. Xn are independent and I Xi  < E ∞. T ]. . d = 0. There we were given a stock with price S(0) at time t = 0. The standard case of a Bernoulli distribution is given by choosing u = 1. .
. For a Poisson process the probability of a jump (and no jump respectively) during a small interval ∆t are approximately I (ν(1) = 1) ≈ λ∆t and I (ν(1) = 0) ≈ 1 − λ∆t. Since the normal distribution is symmetric it is not possible to incorporate this empirical fact in the standard model.4).1: Measuretheoretic and probabilistic languages B.4 Equivalent Measures and RadonNikod´m Derivatives y Given two measures I and Q deﬁned on the same σalgebra F. so we need an additional process generating the jumps. with heavier tails. Rydberg (1996).3.APPENDIX B. measuretheoretic and probabilistic. Such a model like this would exhibit higher probabilities of extreme events and the passage from ordinary observations (around the mean) to extreme observations would be more sudden. To do this we use point processes in general and the Poisson process in particular. such a model seems to be the standard in use among ﬁnancial practitioners (and we will call it the standard model in the following). Financial time series also suggest modelling by probability distributions whose densities behave for x → ±∞ as ρ x exp{−σ x} with ρ ∈ I σ > 0. k! and we say the process N (T ) has a Poisson distribution with parameter λT . This may be explained by the fact that the markets react diﬀerently to positive as opposed to negative information (see Shephard (1996) §1.e. but there is a deﬁnite tendency towards asymmetry. Sometimes we want to incorporate in our model of ﬁnancial markets the possibility of sudden jumps. k = 0. which we have established. I << Q P Q . . We know that distributions of ﬁnancial asset returns are generally rather close to being symmetric around zero. V Glossary. The main arguments against using normally distributed random variables for modelling logreturns (i. we say that I is absolutely P Q P continuous with respect to Q written Q. . + ν(n) in the interval [0. 1.12 below). (iv) Poisson distribution. Rydberg (1997)) appear to be very promising. lognormal distributions for returns) are asymmetry and (semi) heavy tails.) Table B. We can show I E(N (T )) = λT and V ar(N (T )) = λT . . Table B. FACTS FORM PROBABILITY AND MEASURE THEORY 124 recent overview). .e. and more general distributions of normal inverse Gaussian type (see BarndorﬀNielsen (1998).s. This means that we should replace the normal distribution with a distribution R. T ] the probability function e−λT (λT )k I (N (T ) = k) = P . Modelling small intervals in such a way we get for the number of jumps N (T ) = ν(1) + .) Probability Expectation Event Random variable Almostsurely (a. Measure Integral Measurable set Measurable function Almosteverywhere (a. Using the standard model we model the asset price process by a continuous stochastic process. Among suggested (classes of) distributions to be used to address these facts is the class of hyperbolic distributions (see Eberlein and Keller (1995) and §2. P P where λ is a positive constant called the rate or intensity.1 summarises the two parallel languages.
we call I and Q equivalent measures. Q P Q P. in two ways: (a) It is the key to the concept of conditioning (§2. Q Far from being an abstract theoretical result. P. A ∈ F. P P P . the BlackScholes formula. dI P/dQ and dQ P both exist. sets). We shall return to this later in connection with the main mathematical result on change of measure. and hence to its main results. sets. FACTS FORM PROBABILITY AND MEASURE THEORY 125 if I (A) = 0. and the operation of change of measure. any I so representable is P certainly absolutely continuous with respect to Q – the point is that the converse holds. Thus the following are equivalent: I ∼Q P Q iﬀ iﬀ iﬀ I Q have the same null sets. Q dQ Q Symbolically. etc. whenever Q P Q(A) = 0. then Q dI = P A A dI P dQ ∀A ∈ F. Girsanov’s theorem (see §5. (b) The concept of equivalent measures is central to the key idea of mathematical ﬁnance. written I ∼ Q Then P Q Q P P Q P Q.5 Conditional expectation I (AB) := I (A ∩ B)/I (B) if I (B) > 0. By analogy with the P P P Q chain rule of ordinary calculus. Q have the same sets of probability one (the same a. I << Q iﬀ there exists a (F) measurable function f y P Q such that I (A) = f dQ ∀A ∈ F.) Q Since I (A) = A dI .1 (RadonNikod´m). and Q Q/dI dI P dQ Q = 1/ . P dI P dQ Q.s. this says that A dI = A f dQ for all A ∈ F. the RadonNikod´m theorem is of key practical y importance. if I << Q dI = P Q. which is of central importance throughout. B. Q I Q have the same a. riskneutrality. The key to all this is that prices should be the discounted expected values under an equivalent martingale measure.APPENDIX B.1) For basic events deﬁne From this deﬁnition. we write dI P/dQ for f . If I << Q and also Q << I . P P P P (B. are of central economic and ﬁnancial importance. P.s.7). Taking negations: P Q I ∼ Q iﬀ I Q have the same sets of positive measure. dQ Q The measurable function (random variable) dI P/dQ is called the RadonNikod´m derivative (RNQ y derivative) of I with respect to Q P Q. Q I Q have the same sets of positive measure. fundamental theorem of asset pricing. Thus equivalent measures.6 below).4. §2. We quote from measure theory the vitally important RadonNikod´m theorem: y Theorem B. dQ Q dI P For I ∼ Q I (A) = 0 iﬀ Q P Q. Taking complements: I ∼ Q iﬀ I Q P Q P. P. P Q(A) = 0: I and Q have the same null sets. P Q A (Note that since the integral of anything over a null set is zero. we get the multiplication rule I (A ∩ B) = I (AB)I (B).5.
2) I (B) P This suggest deﬁning.APPENDIX B. for suitable random variables X. this implies ZdI = P G G Y dI ∀ G ∈ σ(X). yj ) and f2 (yj ) = i=1 f (xi . Then the marginal distributions are n m f1 (xi ) = j=1 f (xi . Y = yj ) P f (xi . . yj ) = . Now deﬁne the random variable Z = I E(Y X). . . yj ) fY X (yj xi ) = . while the vector (X. f1 (xi ) So we can compute its expectation as usual: I E(Y X = xi ) = j yj fY X (yj xi ) = j yj f (xi .3) Consider now discrete random variables X and Y . . . xm with probabilities f1 (xi ) > 0. I (B) P (B. . . yj ). we get the Bayes rule 126 n I (BAn )I (An ) with (An ) a ﬁnite or countable partition P P I (Ai )I (BAi ) P P . Assume X takes values x1 . X = xi ) = P j Since the {X = xi } generate σ(X). FACTS FORM PROBABILITY AND MEASURE THEORY Using the partition equation I (B) = P of Ω. then Z(ω) = I E(Y X = xi ) = zi (say) Observe that in this case Z is given by a ’nice’ function of X. I (X = xi ) P f1 (xi ) Thus conditional on X = xi (given the information X = xi ). . yj ) f1 (xi ) . Y takes on the values y1 . a more abstract property also holds true. P . . We can use the standard deﬁnition above for the events {Y = yj } and {X = xi } to get I (Y = yj X = xi ) = P I (X = xi . Y ) takes values (xi . yj ) with probabilities f (xi . yj ) > 0. P P j I (Aj )I (BAj ) I (Ai B) = P We can always write I (A) = I P E(1A ) with 1A (ω) = 1 if ω ∈ A and 1A (ω) = 0 otherwise. Y takes values y1 . the conditional expectation of Y given X. . yn with (conditional) probabilities f (xi . yn with probabilities f2 (yj ) > 0. However. . as follows: if X(ω) = xi . Then the above can be written I E(1A 1B ) I E(1A B) = (B. . Furthermore ZdI P {X=xi } = = zi I (X = xi ) = P j yj fY X (yj xi )I (X = xi ) P Y dI P. {X=xi } yj I (Y = yj . the I average of X over B as P I E(XB) = I E(X1B ) . Since Z is constant on the the sets {X = xi } it is σ(X)measurable (these sets generate the σalgebra).
σ2 . Here. I E(Y X = x) = µ2 + ρ σ2 (x − µ1 ). for sets G with G = {ω : X(ω) ∈ B} with B a Borel set. y)dy. The conditional density of Y given X = x is: f (x. f1 (x) Its expectation is I E(Y X = x) = −∞ ∞ ∞ −∞ yfY X (yx)dy = yf (x. σ1 the familiar regression line of statistics (linear model) – see Exercise 2. σadditive – because Y dI = P G n Gn Y dI P if G = ∪n Gn . we speak of diﬀerent versions in such cases. P Indeed. hence I E(Y X) is only deﬁned up to a set of probability zero. then Q Q(G) := G Y dI P (G ∈ G) is nonnegative. y)dy f1 (x) . Gn disjoint – and deﬁned on the σalgebra G. Y ) has density f (x. then X has (marginal) density ∞ ∞ f1 (x) := −∞ f (x. If the random vector (X. y)dx. y Suppose that G is a subσalgebra of F. y)dydx = Now these sets G generate σ(X) and by a standard technique (the πsystems lemma. ρ). and call c(X) the conditional expectation of Y given X. denoted by I E(Y X). µ2 . y). 2 2 N (µ1 . §2. G = −∞ −∞ 1B (x)yf (x.3) the claim is true for all G ∈ σ(X).6. Example. we follow Kolmogorov’s construction using the RadonNikod´m theorem. y) fY X (yx) := . If Y is a nonnegative random variable with I EY < ∞. Observe that on sets with probability zero (i. see Williams (2001). so it is a measure on G. General case. G ⊂ F. FACTS FORM PROBABILITY AND MEASURE THEORY 127 Density case. f1 (x) = 0}) the choice of c(x) is arbitrary. we ﬁnd by Fubini’s theorem ∞ c(X)dI P G = −∞ ∞ 1B (x)c(x)f1 (x)dx ∞ = −∞ ∞ 1B (x)f1 (x) ∞ −∞ yfY X (yx)dydx Y dI P. Y has (marginal) density f2 (y) := −∞ f (x.APPENDIX B. So we deﬁne c(x) = I E(Y X = x) 0 if f1 (x) > 0 if f1 (x) = 0. With this deﬁnition we again ﬁnd c(X)dI = P G G Y dI ∀ G ∈ σ(X). . σ1 .e {ω : X(ω) = x. Bivariate Normal Distribution.
i. Following Kolmogorov. one can use Hilbert space projection theory (Neveu (1975) and Jacod and Protter (2000) follow this route). Xn )) =: I E E(Y X1 . For Y that changes sign. Alternatively. Note. I E[(Y − I E(Y G))2 ] = min{I E[(Y − X)2 ] : X G − measurable}. there exists a RadonNikod´m derivative of Q with respect y y Q to I on G. It may take a little getting used to. To check that something is a conditional expectation: we have to check that it integrates the right way over the right sets (i. then Q P Q(G) = 0 also (the integral of anything over a null set is zero).1. G = {∅. From (B. We call a random variable Z a version of the conditional expectation I E(Y G) of Y given G. . . and Chapter 3 for its use in studying stochastic processes.. as in (B.s. we shall pass between I 4. We used the traditional approach to deﬁne conditional expectation via the RadonNikod´m y theorem. We summarize: Deﬁnition B. which is Gmeasurable. Then Z(ω) = I E(Y G)(ω) is the expected value of Y (ω) given this information. i.APPENDIX B. . Indeed. whose existence we now the conditional expectation of Y given (or conditional on) G. The conditional expectation deﬁned above – via σalgebras and the RadonNikod´m theorem – is y rightly called by Williams ((Williams 1991).. As with all important but nonobvious deﬁnitions.4) Notation. for Y ∈ L2 (Ω. . . FACTS FORM PROBABILITY AND MEASURE THEORY 128 If I (G) = 0. Xn ). they have the same conditional expectation given G. (iii) for every set G in G.e.e. The conditional expectation thus deﬁned coincides with any we may have already encountered – in regression or multivariate analysis.e.5. However. Assume an experiment has been performed. split into Y = Y + − Y − . the only information we have is the set of values X(ω) for every Gmeasurable random variable X. so Q << I . 2. Ω}. and deﬁne I E(Y G) := I E(Y + G) − I E(Y − G). and one can compare the general case with the motivating examples above. . We now discuss the fundamental properties of conditional expectation. .4)). E 3. Suppose G = σ(X1 . I ) one can show that the conditional P expectation Z = I E(Y G) is the leastsquaresbest Gmeasurable predictor of Y : amongst all Gmeasurable random variables it minimises the quadratic distance. Q P By the RadonNikod´m theorem. . (B. . I E(Y {∅. 1.1. Xn ). . we have Y dI = P G G ZdI P ∀G ∈ G. E(Y G) and I G Y at will. I have established. For notational convenience. Further properties are given by Proposition B. particularly martingales (which are deﬁned in terms of conditional expectations).4): if two things integrate the same way over all sets B ∈ G. it proves its worth in action: see §2. if (i) Z is Gmeasurable. and write Z = I E(Y G). Ω}) = I EY. F. From the deﬁnition linearity of conditional expectation follows from the linearity of the integral. ω ∈ Ω has been realized. a.6 below for properties of conditional expectations.84) ‘the central deﬁnition of modern probability’.5. However. consider the following situation. (ii) I E(Z) < ∞. To see the intuition behind conditional expectation. for example. Let Y be a random variable with I E(Y ) < ∞ and G be a subσalgebra of F. . this may not be immediately obvious. Then I E(Y G) = I (Y σ(X1 . . . 1. we call this RadonNikod´m derivative P y E(Y G). p.
Role of independence. If I (Z < 0) > 0.s. First.4): N N YI E(ZG)dI P B = B N n=1 bn 1Bn bn n=1 B∩Bn I E(ZG)dI = P n=1 N bn B∩Bn I E(ZG)dI P = = B ZdI = P B n=1 bn 1Bn ZdI P Y ZdI P. 7. Let Z be a version of I E(XG). the set P G := {Z < −n−1 } ∈ G Thus and I ({Z < −n−1 }) > 0. If Y is Gmeasurable and bounded. then for some n. Note.APPENDIX B.4) for all sets G ∈ F . 6. and represents ‘knowing nothing’. Then Y is no longer random (being known when G is given). Recall that Y is always Fmeasurable (this is the deﬁnition of Y being a random variable). I E(Y G)G0 ] = I E[I E[Y G0 ] a. We have to check (B. Here G = F is the largest possible σalgebra. Then Y can be written as N Y = n=1 bn 1Bn . If G0 ⊂ G. I E(Y F) = Y I − a. 8. Note. I a. P 0≤I E(X1G ) = I E(Z1G ) < −n−1 I (G) < 0.4) for G = ∅ and G = Ω. we know Y (because we know everything). If c : I → I is convex. Proof. 3. 1. and represents ‘knowing everything’. P E(Y ZG) = Y I E(ZG) I − P 5.s. EY 2. and using linearity and (B. for constants bn and events Bn ∈ G. for G = Ω both sides are I .. 4. B ∩ Bn ∈ G also (as G is a σalgebra). If Y is Gmeasurable. Tower property. To say that Y is Gmeasurable is to say that Y is known given G – that is. the proof above applies with G in place of F. If Y is independent of G. FACTS FORM PROBABILITY AND MEASURE THEORY 2. when we are conditioning on G. I E(Y G) = Y 4. I E(Y G) = I EY 9.. and I R R Ec(X) < ∞. then I E(XG) ≥ 0 I − a. For G ⊂ F. There is thus no uncertainty left in Y to average out. We have to check (B. P which contradicts the positivity of X.. Positivity. but if it is. Y may not be Gmeasurable. Here G = {∅. P a.. P I − a. Conditional mean formula.s. When we condition on F (‘knowing everything’). so taking the conditional expectation (averaging out remaining randomness) has no eﬀect.s..s. Taking out what is known. or a function agreeing with Y except on a set of measure zero. and leaves Y unaltered. I E(Y G)] = I E[I EY I − a. Conditional Jensen formula.s. 5. If X ≥ 0. If G = F. then I E(c(X)G) ≥ c (I E(XG)) . The only integrand that integrates like Y over all sets is Y itself. and so counts as a constant when the conditioning is performed. P 129 3.s. . Then for any B ∈ G. consider the case when Y is discrete. Ω} is the smallest possible σalgebra (any σalgebra of subsets of Ω contains ∅ and Ω). For G = ∅ both sides are zero.
If Y is independent of G. the result holds by (B. Take G0 = {∅. Williams (1991). so Gmeasurable as G0 ⊂ G. I G : E E E E I G0 [I G Y ]dI = E E P C C I G Y dI = E P C Y dI P. Proof.g. Being also G0 measurable. 9. n Remark B.g. which involves the standard approximation steps based on the monotone convergence theorem from measure theory (see e. 6. I E(Y G)dI P B = B Y dI = P Ω 1B Y dI P I dI EY P. the result follows by (B. proof of (j)).7h.90. which follows from the assumption that Y is bounded and Z ∈ L1 . 6‘ are the two forms of the iterated conditional expectations property. and use 1. 7. Thus the map X → I E(XG) is idempotent: applying it twice is the same as applying it once. x ∈ I R. 6. §9. it is I G0 Y (a.s.. I G0 I G Y is G0 measurable.4) and linearity. bn )) of points in I 2 such that R c(x) = sup(an x + bn ). §9. the coarser rubs out the eﬀect of the ﬁner. 6 we take G = G0 . one smaller (coarser).s.6a. When conditioning on two σalgebras. We are thus left to show the I E(ZY ) < ∞. Since this holds for all B ∈ G. to see from c(X) ≥ an X + bn that I E[c(X)G] ≥ an I E(XG) + bn . so I E[. We omit details of the proof here. we obtain: I E(XG)G] = I E[I E(XG).). So by (B. I E[c(X)G] ≥ sup (an I E(XG) + bn ) = c (I E(XG)) . p. I E(Y G0 )G] = I E[I E[Y G0 ] a. Hence we may identify the conditional expectation operator as a projection. If G0 ⊂ G. For the general case. by 3. So I G0 [I G Y ] satisﬁes the deﬁning relation for I G0 Y . one larger (ﬁner).4).G] has no eﬀect on it.5. B = I E(1B Y ) = I E(1B )I E(Y ) = using the multiplication theorem for independent random variables.APPENDIX B. I E[Y G0 ] is G0 measurable. and for C ∈ G0 ⊂ G. for each of which the result holds as just proved. Recall (see e. Ω} in 6. . If in 6. Note. that for every convex function there exists a countable sequence ((an . So.4). 8. using the deﬁnition of I G0 .1. This may be thought of as the coarseaveraging property: we shall use this term interchangeably with the iterated conditional expectations property (Williams (1991) uses the term tower property).8h). we approximate to a general random variable Y by a sequence of discrete random variables Yn . either way round. Y is independent of 1B for every B ∈ G. (Williams 1991). n For each ﬁxed n we use 4. E E E E We also have: 6‘. §6. FACTS FORM PROBABILITY AND MEASURE THEORY 130 Since this holds for all B ∈ G.
rather than everywhere.8 below. Thus almostsure convergence is a stronger convergence concept than convergence in probability. However. . Weaker convergence concepts are also useful: they may hold under weaker conditions. which we develop later. Deﬁnition B. In the stochasticprocess setting – such as the passage from discrete to continuous BlackScholes models mentioned above – we need concepts beyond those we have to hand. we say Xn converges to X almost surely – Xn → X (n → ∞) a.if. to assert convergence on a set of probability one (‘almost surely’). we have dealt with one probability measure – or its expectation operator – at a time. X are random variables.6. If Xn . We similarly deﬁne the Lp spaces of pthpower integrable random variables: if p ≥ 1 and X is a random variable with X p := (I EXp )1/p < ∞.2. X are random variables. We shall. For instance. The ﬁrst idea that occurs to one is to use the ordinary convergence concept in this new setting. §12. have many occasions to consider a whole sequence of them.g.s. FACTS FORM PROBABILITY AND MEASURE THEORY 131 B. P The loose idea of the ‘law of averages’ has as its precise form a statement on convergence almost surely. – if Xn → X with probability one – that is. if I ({ω : Xn (ω) → X(ω) as n → ∞}) = 1. if Xn is the observed frequency of heads in a long series of n independent tosses of a fair coin. X were nonrandom. It turns out that the above statement is false in this case. If Xn . Deﬁnition B.APPENDIX B.1.6). or they may be easier to prove. Such situations arise. I ({ω : Xn (ω) − X(ω) > }) → 0 (n → ∞). whenever we approximate a ﬁnancial model in continuous time (such as the continuoustime BlackScholes model of §6. Recall the Lp spaces of pthpower integrable functions (§2. taken literally: some qualiﬁcation is needed. of random variables: then if Xn . the qualiﬁcation needed is absolutely the minimal one imaginable: one merely needs to exclude a set of probability zero – that is.10). It is correspondingly much easier to prove: indeed. This is Kolmogorov’s strong law of large numbers (see e. then the above in this case would be the maninthestreet’s idea of the ‘law of averages’.6. we shall prove it in §2. We conﬁne ourselves here to setting out what we need to discuss convergence of random variables. in the various senses that are useful. we say that Xn converges to X in probability Xn → X (n → ∞) in probability . X = 1/2 the expected frequency. (Williams 1991). which is quite diﬃcult to prove. however. P It turns out that convergence almost surely implies convergence in probability. Xn → X (n → ∞) would be taken literally – as if the Xn . for all > 0. but not in general conversely. This comparison is reﬂected in the form the ‘law of averages’ takes for convergence in probability: this is called the weak law of large numbers.2). converging (in a suitable sense) to some limiting probability measure. for example.6 Modes of Convergence So far. X are random variables.2) by a sequence of models in discrete time (such as the discretetime BlackScholes model of §4. which as its name implies is a weaker form of the strong law of large numbers.
8): if the limit X is constant (nonrandom). convergence in probability and in distribution are equivalent. All the modes of convergence discussed so far involve the values of random variables. This deﬁnition is given a fulllength book treatment in (Billingsley 1968). if p → 0 (n → ∞). if Xn → X in L2 we say that Xn → X in mean square. In such cases. we say that P P I n → I (n → ∞) weakly P P if f dI n → P f dI (n → ∞) P (B. 2 are particularly important: if Xn → X in L1 . We shall need such a framework in the passage from discrete. Weak Convergence. Xn → X in Lp . there is a natural convergence P concept: we say that Xn converges to X in Lp . however. or in pth mean. it is only the distributions of random variables that matter. is continuous except for at most countably many jumps. The restriction to continuity points x of the limit seems awkward at ﬁrst. I ) to be precise). but it is both natural and necessary.3. The cases p = 1. Convergence in distribution is (by far) the weakest of the modes of convergence introduced so far: convergence in probability implies convergence in distribution.to continuoustime BlackScholes models. Often. but not conversely. imply convergence in probability. however. For Xn . X ∈ Lp . Both.5) for all bounded continuous functions f . the natural mode of convergence is the following: Deﬁnition B. F. a partial converse (which we shall need in §2. however. There is. We say that random variables Xn converge to X in distribution if the distribution functions of Xn converge to that of X at all points of continuity of the latter: Xn → X in distribution if I ({Xn ≤ x}) → I ({X ≤ x}) (n → ∞) P P for all points x at which the righthand side is continuous. but also to inﬁnitedimensional settings such as arise in convergence of stochastic processes. if Xn − X that is. FACTS FORM PROBABILITY AND MEASURE THEORY 132 we say that X ∈ Lp (or Lp (Ω. . we say that Xn → X in mean. or to the ﬁnitedimensional (vectorvalued) setting. It is also quite weak: note that the function x → I ({X ≤ x}).APPENDIX B. being monotone in P x. I E(Xn − Xp ) → 0 (n → ∞). we have to restrict to random variables in Lp for the comparison even to be meaningful): neither implies the other. I are probability measures. and we refer to this for background and details. If I n . not just to this onedimensional case. the weakconvergence deﬁnition above applies equally. Convergence in pth mean is not directly comparable with convergence almost surely (of course. The set of continuity points is thus uncountable: ‘most’ points are continuity points. weak convergence of their probability measures is the same as convergence in distribution of their distribution functions.6. However. For ordinary (realvalued) random variables.
that information is never lost (or forgotten): thus. we need further structure. This is provided by the idea of a ﬁltration. give us all the machinery we need to handle static situations involving randomness. or tick data by the second). . and is not always appropriate. Partly for simplicity. in the discrete case. (say. we outline below the elements of this theory that we shall need. There may be a ﬁnal time T . involving randomness which unfolds with time. Thus as time passes. 1. for simplicity. 2. . We thus need a sequence of σalgebras {Fn : n = 0. We shall always suppose all σalgebras to be complete (this can be avoided. 1. . or we may have an inﬁnite time horizon (in the context of option pricing. . F. unfolding with time. As above. . P ). and the Kolmogorov conditional expectations I E(XB). To handle dynamic situations. We wish to model a situation involving randomness unfolding with time. which are increasing: Fn ⊂ Fn+1 (n = 0. We recall from Chapter 2 that σalgebras represent information or knowledge. new information becomes available to all agents. We shall further assume that information once known remains known – is not forgotten – and can be accessed in real time. and available to all. We postpone the continuous case to Chapter 5. t = 0. 2. Information overload is as much of a danger as information scarcity. information is arguably the most important determinant of success in ﬁnancial life. we may suppose time evolves in integer steps. the time horizon T is the expiry time). 2. we restrict ourselves here to the simplest possible situation and do not diﬀerentiate between agents on the basis of their informationprocessing abilities. Time may evolve discretely. we shall conﬁne ourselves to the situation where agents take decisions on the basis of information in the public domain. What we need is a mathematical language to model this information ﬂow. with Fn representing the information. However. . is one of the main factors which will discriminate between the abilities of diﬀerent economic agents to react to changing market conditions. or continuously. as time increases we learn more. Indeed. The Kolmogorov triples (Ω. time as t = 0. partly to reﬂect the legislation and regulations against insider trading.}. but it simpliﬁes matters and suﬃces for our purposes).Appendix C Stochastic Processes in Discrete Time C. . uptodate information is clearly essential to anyone actively engaged in ﬁnancial activity or trading. . . we suppose.1 Information and Filtrations Access to full. who continually update their information.). organise it. stockmarket quotations daily. of course. We may take the initial. or starting. accurate. or time horizon. matters are more complicated. available to us at time n. and access it quickly. Thus F0 represents the initial information (if 133 . or knowledge. The ability to retain information. In reality. 1.
we need to be able to increase n – thereby increasing the information available as new information (typically.} F is called a ﬁltration.2 DiscreteParameter Stochastic Processes The word ‘stochastic’ (derived from the Greek) is roughly synonymous with ‘random’. . Such a family I := {Fn : n = 0. A ﬁltration I therefore F corresponds to a sequence of ﬁner and ﬁner partitions Pn . At time t = 0 the agents only know that some event ω ∈ Ω will happen. In particular.8 for an interesting example. §15. C. we proceed to give some of the necessary deﬁnitions. and in which new information unfolds with time.perhaps in quite complicated ways – and yet which bears the above heuristic meaning. the emphasis is on the ﬂow of information. Ω}. F∞ := lim Fn = σ n→∞ n 134 Fn represents all we ever will know (the ‘Doomsday σalgebra’). French) school of probabilists have been responsible for the ‘general theory of (stochastic) processes’. Since the partitions become ﬁner the information on ω ∗ becomes more detailed with each step. 2. . inﬁnite T and continuoustime processes. I represents time (sometimes I represents space. 2. Often. Usually (always in this book). A. F0 = {∅.} (inﬁnite horizon). . P } is called F. .g. . we shall follow it. .g. . I = {0. 2. . On the other hand. . Meyer (1966). It is perhaps unfortunate that usage favours ‘stochastic process’ rather than the simpler ‘random process’.e. . Here. One has a clear mental picture of what is meant by this – there is no conceptual diﬃculty. So if X is adapted. we need to be able to speak in terms of ‘the information available at time n’. With this by way of motivation. Having the information in Fn revealed is equivalent to (n) knowing in which Ai ∈ Pn the event ω ∗ is. representing ‘knowing everything’). T } (ﬁnite horizon) or I = {0. and one calls X a spatial process). A stochastic process X = {Xn : n ∈ I} is a family of random variables. a stochastic basis or ﬁltered probability space. indexed by an indexset I. 1.APPENDIX C. a probability space endowed with such a ﬁltration. Dellacherie and Meyer (1978). . 1. the sets Ai are disjoint and i=1 Ai = Ω. and talk about the information ﬂow over time. . and for much of the progress in stochastic integration. Meyer and the Strasbourg (and more generally. Meyer of Strasbourg. see e. which can be conveniently manipulated . STOCHASTIC PROCESSES IN DISCRETE TIME there is none. or. . at time T < ∞ they know which speciﬁc event ω ∗ has happened. The (stochastic) process X = (Xn )∞ is n=0 said to be adapted to the ﬁltration I = (Fn )∞ if F n=0 Xn is Fn − measurable for all n. information theory in this sense is not what we need: for us. . since the 1960s. It turns out that the concept of ﬁltrations rather than that of partitions is relevant for the more general situations of inﬁnite Ω. . For the special case of a ﬁnite state space Ω = {ω1 . in which time evolves. ‘what we know at time n’. Meyer (1976). . I F. see e. corresponding to F. new price information) comes in. we will know the value of Xn at time n. Dellacherie and Meyer (1982). . Unfortunately this nice interpretation breaks down as soon as Ω becomes inﬁnite. the trivial σalgebra). but as it does. During the ﬂow of time the agents learn the speciﬁc structure of the (σ) algebras Fn . Now ‘information’ is not only an ordinary word. F∞ will be F (the σalgebra from Chapter 2. 1. Al } l of Ω. We need a framework which can handle dynamic situations. However. Further. X1 . These deﬁnitions are due to P. i. {Ω. Xn ) . Williams (1991). . However. If Fn = σ(X0 . ωn } and a given σalgebra F on Ω (which in this case is just an algebra) we can always ﬁnd a unique ﬁnite partition P = {A1 . But this will not always be so. but even a technical term in mathematics – many books have been written on the subject of information theory. . . what is needed is a precise mathematical construct. which means they learn the corresponding partitions P. . . deﬁned on some common probability space. and how to model and describe it.
. X(n. (ii) I Xn  < ∞ for all n. .1. For a stochastic process X = (Xn ). ω).APPENDIX C. C. Thus a process is always adapted to its natural ﬁltration.3 Deﬁnition and basic properties of martingales Excellent accounts of discreteparameter martingales are Neveu (1975). and models an unfavourable game. is all one needs to know when wishing to predict the future – how one got there provides no further information. Williams (1991) and Williams (2001) to which we refer the reader for detailed discussions. in that the chance inﬂuences tend to cancel each other out on average. . . X(n) interchangeably. A typical situation is that Fn = σ(W0 . I ) if P (i) X is adapted (to {Fn }). F. (n ≥ 1). and models a fair game. Such a ‘lack of memory’ property. W1 . X(ω) is the value X takes on ω (ω represents the ranP domness). W1 . . . Then X is adapted to I = (Fn ). is very useful for modelling purposes. Wn ) is the natural ﬁltration of some process W = (Wn ). (n ≥ 1). P Martingales have a useful interpretation in terms of dynamic games: a martingale is ‘constant on average’.s. Wn ) for some measurable function fn (nonrandom) of n + 1 variables. if using suﬃxes. Norris (1997). P X is a submartingale if in place of (iii) I E[Xn Fn−1 ] ≥ Xn−1 I − a. STOCHASTIC PROCESSES IN DISCRETE TIME 135 we call (Fn ) the natural ﬁltration of X. .s. · · · . For an excellent and accessible recent treatment of Markov chains. A process X = (Xn ) is called a martingale relative to ({Fn }. Deﬁnition C.g.(or σ(W0 . P X is a supermartingale if in place of (iii) I E[Xn Fn−1 ] ≤ Xn−1 I − a. i. see e. and suﬃciently speciﬁc and structured to have a rich and powerful theory. These two types are Markov processes and martingales. where usage dictates that they are called Markov chains. .3 below) model fair gambling games – situations where there may be lots of randomness (or unpredictability). (n ≥ 1). E (iii) I E[Xn Fn−1 ] = Xn−1 I − a. We will summarise what we need to use martingales for modelling in ﬁnance. though an idealisation of reality.3. Wn )) measurable. . Notation. I ).e. and models a favourable game. Martingales. there is a tendency towards stability. We shall encounter Markov processes more in continuous time (see Chapter 5) than in discrete time. it is convenient (e. on the other hand (see §3. but no tendency to drift one way or another: rather. each F Xn is Fn . ni say) to use Xn . There are two main types of stochastic process which are both general enough to be suﬃciently ﬂexible to model many commonly encountered situations. etc. iﬀ Xn = fn (W0 . The concept of a stochastic process is very general – and so very ﬂexible – but it is too general for useful progress to be made without specifying further structure or further restrictions.g.s. a submartingale is ‘increasing on average’. With ω displayed. these become Xn (ω). A Markov process models a situation in which where one is. and we shall feel free to do this.. For a random variable X on (Ω. a supermartingale is ‘decreasing on average’.
A system of gambling which consists in doubling the stake when losing in order to recoup oneself (1815). If ξ ∈ L1 (Ω. Mean zero random walk: Sn = Xi . in discrete time. The doubling strategy above has been known at least since 1815. supermartingale: negative mean). 1. Examples. 2. supermartingales. So we may without loss of generality take X0 = 0 when convenient. There is no play at time 0.’ Gambling games have been studied since time immemorial – indeed. or series of speculative investments. above all avoid a martingale if you do. One has the convergence Mn → M∞ := I E[ξF∞ ] a. pp. so (Mn ) is a martingale.L. A rope for guying down the jibboom to the dolphinstriker. 3.4 Martingale Transforms Now think of a gambling game. C. X is a martingale if and only if it is both a submartingale and a supermartingale. 1589. (Xn ) is a martingale if and only if (Xn − X0 ) is a martingale. The ﬁrst systematic exposition was (Doob 1953). . 2. Stock prices: Sn = S0 ζ1 · · · ζn with ζi independent positive r.sense) I E[Xn Fm ] = I E(Xn Fn−1 )Fm ] = I E[I E[Xn−1 Fm ] = . and in L1 . Ville (1939). F. 96. Doob (1910–) from 1940 e e on. 3.vs with existing ﬁrst moment. 136 1. Martingales were studied by Paul L´vy (1886–1971) from 1934 on (see obituary (Loˆve 1973)) and by J. . 3. Mn := I P E(ξFn ) (so Mn represents our best estimate of ξ based on knowledge at time n). submartingales to subharmonic functions. . with Xi independent with I E(Xi ) = 0 is a martingale (submartingales: positive mean..s. 2. is still a valuable source of information. 4. = I E[Xm Fm ] = Xm . From the Oxford English Dictionary: martingale (etymology unknown) 1. 2. Naut. the PascalFermat correspondence of 1654 which started the subject was on a problem (de M´r´’s problem) related to ee gambling.. 166–167). then using iterated conditional expectations I E[Mn Fn−1 ] = I E(ξFn )Fn−1 ] = I E[I E[ξFn−1 ] = Mn−1 . This classic book. though hard going. The terminology in the inequalities above comes from this: supermartingales correspond to superharmonic functions. X is a submartingale (supermartingale) if and only if −X is a supermartingale (submartingale). there are plays at times n = 1. to control a horse’s head. and similarly for submartingales.APPENDIX C. I ). Accumulating data about a random variable (Williams (1991).s. then for m < n using the iterated conditional expectation and the martingale property repeatedly (all equalities are in the a. STOCHASTIC PROCESSES IN DISCRETE TIME Note. Martingales have many connections with harmonic functions in probabilistic potential theory. The term ‘martingale’ in our sense is due to J. Thackeray: ‘You have not played as yet? Do not do so.. If X is a martingale. and ∆Xn := Xn − Xn−1 . An article of harness.
They dominate the mathematical theory of ﬁnance in discrete time.4. 1.1 (Martingale Transform Lemma). so integrable. . =0 in case (ii). ((C • X)0 = 0 as 0 k=1 Yn = (C • X)n . just as stochastic integrals dominate the theory in continuous time. (ii) If C is bounded and predictable and X is a martingale.). Your winnings on game n are Cn ∆Xn = Cn (Xn − Xn−1 ). Proof. the game is ‘fair on average’. ∆Yn = Cn ∆Xn is empty).4. Now I n − Yn−1 Fn−1 ] = E[Y = I E[Cn (Xn − Xn−1 )Fn−1 ] Cn I E[(Xn − Xn−1 )Fn−1 ] (as Cn is bounded. Note. So we expect to gain nothing – as we should. predictability of C means we can’t foresee the future (which is realistic and fair). If (Xn ) is a martingale. Y deﬁned by Y0 = 0. n I E k=1 Ck ∆Xk = 0 (n = 1. and call C • X the martingale transform of X by C. as there is no play at time 0). 2. n Yn = k=1 Ck ∆Xk (n ≥ 1) . Y = C • X is integrable. Call a process C = (Cn )∞ predictable if Cn is Fn−1 measurable for all n ≥ 1. (i) If C is a bounded nonnegative predictable process and X is a supermartingale. since C is bounded and X integrable. Martingale transforms were introduced and studied by Burkholder (1966). Predictability says that you have to decide how much to stake on play n based on the history before time n (i. Proof. For a textbook account. and Fn−1 measurable. . Neveu (1975). VIII. You can’t beat the system! In the martingale case.e. We write Y = C • X. C • X is a martingale null at zero. Think of Cn n=1 as your stake on play n (C0 is not deﬁned.1. Lemma C. An adapted sequence of real integrable random variables (Xn ) is a martingale iﬀ for any bounded predictable sequence (Cn ). see e. up to and including play n − 1). as C ≥ 0 and X is a supermartingale. Theorem C. .g.APPENDIX C. as X is a martingale. Martingale transforms are the discrete analogues of stochastic integrals. so can be taken out) ≤0 in case (i). 2. C • X is a supermartingale null at zero. Your total (net) winnings up to time n are n n Yn = k=1 Ck ∆Xk = k=1 Ck (Xk − Xk−1 ).. STOCHASTIC PROCESSES IN DISCRETE TIME 137 represents our net winnings per unit stake at play n. Thus if Xn is a martingale. Interpretation.4.
choose j. Then (Cn ) is predictable. Cj+1 = 1A . Remark C. I E( k=1 Ck ∆Xk ) = 0. . Conversely. E[1 Since this holds for every set A ∈ Fj . STOCHASTIC PROCESSES IN DISCRETE TIME 138 is the martingale transform C • X. so the condition of the n proposition.APPENDIX C. (Xn ) is a martingale. and for any Fj measurable set A write Cn = 0 for n = j + 1. Since this holds for every j.1. becomes I A (Xj+1 − Xj )] = 0. Now I E(Y1 ) = I E(C1 I E(X1 − X0 )) = 0 and we see by induction that I E(Yn+1 ) = I E(Cn+1 (Xn+1 − Xn )) + I E(Yn ) = 0. not in immediate transparency. the deﬁnition of conditional expectation gives I E(Xj+1 Fj ) = Xj . We shall see in Chapter 4 the ﬁnancial signiﬁcance of martingale transforms H • M. but in its ease of handling in proofs. so is a martingale.4. if the condition of the proposition holds. The proof above is a good example of the value of Kolmogorov’s deﬁnition of conditional expectation – which reveals itself.
Prices in ﬁnancial markets. Ross. 1994. (North Holland. C. 1990.A. and P.. in S. ed.C. 1982.M. Springer. Option pricing: a simpliﬁed approach. eds. A.. 1970. (Wiley). Rubinstein. Ross. (Chapman and Hall. Ann. (Wiley. J. Notes from Ascona meeting... J. M. Burkill. J. New York). R. 145–166. Processes of normal inverse Gaussian type.R. C. 1985.C.. 1973. (Wadsworth. D. (Princeton University Press). Journal of Political Economy 72.C. (Cambridge University Press). Billingsley. S. Options markets. Journal of Financial Economics 3. 1979. Cox. 1972.H. by W. Meyer. 2001. The valuation of options for alternative stochastic processes. Probabilities and potential vol.. Arbitrage. 1991. and M. Kelly. and P. T. 37. and H. Cochrane.E. and H.L. (PrenticeHall). 1966. U. 1997.. Dudley. J. Math. and M. A general option pricing formula. Davis.Bibliography Allingham.D. o Bj¨rk.H. B. 1995. O.H. 1998. 637–659.. Wilmott. Amsterdam New York). A. 1494–1504. 229–263. A second course in mathematical analysis.P. J. Dothan. and P. and S. The pricing of options and corporate liabilities. Burkholder. Rubinstein. Interest rate theory. 53–122.A. Martingale transforms.A. J. F. Asset pricing. Finance and Stochastics 2.J. R.. Dellacherie. (Oxford University Press). Cox.. Probabilities and potential vol. and S. T. F. 1976. 1953. London and New York) First published 1965 by Methuen & Co Ltd. Paris)... The theory of stochastic processes. Burkill. Imperial College. M. D. Financial Economics 7.. Paciﬁc Grove).C. Arbitrage theory in continuous time.. in Financial Mathematics. Elements of ﬁnancial economics. Howison. Miller. BarndorﬀNielsen. and M. P. Cox. Meyer.. Scholes. J. Interest rate volatility and the shape of the term structure. 1995.. L. Schaefer. Preprint. 1978. (MacMillan). 1989. Doob.. 139 . Convergence of probability measures. (Hermann. Berlin New York London.D. 1968. 41–68. ). Stochastic processes.: Mathematical models in ﬁnance (Chapman & Hall.. Real Analysis and Probability.M. Brown.A.. Cox.. Dellacherie. Bj¨rk. Statist. Black. M. Runggaldier Lecture o Notes in Mathematics pp.
. 1992. Grundbegriﬀe der Wahrscheinlichkeitsrechnung. (Springer. Jarrow. New York. Kan. F. R.. Karatzas..D. (Duxbury Press at Wadsworth Publishing Company) 2nd edn. (Springer).R. . Keller. 1985. Probability essentials. Essentials of stochastic processes. D. e Grimmett.W.: Statistics in ﬁnance (Arnold. E. and R. New York). Futures and Options. 2000. R. Watanabe. 1991. Viswanathan. Martingales and stochastic integrals in the theory of continuous trading.A. E. Harrison. Dynamic Asset Pricing Theory. Myneni. Stochastic Processes and their Applications 11. Annali di Mat. Ikeda. R. and P. and S. 1998. Ince.. Kreps.N. Derivative Securities.A. Ordinary diﬀerential equations. Martingales and arbitrage in multiperiod securities markets. Harrison. 77–105. Universit´ de Paris VI and Stanford University. Bond pricing and the term structure of interest rates: a new methodology for contingent claim valuation. Default risk. 281–299. 1981. 2000. Theory. (1965). 1944. 1991. Derivative credit risk. and D. M.. R. Jacod. 1902. London. Turnbull. Sir James. J. Multifactor term structure models. N. D. eds... F. Miami) 3rd edn. G. Arbitrage pricing and hedging of interest rate claims with state variables: I. and R.). and S. and D.P. J. Annals of Probability 1. Probability: Theory and Examples. and H. Brownian motion and stochastic calculus.BIBLIOGRAPHY Duﬃe.M. 20.. 1995.R. A. Kolb. Hand. (Kolb Publishing. longueur. aire. obituary. (Cambridge University Press) 5th.D.. (John Wiley and Sons. e e Madan.M. 1st ed. eds. (Oxford University Press). Understanding Futures Markets. S. 215–260. (Risk Publications London). Jeans.. ). Kunita (eds. Jameson. Bernoulli 1.R.. Tokyo Berlin New York) Festschrift for Kiyosi Itˆ’s eightieth o birthday. (ed. Brownian motion and stochastic ﬂow systems. 1996. (Dover New York). Kolmogorov. Ma. D. and U. Durrett. Econ. New York) 2nd edn.. N. Lebesgue. Probability: An introduction. Econometrica 60. 1999. 1979. R. 1996. Morton. 1995. Shreve. Eberlein.W. (Springer) English translation: Foundations of probability theory. Edwards. Heath. (SouthWestern College Publishing. 7.. 140 Duﬃe. J. (Springer. H. 1973. e Loˆve. and S.L. M.. ). R. and C. and S. and P. 1992. New York). R. 231–259.J.. 1–18. El Karoui. in S.. (Princton University Press). Int´grale. Welsh. J. Kelly. Paul L´vy (18861971). 1992. Harrison. Durrett.. Wilmott. 1933. Chelsea. Howison. Protter. J. D.. The mathematical theory of electricity and magnetism. (Springer). Hyperbolic distributions in ﬁnance. (McGrawHill. 1995. Pliska. Th. 1992..M.. Jacka. 1925. and A. Itˆ stochastic calculus o and probability theory. in D. 381–408. 1996. 1986..J.. Jarrow. Fukushima M.: Mathematical models in ﬁnance (Chapman & Hall.M. 1995. I..). edn. Cincinnati) 2nd edn.
in D. Stochastic Integration and Diﬀerential Equations.. Rydberg. 141–183.). Neveu. Yor. Meyer.R. Convex Analysis. Statistical aspects of ARCH and stochastic volatility. 1997.. Continuoustime ﬁnance. 13–39.R. (Springer. Merton. 1–25. 293–312. P. Martingale methods in ﬁnancial modelling vol. Princton NJ). Adv. R. 1965. eds. Markov processes and martingales. ). Institute of Mathematics. The analysis of ﬁnite security markets using martingales. and W. A ﬁrst look at rigorous probability theory. Probability and potential.. to appear as Springer Lecture Notes. J. Protter. 2004. Un cours sur les int´grales stochastiques. 2000.. BarndorﬀNielsen. 2000. Amer. Williams. Springer. Rogers. Volume 1: Foundation. Rockafellar. (Blaisdell. e e e 511 in Lecture Notes in Mathematics pp. A probability path... 1996. Meyer.A. R. e Research Report. Oxford).C. A Samuelson.M. University of ˚rhus University.S. Letter to the editor. Research Report 342. 1970. Rennocks. 1973. 1997. J.BIBLIOGRAPHY 141 Merton. New York) 2nd edn. Math. 1997.H. Theory of rational option pricing. and M. (Academic Press) 6th edn. Appl. 245–400. Williams. 1997. (Princton University Press. (Springer. and M. 36 of Applications of Mathematics: Stochastic Modelling and Applied Probability.. The normal inverse Gaussian L´vy process: Simulation and approximation.. A Rydberg.S. (Wiley) 2nd edn.. New York).. R. ﬁnance and other ﬁelds (Chapman & Hall. 2001.. 1997.G. Diﬀusions. Bell Journal of Economics and Management Science 4.. P.in econometrics. Institute of Mathematics.C. 1991. and O. New York).. L. Applications of martingale systems theorems.. D. Introduction to the mathematics of ﬁnancial markets.A. Hinkley. . 1992. 1st ed. Willinger. J. Soc. 1970. Probability models.. D. 1rst edition. Department of Theoretical Statistics. Prob.E. 1987. D. 19.V. M. Musiela.. W. Discreteparameter martingales.: Time Series Models . and D. Trans. T.H. 1996. 1990. L. Continuous martingales and Brownian motion. P. Schachermayer. Industrial Management Review 6. (World Scientiﬁc). Taqqu. Department of Theoretical Statistics. S. Rational theory of warrant pricing. T. 1966. J..A.T. Resnick. P. 1975. 73.. Berlin Heidelberg New York. John.. Financial Times 08. a Revuz. in S´minaire de Probabilit´s X no. Ross.C. (Blackwell. N. Rosenthal.. Hedging can only defer currency volatility impact for British Steel. Rutkowski. (Birkh¨user) 2nd printing. 1952. M. Cox. Snell. S. Waltham.. 1994. Generalized hyperbolic diﬀusions with applications towards ﬁnance. Markov chains. 1976. Norris. University of ˚rhus University. Mass. (Springer. (NorthHolland). (Cambridge University Press). Shephard.
649 in Lecture Notes in Mathematics pp. 142 Yor. in S´minaire e e de Probabilit´s... D. D. 1978. 265–309. Weighing the odds.BIBLIOGRAPHY Williams. (World Scientiﬁc. 1991. Sousespaces denses dans L1 et H 1 et repr´sentation des martingales. M. . Singapore). e Zhang. Probability with martingales.. Exotic Options. XII no. (Cambridge University Press). 1997. (Cambridge University Press). 2001.. Williams.G. P. Springer.
This action might not be possible to undo. Are you sure you want to continue?