Elementary

Calculus
of Financial
Mathematics
About the Series
The SIAM series on Mathematical Modeling and Computation draws attention
to the wide range of important problems in the physical and life sciences and
engineering that are addressed by mathematical modeling and computation;
promotes the interdisciplinary culture required to meet these large-scale challenges;
and encourages the education of the next generation of applied and computational
mathematicians, physical and life scientists, and engineers.
The books cover analytical and computational techniques, describe significant
mathematical developments, and introduce modern scientific and engineering
applications. The series will publish lecture notes and texts for advanced
undergraduate- or graduate-level courses in physical applied mathematics,
biomathematics, and mathematical modeling, and volumes of interest to a wide
segment of the community of applied mathematicians, computational scientists,
and engineers.
Appropriate subject areas for future books in the series include fluids, dynamical
systems and chaos, mathematical biology, neuroscience, mathematical physiology,
epidemiology, morphogenesis, biomedical engineering, reaction-diffusion in
chemistry, nonlinear science, interfacial problems, solidification, combustion,
transport theory, solid mechanics, nonlinear vibrations, electromagnetic theory,
nonlinear optics, wave propagation, coherent structures, scattering theory, earth
science, solid-state physics, and plasma physics.
A. J. Roberts, Elementary Calculus of Financial Mathematics
James D. Meiss, Differential Dynamical Systems
E. van Groesen and Jaap Molenaar, Continuum Modeling in the Physical Sciences
Gerda de Vries, Thomas Hillen, Mark Lewis, Johannes Müller, and Birgitt
Schönfisch, A Course in Mathematical Biology: Quantitative Modeling with
Mathematical and Computational Methods
Ivan Markovsky, Jan C. Willems, Sabine Van Huffel, and Bart De Moor, Exact and
Approximate Modeling of Linear Systems: A Behavioral Approach
R. M. M. Mattheij, S. W. Rienstra, and J. H. M. ten Thije Boonkkamp, Partial
Differential Equations: Modeling, Analysis, Computation
Johnny T. Ottesen, Mette S. Olufsen, and Jesper K. Larsen, Applied Mathematical
Models in Human Physiology
Ingemar Kaj, Stochastic Modeling in Broadband Communications Systems
Peter Salamon, Paolo Sibani, and Richard Frost, Facts, Conjectures, and
Improvements for Simulated Annealing
Lyn C. Thomas, David B. Edelman, and Jonathan N. Crook, Credit Scoring and Its
Applications
Frank Natterer and Frank Wübbeling, Mathematical Methods in Image Reconstruction
Per Christian Hansen, Rank-Deficient and Discrete Ill-Posed Problems: Numerical
Aspects of Linear Inversion
Michael Griebel, Thomas Dornseifer, and Tilman Neunhoeffer, Numerical
Simulation in Fluid Dynamics: A Practical Introduction
Khosrow Chadan, David Colton, Lassi Päivärinta, and William Rundell, An
Introduction to Inverse Scattering and Inverse Spectral Problems
Charles K. Chui, Wavelets: A Mathematical Tool for Signal Analysis
Editorial Board
Alejandro Aceves
Southern Methodist
University
Andrea Bertozzi
University of California,
Los Angeles
Bard Ermentrout
University of Pittsburgh
Thomas Erneux
Université Libre de
Brussels
Bernie Matkowsky
Northwestern University
Robert M. Miura
New Jersey Institute
of Technology
Michael Tabor
University of Arizona
Mathematical Modeling
and Computation
Editor-in-Chief
Richard Haberman
Southern Methodist
University
Society for Industrial and Applied Mathematics
Philadelphia
Elementary
Calculus
of Financial
Mathematics
A. J. Roberts
University of Adelaide
Adelaide, South Australia, Australia
Copyright © 2009 by the Society for Industrial and Applied Mathematics.
10 9 8 7 6 5 4 3 2 1
All rights reserved. Printed in the United States of America. No part of this book may be
reproduced, stored, or transmitted in any manner without the written permission of the
publisher. For information, write to the Society for Industrial and Applied Mathematics,
3600 Market Street, 6th Floor, Philadelphia, PA 19104-2688 USA.
Trademarked names may be used in this book without the inclusion of a trademark
symbol. These names are used in an editorial context only; no infringement of
trademark is intended.
Maple is a registered trademark of Waterloo Maple, Inc.
MATLAB is a registered trademark of The MathWorks, Inc. For MATLAB product
information, please contact The MathWorks, Inc., 3 Apple Hill Drive, Natick, MA
01760-2098 USA, 508-647-7000, Fax: 508-647-7001, info@mathworks.com,
www.mathworks.com.
Library of Congress Cataloging-in-Publication Data
Roberts, A. J.
Elementary calculus of financial mathematics / A. J. Roberts.
p. cm. -- (Mathematical modeling and computation ; 15)
Includes bibliographical references and index.
ISBN 978-0-898716-67-2
1. Finance--Mathematical models. 2. Stochastic processes. 3. Investments--
Mathematics. 4. Calculus. I. Title.
HG106.R63 2009
332.01'51923--dc22 2008042349
is a registered trademark.
To Barbara, Sam, Ben, and Nicky
for their support over the years

emfm
2008/10/22
page vii
i
i
i
i
i
i
i
i
Contents
Preface ix
List of Algorithms xi
1 Financial Indices Appear to Be Stochastic Processes 1
1.1 Brownian motion is also called a Wiener process . . . . . . . . . . . . 3
1.2 Stochastic drift and volatility are unique . . . . . . . . . . . . . . . . 9
1.3 Basic numerics simulate an SDE . . . . . . . . . . . . . . . . . . . . 14
1.4 A binomial lattice prices call option . . . . . . . . . . . . . . . . . . . 20
1.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2 Ito’s Stochastic Calculus Introduced 39
2.1 Multiplicative noise reduces exponential growth . . . . . . . . . . . . 39
2.2 Ito’s formula solves some SDEs . . . . . . . . . . . . . . . . . . . . . 43
2.3 The Black–Scholes equation prices options accurately . . . . . . . . . 48
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3 The Fokker–Planck Equation Describes the Probability Distribution 61
3.1 The probability distribution evolves forward in time . . . . . . . . . . 65
3.2 Stochastically solve deterministic differential equations . . . . . . . . 76
3.3 The Kolmogorov backward equation completes the picture . . . . . . 84
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4 Stochastic Integration Proves Ito’s Formula 93
4.1 The Ito integral

b
a
fdW . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.2 The Ito formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Appendix A Extra MATLAB/SCILAB Code 115
Appendix B Two Alternate Proofs 119
B.1 Fokker–Planck equation . . . . . . . . . . . . . . . . . . . . . . . . . 119
vii
emfm
2008/10/22
page viii
i
i
i
i
i
i
i
i
viii Contents
B.2 Kolmogorov backward equation . . . . . . . . . . . . . . . . . . . . . 121
Bibliography 125
Index 127
emfm
2008/10/22
page ix
i
i
i
i
i
i
i
i
Preface
Welcome! This book leads you on an introduction into the fascinating realm of fi-
nancial mathematics and its calculus. Modern financial mathematics relies on a deep and
sophisticated theory of random processes in time. Such randomness reflects the erratic
fluctuations in financial markets. I take on the challenge of introducing you to the crucial
concepts needed to understand and value financial options among such fluctuations. This
book supports your learning with the bare minimumof necessary prerequisite mathematics.
To deliver understanding with a minimum of analysis, the book starts with a graph-
ical/numerical introduction to how to adapt random walks to describe the typical erratic
fluctuations of financial markets. Then simple numerical simulations both demonstrate the
approach and suggest the symbology of stochastic calculus. The finite steps of the numeri-
cal approach underlie the introduction of the binomial lattice model for evaluating financial
options.
Fluctuations in a financial environment may bankrupt businesses that otherwise would
grow. Discrete analysis of this problemleads to the surprisingly simple extension of classic
calculus needed to perform stochastic calculus. The key is to replace squared noise by a
mean drift: in effect, dW
2
=dt. This simple but powerful rule enables us to differentiate,
integrate, solve stochastic differential equations, and to triumphantly derive and use the
Black–Scholes equation to accurately value financial options.
The first two chapters deal with individual realizations and simulations. However,
some applications require exploring the distribution of possibilities. The Fokker–Planck
and Kolmogorov equations link evolving probability distributions to stochastic differential
equations (SDEs). Such transformations empower us not only to value financial options
but also to model the natural fluctuations in biology models and to approximately solve
differential equations using stochastic simulation.
Lastly, the formal rules used previously are justified more rigorously by an introduc-
tion to a sound definition of stochastic integration. Integration in turn leads to a sound
interpretation of Ito’s formula that we find so useful in financial applications.
Prerequisites
Basic algebra, calculus, data analysis, probability and Markov chains are prerequisites
for this course. There will be many times throughout this book when you will need the
concepts and techniques of such courses. Be sure you are familiar with those, and have
appropriate references on hand.
ix
emfm
2008/10/22
page x
i
i
i
i
i
i
i
i
x Preface
Computer simulations
Incorporated into this book are MATLAB/SCILAB scripts to enhance your ability to probe
the problems and concepts presented and thus to improve learning. You can purchase
MATLAB from the Mathworks company, http://www.mathworks.com. SCILAB is available
for free via http://www.scilab.org.
A. J. Roberts
emfm
2008/10/22
page xi
i
i
i
i
i
i
i
i
List of Algorithms
1.1 MATLAB/SCILAB code to plot mrealizations of a Brownian motion/Wiener
process as shown in Figure 1.3. In SCILAB use rand(.,.,"n") instead
of randn(.,.) for N(0, 1) distributed random numbers. . . . . . . . . . 3
1.2 MATLAB/SCILAB code to draw five stochastic processes, all scaled from
the one Wiener process, with different drifts and volatilities. . . . . . . . . 11
1.3 In MATLAB/SCILAB, given h is the time step, this code estimates the stock
drift and stock volatility from a times series of values s. In SCILAB use
$ instead of end. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Code for simulating five realizations of the financial SDE dS = αSdt +
βSdW . The for-loop steps forward in time. As a second subscript to an
array, the colon forms a row vector over all the realizations. . . . . . . . . . 15
1.5 Code for starting one realization with very small time step, then repeat with
time steps four times as long as the previous. . . . . . . . . . . . . . . . . . 20
1.6 MATLAB/SCILAB code for a four-step binomial lattice estimate of the value
of a call option. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.1 Code for determining the value of the call option in Example 2.10. Observe
the use of nan to introduce unspecified boundary conditions that turn out
to be irrelevant on the selected grid just as they are for the binomial model.
In SCILAB use %nan instead of nan. . . . . . . . . . . . . . . . . . . . . 55
3.1 Example MATLAB/SCILAB code using (3.8) to stochastically estimate the
value of a call option, from Example 1.9, on an asset with initial price S
0
=
35 , strike price X =38.50 after one year in which the asset fluctuates by a
factor of 1.25 and with a bond rate of 12%. . . . . . . . . . . . . . . . . . 79
3.2 MATLAB/SCILAB code to solve a boundary value problem ODE by its
corresponding SDE. Use find to evolve only those realizations within
the domain 0 < X < 3 . Continue until all realizations reach one boundary
or the other. Estimate the expectation of the boundary values using the con-
ditional vectors x<=0 and x>=3 to account for the number of realizations
reaching each boundary. . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.1 MATLAB/SCILAB code (essentially an Euler method) to numerically eval-
uate

t
0
W(s)dW(s) for 0 <t <1 to draw Figure 4.1. . . . . . . . . . . . . 95
A.1 This algorithm draws a 20 second zoom into the exponential function to
demonstrate the smoothness of our usual functions. . . . . . . . . . . . . . 115
xi
emfm
2008/10/22
page xii
i
i
i
i
i
i
i
i
xii List of Algorithms
A.2 This algorithm draws a 60 second zoom into Brownian motion. Use the
Brownian bridge to generate a start curve and new data as the zoom pro-
ceeds. Force the Brownian motion to pass through the origin. Note the
self-affinity as the vertical is scaled with the square root of the horizontal.
Note the infinite number of zero crossings that appear near the original one. 116
A.3 This algorithm randomwalks from the specific point (0.7, 0.4) to show that
the walkers first exit locations given a reasonable sample of the boundary
conditions. This algorithm compares the numerical and exact solution u =
x
2
−y
2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
emfm
2008/10/22
page 1
i
i
i
i
i
i
i
i
Chapter 1
Financial Indices Appear to Be
Stochastic Processes
Contents
1.1 Brownian motion is also called a Wiener process . . . . . . . 3
1.2 Stochastic drift and volatility are unique . . . . . . . . . . . . 9
1.3 Basic numerics simulate an SDE . . . . . . . . . . . . . . . . 14
1.3.1 The Euler method is the simplest . . . . . . . . . . 14
1.3.2 Convergence is relatively slow . . . . . . . . . . . . 18
1.4 A binomial lattice prices call option . . . . . . . . . . . . . . . 20
1.4.1 Arbitrage value of forward contracts . . . . . . . . 22
1.4.2 A one step binomial model . . . . . . . . . . . . . 23
1.4.3 Use a multiperiod binomial lattice for accuracy . . . 28
1.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Answers to selected exercises . . . . . . . . . . . . 38
Earnings momentum and visibility should continue to propel the market to
new highs. A forecast issued by E. F. Hutton, the Wall Street brokerage firm,
moments before the stock market plunged on October 19, 1987.
There is overwhelming evidence that share prices, financial indices, and currency val-
ues fluctuate wildly and randomly. An example is the range of daily cotton prices shown in
Figure 1.1; another is the range of wheat prices shown in Figure 1.2. These figures demon-
strate that randomness appears to be an integral part of the dynamics of the financial mar-
ket. In a dissertation that was little appreciated at the time, the all-pervading randomness
in finance was realized by Bachelier circa 1900. Prices evolving in time form a stochastic
process because of these strong random fluctuations. This book develops the elementary
theory of such stochastic processes in order to underpin the famous Black–Scholes equation
for valuing financial options, here described by Tony Dooley:
. . . the most-used formula in all of history is the Black–Scholes formula. Each
day, it is used millions of times in millions of computer programs, making it
more used than Pythagoras’ theorem!
1
emfm
2008/10/22
page 2
i
i
i
i
i
i
i
i
2 Chapter 1. Financial Indices Appear to Be Stochastic Processes
S
(
t
)
=
p
r
i
c
e
o
f
c
o
t
t
o
n
U
S
$
1970 1975 1980 1985 1990 1995 2000
20
30
40
50
60
70
80
90
100
110
120
year
Figure 1.1. Opening daily prices S(t) of cotton over nearly 30 years showing the
strong fluctuations typical of financial markets (note that there are about 21 trading days
per month, that is, 250 per year).
p
r
i
c
e
o
f
w
h
e
a
t
U
S
$
1920 1930 1940 1950 1960 1970 1980 1990 2000 2010
0
1
2
3
4
5
6
7
year
Figure 1.2. The range of wheat prices over nearly 90 years. The vertical bars
give the range in each year; the tick on the left of each bar is the opening price; and the
tick on the right is the closing price. The red curve is the 4-year average, blue is the 9-year
average, and cyan is the 18-year average.
emfm
2008/10/22
page 3
i
i
i
i
i
i
i
i
1.1. Brownian motion is also called a Wiener process 3
The financial world is not the only example of significant random fluctuations. In
biology we generate differential equations representing the interactions of predators and
prey, for example, foxes and rabbits. Such models are often written as ordinary differential
equations (ODEs) of the formdx/dt = · · · and dy/dt = · · · , where the right-hand side dots
denote life and death interactions. However, biological populations live in an environment
with random events such as drought, flood, or meteorite impact. There are fluctuations in
the populations due to such unforeseen events. Random events are especially dangerous
for populations of endangered species in which there are relatively few individuals.
Engineers may also need to analyze problems with random inputs. A truck driving
along a road shakes from a variety of causes, one of which is travelling over the essentially
random bumps in the road. In many aspects the truck’s design must account for such
stochastic vibrations.
These examples show that the study of stochastic differential equations (SDEs) is
worthwhile. Note that the use of SDEs is an admission of ignorance of the nature of
fluctuations. There are a multitude of unknown processes which influence the phenomena
of interest. Under the central limit theorem, we assume that these unknown influences
accumulate to become normally distributed—alternatively called Gaussian distribution.
Suggested activity: Do at least Exercise 1.1.
1.1 Brownian motion is also called a Wiener process
Nothing in Nature is random . . . A thing appears random only through the in-
completeness of our knowledge. Spinoza
A starting point to describe stochastic processes such as a stock price is Brownian
motion or, more technically, a Wiener process. Figure 1.3 shows an example of this process.
See the roughly qualitative similarities to the cotton prices shown in Figure 1.1 (though the
cotton prices appear to fluctuate more) and to the wheat prices in Figure 1.2.
Algorithm 1.1 MATLAB/SCILAB code to plot m realizations of a Brownian motion/
Wiener process as shown in Figure 1.3. In SCILAB use rand(.,.,"n") instead of
randn(.,.) for N(0, 1) distributed random numbers.
m=5;
n=300;
t=linspace(0,1,n+1)’;
h=diff(t(1:2));
dw=sqrt(h)
*
randn(n,m);
w=cumsum([zeros(1,m);dw]);
plot(t,w)
Where do fluctuations in the financial indices come from? Many economic theorists
assert that fluctuations reflect the random arrival of new knowledge. As new knowledge
is made known to the traders of stocks and shares, they almost instantly assimilate the
knowledge and buy and sell accordingly. However, according to these graphs it would seem
emfm
2008/10/22
page 4
i
i
i
i
i
i
i
i
4 Chapter 1. Financial Indices Appear to Be Stochastic Processes
W
(
t
)
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.0
0.5
1.0
1.5
2.0
2.5
time t
Figure 1.3. Five realizations of Brownian motion (Wiener process) W(t) gener-
ated by Algorithm 1.1. Most figures in this book plot five realizations of a random process
to hint that stochastic processes, such as this Wiener process, are a composite of an un-
countable infinity of individual realizations.
that at least 95% of the “new knowledge” in a day is self-contradictory! The independent
fluctuations from one day to the next say to me that whatever a person discovers one day
is meaningless the next. This means to me that any such knowledge is worthless and
refutes the notion of genuine “new knowledge.”
1
But there is an alternative model for the
fluctuations. Recent computer simulations
2
show that when many interacting agents try to
outsmart each other to obtain assets the result is mayhem. Competing agents individually
generate widely fluctuating valuations of the asset, such as seen in real share prices. No
new knowledge need be hypothesized, just greed. Fortunately, it appears that the average
valuation of the asset is realistic, though no proof of this is known. Thus a recent view of
financial prices is that the average valuation is reasonable, with fluctuations due to agents
competing with each other.
Many agents look at financial indices and see trends and patterns on which they base
their recommendations. One recommendation which I heard in a public presentation was
based on Fibonnaci numbers. I view this sort of trend analysis as gibberish. Random
fluctuations often seem to have short-term patterns, as you see in Figure 1.3. The problem
is that humans are very susceptible to seeing patterns which do not exist. Remember that a
randomsequence must, accidentally, have some chance patterns. Despite all the analysis by
“financial experts,” sound statistical analysis shows that there are very few, if any, patterns
over time in the fluctuations of the stock market. The fluctuations are, to a very good
approximation, independent fromday to day. We assume independence in our development
of suitable mathematics.
1
See Why economic theory is out of whack by Mark Buchanan in the New Scientist, 19 July 2008, to read
a little on financial markets’ lack of reaction to news.
2
Read about the “El Farol” problem by W.B. Arthurs, Amer. Econ. Assoc. Pap. Proc., 84, p. 406 (1994).
emfm
2008/10/22
page 5
i
i
i
i
i
i
i
i
1.1. Brownian motion is also called a Wiener process 5
The Wiener process
Now we return to our main task, which is to understand the nature of Brownian motion
(Weiner process). The stochastic process shown in Figure 1.3 is named after the British
botanist Robert Brown, who first reported the motion in 1826 when observing in his micro-
scope the movement of tiny pollen grains due to small but incessant and randomimpacts of
air molecules. Wiener formalized its properties in the 20th century. Algorithm 1.1 gener-
ates the realizations shown in Figure 1.3 by dividing time into small steps of length Δt =h
so that the jth time step reaches time t
j
= jh (assuming t = 0 is the start of the period of
interest). Then the algorithm determines W
j
, the value of the process at time t
j
, by adding
up many independent and normally distributed increments:
3
W
j+1
=W
j
+

hZ
j
where Z
j
∼ N(0, 1) and with W
0
=0.
We generate a Brownian motion, or Wiener process, W(t) in the limit as the step size
h →0 so that the number of steps becomes large, e.g., 1/h. The random increments to W,
namely ΔW
j
=

hZ
j
, are chosen to scale with

h because it eventuates that this scaling is
precisely what is needed to generate a reasonable limit as h →0, as we now see. Consider
the process W at some fixed time t =nh, that is, we take n steps of size h to reach t, and
hence W(t) is approximated by n random increments of variance h:
W
n
=W
n−1
+

hZ
n−1
=W
n−2
+

hZ
n−2
+

hZ
n−1
=· · ·
=W
0
+
n−1

j=0

hZ
j
.
Now firstly W
0
= 0 and secondly we know that a sum of normally distributed random
variables is a normally distributed random variable with mean given by the sum of means,
and variance given by the sum of the variances. Since all the increments Z
j
∼ N(0, 1) , then

hZ
j
∼ N(0, h) , and the sum of n such increments is
W(t) =
n−1

j=0

hZ
j
∼ N(0, nh) =N(0, t).
Thus we deduce that W(t) ∼ N(0, t).
By taking random increments ∝

h we find that the distribution of W at any time
is fixed—that is, independent of the number of discrete steps we took to get to that time.
Thus considering the step size h →0 is a reasonable limit. Similar arguments show that
the change in W over any time interval of length t, W(t +s) −W(s), is also normally
distributed N(0, t) and, further, is independent of any details of the process that occurred
for times before time s.
3
Recall that saying a random variable Z∼N(a, b) means that Zis normally distributed with mean aand
variance b(standard deviation

b).
emfm
2008/10/22
page 6
i
i
i
i
i
i
i
i
6 Chapter 1. Financial Indices Appear to Be Stochastic Processes
This last property of independence is vitally important in various crucial places in the
development of SDEs. To see this normal distribution, suppose time has been discretized
with steps Δt = h, the time s corresponds to step j = , and the time s +t corresponds to
step j = +n. Then
W
+n
= W
+n−1
+

hZ
+n−1
= W
+n−2
+

hZ
+n−2
+

hZ
+n−1
= · · ·
= W

+
+n−1

j=

hZ
j
.
This sum of n increments is distributed N(0, t) as before, and so
W(t +s) −W(s) = W
+n
−W

∼ N(0, t).
Now, further, see that the sum of the increments above only involve randomvariables Z
j
for
≤j < +n which are completely independent of the random increments

hZ
j
for j <,
and hence are completely independent of the details of W(t) for times t ≤ s, that is, after
time t = s the changes in the process W from W(s) are independent of W(t) for earlier
times t ≤s .
Brownian motion has a prime role in stochastic calculus because of the central limit
theorem. In this context, if the increments in W follow some other distribution on the
smallest “micro” times, then provided the variance of the microincrements is finite, the
process always appears Brownian on the macroscale. This is assured because the sum
of many random variables with finite variance tends to a normal distribution. Indeed, in
Section 1.4 and others following we approximate the increments as the binary choice of
either a step up or a step down. In effect we choose Z
j
= ±1, each with probability
1
2
;
as the mean and variance of such a Z
j
are zero and one, respectively, the cumulative sum

n−1
j=0

hZ
j
has the appearance on the macroscale of an N(0, nh) random variable, as
required.
Definition 1.1. Brownian motion or Wiener process or randomwalk, usually denoted W(t),
satisfies the following properties:
• W(t) is continuous;
• W(0) =0 ;
• the change W(t +s) −W(s) ∼ N(0, t) for t, s ≥0 ; and
• W(t+s)−W(s) is independent of any details of the process for times earlier than s.
Continuous but not differentiable
Wiener proved that such a process exists and is unique in a stochastic sense. The only
property that we have not seen is that W(t) is continuous, so we investigate it now. We
know that W(t +s) −W(s) ∼ N(0, t), so we now imagine t as small and write this as
W(t +s) −W(s) =

tZ
t
,
emfm
2008/10/22
page 7
i
i
i
i
i
i
i
i
1.1. Brownian motion is also called a Wiener process 7
where the random variables Z
t
∼ N(0, 1) . Now although Z
t
will vary with t, it comes
from a normal distribution with mean 0 and variance 1, so as t →0, then almost surely the
right-hand side

tZ
t
→0 . Thus, almost surely W(t +s) →W(s) as t →0, and hence
W is continuous (almost surely).
Although it is continuous we now demonstrate that a Wiener process, such as that
shown in Figure 1.4 (left column), is too jagged to be differentiable.
4
As Figure 1.4 (right
column) shows, recall that for a smooth function such as f(t) =e
t
we generally see a linear
variation near any point; for example,
f(t +s) −f(s) =e
t+s
−e
s
=(e
t
−1)e
s
≈ te
s
,
or more generally, f(t +s) −f(s) ≈ tf

(s) . Thus we are familiar with f(t +s) −f(s)
decreasing linearly with t, and upon this is based all the familiar rules of differential and
integral calculus. In contrast, in the Wiener process, and generally for solutions of SDEs,
Figure 1.4 (left) shows W(t +s) −W(s) decreasing more slowly, like

t. Thus Wiener
processes are much steeper and vastly more jagged than smooth differentiable functions.
Notionally the Wiener process has “infinite slope” and is thus nowhere differentiable. This
feature generates lots of marvelous new effects that make stochastic calculus enormously
intriguing.
Example 1.2. Pick a normally distributed randomvariable Z∼ N(0, 1), then define W(t) =
Z

t . Is W(t) a Wiener process?
Solution: No; although W(t) ∼ N(0, t), it does not satisfy all the properties of a
Wiener process. Consider each property in turn:
• true, W(t) is clearly continuous;
• true, W(0) =0 is satisfied;
• but W(t +s) −W(s) = Z(

t +s−

s); which has variance (

t +s−

s)
2
= t −
2

s(t +s) = t, and so W(t) does not satisfy the third property of a Wiener process
(nor the fourth, incidentally).
Example 1.3. Let W(t) and

W(t) be independent Wiener processes and ρ a fixed number
0 < ρ < 1 . Show that the linear combination X(t) = ρW(t) +

1−ρ
2
W(t) is a Wiener
process.
Solution: Look at the properties in turn.
• X(t) is clearly continuous as W and

W are continuous and the linear combination
maintains continuity.
• X(0) = ρW(0) +

1−ρ
2
W(0) =ρ· 0+

1−ρ
2
· 0 =0 .
4
See the amazing deep zoom of the Wiener process shown by Algorithm A.2 in Appendix A with its
unexplored features. Contrast this with a corresponding zoom, Algorithm A.1, of the exponential function
which is boringly smooth.
emfm
2008/10/22
page 8
i
i
i
i
i
i
i
i
8 Chapter 1. Financial Indices Appear to Be Stochastic Processes


z
o
o
m
i
n
−1.0 −0.6 −0.2 0.2 0.6 1.0
−1.0
−0.8
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
0.8
1.0
−1.0 −0.6 −0.2 0.2 0.6 1.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
−0.5 −0.3 −0.1 0.1 0.3 0.5
−0.6
−0.4
−0.2
0.0
0.2
0.4
0.6
−0.5 −0.3 −0.1 0.1 0.3 0.5
0.5
1.0
1.5
2.0
−0.25 −0.15 −0.05 0.05 0.15 0.25
−0.5
−0.4
−0.3
−0.2
−0.1
0.0
0.1
0.2
0.3
0.4
0.5
−0.25 −0.15 −0.05 0.05 0.15 0.25
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
−0.05 0.05
−0.3
−0.2
−0.1
0.0
0.1
0.2
0.3
−0.05 0.05
0.90
0.95
1.00
1.05
1.10
1.15
1.20
1.25
−0.04 0.00 0.04
−0.25
−0.20
−0.15
−0.10
−0.05
0.00
0.05
0.10
0.15
0.20
0.25
−0.04 0.00 0.04
0.94
0.96
0.98
1.00
1.02
1.04
1.06
1.08
1.10
1.12
Figure 1.4. Left: From top to bottom, zoom into three realizations of a Weiner
process showing the increasing level of detail and jaggedness. Right: From top to bottom,
zoom into the smooth exponential e
t
. Time is plotted horizontally.
emfm
2008/10/22
page 9
i
i
i
i
i
i
i
i
1.2. Stochastic drift and volatility are unique 9
• Fromits definition and fromthe properties of scaling and adding normally distributed
independent random variables,
X(t +s) −X(s) =ρW(t +s) +

1−ρ
2
W(t +s) −ρW(s) −

1−ρ
2
W(s)
=ρ[W(t +s) −W(s)]

∼N(0,t)

∼N(0,ρ
2
t)
+

1−ρ
2
[

W(t +s) −

W(s)]

∼N(0,t)

∼N(0,(1−ρ
2
)t)

∼N(0,ρ
2
t+(1−ρ
2
)t)=N(0,t) .
• Also from its definition, X(t +s) −X(s) = ρ[W(t +s) −W(s)] +

1−ρ
2
[

W(t +
s)−

W(s)] , but neither W(t+s)−W(s) nor

W(t+s)−

W(s) depends on the earlier
details of W or

W, so neither does X(t +s) −X(s), and hence this increment cannot
depend upon the earlier details of X.
Summary
The Wiener process W(t) is the basic stochastic process from which we build an under-
standing of system with fluctuations. The Wiener process is continuous but not differen-
tiable; its independent random fluctuations, ΔW, scale with the square root of the time
step,

Δt.
1.2 Stochastic drift and volatility are unique
Applying the basic Wiener process, Figure 1.3, to the prices of assets needs refinement for
three reasons:
1. Different assets have different amounts of fluctuation;
2. risky assets generally have a positive expected return, whereas the expected value of
a Wiener process E[W(t)] =0 as it is distributed N(0, t);
3. the Wiener process assumes that any step, ΔW, is independent of the magnitude
of W, whereas we expect that any change in the value of an asset, ΔS, would be pro-
portional to S—investors expect, for example, to get twice the return froma doubling
in investment.
Deal with each of these in turn:
1. Increase the size of the fluctuations by scaling the Wiener process to the asset value
S(t) =σW(t), where the scaling factor σ is called the volatility of the asset. That is,
any increment in asset value ΔS = σΔW is distributed N(0, σ
2
Δt), where Δt is the
time step (like h). Symbolically we write this as
dS =σdW.
Interpret this differential equation as meaning S
j+1
=S
j


ΔtZ
j
for Z
j
∼ N(0, 1)
as before.
emfm
2008/10/22
page 10
i
i
i
i
i
i
i
i
10 Chapter 1. Financial Indices Appear to Be Stochastic Processes
S
(
t
)
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0
1
2
3
0, 1
2, 0.5
2, 2
3
time t
Figure 1.5. Algorithm 1.2 draws this realization of a Wiener process, μ = 0 and
σ = 1 , with four other transformed versions, labeled “μ, σ,” with various drift μ and
volatility σ but for the same realization of the Wiener process.
2. We expect (hope) the asset will increase in value in the long term. One way to model
such growth is to ensure that the increments have a nonzero mean. Simply add a little
to each increment:
S
j+1
=S
j
+μΔt +σ

ΔtZ
j
,
where the parameter μ is called the drift. The factor Δt is used so that μ is interpreted
as the expected rate of growth of S(t); for example, in the absence of fluctuations
(the volatility σ = 0) the price S(t) = S
0
+μt exactly, whence μ is the slope of the
increase in price. Figure 1.5 shows a realization of a Wiener process and various S(t)
derived from it with different drifts and volatilities, as computed by Algorithm 1.2.
Equivalently we write ΔS =μΔt +σΔW, or in terms of infinitesimal differentials,
dS =μdt +σdW. (1.1)
Equation (1.1) symbolically records the general stochastic differential equation
5
when
we allow, as we do next, the drift μ and the volatility σ to vary with price S and/or t
instead of being constant.
5
In the physics and engineering communities, SDEs such as (1.1) are written with the appearance of
ordinary differential equations, called a Langevin equation, namely dS/dt =μ(t, S)+σ(t, S)ξ(t), recog-
nizing that ξ(t) represents what is called white noise. This is an SDE in a different disguise. However,
physicists and engineers usually interpret it in the Stratonovich sense, which is subtly different from the Ito
interpretation of SDEs developed in this chapter.
emfm
2008/10/22
page 11
i
i
i
i
i
i
i
i
1.2. Stochastic drift and volatility are unique 11
Algorithm 1.2 MATLAB/SCILAB code to draw five stochastic processes, all scaled from
the one Wiener process, with different drifts and volatilities.
n=300;
t=linspace(0,1,n+1)’;
h=diff(t(1:2));
dw=sqrt(h)
*
randn(n,1);
w=cumsum([0;dw]);
plot(t,t
*
[0 2 2 -1 -1]+w
*
[1 0.5 2 0.3 3])
legend(’0, 1’,’2, 0.5’,’2, 2’,’-1, 0.3’,’-1, 3’,4);
3. In (1.1) μ is the absolute rate of return per unit time. However, financial investors
require a percentage rate of return. That is, we expect increments relative to the
current value to be the same. For example, for an asset with no volatility (treasury
bonds, perhaps) we obtain exponential growth (compound interest) such as S(t) =
S
0
e
rt
, where r is the (continuously compounded) interest rate. Differentiating this
gives dS/dt = rS, whence dS/S = rdt, and so the relative increment in the price
of the asset is ΔS/S = rΔt . For a stochastic quantity the immediately generalize to
assume that the relative increment has both a deterministic component, as above, and
an additional stochastic component,
ΔS
S
= αΔt +βΔW,
for some constants α and β called the stock drift and the stock volatility, respectively.
Figure 1.6 plots ΔS/S for cotton prices from which one could roughly estimate the
magnitude of α and β; Algorithm1.3 provides code to numerically estimate α and β.
Rearranging and writing in terms of differentials, we suppose throughout our appli-
cations to finance that assets satisfy the SDE
dS =αSdt +βSdW (1.2)
Algorithm 1.3 In MATLAB/SCILAB, given h is the time step, this code estimates the stock
drift and stock volatility from a times series of values s. In SCILAB use $ instead of end.
dx=diff(s)./s(1:end-1)
alpha=mean(dx)/h
beta=std(dx)/sqrt(h)
emfm
2008/10/22
page 12
i
i
i
i
i
i
i
i
12 Chapter 1. Financial Indices Appear to Be Stochastic Processes
Δ
X
j
=
Δ
S
j
S
j
(
d
a
i
l
y
)
1970 1975 1980 1985 1990 1995 2000
0.0
0.1
0.2
r a e y
) a (
Δ
X
j
=
Δ
S
j
S
j
(
y
e
a
r
l
y
)
1970 1975 1980 1985 1990 1995 2000
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
r a e y
) b (
Figure 1.6. Relative increments, ΔS/S, of the cotton prices of Figure 1.1 as a
function of time: (a) shows a daily drift and volatility coefficients of α = 2.77×10
−4
/day
and β=0.0181/

day, respectively; (b) shows that this translates into α=6.9%/year and
β =28.6%/

year .
with drift μ = αS and volatility σ = βS . The only straightforwardly observed de-
parture from the model (1.2) is that financial indices generally have larger negative
“jumps” than predicted by the Weiner dW: that is, rare falls in the financial market
are too big for a Wiener process; such rare falls reflect financial “crashes.” How-
ever, for most purposes the stochastic model (1.2) is sound. We use it throughout our
exploration of finance.
emfm
2008/10/22
page 13
i
i
i
i
i
i
i
i
1.2. Stochastic drift and volatility are unique 13
Table 1.1. Annual closing price of wheat and its relative increments. The mean
and standard deviation of the last column estimates the stock drift and stock volatility.
Year Close S
j
ΔS
j
(ΔS
j
)/S
j
2001 $3.00 $0.42 0.14
2002 $3.42 $0.36 0.11
2003 $3.78 $−0.27 −0.07
2004 $3.51 $−0.23 −0.07
2005 $3.28 $1.48 0.45
2006 $4.76
Example 1.4 (stock drift and volatility). The price of wheat in US$ per bushel closed at
the prices shown in the second column of Table 1.1. Given this data, estimate the stock
drift and stock volatility of the price of wheat.
Solution: Compute the changes in the price as shown in the third column of Table 1.1:
ΔS
j
=S
j+1
−S
j
. Compute the relative changes in the price as shown in the fourth column
of Table 1.1: (ΔS
j
)/S
j
= (S
j+1
−S
j
)/S
j
. The average of these five numbers in the fourth
column estimates the stock drift: α ≈ 0.11 = 11% per year. The standard deviation of
these five numbers in the fourth column estimates the stock volatility: β ≈ 0.21 = 21%
per

year.
Of course, these estimates are extremely crude because of the small amount of data
supplied by Table 1.1.
Definition 1.5. An Ito process satisfies an SDE such as (1.1).
This definition is ill-defined, as we have not yet pinned down precisely what is meant
by dS = μdt +σdW. Precision comes in Chapter 4. For the moment we continue to de-
velop and work with an intuitive understanding of the symbols. Finally, the Doob–Meyer
decomposition theorem (which I will not prove) asserts that any given Ito process X(t)
has a unique drift μ and volatility σ. Thus we are assured that there is a one-to-one cor-
respondence between SDEs, dX = μdt +σdW , and the stochastic process which is its
solution X(t).
6
Figure 1.7 shows that by zooming into an Ito process, we find that at the
smallest scales the process looks like one with linear drift and constant volatility. Thus the
decomposition of the process into the SDE form dX = μdt +σdW is justified because
this form applies on small scales—we just imagine that a host of such little pictures can be
“pasted” together to form the large scale process X(t).
Summary
A stochastic process X(t) satisfies an SDE, symbolically written dX =μdt +σdW , with
some drift μ and some volatility σ. Financial assets satisfy particular SDEs of the form
dS =αSdt +βSdW .
6
Note that the term “process” applies to the entire ensemble of realizations of a stochastic function. A
stochastic process is not just any one realization because each realization is markedly different. Indeed, an
SDE gives rise to an infinitude of realizations. It is the ensemble of possible solutions with their probability
of being realized that is included within the term “stochastic process.”
emfm
2008/10/22
page 14
i
i
i
i
i
i
i
i
14 Chapter 1. Financial Indices Appear to Be Stochastic Processes
0.0 0.4 0.8
0.0
0.5
1.0
1.5
2.0
2.5
3.0
0.0 0.2 0.4
0.5
1.0
1.5
2.0
0.0 0.1 0.2
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
0.0 0.1
0.9
1.0
1.1
1.2
Figure 1.7. Five realizations of the Ito process X = exp(t +0.2W(t)) shown at
different levels of magnification (the t axis is horizontal and X(t) is plotted vertically). The
top left displays the largest scale showing the exponential growth in X(t); the bottom right
displays the smallest scale view of X(t) showing that on small scales it looks just like a
stochastic process with linear drift and constant volatility, as promised by the Doob–Meyer
decomposition.
1.3 Basic numerics simulate an SDE
Section 2.2 in Chapter 2 develops algebraic solutions of SDEs. Here we resort to a simple
numerical technique to approximate solutions of SDEs. The numerical solution of SDEs
such as that for assets (1.2) or the general SDE (1.1) is based on replacing the infinitesimal
differentials with corresponding finite differences. In a sense this undoes the limit h →0
discussed in the previous section.
1.3.1 The Euler method is the simplest
Rewrite the general SDE (1.1) in terms of finite differences (increments)
ΔS = μΔt +σΔW,
and then evaluate it at the jth time step, recognizing that the drift and volatility are generally
functions of S and t so that
ΔS
j
=S
j+1
−S
j
=μ(S
j
, t
j
)Δt
j
+σ(S
j
, t
j
)ΔW
j
,
emfm
2008/10/22
page 15
i
i
i
i
i
i
i
i
1.3. Basic numerics simulate an SDE 15
Algorithm 1.4 Code for simulating five realizations of the financial SDE dS = αSdt +
βSdW. The for-loop steps forward in time. As a second subscript to an array, the colon
forms a row vector over all the realizations.
alpha=1, beta=2
n=1000; m=5;
t=linspace(0,1,n+1)’;
h=diff(t(1:2));
s=ones(n+1,m);
for j=1:n
dw=randn(1,m)
*
sqrt(h);
s(j+1,:)=s(j,:)+alpha
*
s(j,:)
*
h+beta
*
s(j,:).
*
dw;
end
plot(t,s)
where ΔW
j
are independent random samples from N(0, Δt
j
). Rearranging for S
j+1
gives
the recursion
S
j+1
= S
j
+μ(S
j
, t
j
)Δt
j
+σ(S
j
, t
j
)ΔW
j
. (1.3)
This form allows the time steps to be different. The Euler method (1.3) is a most general
and simple method to numerically simulate the realizations of SDEs.
Usually, for simplicity, we choose a fixed step in time of Δt
j
=h, whence t
j
=jh and
ΔW
j
=

hZ
j
for random variables Z
j
∼ N(0, 1) so that the Euler method (1.3) is typically
invoked as
S
j+1
=S
j
+μ(S
j
, t
j
)h+σ(S
j
, t
j
)

hZ
j
. (1.4)
Example 1.6 (exponential Brownian motion). The solutions of the SDE dS = αSdt +
βSdW are collectively called exponential Brownian motion. For any given α and β, the
Euler method (1.4) reduces to
S
j+1
=S
j
+αS
j
h+βS
j

hZ
j
. (1.5)
With the initial condition that S
0
= 1 , Algorithm 1.4 shows how this method may be im-
plemented in MATLAB/SCILAB to generate many different realizations.
For three different combinations of stock drift α and volatility β, Figures 1.8–1.10
plot five realizations of the solution S(t). See that increasing stock drift α indeed increases
the rate of growth, and that increasing the stock volatility β indeed increases the level of
fluctuations.
Notice that in this last case with relatively large volatility, Figure 1.10, the realiza-
tions, apart froma couple of large excursions, generally seemto stay smallish in magnitude.
This is very strange, since the deterministic part of the SDE indicates that the solutions
should grow. However, it is not so. Noise can act to stabilize growth; in financial applica-
tions, high volatility can act to keep stocks low even if they would otherwise increase. This
surprising result is supported by later algebraic results.
This simple Euler method (1.4) for an SDE looks just like a Markov chain, as dis-
cussed in stochastic process modelling (see, e.g., Kao 1997). The difference is that instead
of discrete states, here we have a continuum of states parametrized by the asset price S.
emfm
2008/10/22
page 16
i
i
i
i
i
i
i
i
16 Chapter 1. Financial Indices Appear to Be Stochastic Processes
S
(
t
)
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
time t
Figure 1.8. Simulation of the exponential Brownian motion of asset values with
stock drift α =1 and stock volatility β =0.5 as generated by Algorithm 1.4.
S
(
t
)
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0
2
4
6
8
10
12
14
time t
Figure 1.9. Simulation of the exponential Brownian motion of asset values with
larger stock drift α =2 and stock volatility β =0.5 as generated by Algorithm 1.4.
emfm
2008/10/22
page 17
i
i
i
i
i
i
i
i
1.3. Basic numerics simulate an SDE 17
S
(
t
)
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0
1
2
3
4
5
6
7
time t
Figure 1.10. Simulation of the exponential Brownian motion of asset values with
stock drift α =1 and larger stock volatility β =2 as generated by Algorithm 1.4.
But in the numerical code, time is still discrete. At times t
j
the system is in the state of the
asset having price S
j
. At the next time t
j+1
, the system makes a transition to a state S
j+1
,
a distance μ
j
h+σ
j

hZ
j
away, where the stochastic aspect enters through the normally
distributed random variable Z
j
. In stochastic process modeling we investigate probability
distributions, whereas here for the moment we focus on realizations. Chapter 3 addresses
probability distributions of solutions to SDEs.
Why use MATLAB/SCILAB?
True, the financial industry standard is to use spreadsheets for almost all numerical com-
putation. However, spreadsheets are many orders of magnitude slower than script and
programming languages (furthermore, common spreadsheet programs have notorious er-
rors). I do not endorse crippling your power by limiting yourself to inefficient tools. In-
stead, I seek to empower you to manage large-scale numerical problems through MAT-
LAB/SCILAB. I quote from David Chadwick, Stop The Subversive Spreadsheet!:
7
The presence of a spreadsheet application in an accounting system can subvert
all the controls in all other parts of that system.” So says Ray Butler of the
Computer Audit Unit, HM Customs and Excise and Ray should know. For
ten years he has been investigating errors in spreadsheets used by companies
for calculating their VAT payments. Over this period, he has collected use-
ful data on types and frequencies of errors as well as on the effectiveness of
7
http://staffweb.cms.gre.ac.uk/~cd02/EUSPRIG
emfm
2008/10/22
page 18
i
i
i
i
i
i
i
i
18 Chapter 1. Financial Indices Appear to Be Stochastic Processes
different audit methods not only those used by VAT officers but also those in
use by other auditors. Ray has written extensively about the phenomenon of
errors in spreadsheets (an excellent example is to be found in his article en-
titled ‘The Subversive Spreadsheet’. It seems surprising that Ray would find
such a problem in such a straightforward well-defined business application but
as he is quick to point out “Even in a domain such as indirect taxation, which is
characterised by relatively simple calculations, relatively high domain knowl-
edge by developers, and generally well-documented calculation rules, the use
of spreadsheet applications is fraught with danger and errors.”
Ray is not alone in his interest of spreadsheet risks. Chris Conlong of the
Business Modelling Group at KPMG Consulting is also only too aware of
the problems and, when asked, frequently refers to the findings of a KPMG
survey of financial models based on spreadsheets. The survey found that 95%
of models were found to contain major errors (errors that could affect decisions
based on the results of the model), 59% of models were judged to have ‘poor’
model design, 92% of those that dealt with tax issues had significant tax errors
and 75% had significant accounting errors.
These figures are truly astounding and if extrapolated to all major organisations
throughout the world hint at potential disaster scenarios just waiting to happen.
A colleague recently remarked “Spreadsheet errors are a business time-bomb
waiting to go off. They’re a bit like the millennium bug — nobody knew
the time-bomb was there until it was pointed out and then everybody knew
and knew when it would happen. However, with the spreadsheet problem
few people know that there is a time-bomb at all and none knows when their
particular bomb may go off.”
1.3.2 Convergence is relatively slow
As with any numerical method, we expect the numerical solution to approach the true
solution as the time step h →0 . This is true for the Euler method (1.3), but the rate of
convergence is slow: for a deterministic differential equation the error of the Euler method
is generally O

h

, that is, the error decreases in proportion to h; for an SDE the error of the
Euler method is the larger O

h

. For example, for a deterministic differential equation
with time steps of h = 0.001 we would expect an error of about 0.1%, whereas for an SDE
the error would be about

0.001 = 3%. The relatively slow convergence of numerical
solutions of SDEs is difficult to overcome.
For now we illustrate just the convergence. One crucial issue for SDEs is that dif-
ferent realizations of the Wiener process—the noise—generate quite different looking real-
izations of the solution S(t). See the different realizations in each of the above figures, for
example. Thus, to examine convergence we must retain the one realization of the Wiener
process as we vary the step size h. Consequently, in the following example we generate the
Wiener process first with the finest time step, then sample it with increasingly coarse time
steps.
Example 1.7 (convergence to exponential Brownian motion). Superimpose plots of nu-
merical solutions of the SDE dS = Sdt +1.5SdW with initial condition S
0
= 1 for differ-
ent time steps h to show qualitatively the relatively slow convergence as h →0 .
emfm
2008/10/22
page 19
i
i
i
i
i
i
i
i
1.3. Basic numerics simulate an SDE 19
S
(
t
)
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
time t
Figure 1.11. One realization of the exponential Brownian motion of asset values
with stock drift α = 1 and stock volatility β = 1.5 as generated by Algorithm 1.5 with
different time steps h = 1/16 (cyan), 1/64 (red), 1/256 (green), and 1/1024 (blue) to
illustrate convergence as h →0.
S
(
t
)
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
1
2
3
4
5
6
7
time t
Figure 1.12. One realization of the exponential Brownian motion of asset values
with stock drift α = 1 and stock volatility β = 1.5 as generated by Algorithm 1.5 with
different time steps h = 1/16 (cyan), 1/64 (red), 1/256 (green), and 1/1024 (blue) to
illustrate convergence as h →0.
emfm
2008/10/22
page 20
i
i
i
i
i
i
i
i
20 Chapter 1. Financial Indices Appear to Be Stochastic Processes
Algorithm 1.5 Code for starting one realization with very small time step, then repeat with
time steps four times as long as the previous.
alpha=1, beta=1.5
n=4^5;
t=linspace(0,1,n+1)’;
w=[0;cumsum(sqrt(diff(t)).
*
randn(n,1))];
for col=[’b’ ’g’ ’r’ ’c’]
dt=diff(t); dw=diff(w);
s=ones(n+1,1);
for j=1:n
s(j+1)=s(j)+alpha
*
s(j)
*
dt(j)+beta
*
s(j)
*
dw(j);
end
plot(t,s,col), hold on
t=t(1:4:end); w=w(1:4:end); n=n/4;
end
Adapt the code used before. Algorithm 1.5 finds numerical solutions on 0 ≤ t ≤ 1
with n steps of length h = 1/n; first with n =4
5
=1024, then increasing the step length h
by a factor of 4 and correspondingly decreasing n by a factor of 4 each time we draw
a new approximation. We generate the Wiener process, W(t), with the smallest step
length, and then for each approximation we sample the one realization of W(t) with a
step four times larger than the previous approximation by subsampling statements such as
w=w(1:4:end).
Executing Algorithm 1.5 twice generated Figures 1.11–1.12, one for each of the two
different realizations of the Wiener process W(t). In each figure the numerical solutions
do seem to converge, albeit a little slowly.
Suggested activity: Do at least Exercise 1.7(1).
Summary
The Euler method (1.4) numerically computes realizations of SDE (1.1) provided the
time step is small—generally use a time step h ≈ 0.001 or less. In a free online article,
Higham (2001) presents more information on such basic methods for SDEs and their con-
vergence and points to more sophisticated and accurate schemes.
1.4 A binomial lattice prices call option
We use the principle that there cannot be risk-free profit, called arbitrage, to price options
in stochastic finance. The arguments employed here to introduce the rational pricing of
options looks like a numerical approximation of SDEs. But we simplify even more than
previously by not only discretizing time but also discretizing asset prices. The analysis
presented here is based on research by Black, Scholes, and Merton (circa 1973) which later
was simplified, as we now see.
8
8
Read A calculus of risk by Gary Stix in the May 1998 issue of Scientific American. Chapters 2 and 3 of
the book by Stampfli and Goodman (2001) provide reading to supplement this section.
emfm
2008/10/22
page 21
i
i
i
i
i
i
i
i
1.4. A binomial lattice prices call option 21
Interestingly, the application in Example 1.12 shows that such valuation of options
fully justifies taking action on global warming almost irrespective of the actual probability
that global warming is occurring!
Definition 1.8. A European call option gives the buyer the right but not the obligation to
purchase an asset from the seller at a previously agreed price on a particular date.
9
The
agreed price is called the exercise price or the strike price.
For definiteness we refer to the seller of a call option as Alice and to the buyer as
Bob.
10
The call option: is sold by Alice so she obtains money to invest elsewhere; is
exercisable by Bob at the fixed price at the specified date, but can be abandoned by Bob
without penalty.
Insurance is a form of call option
Bob buys an insurance policy for his car from Alice, the insurance company. If, during
the ensuing year, Bob damages his car so that it is worth less, then he makes an insur-
ance claim—Bob exercises the option. Alternatively, if during the year Bob’s car remains
undamaged so it is worth the same (ignoring normal wear and tear), then Bob makes no
claim—Bob does not exercise the option and Alice keeps the price of the insurance policy
as profit. That Bob makes a claim on an insurance policy (or not) is closely analogous to
Bob exercising an option (or not).
Expiry value of a call option
At the expiry date, Bob is not going to buy the asset from Alice for a price above what he
could pay on the open market. Thus if the listed price of the asset is ultimately below the
strike price, then the option is worthless and Bob tosses it away. However, if the listed price
is higher than the strike price of the call option, then Bob will exercise his option and buy
the asset from Alice so that, if nothing else, he could then dispose of the asset at a profit.
The value of the call option to Bob at the expiry date is thus
C =max{0, S−X} ,
where S is the price of the asset and X is the exercise (strike) price. A call option has some
nonnegative value because, depending on the vagaries of market fluctuations, it may be
worth either nothing or something, but is never a liability.
Example 1.9 (trade a call option). Alice holds ten units of Telecom shares, an asset, at
the start of the year; we say altogether they are worth $35. She sells to Bob at $4 per
call option, allowing him to buy the ten shares at the end of the year at the strike price of
X =$38.50 .
9
An American call option is the same but additionally allows the buyer the right to purchase the asset
before the particular date. Throughout this section we discuss only the easier European call option and the
corresponding put option.
10
Alice and Bob first made a name for themselves in helping to solve problems in quantum physics. They
made their contribution in the area of quantum cryptography, where they developed methods of foiling that
dastardly eavesdropping spy, known only by her code name, Eve. Since then Alice and Bob have worked
for us in financial theory.
emfm
2008/10/22
page 22
i
i
i
i
i
i
i
i
22 Chapter 1. Financial Indices Appear to Be Stochastic Processes
If the price of the ten shares on the open market goes up 25%to S =$43.75 , then Bob
will exercise his right to buy the shares from Alice for $38.50 and sell them immediately
for $43.75, which gives him $1.25 profit because he originally paid Alice $4 for the call
option. Alternatively, if the shares drop in price by 20% to S = $28, then Bob will not
exercise his option because he would lose money if he did (over and above the $4 he
originally paid for the option).
Questions: In this example, is the $4 that Bob paid to Alice a fair price? Should
Alice have asked for more, or should Bob have insisted on less? How can we decide?
We answer these questions by a model that is a Markov chain and hence looks like
a numerical solution of an SDE. Section 2.3 examines the continuous time, continuous
state version using more theory of SDEs. The principle we use is that of arbitrage: there
cannot exist the opportunity for risk-free profit. In this context, “profit” means an amount
over and above that which is obtained by investing in secure bonds. We discover that if
the call option is valued too highly, then Alice could form a portfolio of the asset and sell
options so that she is guaranteed to profit. Alternatively, if the call option is valued too low,
then Bob could buy the options with money obtained through selling the asset so that he is
guaranteed to profit. It is only when the call option is precisely and correctly valued that
nobody is guaranteed to do better than invest in bonds. With a correct valuation of options,
people must accept risk in order to better their return from bonds.
1.4.1 Arbitrage value of forward contracts
Before proceeding to the interesting case of options, let us introduce arbitrage in its appli-
cation to valuing forward contracts.
Example 1.10 (Telecom shares). Instead of giving the option to Bob of buying the Tele-
com shares of Example 1.9, suppose Alice and Bob write a forward contract: The forward
contract states that Bob will definitely purchase the asset, the ten Telecom shares, from
Alice for $38.50 at the end of one year. However, the contract involves risk: If the market
value goes above $38.50, then Bob gains because he obtains the asset at a cheaper price
than the open market price, and Alice loses; if the market value stays below $38.50, then
Bob loses because he must buy at a higher price than the open market price, and Alice cor-
respondingly gains. Who should pay what to whom at the start of the year for this forward
contract? Arbitrage determines the value of the forward contract.
Solution: Suppose Bob pays Alice F
0
for the forward contract (negative F
0
means
Alice pays Bob). Alice avoids risk by purchasing the asset at the start of the year for $35
with her own funds; she then has the asset of the shares ready for sale at the agreed price at
the end of the year. Since Bob pays her F
0
for the forward contract, Alice invests only $35−
F
0
into this portfolio of the asset and the forward contract. Alice makes this investment to
receive $38.50 at the end of the year—risk free.
Arbitrage asserts that any risk-free return must be the same as investing in bonds.
Alternatively, Alice could invest this amount, $35 −F
0
, in risk-free bonds paying, say,
12% interest over the year. Thus Alice could alternatively obtain a risk-free return of
1.12($35 −F
0
) by investing in bonds. Thus arbitrage asserts $38.50 = 1.12($35 −F
0
) .
Rearranging F
0
= $35−$38.50/1.12 = $0.625 . That is, Bob should pay Alice 62.5 cents
for this forward contract.
emfm
2008/10/22
page 23
i
i
i
i
i
i
i
i
1.4. A binomial lattice prices call option 23

Sd = 28

S = 35

Su = 43.75
' E
Figure 1.13. State transition diagram for a simple binomial model of asset prices
over one period: The prices either go up by 25% or down by 20%.
S = 35
C = ?
Su = 43.75
C
u
= 5.25
Sd = 28
C
d
= 0
¨
¨
¨
¨
¨
¨B
r
r
r
r
r
rj
Figure 1.14. Rotate the states of Figure 1.13 90

and plot time horizontally to
give a simple binomial tree model of asset prices over one period.
Now use arbitrage to value a general forward contract. Suppose Alice and Bob con-
tract for Bob to purchase an asset from Alice at time T for some agreed price X. Let Bob
and Alice value this forward contract at F
0
at the start of the period, that is, Bob pays Al-
ice F
0
to sign the contract. To avoid risk, Alice could purchase the asset immediately for
its known current price S
0
. She would invest S
0
−F
0
of her funds to do this and would
then receive X at expiry time T from Bob—risk free. Being risk free, this return must be
the same as what Alice would get by investing the same amount, S
0
−F
0
, in bonds. At the
expiry time T the bonds would be worth R(S
0
−F
0
) for some R reflecting the interest rate
of the bonds. Arbitrage asserts X = R(S
0
−F
0
), which rearranged determines
F
0
= S
0
−X/R (1.6)
to be the value of a forward contract.
Analogous reasoning values options: we need to value Alice’s investment and to
discount the expiry value by the bond rate.
1.4.2 A one step binomial model
Our initial simple analysis rests on a Markov chain with just three states representing the
possible prices of an asset now and in a year’s time. For definiteness we continue with
the numbers used in Example 1.9; Figure 1.13 shows the state transition diagram. From
the current state of the asset price being S = $35 , here we restrict our attention to the two
possibilities that the price after a year has either risen by 25% or fallen by 20%; that is, the
asset price is multiplied by a factor of u = 1.25 or d = 0.80, respectively. I do not record
any probabilities for the transitions from the current price S because the probabilities of the
price increasing or decreasing are irrelevant!
It is usual in financial applications to arrange the states vertically and to plot time
horizontally as shown in Figure 1.14. Additionally we include in this diagram the value
of the call option on the asset at the end of the year: if the asset price goes up, the call
emfm
2008/10/22
page 24
i
i
i
i
i
i
i
i
24 Chapter 1. Financial Indices Appear to Be Stochastic Processes
option is worth C
u
=$43.75−$38.50 =$5.25, as the strike price agreed to between Alice
and Bob was X =$38.50, whereas if the asset price goes down the call option is worthless,
C
d
=$0 , because Bob, if he were to buy the asset, would prefer to purchase it at the open
market rate of S = $28 rather than from Alice for X = $38.50 . But how do we determine
the value, C, of the call option at the start of the year?
A risk-free profit from overpriced options
Suppose Alice wants to buy the asset of the Telecom shares. One possibility way to get
the requisite $35 is to invest $23 of her own money and to sell three call options to Bob at
the valuation of $4 each. After one year, if the asset price goes down, then the options are
worthless and Alice’s asset, and hence her hypothetical portfolio, is worth $28; whereas if
the asset price goes up, she has to buy two more lots of Telecom shares, costing $43.75×2,
in order to satisfy Bob’s demand for the three lots of shares, for which Alice receives
$38.50×3, and hence at the end of the year Alice will have $38.50×3−$43.75×2 =$28 .
The hypothetical portfolio’s value of $28, irrespective of whether the asset price rises or
falls, is better than what Alice could have gained if she had invested her $23 in bonds at,
say, 12% interest, because at the end of the year that would have only been worth $25.76.
Alice makes a risk-free profit!
This risk-free profit shows that, at $4, the call options are overpriced. The opportunity
for risk-free profit would result in traders clammering to sell options, which would flood
the market and hence reduce the price of the option. We soon find that the market forces
the call option to have a definite price.
But before we do that, note that Alice’s choice of selling three call options is the
result of a careful balancing act.
• Suppose Alice had only sold two call options and invested $27 of her own money to
buy the asset; then if the market goes up, her portfolio is worth $38.50×2−$43.75 =
$33.25 ; whereas if the market goes down, it is worth only $28. This latter figure is a
markedly lower return than she could get from investing her $27 in bonds, which at
12%interest would be worth $30.24 at the end of the year. If the asset price increases,
then Alice has done well, but the point is that this good outcome is only achieved by
taking the risk that the portfolio will fall in value relative to bonds.
• Conversely, suppose Alice had sold four call options and invested $19 of her own
money to buy the asset; then if the asset price goes down, her portfolio is worth $28;
whereas if the asset price goes up, her portfolio is worth only $22.75. Although
these are both higher than the bond return, when the options are valued correctly, the
imbalance between the two cases results in a potential loss for Alice, and there is
risk.
Only by selling an appropriate number of options does Alice create a risk-free portfolio.
Hedge against fluctuations
We endeavour to discover under what conditions it is impossible to make a profit (above the
bond rate) without risk. Look at the issue from Alice’s point of view. Alice could construct
a hypothetical portfolio of a unit of the asset and H call options sold to Bob, initially costing
emfm
2008/10/22
page 25
i
i
i
i
i
i
i
i
1.4. A binomial lattice prices call option 25
Alice a net investment of S−HC. If the price of the asset goes down, then the call options
are worthless, and so the value of the portfolio at the end of the year is just that of the asset,
namely Sd =$28 . By the definition of a risk-free portfolio, this value must be the same as
the value of the portfolio even if the asset rises in price—a rise in price increases the value
of the asset, but also increases the value of the call option Alice sold to Bob. If the price of
the asset goes up, then Bob will elect to exercise the call option, and at the end of the year
Alice will have to sell H units of the asset to Bob at a price of X, receiving HX, but at the
expense of having to buy H−1 units of the asset at a price Su. Thus at the end of the year
the portfolio is worth
HX−(H−1)Su =Su−H(Su−X) =Su−HC
u
,
where C
u
=$5.25 is the net value of each call option if the asset price has gone up. In order
for the portfolio to have the same value at the end of the year irrespective of whether the
asset price has gone up or down, we must then have Su−HC
u
=Sd, that is, $43.75−H×
$5.25 =$28 . Solve this linear equation immediately to give us the magic result that Alice
must sell H=3 call options to insulate her portfolio from price fluctuations and ensure the
value of $28 at the end of the year.
A fair price for the option
Now Alice would like this investment to be larger than the return from investing the initial
capital, S−HC=$35−3C, in bonds at 12% interest. Thus Alice wants
$28 ≥1.12×($35−3C) ⇒ C ≥
1
3
($35−$28/1.12) =$
10
3
=$3.33.
But if the price of the call option is higher than $3.33, then the market will be flooded
with people trying to sell such options. Thus the arbitrage price of the call option must be
C =$3.33 . This is sensibly lower than the overpriced $4 we used earlier.
Value an option first by finding how one could hedge against risk, and second by
equating such a risk-free portfolio to investment in bonds.
Suggested activity: Do at least Exercise 1.11.
Example 1.11 (Bob wants to buy cheap options). In the converse of Alice’s argument,
Bob may want to buy a call option to protect himself from fluctuations when he sells an
asset.
Solution: The argument is easier if we phrase it as Bob buying x units of the asset,
where x will turn out to be negative to represent that he actually sells the asset. So at the
start of the year Bob buys the three call options at price C from Alice and x units of the
asset at S = $35 each; this costs him $35x +3C which, if Bob had not bought the call
options and invested in bonds instead, would be worth 1.12×($35x+3C) at 12% interest;
if the asset price goes down, Bob’s portfolio will be worth just $28x, as the call options are
worthless; whereas if the asset price goes up, the portfolio is worth $43.75x +3 ×$5.25 .
To be protected against the fluctuations, that is, to be risk free, these last two outcomes have
to be equal:
$28x =$43.75x+3×$5.25 ⇒ x =−1.
emfm
2008/10/22
page 26
i
i
i
i
i
i
i
i
26 Chapter 1. Financial Indices Appear to Be Stochastic Processes
Thus Bob sells one unit of the asset as well as buying the three call options. To be worth-
while, this guaranteed return has to be at least as large as the return he would obtain by
investing in bonds; thus
$28x ≥1.12($35x+3C) ⇒ C ≤$3.33.
Thus Bob would only be interested in buying call options if they were priced less than $3.33;
whereas Alice was only interested in selling them for more than $3.33 . The rational price
for the call option must then be C =$3.33 .
Value a general option
We now derive the general formula for the one-step binomial price of a call option from
the point of view of the seller, Alice. The above example shows that the buyer and seller of
call options have opposing interests that balance at one price only.
11
There are two parts to
the analysis: determining the number of options to sell, and determining the fair price for
the options.
The number of options to sell Refer to the binary tree given earlier but only pay attention
to the algebra. An asset bought by Alice at a cost of S at the start time may increase in
price by a factor u to a price Su, or it may fall in price by a factor d to a price Sd. By
selling H call options at the start of the year for some strike price X, Alice can protect
herself against fluctuations—ensuring a risk free outcome. At the end of the period
each call option is worth C
u
or C
d
depending on whether the asset price has risen
or fallen, respectively (so far we have seen only C
u
=Su−X and C
d
=0 , but more
generally these have other values). To be risk free the portfolio must have the same
value at the end no matter which outcome eventuated: thus Su−HC
u
=Sd−HC
d
,
which, rearranged, gives the hedge ratio
H=
Su−Sd
C
u
−C
d
. (1.7)
Observe the reasonably intuitive result that, in order to make each outcome have the
same value, the number of call options sold must be the ratio of the range of the asset
price to the range of the option. Thus H is called the hedge ratio.
The fair price Recall that for a riskless portfolio there can only be rational buyers and
sellers to establish the portfolio if the investment returns the same as the risk-free
bond interest rate. Let R = e
r
be the factor by which bonds increase in price over
the time period so that r is the equivalent continuously compounded interest rate
over that time period (we used R =1.12 in the earlier example, that is, r =0.1133 =
11.33%). Then the original investment by Alice of S−Hc , if put into bonds, would
be worth R(S−Hc). Setting the equivalent return in bonds equal to the return of the
risk-free portfolio gives R(S−Hc) = Su−HC
u
(= Sd−HC
d
) which, rearranged,
gives the fair price of a call option as
C =
S(R−u) +HC
u
HR
.
11
This balance between the two different views of a call option is rather like the beautiful relation between
a linear programming problem and its dual that you see in operations research.
emfm
2008/10/22
page 27
i
i
i
i
i
i
i
i
1.4. A binomial lattice prices call option 27
However, we obtain a more appealing form of this expression after substituting (1.7)
and rearranging to
C =
1
R
[pC
u
+(1−p)C
d
], (1.8)
where p =
R−d
u−d
and 1−p =
u−R
u−d
.
In the earlier example, p = 0.71 and 1 −p = 0.29 . See that the price of the call
option is just a convex combination of the possible values at the end of the period
discounted by the bond factor R to give a value for the start of the period.
12
This argument is just used to set the prices of call options. People would not actually
create a risk-free portfolio as given above because if they did, their portfolio would not
do any better than the return from bonds. In general, people buy and sell options just as
they buy and sell assets, namely, based on their own assessment of the future movement of
market prices.
Example 1.12 (support research into global warming). Let us present the case for fi-
nancing research projects as an option to ensure a future for humanity. In particular, we ex-
plore justifying a research project into the contentious issue of global warming.
13
Suppose
a researcher in climatology, called JR for short, submits to a country’s government a project
proposal costing 1M$ in personnel and equipment, and promising increased knowledge to
ameliorate any significant global warming. Should the government fund the project? How
is the project to be valued?
We value the project as a European (put) option: the government is the buyer of the
option (that is, pays for JR’s project), and JR is the seller of the option. Interestingly, the
project has this value independent of the probability that global warming actually occurs!
This standard financial instrument of options values the project irrespective of the validity
of the doubts and criticisms of the nay-sayers.
Now proceed with the option valuation, with almost entirely invented figures. Sup-
pose the government funds JR’s project by forming a portfolio of the funded project to-
gether with an investment into the country’s economy. Say the economy is currently valued
at 16 T$.
14
Suppose that the government’s (potential) portfolio is this “bought” option with
a (tiny) fraction φ (phi) of the economy invested in the economy.
15
Look a decade, say,
into the future to value this portfolio. In the spirit of our binomial modeling, either one of
two things will happen:
12
The weights appearing in this convex combination appear as probabilities. But pand 1−phave noth-
ing to do with the actual probability of an up or down step in the asset price. Nonetheless, the theory of
transformations between different probability distributions, the Radon–Nikodym derivative, is based on the
view that such weights do act like probabilities for some purposes.
13
Although global warming is now generally accepted, and its dangers estimated, for the past 30 years or
more global warming was a very contentious issue. Our valuation of research projects applies throughout
this history, but such valuation is only recently recognized.
14
1 T$ is one tera-dollar, more commonly known as a trillion dollars. 1 G$ denotes one giga-dollar,
otherwise known as a billion dollars. 1 M$ denotes one mega-dollar, that is, a million dollars. I prefer this
scientific notation.
15
The variable φis the reciprocal of the hedge ratio Hused earlier. The reason is that now the portfolio
is one option and some fraction of the asset (the economy), whereas previously the portfolio was one asset
and any number of options.
emfm
2008/10/22
page 28
i
i
i
i
i
i
i
i
28 Chapter 1. Financial Indices Appear to Be Stochastic Processes
• If global warming is a chimera, global temperatures will revert to normal, the econ-
omy will grow, and the knowledge of JR’s project is worthless. Say the econ-
omy grows to 24 T$ over the decade. The value of the portfolio is then simply
P
u
= φ24 T$ .
• If global warming is real, global temperatures continue to climb, sea levels rise,
flooding productive land and valuable housing, and the economy suffers badly and
falls to a value of 12 T$. But the knowledge gained by JR’s project empowers the
government to take remedial action to limit the damage caused by global warming;
suppose the knowledge gained by JR’s project saves 3 G$. Consequently, by calling
on the knowledge gained by the research project, the government then values the
portfolio at P
d
= 3G$ +φ12 T$ .
Risk-free portfolio
One generates a risk-free portfolio by choosing the (reciprocal) hedge ratio φ so that these
values at the end of the decade are identical: P
u
= P
d
, that is, φ24T$ = 3G$ +φ12 T$,
which rearranged gives the hedge ratio φ =
1
4
×10
−3
. At the end of the decade the risk-
free portfolio would thus have value 6 G$, independent of the likelihood of global warming
occurring.
Arbitrage gives initial value
Now suppose that over the decade, bonds will increase by 33
1
3
% (roughly 3% per year).
That is, bonds will increase in value by a factor of R = 4/3 . Since the portfolio is risk free,
it must have the same value as investing in bonds. Thus the initial value of the portfolio is
its final value divided by R, namely P =6G$/(4/3) =4.5 G$. Let the present value of JR’s
project be V; then the portfolio would cost V +φ16T$ = V +4 T$. Consequently, as this
cost must equal the initial value of the option P =4.5G$, the value of JR’s research project
is V = 0.5G$ = 500 M$.
Since JR offers to do the project, generating the knowledge, for 1 M$, the government
should value this project, as an option, as 500 times its cost. Research into global warming
is an extremely good value for the money as precautionary insurance, independent of the
probability of the likelihood of global warming.
Analogous valuation applies to all such precautionary research projects.
Suggested activity: Do at least Exercise 1.12.
1.4.3 Use a multiperiod binomial lattice for accuracy
Make your analysis more accurate by dividing the relevant time period into more steps
and by correspondingly increasing the number of possible asset prices. The previous two
examples assumed that the time period over which the call option operates is divided into
just one time step. It also assumed that the price of an asset went up or down only in one
of two large steps. Section 2.3 applies SDEs by making the time step infinitesimally small
and asset prices continuous. As an interim stage, we decide now, for example, to model
the asset price movement with quarterly time steps (Δt = h = 1/4 year). Over a year the
emfm
2008/10/22
page 29
i
i
i
i
i
i
i
i
1.4. A binomial lattice prices call option 29

Sd
4
22.40

Sd
3
25.04

Sd
2
28.00

Sd
31.31

S
35.00

Su
39.13

Su
2
43.75

Su
3
48.91

Su
4
54.69
' ' ' ' ' ' ' '
E E E E E E E E
Figure 1.15. State transition diagram for a simple Markov chain model of asset
prices over multiple time steps. The prices either go up or down by, for example, a factor
of u =1.1180 or d = 1/u =0.8944 .
asset price then has four increments, so we model the domain of asset prices by a nine state
Markov chain: we decide on having four prices on either side of the current price S, as
shown in Figure 1.15, to allow for the possibility that the four increments of the price are
either all up or all down. Although by no means necessary, it is convenient to have the
selection of prices be at a constant ratio u = 1/d to their neighboring prices. Thus, for
example, an increment in price followed by a decrement in price will result in the price
being exactly back at the starting value, rather than somewhere in between.
But why choose u = 1/d = 1.1180? Recall that for exponential Brownian motion
(which is characteristic of asset prices), in each time step of the numerical approxima-
tion (1.5) the prices are multiplied by a factor
1+βZ

h ≈e
βZ

h
(upon ignoring the drift term αh) where Z ∼ N(0, 1). The rough approximation that prices
only either rise or fall is equivalent to assuming that Z =±1, whence the factors
u =e
β

h
and d =1/u =e
−β

h
. (1.9)
Now here we investigate a quarterly model so that h =1/4 , that is

h =1/2 . Hence here
the ratio
u =e
β/2
=

e
β
=

u
yearly
=

1.25 =1.1180.
The general rule is that the multiplicative factor of asset price increments should be propor-
tional to some constant raised to the power of

h. This keeps the volatility comparable in
the approximations over differently sized time steps.
A further reason for making all the increments in price use the same factors u and d
is that it is then easier to apply the formula (1.8), as the parameters u and d being constant
lead to a constant parameter p. However, p also depends upon the bond rate R which also
varies with the time step h. Here we want the interest compounded over four quarters to be
the same as the yearly interest. Thus R
4
quarterly
=R
yearly
, and hence in this example,
R
quarterly
=1.12
1/4
= 1.02873.
Using this value for R and the earlier values for the price increments, the parameter
p =0.6007 and correspondingly 1−p =0.3993.
In general, since bonds growing by compound interest increase in the value like e
rt
for
some rate r, it follows that when sampled at time steps of size h, the appropriate factor
is (e
r
)
h
. That is, the multiplicative factor for the rate of increase in the value of bonds
should be proportional to a constant to the power of h.
emfm
2008/10/22
page 30
i
i
i
i
i
i
i
i
30 Chapter 1. Financial Indices Appear to Be Stochastic Processes
E time t
T
price s
S = 35
C = 3.5
39.13
5.31
31.31
1.05
43.75
7.90
35.00
1.79
28.00
0
48.91
11.49
39.13
3.07
31.31
0
25.04
0
54.69
C
u
4 = 16.2
43.75
C
u
3
d
= 5.25
35.00
C
u
2
d
2 = 0
28.00
C
ud
3 = 0
22.40
C
d
4 = 0
¨
¨
¨
¨B
¨
¨
¨
¨B
¨
¨
¨
¨B
¨
¨
¨
¨B
r
r
r
rj
r
r
r
rj
r
r
r
rj
r
r
r
rj
¨
¨
¨
¨B
¨
¨
¨
¨B
¨
¨
¨
¨B
r
r
r
rj
r
r
r
rj
r
r
r
rj
¨
¨
¨
¨B
¨
¨
¨
¨B
r
r
r
rj
r
r
r
rj
¨
¨
¨
¨B
r
r
r
rj
Figure 1.16. Example of a four-step binary lattice to compute the value of a call
option. The asset price is the top number of each pair, and the computed option value is
the bottom number.
Now investigate the four quarterly time steps. Lay out the states vertically with time
horizontally, as we did for the single step and as shown in Figure 1.16. See that because at
each time step the price is assumed to only either rise or fall, at each time step only half the
prices are accessible, and thus only these are drawn at any time. Although included in the
figure, as yet we do not know the value of the call option at each of the nodes. However,
at the end of the year (the last column in the figure), we do know that the value of a call
option is C = max{0, S−X} for the various asset prices S. These have been recorded and
labeled so that C
u
k
d
l denotes the value of the call option after, in any order, k increments
and l decrements in the price. Then determine the value of the call option at all other times
by proceeding from right to left. First, consider the problem of determining the value of
the call option at the start of the fourth quarter following three successive increments in
the price of the asset. The state of the system is that the asset is priced at Su
3
= 48.91—
the upper state in the second column from the right in Figure 1.16. In this quarter of the
year the problem looks just like the simple one-step binomial model we analyzed earlier:
consider just the top-right part of the lattice as in Figure 1.17. At the start of this final
quarter the issues are exactly the same as outlined for the one-step binomial model: Alice
has an asset with which she can adjust her risk-free portfolio by choosing the right number
of call options valued at a strike price of X=$38.50; because it is risk free, arbitrage asserts
emfm
2008/10/22
page 31
i
i
i
i
i
i
i
i
1.4. A binomial lattice prices call option 31
48.91
C
u
3 = ?
54.69
C
u
4 = 16.2
43.75
C
u
3
d
= 5.25
¨
¨
¨
¨B
r
r
r
rj
Figure 1.17. Each step in using a multi-step binary lattice to value a call option
looks like a simple one step binomial approximation, such as this extract from the top-right
of Figure 1.16. Thus the previous arguments and formulae apply.
it must return the same value as investments in bonds. Thus the formula (1.8) applies with
an appropriate change in subscripts, namely,
C
u
3 =
1
R
[pC
u
4 +(1−p)C
u
3
d
]
= [0.6007×$16.19+0.3993×$5.25] /1.0287
= $11.49.
Similar reasoning applies for all other values of the call option at the start of the last quarter.
For example,
C
u
2
d
=
1
R
[pC
u
3
d
+(1−p)C
u
2
d
2 ] = $3.06.
Once the values of the call option are determined at the start of the last quarter, the same
reasoning and formula give the values at the start of the third quarter—and so on from right
to left across the tableau to give the values already appearing in Figure 1.16. Thus a four-
step binomial lattice model estimates that the call option with strike price X = $38.50 on
the asset is worth C = $3.50 at the start of the year.
Suggested activity: Do at least Exercise 1.15.
Summary
Use expiration values of the call option to determine the values at the third step; use third-
step values for those at the second; use second-step values for those at the first; and finally,
compute the initial call option value.
16
Algorithm 1.6 shows that such computations are
easily done in MATLAB/SCILAB. Each step is based upon the risk-free return of a hypo-
thetical portfolio of options and assets.
16
The repeated application of (1.8) makes it look like a numerical approximation to a partial differential
equation (PDE) for the value of the call option as a function of asset price s and time t, namely C(t, s).
If over a time step of 1 (year or whatever) R=e
r
and u=e
β
, then for many small time steps of size h,
R=e
rh
≈ 1+rh; u=e
β

h
≈ 1+β

h; d=e
−β

h
≈ 1−β

h; and p≈ (1+rβ

h)/2. Then
expanding C(t, s) for nearby values in s and t into Taylor series transforms (1.8) into the PDE
∂c
∂t
−rc+
rs
∂c
∂s
+
1
2
β
2
s
2 ∂
2
c
∂s
2
=0. We later derive this directly from an SDE.
emfm
2008/10/22
page 32
i
i
i
i
i
i
i
i
32 Chapter 1. Financial Indices Appear to Be Stochastic Processes
Algorithm1.6 MATLAB/SCILAB code for a four-step binomial lattice estimate of the value
of a call option.
u=1.25^(1/2), d=1/u
r=1.12^(1/4)
x=38.50
s=35
*
u.^(-4:2:4)
p=(r-d)/(u-d)
c=max(0,s-x)
for i=1:4
c=(p
*
c(2:end)+(1-p)
*
c(1:end-1))/r
end
1.5 Summary
• Many fluctuating, noisy signals, such as financial indices, are modeled using a Weiner
process, W(t), which has the following properties: W(t) is continuous; W(0) = 0 ;
the change W(t +s) −W(s) ∼ N(0, t) for t, s ≥0 ; and the change W(t +s) −W(s)
is independent of any details of the process for times earlier than s.
• An Ito process, S(t), satisfies an SDE in the form
dS = μ(S, t)dt +σ(S, t)dW,
where μ is called the drift and σ the volatility. Financial prices are assumed to have
drift μ =αS and volatility σ =βS.
• Obtain numerical solutions by discretizing the SDE
ΔS
j
=μ(S
j
, t
j
)Δt
j
+σ(S
j
, t
j
)ΔW
j
for (very) small time steps Δt
j
.
• Value financial options by first determining a hedge that would ensure a risk-free
result, and second requiring arbitrage that the risk-free result is the same as that
obtained from secure bonds. A one-step binomial model applies these principles to
value an option as
C=
1
R
[pC
u
+(1−p)C
d
],
where p =
R−d
u−d
and 1−p =
u−R
u−d
,
where R is the bond factor, u and d the factors of increase and decrease in the price
of the underlying asset, and C
u
and C
d
the corresponding values of the option at the
end of the period.
• A more accurate multistep binomial lattice model repeatedly applies the above for-
mula backward in time from expiry date to the current time to determine the current
value of an option.
emfm
2008/10/22
page 33
i
i
i
i
i
i
i
i
Exercises 33
Exercises
1.1. Get your friends and family to play this simple little game that illustrates a key
aspect of stochastic mathematics in application to finance.
1. Draw on a big sheet of paper a sequence of 30 squares and label them consec-
utively 0 (bankrupt), 1, 1, 2, 2, 3, 3, 4, . . . , 14, 14, 15 (millionairedom). Each
of these squares represents one state in a 30-state Markov chain. Imagine each
state represents the value of some asset such as the value of a small business
that each player is managing.
2. Give each player a token and a six-sided die.
3. At the start place each token on the second “2”, the fifth state. Imagine this
corresponds to the small business having an initial value of $200,000.
4. Each turn in the game corresponds to, say, one year in time. In each year the
business may be poor or may grow. Thus in each turn each player rolls his/her
die and moves according to the following rules:
• If a player rolls a 1 or 2, then he/she moves down some states;
• if a player rolls a 3, he/she stays in the same state;
• if a player rolls a 4, 5, or 6, he/she moves up some states.
But the number of states (squares) a player moves is given by the number
written in each square. Thus in the first move, because the fifth square/state is
a “2,” a player moving up moves from the fifth square to the seventh square,
and a player moving down moves to the third square.
That the number written in each square is (roughly) proportional to the position
of the square in the sequence corresponds to the financial reality that small
businesses usually grow/shrink by small amounts, whereas large companies
grow/shrink by large amounts. Investors expect returns in proportion to their
investments.
5. Each player continues to role his/her die and move until reaching 0 or 15. That
is, players continue to operate their businesses until they either go bankrupt or
reach millionairedom.
Questions:
1. Why do you expect each business to grow? That is, why do you expect each
player to reach the “millionairedom” state?
2. When you play the game, roughly what proportion of players reach million-
airedom? What proportion go bankrupt?
3. How do you explain the actual results?
1.2. In MATLAB/SCILAB write a script to use randn and cumsum to generate, say,
five realizations of a Wiener process—approximate it by taking n=1000 time steps
over 0 < t ≤ 1 . Plot the realizations. Introduce a drift μ and volatility σ and plot
realizations of the resultant process; for example, investigate μ ∈ {0, ±1, ±3} and
σ ∈ {1/3, 1, 3} .
1.3. Estimate (albeit very crudely) the stock drift and stock volatility of the price of silver
(US$/ounce) from the data in Table 1.2.
emfm
2008/10/22
page 34
i
i
i
i
i
i
i
i
34 Chapter 1. Financial Indices Appear to Be Stochastic Processes
Table 1.2. Annual closing prices of silver (US$/ounce).
Year Close
2000 $4.595
2001 $4.650
2002 $4.790
2003 $5.960
2004 $6.845
2005 $8.910
Table 1.3. Annual closing value of the Japanese Nikkei Index (yen).
Year Close
1992 16925.0000
1993 17417.1992
1994 19723.0996
1995 19868.1992
1996 19361.3008
1997 15258.7002
1998 13842.1699
1999 18934.3398
2000 13785.6904
2001 10542.6201
2002 8578.9502
2003 10676.6396
2004 11488.7598
2005 16111.4297
2006 17225.8300
1.4. Estimate (albeit crudely) the stock drift and stock volatility of the value of the
Japanese Nikkei Index (yen) from the data in Table 1.3. Plot the Index as a func-
tion of time and copy and paste the data into MATLAB/SCILAB or an equivalent
program, and compute estimates of the stock drift and stock volatility.
1.5. Implement the Euler code of Example 1.6. Plot numerical solutions for α = 1 and
β = 2 over 0 ≤ t ≤ 1 . Now generate simultaneously m= 300 realizations of the
solution of the SDE (for α = 1 and β = 2) and draw a histogram of their values at
time t = 1 , say, using bins of width 0.25 from 0 to 5 ; see that almost all solutions
have a smallish numerical value (less than one). Nonetheless, compute the mean of
the values at time t =1 and see that it is roughly e
1
= 2.718 !
1.6. Modify the Euler code of Example 1.6 to numerically solve and plot five differ-
ent realizations of the (Ornstein–Uhlenbeck process) solutions of the SDE: dS =
−Sdt +dW with initial conditions S
0
= 0, 1, 2, 3, and 4 . Perhaps integrate over
0 ≤t ≤4 .
emfm
2008/10/22
page 35
i
i
i
i
i
i
i
i
Exercises 35
1.7. Modify the Euler code of Example 1.6 to numerically solve and plot five different
realizations of the solutions of the following SDEs:
1. dX =

2X
1+t
+(1+t)
2

dt +(1+t)
2
dW with X
0
=0 and X
0
=1 ;
2. dX =dt +2

XdW with X
0
=1 ;
3. dX =−
1
2
e
−2X
dt +e
−X
dW with X
0
=0 ;
4. dX =

1
2
X+

1+X
2

dt +

1+X
2
dW with X
0
= 0 ;
5. dZ =Z
3
dt −Z
2
dW with Z(0) =1 .
1.8. Quantitatively investigate convergence as time step h→0 at time t =1 for a range of
realizations of exponential Brownian motion (use relative differences). Summarize
your findings.
1.9. What is the value of a forward contract on an asset when
1. its current value is $60, the agreed price after a year is $57, and the bond rate
is 4% per year?
2. its current value is $35, the agreed price after two years is also $35, and the
bond rate is 5% per year?
1.10. Reconsider Example 1.10 of the valuation of a forward contract on Telecom shares.
What bond rate would cause the current value of the forward contract to be pre-
cisely $1?
1.11. Use a one-period binomial model to estimate that the hedge ratio H=2 and the fair
value of the call option is $5.57 when the initial asset price is $35, the exercise price
of the call option is also $35, the rate of interest for bonds is 10%, and the period is
one year. Assume that the asset price moves up or down by 25% per year. See that a
portfolio of the asset and either one call option or three call options sold at this price
are subject to fluctuations in the price of the asset that could result in the investor
losing.
1.12. Use a one-period binomial model to estimate the hedge ratio and the fair value of
the call option when the initial asset price is $50, the exercise price of the call option
is also $50, the rate of interest for bonds is 10%, and the period is one year. Assume
that the asset price moves up by 25% or down by 20% per year.
1.13. What interest rate for bonds would be needed for the call option of Example 1.9 to
be correctly valued at $4 ?
1.14. An asset is initially priced at $50. The exercise price of a call option on this asset
is also $50. Assume that the asset price moves up by 25% or down by 20% per
year. The call option is offered for sale at $8. Use a one-period binomial model to
estimate the interest rate for bonds that would make this the correct value for the
call option.
1.15. Suppose that volatility corresponds to +25% and −20% per year, and that the bond
interest rate is 12% per year. What would be the corresponding factors u, d, and R
for each monthly step of a twelve-step monthly binomial lattice model?
emfm
2008/10/22
page 36
i
i
i
i
i
i
i
i
36 Chapter 1. Financial Indices Appear to Be Stochastic Processes
1.16. Modify the MATLAB/SCILAB code of Algorithm 1.6 to use an n-step binomial lat-
tice for estimating the value of the call option (where n is a supplied parameter).
Show that it is not until n ≥ 512 (or thereabouts) that the estimated initial value of
the call option converges to $3.40 to the nearest cent.
Use this modified code to refine the value of the call option in the situations de-
scribed in Exercises 1.11–1.12.
1.17. Alice buys a put option from Bob (see Definition 1.13), strike price X = $57 , for
some fair price P which you are to determine by arguments similar to those em-
ployed for call options. Develop one step of a binomial model to estimate the value
of the put option.
Definition 1.13. A European put option gives the buyer (of the option) the right but
not the obligation to sell an asset at a previously agreed strike price on a particular
date.
1. First, show that at the expiry of the put option, when the asset has price S, the
put option has value P = max{0, X−S} by considering the cases X < S and
X > S.
2. Second, at the start of a year Alice constructs a portfolio of one bought put
option, value P, and hedges by buying φ units of the asset (selling if φ is
negative), each of price S = $60 . Explain the value of the portfolio at the end
of the year if the asset goes up in price by 25%, and if the asset goes down in
price by 20%. Deduce the risk-free ???hedge ratio φ =
1
3
.
3. Lastly, use the principle of arbitrage to determine the fair price P =$4 for the
option; assume bonds increase in value by 4
1
6
% over the year.
1.18. Alice buys a put option from Bob (see Definition 1.13), strike price X = $39 , for
some fair price P which you are to determine by arguments similar to those em-
ployed for call options. Develop one step of a binomial model estimating the value
of the put option.
1. First, show that at the expiry of the put option, when the asset has price S, the
put option has value P = max{0, X−S} by considering the cases X < S and
X > S.
2. Second, at the start of a year Alice constructs a portfolio of one bought put
option, value P, and hedges by buying φ units of the asset (selling if φ is
negative), each of price S = $60 . Explain the value of the portfolio at the
end of the year if the asset may either double or halve in value. Deduce the
risk-free hedge ratio φ.
3. Lastly, use the principle of arbitrage to show the fair price P for the option;
assume inflation is rampant and that bonds increase in value by 20% over the
year.
1.19. Alice buys a put option from Bob (see Definition 1.13), strike price X, for some fair
price P which you are to determine by arguments similar to those employed for call
options. Investigate one step of a binomial model for determining the value of the
put option, then apply your result to a multiperiod binomial lattice.
emfm
2008/10/22
page 37
i
i
i
i
i
i
i
i
Exercises 37
S(t)
1992 1994 1996 1998 2000 2002 2004 2006
8000
10000
12000
14000
16000
18000
20000
time t
Figure 1.18. Japanese Nikkei Index (yen) as a function of year.
1. First, argue that at the expiry of the put option when the asset has price S, the
option has value P =max{0, X−S} by considering the cases X <S and X >S.
See that a put option, unlike a call option, becomes more valuable the lower
the asset price.
2. Second, at the start of a time interval Alice constructs a portfolio of one bought
put option, value P, and hedges with φ units of the asset, each of price S. At
the end of the time interval the value of the portfolio will be P
u
+φSu if
the asset goes up in price by a factor of u and P
d
+φSd if the asset goes
down in price by the factor d. Argue that for a risk-free result the hedge
ratio φ =(P
d
−P
u
)/(Su−Sd) .
3. Lastly, argue that if such a risk-free portfolio is to give the same return as
bonds, which increase in value by a factor of R over the interval, then the value
of the put option at the start of the interval is the same expression as (1.8),
namely
P =
1
R

R−d
u−d
P
u
+
u−R
u−d
P
d

.
4. Hence estimate a fair value of a put option when the initial asset price is $50,
the exercise price of the put option is also $50, the rate of interest for bonds is
10%, and the period is one year. Assume that the asset price moves up by 25%
or down by 20% per year.
5. Revise the MATLAB/SCILAB code of Exercise 1.16 to estimate the value of
the above put option using an n =4 , 32, 128, and 512 step binomial lattice.
1.20. Clearly explain the crucial features of a Wiener process that empower us to model
noisy, fluctuating dynamics. Explain how and why a Wiener process is transformed
to model general noisy, fluctuating signals.
emfm
2008/10/22
page 38
i
i
i
i
i
i
i
i
38 Chapter 1. Financial Indices Appear to Be Stochastic Processes
Answers to selected exercises
1.1. 1. On average, a player moves up 16% of the time. That is, we could make a business
case that shows 16% growth per year.
2. Only about 1 in 3 reach millionairedom; the rest go bankrupt.
3. Stochastic fluctuations ruin the expected growth.
1.3. Stock drift α ≈15% per year; stock volatility β ≈13% per

year.
1.4. See Figure 1.18. Stock drift α ≈2.2% per year; stock volatility β ≈21% per

year.
1.9. 1. $5.19.
2. $3.25.
1.10. R =1.1324 , that is, 13.24%.
1.12. $7.58.
1.13. Approximately 22%.
1.14. R =1.1236 , that is, an interest rate of 12.36% over the period.
1.16. $4.83 and $6.89, respectively.
1.18. Hedge ratio φ= 1/10 ; price P =$4 .
1.19. 4. $3.03.
5. $2.07, $2.31, $2.34, and $2.34.
emfm
2008/10/22
page 39
i
i
i
i
i
i
i
i
Chapter 2
Ito’s Stochastic Calculus
Introduced
Contents
2.1 Multiplicative noise reduces exponential growth . . . . . . . . 39
2.1.1 Linear growth with noise . . . . . . . . . . . . . . 40
2.1.2 Exponential Brownian motion . . . . . . . . . . . . 40
2.2 Ito’s formula solves some SDEs . . . . . . . . . . . . . . . . . 43
2.2.1 Simple Ito’s formula . . . . . . . . . . . . . . . . . 43
2.2.2 Ito’s formula . . . . . . . . . . . . . . . . . . . . . 46
2.3 The Black–Scholes equation prices options accurately . . . . 48
2.3.1 Discretizations form a trinomial model . . . . . . . 53
2.3.2 Self-financing portfolios . . . . . . . . . . . . . . . 54
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Answers to selected exercises . . . . . . . . . . . . 60
This chapter investigates how to manipulate symbolically SDEs. We discover in the
algebraic solutions of the SDEs features which are also discerned in the numerical solu-
tions. The key is the discovery of a stochastic form of the chain rule for differentiation
called Ito’s formula. The formula is proved in Chapter 4 via a careful definition of stochas-
tic integration. Here we use Ito’s formula to derive the Black–Scholes PDE that values
financial options.
2.1 Multiplicative noise reduces exponential growth
We first explain the solution of two simple SDEs. The second one is the solution for
exponential Brownian motion and introduces the key ingredient we need for the description
and development of Ito’s formula.
39
emfm
2008/10/22
page 40
i
i
i
i
i
i
i
i
40 Chapter 2. Ito’s Stochastic Calculus Introduced
2.1.1 Linear growth with noise
Consider the constant coefficient SDE dX =μdt +σdW; in this short section the drift μ
and volatility σ are constants. Numerically, we interpret this SDE to mean
ΔX
j
=μΔt
j
+σΔW
j
,
when discretized with time steps Δt
j
= t
j+1
−t
j
, which we usually take to be the constant
time step h. Now sum this discretization over n steps starting from time t
0
= 0 when
W
0
=0 :
n−1

j=0
(X
j+1
−X
j
) =
n−1

j=0
μ(t
j+1
−t
j
) +
n−1

j=0
σ(W
j+1
−W
j
).
Because the drift μ and volatility σ are constant, there is massive cancellation in the terms
of this equation, leading to
X
n
−X
0
=μ(t
n
−t
0
) +σ(W
n
−W
0
)
⇒X
n
=X
0
+μt
n
+σW
n
as W
0
=t
0
=0
⇒X(t) =X
0
+μt +σW(t)
in the limit as max
j
Δt
j
→0 . We have algebraically solved our first SDE!
2.1.2 Exponential Brownian motion
The linear constant coefficient SDE is rather trivial. Proceed now to consider the exponen-
tial Brownian motion SDE that, for example, is claimed to govern stock prices:
dX = αX

μ
dt + βX

σ
dW.
As we interpreted before, this SDE means that evaluating the right-hand side at the current
time forms a numerical approximation:
ΔX
j
= αX
j
Δt
j
+βX
j
ΔW
j
.
Divide by X
j
to lead to a form with a right-hand side seen before:
ΔX
j
X
j
=αΔt
j
+βΔW
j
. (2.1)
The right-hand side looks like the constant drift and volatility that we summed so suc-
cessfully before, but the left-hand side is problematic. However, recall that the derivative
of logx is 1/x, matching the 1/x
j
in the left-hand side above, so perhaps differences of logX
will lead to something useful. Indeed, they do, as we see below:
ΔlogX
j
=logX
j+1
−logX
j
by definition of the difference Δ
=log(X
j
+X
j+1
−X
j
) −logX
j
=log(X
j
+ΔX
j
) −logX
j
emfm
2008/10/22
page 41
i
i
i
i
i
i
i
i
2.1. Multiplicative noise reduces exponential growth 41
= log

1+
ΔX
j
X
j

=

ΔX
j
X
j


1
2

ΔX
j
X
j

2
+· · · by Taylor series of log(1+x)
=

αΔt
j
+βΔW
j


1
2

αΔt
j
+βΔW
j

2
+· · · by the SDE (2.1)
=αΔt
j
+βΔW
j

1
2
α
2
Δt
j
2
−αβΔt
j
ΔW
j

1
2
β
2
ΔW
j
2
+· · · .
Here and later the ellipses · · · denotes the higher order terms in the Taylor series. Now sum
the right- and left-hand sides (using t
0
= W
0
= 0), where for simplicity we assume the
time step is constant h; then
logX
n
−logX
0
=αt
n
+βW
n

1
2
α
2
ht
n
−αβhW
n

1
2
β
2
t
n
+· · · ,
where we have magically simplified the form of the quadratic terms—a crucial step which
we justify a little later. Now, taking the limit as the time step h →0 and assuming the
higher order terms →0 in this limit, the above expression becomes
logX(t) −logX
0
=αt +βW(t) −
1
2
β
2
t,
which rearranges into the remarkable analytic solution
X(t) =X
0
exp

(α−
1
2
β
2
)t +βW(t)

. (2.2)
See the astonishing feature, which we saw in the numerical solutions of Figure 1.10, that
noise, parametrized by β, may act to stabilize what would otherwise exponentially grow
by apparently reducing the growth rate fromα to α−
1
2
β
2
.
Example 2.1. The ODE dX = Xdt has growing solutions X = X
0
e
t
, but the same ODE
with large enough multiplicative noise, say dX =Xdt +2XdW, has solutions (2.2) of X =
X
0
exp[−t +2W(t)] . Since W(t) only grows like

t, the dominant term in the argument
of the exponential function is the −t, which shows that almost surely all solutions of the
SDE decay to zero! This almost sure decay is supported by a histogram, Figure 2.1, of the
values of X(1) over many realizations. That there are a few realizations of X(1) which are
large is significant and will be discussed later.
Now look at the detailed justification of the transformations of the three quadratic
terms used above (note the time steps are the constant h).
• First,
n−1

j=0
Δt
j
2
=
n−1

j=0
h
2
=nh
2
=ht
n
→0 as h →0.
• Second,
n−1

j=0
Δt
j
ΔW
j
=h
n−1

j=0
ΔW
j
=hW
n
∼ N(0, h
2
t),
and hence almost surely →0 as h →0 .
emfm
2008/10/22
page 42
i
i
i
i
i
i
i
i
42 Chapter 2. Ito’s Stochastic Calculus Introduced
c
o
u
n
t
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
0
20
40
60
80
100
120
140
X(1)
Figure 2.1. Histogram of 300 realizations of X(1) from the SDE dX = Xdt +
2XdW with X(0) = 1 (Figure 1.10 plots five realizations in time). This histogram shows
most realizations decaying to zero, although some large excursions (shown gathered at
X(1) = 5) significantly affect the statistics.
• Lastly, and much more interestingly,
n−1

j=0
ΔW
j
2
=
n−1

j=0


hZ
j

2
=h
n−1

j=0
Z
2
j
=h
n−1

j=0
1+h
n−1

j=0
(Z
2
j
−1)
=t
n
+Y ,
where Y = h

n−1
j=0
(Z
2
j
−1) for some random variables Z
j
∼ N(0, 1) . Now, as
derived in Exercise 2.1, E[Z
p
] = (p −1)(p −3). . . 1 for even p, and thus E[Y] =
h

n−1
j=0
E[Z
2
j
−1] =E[Z
2
] −1 =0. Hence the value of

n−1
j=0
ΔW
j
2
averages to t
n
,
as asserted earlier.
But the sum may fluctuate wildly due to Y—we now argue that its fluctuations are
emfm
2008/10/22
page 43
i
i
i
i
i
i
i
i
2.2. Ito’s formula solves some SDEs 43
insignificant by showing that the variance of Y vanishes as the time step h →0:
Var[Y] = h
2
n−1

j=0
Var[Z
2
j
−1] as Z
j
are independent
= h
2
nVar[Z
2
−1] as they are distributed identically
= h
2
nE[(Z
2
−1)
2
] as E[Z
2
−1] = 0
= h
2
nE[Z
4
−2Z
2
+1]
= h
2
n[3−2+1] using expectation of even powers of Z
= 2h
2
n
= 2ht
n
→0 as h →0 for all t
n
.
Because the variance of Y tends to 0 and E[Y] = 0, then Y →0 almost surely as
h →0 . Consequently

n−1
j=0
ΔW
j
2
→t
n
almost surely.
Since

n−1
j=0
ΔW
2
j
is almost surely

n−1
j=0
Δt
j
, this is as if “ΔW
j
2
= Δt
j

17
which
suggests the novel symbolic rule introduced below.
Summary
For stochastic calculus, the following symbolic rules for infinitesimals effectively apply:
dt
2
=0 and dtdW =0 (as for ordinary calculus), but that surprisingly dW
2
=dt . These
symbolic rules will be used extensively. They derive, for example, that X(t) =X
0
exp

(α−
1
2
β
2
)t +βW(t)

solves the financial SDE dX = αXdt +βXdW.
2.2 Ito’s formula solves some SDEs
Recall that when you were first introduced to integration in your earlier courses it was
presented as antidifferentiation. That is, you used differentiation rules to infer integration
formulae. The same is true for the solution of some SDEs that we develop in this section.
The basic rule of stochastic differentiation is Ito’s formula with the identity dW
2
=dt at its
heart. We first present a simple version of Ito’s formula and then present the full version.
18
2.2.1 Simple Ito’s formula
Let f(t, w) be a smooth function of two arguments—smooth means that it is differen-
tiable at least several times and that Taylor’s theorem applies. Figures 2.2 and 2.3 show
two example surfaces of two such smooth functions. Then consider the Ito process
17
This last point is perhaps not too surprising because we know E[ΔW
j
2
] =Δt
j
E

ΔW
j
/

Δt
j

2

=
Δt
j
. But importantly we have is that we have shown that the fluctuations about this expectation are negligible
in the limit of continuous time.
18
Chapters 5 and 6 of the book by Stampfli and Goodman (2001) provide appropriate reading to supple-
ment this section.
emfm
2008/10/22
page 44
i
i
i
i
i
i
i
i
44 Chapter 2. Ito’s Stochastic Calculus Introduced
0.
1.
2.
3.
f
0.2
w
0.0
0.4
0.8
t
Figure 2.2. The smooth surface f = (t +w)
2
with one realization of a Wiener
process w = W(t) evolving on the surface: the Ito process X = f(t, W(t)) is the height
of the curve as a function of time and changes due to direct evolution in time and through
evolution of w.
0.
1.
2.
3.
f
0.2
w
0.0
0.4
0.8
t
Figure 2.3. The smooth surface f = 0.3exp(t +2w) with one realization of a
Wiener process w = W(t) evolving on the surface: The Ito process X = f(t, W(t)) is the
height of the curve as a function of time and changes due to direct evolution in time and
through evolution of w.
X(t) = f(t, W(t)), where W(t) is a Wiener process—see the black curves wiggling across
the surfaces in Figures 2.2 and 2.3. We explore these examples later: f(t, w) = (t +w)
2
,
whence X = (t +W(t))
2
, and f(t, w) = exp(at +bw), whence X = exp(at +bW(t)) .
Note that f itself is smooth; the stochastic part of X(t) comes only via the use of the
emfm
2008/10/22
page 45
i
i
i
i
i
i
i
i
2.2. Ito’s formula solves some SDEs 45
Wiener process in the evaluation of f. Now consider the change in X that occurs over some
small change in time, Δt, through the direct dependence upon t and indirectly through the
dependence upon the Wiener process W. Denoting X(t +Δt) =X+ΔX and W(t +Δt) =
W+ΔW, observe (using a multivariable Taylor series of f) that
X+ΔX = f(t +Δt, W+ΔW)
= f(t, W) +
∂f
∂t
Δt +
∂f
∂w
ΔW
+
1
2

2
f
∂t
2
Δt
2
+

2
f
∂t∂w
ΔtΔW+
1
2

2
f
∂w
2
ΔW
2
+· · ·
≈X+
∂f
∂t
Δt +
∂f
∂w
ΔW+
1
2

2
f
∂w
2
Δt,
as by the earlier rules for differentials Δt
2
=ΔtΔW =· · · =0 and ΔW
2
=Δt . In the limit
as Δt →0 the differences become differentials, and thus
dX =

∂f
∂t
+
1
2

2
f
∂w
2

dt +
∂f
∂w
dW, (2.3)
where the partial derivatives are evaluated at (t, W). This is the simple version of Ito’s
formula and gives the differential of an Ito process X which depends directly and smoothly
upon a Wiener process.
Example 2.2. Determine the differential of X(t) = (t+W(t))
2
, and hence deduce an SDE
which X(t) satisfies.
Solution: Here f(t, w) = (t +w)
2
(see Figure 2.2), so f
t
= f
w
= 2(t +w) and
f
ww
=2; thus
dX = [2(t +W) +1]dt +2(t +W)dW.
Recognizing t +W =

X, rewrite this as the SDE
19
dX = [2

X+1]dt +2

XdW.
Example 2.3. Determine the differential of X(t) =cexp[at+bW(t)], and hence solve the
SDE dX =αXdt +βXdW.
Solution: Here f(t, w) = ce
at+bw
(see Figure 2.3), whence f
t
= ace
at+bw
,
f
w
=bce
at+bw
and f
ww
=b
2
ce
at+bw
. Thus Ito’s formula asserts
dX =

ace
at+bW
+
1
2
b
2
ce
at+bW

dt +bce
at+bW
dW
=ce
at+bW

a+
1
2
b
2

dt +bdW

.
Rewritten as dX= (a+
1
2
b
2
)Xdt+bXdW this is the same as the given SDE provided α=
a +
1
2
b
2
and β = b. Thus it is the solution of the SDE provided b = β and
19
Both of these expressions of the stochastic dynamics are correct. Prefer the second for many purposes
as the right-hand side has no occurrences of the Wiener process W(t) except for the differential dW. In
applications where SDEs arise, we normally formulate an SDE model in such a form where the only direct
appearance of “noise” is in the differential dW. Hence prefer this second SDE.
emfm
2008/10/22
page 46
i
i
i
i
i
i
i
i
46 Chapter 2. Ito’s Stochastic Calculus Introduced
a =α−
1
2
β
2
, and hence the solution is
X(t) =cexp[(α−
1
2
β
2
)t +βW(t)] ,
as discussed in Section 2.1.2.
Example 2.4. Derive the drift and volatility of the stochastic process Y = W(t)
3
; express
in terms of Y only.
Solution: Here Y = f(t, W), where f(t, w) = w
3
so that f
t
= 0 , f
w
= 3w
2
and
f
ww
=6w. Thus Ito’s formula asserts
dY = 0dt +3W
2
dW+
1
2
6WdW
2
= 3Wdt +3W
2
dW
= 3Y
1/3
dt +3Y
2/3
dW.
Hence the stochastic process Y has drift μ =3Y
1/3
and volatility σ =3Y
2/3
.
2.2.2 Ito’s formula
The simple form (2.3) of Ito’s formula rests upon the Taylor series of f(t, w) and the rule
that the only quadratic differential to retain is dW
2
=dt. The general formof Ito’s formula,
sometimes referred to modestly as Ito’s lemma, is the same and addresses the differential of
a function of a stochastic process in general rather than just a function of a Wiener process.
Theorem 2.5 (Ito’s formula). Let f(t, x) be a smooth function of its arguments and X be
an Ito process with drift μ(t, X) and volatility σ(t, X), that is, dX = μdt +σdW; then
Y(t) =f(t, X(t)) is also a Ito process with differential
dY =
∂f
∂t
dt +
∂f
∂x
dX+
1
2

2
f
∂x
2
dX
2
(2.4)
where in dX
2
retain only dW
2
→dt .
and the partial derivatives are evaluated at (t, X). Equivalently, expanding expression (2.4)
gives
dY =

∂f
∂t

∂f
∂x
+
1
2
σ
2

2
f
∂x
2

dt +σ
∂f
∂x
dW. (2.5)
Although the formula (2.5) is more explicit, we believe (2.4) is a more memorable
form of Ito’s formula. The rigorous proof of this formula requires a well-defined stochastic
integral and so is deferred to Chapter 4.
Example 2.6. Example 2.2 shows that if X(t) = (t +W(t))
2
, then dX = [2

X+1]dt +
2

XdW; hence deduce dY for Y =e
X
.
Solution: Use Ito’s formula with f(t, x) =e
x
for which f
t
=0 and f
x
= f
xx
=e
x
:
emfm
2008/10/22
page 47
i
i
i
i
i
i
i
i
2.2. Ito’s formula solves some SDEs 47
dY =f
x
dX+
1
2
f
xx
dX
2
=e
X

2

X+1

dt +2

XdW

+
1
2
e
X

2

X+1

dt +2

XdW

2
=e
X

2

X+1

dt +2

XdW+
1
2
4

X

2
dW
2

=e
X

1+2

X+2X

dt +2

XdW

.
Example 2.7 (product rule). Recall that in ordinary calculus the product rule for differ-
entiation is d(fg) = fdg+gdf . In stochastic calculus there is an extra term. Let X(t)
and Y(t) be stochastic processes with differentials
dX =μdt +σdW and dY =νdt +ρdW.
Then one may argue, using our symbolic rules, that
d(XY) = (X+dX)(Y +dY) −XY
=XdY +Y dX+dXdY
=XdY +Y dX+σρdt
by retaining only the dW
2
=dt term in the product dXdY. This is indeed correct, but now
deduce it using Ito’s formula.
Solution: The difficulty is that the product XY is a function of two Ito processes,
whereas Ito’s formula applies only to a function of one Ito process. However, write XY =
1
2
[Z
2
−X
2
−Y
2
], where Z(t) =X+Y is a new stochastic process that, because it is simply
the sum of X and Y, has differential dZ = (μ+ν)dt +(σ+ρ)dW . Then use Ito’s formula
separately on each of the components in the right-hand side of d(XY) =
1
2
[d(Z
2
)−d(X
2
)−
d(Y
2
)]:
d(X
2
) =0dt +2XdX+
1
2
2dX
2
= (2Xμ+σ
2
)dt +2XσdW;
similarly d(Y
2
) = (2Yν+ρ
2
)dt +2YρdW;
and d(Z
2
) =

2Z(μ+ν) +(σ+ρ)
2

dt +2Z(σ+ρ)dW.
Substitute these, with Z =X+Y, into
d(XY) =
1
2

d(Z
2
) −d(X
2
) −d(Y
2
)

=
1
2

(2Xμ+2Yμ+2Xν+2Yν+σ
2
+2σρ+ρ
2
)dt
+(2Xσ+2Yσ+2Xρ+2Yρ)dW
−(2Xμ+σ
2
)dt −2XσdW−(2Yν+ρ
2
)dt
−2YρdW]
= X(νdt +ρdW) +Y(μdt +σdW) +σρdt
= XdY +Y dX+σρdt
as required.
emfm
2008/10/22
page 48
i
i
i
i
i
i
i
i
48 Chapter 2. Ito’s Stochastic Calculus Introduced
0
10
20
30
40
50
60
70
C
10
20
30
40
50
60
70
80
90
100
110
S
0
10
20
30
40
50
60
70
80
90
100
100t
Figure 2.4. Value C(t, S) of the call option for Example 1.9 demonstrating that
the value C is a smooth function of time t and asset price S.
Suggested activity: Do at least Exercise 2.10.
Summary
Ito’s formula is a form of chain rule for stochastic processes to empower all stochastic
calculus. It is the same as the deterministic chain rule except for the crucial addition of a
quadratic differential and the recognition that “dW
2
= dt”: if Y = f(t, X), then the differ-
ential dY =
∂f
∂t
dt +
∂f
∂x
dX+
1
2

2
f
∂x
2
dX
2
.
2.3 The Black–Scholes equation prices options accurately
One application of Ito’s formula is to pricing options based on an asset. This section de-
velops the analysis that Section 1.4 began using binomial lattices to price call options. We
adopt the same line of argument. Recall that the argument is first to form a portfolio of the
asset, together with call options, which is risk free, then to sensibly require this risk-free
portfolio to give the same return as investing in risk-free bonds.
Recall that the value of a call option varies smoothly in both time t and the asset
price S, as seen in Figure 2.4. Hence, the derivatives
∂C
∂t
,
∂C
∂S
, and

2
C
∂S
2
are all well defined
and smoothly varying. Consequently, in any realization, the value of a call option is an
Ito process, say C(t) = C(t, S(t)), through the dependence upon the Ito process S(t), the
realization of the asset value. Recall that the stochastic model of an asset’s value is that
its differential dS = αSdt +βSdW. Hence Ito’s formula tells us how the option value C
fluctuates in time through the fluctuating changes in asset value.
emfm
2008/10/22
page 49
i
i
i
i
i
i
i
i
2.3. The Black–Scholes equation prices options accurately 49
Now find a hypothetical, risk-free portfolio that has the same return as bonds. Con-
struct a portfolio of one call option sold and some number φ (phi) of units of the asset. In
Section 1.4 we held one asset and sought the appropriate number of call options—here it is
preferable to do the complement, with φ as the reciprocal of the earlier H. This portfolio
has a value Π = −C(t, S) +φS, which is itself also an Ito process as it is a function of
the stochastic asset value S. Thus over a small time interval dt the value of the portfolio
changes by an amount deduced via Ito’s formula (2.4):
20
dΠ = −dC+φdS as Π = −C+φS
= −
∂C
∂t
dt −
∂C
∂S
dS−
1
2

2
C
∂S
2
dS
2
+φdS as C =C(t, S) by Ito
= −

∂C
∂t
+αS
∂C
∂S
+
1
2
β
2
S
2

2
C
∂S
2

dt
−βS
∂C
∂S
dW+φαSdt +φβSdW as dS =αSdt +βSdW
=


∂C
∂t
−αS
∂C
∂S

1
2
β
2
S
2

2
C
∂S
2
+αSφ

dt
+βS


∂C
∂S

dW.
A portfolio is risk free when it has no stochastic fluctuations, that is, when it has zero
volatility. Make the volatility, βS[−
∂C
∂S
+φ] , of this portfolio zero by choosing a portfolio
with φ =
∂C
∂S
units of the asset per sold call option.
21
Setting φ =
∂C
∂S
, the portfolio changes in price according to the residual drift term
in dΠ, namely,
dΠ =


∂C
∂t

1
2
β
2
S
2

2
C
∂S
2

dt.
Because this portfolio is risk-free, its return must equal the return of investing the same
in bonds. Given the value of the portfolio is −C+
∂C
∂S
S and the interest rate r (so that
R =e
rt
), the corresponding investment in bonds returns
r

−C+
∂C
∂S
S

dt = dΠ =


∂C
∂t

1
2
β
2
S
2

2
C
∂S
2

dt.
Equating the coefficients of dt and rearranging leads to the Black–Scholes equation for the
value C(t, S) of the call option,
∂C
∂t
+rS
∂C
∂S
+
1
2
β
2
S
2

2
C
∂S
2
=rC. (2.6)
Involving derivatives in both time t and asset value S, this is a PDE for the option value C.
The next section discusses numerical methods of solving the Black–Scholes PDE (2.6);
such numerical methods relate closely to the binomial lattice model for valuing options.
20
The first line of this derivation is justified properly by Section 2.3.2.
21
φ=
∂C
∂S
is analogous to the earlier hedge formula as 1/H=(C
u
−C
d
)/(Su−Sd) ≈ΔC/ΔS.
emfm
2008/10/22
page 50
i
i
i
i
i
i
i
i
50 Chapter 2. Ito’s Stochastic Calculus Introduced
Example 2.8 (forward contract). The Black–Scholes PDE (2.6) straightforwardly values
forward contracts to be exactly as in (1.6). To apply the valuation to forward contracts,
solve the Black–Scholes PDE (2.6) with expiry value C(T, S) chosen appropriately for a
forward contract.
22
Solution: Recall from Section 1.4.1 that a forward contract is a binding agreement
by Alice to sell to Bob an asset at an agreed price X at some expiry time T. At expiry
Bob values the forward contract at C(T, S) = S−X because if the asset value S(T) > X,
then Bob buys the asset more cheaply than the open market price, whereas if the asset value
S(T) <X, then the contract commits Bob to buying the asset at a more expensive price than
the open market price. This expiry value of C(T, S) = S−X is linear in the asset value S.
Being linear, it leads to a simple solution of the Black–Scholes PDE (2.6).
Seek solutions of the Black–Scholes PDE (2.6) that are linear in asset value S, but
which have a general variation in time. That is, seek solutions in the formC(t, S) =a(t)S+
b(t) . Later we will use the information that we know at expiry that a(T) = 1 and b(T) =
−X in order for the expiry value to be C(T, S) = S−X. Substitute C(t, S) =a(t)S+b(t)
into the Black–Scholes PDE (2.6) to deduce
da
dt
S+
db
dt
+rSa+0 = raS+rb.
The marvelous simplification here is that the second derivative term vanishes, which en-
ables this case to be straightforward. Now this equation has to hold for all asset values S;
thus equate the coefficients of the terms in S and the terms constant in S to determine
da
dt
=0 and
db
dt
=rb.
• Since da/dt =0 , a(t) must be constant, and since at expiry a(T) =1, then a(t) =1
for all time.
• Since db/dt =rb, then b(t) =constant×e
rt
. The expiry condition that b(T) =−X
determines the constant to be −Xe
−rT
so that b(t) = −Xe
r(t−T)
. Write this as
b(t) =−X/e
r(T−t)
, as T −t is the time remaining to expiry.
That is, the value of the forward contract for all time up to the expiry time T and for all
asset values S is C(t, S) = S−X/e
r(T−t)
. Consequently, at the start of the time period,
t = 0 , Alice values the forward contract as C(0, S
0
) = S
0
−X/e
rT
. Recalling R = e
rT
,
this valuation is identical to the earlier (1.6).
More general algebraic solutions may be found for the Black–Scholes equation (2.6)
when applied to some options. However, we do not explore this here, as it is more valuable
to qualitatively check numerical solutions of complicated practical options.
Interpret terms to graphically solve
Before attempting to quantitatively solve such a PDE, we must learn to interpret the effects
of the various terms that appear—the interpretations are analogues of those discussed in
22
In this example, we use Cas the symbol for the value of the forward contract in order to be consistent
with the symbology of the Black–Scholes equation (2.6). In Section 1.4.1 we used the symbol Ffor the value
of the forward contract. The change in variable name is insignificant.
emfm
2008/10/22
page 51
i
i
i
i
i
i
i
i
2.3. The Black–Scholes equation prices options accurately 51
classical one-dimensional continuum mechanics; see, e.g., Roberts (1994). The following
interpretations of its mathematical symbols empower us to qualitatively solve the Black–
Scholes equation (2.6).
• First, the term rS
∂C
∂S
is an “advection” term, as it carries information in S-space. For
example, in the absence of the other terms, the equation
∂C
∂t
+rS
∂C
∂S
=0 asserts that
on characteristics dS/dt =rS, that is, on curves S =constant ×e
rt
,
dC
dt
==
∂C
∂t
+
∂C
∂S
dS
dt
by the chain rule
=
∂C
∂t
+rS
∂C
∂S
as dS/dt =rS on each curve
=0 by the equation.
Thus on the characteristics S ∝ e
rt
we would find that C is constant—in effect the
value C is carried along these characteristic curves. These characteristics are those
corresponding to the value of an investment in bonds.
• Second, the term rC on the right-hand side of the Black–Scholes equation (2.6) acts
as a source of value for the option in time. Including the source term rC, three of the
four terms in the Black–Scholes equation (2.6) form
∂C
∂t
+rS
∂C
∂S
= rC. Along each
characteristic curve S ∝e
rt
this PDE asserts that the value of the option grows like
dC
dt
=rC, that is, C also grows exponentially along with any bonds, as C ∝e
rt
.
• Lastly, the term
1
2
β
2
S
2∂
2
C
∂S
2
implies that in addition to these effects, the price of
an option “diffuses” in the S-space. This diffusion

2
C
∂S
2
depends directly upon the
volatility, βS, of the underlying asset so that large fluctuations in a risky asset cause
large diffusion—large “blurring”—of the value of an option. However, and most
importantly, remember that this term represents negative diffusion, on its own
∂C
∂t
=

1
2
β
2
S
2∂
2
C
∂S
2
, and so the Black–Scholes equation (2.6) must be solved backward in
time.
Summary
We solve the Black–Scholes equation (2.6) with “initial” conditions specified at the future
expiry date of the option, of C(T, S) =max{0, S−X} for a call option; we expect the value
of the option to be steadily discounted at the bond rate as we work backward in time (in
proportion to e
rt
); and the value to smooth out by the fluctuation induced “diffusion.”
Example 2.9 (knock out). Exotic options often just change the boundary conditions for
the Black–Scholes equation (2.6). For example, a “knock out” is a call or put option that
additionally expires if the underlying security value S reaches a predetermined “barrier
price.” There are two varieties of knock outs: “down and outs” expire if the underlying
security falls to the barrier price, whereas “up and outs” expire if the underlying security
rises to the barrier price.
In a “down and out,” the holder of the knock out option receives a rebate c if ever the
asset value S falls below some barrier, S
min
say. For asset values S > S
min
, the arguments
emfm
2008/10/22
page 52
i
i
i
i
i
i
i
i
52 Chapter 2. Ito’s Stochastic Calculus Introduced
0 10 20 30 40 50 60 70 80 90100
0
10
20
30
40
50
60
(a) call option
o
p
t
i
o
n

v
a
l
u
e

C
expiry
squash
depreciate
value now
0 10 20 30 40 50 60 70 80 90100
0
5
10
15
20
25
30
35
40
45
(b) put option
expiry
squash
depreciate
value now
0 10 20 30 40 50 60 70 80 90100
0
5
10
15
20
25
30
35
(c)
asset price S
o
p
t
i
o
n

v
a
l
u
e

C
0 10 20 30 40 50 60 70 80 90100
0
5
10
15
20
25
30
35
40
45
(d)
asset price S
Figure 2.5. Qualitative solution of the Black–Scholes equation (2.6) to predict the
value of an option. (a) a call option; (b) a put option; (c), (d) two other special concocted
options. The dashed line is the agreed value of the option at expiry depending upon the
asset price S, and the solid line is a rough estimate of the value of the option at some
earlier time.
behind the Black–Scholes equation (2.6) still apply. Thus one values a knock out option by
solving the Black–Scholes equation for asset values S >S
min
. At the asset value S =S
min
we additionally know that the knock out option has value c, its rebate. Thus when solving
the Black–Scholes equation, we keep supplying this value c at the barrier S = S
min
as a
boundary condition to the values in the domain S >S
min
.
Given the enormous variety of possible options, such as the knock out, it is remarkably
useful to solve the Black–Scholes equation (2.6) qualitatively. In Figure 2.5 we follow the
three qualitative steps on page 51 for any given final valuation of an option plotted as a
function of the asset price S (dashed line):
• First, “squash” the option value to the left by a factor R (dot-dashed line) correspond-
ing to the “advection” in the Black–Scholes equation;
• second, deflate/depreciate the valuation C by a factor R, that is, squash vertically
(dotted line);
emfm
2008/10/22
page 53
i
i
i
i
i
i
i
i
2.3. The Black–Scholes equation prices options accurately 53
• third, smooth the corners and curves in the line to account for the “blurring” of the
option’s value by the stochastic nature of the asset price (solid line) to give the option
value at some earlier time—the longer the time period or the larger the volatility, the
more you smooth the curve.
Qualitative solutions such as these valuably check numerical or algebraic solutions of the
Black–Scholes equation (2.6).
You may want to combine and confirm the first two of the above steps by the analysis
of Exercise 2.11.
2.3.1 Discretizations form a trinomial model
The Black–Scholes equation (2.6) is intimately connected with the binomial lattice ap-
proximation. To connect the two views, first observe that all derivatives with respect
to S in the Black–Scholes equation (2.6) are multiplied by the corresponding power of S;
thus these are in the form of an Euler–Cauchy differential equation.
23
Simplify such
terms by transforming S into x = logS, as then the derivatives simplify: S

∂S
=

∂x
and
S
2 ∂
2
∂S
2
=

2
∂x
2


∂x
. The chain rule derives first
∂C
∂S
=
∂x
∂S
∂C
∂x
=
1
S
∂C
∂x
, and then

2
C
∂x
2
=

S

∂S

2
C = S

∂S

S
∂C
∂S

= S
∂C
∂S
+S
2∂
2
C
∂S
2
. Such a transformation from S to x = logS also
ensures that the geometric sequence of asset prices used in the binomial lattice, a factor of
u = 1/d from each other, becomes a straightforward arithmetic sequence in x, of constant
spacing Δx =logu. The Black–Scholes equation (2.6) then becomes
∂C
∂t
+r
∂C
∂x
+
1
2
β
2


2
C
∂x
2

∂C
∂x

=rC. (2.7)
We numerically solve the tranformed Black–Scholes PDE (2.7) on a grid in the tS-
plane. Create a grid, such as that seen in Figure 2.4 and implicit in the multiperiod lattice
of Figure 1.16, of spacing Δt =h in time and spacing Δx =δ in asset value.
24
Let C
i,j
de-
note the value of C at the jth time t
j
=jh and ith location x
i
= iδ (say). Then a difference
approximation to the time derivative
∂C
∂t

C
i,j
−C
i,j−1
h
, whereas centered approximations
to the x derivatives are
∂C
∂x

C
i+1,j
−C
i−1,j

and

2
C
∂x
2

C
i+1,j
−2C
i,j
+C
i−1,j
δ
2
. Thus a finite
difference approximation
25
to the transformed Black–Scholes PDE (2.7) which is back-
ward in time and centered in log-price is thus
C
i,j
−C
i,j−1
h
+r
C
i+1,j
−C
i−1,j

+
1
2
β
2
C
i+1,j
−2C
i,j
+C
i−1,j
δ
2

1
2
β
2
C
i+1,j
−C
i−1,j

=rC
i,j
.
23
Read about Euler–Cauchy ODEs in many standard texts (see, e.g., Kreyszig 1999, §2.6).
24
Of course one may also discretize the Black–Scholes equation (2.6) with equal ΔS rather than equal
ΔlogS, but then the former seems less natural and also confuses the comparison with the earlier binomial
model.
25
Issues associated with the numerical solution of such a PDE are dealt with in detail in many texts (see,
e.g., Kreyszig 1999).
emfm
2008/10/22
page 54
i
i
i
i
i
i
i
i
54 Chapter 2. Ito’s Stochastic Calculus Introduced
All option values referred to in this equation are at the jth time t
j
, except for the time
derivative where C
i,j−1
appears. Rearranging the equation for the value C
i,j−1
of the call
option at the earlier time t
j−1
gives
C
i,j−1
=
β
2
h
δ
2

1
2
+
δ(2r −β
2
)

2

C
i+1,j
+

1−rh−
β
2
h
δ
2

C
i,j
+
β
2
h
δ
2

1
2

δ(2r −β
2
)

2

C
i−1,j
. (2.8)
This looks like a “trinomial” lattice approximation to the pricing of an option. Indeed,
if we choose the increments in the log-price so that the term in C
i,j
vanishes, namely
β
2
h
δ
2
=1−rh, then (2.8) looks even more like the binomial lattice approximation,
C
i,j−1
= (1−rh)

pC
i+1,j
+(1−p)C
i−1,j

,
where p =
1
2
+
δ(2r−β
2
)

2
and with the multiplication by 1 −rh playing the role of the
division by the bond factor R.
Example 2.10. Revisit Example 1.9 which valued a call option on an asset with initial
price S = 35 over one year with volatility β = log1.25 = 0.2231, bond interest rate r =
log1.12 = 0.1133, and strike price X = 38.50 . Perform an n-step approximation, time
step h = 1/n years, with δ ≈β

2h so that the coefficient of C
i,j
is approximately 0.5 to
reasonably ensure stability.
26
With n =4 time steps, Algorithm 2.1 estimates the value of
the call option to be C = 3.49, whereas with n = 8 we find C = 3.40, and we find C =
3.41 for n = 16 and above. Using this discretisation of the Black–Scholes equation (2.6)
determines the value of an option with just n = 16 time steps, whereas with the binomial
model we needed n =256 .
Suggested activity: Do at least Exercise 2.15.
2.3.2 Self-financing portfolios
One question you may have had about the derivation of the Black–Scholes equation (2.6)
is the following: Why did I not treat the hedge ratio φ as a stochastic Ito process? After
all, the hedge ratio varies with the stochastic asset price S

recall φ =
∂C
∂S

and so should
be a stochastic process. The answer comes from investigating more completely the details
of purchasing or selling the asset and, in particular, how such trades are financed by cash
held in bonds.
Imagine we have sold a call option on an asset and seek a time varying portfolio of
φ units of the asset and ψ bonds. We transfer money from and to bonds as needed to buy
26
A useful rule of thumb to ensure stability of the trinomial model (2.8) is to choose time step h and
x-step δso that none of the three coefficients of C
i,j
and C
i,j±1
on the right-hand side of (2.8) are negative.
emfm
2008/10/22
page 55
i
i
i
i
i
i
i
i
2.3. The Black–Scholes equation prices options accurately 55
Algorithm 2.1 Code for determining the value of the call option in Example 2.10. Observe
the use of nan to introduce unspecified boundary conditions that turn out to be irrelevant
on the selected grid just as they are for the binomial model. In SCILAB use %nan instead
of nan.
n=4
strike=38.50
beta=log(1.25)
r=log(1.12)
h=1/n
delta=beta
*
sqrt(2
*
h) % perhaps double
rt=beta^2
*
h/delta^2
p=0.5+delta
*
(2
*
r-beta^2)/(4
*
beta^2)
x=log(35)+(-n:n)
*
delta;
s=exp(x);
c=max(0,s-strike);
i=2:2
*
n;
for j=n:-1:1
c=[nan c(i)
*
(1-r
*
h-rt)+c(i+1)
*
rt
*
p+c(i-1)
*
rt
*
(1-p) nan]
end
estimated_value=c(n+1)
and sell units of the asset. The whole portfolio then has value Π = −C+φS+ψB at any
time t, where B(t) is the value of a cash bond; we assume B grows exponentially, like e
rt
,
though more general models may also be analyzed. Nowdiscretize time into small intervals
of length Δt
j
(perhaps each interval is one day):
• over the jth interval (the jth day) we hold φ
j
units and ψ
j
bonds with value from the
start of the day of Π
j
= −C
j

j
S
j

j
B
j
;
• at the very start of the next (j +1)th interval (the opening of the next day) we find
the portfolio has value Π
j+1
= −C
j+1

j
S
j+1

j
B
j+1
;
• at the start of the (j +1)th interval (the start of the day’s trading) we immediately
adjust the portfolio to make it risk free over the forthcoming j+1th interval by trading
to then hold φ
j+1
units and ψ
j+1
bonds with value −C
j+1

j+1
S
j+1

j+1
B
j+1
;
as the portfolio is to be self-financing, this must have the same value as Π
j+1
, since
we just trade cash bonds for units of the asset; consequently, for a self-financing
portfolio,
−C
j+1

j+1
S
j+1

j+1
B
j+1
=−C
j+1

j
S
j+1

j
B
j+1
⇒S
j+1

j+1
−φ
j
) +B
j+1

j+1
−ψ
j
) =0
⇒S
j+1
Δφ
j
+B
j+1
Δψ
j
= 0.
The change in value of the portfolio from one time step to the next (from one day to the
next) is thus
emfm
2008/10/22
page 56
i
i
i
i
i
i
i
i
56 Chapter 2. Ito’s Stochastic Calculus Introduced
ΔΠ
j
=−C
j+1

j+1
S
j+1

j+1
B
j+1
−(−C
j

j
S
j

j
B
j
)
=−ΔC
j

j
(S
j+1
−S
j
) +ψ
j
(B
j+1
−B
j
)
+(φ
j+1
−φ
j
)S
j+1
+(ψ
j+1
−ψ
j
)B
j+1
=−ΔC
j

j
ΔS
j

j
ΔB
j
+S
j+1
Δφ
j
+B
j+1
Δψ
j

=0as self-financing
.
Write this in terms of infinitesimals to see that the changes in the value of such a managed,
self-financing portfolio is
dΠ =−dC+φdS+ψdB. (2.9)
That is, obtain the change in value of the portfolio as if the number of units of each com-
ponent is held constant—just as we did in deriving the Black–Scholes equation (2.6). Ex-
ercise 2.17 asks you to recover the Black–Scholes equation (2.6) from an ensuing more
sophisticated analysis based upon (2.9).
Summary
Ito’s formula identifies risk-free portfolios that underpin the Black–Scholes equation (2.6)
for valuing options. Discretizations of the Black–Scholes equation empower accurate nu-
merical valuation of options, but need to be checked qualitatively.
2.4 Summary
• In stochastic calculus, differentials dW, dt, and dW
2
= dt are retained, whereas
dt
2
= dtdW=0, as are higher order products.
• Ito’s formula is that if Y =f(t, X(t)), then
dY =
∂f
∂t
dt +
∂f
∂x
dX+
1
2

2
f
∂x
2
dX
2
,
where dX
2
is to be simplified according to the above rules.
• The Black–Scholes PDE, for the value of an option at time t when the asset has
price S,
∂C
∂t
+rS
∂C
∂S
+
1
2
β
2
S
2

2
C
∂S
2
=rC,
is an advection-diffusion equation to be solved backward in time from the expiry
values of the option C(T, S).
• Numerical discretizations of the Black–Scholes equation are effective in valuing op-
tions.
• Self-financing portfolios of φ and ψ units, respectively, of assets with prices S and B
satisfy Sdφ+Bdψ= 0 .
emfm
2008/10/22
page 57
i
i
i
i
i
i
i
i
Exercises 57
Exercises
2.1. Use the probability distribution function p(z) = exp(−z
2
/2)/

2π for N(0, 1) ran-
dom variables Z and integration by parts to deduce that E[Z
p
] = (p−1)(p−3). . . 1
for even integers p. Hence deduce for Wiener processes W(t) that E[W(t)
p
] = (p−
1)(p−3). . . 1 · t
p/2
(for even integers p) by writing, at any given time, W = Z

t
where Z ∼ N(0, 1) .
2.2. Show that almost surely

n−1
j=0
(ΔW
j
)
4
→0 as the time step Δt
j
=h →0 for a fixed
final time T = nh. HINT: Use that E[|Z|
4
] = 3 for random variates Z ∼ N(0, 1) .
2.3. Argue that almost surely

n−1
j=0
|ΔW
j
|
p
→0 as h →0 for p ≥ 3 . HINT: Use that
E[|Z|
p
] is finite for Z ∼ N(0, 1) .
2.4. Use Ito’s formula to deduce the differential dX for the stochastic process X(t) =
2+t +exp(W(t)) .
2.5. Use Ito’s formula to find an SDE satisfied by the Ito process X = cos(t
2
+W) .
Write the SDE in terms of only t, X, and the differentials dt and dW.
2.6. Use Ito’s formula to showthat the following are the solutions to the SDEs you solved
numerically in Exercise 1.7:
1. X = (1+t)
2
(X
0
+t +W(t)) satisfies
dX =

2X
1+t
+(1+t)
2

dt +(1+t)
2
dW;
2. X = (1+W(t))
2
satisfies dX = dt +2

XdW with X
0
= 1 ;
3. X = log(1+W(t)) satisfies dX = −
1
2
e
−2X
dt +e
−X
dW with X
0
= 0 ;
4. X = sinh(t +W(t)) satisfies dX =

1
2
X+

1+X
2

dt +

1+X
2
dW with
X
0
= 0 ; and
5. Z = 1/(1+W(t)) satisfies dZ = Z
3
dt −Z
2
dW with Z
0
= 1 .
What features that you sawin the numerical solutions are explained by these analytic
solutions?
2.7. Let I
0
(t) = 1 , I
1
(t) = W(t) , I
2
(t) = W(t)
2
−t , I
3
(t) = W(t)
3
−3tW(t) , and
I
4
(t) = W(t)
4
−6tW(t)
2
+3t
2
. Use Ito’s formula to show dI
n
= nI
n−1
dW .
Describe the analogue with classic calculus. Use guesswork, checked with Ito’s
formula, to determine corresponding I
5
and I
6
.
27
2.8. Let an Ito process Y(t) = 1/(t +X(t)) in terms of the Ito process X(t) = W(t)
2
,
where W(t) is a Wiener process. Use Ito’s formula to deduce an expression for dY
in terms of only t, dt, Y, and dW.
2.9. Use the simple version (2.3) of Ito’s formula to generate, by substituting a variety
of functions f(t, W), a variety of SDEs whose analytic solutions you know.
2.10. Use Ito’s formula (2.4) for the following:
27
These I
n
(t) are closely related to the Hermite polynomials; see (Kreyszig 1999, pp. 246–247)
or (Abramowitz and Stegun 1965, Chap. 22).
emfm
2008/10/22
page 58
i
i
i
i
i
i
i
i
58 Chapter 2. Ito’s Stochastic Calculus Introduced
• Since X =t +W(t) satisfies dX =dt +dW, confirm that Y =sinhX satisfies
dY =

1
2
Y +

1+Y
2

dt +

1+Y
2
dW .
• For the above Ito process Y, deduce dZ for Ito process Z = Y
3
.
Recall that
d
dx
sinhx =coshx ,
d
dx
coshx =sinhx, and cosh
2
x =1+sinh
2
x .
2.11. Consider the Black–Scholes equation (2.6) with zero volatility:
∂C
∂t
+rS
∂C
∂S
=rC.
Show by algebraic differentiation that C = f(Se
−rt
)e
rt
satisfies this PDE for any
differentiable function f.
Now suppose you are considering some option with specified value at the expiry
time T, say C
T
(S). Deduce that the above function f(S) = C
T
(RS)/R, where R =
e
rT
. Hence argue that all points on the specified curve (S
T
, C
T
) (where S
T
denotes
the value of the asset at expiry) are mapped by the above PDE to points (S, C) at
time t = 0 by the simple contraction mapping S = S
T
/R and C = C
T
/R. Thus
comment on how the Black–Scholes equation (2.6) with nonzero volatility predicts
that the initial value of an option is just a smoothed version of the final value, but
scaled by the factor 1/R.
2.12. In November 2007 the company Wesfarmers purchased the Coles–Myer group. As
part of the settlement, Coles–Myer shares were transformed into Wesfarmers Par-
tially Protected Shares (WPPS). With a Floor Price specified to be $36, a Cap Price
of $45, and the Lapse Date meaning the date four years from the Issue Date, the
specifications of the WPPS included the following: On a Lapse Date determined by
Wesfarmers,
1. each Partially Protected Share will be reclassified into one Ordinary Share;
and
2. subject to clause 8, each Partially Protected Shareholder will be issued an
additional number of Ordinary Shares for each Partially Protected Share held
on that date, in accordance with the following:
(a) if the MVWAP is equal to or more than the Cap Price, no additional Ordi-
nary Shares;
(b) if the MVWAP is equal to or less than the Floor Price, 0.25 Ordinary
Shares; and
(c) if the MVWAP is between the Floor Price and the Cap Price, the number
of Ordinary Shares calculated using the following formula:
Cap Price
MVWAP
−1
where MVWAP means the VWAP for the period of two months immediately
preceding, but not including, the date of the Lapse Notice.
Instead of part 2(c), for simplicity treat the MVWAP as the sale price of Ordinary
Shares sold on the Australian Stock Exchange on the Lapse Date. A challenge is to
find the initial value of these WPPS using the Black–Scholes equation; for now just
emfm
2008/10/22
page 59
i
i
i
i
i
i
i
i
Exercises 59
0 50 100
0
10
20
30
40
o
p
t
i
o
n

v
a
l
u
e

C
(a)
0 50 100
0
10
20
30
40
(b)
0 50 100
0
20
40
60
asset price S
o
p
t
i
o
n

v
a
l
u
e

C
(c)
0 50 100
0
10
20
30
40
50
asset price S
(d)
Figure 2.6. The expiry value of four different options on an asset.
do the following. Translate the above WPPS conditions into an expiry condition for
the Black–Scholes equation (a condition that depends upon the sale price).
2.13. When one purchases, as an asset, shares in a company, one also expects some reg-
ular dividends as part of the financial benefit of holding the shares. Generalize the
arguments of Section 2.3 to derive a modified Black–Scholes equation when the as-
set in the portfolio also returns dividends at a known rate δ. That is, for each share
held in the portfolio, over a small time interval Δt the dividend contributes δΔt to
increasing the value of the portfolio.
2.14. This exercise is challenging. Extend the previous modification to the case where
the dividend has both a deterministic and stochastic component; over a small time
interval the dividend’s contribution to increasing the value of the portfolio would be
δΔt +ρΔW.
2.15. Use the trinomial model to value the following option. Comment on the match with
the qualitative graphical solution.
You, a bookmaker, have a client, Bob, who wants to bet $100 that Telstra shares will
rise to be above $5 in one year’s time. The share’s current value is $4. Suppose the
interest rate for bonds is 10% per year, and the volatility of the shares is 20% per

year. Deduce that you charge Bob at least about $20 to make this bet.
emfm
2008/10/22
page 60
i
i
i
i
i
i
i
i
60 Chapter 2. Ito’s Stochastic Calculus Introduced
2.16. Modify the trinomial MATLAB/SCILAB code for Example 2.10 to value a knock out
option (Example 2.9) which is as for Example 2.10 but additionally pays a rebate
of $1 if the asset price ever falls to $30.
2.17. Using a self-financing portfolio and assuming the interest rate on bonds is the con-
stant r, rederive the Black–Scholes equation (2.6) from (2.9).
2.18. Consider in turn each of the four options whose expiration values are plotted in
Figure 2.6. Use the qualitative arguments of Section 2.3 to sketch by hand the
option value, as a function of S, as predicted by the Black–Scholes equation at a
significantly earlier time.
Answers to selected exercises
2.4. dX =(1+
1
2
e
W
)dt +e
W
dW=
1
2
(X−t)dt +(X−t −2)dW.
2.5. dX =−

1
2
X+2t

1−X
2

dt −

1−X
2
dW .
2.8. dX =dt +2

XdW, and hence dY =2Y
2
(1−2tY)dt −2Y
2

1
Y
−tdW.
2.16. The knock out value is $3.59.
emfm
2008/10/22
page 61
i
i
i
i
i
i
i
i
Chapter 3
The Fokker–Planck Equation
Describes the Probability
Distribution
Contents
3.1 The probability distribution evolves forward in time . . . . . 65
3.1.1 Steady state probability distributions . . . . . . . . 68
3.1.2 Modeling large birth and death processes . . . . . . 74
3.2 Stochastically solve deterministic differential equations . . . 76
3.3 The Kolmogorov backward equation completes the picture . 84
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Answers to selected exercises . . . . . . . . . . . . 91
Previous chapters concentrated on the properties of individual realizations of a
stochastic process: that they have drift and volatility, and that simple numerics converge,
albeit slowly. The alternative is to discover the statistics of stochastic processes: their mean
and variance, or more generally, the probability distribution of the realizations. Compare
this with Markov chains where, instead of running many simulations of actual realizations
of the process, we typically discuss the probability distribution over the states of the chain
(see, e.g., Kao 1997, Chap. 4). In Markov chains, the transition matrix guides the evolution
of the probabilities; for SDEs, the Fokker–Planck equation describes the evolution of the
probability distribution.
Example 3.1 (Wiener processes diffuse). Consider an ensemble of realizations of a
Wiener process W(t); for example, see five of the realizations shown in Figure 1.3, and
the ten realizations in Figure 3.1. As time increases, these realizations spread out over w-
space, that is, they diffuse. We knowthat at any time t, a Wiener process is distributed as an
N(0, t) random variable. Thus its probability distribution function (PDF) is the Gaussian,
normal distribution p(t, w) = exp(−w
2
/2t)/

2πt : as shown in Figure 3.2, this Gaus-
sian spreads in time corresponding to the spreading realizations. Recall that this Gaussian
distribution satisfies the PDE for diffusion (see, e.g., Kreyszig 1999, §11.5–6):
∂p
∂t
=
1
2

2
p
∂w
2
.
61
emfm
2008/10/22
page 62
i
i
i
i
i
i
i
i
62 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution
W
(
t
)
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.
1.
2.
time t
Figure 3.1. Ten realizations of a Wiener process W(t).
p
(
t
,
w
)
0 1 2 3 4 5
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.2
1
5
w
Figure 3.2. The spreading Gaussian probability distribution function p(t, w) of a
Wiener process at three times: t = 0.2 , t = 1, and t = 5 .
emfm
2008/10/22
page 63
i
i
i
i
i
i
i
i
Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution 63
This PDE governing the evolution of the PDF p(t, w) is the simplest example of a Fokker–
Planck equation describing the evolution of the distribution of a stochastic process.
However, this view of the Wiener process is limited in that, for example, it does not
discern the nature of the increments nor the continuity of the stochastic process. Consider a
random function X(t) = Z

t, where Z is a random variable distributed N(0, 1) so X(t) is
a square-root curve with a randomly chosen coefficient; then X(t) has exactly the same
Gaussian PDF as the Wiener process but is quite different in nature. The same comment
holds if Z is instead generated randomly and independently at each time t.
Although the PDF does not discriminate between the possibilities discussed at the end
of this example, useful statistics are obtained from the PDF and sometimes show overall
features missed by individual realizations of the SDE! Further, when we cannot solve the
SDE sometimes, we can solve the Fokker–Planck equation, and hence make at least partial
analytic progress.
Example 3.2 (expect important but rare large fluctuations). Section 2.1.2 describes
how the solution of the SDE dX = αXdt +βXdW is the stochastic process X(t) =
exp[(α−
1
2
β
2
)t+βW(t)]. This process almost always decays to zero provided α−
1
2
β
2
<
0 (even when the drift rate α > 0). However, this remarkable statement misleads as rare
fluctuations are significant.
Determining the expected value of X(t) from its PDF illustrates the significance of
the rare fluctuations. First, we determine that
E

e
βW(t)

=e
β
2
t/2
, (3.1)
using the PDF of the Wiener process. From the definition of expectation in terms of the
PDF
E[e
βW(t)
] =


−∞
e
βw
p(t, w)dw
=


−∞
e
βw
1

2πt
e
−w
2
/2t
dw
=



1

2πt
exp


w
2
2t
+βw

dw then upon completing the square
=


−∞
1

2πt
exp


1
2t
(w−tβ)
2
+
β
2
t
2

dw
=e
β
2
t/2


−∞
1

2πt
e
−(w−tβ)
2
/(2t)
dw
=e
β
2
t/2


−∞
1

2πt
e
−u
2
/(2t)
du substituting u =w−tβ
=e
β
2
t/2
,
as the integral of the Gaussian is 1. This derivation confirms (3.1).
Second, we conclude that the solutions to exponential Brownian motion, although
almost always tending to zero for large enough β, nonetheless has a growing expectation
emfm
2008/10/22
page 64
i
i
i
i
i
i
i
i
64 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution
whenever α >0 :
E[X(t)] = E

X
0
e
(α−β
2
/2)t+βW(t)

= E

X
0
e
(α−β
2
/2)t
e
βW(t)

=X
0
e
(α−β
2
/2)t
E

e
βW(t)

=X
0
e
(α−β
2
/2)t
e
β
2
t/2
=X
0
e
αt
.
For example, the almost surely decaying realizations X(t) =e
−t+2W(t)
of the SDE dX =
Xdt +2XdW (such that X(0) = 1) nonetheless have an exponentially growing expecta-
tion E[X(t)] = e
t
. Thus very rare, very large fluctuations in X(t) must occur so that the
expectation can grow even though almost all realizations decay to zero. The histogram of
Figure 2.1 hinted at this significance of the few large fluctuations.
Further, as a specific instance of Exercise 3.1, you will also find here that Var[X(t)] =
e
6t
−e
2t
, which grows alarmingly quickly even though, again, almost all realizations decay
to zero!
Example 3.3 (random walkers solve Laplace’s equation). One way that we can approx-
imately solve Laplace’s equation ∇
2
u = 0 in some domain (see, e.g., Kreyszig 1999,
§11.9–11) is to choose a point of interest (x
0
, y
0
) and release from that point a large num-
ber of random walkers (drunks!) who execute Brownian motion in both x and y: that is,
x =W
1
(t) and y=W
2
(t) for two independent Wiener processes. These (drunken) walkers
stick to the boundary when they first hit the boundary as shown in Figure 3.3 (see the sim-
ulation of Algorithm A.3). Then when enough walkers have become stuck to the boundary,
one simply takes the average of the boundary values f(P) to estimate u(x
0
, y
0
). Amaz-
ingly, this stochastic procedure finds the solution of the deterministic Dirichlet problem

2
u = 0 such that u =f(P) on the boundary.
28
One way to see the connection between the random walkers and Laplace’s equation
is to recognize that the PDF of the random walkers satisfy the two-dimensional diffusion
equation
∂p
∂t
=
1
2

2
p; then when they remain stuck to the boundary, the time derivative
becomes zero, and hence the walkers have somehow solved 0 = ∇
2
p, which is Laplace’s
equation. Section 3.2 explores this useful connection between SDEs and deterministic
differential equations.
The Fokker–Planck equation establishes a useful transformation between the solu-
tion of SDEs and certain PDEs. One great advantage of using this connection for solving
PDEs is that you do not have to compute the solution everywhere if all you need is the
solution at one or a few points. Another great advantage is that it easily handles complex
shaped domains, as it needs no underlying grid. This technique is incredibly useful for
solving problems in very high dimensional domains. For example, in investigating quan-
tum dynamics (of Bose–Einstein condensates) with 1000 atoms, say, one wants to solve
PDEs in a vector space with over 10
1000
dimensions! The stochastic solution involving
28
An estimate of the error is the sample standard deviation divided by the square root of the number of
walkers who reached the boundary.
emfm
2008/10/22
page 65
i
i
i
i
i
i
i
i
3.1. The probability distribution evolves forward in time 65
0.0
0.2
0.4
0.6
0.8
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
time 0
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
time 0.05
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
time 0.15
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
time 0.3
Figure 3.3. Forty random walkers (circles) released from (0.7, 0.4), spread out
stochastically in time until they hit the boundaries (the unit square) where they stick to
collectively estimate u(0.7, 0.4). The contour plot shows the true solution of Laplace’s
equation: its value on the boundary gives the contribution of each random walker to the
estimate of u(0.7, 0.4).
many realizations of just 1000 atoms is accessible via supercomputers. That there is this
connection between PDEs and SDEs is established by the Fokker–Planck equation for the
PDFs of the stochastic process.
3.1 The probability distribution evolves forward in time
We now discover how to describe the evolution of the probability distribution function
p(t, x) for a general Ito process with some drift and volatility. For simplicity we just con-
sider one stochastic process dependent upon one Wiener process—the generalization to
emfm
2008/10/22
page 66
i
i
i
i
i
i
i
i
66 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution
s(t +h, x)
s(t, ξ
u
)
s(t, ξ
d
)

d
dd‚
Z = −1
Z = +1
s
x
sx+dx
s
ξ
u
s ξ
u
+dξ
s
ξ
d
s ξ
d
+dξ

d
dd‚

d
dd‚
Figure 3.4. Left: either an up or down step reaches the state x at time t +h.
Right: small intervals of length dξ reach a small interval of length dx over one time step.
multiple coupled stochastic processes involving multiple independent Wiener processes is
analogous and of much practical use, but we will not explore it in this book. This section
ends by showing howto model biological populations with stochastic effects representative
of those endemic in the environment.
Analogue with Markov chains
The probability distribution function p(t, x), namely, the probability of realizations being
near x at time t, is analogous to the vector of probabilities p(t), where the jth compo-
nent, p
j
(t), is the probability being in a state j at time t discussed in Markov chains such
as birth and death processes (see, e.g., Kao 1997, §4.1).
Theorem 3.4 (Fokker–Planck equation). Consider the Ito process X(t) with drift μ(X)
and volatility σ(X), and hence satisfying the SDE dX =μdt +σdW, where for simplicity
we restrict our attention to the autonomous case of no direct time dependence in the drift
and volatility. Then the PDF of the ensemble of realizations p(t, x) satisfies the Fokker–
Planck equation, alternatively called the Kolmogorov forward equation,
∂p
∂t
= −

∂x
[μp] +

2
∂x
2

1
2
σ
2
p

. (3.2)
Analogue
The Fokker–Planck equation is analogous to the evolution of probability distributions in
Markov chains: p(t +1) = p(t)P for some transition matrix P. In the Fokker–Planck
equation (3.2), the x derivatives involving the drift and volatility operate exactly like the
transition matrix P—they operate to dictate how probability distributions evolve in time.
Proof. We derive this Fokker–Planck equation for the case of constant volatility σ (vari-
able volatility makes the details much more complicated).
29
Throughout the derivation,
dashes denote ∂/∂x, and the drift μ is evaluated at x unless otherwise specified.
We investigate the probability that the realization at time t +h is near the value x,
that is, within the interval [x, x +dx] for some small (infinitesimal) interval length dx. In
terms of the PDF, the probability of being in this interval is
Pr {X(t +h) ∈ [x, x+dx]} =p(t +h, x)dx.
29
Exercises 3.7–3.8 ask you to deal with specific cases of variable volatility. Appendix B.1 gives a proof
for general volatility.
emfm
2008/10/22
page 67
i
i
i
i
i
i
i
i
3.1. The probability distribution evolves forward in time 67
The realization reaches x from a variety of possible values of X(t) depending upon the
Wiener increment ΔW=

hZ. That Zis a normal randomvariable is largely immaterial—
so long as Z has zero mean and unit variance, its cumulative sum, over many small time
steps, will quickly become normal—thus for simplicity and to the same effect assume Z =
±1 each with probability
1
2
. This assumption is rather like the binary model of asset price
movement. As Figure 3.4 (left) shows, the process reaches x if it starts from ξ
u
with
Z = +1, ΔW = +

h, or if it starts from ξ
d
if Z = −1, ΔW = −

h, where the values
for ξ are determined from
x =ξ+μ(ξ)h±σ

h.
This is a pair of implicit equations for the two ξ values. Solve the pair approximately by
iteration after rearranging to ξ = x−μ(ξ)h∓σ

h and starting from
30
ξ ≈x;
ξ ≈x−μ(x)h∓σ

h;
ξ = x−μh∓σ

h+· · · .
Differentiating this shows how intervals of X at time t become slightly stretched or com-
pressed in the evolution to time t +h, as shown in Figure 3.4 (right):
dξ =

1−μ

h+· · ·

dx.
Then because Z =±1, each with probability
1
2
,
p(t +h, x)dx = Pr {X(t +h) ∈ [x, x+dx]}
=
1
2
Pr {X(t) ∈ [ξ
u
, ξ
u
+dξ
u
]} +
1
2
Pr {X(t) ∈ [ξ
d
, ξ
d
+dξ
d
]}
=
1
2
p(t, ξ
u
)dξ
u
+
1
2
p(t, ξ
d
)dξ
d
=
1
2
p(t, x−μh−σ

h)[1−μ

h]dx
+
1
2
p(t, x−μh+σ

h)[1−μ

h]dx+· · · .
Divide by the infinitesimal dx and expand p in Taylor series about p(t, x):
p(t +h, x) =
1
2

p−(μh+σ

h)p

+
1
2
(μh+σ

h)
2
p

[1−μ

h]
+
1
2

p+(−μh+σ

h)p

+
1
2
(−μh+σ

h)
2
p

[1−μ

h]
+· · ·
=

p−μhp

+
1
2
σ
2
hp

[1−μ

h] +· · ·
=p−μhp

−μ

hp+
1
2
σ
2
hp

+· · · .
Putting p onto the left and dividing by h leads to
p(t +h, x) −p(t, x)
h
= −(μp)

+
1
2
σ
2
p

+· · · .
30
Recall that the ellipsis “· · · ” denotes small terms of higher order in time step h that we neglect. Here
such terms are typically in h
3/2
, h
2
, and so on.
emfm
2008/10/22
page 68
i
i
i
i
i
i
i
i
68 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution
Take the limit as h →0 to deduce
∂p
∂t
= −

∂x
(μp) +
1
2
σ
2

2
p
∂x
2
.
This is the Fokker–Planck equation (3.2) for the case of constant volatility. Appendix B.1
gives an alternate and more general proof.
Interpretation
The Fokker–Planck equation (3.2) has the following physical interpretation following from
the modeling of motion in a one-dimensional continuum(see, e.g., Roberts 1994). Rewrite
equation (3.2) in “conservative” form
∂p
∂t
+

∂x

μp−

∂x
(Dp)

=0,
where D(x) =
1
2
σ(x)
2
. This equation describes the conservation of probability distribu-
tion p(t, x) as it “‘moves” along the x-axis with a flux q = μp−

∂x
(Dp) . The second
component of the flux, −

∂x
(Dp), is a diffusive term induced by the volatility of the SDE.
However, physical diffusion normally appears in the flux as −D
∂p
∂x
, with D(x) being the
effective diffusion coefficient. Here the additional part of the second term −

∂x
(Dp) is

∂D
∂x
p = σσ

p and is best grouped with the first term −μp. Thus write the flux as
q =(μ−σσ

)p−D
∂p
∂x
,
and interpret this flux as probability distribution being carried by a mean velocity μ−σσ

due to the drift and to the asymmetry in the noise, and as probability distribution diffusing
with coefficient D=
1
2
σ
2
.
31
3.1.1 Steady state probability distributions
Example 3.5 (the Ornstein–Uhlenbeck process). Imagine a car: when you press down
on it for a short time, its suspension reacts and subsequently lifts the car back to its equi-
librium height above the road. The natural decay in the dynamics of the suspension brings
the car back to its equilibrium. Now drive the car along a bumpy road: the car moves
up and down in a complex response to the bumps and to the dynamics of its suspension.
Consider the bumps as a stochastic forcing of the suspension; then we would model the
combined dynamics by an SDE. The height of the car above the road is around about its
normal height, but the bumps cause stochastic fluctuations in its height. Let us see such
behavior in mathematics.
31
If the random noise is absent, the Fokker–Planck equation reduces to the Liouville equation
∂p
∂t
=


∂x
(μp), which describes the dynamics of probability distribution for a deterministic differential equa-
tion. This view of the dynamics is sometimes sought when either the initial conditions are stochastic or
because chaos in the dynamics, such as fluid turbulence, causes randomness to be effectively generated in
the dynamics of the system.
emfm
2008/10/22
page 69
i
i
i
i
i
i
i
i
3.1. The probability distribution evolves forward in time 69
X
(
t
)
0.0 0.5 1.0 1.5 2.0 2.5
0
1
2
3
4
time t
Figure 3.5. Ten realizations of an Ornstein–Uhlenbeck process X(t) with param-
eters α = 1 and σ =

2 , and all realizations starting with X(0) = 3 .
The so-called Ornstein–Uhlenbeck process, met in Exercise 1.6, combines determin-
istic exponential decay (modeling the car’s suspension) with an additive noise (modeling
the forcing of the bumps). The form of the SDE is generally
dX =−αXdt +σdW (3.3)
for some constants α and σ measuring the rate of deterministic decay and the level of
stochastic forcing, respectively. Figure 3.5 shows ten realizations, and Figure 3.6 shows
how the corresponding PDF evolves. Find the steady state PDF for this process, that is,
the density of realizations in Figure 3.5 after the initial transients, namely the shape that is
appearing at large times in Figure 3.6.
Solution: The Fokker–Planck equation (3.2) for the PDF is
∂p
∂t
=−

∂x
(−αxp) +

2
∂x
2
(
1
2
σ
2
p).
Obtain the differential equation for the steady state PDF by setting ∂p/∂t = 0 ; then this
Fokker–Planck equation becomes
0 =

∂x

αxp+
1
2
σ
2
∂p
∂x

⇒constant =αxp+
1
2
σ
2
∂p
∂x
upon integrating.
emfm
2008/10/22
page 70
i
i
i
i
i
i
i
i
70 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution
p
(
t
,
x
)
0 1 2 3 4 5
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.1
0.5
2.5
10
x
Figure 3.6. Evolving PDF p(t, x) for the Ornstein–Uhlenbeck process dX =
−Xdt +

2dW with initial condition X(0) = 3 and plotted at times t = 0.1 , 0.5, 2.5,
and 10 . Compare with the realizations in Figure 3.5.
The constant on the left-hand side is zero as the PDF p and its derivative have to vanish,
for large enough x, in order for the integral of p to be one. Rearranging it as a separable
ODE (see, e.g., Kreyszig 1999, §1.3–4) leads to
1
2
σ
2

dp
p
=

−αxdx
⇒ logp = −αx
2

2
+constant
⇒ p =Ae
−αx
2

2
for some integration constant A. This is a Gaussian distribution centered on x = 0 and
with width proportional to σ/

α. The constant of proportionality is then well known to
be A =

α/π/σ in order to ensure that the total probability, the area under p(x), is one.
The additive noise of the Ornstein–Uhlenbeck process has just “smeared out” the stable
fixed point at X = 0 . The width of the smearing is proportional to the strength of the
noise, σ, and proportional to the inverse square root of the rate of attraction of the fixed
point, 1/

α.
Analogue
The steady state probability distribution p(x) is analogous to the steady state distributions π
found for discrete state stochastic models. The difference here is the continuumof possible
states. Just as you normalize steady state distributions, so we normalize here: For a finite
number of states

j
π
j
=1, here

p(x)dx =1 .
emfm
2008/10/22
page 71
i
i
i
i
i
i
i
i
3.1. The probability distribution evolves forward in time 71
Γ
(
z
)
0 1 2 3 4
0
1
2
3
4
z
Figure 3.7. The Gamma function (3.4) for real argument z.
Gamma function
In the next example we need to use the Gamma function, Γ(z), which you have possibly
already met in other studies (see, e.g., Kreyszig 1999, pp. A54–55),
Γ(z) =


0
x
z−1
e
−x
dx, (3.4)
as plotted in Figure 3.7 for real z. Integration by parts shows that Γ(z+1) =zΓ(z) . Hence,
for example, since Γ(1) = 1 then for integer n, Γ(n+1) = n! . Other special values are
Γ(
1
2
) =

π, and hence Γ(
3
2
) =
1
2

π.
Example 3.6 (a two humped camel). Investigate the steady state PDF of the SDE dX =
(3X −X
3
)dt +XdW and relate it to the deterministic dynamics. Compare it with the
steady state of the SDE dX = (3X−X
3
)dt +2XdW that has twice the volatility.
Solution: First, investigate the deterministic dynamics of dX = (3X−X
3
)dt as done
in courses on ODEs (see, e.g., Kreyszig 1999, §3.3–3.5). This ODE has fixed points
(equilibria) where 3X−X
3
= 0 , namely X = 0 and X = ±

3; the fixed point at X = 0 is
unstable as the linearized dynamics are dX = 3Xdt with exponentially growing solutions,
whereas the fixed points at X=±

3 are stable as the local dynamics, say X=±

3+Y(t) ,
are dY = −6Y dt with exponentially decaying solutions. Thus all deterministic trajectories
evolve to one or another of the fixed points X =±

3 .
Second, consider the steady state PDF p(x) of the SDE dX= (3X−X
3
)dt+XdW .
It satisfies the time-independent Fokker–Planck equation
0 = −

∂x

(3x−x
3
)p

+

2
∂x
2

1
2
x
2
p

.
emfm
2008/10/22
page 72
i
i
i
i
i
i
i
i
72 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution
One integral with respect to x leads to
−(3x−x
3
)p+

∂x

1
2
x
2
p

=constant,
but this constant has to be zero, as p and its derivatives must vanish for large enough x.
Thus by expanding, rearranging, and recognizing that the ODE is separable, we obtain
1
2
x
2
∂p
∂x
= (2x−x
3
)p

dp
p
=

4x−2x
3
x
2
dx
⇒logp =

4
x
−2xdx
⇒logp =4log|x| −x
2
+constant
⇒p =Ax
4
e
−x
2
.
Thus the steady state PDF, as shown in Figure 3.8, is zero near x = 0 , increases away
from x = 0 by the x
4
factor, but soon is brought back to zero by the rapid decay of the
e
−x
2
factor. The two humps of the probability distribution correspond to the two stable
fixed points of the deterministic ODE.
Determine the integration constant A by requiring that the area under the PDF be
one: using symmetry, integration over half the domain requires
1
2
=


0
Ax
4
e
−x
2
dx
=
A
2


0
u
3/2
e
−u
du upon substituting u =x
2
=
A
2
Γ(5/2) =
3

π
8
A,
and thus A=4/(3

π) .
Lastly, a perusal of the deterministic part of dX = (3X−X
3
)dt +2XdW again sug-
gests that there should be two humps in the PDF near the two stable deterministic fixed
points X=±

3 . However, the steady solutions of the corresponding Fokker–Planck equa-
tion,
0 =−

∂x

(3x−x
3
)p

+

2
∂x
2

2x
2
p

,
are derived via
−(3x−x
3
)p+

∂x

2x
2
p

=0
⇒2x
2
∂p
∂x
=(−x−x
3
)p

dp
p
=

−x−x
3
2x
2
dx
emfm
2008/10/22
page 73
i
i
i
i
i
i
i
i
3.1. The probability distribution evolves forward in time 73
P
D
F
p
(
x
)
0 1 2 3 4
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
x
Figure 3.8. Steady state probability distributions for Example 3.6. Solid blue
line: dX = (3X−X
3
)dt +XdW with lower volatility has two humps; dashed green line:
dX =(3X−X
3
)dt +2XdW with larger volatility peaks at the origin.
⇒logp =


1
2x

1
2
xdx
⇒logp =−
1
2
log|x| −
1
4
x
2
+constant
⇒p =
A

|x|
e
−x
2
/4
.
Figure 3.8 shows that doubling the level of the multiplicative noise, now2XdW, effectively
stabilizes the fixed point at the origin. The large spike of probability at the origin has finite
area as 1/

x is integrable. Requiring the total area under the PDF to be one implies
1
2
=


0
Ax
−1/2
e
−x
2
/4
dx
=
A

2


0
u
−3/4
e
−u
du upon substituting u =x
2
/4
=
A

2
Γ(1/4),
and thus A=1/(

2Γ(1/4)) =0.1950 .
emfm
2008/10/22
page 74
i
i
i
i
i
i
i
i
74 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution

0

1

2

n−1

n

n+1
E
λ
0
'
μ
1
E
λ
1
'
μ
2
E
λ
n−1
'
μ
n
E
λ
n
'
μ
n+1
E
'
E
'
. . .
E
'
. . .
Figure 3.9. Transitions between the numbers of individuals in a population.
3.1.2 Modeling large birth and death processes
Consider the modeling of populations by a birth and death process where we track each
and every birth and death, as shown in the states and transitions of Figure 3.9. But in
large populations we only need to know aggregated births and deaths: for example, how
many thousands of people are infected in a flu epidemic? In large populations it is grossly
inefficient to track each and every event. The Fokker–Planck equation empowers us to
transform a single event model, such as a birth and death model, into an efficient aggregate
SDE model via the evolution of the probability distribution. Here we introduce the key
idea via one example.
Example 3.7 (births/deaths in a large population). Malthus proposed one of the first
mathematical models of biological populations: The number of animals N(t) grows in
time according to dN/dt =(α−β)N, where α is the birth rate per individual and β is the
death rate per individual. When the birth rate is greater than the death rate, α > β, then
inevitably the population grows exponentially. Question: What happens in our uncertain
fluctuating environment? Answer: The population fluctuates as plotted in Figure 3.10.
This example introduces a scheme to describe the number of individuals in a popula-
tion, N(t), as a Markov birth and death process. This is then approximated as the Fokker–
Planck equation (3.2) for a stochastic version of the Malthusian model dN/dt =(α−β)N.
Let n range over the total number of individuals in the population (a nonnegative integer),
32
and let p
n
(t) denote the probability that there are n individuals in the population at time t.
Then changes to the number of individuals in the population are due to births at some
rate λ
n
and deaths at some rate μ
n
, as shown in Figure 3.9. For a biological population
with no constraints on the number of individuals, we expect that the birth and death rates
will be proportional to the number of individuals: λ
n
= αn and μ
n
= βn for some con-
stants α and β. These are the key parameters and variables of the stochastic Malthusian
model.
How does the number of individuals evolve? The time rate of change of the proba-
bility of there being n individuals is decreased by a birth to n+1 individuals or by a death
to n−1 individuals, or is increased by a death among n+1 individuals or a birth among
n−1 individuals:
dp
n
dt

n−1
p
n−1
−(λ
n

n
)p
n

n+1
p
n+1
=α(n−1)p
n−1
−(α+β)np
n
+β(n+1)p
n+1
then upon assuming p
n
varies smoothly in n, Taylor
expanding all p
n±1
terms about n, and using dashes to
denote ∂/∂n
32
For sexual animals, the usual practice is to count only the female of the species, as they are most closely
involved in reproduction.
emfm
2008/10/22
page 75
i
i
i
i
i
i
i
i
3.1. The probability distribution evolves forward in time 75
N
(
t
)
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
0
5
10
15
20
25
30
35
time t
Figure 3.10. Five realizations of the stochastic population model (3.6) with
growth rate α = 2 and death rate β = 1. Stochastic fluctuations potentially cause large
variations in the population growth.
=α(n−1)

p−p

+
1
2
p

−· · ·

−(α+β)np
+β(n+1)

p+p

+
1
2
p

+· · ·

= (β−α)p+[βn+β−αn+α]p

+
1
2
[βn+β+αn−α]p

+· · ·
=

∂n
[(β−α)np] +

2
∂n
2

1
2
[(α+β)n+(β−α)]p

+· · · .
That is, neglecting the higher order terms (in the ellipses),
∂p
∂t


∂n
[(β−α)np] +

2
∂n
2

1
2
[(α+β)n+(β−α)]p

. (3.5)
To confirm this form of the right-hand side, start with it and work backward, expanding the
derivatives using the product rule, until you match all the earlier terms.
The PDE (3.5) has the form of a Fokker–Planck equation (3.2) with drift μ = (α−
β)n and diffusion
D=
1
2
σ
2
=
1
2
[(α+β)n+(β−α)] ≈
1
2
(α+β)n
for a large population. Such a Fokker–Planck-like equation governs the PDF of
some SDE. Thus we could equivalently model the population by realizations of the stochas-
emfm
2008/10/22
page 76
i
i
i
i
i
i
i
i
76 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution
tic differential equation dN=μdt +σdW, namely
dN= (α−β)Ndt +

(α+β)NdW. (3.6)
The first term in this SDE, (α−β)Ndt, models Malthusian growth; whereas the second,

(α+β)NdW, models the fluctuations typical of a large number of random events. Such
stochastic models realistically describe natural fluctuations in populations that we see in
the realizations of Figure 3.10.
Suggested activity: Do at least Exercises 3.2 and 3.4(1).
Summary
The Fokker–Planck equation (3.2) empowers us to not only predict steady state distribu-
tions, but to also transform individual-based models into aggregate SDEs that incorpo-
rate realistic fluctuations. Modern cell biochemistry also finds such transformations useful
(Higham 2008).
3.2 Stochastically solve deterministic differential
equations
We now head back toward our earlier claim that the random walkers in a domain somehow
solve Laplace’s equation. The trick is to show that at any time, the average field u expe-
rienced by the random walkers at that time is always the value at the release point. For
simplicity, we address only one spatial dimension. But for wide applicability we permit the
“random walkers” to undergo a quite general stochastic process.
From this, a strong link is cemented between PDEs and SDEs. This link helps solve
the Black–Scholes equation (2.6) for the value of options in finance.
Theorem 3.8 (Feynman–Kac formula). Let u(t, x) satisfy the PDE
∂u
∂t
+μ(x)
∂u
∂x
+D(x)

2
u
∂x
2
=0, (3.7)
where as usual D(x) =
1
2
σ(x)
2
. Let X(t) be the ensemble of solutions to the SDE
dX = μ(X)dt +σ(X)dW with initial condition X(s) = y, then the specific initial value
u(s, y) = E[u(t, X(t))] for all t. This identity is sometimes called the Feynman–Kac for-
mula.
Example 3.9. The function u(t, x) = x
2
−t is one of the solutions to the diffusion PDE
∂u
∂t
=
1
2

2
u
∂x
2
. Show first that the Feynman–Kac formula holds from the initial point (t, x) =
(1, 2), and then that it holds from a general initial point (t, x) = (s, y).
Solution: The diffusion PDE
∂u
∂t
=
1
2

2
u
∂x
2
has zero drift, μ = 0 , and unit volatility,
σ = 1 , so the corresponding SDE is just dX = dW with solution of a Wiener process
starting from the point (t, x) = (1, 2), namely X(t) =W(t −1) +2.
emfm
2008/10/22
page 77
i
i
i
i
i
i
i
i
3.2. Stochastically solve deterministic differential equations 77
Now at any later time t > 1, the expectation on the right-hand side of the Feynman–
Kac formula is
E[u(t, X(t))] = E

{W(t −1) +2}
2
−t

= E

W(t −1)
2
+4W(t −1) +4−t

= E

W(t −1)
2

+4E[W(t −1)] +(4−t)
= Var [W(t −1)] +4E[W(t −1)] +(4−t)
= (t −1) +4×0+(4−t) =3,
as, by definition, W(t −1) is distributed N(0, t −1). The left-hand side of the Feynman–
Kac formula is u(1, 2) = 2
2
−1 =3 in agreement.
Similar algebra applies for any initial point (s, y). The solution of the corresponding
SDE starting from the point (t, x) = (s, y) is X(t) =W(t −s) +y. At any later time t >1,
the expectation on the right-hand side of the Feynman–Kac formula is then
E[u(t, X(t))] = E

{W(t −s) +y}
2
−t

= E

W(t −s)
2
+2yW(t −s) +y
2
−t

= E

W(t −s)
2

+2yE[W(t −s)] +(y
2
−t)
= Var [W(t −s)] +2yE[W(t −s)] +(y
2
−t)
= (t −s) +4×0+(y
2
−t) = y
2
−s,
as, by definition, W(t −s) is distributed N(0, t −s). The left-hand side of the Feynman–
Kac formula is u(s, y) =y
2
−s in agreement.
Proof. Let p(t, x|s, y) be the PDF for the general SDE of Theorem 3.8. It must satisfy
the corresponding Fokker–Planck equation:
∂p
∂t
= −

∂x
[μp] +

2
∂x
2
[Dp] .
Now consider the expected value of u over all realizations released from X =y at time s:
E[u(t, X(t))] =

u(t, x)p(t, x|s, y)dx,
where the implicit limits of integration are over all x from −∞ to +∞, both here and
below. We show that this expectation does not change in time:

∂t
E[u(t, X(t))] =

∂t

u(t, x)p(t, x|s, y)dx
=

∂u
∂t
p+u
∂p
∂t
dx
=

∂u
∂t
p−u

∂x
[μp] +u

2
∂x
2
[Dp] dx by Fokker–Planck,
emfm
2008/10/22
page 78
i
i
i
i
i
i
i
i
78 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution
then integrate the last two terms by parts to
=

−uμp+u

∂x
[Dp]

=0since p→0as x→±∞
+

∂u
∂t
p+μp
∂u
∂x

∂u
∂x

∂x
[Dp] dx
and integrate the last term by parts again
=


∂u
∂x
Dp

=0since p→0as x→±∞
+

∂u
∂t
p+μp
∂u
∂x
+

2
u
∂x
2
Dpdx
=

∂u
∂t

∂u
∂x
+D

2
u
∂x
2

=0by the given PDE (3.7)
pdx
=0.
That is, the expectation E[u(t, X(t))] is constant in time t; it must be always the same as
its initial value of E[u(s, X(s))] = E[u(s, y)] = u(s, y) .
Example 3.10 (solve the Black–Scholes equation stochastically). Recall that the Black–
Scholes equation (2.6) for the value of a call option is
∂C
∂t
+rS
∂C
∂S
+
1
2
β
2
S
2

2
C
∂S
2
=rC,
where the bond rate is r and the asset stock volatility is β. This Black–Scholes equation
is not in the form of the PDE (3.7) because of the source terms on the right-hand side.
We change variables to x and u(t, x) such that C(t, S) = u(t, x)e
rt
and x = S;
33
then
u(t, x) = C(t, S)e
−rt
has the meaning of the value of the call option discounted by the
bond rate to the current time t = 0 . With this change, since
∂C
∂t
=
∂u
∂t
e
rt
+rC, it is
straightforward to see that the discounted value u(t, x) satisfies the PDE
∂u
∂t
+rx
∂u
∂x
+
1
2
β
2
x
2

2
u
∂x
2
=0,
which is in the requisite form (3.7).
As in the example shown in Figure 3.11, suppose at time zero we release stochastic
particles, with path x =Y(t), to evolve according to the corresponding SDE
34
dY =rY dt +βY dW with Y(0) =S
0
, (3.8)
where S
0
is the current asset price. Then according to Theorem 3.8 the current discounted
value of the option is u(0, S
0
) = E[u(t, Y(t))] for all time t. In particular, u(0, S
0
) =
33
We unnecessarily change names from asset value S to abstract x only to match closely with the
PDE (3.7).
34
We use Y here only because Xdenotes the strike price for an option.
emfm
2008/10/22
page 79
i
i
i
i
i
i
i
i
3.2. Stochastically solve deterministic differential equations 79
E[u(t, Y(t))] will hold up to the expiry time, say t = T , of the option. Upon expiry we
know the value of the option, namely C(T, S) = max{S−X, 0}, where S is the asset price,
denoted by Y for the solutions of the SDE. Hence the discounted value of each of the
stochastic particles is
u(T, Y(T)) = e
−rT
max{Y(T) −X, 0} .
Thus using u(0, S
0
) = E[u(t, Y(t))], the current value of the option is
C(0, S
0
) = u(0, S
0
) = e
−rT
E[max{Y(T) −X, 0}] .
Algorithm 3.1 lists this stochastic solution of the Black–Scholes equation. Fig-
ure 3.11 plots 10 realizations of Y(t) —there is nothing particularly unusual to observe.
Averaging over m = 1000 realizations, the estimated value of the call option varies over
a range of approximately 3.3–3.5 . The exact value for the option of $3.40 is comfortably
within this range.
Algorithm 3.1 Example MATLAB/SCILAB code using (3.8) to stochastically estimate the
value of a call option, from Example 1.9, on an asset with initial price S
0
= 35 , strike
price X = 38.50 after one year in which the asset fluctuates by a factor of 1.25 and with a
bond rate of 12%.
r=log(1.12)
b=log(1.25)
x=38.50
s0=35
m=1000;
n=1000;
t=linspace(0,1,n+1);
h=diff(t(1:2));
y=s0
*
ones(1,m);
for j=1:n
y=y+r
*
y
*
h+b
*
y.
*
randn(1,m)
*
sqrt(h);
end
estimated_value=exp(-r)
*
mean(max(y-x,0))
To estimate this value to the nearest few cents we need to increase the accuracy
of the stochastic estimate by a factor of 10. Such accuracy could only be obtained by
using 100 times as many realizations, that is, m≈100, 000 realizations! This is rather too
many realizations for a practical method when the alternative of solving the Black–Scholes
equation (2.6) is so quick.
35
The previous example reinforces the connection between PDEs and SDEs. Recall
that at the start of this chapter we claimed that random walkers could solve Laplace’s equa-
tion by averaging over the values observed by the walkers when they first contacted the
boundary. We now use the above theory to show that a similar technique will work quite
35
This stochastic solution also opens up the possibility of using noise processes different from the Wiener
process to estimate the value of an option. However, our developed theory does not yet justify such use!
emfm
2008/10/22
page 80
i
i
i
i
i
i
i
i
80 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution
Y
(
t
)
0
5.1
22
2.3
0
0
0
7.3
0
19
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
25
30
35
40
45
50
55
60
65
time t
Figure 3.11. Ten example realizations Y(t) of the SDE to stochastically solve the
Black–Scholes equation: the value of the option at the expiry time t = 1 is shown at the
right. Average such values to estimate the value of the option at the initial time t =0 .
generally. For example, Figure 3.12 shows ten realizations X(t) (the analogue of the ran-
dom walkers) initiated by being released from x = 2 and then stochastically walking until
they hit the boundaries at x =0 or x =3 ; the value of some field u at x =2 is then the ap-
propriately weighted average of these two boundary values. Here we restrict our attention
to solutions of boundary value problems of ODEs.
Theorem 3.11. Let u(x) in some domain a <x <b satisfy the ODE
μ(x)
∂u
∂x
+D(x)

2
u
∂x
2
= 0, (3.9)
subject to the boundary conditions u(a) =f(a) and u(b) =f(b) , where as usual D(x) =
1
2
σ(x)
2
. Let X(t) be the ensemble of solutions to the SDE dX = μ(X)dt +σ(X)dW
with initial condition X(0) = y, and let τ be the first exit time from a < X < b of each
realization. Then u at the release point is the average value of the realizations that have
“stuck” to one or other of the boundaries:
36
u(y) = E[f(X(τ))] . (3.10)
36
One may also obtain the solution of the forced ODE μu
x
+Du
x
x =g(x): the formula u(y) =
E[f(X(τ))]−E

τ
0
g(X(t))dt

.
emfm
2008/10/22
page 81
i
i
i
i
i
i
i
i
3.2. Stochastically solve deterministic differential equations 81
x
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
time t
Figure 3.12. Ten realizations X(t) initiated from x = 2, then stochastically evolv-
ing until they hit the boundaries at x = 0 or x = 3; the value of some field u at position
x = 2 is then a weighted average of its two boundary values.
Proof. A little loosely, require that the drift and volatility of the SDE are as required
internally in the domain a < x < b, but reduce to zero outside and in particular at the
end points x = a and b. Then the correspondence between the ODE and the SDE is
maintained throughout the domain. The only difference is that the solutions of the SDE
“stick” to the boundary as soon as they reach it, as there the drift and volatility are both
zero. Now let T denote a time so large that almost all solutions of the SDE have reached a
boundary. Theorem 3.8, restricted to time-independent differential equations, then shows
that the solution at the release point is
u(y) = E[u(X(T))]
= E[u(X(τ))] as X(T) = X(τ) for each realization
= E[f(X(τ))] ,
as u = f on each of the boundaries.
Example 3.12. Consider the ODE −x
du
dx
+
1
2
(1+x
2
)
d
2
u
dx
2
= 0 with boundary conditions
u(0) = 1 and u(3) = 5 . Solve it analytically, and also numerically estimate u(1) and u(2)
via its corresponding SDE.
emfm
2008/10/22
page 82
i
i
i
i
i
i
i
i
82 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution
Solution: Analytically, when written in terms of v = du/dx , the ODE becomes
separable:
−xv+
1
2
(1+x
2
)
dv
dx
=0

1
2
(1+x
2
)
dv
dx
= xv

dv
v
=

2x
1+x
2
dx
⇒ logv =log(1+x
2
) +constant
⇒ v = A(1+x
2
)
⇒ u =

vdx =A

x+
1
3
x
3

+B.
With boundary conditions u(0) = 1 and u(3) = 5 , determine A =
1
3
and B = 1 so the
analytic solution is
u =
1
3

x+
1
3
x
3

+1.
Thus we now know that u(1) =13/9 =1.4444 and u(2) =23/9 =2.5556 .
Algorithm 3.2 MATLAB/SCILAB code to solve a boundary value problem ODE by its
corresponding SDE. Use find to evolve only those realizations within the domain 0 <X<
3 . Continue until all realizations reach one boundary or the other. Estimate the expectation
of the boundary values using the conditional vectors x<=0 and x>=3 to account for the
number of realizations reaching each boundary.
m=100; % realizations
h=0.001; % time step
x=ones(1,m); % initial release
i=1:m;
while m>0
x(i)=x(i)-x(i)
*
h+sqrt(1+x(i).^2).
*
randn(1,m)
*
sqrt(h);
i=find((x>0)&(x<3)); % track those as yet unstuck
m=length(i);
end
estimated_u=mean(1
*
(x<=0)+5
*
(x>=3))
Stochastically, recognize that the ODE comes from an Ito process with drift μ =−x
and volatility σ=

1+x
2
. To estimate u(1), adapt Algorithm3.1 to that of Algorithm3.2:
this version numerically solves the SDE dX =−Xdt +

1+X
2
dW with initial condition
X(0) = 1 . Figure 3.12 plots ten realizations. With just m= 100 realizations we get sig-
nificant fluctuations in the estimates of u(1) about the true value, for example, 1.72, 1.28,
1.36, 1.32, and 1.40 . Similar fluctuations are obtained for estimates of u(2) such as 2.44,
2.48, 2.36, 2.56, and 2.20 . Using many more realisations gives more accuracy. But at least
we roughly approximate the answer quite quickly with relatively few realizations.
emfm
2008/10/22
page 83
i
i
i
i
i
i
i
i
3.2. Stochastically solve deterministic differential equations 83
The binomial distribution estimates errors
In this example, as is typical, each of the m realizations reached either one boundary or
the other, x = 0 or x = 3 . Let Y
i
= 1 if the ith realization reaches the x = 3 boundary,
and conversely Y
i
= 0 if the ith realization reaches the x = 0 boundary. Here, as in any
problem, there is a probability p that the ith realization reaches the right-hand boundary
for which Y
i
= 1; here the estimate ^ p ≈ 0.4 . Now consider the value of the solution at
the release point x = 2: u(2) = E[f(X(τ))]. The boundary values f(X(τ)) are either 1 or 5
depending upon whether the realization reached the left or the right boundary, respectively.
That is, f(X(τ)) = 4Y
i
+1 . Consequently, u(2) = E[f(X(τ))] = E[4Y
i
+1] , and hence
^ u(2) =
1
m
m

i=1
(4Y
i
+1) =
4
m
m

i=1
Y
i
+1 =
4
m
Y+1,
where Y =

m
i=1
Y
i
is distributed bin(m, p); here Y ∼ bin(m, 0.4) .
• Now we know E[Y] = mp from the properties of the binomial distribution. Hence,
E[u(2)] = E[
4
m
Y+1] =
4
m
mp+1 = 4p+1 which, with ^ p = 0.4 , estimates ^ u(2) ≈
2.6 as before. The expectation of the binomial agrees with the earlier mean.
• Additionally we also knowthe variance of a binomial distribution: Var[Y] =mp(1−
p) . Thus we know Var[u(2)] = Var[
4
m
Y +1] =
16
m
2
Var[Y] =
16
m
2
mp(1 − p) =
16p(1 −p)/m which, with ^ p = 0.4 , estimates Var[^ u(2)] ≈ 0.04 . This provides
the error in our estimate of u(2) .
Thus with m = 100 realizations we estimate u(2) = 2.6 ±0.2 , a 10% error. Because the
variance decreases with m, the error decreases with

m, and so we need m ≈ 10, 000
realizations to estimate u(2) to a 1% error.
As these last two examples show, we need enormously many realizations to solve a
PDE at all accurately via its SDE. It is not a practical method for PDEs in small numbers of
dimensions. However, suppose we wish to solve a PDE in, say, 100 dimensions, as easily
arises in finance when each day introduces another dimension. Then a finite difference
solution would require a fantastic 10
100
grid points to just resolve the domain with 10 grid
points in each of the 100 dimensions, let alone try to solve the equation. In contrast, the
stochastic solution is practical because we just solve 100 coupled SDEs for, say, a few
thousand realizations to obtain a workable approximation. The SDE solution of PDEs is
practical in problems of high dimension.
Suggested activity: Do at least Exercise 3.12.
Summary
For a given differential equation for unknown u, the appropriate SDE has realizations
whose expectation of u does not change. Hence we are empowered to stochastically esti-
mate the unknown u at some initial point. This provides a method to stochastically value a
financial option.
emfm
2008/10/22
page 84
i
i
i
i
i
i
i
i
84 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution
3.3 The Kolmogorov backward equation completes the
picture
To date we either sought the steady state probability distribution, the PDF p(x), or we
assumed that the initial state of the Ito process was specified. By investigating the evolution
from a general initial state, we discover a fully informative function that is analogous to
powers of the transition matrix of Markov chains.
Example 3.13 (Wiener process). The Wiener process in Example 3.6 starts fromW(0) =
0 , that is, all the probability is lumped at w= 0 at time t = 0 . More generally, we might
know that at some time s the process has some value y: W(s) =y, or equivalently, all the
probability is lumped at x = y at time t = s . Because of the homogeneous nature of the
Wiener process, we write down the more general PDF just by shifting the origin in time t
and value w. Thus the PDF for the Wiener process, given that we know it was at y at
time s, is
p(t, x|s, y) =
1

2π(t −s)
exp

−(x−y)
2
/2(t −s)

.
The use of the vertical bar in p(t, x|s, y) is a convention to remind us that this function p is
the PDF conditional on being at y at time s.
Example 3.14 (Ornstein–Uhlenbeck process). Remarkably, the conditional PDF for an
Ornstien–Uhlenbeck process is also always a Gaussian: the Gaussian’s parameters just
have exponentially decaying transients. Recall that Figure 3.5 plots 10 realizations showing
how the realizations evolve; Figure 3.6 shows how the corresponding PDF evolves toward
a Gaussian steady state. For simplicity, I record the conditional PDF for the specific SDE
dX = −Xdt +

2dW: it is
p(t, x|s, y) =
1

1−e
−2(t−s)

exp


⎣−

x−ye
−(t−s)

2
2

1−e
−2(t−s)



⎦ .
This probability distribution is that for a normal random variable with mean x =ye
−(t−s)
decaying from y at time t = s to 0 exponentially quickly, and with variance 1 −e
−2(t−s)
growing from 0 at time t = s to saturate exponentially quickly at 1 for large times. The
appearance of the steady state PDF matches the steady state found in Example 3.5.
Recall that Exercise 1.6, using many realizations of numerical simulations, also
shows the same exponential approach to the steady state distribution from a variety of
initial conditions.
This general conditional PDF has much more information about the underlying stochastic
process—it is the analogue of powers of the transition matrix of a Markov chain. This
is in contrast to p(t, x) which, for example, cannot distinguish between a Wiener process
and Z

t ! Here p(t, x|s, y) “knows” about the process when started with any value y at any
time s. Thus, for example, at a small time step h into the future fromtime s, p(s+h, x|s, y)
will be some sharply peaked probability distribution: the peak will be at some ¯ x near y,
and so μ = (¯ x −y)/h is an estimate of the drift at the locale (s, y); whereas the spread
emfm
2008/10/22
page 85
i
i
i
i
i
i
i
i
3.3. The Kolmogorov backward equation completes the picture 85
Etime
s −h s t
s y
s ξ
d
s ξ
u
sx

Z = −1
d
d

Z = +1
&
&
&
&
&
&
&
&
&
&
&b
$
$
$
$
$
$
$
$
$
$
$X
Figure 3.13. To derive the Kolmogorov backward equation (3.11), take a small
time step from (s−h, y) to time s, and see how it affects the subsequent evolution to (t, x).
of the peak, its standard deviation σ
x
, will determine the local volatility σ = σ
x
/

h. By
uniqueness of drift and volatility of an Ito process, the Doob–Meyer decomposition, we
can in principle determine the SDE for any given p(t, x|s, y). The conditional PDF is fully
informative of the underlying stochastic process.
Analogue
The conditional probability distribution p(t, x|s, y) is analogous to powers P
n
of the tran-
sition matrix P in discrete stochastic systems. Recall that P
n
takes the probability distribu-
tion, p(s), at some time s and maps it to the later probability distribution p(t) = p(s)P
n
at time t =s+n. With t −s ∼ n, x ∼ i , y ∼ j , p(t, x|s, y) is exactly analogous to the i, jth
element of P
n
.
Theorem 3.15 (Kolmogorov backward equation). The general PDF p(t, x|s, y) for the
Ito process dX =μ(X)dt +σ(X)dW satisfies the Kolmogorov backward equation,
∂p
∂s
+μ(y)
∂p
∂y
+
1
2
σ(y)
2

2
p
∂y
2
=0, (3.11)
as well as the Fokker–Planck equation (3.2).
37
It is straightforward, albeit tedious, to verify that the example PDFs given earlier
satisfy (3.11). See that the Kolmogorov backwards equation is like an advection-diffusion
equation but with negative diffusivity: this implies it must be solved backward fromt to the
starting time s. This reminds us of the Black–Scholes equation (2.6) for pricing options,
and it helped establish in Example 3.10 the striking connection with the Black–Scholes
equation.
Proof. Throughout this derivation, dashes denote ∂/∂y and the drift μ and the volatility σ
are evaluated at y unless otherwise specified. Figure 3.13 indicates how we investigate the
37
The Kolmogorov backward equation and the Fokker–Planck equation are closely related. Each is the ad-
joint of the other. Those who have met the definition, properties, and use of adjoints will appreciate that such
a pair of adjoint equations and their solutions provides two complementary and beautiful alternative views
of problems. The Kolmogorov backward equation is sometimes also known as the Chapman–Kolmogorov
equation.
emfm
2008/10/22
page 86
i
i
i
i
i
i
i
i
86 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution
probability that the system arrives at (t, x) given that it starts from (s −h, y) by taking the
first small time step to (s, ξ) and combining this with p(t, x|s, ξ) to give the probability
of ultimately reaching (t, x). As for the proof of the Fokker–Planck equation (3.2), for
simplicity assume that the Wiener increment over the time step h is ΔW = σ(y)

hZ,
where Z = ±1 each with probability
1
2
; thus the x value after the first small time step from
(s −h, y) is (Figure 3.13)
ξ = y+μ(y)h±σ(y)

h.
Therefore the routes to (t, x) are either via ξ
u
or ξ
d
:
p(t, x|s −h, y) =
1
2
p(t, x|s, ξ
u
) +
1
2
p(t, x|s, ξ
d
)
=
1
2
p(t, x|s, y+μh+σ

h) +
1
2
p(t, x|s, y+μh−σ

h)
=
1
2

p+(μh+σ

h)p

+
1
2
(μh+σ

h)
2
p

+
1
2

p+(μh−σ

h)p

+
1
2
(μh−σ

h)
2
p

+· · ·
=p+μhp

+
1
2
σ
2
hp

+· · · .
Subtracting p(t, x|s −h, y) from both sides and dividing by h leads to
0 =
p(t, x|s, y) −p(t, x|s −h, y)
h
+μp

+
1
2
σ
2
p

+· · · .
Take the limit as h →0 to deduce
0 =
∂p
∂s

∂p
∂y
+
1
2
σ
2

2
p
∂y
2
.
This is the Kolmogorov backward equation (3.11).
Appendix B.2 gives an alternate derivation.
Summary
The conditional PDF p(t, x|s, y) can completely characterize a stochastic process: as a
function of the initial condition, X(s) = y, the conditional PDF p(t, x|s, y) satisfies the
Kolmogorov backward equation (3.11).
3.4 Summary
• The PDF p(t, x|s, y) gives the probability that a stochastic process has value x at
time t given that it had value y at an earlier time s.
• The PDF of the SDE dX = μ(X)dt +σ(X)dW satisfies both the Fokker–Planck
equation (3.2),
∂p
∂t
= −

∂x
[μp] +

2
∂x
2

1
2
σ
2
p

,
emfm
2008/10/22
page 87
i
i
i
i
i
i
i
i
Exercises 87
and the Kolmogorov backward equation (3.11),
∂p
∂s
+μ(y)
∂p
∂y
+
1
2
σ(y)
2

2
p
∂y
2
= 0.
• The PDFs of SDEs may be determined fromthe corresponding Fokker–Planck equa-
tion.
• The Kolmogorov backward equation enables some PDEs to be solved stochastically.
Exercises
3.1. Showthat Var[X(t)] =X
2
0

e
(2α+β
2
)t
−e
2αt

for X(t) =X
0
exp[(α−
1
2
β
2
)t+βW(t)] .
Give an example showing that even when the expectation decays to zero, the vari-
ance may nonetheless grow in time.
3.2. Consider in turn each of the stochastic processes illustrated in Figures 3.14–3.17.
Each figure shows ten realizations of some Ito process, a different Ito process for
each figure. Assuming each Ito process is settling upon some steady state PDF p(x);
sketch the PDF.
3.3. The realizations plotted in Exercise 3.2 all come from Ito processes you have met
or will soon meet. Identify as best you can which Ito process shown in Exercise 3.4
corresponds to each of the four figures.
3.4. Use the Fokker–Planck equation to investigate the steady state PDFs for the follow-
ing stochastic processes:
1. dX = (2X−X
2
)dt +XdW for X(t) ≥0 ;
2. dX = (3X−2X
2
)dt +2XdW for X(t) ≥0 ;
3. dX =β

1+X
2
dW;
4. dX = −Xdt +

1+X
2
dW;
5. dX =Xdt +(1+X
2
)dW;
6. dX = −Xdt +

2(1+X
2
)
1/4
dW.
3.5. Use the Fokker–Planck equation to find the structure of the steady state PDF of
solutions X(t) to the SDE dX = −2Xdt +

1+X
2
dW.
3.6. Consider a general SDE dX = μ(X)dt +σ(X)dW . Assuming a steady state PDF
exists for the ensemble of realizations, show that the steady state PDF may be writ-
ten as
p(x) =
A
D(x)
exp

μ(x)
D(x)
dx

,
where D(x) =
1
2
σ(x)
2
and A is a normalization constant.
emfm
2008/10/22
page 88
i
i
i
i
i
i
i
i
88 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution
x
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
time t
Figure 3.14. Ten realizations of an Ito process.
x
0 1 2 3 4 5 6 7 8 9 10
0
2
4
6
time t
Figure 3.15. Ten realizations of an Ito process.
emfm
2008/10/22
page 89
i
i
i
i
i
i
i
i
Exercises 89
x
0 1 2 3 4 5 6 7 8 9 10
0
2
4
6
8
10
12
14
16
18
20
time t
Figure 3.16. Ten realizations of an Ito process.
x
0 1 2 3 4 5 6 7 8 9 10
0
5
10
time t
Figure 3.17. Ten realizations of an Ito process.
emfm
2008/10/22
page 90
i
i
i
i
i
i
i
i
90 Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution
3.7. Derive from first principles the Fokker–Planck equation
∂p
∂t
=

2
∂x
2
(
1
2
x
2
p)
for the PDF p(t, x) of solutions X(t) of the SDE
dX =XdW
(this SDE is the case of the financial SDE dX =αXdt +βXdW for no stock drift,
α =0 , and unit stock volatility, β =1).
• First, assume small time steps of size Δt =h and approximate the Wiener pro-
cess by the up/down binomial steps ΔW=±

h. Show that for the stochastic
system to reach (t +h, x) it must have come from(t, ξ) for ξ =x/(1±

h) .
• Second, use Taylor series about p(t, x) to deduce
p(t +h, x) =p+hp+2xh
∂p
∂x
+
1
2
x
2
h

2
p
∂x
2
+· · · ,
where the right-hand side is evaluated at (t, x). You may use that (1±

h)
−1
=
1∓

h+h+· · · .
• Lastly, rearrange and take the limit as the time step h →0 to derive the corre-
sponding Fokker–Planck equation.
3.8. Generalize the proof of the Fokker–Planck equation (3.2) to include variable volatil-
ity σ(x). This generalization is quite difficult because of the need to keep track of
all the details, and also you will need to show
ξ = x∓σ

h+(σσ

−μ)h+· · · .
3.9. Reanalyze Example 3.7, but suppose instead that the death rate μ
n
=βn
2
due to in-
creased competition for limited resources by larger populations of individuals. Use
its Fokker–Planck equation to then argue that the corresponding SDE is approxi-
mately
dN=(αN−βN
2
)dt +

αN+βN
2
dW.
3.10. Consider the PDE
∂u
∂t
+
∂u
∂x
+
1
2

2
u
∂x
2
=0 . One solution is u(t, x) =x
2
−2xt +t
2
−t.
Write down the corresponding SDE, and its solution X(t), and then verify the
Feynman–Kac formula for solutions with initial condition X(s) =y.
3.11. Corresponding to the Black–Scholes equation is the SDE (3.8). Use many real-
izations of its solution to numerically estimate the options of Exercises 1.11, 1.12,
and 1.19. Roughly what errors do you expect in your estimates?
3.12. Solve analytically 2
du
dx
+x
d
2
u
dx
2
= 0 on 1 < x < 5 such that u(1) = 4 and u(5) =
0 . Use the corresponding SDE to estimate u(2), u(3) and u(4) and their errors.
Compare. Say we use m= 100 realizations with a time step of h = 0.001 : discuss
how the errors improve with increasing m and/or decreasing h.
3.13. Verify, perhaps using computer algebra, that the PDFs for the Wiener and Ornstein–
Uhlenbeck processes satisfy the Kolmogorov backward equation (3.11).
emfm
2008/10/22
page 91
i
i
i
i
i
i
i
i
Exercises 91
Answers to selected exercises
3.4. 1. p ∝ x
2
e
−2x
(you determine the constant of proportionality in this and other
answers); see the hump in probability near x = 2 corresponding to the deter-
ministic fixed point;
2. p ∝e
−x
/

x, but here the deterministic fixed point at x =3/2 has been washed
out by the noise which instead has stabilized the origin and hence generated an
integrable peak in probability at x =0 ;
3. p ∝ 1/(1 +x
2
); see that for small x the system spreads by a random walk,
dX ∝dW , but this spreading is arrested by the stabilizing effect of multiplica-
tive noise for large x, where dX ∝XdW, generating this “Cauchy distribution”
which has long tails (curiously β has no influence on the steady state distribu-
tion);
4. p ∝ 1/(1 +x
2
)
2
shows that by stabilizing the origin the previous distribution
has much smaller tails;
5. p ∝
1
(1+x
2
)
2
e
−1/(1+x
2
)
, whereas here the exponential factor in the solution
does not affect the distribution very much so the PDF is much like the previous
one but generated by a deterministically unstable origin which is stabilized by
noise that grows quadratically with x;
6. p ∝
1

1+x
2
e


1+x
2
; for small x this process is like an Ornstein–Uhlenbeck
process, dX ≈ −Xdt +

2dW , but the PDF decays a little faster for large x
because the noise increases like

X.
3.5. p ∝1/(1+x
2
)
3
.
3.12. Analytic solution is u = 5/x−1 .
emfm
2008/10/22
page 92
i
i
i
i
i
i
i
i
emfm
2008/10/22
page 93
i
i
i
i
i
i
i
i
Chapter 4
Stochastic Integration Proves
Ito’s Formula
Contents
4.1 The Ito integral

b
a
fdW . . . . . . . . . . . . . . . . . . . . . 95
4.2 The Ito formula . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Answers to selected exercises . . . . . . . . . . . . 114
So far we have played with stochastic differentials, such as
dX = μ(t, X)dt +σ(t, x)dW, (4.1)
and invoked rules for their manipulation that seemto make sense in the context of stochastic
Ito processes. But the derivation of the symbolic rules has been largely intuitive rather than
rigorous. In this chapter we put stochastic calculus on a firm footing. The best treatment of
stochastic differential forms such as (4.1) asserts that they are just shorthand for integrals
such as
[X(t)]
b
a
=X(b) −X(a) =

b
a
μ(t, X(t))dt +

b
a
σ(t, X(t))dW.
In turn, interpret these integrals as being approximated by the sums

b
a
μ(t, X(t))dt ≈
n−1

j=0
μ
j
Δt
j
and

b
a
σ(t, X(t))dW≈
n−1

j=0
σ
j
ΔW
j
.
But what do these integrals really mean? Can we really approximate them like this? What
properties do they have? How are they related to ordinary integrals?
Ordinary integrals
Interpret integrals such as

b
a
μ(t, X(t))dt as ordinary integrals in the sense of ordinary
calculus (they are Riemann–Lebesgue integrals). All the properties that we are familiar
93
emfm
2008/10/22
page 94
i
i
i
i
i
i
i
i
94 Chapter 4. Stochastic Integration Proves Ito’s Formula
I
(
t
)
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0.0
0.5
1.0
t
Figure 4.1. Five realizations of the Ito integral I(t) =

t
0
W(s)dW(s) for
0 < t < 1 as computed by Algorithm 4.1.
with follow provided the integrand is a smooth function of any Ito process. Thus for these
integrals there is nothing that we really need to develop other than perhaps efficient methods
for their numerical evaluation. In this chapter we focus exclusively upon Ito integrals of
the form

b
a
σ(t, X(t))dW .
Example 4.1. Use Ito’s formula to deduce I =

b
a
WdW . See in the numerical evalu-
ations of Figure 4.1 that the integral is not [
1
2
W
2
]
b
a
because the realizations of I(t) =

t
0
W(s)dW(s) are often negative, whereas
1
2
W
2
, being a square, is always positive.
Solution: The indefinite stochastic integral I =

WdW is synonymous with the
SDE dI = WdW . We guess that despite the above comments, the indefinite integral will
somehow contain
1
2
W
2
, so use the simple form (2.3) of Ito’s formula to see
d

1
2
W
2

= 0dt +WdW+
1
2
· 1dW
2
= WdW+
1
2
dt.
The WdW term on the right-hand side is the desired differential; the dt term is some
adjustment. Move the
1
2
dt to the left-hand side to obtain
dI = WdW = d

1
2
W
2


1
2
dt = d

1
2
W
2

1
2
t

.
Thus upon “summing,” the indefinite integral

WdW =
1
2
W
2

1
2
t . As seen in Figure 4.1,
the integral is a fluctuating positive component superimposed upon −t/2.
Thus, further, the definite integral

b
a
WdW =
1
2

W
2
−t

b
a
.
emfm
2008/10/22
page 95
i
i
i
i
i
i
i
i
4.1. The Ito integral

b
a
fdW 95
Algorithm 4.1 MATLAB/SCILAB code (essentially an Euler method) to numerically eval-
uate

t
0
W(s)dW(s) for 0 < t <1 to draw Figure 4.1.
m=5;
n=1000;
t=linspace(0,1,n+1)’;
dw=randn(n,m)
*
sqrt(diff(t(1:2)));
w=[zeros(1,m);cumsum(dw,1)];
x=[zeros(1,m);cumsum(w(1:end-1,:).
*
dw,1)];
plot(t,x)
4.1 The Ito integral

b
a
fdW
Here we outline the definition of an Ito integral

b
a
fdW and its properties. Øksendal (1998)
[§3.1] and Kloeden and Platen (1992) [§3.1] give full details of the rigorous development
of the integral. The development follows four steps:
1. Define integrals for piecewise constant stochastic integrands—called “step func-
tions” φ(t, ω)—such as those illustrated in Figure 4.2;
2. use such step functions to approximate arbitrarily well any reasonable stochastic
integrand f(t, ω);
3. for a sequence of step functions φ
n
→f as n →∞, then define the Ito integral of f
to be the lim
n→∞

b
a
φ
n
dW;
4. Ito integral properties then follow from those for integrals of step functions.
We modify notation slightly: hereafter we use ω to denote the family of realizations
of a Wiener process W(t, ω). This more explicit notation allows a more general depen-
dence upon the Wiener process to appear in the integrands, such as a dependence upon the
past history. Including the past history of the process is essential if the theory of integrals
is to apply to SDEs such as dX =XdW which we interpret as X =

XdW where the inte-
grand X(t, ω) depends upon all the previous values of the Wiener process W. In contrast,
in

WdW the increment WdW to the integral depends only upon the current value of W.
Definition 4.2. The class of stochastic functions that we may integrate (over the inter-
val [a, b]) is denoted by V. It is defined to be composed of functions f(t, ω) such that
• the expected square integral E[

b
a
f(t, ω)
2
dt] < ∞;
• f(t, ω) depends only upon the past history of the Wiener process, W(s, ω), for s ≤t;
for example, f = W(t, ω)
2
, f = e
W(t,ω)
, or f =

t
0
W(s, ω)ds, but not f = W(t +
1, ω).
38
38
Such functions that only depend upon the previous history of the Wiener process are called variously F
t
-
measurable, F
t
-adapted, or F
t
-previsible. Slight technical differences exist between these terms. However,
we endeavour to work entirely in a manner so that the differences in the terms are immaterial; thus herein
we just refer to a dependence only upon the past history.
emfm
2008/10/22
page 96
i
i
i
i
i
i
i
i
96 Chapter 4. Stochastic Integration Proves Ito’s Formula
φ
(
t
,
ω
)
φ
(
t
,
ω
)
0.0 0.2 0.4 0.6 0.8 1.0
0
2
4
0.0 0.2 0.4 0.6 0.8 1.0
0
2
4
0.0 0.2 0.4 0.6 0.8 1.0
0
2
4
0.0 0.2 0.4 0.6 0.8 1.0
0
2
4
t t
Figure 4.2. Five realizations (different colors) of four different stochastic step
functions φ(t, ω) ∈ S: different step functions φ(t, ω) have different partitions; different
realizations of the one-step function φ(t, ω) have the same partition but different values
distributed according to some probabilities.
Definition 4.3. Let the class of step functions S ⊂ V be the class of piecewise constant
functions. That is, for each step function φ(t, ω) ∈ S , there exists a finite partition a =
t
0
< t
1
< t
2
< · · · < t
n
= b such that φ(t, ω) = φ(t
j
, ω) for t
j
≤ t < t
j+1
. Figure 4.2
plots four different members of S.
Given any such partition, we often use φ
j
or φ
j
(ω) to denote φ(t
j
, ω), just as we
used W
j
to denote W(t
j
, ω).
First task: The Ito integral for step functions
Our first task, as in the outline given at the start of this chapter, is to define the Ito integral
for step functions, such as those drawn in Figure 4.2, and investigate its key properties.
Definition 4.4. For any step function φ(t, ω) define the Ito integral
I(ω) =

b
a
φ(t, ω)dW(t, ω) =

b
a
φdW =
n−1

j=0
φ
j
ΔW
j
. (4.2)
emfm
2008/10/22
page 97
i
i
i
i
i
i
i
i
4.1. The Ito integral

b
a
fdW 97
From this definition two familiar and desirable properties immediately follow, linear-
ity and union (proofs are left as exercises), as does a third “no anticipation” property:
• Linearity:

b
a
(αφ+βψ)dW =α

b
a
φdW+β

b
a
ψdW; (4.3)
• union:

b
a
φdW+

c
b
φdW =

c
a
φdW; (4.4)
• the integral

b
a
φdW depends only upon the history of the Wiener process up to
time b (the integral is F
b
-adapted/measurable/previsible).
Two important properties of the Ito integral arise from its stochastic nature. These proper-
ties are so important we codify them as named theorems.
Theorem 4.5 (martingale property). For the Ito integral (4.2) of a step function φ, its
mean, average, or expected value is always zero:
μ
I
= E

b
a
φ(t, ω)dW(t, ω)

=0. (4.5)
Example 4.6. Recall from Example 4.1 that
I(ω) =

t
0
W(s, ω)dW(s, ω) =
1
2

W(t, ω)
2
−t

.
Verify E[I(ω)] = 0 .
Solution: From the expression for the integral,
E[I] = E

1
2
(W(t, ω)
2
−t)

=
1
2
E[W(t, ω)
2
] −
1
2
E[t]
=
1
2
t −
1
2
t as E[W(t, ω)
2
] = Var[W(t, ω)] =t
= 0.
Although we only prove this martingale property
39
for step functions here, they do
hold generally, and hence we have given an example which is not within the class of step
functions.
39
The term martingale refers to stochastic processes, such as the Wiener process, for which, in symbols,
E[X(t, ω) | F
a
] =X(a, ω) for t >a. The symbol F
a
, called a filtration, denotes all the history of the
process up to time a. This statement says that if you know the history of the process up to time a (that
is, conditional on F
a
), then the process X(t, ω) is a called a martingale if the expected value thereafter is
just X(a, ω). For example, a Wiener process is a martingale because, given only the history up to time a,
the independence from the earlier history of subsequent increments in the Wiener process implies that the
expected value of the Wiener process does not change from W(a, ω). The Ito integral may be considered a
stochastic function of the end point b, and we see that as bvaries the expected value of the integral is always
the value it has for b=a, namely zero. Thus an Ito integral is a martingale.
emfm
2008/10/22
page 98
i
i
i
i
i
i
i
i
98 Chapter 4. Stochastic Integration Proves Ito’s Formula
Proof. Recall the linearity of the expectation and that φ
j
is independent of ΔW
j
, as
φ
j
= φ(t
j
, ω) depends only upon the earlier history of the Wiener process, while the incre-
ment ΔW
j
is independent of the earlier history. Thus for the relevant partition, such as one
of those in Figure 4.2,
E
_

b
a
φdW
_
= E


n−1

j=0
φ
j
ΔW
j


by definition
=
n−1

j=0
E
_
φ
j
ΔW
j
_
by linearity
=
n−1

j=0
E
_
φ
j
_
E
_
ΔW
j
_

=0
by independence
= 0.
Remarkably, irrespective of what integrand we choose in an Ito integral, we always
know the mean value of the Ito integral, namely zero! But even more remarkably (see the
next theorem), we can also readily determine the variance, that is, the spread, of an Ito
integral. The two most important aspects of the stochastic I(ω) =

b
a
φdW, its mean and
variance, can be determined without ever actually computing the Ito integral!
Theorem 4.7 (Ito isometry). For the Ito integral (4.2) of a step function φ(t, ω), its
variance is the integral of the expectation of the squared integrand:
40
σ
2
I
= Var
_

b
a
φ(t, ω)dW(t, ω)
_
= E


_

b
a
φdW
_
2


=

b
a
E
_
φ(t, ω)
2
_
dt. (4.6)
Example 4.8. Recall from Example 4.1 that the Ito integral
I(ω) =

t
0
W(s, ω)dW(s, ω) =
1
2
[W(t, ω)
2
−t] .
Verify the Ito isometry for this integral.
Solution: Directly from the analytic solution,
Var[I(ω)] = Var
_
1
2
(W(t, ω)
2
−t)
_
=
1
4
E
_
(W
2
−t)
2
_
as E[I(ω)] = 0
=
1
4
E
_
W
4
−2W
2
t +t
2
_
=
1
4
_
3t
2
−2· t · t +t
2
_
=
1
2
t
2
,
40
As if the square can be taken inside the integration and dW
2
=dt.
emfm
2008/10/22
page 99
i
i
i
i
i
i
i
i
4.1. The Ito integral

b
a
fdW 99
whereas from the Ito isometry (4.6),
Var[I(ω)] = E
_
_
t
0
W(s, ω)dW(s, ω)
_
2
_
=

t
0
E
_
W(s, ω)
2
_
ds
=

t
0
sds
=
1
2
t
2
.
Since these are equal, we verify the Ito isometry for this Ito integral.
Proof. Again we use the properties of the expectation and that the zero mean incre-
ment ΔW
j
is independent of any earlier stochastic quantity. Thus for the relevant partition,
as in Figure 4.2,
E


_

b
a
φdW
_
2


= E





n−1

j=0
φ
j
ΔW
j


2



= E




n−1

j=0
φ
j
ΔW
j




n−1

k=0
φ
k
ΔW
k




= E


n−1

j,k=0
φ
j
ΔW
j
φ
k
ΔW
k


=
n−1

j,k=0
E
_
φ
j
ΔW
j
φ
k
ΔW
k
_
.
Now within this double sum three cases arise: If j < k, then ΔW
k
is independent of φ
j
,
ΔW
j
, and φ
k
—as ΔW
k
depends only upon times t >t
k
whereas φ
j
, ΔW
j
and φ
k
depend
only upon times t ≤t
k
—so the expectation may be split
E
_
φ
j
ΔW
j
φ
k
ΔW
k
_
= E
_
φ
j
ΔW
j
φ
k
_
E[ΔW
k
]

=0
,
and so all such terms vanish; whereas if j > k, then ΔW
j
is independent of φ
k
, ΔW
k
,
and φ
j
so the expectation may be split,
E
_
φ
j
ΔW
j
φ
k
ΔW
k
_
= E
_
φ
j
ΔW
k
φ
k
_
E
_
ΔW
j
_

=0
,
and so all such terms vanish, leaving just the j =k terms. Thus
emfm
2008/10/22
page 100
i
i
i
i
i
i
i
i
100 Chapter 4. Stochastic Integration Proves Ito’s Formula
E

b
a
φdW

2


=
n−1

j=0
E

φ
2
j
ΔW
j
2

=
n−1

j=0
E

φ
2
j

E

ΔW
j
2

by independence
=
n−1

j=0
E

φ
2
j

Δt
j
by variance of ΔW
j
=

b
a
E

φ
2

dt
by the definition of ordinary integration for piecewise constant integrands such as
E

φ(t, ω)
2

.
Suggested activity: Do at least Exercises 4.1 and 4.4.
Second task: Step functions approximate
The second task is to prove that the class of step functions S can be used to approximate
arbitrarily well any given Ito process f(t, ω) ∈ V . For example, Figure 4.3 shows a se-
quence of four step functions that approximate the given Ito process increasingly well;
they approximate the Ito process in all realizations.
Although this second task involves considerable uninteresting material, which we
omit, we introduce the interesting and important notion of a Cauchy sequence as a means
to approximation. You are familiar with this issue of approximability every time you work
with real numbers. Recall that we approximate real numbers by rationals; for example,
we often use 22/7 to approximate π. For another example, in a computer all floating
point numbers are stored and manipulated as an integer divided by a power of 2. But
these approximations are only valid to some prescribed finite error. Rigorous mathematics
establishes approximability to every error, no matter how small.
Sequences provide a mechanism for such arbitrarily accurate approximation. For ex-
ample, Newton iteration x

=
1
2
(x +2/x) provides a sequence of rational approximations
to the irrational

2: one such sequence is 1, 3/2, 17/12, 577/408,
665857
470832
, and so on.
Similarly, a well-known approximation to the irrational π/4 is the sequence of partial sums
of the series 1−1/3+1/5−1/7+· · · , namely 1, 2/3, 13/15, 76/105, . . . . Such approxi-
mation of reals by a sequence of rationals works because there are rational numbers that are
arbitrarily close to any specified real number; we chose a suitable sequence of such rational
numbers that converge to the required real number. Similarly, we can choose a sequence of
stochastic step functions, such as those shown in Figure 4.3, that converge to any given Ito
process.
The notion of convergence is a problem
In earlier courses you will have discussed convergence in terms of the distance between
an element of the sequence and its limit; for example,

2 −1 ,

2 −3/2,

2 −17/12 ,
emfm
2008/10/22
page 101
i
i
i
i
i
i
i
i
4.1. The Ito integral

b
a
fdW 101
φ
(
t
,
ω
)
φ
(
t
,
ω
)
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
t t
Figure 4.3. Five realizations of an Ito process plotted once in each subfigure. Su-
perimposed are four successively refined approximations by step functions. As the number
of partitions increases from top-left to bottom-right, the corresponding realizations of the
step functions increasingly closely approximate the realizations of the Ito process.

2 −577/408 . Then you will have proved this distance tends to zero and concluded
that the sequence converges to the limit. But here the limit is not in the class of things
that we can as yet integrate! We can as yet only integrate step functions, not yet general
Ito processes. Thus we use a different definition of convergence: instead of investigating
the distance between elements of the sequence with the supposed limit, we investigate
the distance between pairs of elements of the sequence. Table 4.1 gives an example. A
sequence I
n
is termed a Cauchy sequence if, loosely, the differences I
n
−I
m
→0 as
n, m→∞, or more precisely, if for all >0 there exists an N such that I
n
−I
m
< for
all n, m > N (see, e.g., Kreyszig 1999, §14.1). If a sequence is Cauchy, then the sequence
must converge to some limit, even if we do not know what that limit is.
The next subsubsection argues that Ito integrals of converging step functions form
a Cauchy sequence, and hence the Ito integrals converge to something. The advantage of
using the notion of a Cauchy sequence in a test for convergence is that it only involves
operations of the elements of the sequence and does not involve comparisons with the
actual limit. This is just what we need here because we have defined Ito integrals only for
step functions, so we have no limit for comparison. But we need to clearly define distances
between stochastic quantities.
Definition 4.9. Measure the distance between two stochastic quantities by
dist[I(ω), J(ω)] = E

(I(ω) −J(ω))
2

. (4.7)
emfm
2008/10/22
page 102
i
i
i
i
i
i
i
i
102 Chapter 4. Stochastic Integration Proves Ito’s Formula
Table 4.1. A sequence I
n
approximating

2 and the distance between pairs of
elements in the sequence, |I
m
−I
n
| . This distance tends to zero as both m, n → ∞ and so
is a Cauchy sequence.
n 1 2 3 4 5
m I
m
\I
n
1 3/2 17/12 577/408
665857
470832
1 1 0 0.5 0.4167 0.4142 0.4142
2 3/2 0.5 0 0.0833 0.0858 0.0858
3 17/12 0.4167 0.0833 0 0.0025 0.0025
4 577/408 0.4142 0.0858 0.0025 0 2.E-06
5
665857
470832
0.4142 0.0858 0.0025 2.E-06 0
If this distance is zero, then I and J are almost surely the same—realizations can only differ
with probability zero. Analogously, measure the distance between two stochastic processes
over a time interval [a, b] by
dist[I(t, ω), J(t, ω)] =

b
a
E

(I(t, ω) −J(t, ω))
2

dt. (4.8)
Establishing this ability of step functions to approximate Ito processes uses tech-
niques that do not illuminate any significant properties of stochastic integration, and so we
omit all details. Suffice it to say that Øksendal (1998) uses the norm f
2
=

b
a
E[f
2
]dt
on V and proceeds in three steps: step functions may approximate bounded continu-
ous functions; bounded continuous functions may approximate bounded functions; and
bounded functions may approximate the Ito processes in V. Hence Øksendal concludes
that step functions S may approximate arbitrarily well any Ito process in V. We skip these
uninteresting technicalities.
Third task: The limit exists
The third task defines the Ito integral of any f(t, ω) ∈ V to be the limit of Ito integrals of
step functions. The nontrivial task here is to deduce that the limit exists and is unique.
• Given some stochastic function f(t, ω) in V.
• Find a sequence of step functions φ
n
(t, ω) that tends to f(t, ω) in the norm, that is,
dist(φ
n
, f) =

b
a
E


n
−f)
2

dt →0 as n → ∞.
For example, Figure 4.3 plots five realizations of one such sequence of step functions
approximating an f(t, ω).
• Compute the Ito integrals I
n
(ω) =

b
a
φ
n
(t, ω)dW(t, ω) of these step functions.
For example, Table 4.2 records the five realizations of the Ito integral of the step
functions shown in Figure 4.3.
emfm
2008/10/22
page 103
i
i
i
i
i
i
i
i
4.1. The Ito integral

b
a
fdW 103
Table 4.2. The five realizations of the Ito integral I
n
(ω) of the step functions
shown in Figure 4.3. The color matches that of the plot.
• The Ito integrals should converge to some number I(ω), which is stochastic because
it depends upon the realization; call this the integral and denote it by

b
a
fdW. For
example, it is plausible that each column of Table 4.2 converges to five different
realizations of the integral I(ω)—recall that numerical approximations to stochastic
integration converge slowly.
Lemma 4.10. The Ito integrals I
n
(ω) =

b
a
φ
n
(t, ω)dW(t, ω) form a stochastic Cauchy
sequence, in that E[(I
n
−I
m
)
2
] →0 as n, m→ ∞, and hence must converge.
Proof. Use the properties of Ito integrals for step functions and see their pairwise distance
dist(I
n
, I
m
) = E

(I
n
−I
m
)
2

= E

b
a
φ
n
−φ
m
dW

2

by linearity
=

b
a
E


n
−φ
m
)
2

dt by Ito isometry
=

b
a
E

((φ
n
−f) +(f −φ
m
))
2

dt
and since (a+b)
2
≤2a
2
+2b
2
by Exercise 4.5

b
a
E

2(φ
n
−f)
2
+2(f −φ
m
)
2

dt
=2dist(φ
n
, f) +2dist(f, φ
m
)
→0 as n, m→ ∞ by convergence to f(t, ω).
Thus the Ito integrals, I
n
(ω), of the sequence of step functions are a (stochastic)
Cauchy sequence. This Cauchy sequence converges almost surely to some value—we call
this value the Ito integral of f(t, ω), denoted I(ω) =

b
a
f(t, ω)dW(t, ω) .
emfm
2008/10/22
page 104
i
i
i
i
i
i
i
i
104 Chapter 4. Stochastic Integration Proves Ito’s Formula
ψ
(
t
,
ω
)
ψ
(
t
,
ω
)
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
t t
Figure 4.4. The same five realizations of an Ito process as in Figure 4.3, plot-
ted with a different sequence of four successively refined approximations by step func-
tions ψ(t, ω).
Table 4.3. The five realizations of the Ito integral J
n
(ω) of the step functions
shown in Figure 4.4. The color matches that of the plot.
Lemma 4.11. A stochastic integral I(ω) =

b
a
f(t, ω)dW(t, ω) is unique; that is, it is
almost always one definite value in each realization, though different for different realiza-
tions.
Proof. Consider any other sequence of step functions ψ
n
(t, ω) that converge to the inte-
grand f(t, ω); for example, see those plotted in Figure 4.4 which use a different sequence
emfm
2008/10/22
page 105
i
i
i
i
i
i
i
i
4.1. The Ito integral

b
a
fdW 105
of partitions than that shown in Figure 4.3. Suppose the Ito integrals J
n
(ω) of this se-
quence converge to the value J(ω). For example, Table 4.3 gives integrals J
n
(ω) for the
step function sequence shown in Figure 4.4; see that these integrals are different from the
integrals I
n
(ω) and so potentially could converge to something different, namely J(ω) in-
stead of I(ω). However, the two potential values of the integral, I(ω) and J(ω), are almost
always the same as their distance
dist(I, J) = E

(I −J)
2

= E

((I −I
n
) +(I
n
−J
n
) +(J
n
−J))
2

≤E

3(I −I
n
)
2
+3(I
n
−J
n
)
2
+3(J
n
−J)
2

by Exercise 4.5
=3E

(I −I
n
)
2

+3E

(I
n
−J
n
)
2

+3E

(J
n
−J)
2

→ 0 as n → ∞,
as the first and last terms tend to zero by the definition of the values I(ω) and J(ω) of the
integrals, and the middle term also tends to zero because
E

(I
n
−J
n
)
2

= E

b
a
φ
n
−ψ
n
dW

2

by definition and linearity
=

b
a
E


n
−ψ
n
)
2

dt by Ito isometry
=

b
a
E


n
−f +f −ψ
n
)
2

dt

b
a
E

2(φ
n
−f)
2
+2(f −ψ
n
)
2

dt by Exercise 4.5
=2

b
a
E


n
−f)
2

dt +2

b
a
E

(f −ψ
n
)
2

dt
→ 0 as φ
n
, ψ
n
→f in this norm.
Thus the Ito integral

b
a
f(t, ω)dW(t, ω) exists and is almost surely unique.
Suggested activity: Do at least Exercise 4.5.
Fourth task: Properties then follow
The final task is to showthat the integral properties for the Ito integral of functions f(t, ω) ∈
V follow from those for step functions. There is little of interest in the technicalities, so a
proof is omitted. From the five main properties of integrals of step functions, the following
are the five main properties of general stochastic integrals:
• linearity,

b
a
(αf +βg)dW =α

b
a
fdW+β

b
a
gdW; (4.9)
emfm
2008/10/22
page 106
i
i
i
i
i
i
i
i
106 Chapter 4. Stochastic Integration Proves Ito’s Formula
• union,

b
a
fdW+

c
b
fdW =

c
a
fdW; (4.10)
• the Ito integral

b
a
fdW depends only upon the history of the Wiener process up to
time b (the integral is F
b
-adapted/measurable/previsible);
• the martingale property; that is, the mean, average, or expected value is always zero:
E

b
a
fdW

= 0; (4.11)
• Ito isometry, the variance is the ordinary integral of the expectation of the squared
integrand:
Var

b
a
fdW

= E

b
a
fdW

2


=

b
a
E

f
2

dt. (4.12)
We saw some of these properties in previous examples, for example, in the integral

WdW.
Summary
This section put stochastic integrals on a strong footing. As for ordinary integrals, the basis
is the approximation by sums over small increments in time. Consequently, stochastic
integrals have many of the usual properties such as linearity and union. But they also have
additional important properties, such as the martingale property, that their expectation is
always zero, and the Ito isometry, that their variance may be computed as an ordinary
integral. Having now put integration on a firm mathematical base, we now proceed to more
rigorous analysis of stochastic processes.
4.2 The Ito formula
So far we have seen the Ito formula (2.4)–(2.5), enabling us to differentiate stochastic Ito
processes and hence solve some SDEs. However, the integral form of the Ito formula,
stated below, is now to be rigorously established because we now have a carefully defined
Ito integral. The Ito formula also may be used to determine some integrals, as integration
forms a relatively simple subset of the cases of solving differential equations.
Theorem 4.12. Let f(t, x) be a smooth function of its arguments and X(t, ω) be an Ito
process with drift μ(t, X) and volatility σ(t, X), that is, dX =μdt +σdW, then Y(t, ω) =
f(t, X(t, ω)) is also an Ito process such that
[f(t, X(t, ω))]
b
a
=

b
a

∂f
∂t

∂f
∂x
+
1
2
σ
2

2
f
∂x
2

dt +

b
a
σ
∂f
∂x
dW, (4.13)
for all intervals [a, b].
emfm
2008/10/22
page 107
i
i
i
i
i
i
i
i
4.2. The Ito formula 107
Before proceeding to prove the Ito formula, we here look at its role in determining
Ito integrals. Take the view that transforming an Ito integral,

· · · dW, into an ordinary
integral,

· · · dt, is effectively a solution of the integral. We seek to simplify

· · · dW at
the acceptable cost of introducing

· · · dt.
Example 4.13. We have seen that

WdW=
1
2
W(t, ω)
2

1
2
t, but what is I =

XdX?
41
Solution: If I(ω) was an ordinary integral, we would immediately write down the
integral as
1
2
X
2
. So for this stochastic integral we guess I(ω) has
1
2
X
2
in it, apply Ito’s
formula (4.13) to the function f(t, X) =
1
2
X
2
, and see what eventuates:

1
2
X
2

b
a
=

b
a

0+μX+
1
2
σ
2
1

dt +

b
a
σXdW
=

b
a
X (μdt +σdW)

dX
+
1
2

b
a
σ
2
dt

b
a
XdX =

1
2
X
2

b
a

1
2

b
a
σ
2
dt.
This reduces to

WdW for which σ = 1 in the right-hand side.
In Example 4.13 the limits of integration, a and b, were carried along in the working and
made no real difference to the algebraic manipulations other than increasing the level of
detail. It is simpler to neglect to include the limits a and b and symbolically work entirely
with indefinite integrals, as we often do hereafter.
Example 4.14. Determine I(ω) =

tdW in terms of an ordinary integral.
Solution: We might expect tW(t, ω) to appear in the answer, so we use the Ito
formula (4.13) with f(t, W) = tW(t, ω), for which f
t
= W, f
W
= t, and f
WW
= 0, and
we also remember that μ = 0 and σ = 1 for a Wiener process W(t, ω). Applying (4.13)
leads to
tW(t, ω) =

W+0.t +
1
2
.0

dt +

1.tdW

tdW=tW(t, ω) −

Wdt.
See that this looks just like ordinary integration by parts.
41
Assume X(t, ω) is an Ito process with drift μ(t, ω) and volatility σ(t, ω); thus view the integral as
I(ω) =

Xμdt+

XσdW.
emfm
2008/10/22
page 108
i
i
i
i
i
i
i
i
108 Chapter 4. Stochastic Integration Proves Ito’s Formula
Example 4.15. Determine as far as possible

W
2
dW.
Solution: Guess that
1
3
W
3
might appear, so consider f =
1
3
W
3
in the integral form
of Ito’s formula (4.13):
1
3
W
3
=

_
0+0· W
2
+
1
2
· 2W
_
dt +

1· W
2
dW
=

Wdt +

W
2
dW

W
2
dW =
1
3
W
3

Wdt.
Now we proceed to prove the integral form (4.13) of Ito’s formula (2.5); Øksendal
(1998) [§4.1] gives more details than we present. We do this nowbecause it is only nowthat
we have properly defined integration of stochastic functions, and because the differential
formwe used before is viewed as symbolically equivalent to the integral form. Throughout,
observe the crucial role of the independence of Wiener increments ΔW
j
fromearlier events.
Proof. Consider the case where the drift μ and volatility σ are step functions (piecewise
constant) with a common partition a = t
0
< t
1
< · · · <t
n
= b with h = max
j
Δt
j
(so that
h →0 means that the partition is everywhere made of smaller and smaller pieces). Then
arbitrary reasonable μ and σ are approximated by a sequence of such step functions, and
so the same formula holds.
For brevity let an overdot denote ∂/∂t and a dash denote ∂/∂x so that, for example,
˙
f
j
=
∂f
∂t
¸
¸
(t
j
,X
j
)
. Then the left-hand side of the Ito formula (4.13) transforms as follows:
[f]
b
a
=
n−1

j=0
Δf
j
=
n−1

j=0
_
f(t
j+1
, X
j+1
) −f(t
j
, X
j
)
_
=
n−1

j=0
_
f(t
j
+Δt
j
, X
j
+ΔX
j
) −f(t
j
, X
j
)
_
=
n−1

j=0
_
˙
f
j
Δt
j
+f

j
ΔX
j
+
1
2
¨
f
j
Δt
j
2
+
˙
f

j
Δt
j
ΔX
j
+
1
2
f

j
ΔX
j
2
+R
j
_
by expanding f(t, x) in a Taylor series, assuming f(t, x) is differentiable (even though the
process X(t, ω) need not be) and where the residual term R
j
= O
_
|Δt
j
|
3
+ |ΔX
j
|
3
_
. Then
in the next few steps we use that the drift μ and volatility σ are step functions, are thus
constant on each element on the partition, and so ΔX
j
= μ
j
Δt
j

j
ΔW
j
exactly. Then
consider in each of the following dot points each term in turn on the right-hand side of the
above,


n−1
j=0
˙
f
j
Δt
j

b
a
˙
fdt as h →0 (provided we refine the partition in such a way that
μ and σ are step functions in every partition) to give the first term in Ito’s integral
formula;
emfm
2008/10/22
page 109
i
i
i
i
i
i
i
i
4.2. The Ito formula 109
• two more terms are obtained from
n−1

j=0
f

j
ΔX
j
=
n−1

j=0
f

j
μ
j
Δt
j
+
n−1

j=0
f

j
σ
j
ΔW
j
as step functions

b
a
f

μdt +

b
a
f

σdW as h →0;
• with the last term in Ito’s integral formula coming fromthe last of the quadratic terms
in the sum, as we now show the other quadratic terms are zero starting with

n−1

j=0
1
2
¨
f
j
Δt
j
2


n−1

j=0
1
2
|
¨
f
j
|Δt
j
2

1
2
h
n−1

j=0
|
¨
f
j
|Δt
j
as h = max
j
Δt
j

1
2
h

b
a
|
¨
f|dt
→0 as h →0,
provided this integral exists (which we assume for f(t, x) of interest) and hence this
term must contribute nothing;
• the next quadratic term contributes nothing because
n−1

j=0
˙
f

j
Δt
j
ΔX
j
=
n−1

j=0
˙
f

j
Δt
j

j
Δt
j

j
ΔW
j
)
=
n−1

j=0
μ
j
˙
f

j
Δt
j
2
+
n−1

j=0
σ
j
˙
f

j
Δt
j
ΔW
j

=Y, say
,
and the first term here vanishes by the previous case (assuming

b
a

˙
f

|dt exists)
whereas the second term, Y, is more delicate because it is stochastic, but we deter-
mine
E[Y
2
] = E




n−1

j=0
σ
j
˙
f

j
Δt
j
ΔW
j




n−1

k=0
σ
k
˙
f

k
Δt
k
ΔW
k




= E


n−1

j,k=0
σ
j
˙
f

j
Δt
j
ΔW
j
σ
k
˙
f

k
Δt
k
ΔW
k


=
n−1

j,k=0
E

σ
j
˙
f

j
ΔW
j
σ
k
˙
f

k
ΔW
k

Δt
j
Δt
k
,
emfm
2008/10/22
page 110
i
i
i
i
i
i
i
i
110 Chapter 4. Stochastic Integration Proves Ito’s Formula
but all terms in this sum vanish except those for which j =k because,
42
for example,
if j <k then the factor ΔW
k
is independent of all other factors inside the expectation,
and so the expectation factors to
E

σ
j
˙
f

j
ΔW
j
σ
k
˙
f

k
ΔW
k

= E

σ
j
˙
f

j
ΔW
j
σ
k
˙
f

k

E[ΔW
k
]

=0
,
and similarly for j > k, thus leaving only the terms k = j in
E[Y
2
] =
n−1

j=0
E

σ
2
j
˙
f

j
2
ΔW
2
j

Δt
j
2
as ΔW
j
is independent of earlier history
=
n−1

j=0
E

σ
2
j
˙
f

j
2

E

ΔW
2
j

=Δt
j
Δt
j
2
≤h
2
n−1

j=0
E

σ
2
j
˙
f

j
2

Δt
j
→h
2

b
a
E

σ
2
˙
f

2

dt
→0 as h →0,
and thus almost surely the term Y →0 as h →0 ;
• the only significant quadratic term is
n−1

j=0
1
2
f

j
ΔX
j
2
=
n−1

j=0
1
2

μ
j
Δt
j

j
ΔW
j

2
f

j
=
n−1

j=0
1
2
μ
2
j
f

j
Δt
j
2
+
n−1

j=0
μ
j
σ
j
f

j
Δt
j
ΔW
j
+
n−1

j=0
1
2
σ
2
j
f

j
ΔW
j
2
,
and of these three terms the first two vanish by almost identical arguments to the pre-
vious two cases which, setting c
j
=
1
2
σ
2
j
f

j
for brevity, leave the sum

n−1
j=0
c
j
ΔW
j
2
that we want to turn into the integral

b
a
cdt in the rather special way unique to
stochastic calculus—we thus compare

n−1
j=0
c
j
ΔW
j
2
with

n−1
j=0
c
j
Δt
j
by showing
42
Note that this is exactly the same argument used to prove the Ito isometry, and which we use again later.
emfm
2008/10/22
page 111
i
i
i
i
i
i
i
i
4.2. The Ito formula 111
(similarly as before) the vanishing of
E





n−1

j=0
c
j
ΔW
j
2

n−1

j=0
c
j
Δt
j


2



=E





n−1

j=0
c
j
(ΔW
j
2
−Δt
j
)


2



=E




n−1

j=0
c
j
(ΔW
j
2
−Δt
j
)




n−1

k=0
c
k
(ΔW
k
2
−Δt
k
)




=E


n−1

j,k=0
c
j
c
k
(ΔW
j
2
−Δt
j
)(ΔW
k
2
−Δt
k
)


=
n−1

j,k=0
E

c
j
c
k
(ΔW
j
2
−Δt
j
)(ΔW
k
2
−Δt
k
)

=0unless k=j by independence of increments
=
n−1

j=0
E

c
2
j
(ΔW
j
2
−Δt
j
)
2

as ΔW
j
2
−Δt
j
independent of earlier history
=
n−1

j=0
E

c
2
j

E

(ΔW
j
2
−Δt
j
)
2

=
n−1

j=0
E

c
2
j

E

ΔW
j
4
−2ΔW
j
2
Δt
j
+Δt
j
2

=
n−1

j=0
E

c
2
j

3Δt
j
2
−2Δt
j
Δt
j
+Δt
j
2

=2
n−1

j=0
E

c
2
j

Δt
j
2
≤2h
n−1

j=0
E

c
2
j

Δt
j
→ 2h

b
a
E

c
2

dt
→ 0 as h → 0,
emfm
2008/10/22
page 112
i
i
i
i
i
i
i
i
112 Chapter 4. Stochastic Integration Proves Ito’s Formula
and thus almost surely,
n−1

j=0
c
j
ΔW
j
2

n−1

j=0
c
j
Δt
j
=
n−1

j=0
1
2
σ
2
j
f

j
Δt
j

b
a
1
2
σ
2
f

dt
as h →0 to give the last term in Ito’s integral formula;
• and for the very last task I simply claim that the residuals of cubic and higher order
terms vanish,

n−1
j=0
R
j
→0 as h →0 (part of the argument is Exercise 2.3), leaving
just the integral form of the Ito formula.
We terminate here this development of the basic theory that underpins SDEs and
their applications. With this establishment of Ito’s formula on a firm theoretical footing,
and with the working concepts developed in earlier chapters, you are now able to not only
solve practical problems such as the valuation of options, but you also have the understand-
ing to study many of the more theoretical books and articles in financial mathematics and
stochastic processes such as that by Øksendal (1998).
4.3 Summary
• An SDE such as dX = μ(t, X)dt +σ(t, x)dW is a convenient shorthand for the Ito
integral equation
X(b, ω) −X(a, ω) =

b
a
μ(t, X(t, ω))dt +

b
a
σ(t, X(t, ω))dW.
• Crucial properties of Ito integrals are
– linearity,

b
a
(αf +βg)dW =α

b
a
fdW+β

b
a
gdW;
– union,

b
a
fdW+

c
b
fdW =

c
a
fdW;
– the Ito integral

b
a
fdW depends only upon the history of the Wiener process
up to time b (it is F
b
-adapted/measurable/previsible);
– it is martingale; that is, the mean, average, or expected value is always zero:
E

b
a
fdW

=0;
– it has Ito isometry, that is, the variance is the ordinary integral of the expectation
of the squared integrand:
Var

b
a
fdW

= E

b
a
fdW

2


=

b
a
E

f
2

dt.
emfm
2008/10/22
page 113
i
i
i
i
i
i
i
i
Exercises 113
• The integral form of Ito’s formula is
[f(t, X(t, ω))]
b
a
=

b
a

∂f
∂t

∂f
∂x
+
1
2
σ
2

2
f
∂x
2

dt +

b
a
σ
∂f
∂x
dW
for a stochastic process X satisfying dX =μdt +σdW.
• In proving all of the above, a crucial feature is the independence of increments from
earlier in the history. The proofs also rest upon the Ito isometry.
Exercises
4.1. Prove the linearity (4.3) of the Ito integral for step functions.
4.2. Prove the union (4.4) of the Ito integral for step functions.
4.3. By considering the differential d(W
3
−3tW), deduce the Ito integral I(ω) =

T
0
W(t, ω)
2
−tdW(t, ω), where W(t, ω) is a Wiener process. Hence verify the
martingale property and Ito isometry for this Ito integral.
4.4. Argue from basic principles that

T
0
W(t, ω)dW(t, ω) =
1
2

W(T, ω)
2
−T

by forming a partition over the time interval [0, T], approximating the integral as
I
n
(ω) =

n−1
j=0
W
j
ΔW
j
, and showing that these almost surely tend to I(ω) =
1
2

W(T, ω)
2
−T

by considering I −I
n
.
1. Argue that I −I
n
=
1
2

n−1
j=0

(ΔW
j
)
2
−Δt
j

by writing
1
2

W(T, ω)
2
−T

=
1
2
n−1

j=0

Δ(W
2
j
) −Δt
j

.
2. Then use some of the ideas appearing in the proof of the Ito isometry and in
the steps leading to (2.2) to show that the distance between the integral I(ω)
and the approximate I
n
(ω), measured by E[(I −I
n
)
2
] , must tend to zero.
4.5. Prove that (a+b)
2
≤2a
2
+2b
2
for any real numbers a and b by considering (a+
b)
2
+ (a−b)
2
. Similarly prove (a+b+c)
2
≤ 3a
2
+3b
2
+3c
2
for any real a,
b, and c.
4.6. Reconsider E[W(t, ω)
k
] for a Wiener process W(t, ω). Consider d(W(t, ω)
k
) by
Ito’s formula and the martingale property of Ito integrals to deduce
E[W(T, ω)
k
] =
1
2
k(k−1)

T
0
E[W(t, ω)
k−2
] dt for k ≥2 .
Hence determine E[W(t, ω)
2
], E[W(t, ω)
4
], and E[W(t, ω)
6
] . Compare your so-
lution with that for Exercise 2.1.
emfm
2008/10/22
page 114
i
i
i
i
i
i
i
i
114 Chapter 4. Stochastic Integration Proves Ito’s Formula
4.7. For a nonrandomfunction g(t), that is, g has no dependence upon the realizations ω
of a Wiener process W(t, ω), use Ito’s formula to show

g(t)dW(t, ω) = g(t)W(t, ω) −

W(t, ω)
dg
dt
dt.
4.8. Let I
n
(t, ω) = t
n/2
H
n
(W(t, ω)/

t), where H
n
is the nth Hermite polynomial,
(see (Kreyszig 1999, pp. 246–247) or (Abramowitz and Stegun 1965, Chap. 22)),
and the first few Hermite polynomials are H
0
(x) = 1, H
1
(x) = x, H
2
(x) = x
2
−1,
and H
3
(x) = x
3
−3x . What are I
0
(t, ω),. . . ,I
3
(t, ω)? The aim of this exercise is
to show that under Ito integration the functions I
n
(t) are the stochastic analogue of
powers under ordinary integration. Use Ito’s formula to show that

T
0
I
n−1
(t, ω)dW(t, ω) =
1
n
I
n
(T, ω).
See how Ito integration maps I
n−1
to
1
n
I
n
just as ordinary integration analogously
maps the power t
n−1
to
1
n
t
n
. Continue to hence deduce

· · ·

0≤t
1
≤···≤t
n
≤t
1dW(t
1
, ω)dW(t
2
, ω) . . . dW(t
n
, ω)
=
1
n!
t
n/2
H
n
(W(t, ω)/

t).
Note: H
n
(x), among other properties, satisfy the recurrences H

n
= nH
n−1
and
H

n
−xH

n
+nH
n
= 0 .
Answers to selected exercises
4.3. I(ω) =
1
3
W(T, ω)
3
−TW(T, ω) and σ
2
I
=
2
3
T
3
.
4.5. Consider (a+b+c)
2
+(a−b)
2
+(a−c)
2
+(b−c)
2
.
4.6. E[W(t, ω)
2
] = t, E[W(t, ω)
4
] = 3t
2
, and E[W(t, ω)
6
] = 15t
3
.
4.8. I
0
= 1, I
1
= W(t, ω), I
2
= W
2
−t, and I
3
= W
3
−3tW .
emfm
2008/10/22
page 115
i
i
i
i
i
i
i
i
Appendix A
Extra MATLAB/SCILAB Code
The following algorithms generate plots that evolve in time as they execute, and they gen-
erate little movies. This aspect distinguishes them from the algorithms listed in the body
of the text which generate static graphs. Since they do not correspond to specific figures in
the text, they are gathered here in an appendix.
Algorithm A.1 This algorithm draws a 20 second zoom into the exponential function to
demonstrate the smoothness of our usual functions.
t=linspace(-1,1,1024);
plot(t,exp(t)),grid
title(’exponential function’)
t0=clock;t=0;
while t<20
axis([0 0 1 1]+exp(-t/5)
*
[-1 1 -1 1])
drawnow
t=etime(clock,t0);
end
115
emfm
2008/10/22
page 116
i
i
i
i
i
i
i
i
116 Appendix A. Extra MATLAB/SCILAB Code
Algorithm A.2 This algorithm draws a 60 second zoom into Brownian motion. Use the
Brownian bridge to generate a start curve and new data as the zoom proceeds. Force the
Brownian motion to pass through the origin. Note the self-affinity as the vertical is scaled
with the square root of the horizontal. Note the infinite number of zero crossings that appear
near the original one.
tfin=60;
n=2^10+1; % increase for slower better resolution
t=linspace(-1,1,n);
h=diff(t(1:2));
x=cumsum([0,sqrt(h)
*
randn(1,n-1)]);
x=x-x((n+1)/2); % force through origin
j=1:2:n; i=2:2:n; k=(n+3)/4:(3
*
n+1)/4;
hand=plot(t,x);
title(’Weiner process’)
axis([-1 1 -1 1])
hold on,plot([-1.1 0 1.1],[0 0 0],’ro-’),hold off
pause
t0=clock;tt=0;
while tt<tfin
width=exp(-tt/5);
axis([-width width -sqrt(width) sqrt(width)])
drawnow
if h>width/(n/4) % interpolate new data
h=h/2;
x(j)=x(k); t(j)=t(k);
t(i)=0.5
*
(t(i-1)+t(i+1));
x(i)=0.5
*
(x(i-1)+x(i+1))+sqrt(h/2)
*
randn(size(i));
set(hand,’xdata’,t,’ydata’,x)
end
tt=etime(clock,t0);
end
emfm
2008/10/22
page 117
i
i
i
i
i
i
i
i
Appendix A. Extra MATLAB/SCILAB Code 117
Algorithm A.3 This algorithm randomwalks from the specific point (0.7, 0.4) to show that
the walkers first exit locations given a reasonable sample of the boundary conditions. This
algorithm compares the numerical and exact solution u = x
2
−y
2
.
[x,y]=meshgrid(0:0.1:1);
clabel(contour(x,y,x.^2-y.^2))
hold on
m=200
t=0;
z0=[0.7 0.4];
z=z0(ones(m,1),:);
plot(z(:,1),z(:,2),’b.’,’markersize’,20)
h=plot(z(:,1),z(:,2),’r.’,’erasemode’,’xor’,’markersize’,20);
axis(’equal’),axis([0 1 0 1])
hold off
pause
j=1:m;
t0=clock;
while length(j)>m/8
dt=etime(clock,t0)-t;
z(j,:)=z(j,:)+0.1
*
sqrt(dt)
*
randn(length(j),2);
z=max(0,min(z,1));
t=t+dt;
set(h,’xdata’,z(:,1),’ydata’,z(:,2))
drawnow
j=find(max(abs(z’-0.5))<0.5-1e-7);
end
% pause to show all points, then discard nonexiting walks
pause(3)
j=find(max(abs(z’-0.5))>=0.5-1e-7);
set(h,’xdata’,z(j,1),’ydata’,z(j,2))
drawnow
actual_u=z0(1)^2-z0(2)^2
fp=z(j,1).^2-z(j,2).^2;
estimate_u=mean(fp)
error_u=std(fp)/sqrt(length(fp)-1)
emfm
2008/10/22
page 118
i
i
i
i
i
i
i
i
emfm
2008/10/22
page 119
i
i
i
i
i
i
i
i
Appendix B
Two Alternate Proofs
B.1 Fokker–Planck equation
This appendix provides an alternate proof of Theorem 3.4 that the Fokker–Planck equa-
tion (3.2) governs the evolution of a PDF p(t, x) of a stochastic Ito process. This proof
use some theory from Chapter 4 and some techniques used in continuum mechanics (see,
e.g., Roberts 1994). The proof here is more general than that of Section 3.1 in that here we
allow both drift and volatility to vary in both x and time t.
Proof. Let f(x) denote some arbitrary smooth function. For the Ito process X(t, ω),
define a new Ito process Y(t, ω) =f(X(t, ω)). Ito’s formula then tells us that
dY =f

(X)dX+
1
2
f

(X)dX
2
=

f

(X)μ(t, X) +
1
2
f

(X)σ
2
(t, X)

dt +f

(X)σ(t, X)dW
as dX =μ(t, X)dt +σ(t, X)dW in general. Now integrate this over any time interval [a, b]
to get the Ito integral formula (4.12) for this process Y:

b
a
dY =

b
a
f

(X)μ(t, X) +
1
2
f

(X)σ
2
(t, X)dt +

b
a
f

(X)σ(t, X)dW.
Now

b
a
dY =Y(b, ω) −Y(a, ω) =f(X(b, ω)) −f(X(a, ω)) from the definition Y =f(X) .
Thus Ito’s integral formula asserts
f(X(b, ω)) −f(X(a, ω)) =

b
a
f

(X)μ(t, X) +
1
2
f

(X)σ
2
(t, X)dt
+

b
a
f

(X)σ(t, X)dW. (B.1)
Introduce the PDF p(t, x) by taking expectations: recall that
E{g(t, X)} =

g(t, x)p(t, x)dx,
119
emfm
2008/10/22
page 120
i
i
i
i
i
i
i
i
120 Appendix B. Two Alternate Proofs
where the limits of the x integration are implicitly over all x. The expectation of the left-
hand side of (B.1) is
E{f(X(b, ω)) −f(X(a, ω))} =

f(x)p(x, b)dx−

f(x)p(x, a)dx
=

f(x)[p(x, b) −p(x, a)]dx
=

f(x)

b
a
∂p
∂t
dtdx
=

b
a

f(x)
∂p
∂t
dxdt
by the fundamental theorem of calculus that p(x, b) −p(x, a) =

b
a
∂p
∂t
dt . The factor
∂p
∂t
appearing in the integrand becomes the
∂p
∂t
term in the Fokker–Planck equation (3.2).
We derive the x derivative terms in the Fokker–Planck equation from the right-hand
side of (B.1). Consider its expectation,
E

b
a
f

(X)μ(t, X) +
1
2
f

(X)σ
2
(t, X)dt

+E

b
a
f

(X)σ(t, X)dW

.
By the martingale property of Ito integrals (see Theorem 4.5) that E

b
a
·dW

= 0 , the
second expectation above is zero. Hence,
E{f(X(b, ω)) −f(X(a, ω))}
= E

b
a
f

(X)μ(t, X) +
1
2
f

(X)σ
2
(t, X)dt

=

b
a
E

f

(X)μ(t, X)

+E

1
2
f

(X)σ
2
(t, X)

dt
=

b
a

f

(x)μ(t, x)p(t, x)dx+

1
2
f

(x)σ
2
(t, x)p(t, x)dxdt.
By integration by parts,


f

μpdx = [fμp]

x=−∞

f

∂x
(μp)dx =

−f

∂x
(μp)dx as the PDF p(t, x) must
go to zero as x →±∞; if it did not, then there would be no way that the area under
the PDF could be one; whereas
• integrating by parts twice gives

1
2
f

σ
2
pdx =
1
2

f

σ
2
p−f(σ
2
p)


−∞
+

1
2
f

2
∂x
2

2
p)dx =

1
2
f

2
∂x
2

2
p)dx
as again the PDF and its derivative must go to zero as x →±∞.
emfm
2008/10/22
page 121
i
i
i
i
i
i
i
i
B.2. Kolmogorov backward equation 121
Consequently the right-hand side becomes
E{f(X(b, ω)) −f(X(a, ω))} =

b
a

−f

∂x
(μp)dx+

1
2
f

2
∂x
2

2
p)dxdt
=

b
a

f



∂x
(μp) +
1
2

2
∂x
2

2
p)

dxdt.
See the right-hand side of the Fokker–Planck equation (3.2) in this integrand.
Now put the two parts together. Equate the left and right-hand sides:

b
a

f(x)
∂p
∂t
dxdt =

b
a

f(x)



∂x
(μp) +
1
2

2
∂x
2

2
p)

dxdt.
Then put all terms under one pair of integrals on the left:

b
a

f(x)

∂p
∂t
+

∂x
(μp) −
1
2

2
∂x
2

2
p)

dxdt = 0.
All the parts of the Fokker–Planck equation (3.2) appear in this integrand—that is, all the
parts except the “=0” bit.
Use proof by contradiction to get the = 0. Recall that f(x) is arbitrary, as is the
time interval [a, b]. The only way that the continuous factor in the integrand in square
brackets can be zero for all f, a, and b is if it is always zero itself. To see this, suppose the
factor inside the brackets [ ] is nonzero, say positive, at some x = ξ and t = τ . Since the
expression inside the brackets [ ] is continuous, then it must be still positive in some small
time interval around τ, a <τ <b say, and in some small interval around ξ, c <ξ <d say.
Choose f(x) to be any smooth function that is positive inside the interval [c, d] and zero
outside. Then the integral

b
a

f(x)[ ] dxdt must be >0, as the integrand is ≥ 0 and is > 0
over a finite part of the domain. This is the contradiction, as we know the integral is always
zero. Hence the supposition must be false: the factor in the brackets [ ] must be everywhere
zero. Consequently,
∂p
∂t
+

∂x
(μp) −
1
2

2
∂x
2

2
p) =0,
which when rearranged slightly is the Fokker–Planck equation (3.2).
B.2 Kolmogorov backward equation
This section provides an alternate proof of Theorem 3.15 that the Kolmogorov backward
equation (3.11) governs the evolution of the conditional PDF p(t, x|s, y) of a stochastic Ito
process.
Proof. Consider the conditional PDF p(τ, ξ|s, y) for any ξ and y and for any times s <τ .
Recall that pdξ gives the probability of passing through x = ξ at time t = τ given the Ito
process X(t, ω) starts at x = y at time t =s , that is, X(s, ω) =y.
emfm
2008/10/22
page 122
i
i
i
i
i
i
i
i
122 Appendix B. Two Alternate Proofs
Define a new Ito process Z(t, ω) = p(τ, ξ|t, X(t, ω)) for s ≤ t ≤ τ . This is fine
because the conditional PDF p is some smooth function—we may or may not know what
it is, but it does exist. The Ito process Z also varies with ξ and τ, but we focus on its
t dependence. Apply Ito’s formula, remembering dX =μdt +σdW:
dZ =
∂p
∂s

(τ,ξ|t,X)
dt +
∂p
∂y

(τ,ξ|t,X)
dX+
1
2

2
p
∂y
2

(τ,ξ|t,X)
dX
2
=

∂p
∂s
+μ(t, X)
∂p
∂y
+
1
2
σ
2
(t, X)

2
p
∂y
2

(τ,ξ|t,X)
dt +
∂p
∂y

(τ,ξ|t,X)
σ(t, X)dW.
See that the terms in the Kolmogorov backward equation (3.11) appear in the dt term; our
task is to disentangle them somehow. For simplicity define the new Ito process
K(t, ω) =

∂p
∂s
+μ(t, X)
∂p
∂y
+
1
2
σ
2
(t, X)

2
p
∂y
2

(τ,ξ|t,X)
,
which contains the terms of the Kolmogorov backward equation (3.11). Our goal is then to
extract K from all the other terms.
Integrate over any time interval [a, b] such that s ≤a < b ≤τ to deduce
Z(b, ω) −Z(a, ω) =

b
a
Kdt +

b
a
∂p
∂y

(τ,ξ|t,X)
σ(t, X)dW.
Disentangling K is done by taking the expectation of this equation:
E{Z(b, ω)} −E{Z(a, ω)} =

b
a
E{K} dt +E

b
a
∂p
∂y

(τ,ξ|t,X)
σ(t, X)dW

. (B.2)
First, by the martingale property of Ito integrals (see Theorem 4.5) that E{

b
a
·dW} = 0 ,
the last integral above vanishes. Second, consider E{Z(t, ω)}: from the definition of Z and
an expectation,
E{Z(t, ω)} =

p(τ, ξ|t, x)p(t, x|s, y)dx =p(τ, ξ|s, y),
as the integral is the probability of going from (s, y) to (t, x), and then from (t, x) to (τ, ξ),
integrated over all possible x values. Thus the integral must be just the probability of going
from (s, y) to (τ, ξ). But the right-hand side p(τ, ξ|s, y) is independent of the intermediate
time t, and hence the expectation E{Z(t, ω)} is constant in t. Consequently, its change
over the interval E{Z(b, ω)} −E{Z(a, ω)} =0 . Thus, equation (B.2) becomes
0 =

b
a
E{K} dt.
But this integral is zero for all intervals [a, b], s ≤a <b <τ , and so, by the same argument
as used to prove the Fokker–Planck equation, the integrand
E{K(t, ω)} =0 for all s <t <τ.
emfm
2008/10/22
page 123
i
i
i
i
i
i
i
i
B.2. Kolmogorov backward equation 123
We have extracted K(t, ω) at the expense of the expectation.
But K(t, ω) is a continuous Ito process, so the expectation E{K(t, ω)} must also be
continuous; hence E{K(t, ω)} =0 for time t =s. Take the limit as t →s; then X(t, ω) →y
and
E{K(s, ω)} = E


∂p
∂s
+μ(s, y)
∂p
∂y
+
1
2
σ
2
(s, y)

2
p
∂y
2

(τ,ξ|s,y)



= 0.
At t =s there is nothing stochastic remaining inside the expectation. So the expectation is
irrelevant and we deduce
∂p
∂s
+μ(s, y)
∂p
∂y
+
1
2
σ
2
(s, y)

2
p
∂y
2
=0,
which must be satisfied for all τ, ξ, s, and y, as they are arbitrary, and hence proves the
Kolmogorov backward equation (3.11).
emfm
2008/10/22
page 124
i
i
i
i
i
i
i
i
emfm
2008/10/22
page 125
i
i
i
i
i
i
i
i
Bibliography
Abramowitz, M. and Stegun, I. A., eds. (1966), Handbook of Mathematical Functions,
Dover, New York.
Higham., D. J. (2001), An algorithmic introduction to numerical simulation of stochas-
tic differential equations, SIAM Review 43(3), 525–546. Available online at http:
//link.aip.org/link/?SIR/43/525/1
Higham, D. J. (2008), Modeling and simulating chemical reactions, SIAM Review
50(2), 347–368. Available online at http://link.aip.org/link/?SIR/50/
347/1
Kao, E. P. C. (1997), An Introduction to Stochastic Processes, Duxbury Press, Pacific
Grove, CA.
Kloeden, P. E. and Platen, E. (1992), Numerical Solution of Stochastic Differential Equa-
tions, Vol. 23 of Applications of Mathematics, Springer-Verlag, Berlin.
Kreyszig, E. (1999), Advanced Engineering Mathematics, 8th ed., Wiley, New York.
Øksendal, B. K. (1998), Stochastic Eifferential Equations: An Introduction with Applica-
tions, Springer-Verlag, Berlin.
Roberts, A. J. (1994), A One-Dimensional Introduction to Continuum Mechanics, World
Scientific, River Edge, NJ.
Stampfli, J. and Goodman, V. (2001), The Mathematics of Finance: Modeling and Hedging,
Brooks/Cole, Pacific Grove, CA.
125
emfm
2008/10/22
page 126
i
i
i
i
i
i
i
i
emfm
2008/10/22
page 127
i
i
i
i
i
i
i
i
Index
Page numbers in italics denote the page of definition of the term.
adapted, 95, 97, 106
advection-diffusion equation, 56, 85
antidifferentiation, 43
arbitrage, 20, 22, 22, 23, 25, 30, 32, 36
asset, 4, 9–11, 13, 17, 21–23, 25, 27, 30,
48–51, 54, 55, 58, 59, 78, 79
asset price, 15, 23–26, 28–30, 35, 37,
48, 52–54, 60, 67, 78
binomial lattice, 20, 28, 31, 35–37, 49,
53, 54
birth and death process, 66, 74, 90
Black–Scholes equation, 49, 50, 51, 54,
58, 60, 78, 79, 85, 90
Brownian motion, 3, 5, 6, 6, 64, see also
Wiener process
exponential, 15, 18, 29, 35, 40, 63
call option, 21, 22, 24–27, 30, 31, 35,
36, 48, 49, 51, 54, 78, 79
Cauchy distribution, 90
Cauchy sequence, 100, 101, 103
chain rule, 48, 51, 53
Chapman–Kolmogorov equation, 85
conditional PDF, 84, 121, 122
differential, 10, 11, 14, 45–48, 57, 93, 94
diffusion, 51, 61, 64, 68, 75, 76
Dirichlet problem, 64
Doob–Meyer decomposition, 13, 14, 84
drift, 10, 12–14, 33, 40, 46, 49, 66, 68,
75, 80, 82, 84, 85, 106–108,
119
Euler method, 15, 18, 20, 95
Euler–Cauchy differential equation, 53
exercise price, 21, 35, 37
Feynman–Kac formula, 76, 77, 90
filtration, 97
Fokker–Planck equation, 63, 64, 66, 68,
69, 71, 72, 74, 77, 85, 87, 90,
119–122
forward contract, 22, 23, 35, 50
Gamma function, 71
Gaussian, 3, 61, 63, 70, 84, see also
normally distributed
hedge ratio, 26, 27, 28, 35–37, 54
Hermite polynomial, 57, 113
interest rate, 11, 23, 26, 35, 49, 54, 59,
60
Ito integral, 94–96, 96, 97–99, 101–103,
103, 105–107, 113, 119, 120,
122
Ito isometry, 98, 99, 103, 105, 106, 109,
112, 113
Ito process, 13, 48, 49, 54, 57, 65, 66,
87, 93, 94, 100, 102, 106, 119,
121, 123
Ito’s formula, 43, 45, 46–49, 57, 94,
107, 113, 114, 119, 122
Ito’s lemma, 46
knock out, 51, 60
Kolmogorov backward equation, 85, 86,
90, 121–123
Kolmogorov forward equation, 66, see
also Fokker–Planck equation
127
emfm
2008/10/22
page 128
i
i
i
i
i
i
i
i
128 Index
Langevin equation, 10
Laplace’s equation, 64
Linearity, 97
linearity, 98, 103, 105, 113
Liouville equation, 68
Malthus, 74
Malthusian model, 74
Markov chain, 15, 22, 23, 29, 61, 66, 83,
84
martingale property, 97, 106, 113, 120
MATLAB, x, 3, 11, 15, 17, 31–34, 36,
37, 60, 79, 82, 95, 115
measurable, 95, 97, 106
norm, 102, 105
normally distributed, 3, 5, 7, 17, see also
Gaussian
ODE, see ordinary differential equation
ordinary differential equation (ODE), 3,
10, 41, 53, 70–72, 80–82
Ornstein–Uhlenbeck process, 34, 68, 69,
70, 84, 90, 91
partial differential equation (PDE), 31,
49–51, 53, 58, 61, 63–65, 75,
76, 78, 79, 83, 86, 90
PDE, see partial differential equation
PDF, see probability distribution
function
portfolio, 22, 24–28, 30, 35–37, 48, 49,
54–56, 59, 60
previsible, 95, 97, 106
probability distribution function (PDF),
57, 61, 63–66, 69–73, 75, 77,
83–87, 90, 91, 119, 120, 122
product rule, 47, 75
put option, 21, 36, 36, 37, 51
random walk, 6, 64, 76, 79, 90
risk free, 20, 22, 23–28, 30, 32, 36, 37,
48, 49
SCILAB, x, 3, 11, 15, 17, 31–34, 36, 37,
55, 60, 79, 82, 95, 115
SDE, see stochastic differential equation
self-financing, 55, 56, 60
step function, 95, 96, 97, 98, 100–105,
108, 113
stochastic calculus, 6, 7, 43, 47, 93, 110
stochastic differential equation (SDE), 3,
6, 7, 10, 11, 13–15, 17, 18, 20,
22, 28, 31, 32, 34, 35, 39–43,
45, 63–66, 68, 69, 71, 74–84,
86, 87, 90, 95, 106, 112
stochastic process, 1, 3, 5, 13, 46, 54,
57, 63, 65, 84, 87, 97, 106
stock drift, 11, 13, 15, 34, 87
stock volatility, 11, 13, 15, 34, 38, 78, 87
Stratonovich sense, 10
strike price, 21, 24, 26, 30, 31, 36, 54, 79
Taylor series, 31, 41, 45, 46, 67, 90, 108
union, 97, 106, 113
volatility, 9, 10–15, 29, 33, 49, 51, 66,
68, 84, 90, 119
white noise, 10, see also Wiener process
Wiener process, 3, 5, 6, 7, 9, 10, 18, 20,
37, 44–46, 57, 61, 63, 64, 76,
84, 87, 95, 97, 98, 106, 107,
112, 113

Mathematical Modeling and Computation
About the Series
The SIAM series on Mathematical Modeling and Computation draws attention to the wide range of important problems in the physical and life sciences and engineering that are addressed by mathematical modeling and computation; promotes the interdisciplinary culture required to meet these large-scale challenges; and encourages the education of the next generation of applied and computational mathematicians, physical and life scientists, and engineers. The books cover analytical and computational techniques, describe significant mathematical developments, and introduce modern scientific and engineering applications. The series will publish lecture notes and texts for advanced undergraduate- or graduate-level courses in physical applied mathematics, biomathematics, and mathematical modeling, and volumes of interest to a wide segment of the community of applied mathematicians, computational scientists, and engineers. Appropriate subject areas for future books in the series include fluids, dynamical systems and chaos, mathematical biology, neuroscience, mathematical physiology, epidemiology, morphogenesis, biomedical engineering, reaction-diffusion in chemistry, nonlinear science, interfacial problems, solidification, combustion, transport theory, solid mechanics, nonlinear vibrations, electromagnetic theory, nonlinear optics, wave propagation, coherent structures, scattering theory, earth science, solid-state physics, and plasma physics. A. J. Roberts, Elementary Calculus of Financial Mathematics James D. Meiss, Differential Dynamical Systems E. van Groesen and Jaap Molenaar, Continuum Modeling in the Physical Sciences Gerda de Vries, Thomas Hillen, Mark Lewis, Johannes Müller, and Birgitt Schönfisch, A Course in Mathematical Biology: Quantitative Modeling with Mathematical and Computational Methods Ivan Markovsky, Jan C. Willems, Sabine Van Huffel, and Bart De Moor, Exact and Approximate Modeling of Linear Systems: A Behavioral Approach R. M. M. Mattheij, S. W. Rienstra, and J. H. M. ten Thije Boonkkamp, Partial Differential Equations: Modeling, Analysis, Computation Johnny T. Ottesen, Mette S. Olufsen, and Jesper K. Larsen, Applied Mathematical Models in Human Physiology Ingemar Kaj, Stochastic Modeling in Broadband Communications Systems Peter Salamon, Paolo Sibani, and Richard Frost, Facts, Conjectures, and Improvements for Simulated Annealing Lyn C. Thomas, David B. Edelman, and Jonathan N. Crook, Credit Scoring and Its Applications Frank Natterer and Frank Wübbeling, Mathematical Methods in Image Reconstruction Per Christian Hansen, Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion Michael Griebel, Thomas Dornseifer, and Tilman Neunhoeffer, Numerical Simulation in Fluid Dynamics: A Practical Introduction Khosrow Chadan, David Colton, Lassi Päivärinta, and William Rundell, An Introduction to Inverse Scattering and Inverse Spectral Problems Charles K. Chui, Wavelets: A Mathematical Tool for Signal Analysis

Editor-in-Chief
Richard Haberman Southern Methodist University

Editorial Board
Alejandro Aceves Southern Methodist University Andrea Bertozzi University of California, Los Angeles Bard Ermentrout University of Pittsburgh Thomas Erneux Université Libre de Brussels Bernie Matkowsky Northwestern University Robert M. Miura New Jersey Institute of Technology Michael Tabor University of Arizona

Elementary Calculus of Financial Mathematics
A. J. Roberts
University of Adelaide Adelaide, South Australia, Australia

Society for Industrial and Applied Mathematics

Inc.(Mathematical modeling and computation . HG106.R63 2009 332. 6th Floor. Elementary calculus of financial mathematics / A. www. Title. write to the Society for Industrial and Applied Mathematics.mathworks.Copyright © 2009 by the Society for Industrial and Applied Mathematics. MATLAB is a registered trademark of The MathWorks. Library of Congress Cataloging-in-Publication Data Roberts. p. no infringement of trademark is intended. 15) Includes bibliographical references and index. 4. 2. Inc. Stochastic processes. 508-647-7000. cm. For information. These names are used in an editorial context only. 3.com. A. Calculus. Printed in the United States of America. No part of this book may be reproduced. Fax: 508-647-7001. -. Philadelphia. PA 19104-2688 USA. Investments-Mathematics. info@mathworks. Natick. 3600 Market Street. please contact The MathWorks. Trademarked names may be used in this book without the inclusion of a trademark symbol. ISBN 978-0-898716-67-2 1. Finance--Mathematical models.. . J. stored.com. For MATLAB product information. MA 01760-2098 USA. J. Roberts. Maple is a registered trademark of Waterloo Maple. Inc. 3 Apple Hill Drive.01'51923--dc22 2008042349 is a registered trademark. or transmitted in any manner without the written permission of the publisher. 10 9 8 7 6 5 4 3 2 1 All rights reserved. I.

Sam. Ben. and Nicky for their support over the years .To Barbara.

.

1 Fokker–Planck equation . . . . . .3 The Black–Scholes equation prices options accurately 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Stochastically solve deterministic differential equations . . . . . . . . . . . . . . .1 Brownian motion is also called a Wiener process 1. . . . . . . . . . . . . . . . .3 Summary . . . . . . . . . . . . . . . . . . . . . . . 4. . . . . . . . ix xi 1 3 9 14 20 32 33 39 39 43 48 56 57 61 65 76 84 86 87 93 95 106 112 113 115 2 Ito’s Stochastic Calculus Introduced 2. . . . . . . . . Exercises . . . . . . . . . . . . . . . . .1 Multiplicative noise reduces exponential growth . . . . . . . . .2 Ito’s formula solves some SDEs . . . . . . . . . . . . . . . . . . 119 vii . . . . .4 Summary . . . . . . .Contents Preface List of Algorithms 1 Financial Indices Appear to Be Stochastic Processes 1. . . . . . . . . . . . . . . 3 The Fokker–Planck Equation Describes the Probability Distribution 3. . . . . . . . . . . . . . . . . . . Exercises . . . . 3. . . . . . . . . . . . . . . . . 4. . . . . . . . .3 The Kolmogorov backward equation completes the picture . . . . 1. . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Stochastic drift and volatility are unique . . . . . .4 Summary . . . . . . . . . . . . . . . . . . . . . . . Appendix A Extra M ATLAB /S CILAB Code Appendix B Two Alternate Proofs 119 B. . . . . . . . . . . . . . . . . . . . . . 1. . . 3. . . . . . . . . . . .2 The Ito formula . . .1 The Ito integral a f dW . . . . . . . . . . 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. . . . . . Exercises . . . . . . 1. . . . . . . . . . . . . . . . . . . .3 Basic numerics simulate an SDE .5 Summary .1 The probability distribution evolves forward in time . . . . . . . . . . . . . . 4 . . . . . . . . . . . . . . . . . . . . Stochastic Integration Proves Ito’s Formula b 4. . . . . . . .4 A binomial lattice prices call option . .

. . . . . . . . . . . . . . . . . 121 125 127 . . .2 Bibliography Index Contents Kolmogorov backward equation .viii B. .

the book starts with a graphical/numerical introduction to how to adapt random walks to describe the typical erratic fluctuations of financial markets. some applications require exploring the distribution of possibilities. This book supports your learning with the bare minimum of necessary prerequisite mathematics. data analysis. and to triumphantly derive and use the Black–Scholes equation to accurately value financial options. integrate. ix . calculus. I take on the challenge of introducing you to the crucial concepts needed to understand and value financial options among such fluctuations. There will be many times throughout this book when you will need the concepts and techniques of such courses. Such randomness reflects the erratic fluctuations in financial markets. The Fokker–Planck and Kolmogorov equations link evolving probability distributions to stochastic differential equations (SDEs). Lastly. probability and Markov chains are prerequisites for this course. Discrete analysis of this problem leads to the surprisingly simple extension of classic calculus needed to perform stochastic calculus. Then simple numerical simulations both demonstrate the approach and suggest the symbology of stochastic calculus. Integration in turn leads to a sound interpretation of Ito’s formula that we find so useful in financial applications. The first two chapters deal with individual realizations and simulations. and have appropriate references on hand. However. Fluctuations in a financial environment may bankrupt businesses that otherwise would grow. The finite steps of the numerical approach underlie the introduction of the binomial lattice model for evaluating financial options. dW 2 = dt. solve stochastic differential equations. Such transformations empower us not only to value financial options but also to model the natural fluctuations in biology models and to approximately solve differential equations using stochastic simulation. Modern financial mathematics relies on a deep and sophisticated theory of random processes in time. Prerequisites Basic algebra.Preface Welcome! This book leads you on an introduction into the fascinating realm of financial mathematics and its calculus. This simple but powerful rule enables us to differentiate. Be sure you are familiar with those. the formal rules used previously are justified more rigorously by an introduction to a sound definition of stochastic integration. To deliver understanding with a minimum of analysis. The key is to replace squared noise by a mean drift: in effect.

mathworks. You can purchase M ATLAB from the Mathworks company. A.x Preface Computer simulations Incorporated into this book are M ATLAB /S CILAB scripts to enhance your ability to probe the problems and concepts presented and thus to improve learning.com. http://www.org.scilab. Roberts . S CILAB is available for free via http://www. J.

25 and with a bond rate of 12%. . .4 Code for simulating five realizations of the financial SDE dS = αS dt + βS dW . . . . . . . . strike price X = 38. . . 3. . . . . . . given h is the time step. . . . . . . . . . . . . . . . 1. . . . . .1 M ATLAB /S CILAB code (essentially an Euler method) to numerically evalt uate 0 W(s)dW(s) for 0 < t < 1 to draw Figure 4. . 1. .9. . . . this code estimates the stock drift and stock volatility from a times series of values s. . . .3 In M ATLAB /S CILAB. ..1 3 11 11 15 20 32 55 79 82 95 115 . .1. . In S CILAB use %nan instead of nan. 2. .10. . . . . . . . . . . . . . . . Continue until all realizations reach one boundary or the other. .1 This algorithm draws a 20 second zoom into the exponential function to demonstrate the smoothness of our usual functions. . . . . . . on an asset with initial price S0 = 35 . . . . . . . . . . . . . . ..8) to stochastically estimate the value of a call option. . .2 M ATLAB /S CILAB code to solve a boundary value problem ODE by its corresponding SDE. . . . Observe the use of nan to introduce unspecified boundary conditions that turn out to be irrelevant on the selected grid just as they are for the binomial model. . .6 M ATLAB /S CILAB code for a four-step binomial lattice estimate of the value of a call option. . . from Example 1. .3. Estimate the expectation of the boundary values using the conditional vectors x<=0 and x>=3 to account for the number of realizations reaching each boundary. . A. . 1. . . . . . . . . . . . .1 Code for determining the value of the call option in Example 2. . . .. . . . . with different drifts and volatilities. . . . . . xi 1. . . 1. . then repeat with time steps four times as long as the previous. . . . . . .) for N(0. . . .1 Example M ATLAB /S CILAB code using (3. . . all scaled from the one Wiener process. . . . . . The for-loop steps forward in time. . . .List of Algorithms M ATLAB /S CILAB code to plot m realizations of a Brownian motion/Wiener process as shown in Figure 1. 1) distributed random numbers. . . . . . . . . . . As a second subscript to an array. . . . . . ."n") instead of randn(. .50 after one year in which the asset fluctuates by a factor of 1. 3. . . . . .. . . . . . . . . .. . . In S CILAB use rand(. . . . . . .5 Code for starting one realization with very small time step. the colon forms a row vector over all the realizations. . . . . . . . . . 1. . . . In S CILAB use $ instead of end. Use find to evolve only those realizations within the domain 0 < X < 3 . . 4.2 M ATLAB /S CILAB code to draw five stochastic processes.

7. . . . . . 116 A. .xii List of Algorithms A. .3 This algorithm random walks from the specific point (0. . . . . .4) to show that the walkers first exit locations given a reasonable sample of the boundary conditions. . 117 .2 This algorithm draws a 60 second zoom into Brownian motion. 0. . Note the self-affinity as the vertical is scaled with the square root of the horizontal. . . . . Use the Brownian bridge to generate a start curve and new data as the zoom proceeds. . This algorithm compares the numerical and exact solution u = x2 − y2 . . . Force the Brownian motion to pass through the origin. . . . . . . . . Note the infinite number of zero crossings that appear near the original one. . . . . . . .

. . . . . . .4 A binomial lattice prices call option . . . Prices evolving in time form a stochastic process because of these strong random fluctuations. . . . . . . making it more used than Pythagoras’ theorem! 1 . . . . . . A forecast issued by E. . .4. . . . . . . . . .Chapter 1 Financial Indices Appear to Be Stochastic Processes Contents 1. . 1. . . . . . the Wall Street brokerage firm.1. . . . . . . . . . . . . . . . . . . . .3 Brownian motion is also called a Wiener process . . . . Hutton. . . 1. . . Exercises . financial indices. . and currency values fluctuate wildly and randomly. . . . the all-pervading randomness in finance was realized by Bachelier circa 1900. here described by Tony Dooley: .5 Summary . 1. . .3. . . . . . . . .2. . it is used millions of times in millions of computer programs. . . .2 Convergence is relatively slow . . An example is the range of daily cotton prices shown in Figure 1. . . . moments before the stock market plunged on October 19. . . . . . . . . .1 The Euler method is the simplest . . . These figures demonstrate that randomness appears to be an integral part of the dynamics of the financial market. . . Each day. . 1987. This book develops the elementary theory of such stochastic processes in order to underpin the famous Black–Scholes equation for valuing financial options. .4. . In a dissertation that was little appreciated at the time. . . Answers to selected exercises . . . another is the range of wheat prices shown in Figure 1.3. . .1 Arbitrage value of forward contracts . Basic numerics simulate an SDE . 3 9 14 14 18 20 22 23 28 32 33 38 Earnings momentum and visibility should continue to propel the market to new highs. . . . . . . . . . . . . . . . . . There is overwhelming evidence that share prices. . Stochastic drift and volatility are unique . . . . . the most-used formula in all of history is the Black–Scholes formula. . .1 1.4. . . . . . . . . 1. 1.2 A one step binomial model . . . . . . F. . . . . . . . 1.3 Use a multiperiod binomial lattice for accuracy 1. .2 1. . . . . . . . . .

The range of wheat prices over nearly 90 years.1. that is. Financial Indices Appear to Be Stochastic Processes S(t) = price of cotton US$ 90 80 70 60 50 40 30 20 1970 1975 1980 1985 1990 1995 2000 year Figure 1. The vertical bars give the range in each year. 250 per year). Opening daily prices S(t) of cotton over nearly 30 years showing the strong fluctuations typical of financial markets (note that there are about 21 trading days per month. and the tick on the right is the closing price. 7 6 5 price of wheat US$ 4 3 2 1 0 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 year Figure 1. .2. blue is the 9-year average.2 120 110 100 Chapter 1. the tick on the left of each bar is the opening price. The red curve is the 4-year average. and cyan is the 18-year average.

we assume that these unknown influences accumulate to become normally distributed—alternatively called Gaussian distribution. flood.dw]). A truck driving along a road shakes from a variety of causes. There are fluctuations in the populations due to such unforeseen events. Figure 1.1. However. m=5.. In S CILAB use rand(. h=diff(t(1:2)). one of which is travelling over the essentially random bumps in the road.) for N(0. according to these graphs it would seem . These examples show that the study of stochastic differential equations (SDEs) is worthwhile. Algorithm 1.1 Brownian motion is also called a Wiener process Nothing in Nature is random . Note that the use of SDEs is an admission of ignorance of the nature of fluctuations.m). Spinoza A starting point to describe stochastic processes such as a stock price is Brownian motion or. A thing appears random only through the incompleteness of our knowledge. Such models are often written as ordinary differential equations (ODEs) of the form dx/dt = · · · and dy/dt = · · · . However. they almost instantly assimilate the knowledge and buy and sell accordingly. w=cumsum([zeros(1. dw=sqrt(h)*randn(n. . In biology we generate differential equations representing the interactions of predators and prey. See the roughly qualitative similarities to the cotton prices shown in Figure 1. for example. 1) distributed random numbers. n=300.1.3 shows an example of this process. .1 (though the cotton prices appear to fluctuate more) and to the wheat prices in Figure 1. Brownian motion is also called a Wiener process 3 The financial world is not the only example of significant random fluctuations.1.3. Engineers may also need to analyze problems with random inputs. plot(t.w) Where do fluctuations in the financial indices come from? Many economic theorists assert that fluctuations reflect the random arrival of new knowledge. t=linspace(0. As new knowledge is made known to the traders of stocks and shares.n+1)’."n") instead of randn(.1 M ATLAB /S CILAB code to plot m realizations of a Brownian motion/ Wiener process as shown in Figure 1. In many aspects the truck’s design must account for such stochastic vibrations. or meteorite impact. where the right-hand side dots denote life and death interactions.. There are a multitude of unknown processes which influence the phenomena of interest. Suggested activity: Do at least Exercise 1. more technically. foxes and rabbits. biological populations live in an environment with random events such as drought. Under the central limit theorem.2...1. 1. Random events are especially dangerous for populations of endangered species in which there are relatively few individuals..m). a Wiener process.

Many agents look at financial indices and see trends and patterns on which they base their recommendations. have some chance patterns.3. 84. though no proof of this is known. independent from day to day.0 0.1.0 0. Econ. Thus a recent view of financial prices is that the average valuation is reasonable. to read a little on financial markets’ lack of reaction to news.5 0. 1 See Why economic theory is out of whack by Mark Buchanan in the New Scientist.5 0.4 2. No new knowledge need be hypothesized. that at least 95% of the “new knowledge” in a day is self-contradictory! The independent fluctuations from one day to the next say to me that whatever a person discovers one day is meaningless the next. Remember that a random sequence must. 2 Read about the “El Farol” problem by W.1 0. Financial Indices Appear to Be Stochastic Processes W(t) 1.7 0. Fortunately.5 Chapter 1. Pap. to a very good approximation..2 0.4 0.” sound statistical analysis shows that there are very few. Amer. Assoc. 406 (1994). patterns over time in the fluctuations of the stock market.5 2. Arthurs. This means to me that any such knowledge is worthless and refutes the notion of genuine “new knowledge. Five realizations of Brownian motion (Wiener process) W(t) generated by Algorithm 1.0 0. p. I view this sort of trend analysis as gibberish. accidentally. Competing agents individually generate widely fluctuating valuations of the asset. with fluctuations due to agents competing with each other. 19 July 2008. Despite all the analysis by “financial experts. if any. are a composite of an uncountable infinity of individual realizations. We assume independence in our development of suitable mathematics.0 1. One recommendation which I heard in a public presentation was based on Fibonnaci numbers.9 1. just greed.B. such as seen in real share prices. The problem is that humans are very susceptible to seeing patterns which do not exist. The fluctuations are. it appears that the average valuation of the asset is realistic.3. Recent computer simulations2 show that when many interacting agents try to outsmart each other to obtain assets the result is mayhem. Proc. such as this Wiener process. Most figures in this book plot five realizations of a random process to hint that stochastic processes. Random fluctuations often seem to have short-term patterns.8 0.3 0. as you see in Figure 1.0 time t Figure 1.6 0.”1 But there is an alternative model for the fluctuations. .

g. t). Wiener formalized its properties in the 20th century. is also normally distributed N(0. are chosen to scale with h because it eventuates that this scaling is precisely what is needed to generate a reasonable limit as h → 0. Then the algorithm determines Wj. 1) and with W0 = 0 .3 by dividing time into small steps of length Δt = h so that the jth time step reaches time tj = jh (assuming t = 0 is the start of the period of interest). Similar arguments show that the change in W over any time interval of length t. t) and. Thus considering the step size h → 0 is a reasonable limit. nh) = N(0. and the sum of n such increments is W(t) = n−1 √ j=0 hZj ∼ N(0. √ √ namely ΔWj = hZj. √ By taking random increments ∝ h we find that the distribution of W at any time is fixed—that is. We generate a Brownian motion. Consider the process W at some fixed time t = nh .3 is named after the British botanist Robert Brown.1. and variance given by the sum of the variances. W(t) in the limit as the step size h → 0 so that the number of steps becomes large. W(t + s) − W(s). independent of the number of discrete steps we took to get to that time. . h) . as we now see. = W0 + j=0 Now firstly W0 = 0 and secondly we know that a sum of normally distributed random variables is a normally distributed random variable with mean given by the sum of means. 1) . or Wiener process. the value of the process at time tj. which is to understand the nature of Brownian motion (Weiner process). The stochastic process shown in Figure 1. 1/h. The random increments to W. by adding up many independent and normally distributed increments:3 √ Wj+1 = Wj + hZj where Zj ∼ N(0. who first reported the motion in 1826 when observing in his microscope the movement of tiny pollen grains due to small but incessant and random impacts of air molecules.1. t) .. further. is independent of any details of the process that occurred for times before time s.1 generates the realizations shown in Figure 1. and hence W(t) is approximated by n random increments of variance h: √ Wn = Wn−1 + hZn−1 √ √ = Wn−2 + hZn−2 + hZn−1 = ··· n−1 √ hZj . e. b) means that Z is normally distributed with mean a and √ variance b (standard deviation b). we take n steps of size h to reach t. 3 Recall that saying a random variable Z ∼ N(a. Brownian motion is also called a Wiener process 5 The Wiener process Now we return to our main task. Since all the increments Zj ∼ N(0. then √ hZj ∼ N(0. Thus we deduce that W(t) ∼ N(0. Algorithm 1. that is.

Brownian motion or Wiener process or random walk. so we now imagine t as small and write this as √ W(t + s) − W(s) = tZt. Now. Brownian motion has a prime role in stochastic calculus because of the central limit theorem. see that the sum of the increments above only involve random variables Zj for √ ≤ j < + n which are completely independent of the random increments hZj for j < . Financial Indices Appear to Be Stochastic Processes This last property of independence is vitally important in various crucial places in the development of SDEs. To see this normal distribution. In this context. so we investigate it now. after time t = s the changes in the process W from W(s) are independent of W(t) for earlier times t ≤ s . as j=0 required.1. In effect we choose Zj = ±1. suppose time has been discretized with steps Δt = h .4 and others following we approximate the increments as the binary choice of either a step up or a step down. the process always appears Brownian on the macroscale. The only property that we have not seen is that W(t) is continuous. usually denoted W(t). • the change W(t + s) − W(s) ∼ N(0. We know that W(t + s) − W(s) ∼ N(0. further. each with probability 1 . then provided the variance of the microincrements is finite.6 Chapter 1. that is. the time s corresponds to step j = . t) for t. Continuous but not differentiable Wiener proved that such a process exists and is unique in a stochastic sense. satisfies the following properties: • W(t) is continuous. t). Definition 1. • W(0) = 0 . j= This sum of n increments is distributed N(0. Indeed. and the time s + t corresponds to step j = + n. Then √ W +n = W +n−1 + hZ +n−1 √ √ = W +n−2 + hZ +n−2 + hZ +n−1 = ··· +n−1 √ =W + hZj . This is assured because the sum of many random variables with finite variance tends to a normal distribution. . if the increments in W follow some other distribution on the smallest “micro” times. in Section 1. 2 as the mean and variance of such a Zj are zero and one. and so W(t + s) − W(s) = W +n − W ∼ N(0. nh) random variable. and • W(t + s) − W(s) is independent of any details of the process for times earlier than s. t) . t) as before. respectively. the cumulative sum √ n−1 hZj has the appearance on the macroscale of an N(0. and hence are completely independent of the details of W(t) for times t ≤ s. s ≥ 0 .

and hence W is continuous (almost surely).1. and generally for solutions of SDEs. then define W(t) = √ Z t . which has variance ( t + s − s)2 = t − 2 s(t + s) = t. of the exponential function which is boringly smooth. In contrast. in the Wiener process. 1). This feature generates lots of marvelous new effects that make stochastic calculus enormously intriguing. Solution: Look at the properties in turn. like t. Algorithm A. Thus Wiener processes are much steeper and vastly more jagged than smooth differentiable functions. W(0) = 0 is satisfied. is too jagged to be differentiable. Now although Zt will vary with t. recall that for a smooth function such as f(t) = et we generally see a linear variation near any point. 4 See the amazing deep zoom of the Wiener process shown by Algorithm A. Thus. f(t + s) − f(s) = et+s − es = (et − 1)es ≈ tes . √ Figure 1. it comes from a normal distribution with mean 0 and variance 1. Pick a normally distributed random variable Z ∼ N(0. f(t + s) − f(s) ≈ tf (s) . • X(t) is clearly continuous as W and W are continuous and the linear combination maintains continuity. and upon this is based all the familiar rules of differential and integral calculus. Brownian motion is also called a Wiener process 7 where the random variables Zt ∼ N(0. Example 1. Thus we are familiar with f(t + s) − f(s) decreasing linearly with t. although W(t) ∼ N(0. almost surely W(t + s) → W(s) as t → 0.2. for example. or more generally. incidentally).4 (right column) shows.4 (left column). • X(0) = ρW(0) + 1 − ρ2W(0) = ρ · 0 + 1 − ρ2 · 0 = 0 . then almost surely the √ right-hand side tZt → 0 . and so W(t) does not satisfy the third property of a Wiener process (nor the fourth. .4 As Figure 1. √ √ √ √ • but W(t + s) − W(s) = Z( t + s − s). Although it is continuous we now demonstrate that a Wiener process. Is W(t) a Wiener process? Solution: No. 1) .1. Notionally the Wiener process has “infinite slope” and is thus nowhere differentiable.2 in Appendix A with its unexplored features.4 (left) shows W(t + s) − W(s) decreasing more slowly. Show that the linear combination X(t) = ρW(t) + 1 − ρ2W(t) is a Wiener process. W(t) is clearly continuous. Consider each property in turn: • true. so as t → 0. • true. Example 1.3. Contrast this with a corresponding zoom.1. t). it does not satisfy all the properties of a Wiener process. Let W(t) and W(t) be independent Wiener processes and ρ a fixed number 0 < ρ < 1 . such as that shown in Figure 1.

05 0.98 0.15 1.05 −0.04 −0.5 1.90 −0.5 1.3 0.2 0.05 0.4 1.3 0.1 −0.25 2.0 −1. zoom into three realizations of a Weiner process showing the increasing level of detail and jaggedness.8 1.4 0.2 −0.04 0. Right: From top to bottom.1 0.5 0.6 −0.96 0.5 ←− zoom in −0.0 −0.6 1.15 −0.9 0.00 −0.04 Figure 1.6 0.25 −0.1 0.05 0.2 0.05 0.2 1.20 1.25 1.0 −1.05 −0.2 0.15 0.5 1.6 −0.15 0.10 −0.2 −0.3 −0.2 0.0 1.5 −0.2 −0.3 1.4 0.6 0.0 2.20 0. Left: From top to bottom.2 0.10 1.0 0.05 0.94 −0.5 0.5 0.8 0.00 0. Financial Indices Appear to Be Stochastic Processes 3.0 −0.2 0.5 1.0 Chapter 1.5 −0.25 0.1 0.2 0.15 −0.04 1.10 1.1 0.1 0.0 0.5 2.3 −0. Time is plotted horizontally.4.05 1.00 0.3 −0. zoom into the smooth exponential et.05 0.05 0.25 0.6 −0.0 1.20 −0.4 −0.95 0.5 −0.15 −0.12 1.08 1.0 0.1 0.4 −0.0 −0.00 0.04 0.4 −0.2 −0.8 −1.1 −0.6 −0.25 0.10 0.0 0.05 1.25 1.00 0. .6 1.0 −0.06 1.0 −0.1 1.8 −0.2 0.0 0.3 0.02 1.4 0.15 0.3 −0.3 0.0 −0.

t) .t) ∼N(0. Different assets have different amounts of fluctuation. 1. X(t + s) − X(s) = ρ[W(t + s) − W(s)] + 1 − ρ2[W(t + s)− W(s)] . That is. any increment in asset value ΔS = σ ΔW is distributed N(0. 2. so neither does X(t + s) − X(s). would be proportional to S—investors expect. where Δt is the time step (like h). ΔW. X(t + s) − X(s) = ρW(t + s) + 1 − ρ2W(t + s) − ρW(s) − 1 − ρ2W(s) = ρ [W(t + s) − W(s)] + 1 − ρ2 [W(t + s) − W(s)] ∼N(0. Summary The Wiener process W(t) is the basic stochastic process from which we build an understanding of system with fluctuations. to get twice the return from a doubling in investment. for example. .2 Stochastic drift and volatility are unique Applying the basic Wiener process. its independent random fluctuations.2.3. t). ΔW. scale with the square root of the time √ step. risky assets generally have a positive expected return. The Wiener process is continuous but not differentiable. 1) as before. the Wiener process assumes that any step. and hence this increment cannot depend upon the earlier details of X. ΔS. Symbolically we write this as √ Interpret this differential equation as meaning Sj+1 = Sj + σ ΔtZj for Zj ∼ N(0. Figure 1. σ2 Δt). 3. but neither W(t+s)−W(s) nor W(t+s)− W(s) depends on the earlier details of W or W. to the prices of assets needs refinement for three reasons: 1. Stochastic drift and volatility are unique 9 • From its definition and from the properties of scaling and adding normally distributed independent random variables. Deal with each of these in turn: 1.ρ2 t+(1−ρ2 )t)=N(0. • Also from its definition.(1−ρ2 )t) ∼N(0. Increase the size of the fluctuations by scaling the Wiener process to the asset value S(t) = σW(t).1. whereas the expected value of a Wiener process E[W(t)] = 0 as it is distributed N(0. dS = σ dW . where the scaling factor σ is called the volatility of the asset. is independent of the magnitude of W. whereas we expect that any change in the value of an asset.ρ2 t) ∼N(0. Δt.t) ∼N(0.

9 1.5 0.8 0. or in terms of infinitesimal differentials. However. One way to model such growth is to ensure that the increments have a nonzero mean.10 3 Chapter 1. 1 2.2.2 0. which is subtly different from the Ito interpretation of SDEs developed in this chapter. 2. whence μ is the slope of the increase in price.1) Equation (1. 5 In the physics and engineering communities. physicists and engineers usually interpret it in the Stratonovich sense. as computed by Algorithm 1. as we do next. the drift μ and the volatility σ to vary with price S and/or t instead of being constant.1 0. Simply add a little to each increment: √ Sj+1 = Sj + μ Δt + σ ΔtZj . We expect (hope) the asset will increase in value in the long term. σ. S)+σ(t. SDEs such as (1.1) symbolically records the general stochastic differential equation5 when we allow. labeled “μ. in the absence of fluctuations (the volatility σ = 0) the price S(t) = S0 + μt exactly. where the parameter μ is called the drift. This is an SDE in a different disguise. Financial Indices Appear to Be Stochastic Processes 2 1 S(t) 0 0.6 0. S)ξ(t).4 0. Figure 1. with four other transformed versions.2 draws this realization of a Wiener process. dS = μ dt + σ dW . namely dS/dt = μ(t. Equivalently we write ΔS = μ Δt + σ ΔW.5 shows a realization of a Wiener process and various S(t) derived from it with different drifts and volatilities. Algorithm 1. called a Langevin equation. recognizing that ξ(t) represents what is called white noise.” with various drift μ and volatility σ but for the same realization of the Wiener process. for example.3 0. (1.1) are written with the appearance of ordinary differential equations. 2 3 0.5 2.7 0. μ = 0 and σ = 1 . 0.5. .0 time t Figure 1. The factor Δt is used so that μ is interpreted as the expected rate of growth of S(t).0 0.

6 plots ΔS/S for cotton prices from which one could roughly estimate the magnitude of α and β.1). as above. S for some constants α and β called the stock drift and the stock volatility. That is. In (1. 2’. we expect increments relative to the current value to be the same. However.2. Figure 1. we suppose throughout our applications to finance that assets satisfy the SDE dS = αS dt + βS dW (1. Stochastic drift and volatility are unique 11 Algorithm 1. 0. where r is the (continuously compounded) interest rate. 3’.’2.2 M ATLAB /S CILAB code to draw five stochastic processes.5 2 0. dx=diff(s). respectively. w=cumsum([0.dw]).1) μ is the absolute rate of return per unit time.2) Algorithm 1.3 3]) legend(’0. For a stochastic quantity the immediately generalize to assume that the relative increment has both a deterministic component. 1’.4).3 In M ATLAB /S CILAB.’2. and so the relative increment in the price of the asset is ΔS/S = r Δt . n=300. for an asset with no volatility (treasury bonds. this code estimates the stock drift and stock volatility from a times series of values s. 3./s(1:end-1) alpha=mean(dx)/h beta=std(dx)/sqrt(h) . In S CILAB use $ instead of end. Rearranging and writing in terms of differentials. Algorithm 1. Differentiating this gives dS/dt = rS.3’. For example. perhaps) we obtain exponential growth (compound interest) such as S(t) = S0ert. all scaled from the one Wiener process. ΔS = α Δt + β ΔW .1. with different drifts and volatilities. given h is the time step.1. whence dS/S = r dt.’-1.t*[0 2 2 -1 -1]+w*[1 0. t=linspace(0.n+1)’. h=diff(t(1:2)). financial investors require a percentage rate of return. dw=sqrt(h)*randn(n.3 provides code to numerically estimate α and β.5’. 0.’-1. plot(t. and an additional stochastic component.

. We use it throughout our exploration of finance.1 as a function of time: √ shows a daily drift and volatility coefficients of α = 2.0 (b) 1970 1975 1980 1985 1990 1995 2000 y ear Figure 1. with drift μ = αS and volatility σ = βS .4 0.77 × 10−4/day (a) and β = 0. ΔS/S.0 1975 1980 1985 1990 1995 2000 y ear ΔXj = ΔSj Sj (yearly) 0.2) is that financial indices generally have larger negative “jumps” than predicted by the Weiner dW: that is. (b) shows that this translates into α = 6. of the cotton prices of Figure 1.” However.0 (a) 1970 1.2 1.4 1.0181 / day.6.2) is sound.2 0. The only straightforwardly observed departure from the model (1. Relative increments. for most purposes the stochastic model (1. Financial Indices Appear to Be Stochastic Processes ΔXj = ΔSj Sj (daily) 0. rare falls in the financial market are too big for a Wiener process.2 0.6%/ year . such rare falls reflect financial “crashes.1 Chapter 1.8 0.9%/year and √ β = 28.12 0. respectively.6 0.

42 $0.78 $3. It is the ensemble of possible solutions with their probability of being realized that is included within the term “stochastic process. Year 2001 2002 2003 2004 2005 2006 Close Sj $3.48 (ΔSj)/Sj 0.2. The standard deviation of these five numbers in the fourth column estimates the stock volatility: β ≈ 0. the Doob–Meyer decomposition theorem (which I will not prove) asserts that any given Ito process X(t) has a unique drift μ and volatility σ. Compute the relative changes in the price as shown in the fourth column of Table 1. Thus the decomposition of the process into the SDE form dX = μ dt + σ dW is justified because this form applies on small scales—we just imagine that a host of such little pictures can be “pasted” together to form the large scale process X(t).6 Figure 1.11 = 11% per year. symbolically written dX = μ dt + σ dW . we find that at the smallest scales the process looks like one with linear drift and constant volatility. The price of wheat in US$ per bushel closed at the prices shown in the second column of Table 1.51 $3.45 Example 1.21 = 21% √ per year.1: ΔSj = Sj+1 − Sj .4 (stock drift and volatility). Indeed. these estimates are extremely crude because of the small amount of data supplied by Table 1.07 −0.23 $1.28 $4.1). This definition is ill-defined. with some drift μ and some volatility σ. The mean and standard deviation of the last column estimates the stock drift and stock volatility.11 −0.14 0. For the moment we continue to develop and work with an intuitive understanding of the symbols. as we have not yet pinned down precisely what is meant by dS = μ dt + σ dW.76 ΔSj $0. Thus we are assured that there is a one-to-one correspondence between SDEs.1: (ΔSj)/Sj = (Sj+1 − Sj)/Sj .7 shows that by zooming into an Ito process. Annual closing price of wheat and its relative increments. an SDE gives rise to an infinitude of realizations.42 $3. Of course. An Ito process satisfies an SDE such as (1.36 $−0. Finally.1. that the term “process” applies to the entire ensemble of realizations of a stochastic function.27 $−0. Financial assets satisfy particular SDEs of the form dS = αS dt + βS dW . dX = μ dt + σ dW . Given this data. Stochastic drift and volatility are unique 13 Table 1. Solution: Compute the changes in the price as shown in the third column of Table 1. and the stochastic process which is its solution X(t). Definition 1. Precision comes in Chapter 4.” 6 Note .00 $3.5. The average of these five numbers in the fourth column estimates the stock drift: α ≈ 0.1.1. Summary A stochastic process X(t) satisfies an SDE.1. A stochastic process is not just any one realization because each realization is markedly different. estimate the stock drift and stock volatility of the price of wheat.07 0.

4 1.0 1.2) or the general SDE (1.3 1.8 Chapter 1.2 W(t)) shown at different levels of magnification (the t axis is horizontal and X(t) is plotted vertically).2 1.0 0.2 in Chapter 2 develops algebraic solutions of SDEs.5 2.9 0.5 0. and then evaluate it at the jth time step.0 0.0 1.9 0. recognizing that the drift and volatility are generally functions of S and t so that ΔSj = Sj+1 − Sj = μ(Sj.4 0.5 1.1 1.1 The Euler method is the simplest Rewrite the general SDE (1.2 0. 1.1 1. Financial Indices Appear to Be Stochastic Processes 2.1 0.14 3.4 1.0 0.3 Basic numerics simulate an SDE Section 2. Here we resort to a simple numerical technique to approximate solutions of SDEs.1) in terms of finite differences (increments) ΔS = μ Δt + σ ΔW .0 0.5 0. as promised by the Doob–Meyer decomposition. 1.5 1.0 0. tj)ΔWj . Five realizations of the Ito process X = exp(t + 0.2 0.0 0. The numerical solution of SDEs such as that for assets (1. In a sense this undoes the limit h → 0 discussed in the previous section.1 Figure 1. The top left displays the largest scale showing the exponential growth in X(t).8 0. the bottom right displays the smallest scale view of X(t) showing that on small scales it looks just like a stochastic process with linear drift and constant volatility.0 0.0 1.0 2.5 1.1) is based on replacing the infinitesimal differentials with corresponding finite differences.3.0 0. .7. tj)Δtj + σ(Sj.2 1.

it is not so. Basic numerics simulate an SDE 15 Algorithm 1.:)+alpha*s(j. tj)Δtj + σ(Sj.:).5) With the initial condition that S0 = 1 . alpha=1. h=diff(t(1:2)).m).3) is typically invoked as √ Sj+1 = Sj + μ(Sj.s) where ΔWj are independent random samples from N(0. The difference is that instead of discrete states.8–1. whence tj = jh and √ ΔWj = hZj for random variables Zj ∼ N(0. As a second subscript to an array. Δtj).3.n+1)’. Noise can act to stabilize growth.m)*sqrt(h). For any given α and β.. (1.3) is a most general and simple method to numerically simulate the realizations of SDEs. the Euler method (1.g. 1) so that the Euler method (1. s(j+1. This surprising result is supported by later algebraic results. generally seem to stay smallish in magnitude.6 (exponential Brownian motion).4) reduces to √ Sj+1 = Sj + αSjh + βSj hZj . end plot(t. s=ones(n+1. t=linspace(0. since the deterministic part of the SDE indicates that the solutions should grow. for j=1:n dw=randn(1.*dw. we choose a fixed step in time of Δtj = h. For three different combinations of stock drift α and volatility β. (1. Rearranging for Sj+1 gives the recursion Sj+1 = Sj + μ(Sj. Kao 1997).4 shows how this method may be implemented in M ATLAB /S CILAB to generate many different realizations.4) for an SDE looks just like a Markov chain. tj)ΔWj .:)=s(j. tj) hZj . Usually. . beta=2 n=1000. The Euler method (1. Figure 1.4 Code for simulating five realizations of the financial SDE dS = αS dt + βS dW . tj)h + σ(Sj. in financial applications.1. m=5. apart from a couple of large excursions.:)*h+beta*s(j.10. the colon forms a row vector over all the realizations. See that increasing stock drift α indeed increases the rate of growth. (1. Figures 1.10 plot five realizations of the solution S(t). e. high volatility can act to keep stocks low even if they would otherwise increase. Notice that in this last case with relatively large volatility. This is very strange.4) Example 1. The solutions of the SDE dS = αS dt + βS dW are collectively called exponential Brownian motion. as discussed in stochastic process modelling (see. here we have a continuum of states parametrized by the asset price S. for simplicity.1. This simple Euler method (1. Algorithm 1. and that increasing the stock volatility β indeed increases the level of fluctuations. the realizations. However. The for-loop steps forward in time.3) This form allows the time steps to be different.

1 0. Simulation of the exponential Brownian motion of asset values with larger stock drift α = 2 and stock volatility β = 0.7 0.1 0. Financial Indices Appear to Be Stochastic Processes 4.0 1.8 0.4.8 0.5 0.9 1.0 0.8. Simulation of the exponential Brownian motion of asset values with stock drift α = 1 and stock volatility β = 0.0 0.5 0.5 3.3 0.0 time t Figure 1.6 0.3 0.9. 14 12 10 8 S(t) 6 4 2 0 0. .7 0.2 0.5 S(t) 2.4.5 1.5 as generated by Algorithm 1.9 1.0 time t Figure 1.4 0.5 0.0 3.4 0.16 Chapter 1.6 0.2 0.5 as generated by Algorithm 1.0 0.0 2.

I seek to empower you to manage large-scale numerical problems through M ATLAB /S CILAB.uk/~cd02/EUSPRIG . Why use MATLAB/SCILAB? True. At times tj the system is in the state of the asset having price Sj. HM Customs and Excise and Ray should know. I do not endorse crippling your power by limiting yourself to inefficient tools. where the stochastic aspect enters through the normally distributed random variable Zj. Over this period. At the next time tj+1.” So says Ray Butler of the Computer Audit Unit. Basic numerics simulate an SDE 7 6 5 4 17 S(t) 3 2 1 0 0. common spreadsheet programs have notorious errors).2 0. However.9 1.8 0. For ten years he has been investigating errors in spreadsheets used by companies for calculating their VAT payments.3. the financial industry standard is to use spreadsheets for almost all numerical computation. time is still discrete. spreadsheets are many orders of magnitude slower than script and programming languages (furthermore. √ a distance μjh + σj hZj away.7 0.gre. In stochastic process modeling we investigate probability distributions.1 0.ac. Instead.10. whereas here for the moment we focus on realizations.4.1.5 0.0 0.cms.0 time t Figure 1. Stop The Subversive Spreadsheet!: 7 The presence of a spreadsheet application in an accounting system can subvert all the controls in all other parts of that system. the system makes a transition to a state Sj+1. Simulation of the exponential Brownian motion of asset values with stock drift α = 1 and larger stock volatility β = 2 as generated by Algorithm 1.4 0. Chapter 3 addresses probability distributions of solutions to SDEs. But in the numerical code. he has collected useful data on types and frequencies of errors as well as on the effectiveness of 7 http://staffweb.6 0.3 0. I quote from David Chadwick.

Ray has written extensively about the phenomenon of errors in spreadsheets (an excellent example is to be found in his article entitled ‘The Subversive Spreadsheet’.2 Convergence is relatively slow As with any numerical method. They’re a bit like the millennium bug — nobody knew the time-bomb was there until it was pointed out and then everybody knew and knew when it would happen. It seems surprising that Ray would find such a problem in such a straightforward well-defined business application but as he is quick to point out “Even in a domain such as indirect taxation. Chris Conlong of the Business Modelling Group at KPMG Consulting is also only too aware of the problems and. The relatively slow convergence of numerical solutions of SDEs is difficult to overcome.” Ray is not alone in his interest of spreadsheet risks. The survey found that 95% of models were found to contain major errors (errors that could affect decisions based on the results of the model). One crucial issue for SDEs is that different realizations of the Wiener process—the noise—generate quite different looking realizations of the solution S(t).3). with the spreadsheet problem few people know that there is a time-bomb at all and none knows when their particular bomb may go off. that is. This is true for the Euler method (1. These figures are truly astounding and if extrapolated to all major organisations throughout the world hint at potential disaster scenarios just waiting to happen. For now we illustrate just the convergence. which is characterised by relatively simple calculations. . relatively high domain knowledge by developers. See the different realizations in each of the above figures. Financial Indices Appear to Be Stochastic Processes different audit methods not only those used by VAT officers but also those in use by other auditors.” 1.1%.001 = 3% . Superimpose plots of numerical solutions of the SDE dS = S dt + 1. the error decreases in proportion to h. A colleague recently remarked “Spreadsheet errors are a business time-bomb waiting to go off.5S dW with initial condition S0 = 1 for different time steps h to show qualitatively the relatively slow convergence as h → 0 . when asked. for an SDE the error of the √ Euler method is the larger O h . in the following example we generate the Wiener process first with the finest time step. Thus. to examine convergence we must retain the one realization of the Wiener process as we vary the step size h. for example. 92% of those that dealt with tax issues had significant tax errors and 75% had significant accounting errors.18 Chapter 1. For example. whereas for an SDE √ the error would be about 0. the use of spreadsheet applications is fraught with danger and errors. for a deterministic differential equation with time steps of h = 0. However. and generally well-documented calculation rules. 59% of models were judged to have ‘poor’ model design.7 (convergence to exponential Brownian motion). Consequently. then sample it with increasingly coarse time steps. frequently refers to the findings of a KPMG survey of financial models based on spreadsheets. we expect the numerical solution to approach the true solution as the time step h → 0 . but the rate of convergence is slow: for a deterministic differential equation the error of the Euler method is generally O h . Example 1.3.001 we would expect an error of about 0.

0 2.5 with different time steps h = 1/16 (cyan).5 3.6 0.5 2. . and 1/1024 (blue) to illustrate convergence as h → 0.5 0. and 1/1024 (blue) to illustrate convergence as h → 0.0 time t Figure 1.0 0.5 0.9 1.4 0. 1/64 (red).5 0.5 with different time steps h = 1/16 (cyan). 1/256 (green).1 0. 7 6 5 S(t) 4 3 2 1 0. One realization of the exponential Brownian motion of asset values with stock drift α = 1 and stock volatility β = 1.8 0.7 0.1.7 0.6 0.0 19 S(t) 0. 1/256 (green).3. Basic numerics simulate an SDE 3.2 0.9 1.3 0.12.2 0.1 0.5 as generated by Algorithm 1.0 1. 1/64 (red).8 0.3 0.0 0.0 time t Figure 1.11.4 0.5 1. One realization of the exponential Brownian motion of asset values with stock drift α = 1 and stock volatility β = 1.0 0.5 as generated by Algorithm 1.

.12.11–1. 1. Higham (2001) presents more information on such basic methods for SDEs and their convergence and points to more sophisticated and accurate schemes.5 Code for starting one realization with very small time step.8 8 Read A calculus of risk by Gary Stix in the May 1998 issue of Scientific American.*randn(n. Summary The Euler method (1. The arguments employed here to introduce the rational pricing of options looks like a numerical approximation of SDEs. albeit a little slowly.4 A binomial lattice prices call option We use the principle that there cannot be risk-free profit. Suggested activity: Do at least Exercise 1. s=ones(n+1. The analysis presented here is based on research by Black. But we simplify even more than previously by not only discretizing time but also discretizing asset prices. as we now see.1))].1. alpha=1. n=n/4. first with n = 45 = 1024. to price options in stochastic finance.5 twice generated Figures 1. w=[0.s. Executing Algorithm 1. W(t). with the smallest step length. beta=1. called arbitrage.5 finds numerical solutions on 0 ≤ t ≤ 1 with n steps of length h = 1/n . t=linspace(0. end plot(t.7(1). then increasing the step length h by a factor of 4 and correspondingly decreasing n by a factor of 4 each time we draw a new approximation.4) numerically computes realizations of SDE (1.20 Chapter 1. for col=[’b’ ’g’ ’r’ ’c’] dt=diff(t).5 n=4^5. In a free online article. Scholes.001 or less. then repeat with time steps four times as long as the previous. Chapters 2 and 3 of the book by Stampfli and Goodman (2001) provide reading to supplement this section. We generate the Wiener process. In each figure the numerical solutions do seem to converge. hold on t=t(1:4:end).n+1)’. Financial Indices Appear to Be Stochastic Processes Algorithm 1. end Adapt the code used before.1).1) provided the time step is small—generally use a time step h ≈ 0.col). dw=diff(w). Algorithm 1.cumsum(sqrt(diff(t)). for j=1:n s(j+1)=s(j)+alpha*s(j)*dt(j)+beta*s(j)*dw(j). w=w(1:4:end). and then for each approximation we sample the one realization of W(t) with a step four times larger than the previous approximation by subsampling statements such as w=w(1:4:end) . one for each of the two different realizations of the Wiener process W(t). and Merton (circa 1973) which later was simplified.

Example 1.9 The agreed price is called the exercise price or the strike price. it may be worth either nothing or something. then Bob makes no claim—Bob does not exercise the option and Alice keeps the price of the insurance policy as profit. However. he could then dispose of the asset at a profit. Thus if the listed price of the asset is ultimately below the strike price. Bob damages his car so that it is worth less. Expiry value of a call option At the expiry date. Alternatively. Alice holds ten units of Telecom shares. The value of the call option to Bob at the expiry date is thus C = max{0. where they developed methods of foiling that dastardly eavesdropping spy. . then Bob will exercise his option and buy the asset from Alice so that. Since then Alice and Bob have worked for us in financial theory. is exercisable by Bob at the fixed price at the specified date. where S is the price of the asset and X is the exercise (strike) price. Insurance is a form of call option Bob buys an insurance policy for his car from Alice. A European call option gives the buyer the right but not the obligation to purchase an asset from the seller at a previously agreed price on a particular date. She sells to Bob at $4 per call option. but can be abandoned by Bob without penalty.50 . but is never a liability. if during the year Bob’s car remains undamaged so it is worth the same (ignoring normal wear and tear). S − X} .8. an asset.4. during the ensuing year. if the listed price is higher than the strike price of the call option. Bob is not going to buy the asset from Alice for a price above what he could pay on the open market. For definiteness we refer to the seller of a call option as Alice and to the buyer as Bob. A call option has some nonnegative value because. A binomial lattice prices call option 21 Interestingly. allowing him to buy the ten shares at the end of the year at the strike price of X = $38. They made their contribution in the area of quantum cryptography. If.10 The call option: is sold by Alice so she obtains money to invest elsewhere. 9 An American call option is the same but additionally allows the buyer the right to purchase the asset before the particular date. known only by her code name. depending on the vagaries of market fluctuations. at the start of the year. the insurance company. Eve.9 (trade a call option). then he makes an insurance claim—Bob exercises the option. the application in Example 1. That Bob makes a claim on an insurance policy (or not) is closely analogous to Bob exercising an option (or not). we say altogether they are worth $35. Throughout this section we discuss only the easier European call option and the corresponding put option. then the option is worthless and Bob tosses it away. 10 Alice and Bob first made a name for themselves in helping to solve problems in quantum physics.12 shows that such valuation of options fully justifies taking action on global warming almost irrespective of the actual probability that global warming is occurring! Definition 1.1. if nothing else.

Section 2. in risk-free bonds paying. then Bob gains because he obtains the asset at a cheaper price than the open market price. Alternatively. people must accept risk in order to better their return from bonds. then Alice could form a portfolio of the asset and sell options so that she is guaranteed to profit. Questions: In this example. The principle we use is that of arbitrage: there cannot exist the opportunity for risk-free profit.12($35 − F0) . then Bob loses because he must buy at a higher price than the open market price. continuous state version using more theory of SDEs.5 cents for this forward contract. Alice invests only $35− F0 into this portfolio of the asset and the forward contract.50 at the end of the year—risk free. if the shares drop in price by 20% to S = $28.75 . if the market value stays below $38. 1.50/1. Alternatively. from Alice for $38. suppose Alice and Bob write a forward contract: The forward contract states that Bob will definitely purchase the asset. she then has the asset of the shares ready for sale at the agreed price at the end of the year.50 and sell them immediately for $43.625 . Example 1.3 examines the continuous time. However. Since Bob pays her F0 for the forward contract. Alice could invest this amount.25 profit because he originally paid Alice $4 for the call option. Bob should pay Alice 62. It is only when the call option is precisely and correctly valued that nobody is guaranteed to do better than invest in bonds. and Alice correspondingly gains.75.50 = 1.1 Arbitrage value of forward contracts Before proceeding to the interesting case of options. Arbitrage asserts that any risk-free return must be the same as investing in bonds. then Bob could buy the options with money obtained through selling the asset so that he is guaranteed to profit. In this context. Instead of giving the option to Bob of buying the Telecom shares of Example 1. Alice avoids risk by purchasing the asset at the start of the year for $35 with her own funds. $35 − F0 . and Alice loses. Rearranging F0 = $35 − $38. if the call option is valued too low. say. . then Bob will exercise his right to buy the shares from Alice for $38. the ten Telecom shares.12 = $0. the contract involves risk: If the market value goes above $38. We discover that if the call option is valued too highly. Financial Indices Appear to Be Stochastic Processes If the price of the ten shares on the open market goes up 25% to S = $43.50. is the $4 that Bob paid to Alice a fair price? Should Alice have asked for more.4. then Bob will not exercise his option because he would lose money if he did (over and above the $4 he originally paid for the option). let us introduce arbitrage in its application to valuing forward contracts.12($35 − F0) by investing in bonds. With a correct valuation of options. Alice makes this investment to receive $38. Who should pay what to whom at the start of the year for this forward contract? Arbitrage determines the value of the forward contract. That is. which gives him $1. Alternatively.9. Thus arbitrage asserts $38. Solution: Suppose Bob pays Alice F0 for the forward contract (negative F0 means Alice pays Bob).10 (Telecom shares). 12% interest over the year. Thus Alice could alternatively obtain a risk-free return of 1.22 Chapter 1. or should Bob have insisted on less? How can we decide? We answer these questions by a model that is a Markov chain and hence looks like a numerical solution of an SDE.50 at the end of one year.50. “profit” means an amount over and above that which is obtained by investing in secure bonds.

13 90◦ and plot time horizontally to give a simple binomial tree model of asset prices over one period. the asset price is multiplied by a factor of u = 1.2 A one step binomial model Our initial simple analysis rests on a Markov chain with just three states representing the possible prices of an asset now and in a year’s time. Bob pays Alice F0 to sign the contract.6) to be the value of a forward contract. 1.14.25 or d = 0.13.25 ¨¨ ¨ ¨ r rr rr Sd = 28 j r Cd = 0 S = 35 C=? Figure 1. State transition diagram for a simple binomial model of asset prices over one period: The prices either go up by 25% or down by 20%. For definiteness we continue with the numbers used in Example 1. which rearranged determines F0 = S0 − X/R (1. Rotate the states of Figure 1. Alice could purchase the asset immediately for its known current price S0. Being risk free. in bonds. Su = 43. Let Bob and Alice value this forward contract at F0 at the start of the period. that is. From the current state of the asset price being S = $35 . She would invest S0 − F0 of her funds to do this and would then receive X at expiry time T from Bob—risk free. Additionally we include in this diagram the value of the call option on the asset at the end of the year: if the asset price goes up. Now use arbitrage to value a general forward contract. I do not record any probabilities for the transitions from the current price S because the probabilities of the price increasing or decreasing are irrelevant! It is usual in financial applications to arrange the states vertically and to plot time horizontally as shown in Figure 1.13 shows the state transition diagram.80. A binomial lattice prices call option  Sd = 28   '   S = 35    E Su = 43. the call . At the expiry time T the bonds would be worth R(S0 − F0) for some R reflecting the interest rate of the bonds.75 B ¨ ¨ Cu = 5.75     23 Figure 1. Suppose Alice and Bob contract for Bob to purchase an asset from Alice at time T for some agreed price X. this return must be the same as what Alice would get by investing the same amount. here we restrict our attention to the two possibilities that the price after a year has either risen by 25% or fallen by 20%. that is.4.4. S0 − F0 . Arbitrage asserts X = R(S0 − F0). To avoid risk.1. Analogous reasoning values options: we need to value Alice’s investment and to discount the expiry value by the bond rate.9. respectively.14. Figure 1.

We soon find that the market forces the call option to have a definite price. and hence her hypothetical portfolio. 12% interest.50 × 3 − $43. because at the end of the year that would have only been worth $25.76. and hence at the end of the year Alice will have $38. After one year.50. But how do we determine the value. Although these are both higher than the bond return. irrespective of whether the asset price rises or falls.75. whereas if the asset price goes up. The opportunity for risk-free profit would result in traders clammering to sell options.50×2−$43. the call options are overpriced. if the asset price goes down. then the options are worthless and Alice’s asset. which at 12% interest would be worth $30. Alice makes a risk-free profit! This risk-free profit shows that.50 = $5. Look at the issue from Alice’s point of view.75 × 2 = $28 . Hedge against fluctuations We endeavour to discover under what conditions it is impossible to make a profit (above the bond rate) without risk.24 Chapter 1.75 = $33. her portfolio is worth only $22. initially costing . her portfolio is worth $38. Only by selling an appropriate number of options does Alice create a risk-free portfolio.25. is worth $28. This latter figure is a markedly lower return than she could get from investing her $27 in bonds. But before we do that. she has to buy two more lots of Telecom shares. whereas if the market goes down. One possibility way to get the requisite $35 is to invest $23 of her own money and to sell three call options to Bob at the valuation of $4 each. which would flood the market and hence reduce the price of the option. at $4. it is worth only $28. then if the asset price goes down. whereas if the asset price goes up.50 × 3. whereas if the asset price goes down the call option is worthless.25 . • Conversely. note that Alice’s choice of selling three call options is the result of a careful balancing act. when the options are valued correctly. costing $43. of the call option at the start of the year? A risk-free profit from overpriced options Suppose Alice wants to buy the asset of the Telecom shares. for which Alice receives $38. say. and there is risk. would prefer to purchase it at the open market rate of S = $28 rather than from Alice for X = $38. her portfolio is worth $28. Financial Indices Appear to Be Stochastic Processes option is worth Cu = $43. as the strike price agreed to between Alice and Bob was X = $38. C. is better than what Alice could have gained if she had invested her $23 in bonds at.75 × 2. suppose Alice had sold four call options and invested $19 of her own money to buy the asset.50 .75 − $38. Cd = $0 . if he were to buy the asset. • Suppose Alice had only sold two call options and invested $27 of her own money to buy the asset. If the asset price increases. then Alice has done well. then if the market goes up. but the point is that this good outcome is only achieved by taking the risk that the portfolio will fall in value relative to bonds. in order to satisfy Bob’s demand for the three lots of shares. Alice could construct a hypothetical portfolio of a unit of the asset and H call options sold to Bob. The hypothetical portfolio’s value of $28. because Bob.24 at the end of the year. the imbalance between the two cases results in a potential loss for Alice.

33. By the definition of a risk-free portfolio. but also increases the value of the call option Alice sold to Bob. the portfolio is worth $43. then the market will be flooded with people trying to sell such options.75 x + 3 × $5. 3 3 But if the price of the call option is higher than $3. $43. we must then have Su − HCu = Sd . to be risk free.33 . as the call options are worthless.12 × ($35 − 3C) ⇒ 10 1 C ≥ ($35 − $28/1. A fair price for the option Now Alice would like this investment to be larger than the return from investing the initial capital. S − HC = $35 − 3C . Solution: The argument is easier if we phrase it as Bob buying x units of the asset. Thus at the end of the year the portfolio is worth HX − (H − 1)Su = Su − H(Su − X) = Su − HCu .75 x + 3 × $5. and at the end of the year Alice will have to sell H units of the asset to Bob at a price of X.1. This is sensibly lower than the overpriced $4 we used earlier. would be worth 1. this costs him $35x + 3C which. Bob may want to buy a call option to protect himself from fluctuations when he sells an asset. Thus Alice wants $28 ≥ 1. Solve this linear equation immediately to give us the magic result that Alice must sell H = 3 call options to insulate her portfolio from price fluctuations and ensure the value of $28 at the end of the year. whereas if the asset price goes up. Value an option first by finding how one could hedge against risk.11. Example 1.25 . Bob’s portfolio will be worth just $28x.25 = $28 . this value must be the same as the value of the portfolio even if the asset rises in price—a rise in price increases the value of the asset. A binomial lattice prices call option 25 Alice a net investment of S − HC . receiving HX. If the price of the asset goes up. if the asset price goes down.11 (Bob wants to buy cheap options). then Bob will elect to exercise the call option. and so the value of the portfolio at the end of the year is just that of the asset.75 − H × $5.25 ⇒ x = −1 . that is. if Bob had not bought the call options and invested in bonds instead. Thus the arbitrage price of the call option must be C = $3.12) = $ = $3.25 is the net value of each call option if the asset price has gone up. namely Sd = $28 . If the price of the asset goes down. then the call options are worthless. . where Cu = $5.33 . these last two outcomes have to be equal: $28x = $43.12 × ($35x + 3C) at 12% interest. In order for the portfolio to have the same value at the end of the year irrespective of whether the asset price has gone up or down. but at the expense of having to buy H − 1 units of the asset at a price Su. that is. So at the start of the year Bob buys the three call options at price C from Alice and x units of the asset at S = $35 each. Suggested activity: Do at least Exercise 1. where x will turn out to be negative to represent that he actually sells the asset. and second by equating such a risk-free portfolio to investment in bonds. In the converse of Alice’s argument.4. To be protected against the fluctuations. in bonds at 12% interest.

the number of call options sold must be the ratio of the range of the asset price to the range of the option.7) Observe the reasonably intuitive result that. which. or it may fall in price by a factor d to a price Sd. Thus Bob would only be interested in buying call options if they were priced less than $3.33.33 . Cu − Cd (1. respectively (so far we have seen only Cu = Su − X and Cd = 0 . Value a general option We now derive the general formula for the one-step binomial price of a call option from the point of view of the seller. The number of options to sell Refer to the binary tree given earlier but only pay attention to the algebra. At the end of the period each call option is worth Cu or Cd depending on whether the asset price has risen or fallen. The rational price for the call option must then be C = $3. r = 0. that is.33 .33%). if put into bonds. The fair price Recall that for a riskless portfolio there can only be rational buyers and sellers to establish the portfolio if the investment returns the same as the risk-free bond interest rate. Financial Indices Appear to Be Stochastic Processes Thus Bob sells one unit of the asset as well as buying the three call options.33 . in order to make each outcome have the same value. . Let R = er be the factor by which bonds increase in price over the time period so that r is the equivalent continuously compounded interest rate over that time period (we used R = 1. but more generally these have other values). Then the original investment by Alice of S − Hc . this guaranteed return has to be at least as large as the return he would obtain by investing in bonds. gives the hedge ratio H= Su − Sd . To be worthwhile. would be worth R(S − Hc). The above example shows that the buyer and seller of call options have opposing interests that balance at one price only. rearranged. whereas Alice was only interested in selling them for more than $3.12($35x + 3C) ⇒ C ≤ $3.11 There are two parts to the analysis: determining the number of options to sell.26 Chapter 1. gives the fair price of a call option as C= S(R − u) + HCu . rearranged. An asset bought by Alice at a cost of S at the start time may increase in price by a factor u to a price Su. Thus H is called the hedge ratio.1133 = 11. Alice. HR 11 This balance between the two different views of a call option is rather like the beautiful relation between a linear programming problem and its dual that you see in operations research. To be risk free the portfolio must have the same value at the end no matter which outcome eventuated: thus Su − HCu = Sd − HCd. and determining the fair price for the options. thus $28x ≥ 1. Alice can protect herself against fluctuations—ensuring a risk free outcome. Setting the equivalent return in bonds equal to the return of the risk-free portfolio gives R(S − Hc) = Su − HCu (= Sd − HCd) which. By selling H call options at the start of the year for some strike price X.12 in the earlier example.

14 1 T$ is one tera-dollar. their portfolio would not do any better than the return from bonds.7) and rearranging to 1 [pCu + (1 − p)Cd]. whereas previously the portfolio was one asset and any number of options. Let us present the case for financing research projects as an option to ensure a future for humanity. either one of two things will happen: weights appearing in this convex combination appear as probabilities. Our valuation of research projects applies throughout this history.1.13 Suppose a researcher in climatology. with almost entirely invented figures. a million dollars. we obtain a more appealing form of this expression after substituting (1. 12 The .12 This argument is just used to set the prices of call options. based on their own assessment of the future movement of market prices. Nonetheless.71 and 1 − p = 0. I prefer this scientific notation.14 Suppose that the government’s (potential) portfolio is this “bought” option with a (tiny) fraction φ (phi) of the economy invested in the economy.12 (support research into global warming). But p and 1−p have nothing to do with the actual probability of an up or down step in the asset price. submits to a country’s government a project proposal costing 1M$ in personnel and equipment.29 . the project has this value independent of the probability that global warming actually occurs! This standard financial instrument of options values the project irrespective of the validity of the doubts and criticisms of the nay-sayers. 1 G$ denotes one giga-dollar. the Radon–Nikodym derivative. A binomial lattice prices call option 27 However. Now proceed with the option valuation. otherwise known as a billion dollars. Suppose the government funds JR’s project by forming a portfolio of the funded project together with an investment into the country’s economy. people buy and sell options just as they buy and sell assets.4.8) In the earlier example. 15 The variable φ is the reciprocal of the hedge ratio H used earlier. In general. In the spirit of our binomial modeling. 1 M$ denotes one mega-dollar. Should the government fund the project? How is the project to be valued? We value the project as a European (put) option: the government is the buyer of the option (that is. but such valuation is only recently recognized. and its dangers estimated. Say the economy is currently valued at 16 T$. R u−R R−d and 1 − p = . p = 0. called JR for short. In particular. namely. is based on the view that such weights do act like probabilities for some purposes.15 Look a decade. into the future to value this portfolio. and JR is the seller of the option. See that the price of the call option is just a convex combination of the possible values at the end of the period discounted by the bond factor R to give a value for the start of the period. and promising increased knowledge to ameliorate any significant global warming. People would not actually create a risk-free portfolio as given above because if they did. 13 Although global warming is now generally accepted. for the past 30 years or more global warming was a very contentious issue. more commonly known as a trillion dollars. Example 1. that is. where p = u−d u−d C= (1. The reason is that now the portfolio is one option and some fraction of the asset (the economy). Interestingly. pays for JR’s project). say. the theory of transformations between different probability distributions. we explore justifying a research project into the contentious issue of global warming.

1. Say the economy grows to 24 T$ over the decade. Since the portfolio is risk free. independent of the probability of the likelihood of global warming. Arbitrage gives initial value Now suppose that over the decade. Thus the initial value of the portfolio is its final value divided by R. suppose the knowledge gained by JR’s project saves 3 G$.28 Chapter 1. global temperatures will revert to normal. the government should value this project. bonds will increase by 33 1 % (roughly 3% per year). generating the knowledge. φ24 T$ = 3 G$ + φ12 T$. Analogous valuation applies to all such precautionary research projects. Consequently. The previous two examples assumed that the time period over which the call option operates is divided into just one time step. for example. Financial Indices Appear to Be Stochastic Processes • If global warming is a chimera. Let the present value of JR’s project be V. we decide now. as 500 times its cost.5 G$.3 applies SDEs by making the time step infinitesimally small and asset prices continuous. and the knowledge of JR’s project is worthless. But the knowledge gained by JR’s project empowers the government to take remedial action to limit the damage caused by global warming. then the portfolio would cost V + φ16 T$ = V + 4 T$. Risk-free portfolio One generates a risk-free portfolio by choosing the (reciprocal) hedge ratio φ so that these values at the end of the decade are identical: Pu = Pd.4. independent of the likelihood of global warming occurring. for 1 M$. the economy will grow.12. At the end of the decade the risk4 free portfolio would thus have value 6 G$. Over a year the . flooding productive land and valuable housing. The value of the portfolio is then simply Pu = φ24 T$ . the government then values the portfolio at Pd = 3 G$ + φ12 T$ . that is.5 G$ = 500 M$. Since JR offers to do the project. Section 2. Consequently. Suggested activity: Do at least Exercise 1.5 G$. and the economy suffers badly and falls to a value of 12 T$. sea levels rise. It also assumed that the price of an asset went up or down only in one of two large steps. by calling on the knowledge gained by the research project. which rearranged gives the hedge ratio φ = 1 × 10−3. it must have the same value as investing in bonds. Research into global warming is an extremely good value for the money as precautionary insurance. as this cost must equal the initial value of the option P = 4. • If global warming is real. As an interim stage. global temperatures continue to climb. namely P = 6 G$/(4/3) = 4. to model the asset price movement with quarterly time steps (Δt = h = 1/4 year). bonds will increase in value by a factor of R = 4/3 .3 Use a multiperiod binomial lattice for accuracy Make your analysis more accurate by dividing the relevant time period into more steps and by correspondingly increasing the number of possible asset prices. as an option. the value of JR’s research project is V = 0. 3 That is.

8944 . √ √ In general. Although by no means necessary.9) √ Now here we investigate a quarterly model so that h = 1/4 .121/4 = 1.1180? Recall that for exponential Brownian motion (which is characteristic of asset prices).25 = 1. u = eβ h and d = 1/u = e−β h .00 ' 31. . the appropriate factor is (er)h. A binomial lattice prices call option 29                  4 E Sd3 E Sd2 E Sd E S E Su E Su2 E Su3 E Su4 Sd 22.3993 . However. so we model the domain of asset prices by a nine state Markov chain: we decide on having four prices on either side of the current price S. Thus R4 quarterly = Ryearly .75 ' 48. A further reason for making all the increments in price use the same factors u and d is that it is then easier to apply the formula (1. Rquarterly = 1. the parameter p = 0. But why choose u = 1/d = 1.15. Here we want the interest compounded over four quarters to be the same as the yearly interest. The prices either go up or down by.15. a factor of u = 1. since bonds growing by compound interest increase in the value like ert for some rate r.31 ' 35.13 ' 43. The general rule is that the multiplicative factor of asset price increments should be propor√ tional to some constant raised to the power of h .1.69                  Figure 1.1180 . The rough approximation that prices only either rise or fall is equivalent to assuming that Z = ±1. it is convenient to have the selection of prices be at a constant ratio u = 1/d to their neighboring prices. as the parameters u and d being constant lead to a constant parameter p. to allow for the possibility that the four increments of the price are either all up or all down. the multiplicative factor for the rate of increase in the value of bonds should be proportional to a constant to the power of h. This keeps the volatility comparable in the approximations over differently sized time steps. that is h = 1/2 . rather than somewhere in between. p also depends upon the bond rate R which also varies with the time step h.5) the prices are multiplied by a factor √ √ 1 + βZ h ≈ eβZ h (upon ignoring the drift term αh) where Z ∼ N(0.00 ' 39. as shown in Figure 1.02873 .40 ' 25. Hence here the ratio √ √ u = eβ/2 = eβ = uyearly = 1.6007 and correspondingly 1 − p = 0. in each time step of the numerical approximation (1. and hence in this example. 1). Using this value for R and the earlier values for the price increments. asset price then has four increments.4. State transition diagram for a simple Markov chain model of asset prices over multiple time steps. an increment in price followed by a decrement in price will result in the price being exactly back at the starting value.1180 or d = 1/u = 0.04 ' 28. Thus.91 ' 54. for example.8). whence the factors (1. it follows that when sampled at time steps of size h. That is. for example.

40 Cd4 = 0 Figure 1. Then determine the value of the call option at all other times by proceeding from right to left. See that because at each time step the price is assumed to only either rise or fall. in any order. First.07 ¨¨ ¨ r r r j r 35.00 Cu2 d2 = 0 31.30 Chapter 1. Now investigate the four quarterly time steps. These have been recorded and labeled so that Cuk dl denotes the value of the call option after.16.75 Cu3 d = 5.2 48.31 r rr j r 28. at each time step only half the prices are accessible. as yet we do not know the value of the call option at each of the nodes. and the computed option value is the bottom number. In this quarter of the year the problem looks just like the simple one-step binomial model we analyzed earlier: consider just the top-right part of the lattice as in Figure 1. we do know that the value of a call option is C = max{0.00 1. because it is risk free.49 rr ¨ r j r ¨¨ rr r j r B ¨ ¨ ¨¨ rr r j r ¨ B ¨¨ ¨ r rr j r ¨ ¨ 43. Example of a four-step binary lattice to compute the value of a call option. k increments and l decrements in the price.00 Cud3 = 0 ¨ B ¨ price s T E time t 25. The state of the system is that the asset is priced at Su3 = 48.91 B ¨ ¨ ¨¨ 39.50. S − X} for the various asset prices S.31 1.00 0 B ¨ 0 ¨¨ ¨ r rr j r 28. Financial Indices Appear to Be Stochastic Processes 54. The asset price is the top number of each pair.25 39. as we did for the single step and as shown in Figure 1.31 ¨¨ ¨ r r j r rr r j r ¨ ¨ 35.91— the upper state in the second column from the right in Figure 1.13 ¨¨ 43. Although included in the figure.75 7. Lay out the states vertically with time horizontally. and thus only these are drawn at any time.69 Cu4 = 16. However.05 31. arbitrage asserts . At the start of this final quarter the issues are exactly the same as outlined for the one-step binomial model: Alice has an asset with which she can adjust her risk-free portfolio by choosing the right number of call options valued at a strike price of X = $38.04 0 r rr j r 22.16.90 B ¨ ¨ B ¨ 11.5 r B ¨ 5. consider the problem of determining the value of the call option at the start of the fourth quarter following three successive increments in the price of the asset.13 S = 35 C = 3. at the end of the year (the last column in the figure).16.17.79 B ¨ ¨ B ¨ 3.

the same reasoning and formula give the values at the start of the third quarter—and so on from right to left across the tableau to give the values already appearing in Figure 1.50 on the asset is worth C = $3.16.16 Algorithm 1. We later derive this directly from an SDE.16. and finally. ∂s 2 ∂s .6007 × $16.3993 × $5. Thus the formula (1.69 Cu4 = 16.91 C u3 = ? B ¨ ¨ ¨¨ rr r j r 43.17. d = e−β h ≈ 1− β h . Thus a fourstep binomial lattice model estimates that the call option with strike price X = $38. Suggested activity: Do at least Exercise 1.8) into the PDE ∂c −rc+ ∂t 2c rs ∂c + 1 β2 s2 ∂ 2 = 0 . For example. s). use thirdstep values for those at the second. Similar reasoning applies for all other values of the call option at the start of the last quarter.19 + 0. A binomial lattice prices call option 54. such as this extract from the top-right of Figure 1. √ √ √ √ R = erh ≈ 1+ rh .75 Cu3 d = 5. Then expanding C(t.2 31 48. namely C(t. u = eβ h ≈ 1+β h .50 at the start of the year.6 shows that such computations are easily done in M ATLAB /S CILAB. use second-step values for those at the first.25 Figure 1. compute the initial call option value.0287 = $11. Each step in using a multi-step binary lattice to value a call option looks like a simple one step binomial approximation.8) makes it look like a numerical approximation to a partial differential equation (PDE) for the value of the call option as a function of asset price s and time t. 16 The repeated application of (1. s) for nearby values in s and t into Taylor series transforms (1.49 . Each step is based upon the risk-free return of a hypothetical portfolio of options and assets. If over a time step of 1 (year√or whatever) R = er and u = eβ .4.1. it must return the same value as investments in bonds.8) applies with an appropriate change in subscripts.06 . and p ≈ (1+rβ h)/2 . R Once the values of the call option are determined at the start of the last quarter. Cu3 = 1 [pCu4 + (1 − p)Cu3 d] R = [0. namely.15. then for many small time steps of size h. 1 Cu2 d = [pCu3 d + (1 − p)Cu2 d2 ] = $3. Thus the previous arguments and formulae apply.25] /1. Summary Use expiration values of the call option to determine the values at the third step.

s-x) for i=1:4 c=(p*c(2:end)+(1-p)*c(1:end-1))/r end 1. t) for t. which has the following properties: W(t) is continuous. the change W(t + s) − W(s) ∼ N(0. .6 M ATLAB /S CILAB code for a four-step binomial lattice estimate of the value of a call option. u and d the factors of increase and decrease in the price of the underlying asset. tj)ΔWj for (very) small time steps Δtj. • Value financial options by first determining a hedge that would ensure a risk-free result. d=1/u r=1. and the change W(t + s) − W(s) is independent of any details of the process for times earlier than s. are modeled using a Weiner process. s ≥ 0 . where μ is called the drift and σ the volatility.32 Chapter 1. • A more accurate multistep binomial lattice model repeatedly applies the above formula backward in time from expiry date to the current time to determine the current value of an option. S(t). • An Ito process. and second requiring arbitrage that the risk-free result is the same as that obtained from secure bonds. Financial Indices Appear to Be Stochastic Processes Algorithm 1. satisfies an SDE in the form dS = μ(S. t)dW .12^(1/4) x=38. u=1. Financial prices are assumed to have drift μ = αS and volatility σ = βS. such as financial indices. tj)Δtj + σ(Sj.5 Summary • Many fluctuating. • Obtain numerical solutions by discretizing the SDE ΔSj = μ(Sj. noisy signals. where p = u−d u−d C= where R is the bond factor. W(0) = 0 .^(-4:2:4) p=(r-d)/(u-d) c=max(0. A one-step binomial model applies these principles to value an option as 1 [pCu + (1 − p)Cd].50 s=35*u. W(t). and Cu and Cd the corresponding values of the option at the end of the period.25^(1/2). t)dt + σ(S. R u−R R−d and 1 − p = .

Introduce a drift μ and volatility σ and plot realizations of the resultant process. 3} . 4. . How do you explain the actual results? 1. say.3. Draw on a big sheet of paper a sequence of 30 squares and label them consecutively 0 (bankrupt). investigate μ ∈ {0. Plot the realizations. 2. Imagine this corresponds to the small business having an initial value of $200. the fifth state.000. Estimate (albeit very crudely) the stock drift and stock volatility of the price of silver (US$/ounce) from the data in Table 1. one year in time. Get your friends and family to play this simple little game that illustrates a key aspect of stochastic mathematics in application to finance. Each player continues to role his/her die and move until reaching 0 or 15. Investors expect returns in proportion to their investments. 14. Thus in the first move. . 1. why do you expect each player to reach the “millionairedom” state? 2. 3. • if a player rolls a 4. Why do you expect each business to grow? That is. 2. 3. he/she stays in the same state. say. That is.2. 14. 2. he/she moves up some states. . Each of these squares represents one state in a 30-state Markov chain. 1. That the number written in each square is (roughly) proportional to the position of the square in the sequence corresponds to the financial reality that small businesses usually grow/shrink by small amounts. Give each player a token and a six-sided die. 1. Thus in each turn each player rolls his/her die and moves according to the following rules: • If a player rolls a 1 or 2. 1. ±1. because the fifth square/state is a “2.1. When you play the game. • if a player rolls a 3. But the number of states (squares) a player moves is given by the number written in each square. In M ATLAB /S CILAB write a script to use randn and cumsum to generate. Each turn in the game corresponds to. 5. ±3} and σ ∈ {1/3.” a player moving up moves from the fifth square to the seventh square. 3. or 6. roughly what proportion of players reach millionairedom? What proportion go bankrupt? 3. Questions: 1. 1. 15 (millionairedom). 5. 4. five realizations of a Wiener process—approximate it by taking n = 1000 time steps over 0 < t ≤ 1 . . . then he/she moves down some states.2. whereas large companies grow/shrink by large amounts. players continue to operate their businesses until they either go bankrupt or reach millionairedom. In each year the business may be poor or may grow. At the start place each token on the second “2”. for example.Exercises 33 Exercises 1. Imagine each state represents the value of some asset such as the value of a small business that each player is managing. and a player moving down moves to the third square.

2.25 from 0 to 5 . Financial Indices Appear to Be Stochastic Processes Table 1. 2.9502 10676.6.910 Table 1.960 $6.3. Annual closing prices of silver (US$/ounce).3. .6 to numerically solve and plot five different realizations of the (Ornstein–Uhlenbeck process) solutions of the SDE: dS = −S dt + dW with initial conditions S0 = 0. Year 2000 2001 2002 2003 2004 2005 Close $4. say. Now generate simultaneously m = 300 realizations of the solution of the SDE (for α = 1 and β = 2) and draw a histogram of their values at time t = 1 .845 $8.1992 19723.1699 18934. Nonetheless. and 4 . compute the mean of the values at time t = 1 and see that it is roughly e1 = 2.1992 19361.3398 13785. Year 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 Close 16925. Plot the Index as a function of time and copy and paste the data into M ATLAB /S CILAB or an equivalent program.7598 16111.3008 15258.650 $4.4297 17225.6904 10542.6396 11488. Perhaps integrate over 0 ≤ t ≤ 4. see that almost all solutions have a smallish numerical value (less than one).5. and compute estimates of the stock drift and stock volatility.8300 1.790 $5. Modify the Euler code of Example 1.0996 19868.595 $4. 3.0000 17417.718 ! 1. using bins of width 0. Implement the Euler code of Example 1.6.34 Chapter 1.7002 13842. Plot numerical solutions for α = 1 and β = 2 over 0 ≤ t ≤ 1 .6201 8578. 1. 1. Annual closing value of the Japanese Nikkei Index (yen).4. Estimate (albeit crudely) the stock drift and stock volatility of the value of the Japanese Nikkei Index (yen) from the data in Table 1.

2 4. What is the value of a forward contract on an asset when 1. and the bond rate is 4% per year? 2. Summarize your findings. Suppose that volatility corresponds to +25% and −20% per year. See that a portfolio of the asset and either one call option or three call options sold at this price are subject to fluctuations in the price of the asset that could result in the investor losing. Assume that the asset price moves up by 25% or down by 20% per year. Modify the Euler code of Example 1. The call option is offered for sale at $8. d. Assume that the asset price moves up by 25% or down by 20% per year. Quantitatively investigate convergence as time step h → 0 at time t = 1 for a range of realizations of exponential Brownian motion (use relative differences). the rate of interest for bonds is 10%.10 of the valuation of a forward contract on Telecom shares. the agreed price after a year is $57. 1. the rate of interest for bonds is 10%. √ 2.10. Use a one-period binomial model to estimate that the hedge ratio H = 2 and the fair value of the call option is $5. its current value is $60.13. the exercise price of the call option is also $35. The exercise price of a call option on this asset is also $50. Use a one-period binomial model to estimate the hedge ratio and the fair value of the call option when the initial asset price is $50. dX = 1 X + 1 + X2 dt + 1 + X2 dW with X0 = 0 . An asset is initially priced at $50.11. dX = dt + 2 X dW with X0 = 1 . and that the bond interest rate is 12% per year. its current value is $35. 1.9 to be correctly valued at $4 ? 1.Exercises 35 1.57 when the initial asset price is $35. 1.15.8. Use a one-period binomial model to estimate the interest rate for bonds that would make this the correct value for the call option. and the period is one year. dZ = Z3dt − Z2dW with Z(0) = 1 .14. What would be the corresponding factors u. Reconsider Example 1. 2 5. dX = 1+t + (1 + t)2 dt + (1 + t)2dW with X0 = 0 and X0 = 1 .7. and R for each monthly step of a twelve-step monthly binomial lattice model? . 1. Assume that the asset price moves up or down by 25% per year. 3.6 to numerically solve and plot five different realizations of the solutions of the following SDEs: 2X 1. What bond rate would cause the current value of the forward contract to be precisely $1? 1. 1. What interest rate for bonds would be needed for the call option of Example 1. the exercise price of the call option is also $50. and the period is one year. and the bond rate is 5% per year? 1.9. the agreed price after two years is also $35. dX = − 1 e−2Xdt + e−XdW with X0 = 0 .12.

13). Alice buys a put option from Bob (see Definition 1. strike price X = $57 . value P.16. 1. and hedges by buying φ units of the asset (selling if φ is negative). the put option has value P = max{0. then apply your result to a multiperiod binomial lattice. when the asset has price S. Second. 2. and if the asset goes down in price by 20%.13. Show that it is not until n ≥ 512 (or thereabouts) that the estimated initial value of the call option converges to $3. 3 3.11–1. 1.13). for some fair price P which you are to determine by arguments similar to those employed for call options.13). for some fair price P which you are to determine by arguments similar to those employed for call options. strike price X = $39 . 1. Financial Indices Appear to Be Stochastic Processes 1. Use this modified code to refine the value of the call option in the situations described in Exercises 1. assume bonds increase in value by 4 1 % over the year. Definition 1. use the principle of arbitrage to show the fair price P for the option. strike price X. First. 2. show that at the expiry of the put option. value P.19. Lastly. assume inflation is rampant and that bonds increase in value by 20% over the year.12. . show that at the expiry of the put option.6 to use an n-step binomial lattice for estimating the value of the call option (where n is a supplied parameter). Alice buys a put option from Bob (see Definition 1. Deduce the risk-free ???hedge ratio φ = 1 .18.36 Chapter 1. at the start of a year Alice constructs a portfolio of one bought put option. for some fair price P which you are to determine by arguments similar to those employed for call options. when the asset has price S. Alice buys a put option from Bob (see Definition 1. A European put option gives the buyer (of the option) the right but not the obligation to sell an asset at a previously agreed strike price on a particular date.17. Investigate one step of a binomial model for determining the value of the put option. 6 1.40 to the nearest cent. Develop one step of a binomial model to estimate the value of the put option. Second. Explain the value of the portfolio at the end of the year if the asset goes up in price by 25%. Lastly. use the principle of arbitrage to determine the fair price P = $4 for the option. Explain the value of the portfolio at the end of the year if the asset may either double or halve in value. at the start of a year Alice constructs a portfolio of one bought put option. Develop one step of a binomial model estimating the value of the put option. each of price S = $60 . each of price S = $60 . Deduce the risk-free hedge ratio φ. 3. and hedges by buying φ units of the asset (selling if φ is negative). First. 1. X − S} by considering the cases X < S and X> S. the put option has value P = max{0. Modify the M ATLAB /S CILAB code of Algorithm 1. X − S} by considering the cases X < S and X> S.

Clearly explain the crucial features of a Wiener process that empower us to model noisy. Lastly. 1. Assume that the asset price moves up by 25% or down by 20% per year. and the period is one year. 1. and hedges with φ units of the asset. unlike a call option. each of price S. 128. fluctuating dynamics. . 32. and 512 step binomial lattice.8). 3.18.Exercises 20000 18000 16000 37 S(t) 14000 12000 10000 8000 1992 1994 1996 1998 2000 2002 2004 2006 time t Figure 1. at the start of a time interval Alice constructs a portfolio of one bought put option. Second. namely 1 R−d u−R P= Pu + Pd . 2. then the value of the put option at the start of the interval is the same expression as (1. See that a put option. R u−d u−d 4. Explain how and why a Wiener process is transformed to model general noisy. the exercise price of the put option is also $50.16 to estimate the value of the above put option using an n = 4 . 5. Hence estimate a fair value of a put option when the initial asset price is $50. the option has value P = max{0. Revise the M ATLAB /S CILAB code of Exercise 1. First. argue that if such a risk-free portfolio is to give the same return as bonds. value P. At the end of the time interval the value of the portfolio will be Pu + φSu if the asset goes up in price by a factor of u and Pd + φSd if the asset goes down in price by the factor d.20. which increase in value by a factor of R over the interval. Argue that for a risk-free result the hedge ratio φ = (Pd − Pu)/(Su − Sd) . the rate of interest for bonds is 10%. Japanese Nikkei Index (yen) as a function of year. X − S} by considering the cases X < S and X > S . argue that at the expiry of the put option when the asset has price S. fluctuating signals. becomes more valuable the lower the asset price.

3.03. stock volatility β ≈ 21% per 1. √ year. 1.58.18. 1. respectively. Approximately 22%. Stochastic fluctuations ruin the expected growth. Hedge ratio φ = 1/10 .25 . 2. That is. a player moves up 16% of the time. 3. an interest rate of 12.19. 1. that is. 1. 1. $2.89. 1. $5.12.1324 .18. $2.83 and $6. stock volatility β ≈ 13% per √ year. $2. 1.34 . 4. Stock drift α ≈ 2. 1.13. See Figure 1. .38 Chapter 1. $4. 1.10.31. R = 1.24%. R = 1.16.14. and $2. 1. the rest go bankrupt.36% over the period.1. $3. price P = $4 .19.34. 5. Stock drift α ≈ 15% per year. Only about 1 in 3 reach millionairedom. Financial Indices Appear to Be Stochastic Processes Answers to selected exercises 1. 2.4.1236 .07. $7. 1. On average.2% per year.9. 13. that is. we could make a business case that shows 16% growth per year. $3.

2. . . . 2. . . . . . . . . . . . . . . Answers to selected exercises . . . . . . . . . . .1 Linear growth with noise . . . . . 2. . . . . 2. . .3 The Black–Scholes equation prices options accurately 2. . . . . . . . . . . Exercises . 39 40 40 43 43 46 48 53 54 56 57 60 This chapter investigates how to manipulate symbolically SDEs. . . . . . . Here we use Ito’s formula to derive the Black–Scholes PDE that values financial options. . . . . . 39 . . . . . . . . . . .2 Self-financing portfolios . . 2. . . 2. . . . . The second one is the solution for exponential Brownian motion and introduces the key ingredient we need for the description and development of Ito’s formula. .2 Exponential Brownian motion . .1 . . . . . . . 2. . . . . . . .3. . . . . . . . .2 Ito’s formula solves some SDEs . . . . .1 Multiplicative noise reduces exponential growth We first explain the solution of two simple SDEs. .2 Ito’s formula . . .1 Discretizations form a trinomial model . . .1. . .4 Summary .1 Simple Ito’s formula . We discover in the algebraic solutions of the SDEs features which are also discerned in the numerical solutions. . . . . . . . . . . . . . . . . . . . . . . . . . . . .2. . . . . . . . . The formula is proved in Chapter 4 via a careful definition of stochastic integration. . . . . . 2. . . . . . . . . . 2. . . . .Chapter 2 Ito’s Stochastic Calculus Introduced Contents Multiplicative noise reduces exponential growth . 2. . .3. .2. . . . . The key is the discovery of a stochastic form of the chain rule for differentiation called Ito’s formula. . . . . .1. . . .

is claimed to govern stock prices: dX = αX dt + βX dW . there is massive cancellation in the terms of this equation. for example. Proceed now to consider the exponential Brownian motion SDE that.1) The right-hand side looks like the constant drift and volatility that we summed so successfully before. Because the drift μ and volatility σ are constant. Ito’s Stochastic Calculus Introduced 2.2 Exponential Brownian motion The linear constant coefficient SDE is rather trivial. We have algebraically solved our first SDE! 2. Now sum this discretization over n steps starting from time t0 = 0 when W0 = 0 : n−1 n−1 n−1 (Xj+1 − Xj) = j=0 j=0 μ(tj+1 − tj) + j=0 σ(Wj+1 − Wj) . when discretized with time steps Δtj = tj+1 − tj. so perhaps differences of log X will lead to something useful. μ σ As we interpreted before. but the left-hand side is problematic.40 Chapter 2. Xj (2. in this short section the drift μ and volatility σ are constants. we interpret this SDE to mean ΔXj = μ Δtj + σ ΔWj .1. this SDE means that evaluating the right-hand side at the current time forms a numerical approximation: ΔXj = αXj Δtj + βXj ΔWj .1. which we usually take to be the constant time step h .1 Linear growth with noise Consider the constant coefficient SDE dX = μ dt + σ dW . Numerically. as we see below: Δ log Xj = log Xj+1 − log Xj by definition of the difference Δ = log(Xj + Xj+1 − Xj) − logXj = log(Xj + ΔXj) − logXj . recall that the derivative of log x is 1/x. Divide by Xj to lead to a form with a right-hand side seen before: ΔXj = α Δtj + β ΔWj . However. they do. matching the 1/xj in the left-hand side above. Indeed. leading to Xn − X0 = μ(tn − t0) + σ(Wn − W0) ⇒ Xn = X0 + μtn + σWn as W0 = t0 = 0 ⇒ X(t) = X0 + μt + σW(t) in the limit as maxj Δtj → 0 .

Now look at the detailed justification of the transformations of the three quadratic terms used above (note the time steps are the constant h). h2t).2) See the astonishing feature. but the same ODE with large enough multiplicative noise.and left-hand sides (using t0 = W0 = 0). Now. n−1 n−1 Δtj2 = j=0 j=0 h2 = nh2 = htn → 0 as h → 0 . then log Xn − log X0 = αtn + βWn − 1 α2htn − αβhWn − 1 β2tn + · · · . the dominant term in the argument of the exponential function is the −t. of the values of X(1) over many realizations. has solutions (2. n−1 n−1 Δtj ΔWj = h j=0 j=0 ΔWj = hWn ∼ N(0. Figure 2. • Second.1. . the above expression becomes log X(t) − logX0 = αt + βW(t) − 1 β2t . say dX = X √ + 2X dW. and hence almost surely → 0 as h → 0 . may act to stabilize what would otherwise exponentially grow by apparently reducing the growth rate from α to α − 1 β2.10. That there are a few realizations of X(1) which are large is significant and will be discussed later. Since W(t) only grows like t. 2 (2.1.1) 2 1 α2Δt 2 − αβ Δt ΔW − 1 β2ΔW 2 + · · · . Now sum the right.2) of X = dt X0 exp[−t + 2W(t)] . which shows that almost surely all solutions of the SDE decay to zero! This almost sure decay is supported by a histogram. which we saw in the numerical solutions of Figure 1. parametrized by β. The ODE dX = X dt has growing solutions X = X0et . 2 2 where we have magically simplified the form of the quadratic terms—a crucial step which we justify a little later.2. taking the limit as the time step h → 0 and assuming the higher order terms → 0 in this limit. = α Δtj + β ΔWj − 2 j j j 2 j Here and later the ellipses · · · denotes the higher order terms in the Taylor series.1. • First. that noise. Multiplicative noise reduces exponential growth = log 1 + = ΔXj Xj ΔXj 2 + ··· Xj by Taylor series of log(1 + x) 2 41 ΔXj 1 − Xj 2 = α Δtj + β ΔWj − 1 α Δtj + β ΔWj + · · · by the SDE (2. 2 Example 2. where for simplicity we assume the time step is constant h. 2 which rearranges into the remarkable analytic solution X(t) = X0 exp (α − 1 β2)t + βW(t) .

• Lastly.0 4. Now. .5 4. Histogram of 300 realizations of X(1) from the SDE dX = X dt + 2X dW with X(0) = 1 (Figure 1.0 1. 1 for even p.1. n−1 2 where Y = h j=0 (Zj − 1) for some random variables Zj ∼ N(0.1.0 3. 1) . although some large excursions (shown gathered at X(1) = 5) significantly affect the statistics. Ito’s Stochastic Calculus Introduced count 80 60 40 20 0 0.0 2. .5 1. as derived in Exercise 2.0 X(1) Figure 2.10 plots five realizations in time). This histogram shows most realizations decaying to zero.5 3. n−1 n−1 ΔWj2 = j=0 j=0 √ 2 hZj n−1 =h j=0 n−1 2 Zj n−1 =h j=0 1+h j=0 2 (Zj − 1) = tn + Y .5 2. Hence the value of j=0 ΔWj2 averages to tn.42 140 120 100 Chapter 2. and thus E[Y] = n−1 n−1 2 h j=0 E[Zj − 1] = E[Z2] − 1 = 0. and much more interestingly. But the sum may fluctuate wildly due to Y—we now argue that its fluctuations are .0 0.5 5. as asserted earlier. E[Zp] = (p − 1)(p − 3) .

but that surprisingly dW 2 = dt . the following symbolic rules for infinitesimals effectively apply: dt2 = 0 and dt dW = 0 (as for ordinary calculus). 17 This 2 . We first present a simple version of Ito’s formula and then present the full version. this is as if “ΔWj2 = Δtj”17 which suggests the novel symbolic rule introduced below. 18 Chapters 5 and 6 of the book by Stampfli and Goodman (2001) provide appropriate reading to supplement this section.2. you used differentiation rules to infer integration formulae. that X(t) = X0 exp (α − 1 2 2 β )t + βW(t) solves the financial SDE dX = αX dt + βX dW . The same is true for the solution of some SDEs that we develop in this section. Because the variance of Y tends to 0 and E[Y] = 0. Figures 2. 2 Summary For stochastic calculus.1 Simple Ito’s formula Let f(t.2 and 2. 2. w) be a smooth function of two arguments—smooth means that it is differentiable at least several times and that Taylor’s theorem applies. These symbolic rules will be used extensively.2. But importantly we have is that we have shown that the fluctuations about this expectation are negligible in the limit of continuous time.18 2. Then consider the Ito process last point is perhaps not too surprising because we know E [ΔWj 2 ] = Δtj E ΔWj / Δtj = Δtj .2. That is. n−1 n−1 2 Since j=0 ΔWj is almost surely j=0 Δtj. The basic rule of stochastic differentiation is Ito’s formula with the identity dW 2 = dt at its heart.3 show two example surfaces of two such smooth functions. They derive. Consequently j=0 ΔWj2 → tn almost surely. Ito’s formula solves some SDEs insignificant by showing that the variance of Y vanishes as the time step h → 0: n−1 43 Var[Y] = h 2 j=0 2 Var[Zj − 1] as Zj are independent = h n Var[Z2 − 1] as they are distributed identically = h2n E[(Z2 − 1)2] as E[Z2 − 1] = 0 = h2n E[Z4 − 2Z2 + 1] = h2n[3 − 2 + 1] using expectation of even powers of Z = 2h2n = 2htn → 0 as h → 0 for all tn . for example. then Y → 0 almost surely as n−1 h → 0 .2 Ito’s formula solves some SDEs Recall that when you were first introduced to integration in your earlier courses it was presented as antidifferentiation.

0 0. where W(t) is a Wiener process—see the black curves wiggling across the surfaces in Figures 2. 3. f 1.2 0. W(t)) is the height of the curve as a function of time and changes due to direct evolution in time and through evolution of w. w) = (t + w)2. 0. Ito’s Stochastic Calculus Introduced 3.2 and 2. The smooth surface f = (t + w)2 with one realization of a Wiener process w = W(t) evolving on the surface: the Ito process X = f(t.4 t 0. W(t)).3.2 Figure 2.3.3 exp(t + 2w) with one realization of a Wiener process w = W(t) evolving on the surface: The Ito process X = f(t. 2.2. Note that f itself is smooth. X(t) = f(t. whence X = exp(at + bW(t)) . 0. f 1. and f(t. 2.8 w 0. whence X = (t + W(t))2.0 0. W(t)) is the height of the curve as a function of time and changes due to direct evolution in time and through evolution of w.44 Chapter 2.8 w Figure 2. w) = exp(at + bw). We explore these examples later: f(t. the stochastic part of X(t) comes only via the use of the . 0.4 t 0. The smooth surface f = 0. 0.

Δt. w) = ceat+bw (see Figure 2. Solution: Here f(t. observe (using a multivariable Taylor series of f) that X + ΔX = f(t + Δt. Determine the differential of X(t) = (t+W(t))2. 2 ∂t ∂w ∂w (2.3) where the partial derivatives are evaluated at (t. Hence prefer this second SDE.2. W) + Δt + ∂t ∂w ∂2f ∂2f ∂2f 2 + 1 2 Δt2 + Δt ΔW + 1 2 ∂t 2 ∂w2 ΔW + · · · ∂t∂w ∂f ∂f ∂2f ≈ X + Δt + ΔW + 1 2 ∂w2 Δt . ∂t ∂w as by the earlier rules for differentials Δt2 = ΔtΔW = · · · = 0 and ΔW 2 = Δt . w) = (t + w)2 (see Figure 2.3).2. This is the simple version of Ito’s formula and gives the differential of an Ito process X which depends directly and smoothly upon a Wiener process. Example 2. fw = bceat+bw and fww = b2ceat+bw. thus dX = [2(t + W) + 1]dt + 2(t + W)dW .3. Prefer the second for many purposes as the right-hand side has no occurrences of the Wiener process W(t) except for the differential dW. and hence deduce an SDE which X(t) satisfies. Example 2. 19 Both . and hence solve the SDE dX = αX dt + βX dW . and thus dX = ∂f 1 ∂2f ∂f dt + +2 dW .2. we normally formulate an SDE model in such a form where the only direct appearance of “noise” is in the differential dW. through the direct dependence upon t and indirectly through the dependence upon the Wiener process W. Denoting X(t + Δt) = X + ΔX and W(t + Δt) = W + ΔW. Determine the differential of X(t) = c exp[at +bW(t)].2). so ft = fw = 2(t + w) and fww = 2. whence ft = aceat+bw. 2 Rewritten as dX = (a + 1 b2)X dt +bX dW this is the same as the given SDE provided α = 2 a + 1 b2 and β = b. rewrite this as the SDE19 √ √ dX = [2 X + 1]dt + 2 X dW . Now consider the change in X that occurs over some small change in time. W). Thus Ito’s formula asserts dX = aceat+bW + 1 b2ceat+bW dt + bceat+bWdW 2 = ceat+bW a + 1 b2 dt + b dW . Solution: Here f(t. In applications where SDEs arise. Thus it is the solution of the SDE provided b = β and 2 of these expressions of the stochastic dynamics are correct. Ito’s formula solves some SDEs 45 Wiener process in the evaluation of f. In the limit as Δt → 0 the differences become differentials. W + ΔW) ∂f ∂f ΔW = f(t. √ Recognizing t + W = X.

Example 2. Solution: Here Y = f(t. 2 ∂t ∂x ∂x ∂x (2. The general form of Ito’s formula. sometimes referred to modestly as Ito’s lemma. 2. express in terms of Y only.2 shows that if X(t) = (t + W(t))2. 2 as discussed in Section 2.2 Ito’s formula The simple form (2. then dX = [2 X + 1]dt + √ 2 X dW. where in dX dY = (2.3) of Ito’s formula rests upon the Taylor series of f(t. dX = μ dt + σ dW. Let f(t. x) be a smooth function of its arguments and X be an Ito process with drift μ(t. √ Example 2. x) = ex for which ft = 0 and fx = fxx = ex: . and hence the solution is 2 X(t) = c exp[(α − 1 β2)t + βW(t)] . Example 2. The rigorous proof of this formula requires a well-defined stochastic integral and so is deferred to Chapter 4. Hence the stochastic process Y has drift μ = 3Y 1/3 and volatility σ = 3Y 2/3. Equivalently. is the same and addresses the differential of a function of a stochastic process in general rather than just a function of a Wiener process. Solution: Use Ito’s formula with f(t. w) = w3 so that ft = 0 . Ito’s Stochastic Calculus Introduced a = α − 1 β2 .5) is more explicit. X) and volatility σ(t. then Y(t) = f(t.5 (Ito’s formula).4) gives dY = ∂f ∂f ∂f ∂2f + μ + 1 σ2 2 dt + σ dW . where f(t. Thus Ito’s formula asserts dY = 0 dt + 3W 2 dW + 1 6W dW 2 2 = 3W dt + 3W 2 dW = 3Y 1/3dt + 3Y 2/3dW . hence deduce dY for Y = eX. Derive the drift and volatility of the stochastic process Y = W(t)3.4. fw = 3w2 and fww = 6w . X). X).2. expanding expression (2. that is. w) and the rule that the only quadratic differential to retain is dW 2 = dt.4) is a more memorable form of Ito’s formula.5) Although the formula (2. X(t)) is also a Ito process with differential ∂f ∂f ∂2f dt + dX + 1 2 dX2 2 ∂x ∂t ∂x 2 retain only dW 2 → dt .1. W).4) and the partial derivatives are evaluated at (t.6. we believe (2.46 Chapter 2.2. Theorem 2.

47 Example 2. because it is simply 2 the sum of X and Y. into d(XY) = 1 d(Z2) − d(X2) − d(Y 2) 2 = 1 (2Xμ + 2Yμ + 2Xν + 2Yν + σ2 + 2σρ + ρ2)dt 2 + (2Xσ + 2Yσ + 2Xρ + 2Yρ)dW − (2Xμ + σ2)dt − 2Xσ dW − (2Yν + ρ2)dt − 2Yρ dW] = X(ν dt + ρ dW) + Y(μ dt + σ dW) + σρ dt = X dY + Y dX + σρ dt as required. In stochastic calculus there is an extra term. Then one may argue. Let X(t) and Y(t) be stochastic processes with differentials dX = μ dt + σ dW and dY = ν dt + ρ dW . whereas Ito’s formula applies only to a function of one Ito process. write XY = 1 [Z2 − X2 − Y 2].7 (product rule). However. that d(XY) = (X + dX)(Y + dY) − XY = X dY + Y dX + dX dY = X dY + Y dX + σρ dt by retaining only the dW 2 = dt term in the product dX dY. Solution: The difficulty is that the product XY is a function of two Ito processes. Ito’s formula solves some SDEs dY = fxdX + 1 fxxdX2 2 √ √ X 2 X + 1 dt + 2 X dW =e + 1 eX 2 = eX = eX √ √ 2 2 X + 1 dt + 2 X dW √ √ √ 2 2 X + 1 dt + 2 X dW + 1 4 X dW 2 2 √ √ 1 + 2 X + 2X dt + 2 X dW . with Z = X + Y. has differential dZ = (μ + ν)dt + (σ + ρ)dW . and d(Z2) = 2Z(μ + ν) + (σ + ρ)2 dt + 2Z(σ + ρ)dW . This is indeed correct. Substitute these. using our symbolic rules. Recall that in ordinary calculus the product rule for differentiation is d(fg) = f dg + g df . . where Z(t) = X + Y is a new stochastic process that. Then use Ito’s formula separately on each of the components in the right-hand side of d(XY) = 1 [d(Z2)−d(X2)− 2 d(Y 2)]: d(X2) = 0 dt + 2X dX + 1 2 dX2 2 = (2Xμ + σ2)dt + 2Xσ dW .2. similarly d(Y 2) = (2Yν + ρ2)dt + 2Yρ dW . but now deduce it using Ito’s formula.2.

the realization of the asset value.9 demonstrating that the value C is a smooth function of time t and asset price S. and ∂ C are all well defined ∂t ∂S ∂S2 and smoothly varying. Ito’s Stochastic Calculus Introduced C 70 60 50 40 30 20 10 0 0 10 90 20 80 30 70 40 50 60 60 50 70 40 80 30 90 20 100 10 S 110 100 100t Figure 2. ∂C .4. Consequently. the derivatives ∂C . Suggested activity: Do at least Exercise 2. ∂t 2 2. S) of the call option for Example 1.4 began using binomial lattices to price call options. Value C(t. together with call options. Recall that the argument is first to form a portfolio of the asset. Summary Ito’s formula is a form of chain rule for stochastic processes to empower all stochastic calculus. the value of a call option is an Ito process. Recall that the stochastic model of an asset’s value is that its differential dS = αS dt + βS dW .3 The Black–Scholes equation prices options accurately One application of Ito’s formula is to pricing options based on an asset.4. It is the same as the deterministic chain rule except for the crucial addition of a quadratic differential and the recognition that “dW 2 = dt”: if Y = f(t. through the dependence upon the Ito process S(t). Recall that the value of a call option varies smoothly in both time t and the asset 2 price S. S(t)). say C(t) = C(t. in any realization. Hence Ito’s formula tells us how the option value C fluctuates in time through the fluctuating changes in asset value.48 Chapter 2. We adopt the same line of argument. then to sensibly require this risk-free portfolio to give the same return as investing in risk-free bonds. then the differ∂f ∂2 f ential dY = ∂f dt + ∂x dX + 1 ∂x2 dX2 . .10. Hence. as seen in Figure 2. This section develops the analysis that Section 1. which is risk free. X).

2.3. The Black–Scholes equation prices options accurately

49

Now find a hypothetical, risk-free portfolio that has the same return as bonds. Construct a portfolio of one call option sold and some number φ (phi) of units of the asset. In Section 1.4 we held one asset and sought the appropriate number of call options—here it is preferable to do the complement, with φ as the reciprocal of the earlier H. This portfolio has a value Π = −C(t, S) + φS, which is itself also an Ito process as it is a function of the stochastic asset value S. Thus over a small time interval dt the value of the portfolio changes by an amount deduced via Ito’s formula (2.4):20 dΠ = −dC + φ dS =− as Π = −C + φS as C = C(t, S) by Ito ∂C ∂C ∂2C dt − dS − 1 2 dS2 + φ dS 2 ∂S ∂t ∂S ∂C ∂C 1 2 2 ∂2C =− dt + αS + β S ∂t ∂S 2 ∂S2 − βS

∂C dW + φαS dt + φβS dW as dS = αS dt + βS dW ∂S ∂C ∂C 1 2 2 ∂2C = − + αSφ dt − αS − β S ∂t ∂S 2 ∂S2 + βS − ∂C + φ dW . ∂S

A portfolio is risk free when it has no stochastic fluctuations, that is, when it has zero volatility. Make the volatility, βS[− ∂C + φ] , of this portfolio zero by choosing a portfolio ∂S with φ = ∂C units of the asset per sold call option.21 ∂S Setting φ = ∂C , the portfolio changes in price according to the residual drift term ∂S in dΠ, namely, ∂C 1 2 2 ∂2C dΠ = − dt . − β S ∂t 2 ∂S2 Because this portfolio is risk-free, its return must equal the return of investing the same in bonds. Given the value of the portfolio is −C + ∂C S and the interest rate r (so that ∂S R = ert), the corresponding investment in bonds returns r −C + ∂C 1 2 2 ∂2C ∂C S dt = dΠ = − − β S dt . ∂S ∂t 2 ∂S2

Equating the coefficients of dt and rearranging leads to the Black–Scholes equation for the value C(t, S) of the call option, ∂C 1 2 2 ∂2C ∂C + rS + β S = rC . ∂t ∂S 2 ∂S2 (2.6)

Involving derivatives in both time t and asset value S, this is a PDE for the option value C. The next section discusses numerical methods of solving the Black–Scholes PDE (2.6); such numerical methods relate closely to the binomial lattice model for valuing options.
20 The first 21 φ = ∂C ∂S

line of this derivation is justified properly by Section 2.3.2. is analogous to the earlier hedge formula as 1/H = (Cu −Cd )/(Su−Sd) ≈ ΔC/ΔS .

50

Chapter 2. Ito’s Stochastic Calculus Introduced

Example 2.8 (forward contract). The Black–Scholes PDE (2.6) straightforwardly values forward contracts to be exactly as in (1.6). To apply the valuation to forward contracts, solve the Black–Scholes PDE (2.6) with expiry value C(T , S) chosen appropriately for a forward contract.22 Solution: Recall from Section 1.4.1 that a forward contract is a binding agreement by Alice to sell to Bob an asset at an agreed price X at some expiry time T . At expiry Bob values the forward contract at C(T , S) = S − X because if the asset value S(T ) > X, then Bob buys the asset more cheaply than the open market price, whereas if the asset value S(T ) < X, then the contract commits Bob to buying the asset at a more expensive price than the open market price. This expiry value of C(T , S) = S − X is linear in the asset value S. Being linear, it leads to a simple solution of the Black–Scholes PDE (2.6). Seek solutions of the Black–Scholes PDE (2.6) that are linear in asset value S, but which have a general variation in time. That is, seek solutions in the form C(t, S) = a(t)S+ b(t) . Later we will use the information that we know at expiry that a(T ) = 1 and b(T ) = −X in order for the expiry value to be C(T , S) = S − X . Substitute C(t, S) = a(t)S + b(t) into the Black–Scholes PDE (2.6) to deduce da db S+ + rSa + 0 = raS + rb . dt dt The marvelous simplification here is that the second derivative term vanishes, which enables this case to be straightforward. Now this equation has to hold for all asset values S; thus equate the coefficients of the terms in S and the terms constant in S to determine da =0 dt and db = rb . dt

• Since da/dt = 0 , a(t) must be constant, and since at expiry a(T ) = 1, then a(t) = 1 for all time. • Since db/dt = rb , then b(t) = constant×ert . The expiry condition that b(T ) = −X determines the constant to be −Xe−rT so that b(t) = −Xer(t−T) . Write this as b(t) = −X/er(T−t), as T − t is the time remaining to expiry. That is, the value of the forward contract for all time up to the expiry time T and for all asset values S is C(t, S) = S − X/er(T−t) . Consequently, at the start of the time period, t = 0 , Alice values the forward contract as C(0, S0) = S0 − X/erT . Recalling R = erT , this valuation is identical to the earlier (1.6). More general algebraic solutions may be found for the Black–Scholes equation (2.6) when applied to some options. However, we do not explore this here, as it is more valuable to qualitatively check numerical solutions of complicated practical options.

Interpret terms to graphically solve
Before attempting to quantitatively solve such a PDE, we must learn to interpret the effects of the various terms that appear—the interpretations are analogues of those discussed in
22 In this example, we use C as the symbol for the value of the forward contract in order to be consistent with the symbology of the Black–Scholes equation (2.6). In Section 1.4.1 we used the symbol F for the value of the forward contract. The change in variable name is insignificant.

2.3. The Black–Scholes equation prices options accurately

51

classical one-dimensional continuum mechanics; see, e.g., Roberts (1994). The following interpretations of its mathematical symbols empower us to qualitatively solve the Black– Scholes equation (2.6). • First, the term rS ∂C is an “advection” term, as it carries information in S-space. For ∂S example, in the absence of the other terms, the equation ∂C + rS ∂C = 0 asserts that ∂t ∂S on characteristics dS/dt = rS , that is, on curves S = constant × ert , ∂C ∂C dS dC == + by the chain rule dt ∂t ∂S dt ∂C ∂C + rS as dS/dt = rS on each curve = ∂t ∂S =0 by the equation. Thus on the characteristics S ∝ ert we would find that C is constant—in effect the value C is carried along these characteristic curves. These characteristics are those corresponding to the value of an investment in bonds. • Second, the term rC on the right-hand side of the Black–Scholes equation (2.6) acts as a source of value for the option in time. Including the source term rC, three of the four terms in the Black–Scholes equation (2.6) form ∂C + rS ∂C = rC . Along each ∂t ∂S characteristic curve S ∝ ert this PDE asserts that the value of the option grows like dC = rC, that is, C also grows exponentially along with any bonds, as C ∝ ert. dt • Lastly, the term 1 β2S2 ∂ C implies that in addition 2to these effects, the price of 2 ∂S2 an option “diffuses” in the S-space. This diffusion ∂ C depends directly upon the ∂S2 volatility, βS, of the underlying asset so that large fluctuations in a risky asset cause large diffusion—large “blurring”—of the value of an option. However, and most importantly, remember that this term represents negative diffusion, on its own ∂C = ∂t − 1 β2S2 ∂ C , and so the Black–Scholes equation (2.6) must be solved backward in 2 ∂S2 time.
2 2

Summary
We solve the Black–Scholes equation (2.6) with “initial” conditions specified at the future expiry date of the option, of C(T , S) = max{0, S − X} for a call option; we expect the value of the option to be steadily discounted at the bond rate as we work backward in time (in proportion to ert); and the value to smooth out by the fluctuation induced “diffusion.” Example 2.9 (knock out). Exotic options often just change the boundary conditions for the Black–Scholes equation (2.6). For example, a “knock out” is a call or put option that additionally expires if the underlying security value S reaches a predetermined “barrier price.” There are two varieties of knock outs: “down and outs” expire if the underlying security falls to the barrier price, whereas “up and outs” expire if the underlying security rises to the barrier price. In a “down and out,” the holder of the knock out option receives a rebate c if ever the asset value S falls below some barrier, Smin say. For asset values S > Smin , the arguments

it is remarkably useful to solve the Black–Scholes equation (2.5. (c).6) still apply. squash vertically (dotted line). • second. In Figure 2. Given the enormous variety of possible options. that is. behind the Black–Scholes equation (2. Thus one values a knock out option by solving the Black–Scholes equation for asset values S > Smin . we keep supplying this value c at the barrier S = Smin as a boundary condition to the values in the domain S > Smin . Qualitative solution of the Black–Scholes equation (2. Thus when solving the Black–Scholes equation. such as the knock out.52 (a) call option 60 option value C 50 40 30 20 10 expiry squash depreciate value now Chapter 2.5 we follow the three qualitative steps on page 51 for any given final valuation of an option plotted as a function of the asset price S (dashed line): • First. (b) a put option.6) qualitatively. and the solid line is a rough estimate of the value of the option at some earlier time. (a) a call option. its rebate.6) to predict the value of an option. (d) two other special concocted options. “squash” the option value to the left by a factor R (dot-dashed line) corresponding to the “advection” in the Black–Scholes equation. At the asset value S = Smin we additionally know that the knock out option has value c. . The dashed line is the agreed value of the option at expiry depending upon the asset price S. Ito’s Stochastic Calculus Introduced (b) put option 45 40 35 30 25 20 15 10 5 0 expiry squash depreciate value now 0 0 10 20 30 40 50 60 70 80 90 100 (c) 35 option value C 30 25 20 15 10 5 0 0 10 20 30 40 50 60 70 80 90 100 asset price S 0 10 20 30 40 50 60 70 80 90 100 (d) 45 40 35 30 25 20 15 10 5 0 0 10 20 30 40 50 60 70 80 90 100 asset price S Figure 2. deflate/depreciate the valuation C by a factor R.

To connect the two views.g.j h i. ∂t ∂x 2 ∂x2 ∂x (2.j Ci. as then the derivatives simplify: S ∂S = ∂x and ∂ ∂ ∂ ∂x 1 S2 ∂S2 = ∂x2 − ∂x . §2.j and ∂ C ≈ i+1.4 and implicit in the multiperiod lattice of Figure 1. of constant spacing Δx = log u .6).6).j2δ i−1.23 Simplify such ∂ ∂ terms by transforming S into x = log S..7) which is backdifference approximation ward in time and centered in log-price is thus We numerically solve the tranformed Black–Scholes PDE (2.j i−1. Then a difference C −C 2 C −2C +C Ci+1.3. whereas centered approximations ∂t i. the more you smooth the curve. 24 Of 23 Read about . You may want to combine and confirm the first two of the above steps by the analysis of Exercise 2.7) on a grid in the tSplane. such as that seen in Figure 2. of spacing Δt = h in time and spacing Δx = δ in asset value. The Black–Scholes equation prices options accurately 53 • third.6) then becomes 2 2 2 ∂C 1 2 ∂2C ∂C ∂C +r + β − = rC ..1 Discretizations form a trinomial model The Black–Scholes equation (2. Create a grid. e. thus these are in the form of an Euler–Cauchy differential equation.j denote the value of C at the jth time tj = jh and ith location xi = iδ (say). e. and then ∂ C = ∂S ∂x ∂x ∂x2 2 ∂ 2 ∂ S ∂S C = S ∂S S ∂C = S ∂C + S2 ∂ C .2. but then the former seems less natural and also confuses the comparison with the earlier binomial model. a factor of u = 1/d from each other.j−1 . becomes a straightforward arithmetic sequence in x. Qualitative solutions such as these valuably check numerical or algebraic solutions of the Black–Scholes equation (2. +2 i.j − 1 β2 Ci+1.6) is intimately connected with the binomial lattice approximation. Kreyszig 1999.6) with equal ΔS rather than equal Δlog S.j − Ci−1. 2.j−1 +r h 2δ 1 β2 Ci+1.11.j − Ci.j = rC .24 Let Ci.j − Ci−1.3.g. smooth the corners and curves in the line to account for the “blurring” of the option’s value by the stochastic nature of the asset price (solid line) to give the option value at some earlier time—the longer the time period or the larger the volatility. Thus a finite to the x derivatives are ∂C ≈ i+1. Such a transformation from S to x = log S also ∂S ∂S ∂S2 ensures that the geometric sequence of asset prices used in the binomial lattice.16.j 2 2δ δ2 Euler–Cauchy ODEs in many standard texts (see. The chain rule derives first ∂C = ∂S ∂C = S ∂C .6) are multiplied by the corresponding power of S.j − 2Ci. 25 Issues associated with the numerical solution of such a PDE are dealt with in detail in many texts (see.7) C −C approximation to the time derivative ∂C ≈ i. Kreyszig 1999).j . course one may also discretize the Black–Scholes equation (2. first observe that all derivatives with respect to S in the Black–Scholes equation (2.j δ2 ∂x ∂x2 25 to the transformed Black–Scholes PDE (2.j + Ci−1. The Black–Scholes equation (2.

Ci.j±1 on the right-hand side of (2.1133.j vanishes. time √ step h = 1/n years. whereas with the binomial model we needed n = 256 . if we choose the increments in the log-price so that the term in Ci.j is approximately 0.2231. how such trades are financed by cash held in bonds. δ(2r−β2 ) 2.8) + 1 − rh − + β2h δ2 1 δ(2r − β2) Ci−1. with δ ≈ β 2h so that the coefficient of Ci.j δ2 (2.j−1 = (1 − rh) pCi+1. where p = 1 + 4β2 and with the multiplication by 1 − rh playing the role of the 2 division by the bond factor R. Using this discretisation of the Black–Scholes equation (2.j 2 4β2 β2h Ci.8) looks even more like the binomial lattice approximation. Algorithm 2.j−1 appears. Suggested activity: Do at least Exercise 2.3. The answer comes from investigating more completely the details of purchasing or selling the asset and. − 2 4β2 This looks like a “trinomial” lattice approximation to the pricing of an option.9 which valued a call option on an asset with initial price S = 35 over one year with volatility β = log 1.j . and strike price X = 38.25 = 0.40. in particular. Perform an n-step approximation.8) are negative.50 .54 Chapter 2. except for the time derivative where Ci.j + (1 − p)Ci−1. Indeed.j and Ci. We transfer money from and to bonds as needed to buy 26 A useful rule of thumb to ensure stability of the trinomial model (2.j−1 = β2h δ2 1 δ(2r − β2) + Ci+1.49.41 for n = 16 and above. the hedge ratio varies with the stochastic asset price S recall φ = ∂C and so should ∂S be a stochastic process. Imagine we have sold a call option on an asset and seek a time varying portfolio of φ units of the asset and ψ bonds. and we find C = 3. Example 2. .2 Self-financing portfolios One question you may have had about the derivation of the Black–Scholes equation (2.12 = 0. bond interest rate r = log 1.26 With n = 4 time steps. Revisit Example 1.5 to reasonably ensure stability. whereas with n = 8 we find C = 3. Ito’s Stochastic Calculus Introduced All option values referred to in this equation are at the jth time tj.j . Rearranging the equation for the value Ci.8) is to choose time step h and x-step δ so that none of the three coefficients of Ci.1 estimates the value of the call option to be C = 3.15.6) is the following: Why did I not treat the hedge ratio φ as a stochastic Ito process? After all.j−1 of the call option at the earlier time tj−1 gives Ci.6) determines the value of an option with just n = 16 time steps. namely β2 h δ2 = 1 − rh.10. then (2.

The Black–Scholes equation prices options accurately 55 Algorithm 2. − Cj+1 + φj+1Sj+1 + ψj+1Bj+1 = −Cj+1 + φjSj+1 + ψjBj+1 ⇒ Sj+1(φj+1 − φj) + Bj+1(ψj+1 − ψj) = 0 ⇒ Sj+1Δφj + Bj+1Δψj = 0 .3.1 Code for determining the value of the call option in Example 2. The whole portfolio then has value Π = −C + φS + ψB at any time t. since we just trade cash bonds for units of the asset. as the portfolio is to be self-financing.2. i=2:2*n.5+delta*(2*r-beta^2)/(4*beta^2) x=log(35)+(-n:n)*delta. • at the very start of the next (j + 1)th interval (the opening of the next day) we find the portfolio has value Πj+1 = −Cj+1 + φjSj+1 + ψjBj+1 . In S CILAB use %nan instead of nan. The change in value of the portfolio from one time step to the next (from one day to the next) is thus . like ert. • at the start of the (j + 1)th interval (the start of the day’s trading) we immediately adjust the portfolio to make it risk free over the forthcoming j+1th interval by trading to then hold φj+1 units and ψj+1 bonds with value −Cj+1 + φj+1Sj+1 + ψj+1Bj+1. c=max(0. n=4 strike=38. for j=n:-1:1 c=[nan c(i)*(1-r*h-rt)+c(i+1)*rt*p+c(i-1)*rt*(1-p) nan] end estimated_value=c(n+1) and sell units of the asset. Now discretize time into small intervals of length Δtj (perhaps each interval is one day): • over the jth interval (the jth day) we hold φj units and ψj bonds with value from the start of the day of Πj = −Cj + φjSj + ψjBj. s=exp(x). this must have the same value as Πj+1.10. Observe the use of nan to introduce unspecified boundary conditions that turn out to be irrelevant on the selected grid just as they are for the binomial model. for a self-financing portfolio. consequently.50 beta=log(1.s-strike).25) r=log(1.12) h=1/n delta=beta*sqrt(2*h) % perhaps double rt=beta^2*h/delta^2 p=0. though more general models may also be analyzed. we assume B grows exponentially. where B(t) is the value of a cash bond.

2 ∂x ∂t ∂x where dX2 is to be simplified according to the above rules. but need to be checked qualitatively. respectively. Exercise 2. dt. Ito’s Stochastic Calculus Introduced ΔΠj = −Cj+1 + φj+1Sj+1 + ψj+1Bj+1 − (−Cj + φjSj + ψjBj) = −ΔCj + φj(Sj+1 − Sj) + ψj(Bj+1 − Bj) + (φj+1 − φj)Sj+1 + (ψj+1 − ψj)Bj+1 = −ΔCj + φjΔSj + ψjΔBj + Sj+1Δφj + Bj+1Δψj .6). of assets with prices S and B satisfy S dφ + B dψ = 0 .17 asks you to recover the Black–Scholes equation (2. Summary Ito’s formula identifies risk-free portfolios that underpin the Black–Scholes equation (2. obtain the change in value of the portfolio as if the number of units of each component is held constant—just as we did in deriving the Black–Scholes equation (2. then dY = ∂f ∂f ∂2f dt + dX + 1 2 dX2. ∂t ∂S 2 ∂S2 is an advection-diffusion equation to be solved backward in time from the expiry values of the option C(T . ∂C 1 2 2 ∂2C ∂C + rS + β S = rC . X(t)). 2. = 0 as self-financing Write this in terms of infinitesimals to see that the changes in the value of such a managed.9) That is. as are higher order products. for the value of an option at time t when the asset has price S. • Self-financing portfolios of φ and ψ units.6) from an ensuing more sophisticated analysis based upon (2. self-financing portfolio is dΠ = −dC + φ dS + ψ dB . • Ito’s formula is that if Y = f(t. and dW 2 = dt are retained.4 Summary • In stochastic calculus. (2. . • Numerical discretizations of the Black–Scholes equation are effective in valuing options. • The Black–Scholes PDE. differentials dW. whereas dt2 = dt dW = 0. Discretizations of the Black–Scholes equation empower accurate numerical valuation of options. S).56 Chapter 2.9).6) for valuing options.

pp. 2. 246–247) n or (Abramowitz and Stegun 1965. see (Kreyszig 1999. . at any given time.6. n−1 Argue that almost surely j=0 |ΔWj|p → 0 as h → 0 for p ≥ 3 . Write the SDE in terms of only t.9.1. E[|Z| 2. Hence deduce for Wiener processes W(t) that E[W(t)p] = (p− √ 1)(p − 3) . H INT: Use that E[|Z|4] = 3 for random variates Z ∼ N(0. What features that you saw in the numerical solutions are explained by these analytic solutions? Let I0(t) = 1 . 2.3. and the differentials dt and dW. Use Ito’s formula to deduce an expression for dY in terms of only t. 27 These I (t) are closely related to the Hermite polynomials. Chap. X = log(1 + W(t)) satisfies dX = − 1 e−2Xdt + e−XdW with X0 = 0 . . 1) random variables Z and integration by parts to deduce that E[Zp] = (p − 1)(p − 3) .2.27 Let an Ito process Y(t) = 1/(t + X(t)) in terms of the Ito process X(t) = W(t)2. and dW.4. 2. . 22). a variety of SDEs whose analytic solutions you know. 1) . X = (1 + W(t))2 satisfies dX = dt + 2 X dW with X0 = 1 . Show that almost surely 2. Use the probability distribution function p(z) = exp(−z2/2)/ 2π for N(0. . Use Ito’s formula to find an SDE satisfied by the Ito process X = cos(t2 + W) .7: 1. I1(t) = W(t) . Use Ito’s formula (2. Describe the analogue with classic calculus. dt. 1) .5. and 1 + X2 dW with 5. W). and I4(t) = W(t)4 − 6tW(t)2 + 3t2 . Use the simple version (2. Y. to determine corresponding I5 and I6. I2(t) = W(t)2 − t . H INT: Use that p] is finite for Z ∼ N(0. Z = 1/(1 + W(t)) satisfies dZ = Z3dt − Z2dW with Z0 = 1 . I3(t) = W(t)3 − 3tW(t) . 2.3) of Ito’s formula to generate. X.7. 1+t √ 2. checked with Ito’s formula. n−1 4 j=0 (ΔWj) → 0 as the time step Δtj = h → 0 for a fixed final time T = nh . Use Ito’s formula to deduce the differential dX for the stochastic process X(t) = 2 + t + exp(W(t)) . 1) . 2 4. 1 for even integers p. by substituting a variety of functions f(t. where W(t) is a Wiener process. 2. Use guesswork.4) for the following: 2. W = Z t where Z ∼ N(0. 3. X = sinh(t + W(t)) satisfies dX = 1 X + 1 + X2 dt + 2 X0 = 0 . X = (1 + t)2(X0 + t + W(t)) satisfies dX = 2X + (1 + t)2 dt + (1 + t)2dW .Exercises 57 Exercises √ 2. Use Ito’s formula to show that the following are the solutions to the SDEs you solved numerically in Exercise 1. Use Ito’s formula to show dIn = nIn−1 dW . 1 · tp/2 (for even integers p) by writing.8. 2. .10.

With a Floor Price specified to be $36. 1.11. 2. the number of Ordinary Shares calculated using the following formula: Cap Price MVWAP −1 where MVWAP means the VWAP for the period of two months immediately preceding. Now suppose you are considering some option with specified value at the expiry time T . where R = erT . but not including. Thus comment on how the Black–Scholes equation (2. no additional Ordinary Shares. for now just . Consider the Black–Scholes equation (2. 0. for simplicity treat the MVWAP as the sale price of Ordinary Shares sold on the Australian Stock Exchange on the Lapse Date.6) with nonzero volatility predicts that the initial value of an option is just a smoothed version of the final value. say CT (S). and cosh2 x = 1 + sinh2 x . CT ) (where ST denotes the value of the asset at expiry) are mapped by the above PDE to points (S. Coles–Myer shares were transformed into Wesfarmers Partially Protected Shares (WPPS).58 Chapter 2. Ito’s Stochastic Calculus Introduced • Since X = t + W(t) satisfies dX = dt + dW. and (c) if the MVWAP is between the Floor Price and the Cap Price. C) at time t = 0 by the simple contraction mapping S = ST /R and C = CT /R . but scaled by the factor 1/R. the date of the Lapse Notice. 2.12. In November 2007 the company Wesfarmers purchased the Coles–Myer group. the specifications of the WPPS included the following: On a Lapse Date determined by Wesfarmers. Hence argue that all points on the specified curve (ST . a Cap Price of $45. in accordance with the following: (a) if the MVWAP is equal to or more than the Cap Price.6) with zero volatility: ∂C ∂C + rS = rC . confirm that Y = sinh X satisfies dY = 1 Y + 1 + Y 2 dt + 1 + Y 2 dW . each Partially Protected Share will be reclassified into one Ordinary Share. each Partially Protected Shareholder will be issued an additional number of Ordinary Shares for each Partially Protected Share held on that date. and 2. subject to clause 8. ∂t ∂S Show by algebraic differentiation that C = f(Se−rt)ert satisfies this PDE for any differentiable function f. dx cosh x = sinh x. As part of the settlement. deduce dZ for Ito process Z = Y 3 . 2 • For the above Ito process Y.25 Ordinary Shares. A challenge is to find the initial value of these WPPS using the Black–Scholes equation. d d Recall that dx sinh x = cosh x . Instead of part 2(c). Deduce that the above function f(S) = CT (RS)/R. (b) if the MVWAP is equal to or less than the Floor Price. and the Lapse Date meaning the date four years from the Issue Date.

over a small time interval Δt the dividend contributes δ Δt to increasing the value of the portfolio. and the volatility of the shares is 20% per √ year.13. over a small time interval the dividend’s contribution to increasing the value of the portfolio would be δ Δt + ρ ΔW . for each share held in the portfolio.3 to derive a modified Black–Scholes equation when the asset in the portfolio also returns dividends at a known rate δ.6. a bookmaker. 2. 2. Translate the above WPPS conditions into an expiry condition for the Black–Scholes equation (a condition that depends upon the sale price).Exercises 59 (a) 40 option value C 30 20 10 0 40 30 20 10 0 (b) 0 50 (c) 100 0 50 (d) 100 60 option value C 50 40 40 30 20 10 20 0 0 50 asset price S 100 0 0 50 asset price S 100 Figure 2. This exercise is challenging. The share’s current value is $4. Comment on the match with the qualitative graphical solution. shares in a company. Generalize the arguments of Section 2. have a client. Extend the previous modification to the case where the dividend has both a deterministic and stochastic component. Bob. as an asset. . You. The expiry value of four different options on an asset. Deduce that you charge Bob at least about $20 to make this bet.14. That is. 2.15. Use the trinomial model to value the following option. one also expects some regular dividends as part of the financial benefit of holding the shares. who wants to bet $100 that Telstra shares will rise to be above $5 in one year’s time. do the following. Suppose the interest rate for bonds is 10% per year. When one purchases.

Ito’s Stochastic Calculus Introduced 2.6) from (2. 2 √ 1 2. dX = dt + 2 X dW.3 to sketch by hand the option value. Consider in turn each of the four options whose expiration values are plotted in Figure 2. rederive the Black–Scholes equation (2. and hence dY = 2Y 2(1 − 2tY)dt − 2Y 2 Y − t dW. as predicted by the Black–Scholes equation at a significantly earlier time. The knock out value is $3. . dX = − 1 X + 2t 1 − X2 dt − 1 − X2dW .6.9).10 to value a knock out option (Example 2.4.60 Chapter 2. dX = (1 + 1 eW)dt + eWdW = 1 (X − t)dt + (X − t − 2)dW.59.16.8. 2.16. Using a self-financing portfolio and assuming the interest rate on bonds is the constant r. Use the qualitative arguments of Section 2.9) which is as for Example 2.18. Modify the trinomial M ATLAB /S CILAB code for Example 2.10 but additionally pays a rebate of $1 if the asset price ever falls to $30.5. 2. as a function of S. 2. 2 2 2.17. Answers to selected exercises 2.

. e. . a Wiener process is distributed as an N(0. §11. . . . we typically discuss the probability distribution over the states of the chain (see. . . . . 3. . t) random variable. ∂t ∂w2 61 . . . . .Chapter 3 The Fokker–Planck Equation Describes the Probability Distribution Contents 3.2 Modeling large birth and death processes . Recall that this Gaussian distribution satisfies the PDE for diffusion (see. Exercises . . .1. . . Consider an ensemble of realizations of a Wiener process W(t). . . . 65 68 74 76 84 86 87 91 Previous chapters concentrated on the properties of individual realizations of a stochastic process: that they have drift and volatility. . . In Markov chains. . . . for SDEs.1 (Wiener processes diffuse).2. . . . . . .g. this Gaussian spreads in time corresponding to the spreading realizations. . The alternative is to discover the statistics of stochastic processes: their mean and variance. . Compare this with Markov chains where. . Kao 1997. . these realizations spread out over wspace. . they diffuse. . instead of running many simulations of actual realizations of the process. the transition matrix guides the evolution of the probabilities. albeit slowly.1.g. √ normal distribution p(t.2 Stochastically solve deterministic differential equations . and the ten realizations in Figure 3. . . .4 Summary . . .1 The probability distribution evolves forward in time . w) = exp(−w2/2t)/ 2πt : as shown in Figure 3. .1. . .3. . . . . . or more generally. . . Kreyszig 1999. . 4). e. . . . As time increases. that is.. for example. . . 3. . . . . 3. . . . . . . . . . Example 3. .. Chap. . . Answers to selected exercises . . . .3 The Kolmogorov backward equation completes the picture 3. . . Thus its probability distribution function (PDF) is the Gaussian. .5–6): ∂p 1 ∂2p =2 .1 Steady state probability distributions . . the probability distribution of the realizations. the Fokker–Planck equation describes the evolution of the probability distribution. . 3. . and that simple numerics converge. . . . We know that at any time t. see five of the realizations shown in Figure 1. . . . .

6 0.8 0.62 Chapter 3.2 0. 1. The spreading Gaussian probability distribution function p(t.5 0.1 0.6 0. w) 0. w) of a Wiener process at three times: t = 0.9 0. The Fokker–Planck Equation Describes the Probability Distribution 2.4 0.0 0 1 2 3 4 5 w Figure 3. and t = 5 .2 0. t = 1.0 0. Ten realizations of a Wiener process W(t).2.3 0. .2 1 5 p(t.9 1. 0.0 time t Figure 3.8 0.1 0.4 0. W(t) 0.7 0.5 0.7 0.1.3 0. 0.2 .

1. Section 2. This derivation confirms (3. we can solve the Fokker–Planck equation.1). However. This process almost always decays to zero provided α − 1 β2 < 2 2 0 (even when the drift rate α > 0).Chapter 3. when we cannot solve the SDE sometimes. this remarkable statement misleads as rare fluctuations are significant. as the integral of the Gaussian is 1. we determine that 2 E eβW(t) = eβ t/2 . Example 3.2 describes how the solution of the SDE dX = αX dt + βX dW is the stochastic process X(t) = exp[(α − 1 β2)t + βW(t)]. 1) so X(t) is a square-root curve with a randomly chosen coefficient. this view of the Wiener process is limited in that. (3. The same comment holds if Z is instead generated randomly and independently at each time t. Consider a √ random function X(t) = Z t.1) using the PDF of the Wiener process. nonetheless has a growing expectation . we conclude that the solutions to exponential Brownian motion. First. Second.2 (expect important but rare large fluctuations). The Fokker–Planck Equation Describes the Probability Distribution 63 This PDE governing the evolution of the PDF p(t. w) is the simplest example of a Fokker– Planck equation describing the evolution of the distribution of a stochastic process. useful statistics are obtained from the PDF and sometimes show overall features missed by individual realizations of the SDE! Further. where Z is a random variable distributed N(0. However. for example. Although the PDF does not discriminate between the possibilities discussed at the end of this example. w) dw 1 2 e−w /2t dw eβw √ 2πt −∞ ∞ ∞ ∞ 1 w2 + βw dw exp − √ 2t 2πt then upon completing the square 1 1 β2t dw exp − (w − tβ)2 + √ 2t 2 2πt −∞ ∞ 1 2 2 = eβ t/2 e−(w−tβ) /(2t) dw √ 2πt −∞ ∞ 1 2 β2 t/2 =e e−u /(2t) du √ substituting u = w − tβ 2πt −∞ 2 = eβ t/2 . From the definition of expectation in terms of the PDF E[eβW(t)] = = = = ∞ −∞ ∞ eβwp(t. and hence make at least partial analytic progress. although almost always tending to zero for large enough β. Determining the expected value of X(t) from its PDF illustrates the significance of the rare fluctuations. it does not discern the nature of the increments nor the continuity of the stochastic process. then X(t) has exactly the same Gaussian PDF as the Wiener process but is quite different in nature.

very large fluctuations in X(t) must occur so that the expectation can grow even though almost all realizations decay to zero.1 hinted at this significance of the few large fluctuations. For example. almost all realizations decay e to zero! Example 3. this stochastic procedure finds the solution of the deterministic Dirichlet problem ∇ 2u = 0 such that u = f(P) on the boundary. For example. y0) and release from that point a large number of random walkers (drunks!) who execute Brownian motion in both x and y: that is. the almost surely decaying realizations X(t) = e−t+2W(t) of the SDE dX = X dt + 2X dW (such that X(0) = 1) nonetheless have an exponentially growing expectation E[X(t)] = et .3). Thus very rare. which grows alarmingly quickly even though. This technique is incredibly useful for solving problems in very high dimensional domains. Another great advantage is that it easily handles complex shaped domains. Then when enough walkers have become stuck to the boundary. and hence the walkers have somehow solved 0 = ∇ 2p.1.g. one wants to solve PDEs in a vector space with over 101000 dimensions! The stochastic solution involving 28 An estimate of the error is the sample standard deviation divided by the square root of the number of walkers who reached the boundary.2 explores this useful connection between SDEs and deterministic differential equations. the time derivative ∂t 2 becomes zero.28 One way to see the connection between the random walkers and Laplace’s equation is to recognize that the PDF of the random walkers satisfy the two-dimensional diffusion equation ∂p = 1 ∇ 2p. Further. in investigating quantum dynamics (of Bose–Einstein condensates) with 1000 atoms. which is Laplace’s equation. e. The histogram of Figure 2.64 Chapter 3. One way that we can approximately solve Laplace’s equation ∇ 2u = 0 in some domain (see. y0). you will also find here that Var[X(t)] = 6t −e2t. These (drunken) walkers stick to the boundary when they first hit the boundary as shown in Figure 3. again.3 (random walkers solve Laplace’s equation). Kreyszig 1999. as a specific instance of Exercise 3. as it needs no underlying grid. x = W1(t) and y = W2(t) for two independent Wiener processes.9–11) is to choose a point of interest (x0. then when they remain stuck to the boundary. . Amazingly. §11. Section 3.. The Fokker–Planck equation establishes a useful transformation between the solution of SDEs and certain PDEs. One great advantage of using this connection for solving PDEs is that you do not have to compute the solution everywhere if all you need is the solution at one or a few points. The Fokker–Planck Equation Describes the Probability Distribution whenever α > 0 : 2 E [X(t)] = E X0e(α−β /2)t+βW(t) 2 = E X0e(α−β /2)teβW(t) 2 = X0e(α−β /2)t E eβW(t) 2 2 = X0e(α−β /2)teβ t/2 = X0eαt . one simply takes the average of the boundary values f(P) to estimate u(x0.3 (see the simulation of Algorithm A. say.

2 0.15 1.8 0.0 0.0 0.6 1.4).3.7.6 0.2 0.0 0.4).0 0.0 time 0.1 The probability distribution evolves forward in time We now discover how to describe the evolution of the probability distribution function p(t. 0. The contour plot shows the true solution of Laplace’s equation: its value on the boundary gives the contribution of each random walker to the estimate of u(0.2 0.0 Figure 3.0 0.7.2 0.0 0.4 0.3 0.7.2 0.0 1.0 0.8 1. many realizations of just 1000 atoms is accessible via supercomputers.0 0.6 0.4 0.0 0. The probability distribution evolves forward in time time 0 1.4 0.6 0. 0.0 0.8 1.6 0. spread out stochastically in time until they hit the boundaries (the unit square) where they stick to collectively estimate u(0.2 0. For simplicity we just consider one stochastic process dependent upon one Wiener process—the generalization to .8 1.3.0 0.4 0.4 0. x) for a general Ito process with some drift and volatility.6 0.8 time 0.8 0.4).0 time 0.2 0.8 0. 3.4 0.6 0.8 1.6 0.4 0.2 0.0 0.0 0.8 0.6 0. Forty random walkers (circles) released from (0.1.4 0. 0.4 0. That there is this connection between PDEs and SDEs is established by the Fokker–Planck equation for the PDFs of the stochastic process.05 65 0.2 0.

and the drift μ is evaluated at x unless otherwise specified.2) Analogue The Fokker–Planck equation is analogous to the evolution of probability distributions in Markov chains: p(t + 1) = p(t)P for some transition matrix P.4 (Fokker–Planck equation). the x derivatives involving the drift and volatility operate exactly like the transition matrix P—they operate to dictate how probability distributions evolve in time. The Fokker–Planck Equation Describes the Probability Distribution s(t. We derive this Fokker–Planck equation for the case of constant volatility σ (variable volatility makes the details much more complicated).2). Analogue with Markov chains The probability distribution function p(t.4. x)dx . Theorem 3. and hence satisfying the SDE dX = μ dt + σ dW. within the interval [x. pj(t). Consider the Ito process X(t) with drift μ(X) and volatility σ(X). the probability of realizations being near x at time t.1 gives a proof for general volatility.29 Throughout the derivation. namely. is analogous to the vector of probabilities p(t). Proof. ∂t ∂x ∂x 2 (3. ξd) Z = −1d d s(t + h. ∂p ∂ ∂2 = − [μp] + 2 1 σ2p .7–3. Kao 1997.8 ask you to deal with specific cases of variable volatility. Appendix B. but we will not explore it in this book. In the Fokker–Planck equation (3. multiple coupled stochastic processes involving multiple independent Wiener processes is analogous and of much practical use. x + dx]} = p(t + h. Left: either an up or down step reaches the state x at time t + h. x). x) satisfies the Fokker– Planck equation. Right: small intervals of length dξ reach a small interval of length dx over one time step.1). is the probability being in a state j at time t discussed in Markov chains such as birth and death processes (see. . where for simplicity we restrict our attention to the autonomous case of no direct time dependence in the drift and volatility. x) ‚  Z = +1   s(t. Then the PDF of the ensemble of realizations p(t.66 Chapter 3. §4.g. ξu)   ξd + dξ s d ξd s dd sx + dx ‚ ds ‚    x ξu + dξ s       ξu s Figure 3. the probability of being in this interval is Pr {X(t + h) ∈ [x. where the jth component. that is. In terms of the PDF. This section ends by showing how to model biological populations with stochastic effects representative of those endemic in the environment. e.. dashes denote ∂/∂x. alternatively called the Kolmogorov forward equation. We investigate the probability that the realization at time t + h is near the value x. x + dx] for some small (infinitesimal) interval length dx. 29 Exercises 3.

Here such terms are typically in h3/2 . ξu)dξu + 1 p(t. √ ξ = x − μh ∓ σ h + · · · . x) − p(t. This assumption is rather like the binary model of asset price 2 movement. each with probability 1 . x) = −(μp) + 1 σ2p + · · · . The probability distribution evolves forward in time 67 The realization reaches x from a variety of possible values of X(t) depending upon the √ Wiener increment ΔW = hZ . x)dx = Pr {X(t + h) ∈ [x. That Z is a normal random variable is largely immaterial— so long as Z has zero mean and unit variance. . ΔW = + h .3.1. x): √ √ p(t + h. the process reaches x if it starts from ξu with √ √ Z = +1. ξd + dξd]} 2 2 = 1 p(t. will quickly become normal—thus for simplicity and to the same effect assume Z = ±1 each with probability 1 . x − μh + σ h)[1 − μ h]dx + · · · . ξu + dξu]} + 1 Pr {X(t) ∈ [ξd. x − μh − σ h)[1 − μ h]dx 2 √ + 1 p(t. Then because Z = ±1. Differentiating this shows how intervals of X at time t become slightly stretched or compressed in the evolution to time t + h. √ ξ ≈ x − μ(x)h ∓ σ h . as shown in Figure 3. Solve the pair approximately by √ iteration after rearranging to ξ = x − μ(ξ)h ∓ σ h and starting from30 ξ ≈ x. where the values for ξ are determined from √ x = ξ + μ(ξ)h ± σ h . h2 . its cumulative sum. 2 p(t + h. 2 h 30 Recall that the ellipsis “· · · ” denotes small terms of higher order in time step h that we neglect. 2 Putting p onto the left and dividing by h leads to p(t + h.4 (right): dξ = 1 − μ h + · · · dx . over many small time steps. ΔW = − h . or if it starts from ξd if Z = −1. As Figure 3. ξd)dξd 2 2 √ = 1 p(t. and so on. x) = 1 p − (μh + σ h)p + 1 (μh + σ h)2p [1 − μ h] 2 2 √ √ + 1 p + (−μh + σ h)p + 1 (−μh + σ h)2p [1 − μ h] 2 2 + ··· = p − μhp + 1 σ2hp 2 [1 − μ h] + · · · = p − μhp − μ hp + 1 σ2hp + · · · . This is a pair of implicit equations for the two ξ values.4 (left) shows. x + dx]} = 1 Pr {X(t) ∈ [ξu. 2 Divide by the infinitesimal dx and expand p in Taylor series about p(t.

Imagine a car: when you press down on it for a short time. The height of the car above the road is around about its normal height.2) has the following physical interpretation following from the modeling of motion in a one-dimensional continuum (see. The second ∂ component of the flux. 31 If ∂ − ∂x (μp)..1.1 Steady state probability distributions Example 3. 31 2 3. the random noise is absent. Rewrite equation (3. with D(x) being the ∂x ∂ effective diffusion coefficient. The natural decay in the dynamics of the suspension brings the car back to its equilibrium. such as fluid turbulence. and as probability distribution diffusing with coefficient D = 1 σ2 . Interpretation The Fokker–Planck equation (3. Appendix B. Consider the bumps as a stochastic forcing of the suspension.1 gives an alternate and more general proof. This equation describes the conservation of probability distribu2 ∂ tion p(t. However.2) for the case of constant volatility.5 (the Ornstein–Uhlenbeck process). then we would model the combined dynamics by an SDE. the Fokker–Planck equation reduces to the Liouville equation ∂p = ∂t which describes the dynamics of probability distribution for a deterministic differential equation. − ∂x (Dp). physical diffusion normally appears in the flux as −D ∂p . Thus write the flux as − ∂x q = (μ − σσ )p − D ∂p . but the bumps cause stochastic fluctuations in its height. Roberts 1994). causes randomness to be effectively generated in the dynamics of the system. Now drive the car along a bumpy road: the car moves up and down in a complex response to the bumps and to the dynamics of its suspension. Here the additional part of the second term − ∂x (Dp) is ∂D p = σσ p and is best grouped with the first term −μp.g. 2 ∂t ∂x ∂x This is the Fokker–Planck equation (3. its suspension reacts and subsequently lifts the car back to its equilibrium height above the road. The Fokker–Planck Equation Describes the Probability Distribution Take the limit as h → 0 to deduce ∂2p ∂p ∂ = − (μp) + 1 σ2 2 . This view of the dynamics is sometimes sought when either the initial conditions are stochastic or because chaos in the dynamics.68 Chapter 3. e. is a diffusive term induced by the volatility of the SDE. ∂x and interpret this flux as probability distribution being carried by a mean velocity μ − σσ due to the drift and to the asymmetry in the noise. . ∂t ∂x ∂x where D(x) = 1 σ(x)2 . x) as it “‘moves” along the x-axis with a flux q = μp − ∂x (Dp) .2) in “conservative” form ∂ ∂p ∂ + μp − (Dp) = 0. Let us see such behavior in mathematics.

5 after the initial transients.6. √ realizations of an Ornstein–Uhlenbeck process X(t) with paramTen eters α = 1 and σ = 2 .0 2.6. that is. met in Exercise 1. The form of the SDE is generally dX = −αX dt + σ dW (3.3) for some constants α and σ measuring the rate of deterministic decay and the level of stochastic forcing.5 shows ten realizations. ∂t ∂x ∂x 2 Obtain the differential equation for the steady state PDF by setting ∂p/∂t = 0 .5 2.5 1. Find the steady state PDF for this process. then this Fokker–Planck equation becomes ∂ ∂p αxp + 1 σ2 2 ∂x ∂x ∂p ⇒ constant = αxp + 1 σ2 2 ∂x 0= upon integrating. The so-called Ornstein–Uhlenbeck process. the density of realizations in Figure 3.6 shows how the corresponding PDF evolves. respectively. The probability distribution evolves forward in time 4 3 2 1 0 69 X(t) 0. .1.0 1. combines deterministic exponential decay (modeling the car’s suspension) with an additive noise (modeling the forcing of the bumps).3.5 time t Figure 3. and all realizations starting with X(0) = 3 . Solution: The Fokker–Planck equation (3.0 0.2) for the PDF is ∂ ∂2 ∂p = − (−αxp) + 2 ( 1 σ2p) .5. namely the shape that is appearing at large times in Figure 3. and Figure 3. Figure 3.

e. x) for the Ornstein–Uhlenbeck process dX = √ −X dt + 2 dW with initial condition X(0) = 3 and plotted at times t = 0.5.6. is one. with width proportional to σ/ α . §1..7 0. in order for the integral of p to be one. for large enough x. Kreyszig 1999.5 0.g.9 0. Evolving PDF p(t.4 0. here p(x) dx = 1 . the area under p(x).2 0. x) 0. 1/ α.70 Chapter 3.6 0.5 10 p(t. σ. The Fokker–Planck Equation Describes the Probability Distribution 1.8 0.5. The width of the smearing is proportional to the strength of the noise. The constant of proportionality is then well known to be A = α/π/σ in order to ensure that the total probability.0 0 1 2 3 4 5 x Figure 3. and 10 . Compare with the realizations in Figure 3. Just as you normalize steady state distributions. 0.5.1 .3 0. so we normalize here: For a finite number of states j πj = 1.1 0.5 2. The difference here is the continuum of possible states.3–4) leads to 1 σ2 2 dp = −αx dx p ⇒ log p = −αx2/σ2 + constant 2 2 ⇒ p = Ae−αx /σ for some integration constant √ This is a Gaussian distribution centered on x = 0 and A.1 0. 2. Analogue The steady state probability distribution p(x) is analogous to the steady state distributions π found for discrete state stochastic models. The additive noise of the Ornstein–Uhlenbeck process has just “smeared out” the stable fixed point at X = 0 . The constant on the left-hand side is zero as the PDF p and its derivative have to vanish.√ proportional to the inverse square root of the rate of attraction of the fixed and point. .0 0. Rearranging it as a separable ODE (see.

√ √ whereas the fixed points at X = ± 3 are stable as the local dynamics. The probability distribution evolves forward in time 71 4 3 2 1 Γ (z) 0 0 z 1 2 3 4 Figure 3. This ODE has fixed points √ (equilibria) where 3X − X3 = 0 . say X = ± 3+Y(t) . Investigate the steady state PDF of the SDE dX = (3X − X3)dt + X dW and relate it to the deterministic dynamics. for example.3. Thus all deterministic trajectories √ evolve to one or another of the fixed points X = ± 3 . Gamma function In the next example we need to use the Gamma function. Compare it with the steady state of the SDE dX = (3X − X3)dt + 2X dW that has twice the volatility.7 for real z. Γ (n + 1) = n! . since Γ (1) = 1 then for integer n.g. 2 2 2 Example 3. (3. A54–55).7.4) for real argument z. Second. The Gamma function (3. pp.g. Γ (z). namely X = 0 and X = ± 3. ∂x ∂x 2 ..6 (a two humped camel). Kreyszig 1999. e. which you have possibly already met in other studies (see. Solution: First.. and hence Γ ( 3 ) = 1 π . Kreyszig 1999. consider the steady state PDF p(x) of the SDE dX = (3X − X3)dt + X dW .3–3. investigate the deterministic dynamics of dX = (3X − X3)dt as done in courses on ODEs (see.4) 0 as plotted in Figure 3.1. the fixed point at X = 0 is unstable as the linearized dynamics are dX = 3X dt with exponentially growing solutions. Γ (z) = ∞ xz−1e−x dx . §3. Hence. are dY = −6Y dt with exponentially decaying solutions. It satisfies the time-independent Fokker–Planck equation 0=− ∂ ∂2 (3x − x3)p + 2 1 x2p . Other special values are √ √ Γ ( 1 ) = π. Integration by parts shows that Γ (z + 1) = zΓ (z) . e.5).

72

Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution

One integral with respect to x leads to −(3x − x3)p + ∂ 1 2 x p = constant, ∂x 2

but this constant has to be zero, as p and its derivatives must vanish for large enough x. Thus by expanding, rearranging, and recognizing that the ODE is separable, we obtain
1 2 ∂p 2 x ∂x

= (2x − x3)p 4x − 2x3 dx x2 4 − 2x dx x
2

dp = p

⇒ log p =

⇒ log p = 4 log |x| − x2 + constant ⇒ p = Ax4e−x . Thus the steady state PDF, as shown in Figure 3.8, is zero near x = 0 , increases away from x = 0 by the x4 factor, but soon is brought back to zero by the rapid decay of the 2 e−x factor. The two humps of the probability distribution correspond to the two stable fixed points of the deterministic ODE. Determine the integration constant A by requiring that the area under the PDF be one: using symmetry, integration over half the domain requires
1 2

=

Ax4e−x dx
2

√ and thus A = 4/(3 π) . Lastly, a perusal of the deterministic part of dX = (3X − X3)dt + 2X dW again suggests that there should be two humps in the PDF near the two stable deterministic fixed √ points X = ± 3 . However, the steady solutions of the corresponding Fokker–Planck equation, ∂ ∂2 (3x − x3)p + 2 2x2p , 0=− ∂x ∂x are derived via −(3x − x3)p + ⇒ 2x2 ⇒ ∂ 2x2p = 0 ∂x

A ∞ 3/2 −u u e du upon substituting u = x2 = 2 0 √ 3 π A A, = Γ (5/2) = 2 8

0

∂p = (−x − x3)p ∂x −x − x3 dp = dx p 2x2

3.1. The probability distribution evolves forward in time
0.7 0.6 0.5

73

PDF p(x)

0.4 0.3 0.2 0.1 0.0 0 1 2 3 4

x Figure 3.8. Steady state probability distributions for Example 3.6. Solid blue line: dX = (3X − X3)dt + X dW with lower volatility has two humps; dashed green line: dX = (3X − X3)dt + 2X dW with larger volatility peaks at the origin.

⇒ log p = −

1 − 1 x dx 2x 2

⇒ log p = − 1 log |x| − 1 x2 + constant 2 4 A −x2 /4 e ⇒p= . |x| Figure 3.8 shows that doubling the level of the multiplicative noise, now 2X dW, effectively stabilizes the fixed point at the origin. The large spike of probability at the origin has finite √ area as 1/ x is integrable. Requiring the total area under the PDF to be one implies
1 2

=

2 Ax−1/2e−x /4dx

A ∞ −3/4 −u u e du =√ 2 0 A = √ Γ (1/4) , 2 √ and thus A = 1/( 2Γ (1/4)) = 0.1950 .

0

upon substituting u = x2/4

74

Chapter 3. The Fokker–Planck Equation Describes the Probability Distribution    λ0  λ1  E λn−1 λn  E E E E E E ... ... 0 1 2 n−1 n n+1 '  ' '  '  'μ  ' '  μ1 μn μn+1 2 Figure 3.9. Transitions between the numbers of individuals in a population.

3.1.2 Modeling large birth and death processes
Consider the modeling of populations by a birth and death process where we track each and every birth and death, as shown in the states and transitions of Figure 3.9. But in large populations we only need to know aggregated births and deaths: for example, how many thousands of people are infected in a flu epidemic? In large populations it is grossly inefficient to track each and every event. The Fokker–Planck equation empowers us to transform a single event model, such as a birth and death model, into an efficient aggregate SDE model via the evolution of the probability distribution. Here we introduce the key idea via one example. Example 3.7 (births/deaths in a large population). Malthus proposed one of the first mathematical models of biological populations: The number of animals N(t) grows in time according to dN/dt = (α − β)N, where α is the birth rate per individual and β is the death rate per individual. When the birth rate is greater than the death rate, α > β, then inevitably the population grows exponentially. Question: What happens in our uncertain fluctuating environment? Answer: The population fluctuates as plotted in Figure 3.10. This example introduces a scheme to describe the number of individuals in a population, N(t), as a Markov birth and death process. This is then approximated as the Fokker– Planck equation (3.2) for a stochastic version of the Malthusian model dN/dt = (α −β)N . Let n range over the total number of individuals in the population (a nonnegative integer),32 and let pn(t) denote the probability that there are n individuals in the population at time t. Then changes to the number of individuals in the population are due to births at some rate λn and deaths at some rate μn, as shown in Figure 3.9. For a biological population with no constraints on the number of individuals, we expect that the birth and death rates will be proportional to the number of individuals: λn = αn and μn = βn for some constants α and β. These are the key parameters and variables of the stochastic Malthusian model. How does the number of individuals evolve? The time rate of change of the probability of there being n individuals is decreased by a birth to n + 1 individuals or by a death to n − 1 individuals, or is increased by a death among n + 1 individuals or a birth among n − 1 individuals: dpn = λn−1pn−1 − (λn + μn)pn + μn+1pn+1 dt = α(n − 1)pn−1 − (α + β)npn + β(n + 1)pn+1 then upon assuming pn varies smoothly in n, Taylor expanding all pn±1 terms about n, and using dashes to denote ∂/∂n
32 For sexual animals, the usual practice is to count only the female of the species, as they are most closely involved in reproduction.

2 0.0 1.6 1. Such a Fokker–Planck-like equation governs the PDF of some SDE.4 0. neglecting the higher order terms (in the ellipses). (3.10. until you match all the earlier terms. = α(n − 1) p − p + 1 p − · · · − (α + β)np 2 + β(n + 1) p + p + 1 p + · · · 2 = (β − α)p + [βn + β − αn + α]p + 1 [βn + β + αn − α]p + · · · 2 = ∂2 ∂ [(β − α)np] + 2 1 [(α + β)n + (β − α)]p + · · · .8 2.0 time t Figure 3. ∂n ∂n 2 That is.8 1. Five realizations of the stochastic population model (3.6) with growth rate α = 2 and death rate β = 1.2) with drift μ = (α − β)n and diffusion D = 1 σ2 = 1 [(α + β)n + (β − α)] ≈ 1 (α + β)n 2 2 2 for a large population.5) To confirm this form of the right-hand side. start with it and work backward.5) has the form of a Fokker–Planck equation (3.0 75 N(t) 0.3. The probability distribution evolves forward in time 35 30 25 20 15 10 5 0 0.6 0.1. The PDE (3.2 1. expanding the derivatives using the product rule.4 1. Thus we could equivalently model the population by realizations of the stochas- . ∂p ∂ ∂2 ≈ [(β − α)np] + 2 ∂t ∂n ∂n 1 [(α + β)n + (β − α)]p 2 . Stochastic fluctuations potentially cause large variations in the population growth.

(α − β)N dt. From this. ∂t ∂x ∂x (3. This identity is sometimes called the Feynman–Kac formula. Let X(t) be the ensemble of solutions to the SDE 2 dX = μ(X) dt + σ(X) dW with initial condition X(s) = y . models Malthusian growth. The trick is to show that at any time.9.76 Chapter 3. namely X(t) = W(t − 1) + 2 .7) where as usual D(x) = 1 σ(x)2 . so the corresponding SDE is just dX = dW with solution of a Wiener process starting from the point (t. we address only one spatial dimension. Modern cell biochemistry also finds such transformations useful (Higham 2008). Theorem 3. The function u(t. but to also transform individual-based models into aggregate SDEs that incorporate realistic fluctuations. x) = (1. (α + β)N dW. a strong link is cemented between PDEs and SDEs. and then that it holds from a general initial point (t. the average field u experienced by the random walkers at that time is always the value at the release point.10. y) = E [u(t. This link helps solve the Black–Scholes equation (2.2 and 3. ∂t 2 ∂x2 σ = 1 . μ = 0 . and unit volatility. Let u(t.4(1). Example 3. namely dN = (α − β)N dt + (α + β)NdW . then the specific initial value u(s. Summary The Fokker–Planck equation (3. But for wide applicability we permit the “random walkers” to undergo a quite general stochastic process.6) The first term in this SDE. Show first that the Feynman–Kac formula holds from the initial point (t.6) for the value of options in finance. Suggested activity: Do at least Exercises 3. x) = (s. x) = ∂t 2 ∂x2 (1. whereas the second. 2). . 2). y). 2 Solution: The diffusion PDE ∂u = 1 ∂ u has zero drift.2 Stochastically solve deterministic differential equations We now head back toward our earlier claim that the random walkers in a domain somehow solve Laplace’s equation. The Fokker–Planck Equation Describes the Probability Distribution tic differential equation dN = μ dt + σ dW . 3. X(t))] for all t. x) satisfy the PDE ∂u ∂u ∂2u + μ(x) + D(x) 2 = 0 . (3.2) empowers us to not only predict steady state distributions. models the fluctuations typical of a large number of random events.8 (Feynman–Kac formula). For simplicity. x) = x2 − t is one of the solutions to the diffusion PDE ∂u = 1 ∂2 u . Such stochastic models realistically describe natural fluctuations in populations that we see in the realizations of Figure 3.

3. y) dx . x)p(t. as. We show that this expectation does not change in time: ∂ ∂ E [u(t. the expectation on the right-hand side of the Feynman–Kac formula is then E [u(t. It must satisfy the corresponding Fokker–Planck equation: ∂ ∂2 ∂p = − [μp] + 2 [Dp] . by definition. Proof. X(t))] = u(t. The solution of the corresponding SDE starting from the point (t. where the implicit limits of integration are over all x from −∞ to +∞ . The left-hand side of the Feynman– Kac formula is u(1. y) dx ∂t ∂t ∂p ∂u p + u dx = ∂t ∂t ∂u ∂ ∂2 = p − u [μp] + u 2 [Dp] dx by Fokker–Planck. The left-hand side of the Feynman– Kac formula is u(s. ∂t ∂x ∂x Now consider the expected value of u over all realizations released from X = y at time s: E [u(t.8. X(t))] = E {W(t − 1) + 2}2 − t = E W(t − 1)2 + 4W(t − 1) + 4 − t = E W(t − 1)2 + 4 E [W(t − 1)] + (4 − t) = Var [W(t − 1)] + 4 E [W(t − 1)] + (4 − t) = (t − 1) + 4 × 0 + (4 − t) = 3 . both here and below. Stochastically solve deterministic differential equations 77 Now at any later time t > 1. W(t − 1) is distributed N(0. 2) = 22 − 1 = 3 in agreement. y) be the PDF for the general SDE of Theorem 3. ∂t ∂x ∂x . x|s. At any later time t > 1. X(t))] = u(t. the expectation on the right-hand side of the Feynman– Kac formula is E [u(t. X(t))] = E {W(t − s) + y}2 − t = E W(t − s)2 + 2yW(t − s) + y2 − t = E W(t − s)2 + 2y E [W(t − s)] + (y2 − t) = Var [W(t − s)] + 2y E [W(t − s)] + (y2 − t) = (t − s) + 4 × 0 + (y2 − t) = y2 − s . Similar algebra applies for any initial point (s.2. Let p(t. x)p(t. x|s. as. by definition. x) = (s. x|s. y) = y2 − s in agreement. W(t − s) is distributed N(0. y). t − s). y) is X(t) = W(t − s) + y . t − 1).

7) because of the source terms on the right-hand side. the expectation E [u(t.8 the current discounted value of the option is u(0. y)] = u(s.33 then u(t. S) = u(t.10 (solve the Black–Scholes equation stochastically). with path x = Y(t). ∂t ∂x 2 ∂x2 which is in the requisite form (3. to evolve according to the corresponding SDE34 dY = rY dt + βY dW with Y(0) = S0. Example 3.6) for the value of a call option is ∂C 1 2 2 ∂2C ∂C + rS + β S = rC . S)e−rt has the meaning of the value of the call option discounted by the bond rate to the current time t = 0 . We change variables to x and u(t. (3. u(0. . S0) = E [u(t.7). X(t))] is constant in time t. Then according to Theorem 3. since ∂C = ∂u ert + rC . x) = C(t. As in the example shown in Figure 3.11.78 Chapter 3. x)ert and x = S.7). Y(t))] for all time t. In particular. y) . S0) = 33 We unnecessarily change names from asset value S to abstract x only to match closely with the PDE (3. Recall that the Black– Scholes equation (2. 34 We use Y here only because X denotes the strike price for an option.8) where S0 is the current asset price. it is ∂t ∂t straightforward to see that the discounted value u(t. This Black–Scholes equation is not in the form of the PDE (3. it must be always the same as its initial value of E [u(s. With this change. x) satisfies the PDE ∂u 1 2 2 ∂2u ∂u + rx + β x = 0. That is.7) = 0. suppose at time zero we release stochastic particles. x) such that C(t. X(s))] = E [u(s. ∂t ∂S 2 ∂S2 where the bond rate is r and the asset stock volatility is β. The Fokker–Planck Equation Describes the Probability Distribution then integrate the last two terms by parts to ∂ ∂u ∂u ∂u ∂ = −uμp + u [Dp] + p + μp − [Dp] dx ∂x ∂t ∂x ∂x ∂x = 0 since p→ 0 as x→ ±∞ and integrate the last term by parts again = − ∂u Dp ∂x + ∂u ∂2u ∂u p + μp + Dp dx ∂t ∂x ∂x2 = 0 since p→ 0 as x→ ±∞ = ∂u ∂u ∂2u +μ + D 2 p dx ∂t ∂x ∂x = 0 by the given PDE (3.

S0) = E [u(t.25) x=38. for j=1:n y=y+r*y*h+b*y. The exact value for the option of $3.1 Example M ATLAB /S CILAB code using (3. r=log(1.6) is so quick.1. h=diff(t(1:2)).8) to stochastically estimate the value of a call option. on an asset with initial price S0 = 35 . 0}. S0) = e−rT E [max{Y(T ) − X. the estimated value of the call option varies over a range of approximately 3. from Example 1. Figure 3.3–3.3. y=s0*ones(1. 000 realizations! This is rather too many realizations for a practical method when the alternative of solving the Black–Scholes equation (2. say t = T . Hence the discounted value of each of the stochastic particles is u(T . t=linspace(0. 0} . Stochastically solve deterministic differential equations 79 E [u(t.35 The previous example reinforces the connection between PDEs and SDEs.0)) To estimate this value to the nearest few cents we need to increase the accuracy of the stochastic estimate by a factor of 10. Such accuracy could only be obtained by using 100 times as many realizations. namely C(T . We now use the above theory to show that a similar technique will work quite 35 This stochastic solution also opens up the possibility of using noise processes different from the Wiener process to estimate the value of an option.50 s0=35 m=1000. Y(t))]. Averaging over m = 1000 realizations. Thus using u(0. where S is the asset price.1 lists this stochastic solution of the Black–Scholes equation.m)*sqrt(h). the current value of the option is C(0. However. Y(T )) = e−rT max{Y(T ) − X. end estimated_value=exp(-r)*mean(max(y-x. S0) = u(0. n=1000.n+1). 0}] .*randn(1. S) = max{S − X. Recall that at the start of this chapter we claimed that random walkers could solve Laplace’s equation by averaging over the values observed by the walkers when they first contacted the boundary. m ≈ 100.9. our developed theory does not yet justify such use! . denoted by Y for the solutions of the SDE. strike price X = 38.12) b=log(1.11 plots 10 realizations of Y(t) —there is nothing particularly unusual to observe.25 and with a bond rate of 12%. of the option.5 .2. Algorithm 3.50 after one year in which the asset fluctuates by a factor of 1.m). Y(t))] will hold up to the expiry time. Upon expiry we know the value of the option.40 is comfortably within this range. that is. Algorithm 3.

11.11. Ten example realizations Y(t) of the SDE to stochastically solve the Black–Scholes equation: the value of the option at the expiry time t = 1 is shown at the right.2 0.9) subject to the boundary conditions u(a) = f(a) and u(b) = f(b) . Let X(t) be the ensemble of solutions to the SDE dX = μ(X) dt + σ(X) dW with initial condition X(0) = y . .1 2.3 0 0 0 0 22 19 Y(t) 0. Let u(x) in some domain a < x < b satisfy the ODE μ(x) ∂u ∂2u + D(x) 2 = 0 .5 0. generally.9 1.10) 36 One may also obtain the solution of the forced ODE μu + Du x = g(x): the formula u(y) = x x τ E [f(X(τ))]− E 0 g(X(t))dt .8 0. The Fokker–Planck Equation Describes the Probability Distribution 65 60 55 50 45 40 35 30 25 0.0 7. ∂x ∂x (3. Average such values to estimate the value of the option at the initial time t = 0 .1 0.12 shows ten realizations X(t) (the analogue of the random walkers) initiated by being released from x = 2 and then stochastically walking until they hit the boundaries at x = 0 or x = 3 .3 0.7 0. Here we restrict our attention to solutions of boundary value problems of ODEs. For example. where as usual D(x) = 1 2 2 σ(x) . Theorem 3.6 0.4 0.3 5. (3. and let τ be the first exit time from a < X < b of each realization. the value of some field u at x = 2 is then the appropriately weighted average of these two boundary values. Figure 3.80 Chapter 3. Then u at the release point is the average value of the realizations that have “stuck” to one or other of the boundaries:36 u(y) = E [f(X(τ))] .0 time t Figure 3.

4 1.0 1. Theorem 3. the value of some field u at position x = 2 is then a weighted average of its two boundary values.12. but reduce to zero outside and in particular at the end points x = a and b. then stochastically evolving until they hit the boundaries at x = 0 or x = 3. as u = f on each of the boundaries. Then the correspondence between the ODE and the SDE is maintained throughout the domain.6 0.3.5 1.0 time t Figure 3.2. and also numerically estimate u(1) and u(2) via its corresponding SDE.2 0. restricted to time-independent differential equations. 2 .0 0.2 1.6 1.8. require that the drift and volatility of the SDE are as required internally in the domain a < x < b . Stochastically solve deterministic differential equations 3. The only difference is that the solutions of the SDE “stick” to the boundary as soon as they reach it. Solve it analytically. A little loosely.0 2. Consider the ODE −x du + 1 (1 + x2) d u = 0 with boundary conditions dx 2 dx2 u(0) = 1 and u(3) = 5 .12. then shows that the solution at the release point is u(y) = E [u(X(T ))] = E [u(X(τ))] as X(T ) = X(τ) for each realization = E [f(X(τ))] .0 0.0 0. as there the drift and volatility are both zero.8 1.5 0.4 0. Proof.8 2. Example 3.0 81 x 1. Ten realizations X(t) initiated from x = 2.5 2. Now let T denote a time so large that almost all solutions of the SDE have reached a boundary.

*randn(1.^2). 1.36. when written in terms of v = du/dx .1 to that of Algorithm 3.20 . 1. recognize that the ODE comes from an Ito process with drift μ = −x and volatility σ = 1 + x2 . 2. determine A = 1 and B = 1 so the 3 analytic solution is u = 1 x + 1 x3 + 1 . m=100. The Fokker–Planck Equation Describes the Probability Distribution Solution: Analytically. 1. Using many more realisations gives more accuracy.5556 . i=find((x>0)&(x<3)). 2.2 M ATLAB /S CILAB code to solve a boundary value problem ODE by its corresponding SDE. % realizations h=0. the ODE becomes separable: −xv + 1 (1 + x2) 2 dv =0 dx dv = xv ⇒ 1 (1 + x2) 2 dx 2x dv = ⇒ dx v 1 + x2 ⇒ log v = log(1 + x2) + constant ⇒ v = A(1 + x2) ⇒ u = v dx = A x + 1 x3 + B . Similar fluctuations are obtained for estimates of u(2) such as 2. 1.56. But at least we roughly approximate the answer quite quickly with relatively few realizations.44.4444 and u(2) = 23/9 = 2. while m>0 x(i)=x(i)-x(i)*h+sqrt(1+x(i). 3 3 Thus we now know that u(1) = 13/9 = 1. 2. Use find to evolve only those realizations within the domain 0 < X < 3 . adapt Algorithm 3. 3 With boundary conditions u(0) = 1 and u(3) = 5 .82 Chapter 3.2: this version numerically solves the SDE dX = −X dt + 1 + X2 dW with initial condition X(0) = 1 . and 2. % time step x=ones(1. . % initial release i=1:m. With just m = 100 realizations we get significant fluctuations in the estimates of u(1) about the true value.48. Estimate the expectation of the boundary values using the conditional vectors x<=0 and x>=3 to account for the number of realizations reaching each boundary.32. Figure 3. % track those as yet unstuck m=length(i). Continue until all realizations reach one boundary or the other. for example.40 . Algorithm 3. To estimate u(1). and 1.m)*sqrt(h).m).12 plots ten realizations.72.36.28. end estimated_u=mean(1*(x<=0)+5*(x>=3)) Stochastically.001.

say. The expectation of the binomial agrees with the earlier mean. respectively. Consequently. as easily arises in finance when each day introduces another dimension. the error decreases with m. the stochastic solution is practical because we just solve 100 coupled SDEs for. a few thousand realizations to obtain a workable approximation. Stochastically solve deterministic differential equations 83 The binomial distribution estimates errors In this example. As these last two examples show. with p = 0. here Y ∼ bin(m. Let Yi = 1 if the ith realization reaches the x = 3 boundary. we need enormously many realizations to solve a PDE at all accurately via its SDE.6 ± 0. the appropriate SDE has realizations whose expectation of u does not change. However. each of the m realizations reached either one boundary or the other. f(X(τ)) = 4Yi + 1 . and hence u(2) = ^ where Y = m i=1 Yi is 1 m m (4Yi + 1) = i=1 4 m m Yi + 1 = i=1 4 Y + 1. In contrast.2 . let alone try to solve the equation. as is typical. with p = 0. suppose we wish to solve a PDE in. • Now we know E[Y] = mp from the properties of the binomial distribution. p). u(2) = E [f(X(τ))] = E [4Yi + 1] .04 . • Additionally we also know the variance of a binomial distribution: Var[Y] = mp(1 − 4 16 16 p) . This provides ^ u the error in our estimate of u(2) . Hence. say. m distributed bin(m. That is. Here. Because the √ variance decreases with m. It is not a practical method for PDEs in small numbers of dimensions. Now consider the value of the solution at ^ the release point x = 2: u(2) = E [f(X(τ))]. The boundary values f(X(τ)) are either 1 or 5 depending upon whether the realization reached the left or the right boundary. 4 4 ^ ^ E[u(2)] = E[ mY + 1] = mmp + 1 = 4p + 1 which. and conversely Yi = 0 if the ith realization reaches the x = 0 boundary. . here the estimate p ≈ 0. and so we need m ≈ 10.4 . 100 dimensions. Thus we know Var[u(2)] = Var[ m Y + 1] = m2 Var[Y] = m2 mp(1 − p) = 16p(1 − p)/m which.3. Thus with m = 100 realizations we estimate u(2) = 2. estimates Var[^ (2)] ≈ 0. there is a probability p that the ith realization reaches the right-hand boundary for which Yi = 1. Then a finite difference solution would require a fantastic 10100 grid points to just resolve the domain with 10 grid points in each of the 100 dimensions. This provides a method to stochastically value a financial option. a 10% error. 000 realizations to estimate u(2) to a 1% error.4 .4 . x = 0 or x = 3 .12. The SDE solution of PDEs is practical in problems of high dimension. as in any problem. 0. Suggested activity: Do at least Exercise 3.6 as before. estimates u(2) ≈ 2.4) . Summary For a given differential equation for unknown u.2. Hence we are empowered to stochastically estimate the unknown u at some initial point.

x|s. This is in contrast to p(t.3 The Kolmogorov backward equation completes the picture To date we either sought the steady state probability distribution. Remarkably.5 plots 10 realizations showing how the realizations evolve. The Wiener process in Example 3. or we assumed that the initial state of the Ito process was specified. y) = exp ⎣− ⎦. given that we know it was at y at time s. x|s. The appearance of the steady state PDF matches the steady state found in Example 3. −2(t−s) 2 1−e 2π 1 − e−2(t−s) This probability distribution is that for a normal random variable with mean x = ye−(t−s) decaying from y at time t = s to 0 exponentially quickly. we discover a fully informative function that is analogous to powers of the transition matrix of Markov chains. whereas the spread x . The Fokker–Planck Equation Describes the Probability Distribution 3.13 (Wiener process). More generally. Recall that Figure 3. y) = 2π(t − s) The use of the vertical bar in p(t. at a small time step h into the future from time s.6 shows how the corresponding PDF evolves toward a Gaussian steady state. we might know that at some time s the process has some value y: W(s) = y . y) is a convention to remind us that this function p is the PDF conditional on being at y at time s. Thus the PDF for the Wiener process. is 1 exp −(x − y)2/2(t − s) .84 Chapter 3. y). p(s + h. the conditional PDF for an Ornstien–Uhlenbeck process is also always a Gaussian: the Gaussian’s parameters just have exponentially decaying transients. the PDF p(x). also shows the same exponential approach to the steady state distribution from a variety of initial conditions. for example. x) which.6. and so μ = (¯ − y)/h is an estimate of the drift at the locale (s. all the probability is lumped at x = y at time t = s . By investigating the evolution from a general initial state. Because of the homogeneous nature of the Wiener process. y) “knows” about the process when started with any value y at any time s.5. x|s. Figure 3. This general conditional PDF has much more information about the underlying stochastic process—it is the analogue of powers of the transition matrix of a Markov chain. and with variance 1 − e−2(t−s) growing from 0 at time t = s to saturate exponentially quickly at 1 for large times. we write down the more general PDF just by shifting the origin in time t and value w. Thus. For simplicity. cannot distinguish between a Wiener process √ and Z t ! Here p(t. using many realizations of numerical simulations. for example. I record the conditional PDF for the specific SDE √ dX = −X dt + 2 dW : it is ⎡ 2⎤ x − ye−(t−s) 1 ⎥ ⎢ p(t.6 starts from W(0) = 0 . that is. p(t. y) ¯ will be some sharply peaked probability distribution: the peak will be at some x near y.14 (Ornstein–Uhlenbeck process). x|s. x|s. all the probability is lumped at w = 0 at time t = 0 . Example 3. or equivalently. Example 3. Recall that Exercise 1.

3. dashes denote ∂/∂y and the drift μ and the volatility σ are evaluated at y unless otherwise specified.15 (Kolmogorov backward equation). Those who have met the definition. properties. Each is the adjoint of the other. jth element of Pn.13. (3. ∂s ∂y ∂y as well as the Fokker–Planck equation (3. y) to time s. x|s.11). This reminds us of the Black–Scholes equation (2. The conditional PDF is fully informative of the underlying stochastic process.6) for pricing options. Analogue The conditional probability distribution p(t.37 It is straightforward. Proof.10 the striking connection with the Black–Scholes equation. and see how it affects the subsequent evolution to (t.2).11) . albeit tedious. we can in principle determine the SDE for any given p(t.3. and it helped establish in Example 3. x ∼ i . will determine the local volatility σ = σx/ h . x). √ of the peak. its standard deviation σx. Theorem 3. take a small time step from (s − h. the Doob–Meyer decomposition. The general PDF p(t.13 indicates how we investigate the 37 The Kolmogorov backward equation and the Fokker–Planck equation are closely related. y) is exactly analogous to the i. y). p(s). ∂p 1 ∂2p ∂p + μ(y) + 2 σ(y)2 2 = 0 . Figure 3. and use of adjoints will appreciate that such a pair of adjoint equations and their solutions provides two complementary and beautiful alternative views of problems.11). To derive the Kolmogorov backward equation (3. x|s. at some time s and maps it to the later probability distribution p(t) = p(s)Pn at time t = s + n . y ∼ j . to verify that the example PDFs given earlier satisfy (3. y) is analogous to powers Pn of the transition matrix P in discrete stochastic systems. See that the Kolmogorov backwards equation is like an advection-diffusion equation but with negative diffusivity: this implies it must be solved backward from t to the starting time s. y) for the Ito process dX = μ(X)dt + σ(X)dW satisfies the Kolmogorov backward equation. x|s. By uniqueness of drift and volatility of an Ito process. The Kolmogorov backward equation is sometimes also known as the Chapman–Kolmogorov equation. The Kolmogorov backward equation completes the picture X $ $ sx & b $$$& $ & $$$ ξu $$ s & &    Z = +1  & & ys   & d & & Z = −1d s ξ ‚ d& d E time s−h s t 85 Figure 3. With t − s ∼ n . Recall that Pn takes the probability distribution. p(t. Throughout this derivation. x|s.

4 Summary • The PDF p(t. x|s. y) is (Figure 3.2). ξ) to give the probability of ultimately reaching (t. x|s. 2 Subtracting p(t. y) gives the probability that a stochastic process has value x at time t given that it had value y at an earlier time s. x|s.86 Chapter 3.11). x|s − h. ξ) and combining this with p(t. x|s. X(s) = y . 2 h Take the limit as h → 0 to deduce 0= ∂p 1 2 ∂2p ∂p +μ + σ . y) by taking the first small time step to (s. Therefore the routes to (t. y) satisfies the Kolmogorov backward equation (3. As for the proof of the Fokker–Planck equation (3. 3. x|s. ξd) 2 2 √ √ 1 = 2 p(t. x) are either via ξu or ξd: p(t.2). y) + μp + 1 σ2p + · · · . x|s. x|s − h. x|s. The Fokker–Planck Equation Describes the Probability Distribution probability that the system arrives at (t.2 gives an alternate derivation. • The PDF of the SDE dX = μ(X)dt + σ(X)dW equation (3. x|s. ∂p ∂ ∂2 = − [μp] + 2 ∂t ∂x ∂x satisfies both the Fokker–Planck 1 σ2p 2 .13) √ ξ = y + μ(y)h ± σ(y) h . for √ simplicity assume that the Wiener increment over the time step h is ΔW = σ(y) hZ. y + μh − σ h) 2 √ √ = 1 p + (μh + σ h)p + 1 (μh + σ h)2p 2 2 √ √ + 1 p + (μh − σ h)p + 1 (μh − σ h)2p + · · · 2 2 = p + μhp + 1 σ2hp + · · · .11). y + μh + σ h) + 1 p(t. ∂s ∂y 2 ∂y2 This is the Kolmogorov backward equation (3. y) can completely characterize a stochastic process: as a function of the initial condition. Appendix B. Summary The conditional PDF p(t. thus the x value after the first small time step from 2 (s − h. the conditional PDF p(t. x). . x|s − h. y) = 1 p(t. y) − p(t. ξu) + 1 p(t. y) from both sides and dividing by h leads to 0= p(t. x) given that it starts from (s − h. where Z = ±1 each with probability 1 . x|s.

2 . Assuming each Ito process is settling upon some steady state PDF p(x). 3.2. √ 6. Consider in turn each of the stochastic processes illustrated in Figures 3. Show that Var[X(t)] = X2 e(2α+β )t −e2αt for X(t) = X0 exp[(α− 1 β2)t+βW(t)] .4. dX = (2X − X2)dt + X dW for X(t) ≥ 0 . Identify as best you can which Ito process shown in Exercise 3. dX = (3X − 2X2)dt + 2X dW for X(t) ≥ 0 . 4. the variance may nonetheless grow in time. show that the steady state PDF may be written as μ(x) A exp dx . dX = −X dt + 1 + X2 dW . 3. Use the Fokker–Planck equation to investigate the steady state PDFs for the following stochastic processes: 2 1. Exercises 3. 3. ∂s ∂y ∂y 87 • The PDFs of SDEs may be determined from the corresponding Fokker–Planck equation. sketch the PDF.11).17. 3.14–3.Exercises and the Kolmogorov backward equation (3. The realizations plotted in Exercise 3.3. dX = X dt + (1 + X2) dW . ∂p ∂p 1 ∂2p + μ(y) + 2 σ(y)2 2 = 0 . Use the Fokker–Planck equation to find the structure of the steady state PDF of solutions X(t) to the SDE dX = −2X dt + 1 + X2 dW . 2 0 Give an example showing that even when the expectation decays to zero. Each figure shows ten realizations of some Ito process. 3.6. Consider a general SDE dX = μ(X)dt + σ(X)dW . p(x) = D(x) D(x) where D(x) = 1 σ(x)2 and A is a normalization constant. dX = β 1 + X2 dW .4 corresponds to each of the four figures. • The Kolmogorov backward equation enables some PDEs to be solved stochastically. 2. Assuming a steady state PDF exists for the ensemble of realizations. a different Ito process for each figure.1.5. 3.2 all come from Ito processes you have met or will soon meet. 5. dX = −X dt + 2(1 + X2)1/4dW .

. 6 4 2 x 0 0 1 2 3 4 5 6 7 8 9 10 time t Figure 3. Ten realizations of an Ito process.15. Ten realizations of an Ito process.88 Chapter 3.14. The Fokker–Planck Equation Describes the Probability Distribution 5 4 3 2 1 x 0 0 1 2 3 4 5 6 7 8 9 10 time t Figure 3.

17.Exercises 89 20 18 16 14 12 x 10 8 6 4 2 0 0 1 2 3 4 5 6 7 8 9 10 time t Figure 3. Ten realizations of an Ito process. 10 5 0 x 0 1 2 3 4 5 6 7 8 9 10 time t Figure 3.16. . Ten realizations of an Ito process.

One solution is u(t. 3. Reanalyze Example 3. Use the corresponding SDE to estimate u(2). ∂x 2 ∂x √ where the right-hand side is evaluated at (t.9. x) = x2 − 2xt + t2 − t. use Taylor series about p(t. and its solution X(t). ∂t ∂x 2 ∂x2 Write down the corresponding SDE.7.001 : discuss how the errors improve with increasing m and/or decreasing h. ξ) for ξ = x/(1 ± h) . Derive from first principles the Fokker–Planck equation ∂2 ∂p = 2 ( 1 x2p) ∂t ∂x 2 for the PDF p(t.12.12. perhaps using computer algebra. 3.2) to include variable volatility σ(x). This generalization is quite difficult because of the need to keep track of all the details.19. x) to deduce p(t + h. • Lastly. that the PDFs for the Wiener and Ornstein– Uhlenbeck processes satisfy the Kolmogorov backward equation (3. Compare.7.8. x). 1. Roughly what errors do you expect in your estimates? 3. 2 2 . β = 1). You may use that (1± h)−1 = √ 1∓ h+ h + ··· . assume small time steps of size Δt = h √ approximate the Wiener proand cess by the up/down binomial steps ΔW = ± h .13.11. Use many realizations of its solution to numerically estimate the options of Exercises 1. • First. 3.11. α = 0 .8). and also you will need to show √ ξ = x ∓ σ h + (σσ − μ)h + · · · . and 1. but suppose instead that the death rate μn = βn2 due to increased competition for limited resources by larger populations of individuals.11). Show that for the stochastic √ system to reach (t + h. Consider the PDE ∂u + ∂u + 1 ∂ u = 0 . The Fokker–Planck Equation Describes the Probability Distribution 3. Solve analytically 2 du + x d u = 0 on 1 < x < 5 such that u(1) = 4 and u(5) = dx dx2 0 . 3. rearrange and take the limit as the time step h → 0 to derive the corresponding Fokker–Planck equation. x) = p + hp + 2xh ∂p 1 2 ∂2p + x h 2 + ··· . Generalize the proof of the Fokker–Planck equation (3. Verify. Corresponding to the Black–Scholes equation is the SDE (3. 3. and unit stock volatility. Say we use m = 100 realizations with a time step of h = 0. • Second. x) it must have come from (t. Use its Fokker–Planck equation to then argue that the corresponding SDE is approximately dN = (αN − βN2)dt + αN + βN2 dW .10. x) of solutions X(t) of the SDE dX = X dW (this SDE is the case of the financial SDE dX = αX dt + βX dW for no stock drift. and then verify the Feynman–Kac formula for solutions with initial condition X(s) = y .90 Chapter 3. u(3) and u(4) and their errors.

p ∝ x2e−2x (you determine the constant of proportionality in this and other answers). see that for small x the system spreads by a random walk. dX ≈ −X dt + 2 dW √ the PDF decays a little faster for large x . p ∝ 1/(1 + x2)2 shows that by stabilizing the origin the previous distribution has much smaller tails. p ∝ 1/(1 + x2)3 . but this spreading is arrested by the stabilizing effect of multiplicative noise for large x. p ∝ √ 1 2 e− 1+x . 4. √ 2 6. whereas here the exponential factor in the solution does not affect the distribution very much so the PDF is much like the previous one but generated by a deterministically unstable origin which is stabilized by noise that grows quadratically with x. p ∝ 1/(1 + x2). p ∝ e−x/ x. see the hump in probability near x = 2 corresponding to the deterministic fixed point. 1. 3. . 1 5. for small x this process is like an Ornstein–Uhlenbeck 1+x √ process. but here the deterministic fixed point at x = 3/2 has been washed out by the noise which instead has stabilized the origin and hence generated an integrable peak in probability at x = 0 . generating this “Cauchy distribution” which has long tails (curiously β has no influence on the steady state distribution). where dX ∝ X dW. p ∝ (1+x2 )2 e−1/(1+x ). 3.4.5.Exercises 91 Answers to selected exercises 3.12. dX ∝ dW . but because the noise increases like X. √ 2. Analytic solution is u = 5/x − 1 . 2 3.

.

. . . . . . . But what do these integrals really mean? Can we really approximate them like this? What properties do they have? How are they related to ordinary integrals? Ordinary integrals Interpret integrals such as a μ(t. . . . such as dX = μ(t. . . . . All the properties that we are familiar 93 b . . . .1) and invoked rules for their manipulation that seem to make sense in the context of stochastic Ito processes.3 Summary . . . 4. . . . . . . . . . . . . . a In turn. . . . X(t))dt as ordinary integrals in the sense of ordinary calculus (they are Riemann–Lebesgue integrals). . . . X)dt + σ(t. . . . (4. 95 106 112 113 114 So far we have played with stochastic differentials. . . . . X(t))dW ≈ j=0 σjΔWj . The best treatment of stochastic differential forms such as (4. Exercises . . Answers to selected exercises b . . . . . X(t))dW . . . . . . . . . . . . . . . .Chapter 4 Stochastic Integration Proves Ito’s Formula Contents 4. . . . x)dW . . . . . . . But the derivation of the symbolic rules has been largely intuitive rather than rigorous. . . . . . .1 The Ito integral a f dW . X(t))dt + b σ(t. . . . . . . . . . . 4. . . . In this chapter we put stochastic calculus on a firm footing. . . . . . . .2 The Ito formula . . . . . . . . . . X(t))dt ≈ j=0 b n−1 μjΔtj and a σ(t. . . interpret these integrals as being approximated by the sums b a n−1 μ(t. . . . . . .1) asserts that they are just shorthand for integrals such as [X(t)]b = X(b) − X(a) = a b a μ(t.

5 I(t) 0.6 0. the definite integral a W dW = 1 W 2 − t a . the dt term is some adjustment.8 0. Example 4. Use Ito’s formula to deduce I = a W dW .1. Five realizations of the Ito integral I(t) = 0 W(s)dW(s) for 0 < t < 1 as computed by Algorithm 4.3 0.1. the indefinite integral will somehow contain 1 W 2. whereas 1 W 2.0 0. Thus for these integrals there is nothing that we really need to develop other than perhaps efficient methods for their numerical evaluation. being a square.7 0. 2 2 2 The W dW term on the right-hand side is the desired differential.9 1. so use the simple form (2. further.5 0.94 1. 0 2 Solution: The indefinite stochastic integral I = W dW is synonymous with the SDE dI = W dW .2 0. See in the numerical evaluations of Figure 4.0 t Figure 4. Move the 1 dt to the left-hand side to obtain 2 dI = W dW = d 1 W 2 − 1 dt = d 1 W 2 − 1 t . 2 2 the integral is a fluctuating positive component superimposed upon −t/2. As seen in Figure 4. 2 2 2 2 Thus upon “summing.1 0. Stochastic Integration Proves Ito’s Formula 0. In this chapter we focus exclusively upon Ito integrals of b the form a σ(t.3) of Ito’s formula to see 2 d 1 W 2 = 0 dt + W dW + 1 · 1 dW 2 = W dW + 1 dt .1.0 Chapter 4. We guess that despite the above comments.4 0.0 0.1 that the integral is not [ 1 W 2]b because the realizations of I(t) = a 2 t W(s)dW(s) are often negative. b b Thus. 2 b t .” the indefinite integral W dW = 1 W 2 − 1 t . with follow provided the integrand is a smooth function of any Ito process. is always positive. X(t))dW .1.

3. • f(t. The class of stochastic functions that we may integrate (over the interval [a. ω) depends only upon the past history of the Wiener process. in W dW the increment W dW to the integral depends only upon the current value of W.m). ω). ω)2. t for example. Ito integral properties then follow from those for integrals of step functions. Define integrals for piecewise constant stochastic integrands—called “step functions” φ(t. In contrast. but not f = W(t + 1. ω). use such step functions to approximate arbitrarily well any reasonable stochastic integrand f(t.n+1)’. ω) depends upon all the previous values of the Wiener process W.ω). 2.1. This more explicit notation allows a more general dependence upon the Wiener process to appear in the integrands. dw=randn(n. The development follows four steps: 1. ω)2dt] < ∞ . then define the Ito integral of f b to be the limn→ ∞ a φn dW . 4.2. b]) is denoted by V. or f = 0 W(s. x=[zeros(1.m). Slight technical differences exist between these terms.m)*sqrt(diff(t(1:2))). or Ft -previsible. ω)—such as those illustrated in Figure 4.x) 4. w=[zeros(1.1 The Ito integral b f dW a b Here we outline the definition of an Ito integral a f dW and its properties. W(s. However. for a sequence of step functions φn → f as n → ∞ . ω).:). Øksendal (1998) [§3. The Ito integral a f dW b 95 Algorithm 4.1 M ATLAB /S CILAB code (essentially an Euler method) to numerically evalt uate 0 W(s)dW(s) for 0 < t < 1 to draw Figure 4.1. such as a dependence upon the past history.1] and Kloeden and Platen (1992) [§3. for s ≤ t.1)]. thus herein we just refer to a dependence only upon the past history. ω). m=5. We modify notation slightly: hereafter we use ω to denote the family of realizations of a Wiener process W(t. 38 Such functions that only depend upon the previous history of the Wiener process are called variously F b .1] give full details of the rigorous development of the integral.cumsum(dw. plot(t. Definition 4.4. f = eW(t. Ft -adapted. t=linspace(0. It is defined to be composed of functions f(t. ω)ds. ω) such that • the expected square integral E[ a f(t.1. f = W(t. Including the past history of the process is essential if the theory of integrals is to apply to SDEs such as dX = X dW which we interpret as X = X dW where the integrand X(t. n=1000.1)].2.cumsum(w(1:end-1.38 tmeasurable.*dw. we endeavour to work entirely in a manner so that the differences in the terms are immaterial.

96
4 2

Chapter 4. Stochastic Integration Proves Ito’s Formula
4 2 0

φ(t, ω)

0

0.0 4 2

0.2

0.4

0.6

0.8

1.0

0.0 4 2 0

0.2

0.4

0.6

0.8

1.0

φ(t, ω)

0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

t

t

Figure 4.2. Five realizations (different colors) of four different stochastic step functions φ(t, ω) ∈ S: different step functions φ(t, ω) have different partitions; different realizations of the one-step function φ(t, ω) have the same partition but different values distributed according to some probabilities. Definition 4.3. Let the class of step functions S ⊂ V be the class of piecewise constant functions. That is, for each step function φ(t, ω) ∈ S , there exists a finite partition a = t0 < t1 < t2 < · · · < tn = b such that φ(t, ω) = φ(tj, ω) for tj ≤ t < tj+1 . Figure 4.2 plots four different members of S. Given any such partition, we often use φj or φj(ω) to denote φ(tj, ω), just as we used Wj to denote W(tj, ω).

First task: The Ito integral for step functions Our first task, as in the outline given at the start of this chapter, is to define the Ito integral for step functions, such as those drawn in Figure 4.2, and investigate its key properties. Definition 4.4. For any step function φ(t, ω) define the Ito integral
b a b a n−1

I(ω) =

φ(t, ω)dW(t, ω) =

φ dW =
j=0

φjΔWj .

(4.2)

4.1. The Ito integral a f dW

b

97

From this definition two familiar and desirable properties immediately follow, linearity and union (proofs are left as exercises), as does a third “no anticipation” property: • Linearity:
b a b a b a

(αφ + βψ)dW = α
b a c b

φ dW + β
c a

ψ dW ;

(4.3)

• union:

φ dW +

φ dW =

φ dW ;

(4.4)

• the integral a φ dW depends only upon the history of the Wiener process up to time b (the integral is Fb-adapted/measurable/previsible). Two important properties of the Ito integral arise from its stochastic nature. These properties are so important we codify them as named theorems. Theorem 4.5 (martingale property). For the Ito integral (4.2) of a step function φ, its mean, average, or expected value is always zero: μI = E
b a

b

φ(t, ω) dW(t, ω) = 0 .

(4.5)

Example 4.6. Recall from Example 4.1 that I(ω) =
t 0

W(s, ω)dW(s, ω) = 1 W(t, ω)2 − t . 2

Verify E[I(ω)] = 0 . Solution: From the expression for the integral, E[I] = E 1 (W(t, ω)2 − t) 2 = 1 E[W(t, ω)2] − 1 E[t] 2 2 = 1 t − 1 t as E[W(t, ω)2] = Var[W(t, ω)] = t 2 2 = 0. Although we only prove this martingale property39 for step functions here, they do hold generally, and hence we have given an example which is not within the class of step functions.
39 The term martingale refers to stochastic processes, such as the Wiener process, for which, in symbols, E[X(t, ω) | Fa ] = X(a, ω) for t > a . The symbol Fa , called a filtration, denotes all the history of the process up to time a. This statement says that if you know the history of the process up to time a (that is, conditional on Fa ), then the process X(t, ω) is a called a martingale if the expected value thereafter is just X(a, ω). For example, a Wiener process is a martingale because, given only the history up to time a, the independence from the earlier history of subsequent increments in the Wiener process implies that the expected value of the Wiener process does not change from W(a, ω). The Ito integral may be considered a stochastic function of the end point b, and we see that as b varies the expected value of the integral is always the value it has for b = a, namely zero. Thus an Ito integral is a martingale.

98

Chapter 4. Stochastic Integration Proves Ito’s Formula

Proof. Recall the linearity of the expectation and that φj is independent of ΔWj, as φj = φ(tj, ω) depends only upon the earlier history of the Wiener process, while the increment ΔWj is independent of the earlier history. Thus for the relevant partition, such as one of those in Figure 4.2, ⎡ ⎤
b

E
a

φ dW = E ⎣

n−1 j=0

φjΔWj⎦

by definition

n−1

=
j=0 n−1

E φjΔWj

by linearity

=
j=0

E φj E ΔWj
=0

by independence

= 0.

Remarkably, irrespective of what integrand we choose in an Ito integral, we always know the mean value of the Ito integral, namely zero! But even more remarkably (see the next theorem), we can also readily determine the variance, that is, the spread, of an Ito b integral. The two most important aspects of the stochastic I(ω) = a φ dW, its mean and variance, can be determined without ever actually computing the Ito integral! Theorem 4.7 (Ito isometry). For the Ito integral (4.2) of a step function φ(t, ω), its variance is the integral of the expectation of the squared integrand:40 ⎡ ⎤ σ2 = Var I
b a

φ(t, ω) dW(t, ω) = E ⎣

b

2

φ dW
a

⎦=

b a

E φ(t, ω)2 dt .

(4.6)

Example 4.8. Recall from Example 4.1 that the Ito integral I(ω) =
t 0

W(s, ω)dW(s, ω) = 1 [W(t, ω)2 − t] . 2

Verify the Ito isometry for this integral. Solution: Directly from the analytic solution, Var[I(ω)] = Var 1 (W(t, ω)2 − t) 2 = 1 E (W 2 − t)2 4 as E[I(ω)] = 0

= 1 E W 4 − 2W 2t + t2 4 = 1 3t2 − 2 · t · t + t2 4 = 1 t2, 2
40 As

if the square can be taken inside the integration and dW2 = dt .

4. as in Figure 4. ω)dW(s. leaving just the j = k terms. ω)2 ds s ds 0 = 1 t2 . Thus for the relevant partition. and φk —as ΔWk depends only upon times t > tk whereas φj.1. ΔWj. ΔWj and φk depend only upon times t ≤ tk—so the expectation may be split E φjΔWjφkΔWk = E φjΔWjφk E [ΔWk] . Now within this double sum three cases arise: If j < k.k=0 E φjΔWjφkΔWk . Thus . Proof. ω) 0 = = E W(s.2.6). E φjΔWjφkΔWk = E φjΔWkφk E ΔWj . ⎡⎛ ⎡ ⎤ ⎞ ⎤ E⎣ b 2 φ dW a ⎦ = E ⎢⎝ ⎣ n−1 j=0 2 ⎥ φjΔWj⎠ ⎦ ⎞⎛ φjΔWj⎠ ⎝ ⎞⎤ φkΔWk⎠⎦ ⎡⎛ = E ⎣⎝ ⎡ = E⎣ n−1 j=0 n−1 k=0 n−1 ⎤ φjΔWjφkΔWk⎦ j. Var[I(ω)] = E t 0 t t 2 b 99 W(s. ΔWk. we verify the Ito isometry for this Ito integral. The Ito integral a f dW whereas from the Ito isometry (4. 2 Since these are equal. =0 and so all such terms vanish. whereas if j > k. =0 and so all such terms vanish. Again we use the properties of the expectation and that the zero mean increment ΔWj is independent of any earlier stochastic quantity. then ΔWk is independent of φj. and φj so the expectation may be split.k=0 n−1 = j. then ΔWj is independent of φk.

. But these approximations are only valid to some prescribed finite error. Similarly. Second task: Step functions approximate The second task is to prove that the class of step functions S can be used to approximate arbitrarily well any given Ito process f(t. for example. Recall that we approximate real numbers by rationals. Suggested activity: Do at least Exercises 4. . Figure 4. which we omit. 17/12. 577/408. Stochastic Integration Proves Ito’s Formula ⎤ ⎦= n−1 2 E φj ΔWj2 j=0 n−1 φ dW a = j=0 n−1 2 E φj E ΔWj2 by independence = j=0 2 E φj Δtj by variance of ΔWj b a = E φ2 dt by the definition of ordinary integration for piecewise constant integrands such as E φ(t. we chose a suitable sequence of such rational numbers that converge to the required real number.1 and 4. we often use 22/7 to approximate π.3. 3/2. 76/105. Such approximation of reals by a sequence of rationals works because there are rational numbers that are arbitrarily close to any specified real number. we can choose a sequence of stochastic step functions. a well-known approximation to the irrational π/4 is the sequence of partial sums of the series 1 − 1/3 + 1/5 − 1/7 + · · · . For example. Sequences provide a mechanism for such arbitrarily accurate approximation. The notion of convergence is a problem In earlier courses you will have discussed convergence√ terms √ the distance between in of √ an element of the sequence and its limit. Although this second task involves considerable uninteresting material. . 2 − 3/2. no matter how small. they approximate the Ito process in all realizations. that converge to any given Ito process. 2 − 17/12 . . namely 1. we introduce the interesting and important notion of a Cauchy sequence as a means to approximation. for example.4. 2 − 1 . in a computer all floating point numbers are stored and manipulated as an integer divided by a power of 2.100 ⎡ E⎣ b 2 Chapter 4. 665857 . 2/3. Rigorous mathematics establishes approximability to every error. For example. ω)2 . For another example. . 470832 Similarly. 13/15. You are familiar with this issue of approximability every time you work with real numbers. and so on. ω) ∈ V . such as those shown in Figure 4.3 shows a sequence of four step functions that approximate the given Ito process increasingly well. Newton iteration x = 1 (x + 2/x) provides a sequence of rational approximations 2 √ to the irrational 2: one such sequence is 1.

Then you will have proved this distance tends to zero and concluded that the sequence converges to the limit. Definition 4.. ω) 0 0.0 0. J(ω)] = E (I(ω) − J(ω))2 . loosely.2 0.4 0.0 0. Table 4.2 0. the differences In − Im → 0 as n. the corresponding realizations of the step functions increasingly closely approximate the realizations of the Ito process. ω) 0 0. m → ∞.4 0. Superimposed are four successively refined approximations by step functions.2 0. Kreyszig 1999. Five realizations of an Ito process plotted once in each subfigure.2 0. The Ito integral a f dW 2 1 2 1 0 b 101 φ(t.1. even if we do not know what that limit is.8 1. If a sequence is Cauchy. §14.0 0. As the number of partitions increases from top-left to bottom-right. (4.6 0.0 2 1 0 0. e.0 0.6 0. we investigate the distance between pairs of elements of the sequence.8 1.9. √ 2 − 577/408 . if for all > 0 there exists an N such that In − Im < for all n.1 gives an example.8 1. A sequence In is termed a Cauchy sequence if.4.4 0. But we need to clearly define distances between stochastic quantities. This is just what we need here because we have defined Ito integrals only for step functions. But here the limit is not in the class of things that we can as yet integrate! We can as yet only integrate step functions. and hence the Ito integrals converge to something. so we have no limit for comparison.0 t t Figure 4.7) . The advantage of using the notion of a Cauchy sequence in a test for convergence is that it only involves operations of the elements of the sequence and does not involve comparisons with the actual limit. Thus we use a different definition of convergence: instead of investigating the distance between elements of the sequence with the supposed limit. then the sequence must converge to some limit.8 1.6 0. or more precisely.g. not yet general Ito processes. m > N (see. Measure the distance between two stochastic quantities by dist[I(ω).0 φ(t.0 2 1 0.3.4 0. The next subsubsection argues that Ito integrals of converging step functions form a Cauchy sequence.1).6 0.

5 0. n → ∞ and so is a Cauchy sequence. Suffice it to say that Øksendal (1998) uses the norm f 2 = a E[f2]dt on V and proceeds in three steps: step functions may approximate bounded continuous functions. and so we b omit all details.0025 0.102 Chapter 4. |Im − In| . ω) dW(t.0025 4 577/408 0. measure the distance between two stochastic processes over a time interval [a. and bounded functions may approximate the Ito processes in V. b] by dist[I(t. ω) in V. (4.1.4167 0. Third task: The limit exists The third task defines the Ito integral of any f(t.0025 0 2. ω) ∈ V to be the limit of Ito integrals of step functions. dist(φn. ω) − J(t.2 records the five realizations of the Ito integral of the step functions shown in Figure 4. ω)] = b a E (I(t. Table 4. ω))2 dt .E-06 0 If this distance is zero. Figure 4. Analogously.4142 0. For example. m 1 2 3 4 5 n Im\In 1 3/2 17/12 577/408 665857 470832 1 1 0 0.4142 0.4142 0.8) Establishing this ability of step functions to approximate Ito processes uses techniques that do not illuminate any significant properties of stochastic integration.0833 0.3. The nontrivial task here is to deduce that the limit exists and is unique. For example. We skip these uninteresting technicalities.3 plots five realizations of one such sequence of step functions approximating an f(t.0025 2.0858 0.4167 0. b .0833 0 0. • Find a sequence of step functions φn(t.E-06 5 665857 470832 0. • Given some stochastic function f(t.0858 0. Stochastic Integration Proves Ito’s Formula √ Table 4. This distance tends to zero as both m. that is. A sequence In approximating 2 and the distance between pairs of elements in the sequence. bounded continuous functions may approximate bounded functions. J(t. • Compute the Ito integrals In(ω) = a φn(t. ω) of these step functions.0858 0. ω) that tends to f(t. ω) in the norm.0858 3 17/12 0. Hence Øksendal concludes that step functions S may approximate arbitrarily well any Ito process in V. f) = b a E (φn − f)2 dt → 0 as n → ∞ . then I and J are almost surely the same—realizations can only differ with probability zero. ω).4142 2 3/2 0. ω).5 0 0.

The Ito integral a f dW b 103 Table 4.10. This Cauchy sequence converges almost surely to some value—we call b this value the Ito integral of f(t. .4. call this the integral and denote it by a f dW . ω) form a stochastic Cauchy sequence. For example. ω).2 converges to five different realizations of the integral I(ω)—recall that numerical approximations to stochastic integration converge slowly. Im) = E (In − Im)2 =E = = b a b a b a 2 b a φn − φm dW b by linearity by Ito isometry E (φn − φm)2 dt E ((φn − f) + (f − φm))2 dt and since (a + b)2 ≤ 2a2 + 2b2 by Exercise 4.3. φm) → 0 as n. it is plausible that each column of Table 4. m → ∞ by convergence to f(t. of the sequence of step functions are a (stochastic) Cauchy sequence. Use the properties of Ito integrals for step functions and see their pairwise distance dist(In. In(ω). ω) dW(t.1.5 ≤ E 2(φn − f)2 + 2(f − φm)2 dt = 2 dist(φn. which is stochastic because b it depends upon the realization. Lemma 4. f) + 2 dist(f. ω) dW(t.2. ω) . The Ito integrals In(ω) = a φn(t. Thus the Ito integrals. and hence must converge. in that E[(In − Im)2] → 0 as n. The five realizations of the Ito integral In(ω) of the step functions shown in Figure 4. The color matches that of the plot. ω). m → ∞ . • The Ito integrals should converge to some number I(ω). Proof. denoted I(ω) = a f(t.

ω) dW(t.4 0.4. though different for different realizations. Stochastic Integration Proves Ito’s Formula 2 1 0 ψ(t. see those plotted in Figure 4.4. it is almost always one definite value in each realization.6 0. ω) 0 0. ω) that converge to the integrand f(t.8 1.4 0.4 which use a different sequence b .8 1.2 0. that is.6 0.104 2 1 Chapter 4. ω). ω) is unique. The same five realizations of an Ito process as in Figure 4.3.4 0.8 1.2 0.8 1. ω).2 0. The five realizations of the Ito integral Jn(ω) of the step functions shown in Figure 4.11. ω) 0 0. Table 4.3.0 t t Figure 4.2 0.0 0.0 0.0 0. Lemma 4. for example. plotted with a different sequence of four successively refined approximations by step functions ψ(t.6 0. A stochastic integral I(ω) = a f(t. Consider any other sequence of step functions ψn(t.6 0.0 ψ(t.0 2 1 0.0 0.0 2 1 0 0. Proof.4 0. The color matches that of the plot.

1. ω) exists and is almost surely unique. the two potential values of the integral. Suggested activity: Do at least Exercise 4. as the first and last terms tend to zero by the definition of the values I(ω) and J(ω) of the integrals. (4. ω) ∈ V follow from those for step functions. Suppose the Ito integrals Jn(ω) of this sequence converge to the value J(ω).9) .5 =2 E (φn − f)2 dt + 2 b a E (f − ψn)2 dt → 0 as φn. Fourth task: Properties then follow The final task is to show that the integral properties for the Ito integral of functions f(t. J) = E (I − J)2 = E ((I − In) + (In − Jn) + (Jn − J))2 ≤ E 3(I − In)2 + 3(In − Jn)2 + 3(Jn − J)2 by Exercise 4. From the five main properties of integrals of step functions. There is little of interest in the technicalities.5. and the middle term also tends to zero because E (In − Jn)2 = E = = ≤ b a b a b a 2 b a φn − ψn dW by definition and linearity by Ito isometry E (φn − ψn)2 dt E (φn − f + f − ψn)2 dt E 2(φn − f)2 + 2(f − ψn)2 dt b a by Exercise 4. ψn → f in this norm. Thus the Ito integral a f(t. However. so a proof is omitted. are almost always the same as their distance dist(I. For example. ω) dW(t. Table 4. the following are the five main properties of general stochastic integrals: • linearity.5 = 3 E (I − In)2 + 3 E (In − Jn)2 + 3 E (Jn − J)2 → 0 as n → ∞ .3.4. b a b a b a b (αf + βg)dW = α f dW + β g dW . The Ito integral a f dW b 105 of partitions than that shown in Figure 4. I(ω) and J(ω). namely J(ω) instead of I(ω).4. see that these integrals are different from the integrals In(ω) and so potentially could converge to something different.3 gives integrals Jn(ω) for the step function sequence shown in Figure 4.

ω) be an Ito process with drift μ(t. Consequently. However. X) and volatility σ(t. (4. Theorem 4.11) • Ito isometry. Summary This section put stochastic integrals on a strong footing. the mean. ω) = f(t. ω)) is also an Ito process such that [f(t. ∂x 2 ∂x a ∂t a ∂x b (4. in the integral W dW . But they also have additional important properties. for example. we now proceed to more rigorous analysis of stochastic processes. (4.12. ω))]b = a for all intervals [a.2 The Ito formula So far we have seen the Ito formula (2. and the Ito isometry. b]. or expected value is always zero: b E a f dW = 0 . x) be a smooth function of its arguments and X(t. Chapter 4. the variance is the ordinary integral of the expectation of the squared integrand: ⎡ ⎤ b Var a f dW = E ⎣ b 2 f dW a ⎦= b a E f2 dt . that their variance may be computed as an ordinary integral. stated below. The Ito formula also may be used to determine some integrals. As for ordinary integrals. Stochastic Integration Proves Ito’s Formula b a b f dW + c b f dW = c a f dW . b ∂f ∂f ∂f ∂2f + μ + 1 σ2 2 dt + σ dW . X(t. enabling us to differentiate stochastic Ito processes and hence solve some SDEs. the basis is the approximation by sums over small increments in time.10) • the Ito integral a f dW depends only upon the history of the Wiener process up to time b (the integral is Fb-adapted/measurable/previsible).106 • union. Having now put integration on a firm mathematical base. such as the martingale property. • the martingale property. is now to be rigorously established because we now have a carefully defined Ito integral. that is. X). that is. dX = μ dt + σ dW. the integral form of the Ito formula. (4.13) .4)–(2. that their expectation is always zero.12) We saw some of these properties in previous examples. Let f(t. stochastic integrals have many of the usual properties such as linearity and union. then Y(t. 4. average.5). as integration forms a relatively simple subset of the cases of solving differential equations. X(t.

This reduces to W dW for which σ = 1 in the right-hand side.13 the limits of integration.t dW 2 ⇒ t dW = tW(t. Example 4. In Example 4. Determine I(ω) = t dW in terms of an ordinary integral.13) to the function f(t. ω)2 − 1 t. so we use the Ito formula (4.0 dt + 1. as we often do hereafter. The Ito formula 107 Before proceeding to prove the Ito formula. ω). apply Ito’s 2 2 formula (4. · · · dt.4. We have seen that W dW = 1 W(t. is effectively a solution of the integral. 41 Assume X(t. ω) is an Ito process with drift μ(t. were carried along in the working and made no real difference to the algebraic manipulations other than increasing the level of detail.13.13) leads to tW(t. we would immediately write down the integral as 1 X2 . ω) = W + 0. It is simpler to neglect to include the limits a and b and symbolically work entirely with indefinite integrals. ω).14. Applying (4. So for this stochastic integral we guess I(ω) has 1 X2 in it. X) = 1 X2 . for which ft = W. a and b. See that this looks just like ordinary integration by parts.t + 1 . fW = t. ω). We seek to simplify · · · dW at the acceptable cost of introducing · · · dt. Example 4. W) = tW(t. into an ordinary integral.2. Solution: We might expect tW(t. · · · dW. but what is I = X dX ? 41 2 2 Solution: If I(ω) was an ordinary integral. ω) − W dt . ω) and volatility σ(t. and fWW = 0. Take the view that transforming an Ito integral. thus view the integral as I(ω) = Xμdt+ XσdW . .13) with f(t. and we also remember that μ = 0 and σ = 1 for a Wiener process W(t. ω) to appear in the answer. we here look at its role in determining Ito integrals. and see what eventuates: 2 b b 1 2 X = 2 a a b 0 + μX + 1 σ21 dt + 2 X (μ dt + σ dW) + 1 2 dX b a b a b σX dW a = σ2 dt a ⇒ b a X dX = 1 X2 2 −1 2 b a σ2 dt .

for example. Xj + ΔXj) − f(tj.5). 3 Now we proceed to prove the integral form (4. Then arbitrary reasonable μ and σ are approximated by a sequence of such step functions.13) of Ito’s formula (2. assuming f(t. ˙ fj = ∂f (t .13): 1 3 3W = 0 + 0 · W 2 + 1 · 2W dt + 1 · W 2 dW 2 = W dt + W 2 dW ⇒ W 2 dW = 1 W 3 − W dt . Determine as far as possible W 2 dW .108 Chapter 4. Øksendal (1998) [§4.13) transforms as follows: ∂t j j n−1 [f]b = a j=0 n−1 Δfj = j=0 n−1 f(tj+1. Then the left-hand side of the Ito formula (4. We do this now because it is only now that we have properly defined integration of stochastic functions.1] gives more details than we present. and so the same formula holds. b˙ n−1 ˙ • fjΔtj → f dt as h → 0 (provided we refine the partition in such a way that j=0 a μ and σ are step functions in every partition) to give the first term in Ito’s integral formula. Solution: Guess that 1 W 3 might appear. Xj) = j=0 n−1 f(tj + Δtj. Xj) = j=0 ˙ ¨ ˙ fjΔtj + fj ΔXj + 1 fjΔtj2 + fj ΔtjΔXj + 1 fj ΔXj2 + Rj 2 2 by expanding f(t.X ) . x) is differentiable (even though the process X(t. and so ΔXj = μjΔtj + σjΔWj exactly. Stochastic Integration Proves Ito’s Formula Example 4. are thus constant on each element on the partition.15. observe the crucial role of the independence of Wiener increments ΔWj from earlier events. so consider f = 1 W 3 in the integral form 3 3 of Ito’s formula (4. and because the differential form we used before is viewed as symbolically equivalent to the integral form. Consider the case where the drift μ and volatility σ are step functions (piecewise constant) with a common partition a = t0 < t1 < · · · < tn = b with h = maxj Δtj (so that h → 0 means that the partition is everywhere made of smaller and smaller pieces). For brevity let an overdot denote ∂/∂t and a dash denote ∂/∂x so that. Proof. Then in the next few steps we use that the drift μ and volatility σ are step functions. Xj+1) − f(tj. Then consider in each of the following dot points each term in turn on the right-hand side of the above. Throughout. x) in a Taylor series. . ω) need not be) and where the residual term Rj = O |Δtj|3 + |ΔXj|3 .

as we now show the other quadratic terms are zero starting with n−1 2 1¨ 2 fjΔtj j=0 n−1 ≤ j=0 2 1 ¨ 2 |fj|Δtj n−1 ≤ 1h 2 j=0 ¨ |fj|Δtj as h = maxj Δtj b a → 1h 2 ¨ |f|dt → 0 as h → 0 . • the next quadratic term contributes nothing because n−1 n−1 ˙ fj ΔtjΔXj = j=0 j=0 n−1 ˙ fj Δtj(μjΔtj + σjΔWj) n−1 = j=0 ˙ μjfj Δtj2 + j=0 ˙ σjfj Δtj ΔWj .2.k=0 ˙ ˙ E σjfj ΔWj σkfkΔWk ΔtjΔtk . is more delicate because it is stochastic. The Ito formula • two more terms are obtained from n−1 n−1 n−1 109 fj ΔXj = j=0 j=0 fj μjΔtj + j=0 b b a fj σjΔWj as step functions f σ dW as h → 0 . say ˙ and the first term here vanishes by the previous case (assuming a |μf |dt exists) whereas the second term. provided this integral exists (which we assume for f(t.4. . but we determine ⎞⎛ ⎞⎤ ⎡⎛ E[Y 2] = E ⎣⎝ ⎡ = E⎣ n−1 j=0 n−1 b ˙ σjfj Δtj ΔWj⎠ ⎝ n−1 k=0 ˙ σkfkΔtk ΔWk⎠⎦ ⎤ ˙ ˙ σjfj Δtj ΔWj σkfkΔtk ΔWk⎦ j. =Y. → f μ dt + a • with the last term in Ito’s integral formula coming from the last of the quadratic terms in the sum. x) of interest) and hence this term must contribute nothing. Y.k=0 n−1 = j.

42 for example. and so the expectation factors to ˙ ˙ ˙ ˙ E σjfj ΔWj σkfkΔWk = E σjfj ΔWj σkfk E [ΔWk] . and of these three terms the first two vanish by almost identical arguments to the pren−1 2 vious two cases which. thus leaving only the terms k = j in n−1 E[Y 2] = j=0 2˙ 2 2 E σj fj ΔWj Δtj2 as ΔWj is independent of earlier history n−1 = j=0 2˙ 2 2 E σj fj E ΔWj Δtj2 =Δtj 2˙ 2 E σj fj Δtj b a 2 n−1 ≤ h2 j=0 → h2 ˙ E σ2f dt → 0 as h → 0 . setting cj = 1 σj fj for brevity. and which we use again later. leave the sum j=0 cjΔWj2 2 that we want to turn into the integral a c dt in the rather special way unique to n−1 n−1 stochastic calculus—we thus compare j=0 cjΔWj2 with j=0 cjΔtj by showing 42 Note that this is exactly the same argument used to prove the Ito b isometry. and thus almost surely the term Y → 0 as h → 0 . Stochastic Integration Proves Ito’s Formula but all terms in this sum vanish except those for which j = k because. if j < k then the factor ΔWk is independent of all other factors inside the expectation.110 Chapter 4. • the only significant quadratic term is n−1 1 2 fj j=0 n−1 ΔXj2 2 = j=0 n−1 1 2 μjΔtj + σjΔWj fj n−1 n−1 = j=0 1 2 2 μj fj Δtj2 + j=0 μjσjfj Δtj ΔWj + j=0 1 2 2 σj fj ΔWj2 . . =0 and similarly for j > k.

4. The Ito formula (similarly as before) the vanishing of ⎡⎛ ⎞2⎤ n−1 n−1 ⎥ ⎢ cjΔWj2 − cjΔtj⎠ ⎦ E ⎣⎝ j=0 j=0 111 ⎡⎛ ⎢ = E ⎣⎝ n−1 j=0 ⎞2⎤ ⎥ cj(ΔWj2 − Δtj)⎠ ⎦ ⎞⎛ cj(ΔWj2 − Δtj)⎠ ⎝ n−1 k=0 ⎡⎛ = E ⎣⎝ ⎡ = E⎣ n−1 j=0 ⎞⎤ ck(ΔWk2 − Δtk)⎠⎦ ⎤ n−1 cjck(ΔWj2 − Δtj)(ΔWk2 − Δtk)⎦ j.k=0 n−1 E cjck(ΔWj2 − Δtj)(ΔWk2 − Δtk) = 0 unless k = j by independence of increments = j=0 2 E cj (ΔWj2 − Δtj)2 as ΔWj2 − Δtj independent of earlier history n−1 = j=0 n−1 2 E cj E (ΔWj2 − Δtj)2 = j=0 n−1 2 E cj E ΔWj4 − 2ΔWj2 Δtj + Δtj2 = j=0 2 E cj n−1 3Δtj2 − 2Δtj Δtj + Δtj2 =2 j=0 2 E cj Δtj2 n−1 ≤ 2h j=0 2 E cj Δtj b a → 2h E c2 dt → 0 as h → 0 . .k=0 n−1 = j.2.

or expected value is always zero: b b E a f dW = 0 . n−1 Chapter 4. b a (αf + βg)dW = α b a c b b a f dW + β c a b a g dW . – union. the variance is the ordinary integral of the expectation of the squared integrand: ⎡ ⎤ b Var a f dW = E ⎣ b 2 f dW a ⎦= b a E f2 dt . average. f dW + f dW = f dW . a • Crucial properties of Ito integrals are – linearity. ω) − X(a. X)dt + σ(t. that is.3). but you also have the understanding to study many of the more theoretical books and articles in financial mathematics and stochastic processes such as that by Øksendal (1998). 4. j=0 Rj → 0 as h → 0 (part of the argument is Exercise 2. With this establishment of Ito’s formula on a firm theoretical footing.3 Summary • An SDE such as dX = μ(t. ω))dW . Stochastic Integration Proves Ito’s Formula n−1 n−1 cjΔWj2 → j=0 j=0 cjΔtj = j=0 1 2 2 σj fj Δtj → b a 1 σ2f 2 dt as h → 0 to give the last term in Ito’s integral formula. • and for the very last task I simply claim that the residuals of cubic and higher order n−1 terms vanish. X(t. leaving just the integral form of the Ito formula. ω))dt + σ(t. that is. the mean. . – it is martingale. – the Ito integral a f dW depends only upon the history of the Wiener process up to time b (it is Fb-adapted/measurable/previsible). X(t. – it has Ito isometry.112 and thus almost surely. ω) = b a b μ(t. We terminate here this development of the basic theory that underpins SDEs and their applications. and with the working concepts developed in earlier chapters. you are now able to not only solve practical problems such as the valuation of options. x)dW is a convenient shorthand for the Ito integral equation X(b.

must tend to zero. ω)dW(t. X(t.2) to show that the distance between the integral I(ω) and the approximate In(ω). The proofs also rest upon the Ito isometry. ω)2 − T = 1 2 2.1. Compare your solution with that for Exercise 2. where W(t. approximating the integral as n−1 In(ω) = j=0 WjΔWj . ω))]b = a b ∂f ∂f ∂2f ∂f + μ + 1 σ2 2 dt + σ dW ∂x 2 ∂x a ∂t a ∂x b 113 for a stochastic process X satisfying dX = μ dt + σ dW . deduce the Ito integral I(ω) = T 2 0 W(t. Hence verify the martingale property and Ito isometry for this Ito integral. • In proving all of the above. ω)k] for a Wiener process W(t. Prove the linearity (4. 4.4) of the Ito integral for step functions. ω)2 − T 2 by forming a partition over the time interval [0. ω) = 1 W(T .3) of the Ito integral for step functions.2. ω)4]. j=0 W(T . Prove the union (4. Hence determine E[W(t.6. Reconsider E[W(t. Similarly prove (a + b + c)2 ≤ 3a2 + 3b2 + 3c2 for any real a. 4. Consider d(W(t. Prove that (a + b)2 ≤ 2a2 + 2b2 for any real numbers a and b by considering (a + b)2 + (a − b)2. a crucial feature is the independence of increments from earlier in the history. E[W(t. Then use some of the ideas appearing in the proof of the Ito isometry and in the steps leading to (2.4. ω)6] .5. ω)k] = 1 k(k − 1) 2 T 0 E[W(t. By considering the differential d(W 3 − 3tW). ω) − t dW(t. b. and showing that these almost surely tend to I(ω) = 1 2 W(T . 4. ω). 1. ω)2 − T by considering I − In . ω)2]. 4. and c. measured by E[(I − In)2] .3.1. T ].Exercises • The integral form of Ito’s formula is [f(t. Argue that I − In = 1 2 1 2 n−1 j=0 (ΔWj)2 − Δtj by writing n−1 2 Δ(Wj ) − Δtj . ω). and E[W(t. ω) is a Wiener process. ω)k) by Ito’s formula and the martingale property of Ito integrals to deduce E[W(T . . Exercises 4. ω)k−2] dt for k ≥ 2 . Argue from basic principles that T 0 W(t. 4.

ω)? The aim of this exercise is to show that under Ito integration the functions In(t) are the stochastic analogue of powers under ordinary integration. H1(x) = x. .8. (see (Kreyszig 1999. ω) dW(t2. Continue to hence deduce ··· = √ 1 n/2 t Hn(W(t. ω) Note: Hn(x). 4. ω). dW(tn. ω).8.7. . H2(x) = x2 − 1. satisfy the recurrences Hn = nHn−1 and Hn − xHn + nHn = 0 . ω)2] = t. I1 = W(t. I 3 3 4. ω) . .3. ω)3 − TW(T . E[W(t. ω) dg dt . Answers to selected exercises 4. ω)/ t) . Let In(t. ω). and E[W(t. ω)4] = 3t2. Chap.5. n! 0≤t1 ≤···≤tn ≤t 1 dW(t1. ω) = tn/2Hn(W(t. use Ito’s formula to show g(t)dW(t. among other properties. ω) = 1 In(T . and I3 = W 3 − 3tW . Stochastic Integration Proves Ito’s Formula 4. I(ω) = 1 W(T . I0 = 1. where Hn is the nth Hermite polynomial. . n 1 See how Ito integration maps In−1 to n In just as ordinary integration analogously 1 maps the power tn−1 to n tn . For a nonrandom function g(t). 246–247) or (Abramowitz and Stegun 1965.I3(t. pp. Use Ito’s formula to show that T 0 In−1(t. ω)/ t). . dt √ 4.114 Chapter 4.. . ω) and σ2 = 2 T 3 . ω) − W(t.6. g has no dependence upon the realizations ω of a Wiener process W(t. I2 = W 2 − t. What are I0(t. 4. that is. ω)dW(t. ω) . and H3(x) = x3 − 3x . ω) = g(t)W(t. Consider (a + b + c)2 + (a − b)2 + (a − c)2 + (b − c)2 . 22)). E[W(t. and the first few Hermite polynomials are H0(x) = 1. ω)6] = 15t3 .

Since they do not correspond to specific figures in the text.grid title(’exponential function’) t0=clock.exp(t)).t=0.Appendix A Extra MATLAB/SCILAB Code The following algorithms generate plots that evolve in time as they execute. t=linspace(-1.1. while t<20 axis([0 0 1 1]+exp(-t/5)*[-1 1 -1 1]) drawnow t=etime(clock. and they generate little movies. This aspect distinguishes them from the algorithms listed in the body of the text which generate static graphs.1 This algorithm draws a 20 second zoom into the exponential function to demonstrate the smoothness of our usual functions. plot(t. Algorithm A.t0). they are gathered here in an appendix.1024). end 115 .

t(i)=0. Use the Brownian bridge to generate a start curve and new data as the zoom proceeds. tfin=60. i=2:2:n.sqrt(h)*randn(1. h=diff(t(1:2)).x). axis([-width width -sqrt(width) sqrt(width)]) drawnow if h>width/(n/4) % interpolate new data h=h/2. title(’Weiner process’) axis([-1 1 -1 1]) hold on.n).tt=0.hold off pause t0=clock. while tt<tfin width=exp(-tt/5).[0 0 0].5*(t(i-1)+t(i+1)).’ydata’. x(i)=0.x) end tt=etime(clock.t. Extra MATLAB/S CILAB Code Algorithm A. % force through origin j=1:2:n. hand=plot(t. % increase for slower better resolution t=linspace(-1.1.n-1)]). n=2^10+1.plot([-1. set(hand.’xdata’. x(j)=x(k).t0).1 0 1.1]. Force the Brownian motion to pass through the origin. x=x-x((n+1)/2). x=cumsum([0.5*(x(i-1)+x(i+1))+sqrt(h/2)*randn(size(i)). k=(n+3)/4:(3*n+1)/4. Note the self-affinity as the vertical is scaled with the square root of the horizontal.116 Appendix A. t(j)=t(k). Note the infinite number of zero crossings that appear near the original one.’ro-’).2 This algorithm draws a 60 second zoom into Brownian motion. end .

y.7 0.Appendix A.axis([0 1 0 1]) hold off pause j=1:m. z=max(0.y]=meshgrid(0:0.7. 0.’xdata’.’ydata’.’markersize’.z(:. axis(’equal’).2).z(:. Extra MATLAB/S CILAB Code 117 Algorithm A.1).2).:)+0. t0=clock.z(j. then discard nonexiting walks pause(3) j=find(max(abs(z’-0.4].1).1*sqrt(dt)*randn(length(j).1).2)) drawnow j=find(max(abs(z’-0.5-1e-7).2).’r.z(:. estimate_u=mean(fp) error_u=std(fp)/sqrt(length(fp)-1) .:).^2)) hold on m=200 t=0.t0)-t. set(h.’.’.’b.1).4) to show that the walkers first exit locations given a reasonable sample of the boundary conditions.’markersize’. t=t+dt.1). [x.’ydata’.min(z. z0=[0.5))>=0.20) h=plot(z(:.20). z(j.5-1e-7). set(h. while length(j)>m/8 dt=etime(clock. clabel(contour(x.1:1).3 This algorithm random walks from the specific point (0. z=z0(ones(m.^2.’erasemode’.2)) drawnow actual_u=z0(1)^2-z0(2)^2 fp=z(j. end % pause to show all points.5))<0.1).1)).^2-y. This algorithm compares the numerical and exact solution u = x2 − y2 .:)=z(j.2).^2-z(j. plot(z(:.’xdata’.x.z(:.’xor’.z(j.

.

ω) = f(X(b. Proof. X) + 1 f (X)σ2(t. e. ω)) − f(X(a. X)dW 2 as dX = μ(t. x)p(t. define a new Ito process Y(t. ω)) from the definition Y = f(X) . X)dt + σ(t.. x) dx. Thus Ito’s integral formula asserts f(X(b. ω)). This proof use some theory from Chapter 4 and some techniques used in continuum mechanics (see. Ito’s formula then tells us that dY = f (X)dX + 1 f (X)dX2 2 = f (X)μ(t.12) for this process Y: b a b dY = b a f (X)μ(t. The proof here is more general than that of Section 3. X) dW . 119 . ω) = f(X(t. ω)) = b a f (X)μ(t.1 Fokker–Planck equation This appendix provides an alternate proof of Theorem 3. X) dt 2 b a + f (X)σ(t. X) dt + f (X)σ(t. X)dW in general. For the Ito process X(t. X) + 1 f (X)σ2(t.4 that the Fokker–Planck equation (3. x) of a stochastic Ito process. ω). b] to get the Ito integral formula (4. ω) − Y(a. Now integrate this over any time interval [a. X) dW .1) Introduce the PDF p(t. X) dt + 2 b a f (X)σ(t. Now a dY = Y(b. ω)) − f(X(a.g. X) + 1 f (X)σ2(t. X)} = g(t. Let f(x) denote some arbitrary smooth function. Roberts 1994).Appendix B Two Alternate Proofs B.1 in that here we allow both drift and volatility to vary in both x and time t.2) governs the evolution of a PDF p(t. x) by taking expectations: recall that E {g(t. (B.

x) must x=− go to zero as x → ±∞. if it did not.120 Appendix B. ω)) − f(X(a. the By the martingale property of Ito integrals (see Theorem 4. b) − p(x. then there would be no way that the area under the PDF could be one. Consider its expectation. b) dx − f(x)p(x. ω)) − f(X(a. a)]dx = f(x) = b b ∂p a ∂t dt dx f(x) a ∂p dx dt ∂t b by the fundamental theorem of calculus that p(x. X) + E 1 f (X)σ2(t. Two Alternate Proofs where the limits of the x integration are implicitly over all x. X) dW b a ·dW .2). • ∂ ∂ f μp dx = [fμp]∞ ∞ − f ∂x (μp) dx = −f ∂x (μp) dx as the PDF p(t. x)p(t. X) + 1 f (X)σ2(t. X) dt 2 f (x)μ(t. 2 By integration by parts. b E a f (X)μ(t. X) dt 2 E f (X)μ(t. X) + 1 f (X)σ2(t. x) dx dt . ∂t We derive the x derivative terms in the Fokker–Planck equation from the right-hand side of (B. The factor ∂p ∂t ∂t appearing in the integrand becomes the ∂p term in the Fokker–Planck equation (3. ω))} = f(x)p(x. b) − p(x. whereas • integrating by parts twice gives 1 2f ∂2 ∞ σ2p dx = 1 f σ2p − f(σ2p) −∞ + 1 f 2 (σ2p) dx = 2 2 ∂x 2 1 ∂ 2 2 f ∂x2 (σ p) dx as again the PDF and its derivative must go to zero as x → ±∞ . a) = a ∂p dt .5) that E second expectation above is zero. X) dt + E 2 b a f (X)σ(t. = 0 . x)p(t. . E {f(X(b. Hence. ω))} =E = = b a b a b a f (X)μ(t. a) dx = f(x)[p(x. x) dx + 1 f (x)σ2(t. The expectation of the lefthand side of (B.1) is E {f(X(b.1).

2 ∂x ∂t ∂x a Then put all terms under one pair of integrals on the left: b f(x) a ∂ ∂2 ∂p + (μp) − 1 2 (σ2p) dx dt = 0 . X(s. Since the expression inside the brackets [ ] is continuous. Then the integral a f(x)[ ] dx dt must be > 0. as we know the integral is always zero.2). Choose f(x) to be any smooth function that is positive inside the interval [c. ω)) − f(X(a. that is. and b is if it is always zero itself.2 Kolmogorov backward equation This section provides an alternate proof of Theorem 3. suppose the factor inside the brackets [ ] is nonzero. Recall that p dξ gives the probability of passing through x = ξ at time t = τ given the Ito process X(t. x|s. at some x = ξ and t = τ . all the parts except the “= 0” bit. as is the time interval [a. . y) for any ξ and y and for any times s < τ . ω) starts at x = y at time t = s . 2 ∂x ∂t ∂x which when rearranged slightly is the Fokker–Planck equation (3. ∂p ∂ ∂2 + (μp) − 1 2 (σ2p) = 0 . then it must be still positive in some small time interval around τ. and in some small interval around ξ. Equate the left and right-hand sides: b f(x) a b ∂p ∂ ∂2 dx dt = f(x) − (μp) + 1 2 (σ2p) dx dt . ξ|s. 2 ∂x ∂x f − See the right-hand side of the Fokker–Planck equation (3.15 that the Kolmogorov backward equation (3. B.2) appear in this integrand—that is. Consequently.2) in this integrand. To see this. a. Now put the two parts together. b]. c < ξ < d say. Kolmogorov backward equation Consequently the right-hand side becomes E {f(X(b. The only way that the continuous factor in the integrand in square brackets can be zero for all f. Consider the conditional PDF p(τ. ω) = y . y) of a stochastic Ito process. This is the contradiction. Proof.B. Hence the supposition must be false: the factor in the brackets [ ] must be everywhere zero. 2 ∂x ∂t ∂x All the parts of the Fokker–Planck equation (3. say positive. ω))} = = b a b a 121 −f ∂ ∂2 (μp) dx + 1 f 2 (σ2p) dx dt 2 ∂x ∂x ∂2 ∂ (μp) + 1 2 (σ2p) dx dt . Use proof by contradiction to get the = 0. Recall that f(x) is arbitrary. a < τ < b say.2. d] and zero b outside. as the integrand is ≥ 0 and is > 0 over a finite part of the domain.11) governs the evolution of the conditional PDF p(t.

Thus the integral must be just the probability of going from (s.ξ|t. and then from (t.ξ|t.X) (τ. ξ|t. by the martingale property of Ito integrals (see Theorem 4. equation (B. X) + 2 σ (t. Apply Ito’s formula. X) + 2 σ (t. . X) dW a ∂y (τ.2) becomes 0= b a E {K} dt . ω)) for s ≤ t ≤ τ . ω)}: from the definition of Z and an expectation. s ≤ a < b < τ . Second. its change over the interval E {Z(b. as the integral is the probability of going from (s. Two Alternate Proofs Define a new Ito process Z(t. by the same argument as used to prove the Fokker–Planck equation. ω) = b a K dt + ∂p σ(t. ω)} = 0 .5) that E{ a ·dW} = 0 . and so. ω) − Z(a. y) dx = p(τ. ξ|t. ω)} − E{Z(a. ω) = ∂p 1 2 ∂p ∂2p + μ(t. Integrate over any time interval [a.ξ|t. b].122 Appendix B. a ∂y (τ. ω)} = p(τ. X) dW . x). integrated over all possible x values. ω)} − E{Z(a. the integrand E {K(t.X) ∂y (τ.X) See that the terms in the Kolmogorov backward equation (3. but we focus on its t dependence.ξ|t. X(t.11) appear in the dt term. ξ). but it does exist. y) to (t. our task is to disentangle them somehow. ω)} is constant in t. But the right-hand side p(τ.ξ|t. X) 2 ∂s ∂y ∂y .X) ∂p σ(t. x|s. Consequently. This is fine because the conditional PDF p is some smooth function—we may or may not know what it is. remembering dX = μ dt + σ dW : dZ = = ∂2p ∂p ∂p dt + dX + 1 dX2 2 ∂y2 ∂s (τ.ξ|t. ∂y (τ. E {Z(t. ξ). (B. Thus. ω)} = b a E {K} dt + E ∂p σ(t. The Ito process Z also varies with ξ and τ. But this integral is zero for all intervals [a. ω) = p(τ.X) b Disentangling K is done by taking the expectation of this equation: E {Z(b. ξ|s.X) b b . x)p(t.2) First.X) ∂p 1 2 ∂p ∂2p + μ(t. b] such that s ≤ a < b ≤ τ to deduce Z(b.X) which contains the terms of the Kolmogorov backward equation (3.ξ|t. consider E{Z(t.11). the last integral above vanishes. x) to (τ. and hence the expectation E {Z(t.ξ|t. Our goal is then to extract K from all the other terms. (τ. y) to (τ. X) 2 ∂s ∂y ∂y dt + (τ. y) is independent of the intermediate time t. For simplicity define the new Ito process K(t. ξ|s. y). ω)} = 0 for all s < t < τ . X) dW .

and hence proves the Kolmogorov backward equation (3. y) 2 = 0 . so the expectation E {K(t. ω) is a continuous Ito process. ω) → y and ⎧ ⎫ ⎨ ∂p ⎬ ∂2p ∂p 1 2 E {K(s.2. ω)} = E + μ(s.11). as they are arbitrary. ω)} must also be continuous. hence E {K(t. y) + 2 σ (s. then X(t. ω) at the expense of the expectation. and y. . ⎩ ∂s ⎭ ∂y ∂y (τ.y) At t = s there is nothing stochastic remaining inside the expectation. So the expectation is irrelevant and we deduce ∂p 1 2 ∂2p ∂p + μ(s. Take the limit as t → s. y) 2 = 0. Kolmogorov backward equation 123 We have extracted K(t. But K(t. s. ω)} = 0 for time t = s.B. y) + 2 σ (s. ξ. ∂s ∂y ∂y which must be satisfied for all τ.ξ|s.

.

Stochastic Eifferential Equations: An Introduction with Applications. A.Bibliography Abramowitz. V. SIAM Review 50(2). Roberts. E. An algorithmic introduction to numerical simulation of stochastic differential equations. J. SIAM Review 43(3). Modeling and simulating chemical reactions. Pacific Grove. Numerical Solution of Stochastic Differential Equations. A One-Dimensional Introduction to Continuum Mechanics. E. Available online at http://link. (1999). Springer-Verlag. I. Vol. 8th ed. 525–546. CA. eds. (1992). Higham. Springer-Verlag. (1998). River Edge.aip. (1994). 125 . C. New York. J. A. Brooks/Cole.. The Mathematics of Finance: Modeling and Hedging. Dover. (1997). Berlin. P. P. Duxbury Press. M. (1966). and Platen. B. E. Kloeden. J. Pacific Grove. D. (2008). Available online at http: //link. Øksendal. World Scientific. CA. 347–368. K.aip. (2001). Handbook of Mathematical Functions. D. Berlin.org/link/?SIR/50/ 347/1 Kao. and Stegun. Wiley... and Goodman.org/link/?SIR/43/525/1 Higham. Advanced Engineering Mathematics. J. 23 of Applications of Mathematics. Stampfli. An Introduction to Stochastic Processes. E. NJ. Kreyszig. New York. (2001).

.

90 Cauchy sequence. 78 binomial lattice. 12–14. 22. 27. 48. 87. 84. 21–23. 119–122 forward contract. 28. 71. 23. 40.Index Page numbers in italics denote the page of definition of the term. see also Fokker–Planck equation . 23. 65. 70. 18. 6. 74. 22. 30. 106 advection-diffusion equation. 107. 43 arbitrage. 46 knock out. 109. 106. 3. 51. 72. 4. 79 Cauchy distribution. 64 Doob–Meyer decomposition. 103. 13. 14. 5. 63 call option. 21. 26. 61. 54. 63. 64. 20. 66. 105–107. 86. 121. 90 Brownian motion. 54. 90 Black–Scholes equation. 85. 11. 77. 53 Chapman–Kolmogorov equation. 57. 59. 119. 3. 94. 101–103. 76. 123 Ito’s formula. 37 Feynman–Kac formula. 20. 48. 122 Ito’s lemma. 66. 84. 58. 57. 78. see also Wiener process exponential. 95. 53 127 exercise price. 87. 100. 90 filtration. 13. 69. 113. 106. 45. 54. 93. 85 antidifferentiation. 100. 17. 94 diffusion. 101. 68. 49. 122 differential. 75. 10. 37. 51. 61. 45–48. 93. 24–27. 103. 35. 68. 15. 48. 58. 28. 25. 119. 55. 35. 35. 49. 80. 11. 78. 60. 85. 18. 10. 94–96. 23. 112. 49. 35. 85. 13. 36. 97 Fokker–Planck equation. 30. 79. 85 conditional PDF. 35. 119. 97. 15. 54 Hermite polynomial. 51. 84. 50. 25. 74. 54 birth and death process. 119 Euler method. 102. 82. 64. 21. 67. 52–54. 106–108. 48. 49. 49. 77. 27. 95 Euler–Cauchy differential equation. 28–30. 120. 113. 90. 59. 60 Kolmogorov backward equation. 57. 46. 35–37. 40. 90. 32. 63. 114. 29. 71 Gaussian. 113 interest rate. 46–49. 66. 85. 121–123 Kolmogorov forward equation. 57. 84 drift. 60. 97–99. 31. 98. 43. 79 asset price. see also normally distributed hedge ratio. 30. 66. 122 Ito isometry. 64. 31. 54. 121. 66. 76 Dirichlet problem. 35–37. 113 Ito process. 9–11. 23–26. 33. 94. 15. 99. 54. 20. 56. 48–51. 68. 49. adapted. 103 chain rule. 78. 50 Gamma function. 51. 14. 26. 22. 22. 35. 6. 51. 60 Ito integral. 36 asset. 53. 105. 75. 96.

3. 61. 37. 119 white noise. 57. 83–87. 37. 34. 96. 61. 86. 60 previsible. 21. 68. 84. 112 stochastic process. 36. 6. 11. 1. 60 step function. 69. 97. 36. see also Gaussian ODE. 106. 119. 103. 24–28. 7. 120. 113 Liouville equation. 11. 97. 54. 41. 106 probability distribution function (PDF). 48. 95. 44–46. 95. 64. 20. 98. 112. 67. 75. 79. 17. 55. 23–28. 11. 84. 49 S CILAB. 31–34. 77. 36. 39–43. 90. 9. 7. see partial differential equation PDF. 74 Markov chain. 83. 49. 36. 48. 71. 78. 10. 57. 3. 26. 84. 63–65. 45. 34. 49–51. 64. 64 Linearity. 95. 63. 54. 115 measurable. 21. 32. 98. 10. 57. see probability distribution function portfolio. 87. 47. 97. 24. 90. 6. 106. 79. 95. 53. see also Wiener process Wiener process. 79. 29. 105. 87 Stratonovich sense. 38. 31–34. 84 martingale property. 87. 79 Taylor series. 30. 18. 75 put option. 58. 106 stock drift. 87 stock volatility. 37. 78. see ordinary differential equation ordinary differential equation (ODE). 45. 6. 102. 100–105. 60. 28. 74 Malthusian model. 90. 22. 113 stochastic calculus. 63. 68. 120 M ATLAB. 105 normally distributed. 3. 30. 53. 10 strike price. 108 union. 31. 90. 75. 15. 13. 22. 97 linearity. 3. 31. 46. 76. 9. 13. 56. 90 risk free. 20. 37. 5. 41. 34. see stochastic differential equation self-financing. 84. 5. 115 SDE. 83. 6. 63–66. 36. 63–66. 66. 10 Laplace’s equation. 107. 59. 61. 97. 90 PDE. 70–72. 37. 43. 7. 76. 34. 54–56. 3. 82. 86. 11. 79. 108. 68. 122 product rule. 17. 31. 91. 61. 87. 69. 32. 55. 10. 113 . 13–15. 106 norm. 11. 93. 18. 95. 30. x. 22. 98. 33. 76. 47. 74–84. 20. 95. 15. 106. 17. 97. 15. 82. 7. 70. 65. 51 Index random walk. 13. 60. 15. 35–37. 46. 113 volatility.128 Langevin equation. 90. 29. 31. 80–82 Ornstein–Uhlenbeck process. 110 stochastic differential equation (SDE). 51. 17. 15. 49. 5. x. 22. 3. 95. 3. 36. 68 Malthus. 113. 91 partial differential equation (PDE). 10–15. 97. 66. 97. 23. 10. 69–73. 35. 106.

Sign up to vote on this title
UsefulNot useful