MARTA SANZ Lecturenotes-Sc2008 PDF

AN INTRODUCTION TO
STOCHASTIC CALCULUS
Marta Sanz-Solé
Facultat de Matemàtiques
Universitat de Barcelona
January 30, 2009
Contents
1 A review of the basics on stochastic processes 4
1.1 The law of a stochastic process . . . . . . . . . . . . . . . . . 4
1.2 Sample paths . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 The Brownian motion 9

2.1 Equivalent definitions of Brownian motion . . . . . . . . . . . 9
2.2 A construction of Brownian motion . . . . . . . . . . . . . . . 11
2.3 Path properties of Brownian motion . . . . . . . . . . . . . . . 15
2.4 The martingale property of Brownian motion . . . . . . . . . . 18
2.5 Markov property . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Itô’s calculus 25
3.1 Itô’s integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 The Itô integral as a stochastic process . . . . . . . . . . . . . 30
3.3 An extension of the Itô integral . . . . . . . . . . . . . . . . . 32
3.4 A change of variables formula: Itô’s formula . . . . . . . . . . 34
3.4.1 One dimensional Itô’s formula . . . . . . . . . . . . . . 35
3.4.2 Multidimensional version of Itô’s formula . . . . . . . . 38
4 Applications of the Itô formula 44

4.1 Burkholder-Davis-Gundy inequalities . . . . . . . . . . . . . . 44
4.2 Representation of L2 Brownian functionals . . . . . . . . . . . 45
4.3 Girsanov’s theorem . . . . . . . . . . . . . . . . . . . . . . . . 47
5 Local time of Brownian motion and Tanaka’s formula 50
6 Stochastic differential equations 55

6.1 Examples of stochastic differential equations . . . . . . . . . . 57
6.2 A result on existence and uniqueness of solution . . . . . . . . 58
6.3 Some properties of the solution . . . . . . . . . . . . . . . . . 62
6.4 Markov property of the solution . . . . . . . . . . . . . . . . . 65
7 Numerical approximations of stochastic differential equa-

tions 68
8 Continuous time martingales 71

8.1 Doob’s inequalities for martingales . . . . . . . . . . . . . . . 71
8.2 Local martingales . . . . . . . . . . . . . . . . . . . . . . . . . 74
8.3 Quadratic variation of a local martingale . . . . . . . . . . . . 76
9 Stochastic integrals with respect to continuous martingales 80
10 Appendix 1: Conditional expectation 83
11 Appendix 2: Stopping times 87
3
1 A review of the basics on stochastic pro-
cesses
This chapter is devoted to introduce the notion of stochastic processes and
some general definitions related with this notion. For a more complete ac-
count on the topic, we refer the reader to [?]. Let us start with a definition.
Definition 1.1 A stochastic process with state space S is a family {Xi , i ∈

I} of random variables Xi : Ω → S indexed by a set I.
For a successful progress in the analysis of such object, one needs to put
some structure on the index set I and on the state space S. In this course,
we shall mainly deal with the particular cases: I = N, Z+ , R+ and S either a
countable set or a subset of R.
The basic problem statisticians are interested in, is the analysis of the prob-
ability law (mostly described by some parameters) of characters exhibited by
populations. For a fixed character described by a random variable X, they
use a finite number of independent copies of X -a sample of X. For many
purposes, it is interesting to have samples of any size and therefore to con-
sider sequences Xn , n ≥ 1. It is important here to insist on the word copies,
meaning that the circumstances around the different outcomes of X do not
change. It is a static world. Hence, they deal with stochastic processes
{Xn , n ≥ 1} consisting of independent and identically distributed random
variables.
However, this is not the setting we are interested in here. Instead, we would
like to give stochastic models for phenomena of the real world which evolve as
time passes by. Stochasticity is a choice in front of a complete knowledge and
extreme complexity. Evolution, in contrast with statics, is what we observe
in most phenomena in Physics, Chemistry, Biology, Economics, Life Sciences,
etc.
Stochastic processes are well suited for modeling stochastic evolution phe-
nomena. The interesting cases correspond to families of random variables Xi
which are not independent. In fact, the famous classes of stochastic processes
are described by means of types of dependence between the variables of the
process.
1.1 The law of a stochastic process

The probabilistic features of a stochastic process are gathered in the joint
distributions of their variables, as given in the next definition.
4
Definition 1.2 The finite-dimensional joint distributions of the process
{Xi , i ∈ I} consists of the multi-dimensional probability laws of any finite
family of random vectors Xi1 , . . . , Xim , where i1 , . . . , im ∈ I and m ≥ 1 is
arbitrary.
Let us give an important example.
Example 1.1 A stochastic process {Xt , t ≥ 0} is said to be Gaussian if its

finite-dimensional joint distributions are Gaussian laws.
Remember that in this case, the law of the random vector (Xt1 , . . . , Xtm ) is
characterized by two parameters:
µ(t1 , . . . , tm ) = E (Xt1 , . . . , Xtm ) = (E(Xt1 ), . . . , E(Xtm ))

Λ(t1 , . . . , tm ) = Cov(Xti , Xtj ) .
1≤i,j≤m
In the sequel we shall assume that I ⊂ R+ and S ⊂ R, either countable or

uncountable, and denote by RI the set of real-valued functions defined on I.
A stochastic process {Xt , t ≥ 0} can be viewed as a random vector
X : Ω → RI .
Putting the appropriate σ-field of events in RI , say B(RI ), one can define,
as for random variables, the law of the process as the mapping
PX (B) = P (X −1 (B)), B ∈ B.
Mathematical results from measure theory tell us that PX is defined by means

of a procedure of extension of measures on cylinder sets given by the family
of all possible finite-dimensional joint distributions. This is a deep result.
In Example 1.1, we have defined a class of stochastic processes by means of
the type of its finite-dimensional joint distributions. But, does such an object
exist? In other words, could one define stochastic processes giving only its
finite-dimensional joint distributions? Roughly speaking, the answer is yes,
adding some extra condition. The precise statement is a famous result by
Kolmogorov that we now quote.
Theorem 1.1 Consider a family
{Pt1 ,...,tn , t1 < . . . < tn , n ≥ 1, ti ∈ I} (1.1)
where:
5
1. Pt1 ,...,tn is a probability on Rn ,
2. if {ti1 < . . . < tim } ⊂ {t1 < . . . < tn }, the probability law Pti1 ...tim is the
marginal distribution of Pt1 ...tn .
There exists a stochastic process {Xt , t ∈ I} defined in some probability space,

such that its finite-dimensional joint distributions are given by (1.1). That
is, the law of the random vector (Xt1 , . . . , Xtn ) is Pt1 ,...,tn .
One can apply this theorem to Example 1.1 to show the existence of Gaussian
processes, as follows.
Let K : I × I → R be a symmetric, positive definite function. That means:
• for any s, t ∈ I, K(t, s) = K(s, t);
• for any natural number n and arbitrary t1 , . . . , tn ∈ I, and x1 , . . . , xn ∈

R,
n
X
K(ti , tj )xi xj > 0.
i,j=1
Then there exists a Gaussian process {Xt , t ≥ 0} such that E(Xt ) = 0 for
any t ∈ I and Cov (Xti , Xtj ) = K(ti , tj ), for any ti , tj ∈ I.
To prove this result, fix t1 , . . . , tn ∈ I and set µ = (0, . . . , 0) ∈ Rn , Λ =
(K(ti , tj ))1≤i,j≤n and
Pt1 ,...,tn = N (0, Λ).
We denote by (Xt1 , . . . , Xtn ) a random vector with law Pt1 ,...,tn . For any
subset {ti1 , . . . , tim } of {t1 , . . . , tn }, it holds that
A(Xt1 , . . . , Xtn ) = (Xti1 , . . . , Xtim ),
with  
δt1 ,ti1 · · · δtn ,ti1
A =  ··· ··· ··· ,
 
δt1 ,tim · · · δtn ,tim

where δs,t denotes the Kronecker Delta function.
By the properties of Gaussian vectors, the random vector (Xti1 , . . . , Xtim )
has an m-dimensional normal distribution, zero mean, and covariance matrix
AΛAt . By the definition of A, it is trivial to check that
AΛAt = (K(til , tik ))1≤l,k≤m .
Hence, the assumptions of Theorem 1.1 hold true and the result follows.
6
1.2 Sample paths
In the previous discussion, stochastic processes are considered as random
vectors. In the context of modeling, what matters are observed values of the
process. Observations correspond to fixed values of ω ∈ Ω. This new point
of view leads to the next definition.
Definition 1.3 The sample paths of a stochastic process {Xt , t ∈ I} are the
family of functions indexed by ω ∈ Ω, X(ω) : I → S, defined by X(ω)(t) =
Xt (ω).
Example 1.2 Consider random arrivals of customers at a store. We set our

clock at zero and measure the time between two consecutive arrivals. They
are random variables X1 , X2 , . . . . Set S0 = 0 and Sn = nj=1 Xj , n ≥ 1. Sn
P
is the time of the n-th arrival. The process we would like to introduce is Nt ,
giving the number of customers who have visited the store during the time
interval [0, t], t ≥ 0.
Clearly, N0 = 0 and for t > 0, Nt = k if and only if
Sk ≤ t < Sk+1 .
The stochastic process {Nt , t ≥ 0} takes values on Z+ . Its sample paths are
increasing right continuous functions, with jumps at the random times Sn ,
n ≥ 1, of size one. It is a particular case of a counting process. Sample paths
of counting processes are always increasing right continuous functions, their
jumps are natural numbers.
Example 1.3 Evolution of prices of risky assets can be described by real-

valued stochastic processes {Xt , t ≥ 0} with continuous, although very rough,
sample paths. They are generalizations of the Brownian motion.
The Brownian motion, also called Wiener process, is a Gaussian process
{Bt , t ≥ 0} with the following parameters:
E(Bt ) = 0
E(Bs Bt ) = s ∧ t,
This defines the finite dimensional distributions and therefore the existence
of the process via Kolmogorov’s theorem (see Theorem 1.1).
7
Before giving a heuristic motivation for the preceding definition of Brownian
motion we introduce two further notions.
A stochastic process {Xt , t ∈ I} has independent increments if for any

t1 < t2 < . . . < tk the random variables Xt2 − Xt1 , . . . , Xtk − Xtk−1 are
independent.
A stochastic process {Xt , t ∈ I} has stationary increments if for any t1 < t2 ,

the law of the random variable Xt2 − Xt1 is the same as that of Xt2 −t1 .
Brownian motion is termed after Robert Brown, a British botanist who ob-
served and reported in 1827 the irregular movements of pollen particles sus-
pended in a liquid. Assume that, when starting the observation, the pollen
particle is at position x = 0. Denote by Bt the position of (one coordinate)
of the particle at time t > 0. By physical reasons, the trajectories must be
continuous functions and because of the erratic movement, it seems reason-
able to say that {Bt , t ≥ 0} is a stochastic process. It also seems reasonable
to assume that the change in position of the particle during the time interval
[t, t + s] is independent of its previous positions at times τ < t and therefore,
to assume that the process has independent increments. The fact that such
an increment must be stationary is explained by kinetic theory, assuming
that the temperature during the experience remains constant.
The model for the law of Bt has been given by Einstein in 1905. More
precisely, Einstein’s definition of Brownian motion is that of a stochastic
processes with independent and stationary increments such that the law of
an increment Bt − Bs , s < t is Gaussian, zero mean and E(Bt − Bs )2 = t − s.
This definition is equivalent to the one given before.
8
2 The Brownian motion
2.1 Equivalent definitions of Brownian motion
This chapter is devoted to the study of Brownian motion, the process intro-
duced in Example 1.3 that we recall now.
Definition 2.1 The stochastic process {Bt , t ≥ 0} is a one-dimensional
Brownian motion if it is Gaussian, zero mean and with covariance function
given by E (Bt Bs ) = s ∧ t.
The existence of such process is ensured by Kolmogorov’s theorem. Indeed,
it suffices to check that
(s, t) → Γ(s, t) = s ∧ t
is nonnegative definite. That means, for any ti , tj ≥ 0 and any real numbers
ai , aj , i, j = 1, . . . , m,
m
X
ai aj Γ(ti , tj ) ≥ 0.
i,j=1
But Z ∞
s∧t= 1[0,s] (r) 1[0,t] (r) dr.
0
Hence,
m
X m
X Z ∞
ai aj ti ∧ tj = ai aj 1[0,ti ] (r) 1[0,tj ] (r) dr
i,j=1 i,j=1 0
m
!2
Z ∞ X
= ai 1[0,ti ] (r) dr ≥ 0.
0 i=1
Notice also that, since E(B02 ) = 0, the random variable B0 is zero almost
surely.
Each random variable Bt , t > 0, of the Brownian motion has a density, and
it is
1 x2
pt (x) = √ exp(− ),
2πt 2t
while for t = 0, its ”density” is a Dirac mass at zero, δ{0} .
Differentiating pt (x) once with respect to t, and then twice with respect to
x easily yields
∂ 1 ∂2
pt (x) = pt (x)
∂t 2 ∂x2
p0 (x) = δ{0} .
9
This is the heat equation on R with initial condition p0 (x) = δ{0} . That
means, as time evolves, the density of the random variables of the Brownian
motion behaves like a diffusive physical phenomenon.
There are equivalent definitions of Brownian motion, as the one given in the
nest result.
Proposition 2.1 A stochastic process {Xt , t ≥ 0} is a Brownian motion if

and only if
(i) X0 = 0, a.s.,
(ii) for any 0 ≤ s > t, the random variable Xt − Xs is independent of the

σ-field generated by Xr , 0 ≤ r ≤ s, σ(Xr , 0 ≤ r ≤ s) and Xt − Xs is a
N (0, t − s) random variable.
Proof: Let us assume first that {Xt , t ≥ 0} is a Brownian motion. Then

E(X02 ) = 0. Thus, X0 = 0 a.s..
Let Hs and H̃s be the vector spaces included in L2 (Ω) spanned by (Xr , 0 ≤
r ≤ s) and (Xs+u − Xs , u ≥ 0), respectively. Since for any 0 ≤ r ≤ s
E (Xr (Xs+u − Xs )) = 0,
Hs and H̃s are orthogonal in L2 (Ω). Consequently, Xt − Xs is independent

of the σ-field σ(Xr , 0 ≤ r ≤ s).
Since linear combinations of Gaussian random variables are also Gaussian,
Xt − Xs is normal, and E(Xt − Xs ) = 0,
E(Xt − Xs )2 = t + s − 2s = t − s.
This ends the proof of properties (i) and (ii).

Assume now that (i) and (ii) hold true. Then the finite dimensional distri-
butions of {Xt , t ≥ 0} are multidimensional normal and for 0 ≤ s ≤ t,
E(Xt Xs ) = E ((Xt − Xs + Xs )Xs ) = E ((Xt − Xs )Xs ) + E(Xs2 )

= E(Xt − Xs )E(Xs ) + E(Xs2 ) = E(Xs2 ) = s = s ∧ t.
Remark 2.1 We shall see later that Brownian motion has continuous sam-
ple paths. The description of the process given in the preceding proposition
tell us that such a process is a model for a random evolution which starts from
x = 0 at time t = 0, such that the qualitative change on time increments only
depends on their length (stationary law), and that the future evolution of the
process is independent of its past (Markov property).
10
Remark 2.2 It is easy to prove that the if B = {Bt , t ≥ 0} is a Brownian
motion, so is −B = {−Bt , t ≥ 0}. Moreover, for any λ > 0, the process
1

Bλ = Bλ2 t , t ≥ 0
λ
is also a Brownian motion. This means that zooiming in or out, we will
observe the same sort of behaviour.
2.2 A construction of Brownian motion

There are several ways to obtain a Brownian motion. Here we shall give P.
Lévy’s construction, which also provides the continuity of the sample paths.
Before going through the details of this construction, we mention an alter-
nate.
Brownian motion as limit of a random walk
Let {ξj , j ∈ N} be a sequence of independent, identically distributed random
variables, with mean zero and variance σ 2 > 0. Consider the sequence of
partial sums defined by S0 = 0, Sn = nj=1 ξj . We have already seen in
P
previous chapters that the sequence {Sn , n ≥ 0} is a Markov chain, and also
a martingale.
Let us consider the continuous time stochastic process defined by linear in-
terpolation of {Sn , n ≥ 0}, as follows. For any t ≥ 0, let [t] denote its integer
value. Then set
Yt = S[t] + (t − [t])ξ[t]+1 , (2.1)
for any t ≥ 0.
The next step is to scale the sample paths of {Yt , t ≥ 0}. By analogy with
the scaling in the statement of the central limit theorem, we set
(n) 1
Bt = √ Ynt , (2.2)
σ n
t ≥ 0.
A famous result in probability theory -Donsker theorem- tell us that the se-
(n)
quence of processes Bt , t ≥ 0}, n ≥ 1, converges in law to the Brownian
motion. The reference sample space is the set of continuous functions van-
ishing at zero. Hence, proving the statement, we obtain continuity of the
sample paths of the limit.
Donsker theorem is the infinite dimensional version of the above mentioned
(n)
central limit theorem. Considering s = nk , t = k+1
n
, the increment Bt −
Bs(n) = σ√1 n ξk+1 is a random variable, with mean zero and variance t −
11
(n)
s. Hence Bt is not that far from the Brownian motion, and this is what
Donsker’s theorem proves.
P. Lévy’s construction of Brownian Motion

An important ingredient in the procedure is a sequence of functions defined
on [0, 1], termed Haar functions, defined as follows:
h0 (t) = 1,
n n
hkn (t) = 2 2 1[ 2k
, 2k+1 [ − 2 1[ 2k+1 , 2k+2 [ ,
2
2n+1 2n+1 2n+1 2n+1
where n ≥ 1 and k ∈ {0, 1, . . . , 2n − 1}.

The set of functions (h0 , hkn ) is a CONS of L2 ([0, 1], B([0, 1]), λ), where
λ stands for the Lebesgue measure. Consequently, for any f ∈
L2 ([0, 1], B([0, 1]), λ), we can write the expansion
−1
∞ 2X n
hf, hkn ihkn ,

X
f = hf, h0 ih0 + (2.3)
n=0 k=0
where the notation h·, ·i means the inner product in L2 ([0, 1], B)([0, 1]), λ).
Using (2.3), we define an isometry between L2 ([0, 1], B([0, 1]), λ) and
L2 (Ω, F, P ) as follows. Consider a family of independent random variables
with law N (0, 1), (N0 , Nnk ). Then, for f ∈ L2 ([0, 1], B([0, 1]), λ), set
−1
∞ 2X n
hf, hkn iNnk .

X
I(f ) = hf, h0 iN0 +
n=0 k=0
Clearly,
E I(f )2 = kf k22 .
Hence I defines an isometry between the space of random variable
L2 (Ω, F, P ) and L2 ([0, 1], B([0, 1]), λ). Moreover, since
−1
m 2X n
hf, hkn iNnk ,

X
I(f ) = lim hf, h0 iN0 +
m→∞
n=0 k=0
the randon variable I(f ) is N (0, kf k22 ) and by Parseval’s identity
E(I(f )I(g)) = hf, gi, (2.4)
for any f, g ∈ L2 ([0, 1], B([0, 1]), λ).

Theorem 2.1 The process B = {Bt = I(11[0,t] ), t ∈ [0, 1]} defines a Brow-
nian motion indexed by [0, 1]. Moreover, the sample paths are continuous,
almost surely.
12
Proof: By construction B0 = 0. Notice that for 0 ≤ s ≤ t ≤ 1, Bt − Bs =
I(11]s,t] ). Hence, by virtue or (2.4) the process B has independent increments
and Bt − Bs has a N (0, t − s) law. From this, by a change of variables
one obtains that the finite dimensional distributions of {Bt , t ∈ [0, 1]} are
Gaussian. By Proposition 2.1, we obtain the first statement.
Our next aim is to prove that the series appearing in
∞ 2X
−1 n

h11[0,t] , hkn iNnk
X
Bt = I 1[0,t] = h11[0,t] , h0 iN0 +
n=0 k=0
∞ 2X
−1 n
gnk (t)Nnk
X
= g0 (t)N0 + (2.5)
n=0 k=0
converges uniformly. In the last term we have introduced the Schauder func-
tions defined as follows.
g0 (t) = h11[0,t] , h0 i = t,
Z t
gnk (t) = h11[0,t] , hkn i = hkn (s)ds,
0
for any t ∈ [0, 1].

By construction, the functions gnk (t) are positive, have disjoint supports and
n
gnk (t) ≤ 2− 2 .
Thus, n
2X −1

n
sup
gn (t)Nn ≤ 2− 2
k k
sup |Nnk |.
t∈[0,1] k=0
0≤k≤2n −1
The next step consists in proving that |Nnk | is bounded by some constant
n
depending on n such that when multiplied by 2− 2 the series with these
terms converges.
For this, we will use a result on large deviations for Gaussian measures along
with the first Borel-Cantelli lemma.
Lemma 2.1 For any random variable X with law N (0, 1) and for any a ≥ 1,
a2
P (|X| ≥ a) ≤ e− 2 .
Proof: We clearly have

2 Z∞ − x2
2 2 Z ∞ x − x2
P (|X| ≥ a) = √ dxe ≤√ dx e 2
2π a 2π a a
2 a2 a2
= √ e− 2 ≤ e− 2 ,
a 2π
13
x √2
where we have used that 1 ≤ a
and a 2π
≤ 1.

We now move to the Borel-Cantelli’s based argument. By the preceding
lemma,
! n −1
2X
n n

P sup |Nnk | >2 4 ≤ P |Nnk | > 2 4
0≤k≤2n −1 k=0
n

≤ 2n exp −2 2 −1 .
It follows that ∞
!
n
|Nnk |
X
P sup >2 4 < +∞,
n=0 0≤k≤2n −1
and by the first Borel-Cantelli lemma

( )!
n
P lim inf
n→∞
sup |Nnk | ≤2 4 = 1.
0≤k≤2n −1
That is, a.s., there exists n0 , which may depend on ω, such that
n
sup |Nnk | ≤ 2 4
0≤k≤2n −1
for any n ≥ n0 . Hence, we have proved

n
2X −1

n n
sup gnk (t)Nnk ≤ 2− 2 sup |Nnk | ≤ 2− 4 ,

t∈[0,1]

k=0
0≤k≤2n −1
a.s., for n big enough, which proves the a.s. uniform convergence of the series
(2.5).

Next we discuss how from Theorem 1.1 we can get a Brownian motion indexed
by R+ . To this end, let us consider a sequence B k , k ≥ 1 consisting of
independent Brownian motions indexed by [0, 1]. That means, for each k ≥ 1,
B k = {Btk , t ∈ [0, 1] is a Brownian motion and for different values of k, they
are independent. Then we define a Brownian motion recursively as follows.
Let k ≥ 1; for t ∈ [k, k + 1] set
k+1
Bt = B11 + B12 + · · · + B1k + Bt−k .
Such a process is Gaussian, zero mean and E(Bt Bs ) = s ∧ t. Hence it is a

Brownian motion.
14
we end this section by giving the notion of d-dimensional Brownian motion,
for a natural number d ≥ 1. For d = 1 it is the process we have seen so far.
For d > 1, it is the process defined by
Bt = (Bt1 , Bt2 , . . . , Btd ), t ≥ 0,
where the components are independent one-dimensional Brownian motions.
2.3 Path properties of Brownian motion

We already know that the trajectories of Brownian motion are a.s. continuous
functions. However, since the process is a model for particles wandering
erratically, one expects rough behaviour. This section is devoted to prove
some results that will make more precise these facts.
Firstly, it is possible to prove that the sample paths of Brownian motion
are γ-Hölder continuous. The main tool for this is Kolmogorov’s continuity
criterion:
Proposition 2.2 Let {Xt , t ≥ 0} be a stochastic process satisfying the fol-
lowing property: for some positive real numbers α, β and C,
E (|Xt − Xs |α ) ≤ C|t − s|1+β .
Then almost surely, the sample paths of the process are γ-Hölder continuous
with γ < αβ .
The law of the random variable Bt − Bs is N(0, t − s). Thus, it is possible to
compute the moments, and we have
(2k)!

E (Bt − Bs )2k = (t − s)k ,
2k k!
for any k ∈ N. Therefore, Proposition 2.2 yields that almost surely, the
sample paths of the Brownian motion are γ–Hölder continuous with γ ∈
(0, 21 ).
Nowhere differentiability
We shall prove that the exponent γ = 12 above is sharp. As a consequence
we will obtain a celebrated result by Dvoretzky, Erdös and Kakutani telling
that a.s. the sample paths of Brownian motion are not differentiable. We
gather these results in the next theorem.
Theorem 2.2 Fix any γ ∈ ( 12 , 1]; then a.s. the sample paths of {Bt , t ≥
0} are nowhere Hölder continuous with exponent γ. As a consequence, a.s.
sample paths are nowhere differentiable and there are of infinite variation on
each finite interval.
15
Proof: Let γ ∈ ( 12 , 1] and assume that a sample path t → Bt (ω) is γ-Hölder
continuous at s ∈ [0, 1). Then
|Bt (ω) − Bs (ω)| ≤ C|t − s|γ ,
for any t ∈ [0, 1] and some constant C > 0.

Let n big enough and let i = [ns] + 1; by the triangular inequality

B j (ω) − B j+1 (ω) ≤ Bs (ω) − B j (ω)

n n n

+ Bs (ω) − B j+1 (ω)

n
j γ j + 1 γ

≤ C s − + s − .
n n
Hence, by restricting j = i, i + 1, . . . , i + N − 1, we obtain
γ
N M
B j (ω) − B j+1 (ω) ≤ C = γ.

n n n n
Define
M

AiM,n = B j (ω) − B j+1 (ω) ≤ , j = i, i + 1, . . . , i + N − 1 .

n n nγ
We have seen that the set of trajectories where t → Bt (ω) is γ-Hölder con-
tinuous at s is included in
∪∞ ∞ ∞ n i
M =1 ∪k=1 ∩n=k ∪i=1 AM,n .
Next we prove that this set has null probability. Indeed,

P ∩∞ n i n i
n=k ∪i=1 AM,n ≤ lim inf P ∪i=1 AM,n n→∞
n
P AiM,n
X
≤ lim inf
n→∞
i=1
M N

≤ lim inf n P B 1 ≤ γ ,

n→∞ n n

where we have used that the random variables B j − B j+1 are N 0, n1 and
n n
independent. But
M n Z M n−γ
r
nx2

P B 1 ≤ γ =

e− 2 dx
n n 2π −M n−γ
1 −γ
1 Z Mn 2 − x2
2 1
=√ 1 −γ e dx ≤ Cn 2 −γ .
2π −M n 2
16

1
Hence, by taking N such that N γ − 2
> 1,
h 1
iN
P ∩∞ n i
n=k ∪i=1 AM,n ≤ lim inf nC n
2
−γ
= 0.
n→∞
since this holds for any k, M , se get

P ∪∞ ∞ ∞ n i
M =1 ∪k=1 ∩n=k ∪i=1 AM,n = 0.
This ends the proof of the first part of the theorem.

If a sample path t → Bt (ω) were differentiable at some s, then it would be
Lipschitz continuous (i.e. Hölder continuous with exponent γ = 1), and we
have just proved that this might only happen with probability zero.
It is known that if a real function is of finite variation on some finite interval,
then it is differentiable on that interval. This yields the last assertion of the
theorem and ends its proof.

Quadratic variation
The notion of quadratic variation provides a measure of the roughness of
a function. Existence of variations of different orders are also important in
procedures of approximation via a Taylor expansion and also in the develop-
ment of infinitesimal calculus. We will study here the existence of quadratic
variation, i.e. variation of order two, for the Brownian motion. As shall be
explained in more detail in the next chapter, this provides an explanation to
the fact that rules of Itô’s stochastic calculus are different from those of the
classical differential deterministic calculus.
Fix a finite interval [0, T ] and consider the sequence of partitions given by
the points Πn = (tn0 = 0 ≤ tn1 ≤ . . . ≤ tnrn = T ). We assume that
n→∞
lim |Πn | = 0,
where |Πn | denotes the norm of the partition Πn :
|Πn | = sup (tj+1 − tj )

j=0,...,rn −1
Set ∆k B = Btnk − Btnk−1 .
Proposition 2.3 The sequence { rk=1 (∆k B)2 , n ≥ 1} converges in L2 (Ω) to

n
P
the deterministic random variable T . That is,

 !2 
rn
(∆k B)2 − T  = 0.
X
lim E 
n→∞
k=1
17
Proof: For the sake of simplicity, we shall omit the dependence on n. Set
∆k t = tk −tk−1 . Notice that the random variables (∆k B)2 −∆k t, k = 1, . . . , n,
are independent and centered. Thus,
 !2   ! 
rn rn 2
(∆k B)2 − T  = E  (∆k B)2 − ∆k t
X X
E 
k=1 k=1
rn 2
(∆k B)2 − ∆k t
X
= E
k=1
rn h i
3(∆k t)2 − 2(∆k t)2 + (∆k t)2
X
=
k=1
rn
(∆k t)2 ≤ 2T |Πn |,
X
=2
k=1
which clearly tends to zero as n tends to infinity.

This proposition, together with the continuity of the sample paths of Brow-
nian motion yields
rn
X
sup |∆k B| = ∞, a.s.,
n k=1
something that we already know from Theorem 2.2.

Indeed, assume that V := supn rk=1 |∆k B| < ∞. Then
Pn
rn rn
!
2
X X
(∆k B) ≤ sup |∆k B| |∆k B|
k=1 k k=1
≤ V sup |∆k B|.
k
Prn 2
We obtain limn→∞ k=1 (∆k B) = 0. a.s., which contradicts the result proved
in Proposition 2.3.
2.4 The martingale property of Brownian motion

We start this section by giving the definition of martingale for continuous
time stochastic processes. First, we introduce the appropriate notion of fil-
tration, as follows.
A family {Ft , t ≥ 0} of sub σ–fields of F is termed a filtration if
1. F0 contains all the sets of F of null probability,
2. For any 0 ≤ s ≤ t, Fs ⊂ Ft .
18
If in addition
∩s>t Fs = Ft ,
for any t ≥ 0, the filtration is said to be right-continuous.
Definition 2.2 A stochastic process {Xt , t ≥ 0} is a martingale with respect

to the filtration {Ft , t ≥ 0} if each variable belongs to L1 (Ω) and moreover
1. Xt is Ft –measurable for any t ≥ 0,
2. for any 0 ≤ s ≤ t, E(Xt /Fs ) = Xs .
If the equality in (2) is replaced by ≤ (respectively, ≥), we have a super-

martingale (respectively, a submartingale).
Given a stochastic process {Xt , t ≥ 0}, there is a natural way to define a
filtration by considering
Ft = σ(Bs , 0 ≤ s ≤ t), t ≥ 0.
To ensure that the above property (1) for a filtration holds, one needs to com-
plete the σ-field. In general, there is no reason to expect right-continuity.
However, for the Brownian motion, the natural filtration possesses this prop-
erty.
A stochastic process with independent increments possesses the martingale
property with respect to the natural filtration. Indeed, for 0 ≤ s ≤ t,
E(Xt − Xs /Fs ) = E(Xt − Xs ) = 0.
Hence, a Brownian motion possesses the martingale property with respect to

the natural filtration.
Other examples of martingales with respect to the same filtration, related
with the Brownian motion are
1. {Bt2 − t, t ≥ 0},

a2 t
2. {exp aBt − 2
, t ≥ 0}.
Indeed, for the first example, let us consider 0 ≤ s ≤ t. Then,

E Bt2 /Fs = E (Bt − Bs + Bs )2 /Fs

= E (Bt − Bs )2 /Fs + 2E ((Bt − Bs )Bs /Fs )

+ E Bs2 /Fs .
19
Since Bt − Bs is independent of Fs , owing to the properties of the conditional
expectation, we have

E (Bt − Bs )2 /Fs = E (Bt − Bs )2 = t − s,
E ((Bt − Bs )Bs /Fs ) = Bs E (Bt − Bs /Fs ) = 0,

E Bs2 /Fs = Bs2 .
Consequently,
E Bt2 − Bs2 /Fs = t − s.
For the second example, we also use the property of independent increments,
as follows:
a2 t a2 t
! ! ! !
E exp aBt − /Fs = exp(aBs )E exp a(Bt − Bs ) − /Fs
2 s
a2 t
!!
= exp(aBs )E exp a(Bt − Bs ) − .
s
Using the density of the random variable Bt − Bs one can easily check that
a2 t a2 (t − s) a2 t
!! !
E exp a(Bt − Bs ) − = exp − .
s 2 2
Therefore, we obtain
a2 t a2 s
! ! !
E exp aBt − /Fs = exp aBs − .
2 2
2.5 Markov property

For any 0 ≤ s ≤ t, x ∈ R and A ∈ B(R), we set
|x − y|2
!
1 Z
p(s, t, x, A) = 1 exp − dy. (2.6)
(2π(t − s)) 2 A 2(t − s)
Actually, p(s, t, x, A) is the probability that a random variable, Normal, with

mean x and variance t − s take values on a fixed set A.
Let us prove the following identity:
P {Bt ∈ A / Fs } = p(s, t, Bs , A), (2.7)
which means that, conditionally to the past of the Brownian motion until
time s, the law of Bt at a future time t only depends on Bs .
20
Let f : R → R be a bounded measurable function. Then, since Bs is Fs –
measurable and Bt − Bs independent of Fs , we obtain
E (f (Bt )/ Fs ) = E (f (Bs + (Bt − Bs )) / Fs )

= E (f (x + Bt − Bs )) .
x=Bs
The random variable x + Bt − Bs is N(x, t − s). Thus,

Z
E (f (x + Bt − Bs )) = f (y)p(s, t, x, dy),
R
and consequently,
Z
E (f (Bt )/ Fs ) = f (y)p(s, t, Bs , dy).
R
This yields (2.7) by taking f = 1A .

Going back to (2.6), we notice that the function x → p(s, t, x, A) is measur-
able, and the mapping A → p(s, t, x, A) is a probability.
Let us prove the additional property, called Chapman-Kolmogorov equation:
For any 0 ≤ s ≤ u ≤ t,
Z
p(s, t, x, A) = p(u, t, y, A)p(s, u, x, dy). (2.8)
R
We recall that the sum of two independent Normal random variables, is again
Normal, with mean the sum of the respective means, and variance the sum
of the respective variances. This is expressed in mathematical terms by the
fact that
Z
fN(x,σ1 ) ∗ fN(y,σ2 ) = fN(x,σ1 ) (y)fN(y,σ2 ) (· − y)dy
R
= fN(x+y,σ1 +σ2 ) .
Using this fact, we obtain

Z Z
p(u, t, y, A)p(s, u, x, dy) = dz fN(x,u−s) ∗ fN(0,t−u) (z)
R
ZA
= dzfN(x,t−s) (z) = p(s, t, x, A).
A
proving (2.8).
This equation is the time continuous analogue of the property own by the
transition probability matrices of a Markov chain. That is,
Π(m+n) = Π(m) Π(n) ,
21
meaning that evolutions in m + n steps are done by concatenating m-step
and n-step evolutions. In (2.8) m + n is replaced by the real time t − s, m
by t − u, and n by u − s, respectively.
We are now prepared to give the definition of a Markov process.
Consider a mapping
p : R+ × R+ × R × B(R) → R+ ,
satisfying the properties
(i) for any fixed s, t ∈ R+ , A ∈ B(R),
x → p(s, t, x, A)
is B(R)–measurable,
(ii) for any fixed s, t ∈ R+ , x ∈ R,
A → p(s, t, x, A)
is a probability,
(iii) Equation (2.8) holds.
Such a function p is termed a Markovian transition function. Let us also fix

a probability µ on B(R).
Definition 2.3 A real valued stochastic process {Xt , t ∈ R+ } is a Markov

process with initial law µ and transition probability function p if
(a) the law of X0 is µ,
(b) for any 0 ≤ s ≤ t,
P {Xt ∈ A/Fs } = p(s, t, Xs , A).
Therefore, we have proved that the Brownian motion is a Markov process

with initial law a Dirac delta function at 0 and transition probability function
p the one defined in (2.6).
Strong Markov property

Throughout this section, (Ft , t ≥ 0) will denote the natural filtration asso-
ciated with a Brownian motion {Bt , t ≥ 0} and stopping times will always
refer to this filtration.
22
Theorem 2.3 Let T be a stopping time. Then, conditionally to {T < ∞},
the process defined by
BtT = BT +t − BT , t ≥ 0,
is a Brownian motion independent of FT .
Proof: Assume that T < ∞ a.s. We shall prove that for any A ∈ FT , any
choice of parameters 0 ≤ t1 < · · · < tp and any continuous and bounded
function f on Rp , we have
h i h i
E 1A f BtT1 , · · · , BtTp = P (A)E f Bt1 , · · · , Btp . (2.9)
This suffices to prove all the assertions of the theorem. Indeed, by taking
A = Ω, we see that the finite dimensional distributions of B and B T coincide.
On the other hand, (2.9) states the independence of 1A and the random vector
(BtT1 , · · · , BtTp ). By a monotone class argument, we get the independence of
1A and B T .
The continuity of the sample paths of B implies, a.s.

f BtT1 , · · · , BtTp = f BT +t1 − BT , . . . , BT +tp − BT
∞
X
= lim 1{(k−1)2−n <T ≤k2−n } f Bk2−n +t1 − Bk2−n , . . . Bk2−n +tp − Bk2−n .
n→∞
k=1
Since f is bounded we can apply bounded convergence and write

h i
E 1A f BtT1 , · · · , BtTp
∞
X h i
= lim E 1{(k−1)2−n <T ≤k2−n } 1A f Bk2−n +t1 − Bk2−n , . . . Bk2−n +tp − Bk2−n .
n→∞
k=1
Since A ∈ FT , the event A ∩ {(k − 1)2−n < T ≤ k2−n } ∈ Fk2−n . Hence, by

the Markov property proved before,
h i
E 1{(k−1)2−n <T ≤k2−n } 1A f Bk2−n +t1 − Bk2−n , . . . Bk2−n +tp − Bk2−n
h i h i
= P A ∩ {(k − 1)2−n < T ≤ k2−n } E f (Bt1 , · · · Btp ) .
Summing up with respect to k both terms in the preceding identity yields
(2.9), and this finishes the proof if T < ∞ a.s.
If P (T = ∞) > 0, we can argue as before and obtain
h i h i
E 1A∩{T <∞} f BtT1 , · · · , BtTp = P (A ∩ {T < ∞})E f Bt1 , · · · , Btp .

An interesting consequence of the preceding property is given in the next
proposition.
23
Proposition 2.4 (Reflection principle). For any t > 0, set St = sups≤t Bs .
Then, for any a ≥ 0 and b ≤ a,
P {St ≥ a, Bt ≤ b} = P {Bt ≥ 2a − b} . (2.10)
As a consequence, the probability law of St and |Bt | are the same.
Proof: Consider the stopping time
Ta = inf{t ≥ 0, Bt = a},
which is finite a.s. We have

Ta
P {St ≥ a, Bt ≤ b} = P {Ta ≤ t, Bt ≤ b} = P {Ta ≤ t, Bt−Ta
≤ b − a}.
Ta
Indeed, Bt−T a
= Bt − BTa = Bt − a and B and B Ta have the same law.
Moreover, we know that these processes are independent of FTa .
This last property, along with the fact that B Ta and −B Ta have the same
law yields that (Ta , B Ta ) has the same distribution as (Ta , −B Ta ).
Define H = {(s, w) ∈ R+ × C(R+ ; R); s ≤ t, w(t − s) ≤ b − a}. Then
Ta
P {Ta ≤ t, Bt−Ta
≤ b − a} = P {(Ta , B Ta ) ∈ H} = P {(Ta , −B Ta ) ∈ H}
Ta
= P {Ta ≤ t, −Bt−T a
≤ b − a} = P {Ta ≤ t, 2a − b ≤ Bt }
= P {2a − b ≤ Bt },
where the last equality is a consequence of the inclusion {2a − b ≤ Bt } ⊂

{Ta ≤ t}. This ends the proof of (2.10).
For the second one, we notice that {Bt ≥ a} ⊂ {St ≥ a}. This fact along
with (2.10) yield the validity of the identities
P {St ≥ a} = P {St ≥ a, Bt ≤ a} + P {St ≥ a, Bt ≥ a}

= 2P {Bt ≥ a} = P {|Bt | ≥ a}.
The proof is now complete.

24
3 Itô’s calculus
Itô’s calculus has been developed in the 50’ by Kyoshi Itô in an attempt to
give rigourous meaning to some differential equations driven by the Brown-
ian motion appearing in the study of some problems related with continuous
time Markov processes. Roughly speaking, one could say that Itô’s calculus
is an analogue of the classical Newton and Leibniz calculus for stochastic pro-
cesses. In fact, in classical
R
mathematical analysis, there are several extensions
of the Riemann integral f (x)dx. For example, if g is an increasing bounded
function (or the difference of two of this class of functions),
R
Lebesgue-Stieltjes
integral gives a precise meaning to the integral f (x)g(dx), for some set of
functions f . However, before Itô’s development, no theory allowing nowhere
differentiable integrators g was known. Brownian motion, introduced in the
preceding chapter, is an example of stochastic process whose sample paths, al-
though continuous, are nowhere differentiable. Therefore, Lebesgue-Stieltjes
integral does not apply to the sample paths of Brownian motion.
There are many motivations coming from a variety of disciplines to consider
stochastic differential equations driven by a Brownian motion. Such an object
is defined as
dXt = σ(t, Xt )dBt + b(t, Xt )dt,

X0 = x0 ,
or in integral form,
Z t Z t
Xt = x0 + σ(s, Xs )dBs + b(s, Xs )ds. (3.1)
0 0
The first notion

Rt
to be introduced is that of stochastic integral. In fact, in (3.1)
the Rintegral 0 b(s, Xs )ds might be defined pathwise, but this is not the case
for 0t σ(s, Xs )dBs , because of the roughness of the paths of the integrator.
More explicitly, it is not possible to fix ω ∈ Ω, then to consider the path
σ(s, Xs (ω)), and finally to integrate with respect to Bs (ω).
3.1 Itô’s integral

Along this section, we will consider a Brownian motion B = {Bt , t ≥ 0}
defined on a probability space (Ω, F, P ). We also will consider a filtration
(Ft , t ≥ 0) satisfying the following properties:
1. B is adapted to (Ft , t ≥ 0),
2. the σ-field generated by {Bu − Bt , u ≥ t} is independent of (Ft , t ≥ 0).
25
Notice that these two properties are satisfied if (Ft , t ≥ 0) is the natural
filtration associated to B.
We fix a finite time horizon T and define L2a,T the set of stochastic processes
u = {ut , t ∈ [0, T ]} satisfying the following conditions:
(i) u is adapted and jointly measurable in (t, ω), with respect to the product
σ-field B([0, T ]) × F.
RT
(ii) 0 E(u2t )dt < ∞.
The notation L2a,T evokes the two properties -adaptedness and square
integrability- described before.
Consider first the subset of L2a,T consisting of step processes. That is, stochas-
tic processes which can be written as
n
X
ut = uj 1[tj−1 ,tj [ (t), (3.2)
j=1
with 0 = t0 ≤ t1 ≤ · · · ≤ tn = T and where uj , j = 1, . . . , n, are Ftj−1 –

measurable square integrable random variables. We shall denote by E the
set of these processes.
For step processes, the Itô stochastic integral is defined by the very natural
formula
Z T n
X
ut dBt = uj (Btj − Btj−1 ), (3.3)
0 j=1
that Rwe may compare with Lebesgue integral of simple functions. Notice
that 0T ut dBt is a random variable. Of course, we would like to be able to
consider more general integrands than step processes. Therefore, we must try
to extend the definition (3.3). For this, we have to use tools from Functional
Analysis based upon a very natural idea: If we are able to prove that (3.3)
gives a continuous functional between two metric spaces, then the stochastic
integral defined for the very particular class of step stochastic processes could
be extended to a more general class given by the closure of this set with
respect to a suitable norm.
The idea of continuity is made precise by the
Isometry property:
!2 !
Z T Z T
E ut dBt =E u2t dt . (3.4)
0 0
26
Let us prove (3.4) for step processes. Clearly
!2 n
Z T
E u2j (∆j B)2
X
E ut dBt =
0 j=1
X
+2 E(uj uk (∆j B)(∆k B)).
j<k
The measurability property of the random variables uj , j = 1, . . . , n, implies

that the random variables u2j are independent of (∆j B)2 . Hence, the con-
tribution of the first term in the right hand-side of the preceding identity is
equal to
n Z T
E(u2j )(tj − tj−1 ) = E(u2t )dt.
X
j=1 0
For the second term, we notice that for fixed j and k, j < k, the random
variables uj uk ∆j B are independent of ∆k B. Therefore,
E(uj uk (∆j B)(∆k B)) = E(uj uk (∆j B))E(∆k B) = 0.
Thus, we have (3.4).

This property tell us that the stochastic integral is a continuous functional
defined on E, endowed with the norm of L2 (Ω × [0, T ]), taking values on the
set L2 (Ω) of square integrable random variables.
Other properties of the stochastic integral of step processes
1. The stochastic integral is a centered random variable. Indeed,

 
n
!
Z T X
E ut dBt = E  uj (Btj − Btj+1 
0 j=1
n
X
= E(uj )E Btj − Btj+1 = 0,
j=1
where we have used that the random variables

uj and Btj − Btj+1 are
independent and moreover E Btj − Btj+1 = 0.
2. Linearity: If u1 , u2 are two step processes and a, b ∈ R, then clearly

au1 + bu2 is also a step process and
Z T Z T Z T
1 2 1
(au + bu )(t)dBt = a u (t)dBt + u2 (t)dBt .
0 0 0
27
The next step consists of identifying a bigger set than E of random processes
such that E is dense in the norm L2 (Ω × [0, T ]). This is actually the set
denoted before by L2a,T . Indeed, we have the following result which is a
crucial fact in Itô’s theory.
Proposition 3.1 For any u ∈ L2a,T there exists a sequence (un , n ≥ 1) ⊂ E

such that Z t
lim E (unt − ut )2 dt = 0.
n→∞ 0
Sketch of the proof: Assume that u has continuous sample paths, a.s. An
approximation sequence can be defined as follows:
[nT ] !
n
X k
u (t) = u 1 k k+1 (t).
0 n [n, n [
by continuity we have limn→∞ supk supt∈∆k |un (t) − u(t)| = 0. Hence, by

uniform integrability, also
lim sup sup E|un (t) − u(t)|2 = 0.

n→∞ k t∈∆k
Then
[nT ] Z k+1
! 2
Z T n
∧T k
2
X
E |un (t) − u(t)| dt = E u

− u(t) dt

0 k=0
k
n
n
≤ T sup sup E|un (t) − u(t)|2 ,
k t∈∆k
and the result follows.

For non continuous integrands, one proceeds by two steps. Firstly, continuous
processes related to u are constructed, via convolution. Then one can apply
the first step of the proof.

Owing to this fact, we can give the following definition.
Definition 3.1 The Itô stochastic integral of a process u ∈ L2a,T is

Z T Z T
2
ut dBt := L (Ω) − n→∞
lim unt dBt . (3.5)
0 0
28
In order this definition to make sense, one needs to make sure that if the
process u is approximated by two different sequences, say un,1 and un,2 , the
definition of the stochastic integral, using either un,1 or un,2 coincide. This
is proved using the isometry property. Indeed
!2
Z T Z T Z T 2
E un,1
t dBt − utn,2 dBt = E un,1 n,2
t − ut dt
0 0 0
Z T 2 Z T 2
≤2 E un,1
t − ut dt + 2 E un,2
t − ut dt
0 0
→ 0,
By its very definition, the stochastic integral defined in Definition 3.1 satisfies
the isometry property as well. Moreover,
(a) stochastic integrals are centered random variables:

!
Z T
E ut dBt = 0,
0
(b) stochastic integration is a linear operator:

Z T Z T Z T
(aut + bvt ) dBt = a ut dBt + b vt dBt .
0 0 0
Remember that these facts are true for processes in E, as has been mentioned
before. The extension to processes in L2a,T is done by applying Proposition
3.1. For the sake of illustration we prove (a).
Consider an approximating sequence un inR the sense of Proposition 3.1. By
the construction of the stochastic integral 0T ut dBt , it holds that
! !
Z T Z T
n→∞
lim E unt dBt =E ut dBt ,
0 0
R
T
Since E 0 unt dBt = 0 for every n ≥ 1, this concludes the proof.
We end this section with an interesting example.
Example 3.1 For the Brownian motion B, the following formula holds:
Z T 1 2
Bt dBt = BT − T .
0 2
29
Let us remark that we would rather expect 0T Bt dBt = 12 BT2 , by analogy
R
with rules of deterministic calculus.

To prove this identity, we consider the particular sequence of approximating
step processes introduced in Proposition 3.1, as follows:
n
unt =
X
Btj−1 1]tj−1 ,tj ] (t).
j=1
According to Definition 3.1,

Z T n
X
Bt dBt = lim Btj−1 Btj − Btj−1 ,
0 n→∞
j=1
in the L2 (Ω) norm.

Clearly,
n n
1X
Bt2j − Bt2j−1
X
Btj−1 Btj − Btj−1 =
j=1 2 j=1
n
1X 2
− Btj − Btj−1
2 j=1
n
1 1X 2
= BT2 − Btj − Btj−1 . (3.6)
2 2 j=1
We conclude by using Proposition 2.3.
3.2 The Itô integral as a stochastic process

The indefinite Itô stochastic integral of a process u ∈ L2a,T is defined as
follows: Z t Z T
us dBs := us 1[0,t] (s)dBs , (3.7)
0 0
t ∈ [0, T ].
For this definition to make sense, we need that for any t ∈ [0, T ], the process
{us 1[0,t] (s), s ∈ [0, T ]} belongs to L2a,T . This is clearly true.
Obviously, properties of the integral mentioned in the previous section, like
zero mean, isometry, linearity, also hold for the indefinite integral.
The rest of the section is devoted to the study of important properties of the
stochastic process given by an indefinite Itô integral.
Rt
Proposition 3.2 The process {It = 0 us dBs , t ∈ [0, T ]} is a martingale.
30
Proof: We first establish the martingale property for any approximating
sequence Z t
Itn = uns dBs , t ∈ [0, T ],
0
n 2
where u converges to u in L (Ω × [0, T ]). This suffices to prove the Propo-
sition, since L2 (Ω)–limits of martingales are again martingales.
Let unt , t ∈ [0, T ], be defined by the right hand-side of (3.2). Fix 0 ≤ s ≤ t ≤
T and assume that tk−1 < s ≤ tk < tl < t ≤ tl+1 . Then
l
Itn − Isn = uk (Btk − Bs ) +
X
uj (Btj − Btj +1 )
j=k+1
+ ul (Bt − Btl ).
Using properties (g) and (f), respectively, of the conditional expectation (see
Appendix 2) yields
l
E (Itn − Isn /Fs ) = E (uk (Btk − Bs )/Fs ) +
X
E E uj ∆j B/Ftj−1 /Fs
j=k+1

+ E ul E Bt − Btl /Ftl−1 /Fs
= 0.
This finishes the proof of the proposition.

A proof not very different as that of Proposition 2.3 yields
Proposition 3.3 For any process u ∈ L2a,T ,
n
!2
Z tj Z t
1
u2s ds.
X
L (Ω) − n→∞
lim us dBs =
j=1 tj−1 0
That means, the quadratic

Rt 2
variation of the indefinite stochastic integral is
given by the process { 0 us ds, t ∈ [0, T ]}.
The isometry property of the stochastic integral can be extended in the fol-
lowing sense. Let p ∈ [2, ∞[. Then,
Z t p Z t p
2
E us dBs ≤ C(p)E u2s ds . (3.8)
0 0
Here C(p) is a positive constant depending on p. This is Burkholder’s in-

equality.
31
A combination of Burkholder’s inequality and Kolmogorov’s continuity cri-
terion allows to deduce the continuity of the sample paths p
of the indefi-
RT
nite stochastic integral. Indeed, assume that 0 E (ur ) dr < ∞, for any
2
p ∈ [2, ∞[. Using first (3.8) and then Hölder’s inequality (be smart!) implies
Z t p Z t p
2
E ur dBr ≤ C(p)E u2r dr
s s
p
Z t p
−1
≤ C(p)|t − s| 2 E (ur ) 2 dr
s
p
−1
≤ C(p)|t − s| 2 .
Since
Rt
p ≥ 2 is arbitrary, with Theorem 1.1 we have that the sample paths of
1
0 us dBs , t ∈ [0, T ] are γ–Hölder continuous with γ ∈]0, 2 [.
3.3 An extension of the Itô integral

In Section 3.1 we have introduced the set L2a,T and we have defined the
stochastic integral of processes of this class with respect to the Brownian
motion. In this section we shall consider a large class of integrands. The
notations and underlying filtration are the same as in Section 3.1.
Let Λ2a,T be the set of real valued processes u adapted to the filtration (Ft , t ≥
0), jointly measurable in (t, ω) with respect to the product σ-field B([0, T ]) ×
F and satisfying ( )
Z T
P u2t dt < ∞ = 1. (3.9)
0
Clearly L2a,T ⊂ Λ2a,T . Our aim is to define the stochastic integral for processes
in Λ2a,T . For this we shall follow the same approach as in section 3.1. Firstly,
we start with step processes (un , n ≥ 1) of the form (3.2) belonging to Λ2a,T
and define the integral as in (3.3). The extension to processes in Λ2a,T needs
two ingredients. The first one is an approximation result that we now state
without giving a proof. Reader may consult for instance [1].
Proposition 3.4 Let u ∈ Λ2a,T . There exists a sequence of step processes

(un , n ≥ 1) of the form (3.2) belonging to Λ2a,T such that
Z T
lim |unt − ut |2 dt = 0,
n→∞ 0
a.s.
The second ingredient gives a connection between stochastic integrals of step

processes in Λ2a,T and their quadratic variation, as follows.
32
Proposition 3.5 Let u be a step processes in Λ2a,T . Then for any > 0,
N > 0, (Z ) (Z )
T T N
2
P ut dBt > ≤ P ut dt > N + 2 . (3.10)

0 0
Proof: It is based on a truncation argument. Let u be given by the right-hand

side of (3.2). Fix N > 0 and define

if t ∈ [tj−1 , tj [, and nj=1 u2j (tj − tj−1 ) ≤ N,
u , P
j
vtN =
if t ∈ [tj−1 , tj [, and nj=1 u2j (tj − tj−1 ) > N,
0, P
The process {vtN , t ∈ [0, T ]} belongs to L2a,T . Indeed, by definition

Z t
|vtN |2 dt ≤ N.
0
Moreover, if 0T u2t dt ≤ N , necessarily ut = vtN for any t ∈ [0, T ]. Then by

R
considering the decomposition

(Z )
T

ut dBt >

0
(Z ) (Z )
T Z T T Z T
= ut dBt > , u2t dt >N ∪ ut dBt > , u2t dt ≤N ,
0 0 0 0
we obtain
(Z ) (Z ) (Z )
T T T
P

ut dBt > ≤ P

v N dBt > + P

u2t dt > N .
0 0 t 0
We finally apply Chebychev’s inequality along with the isometry property of

the stochastic integral for processes in L2a,T and get
(Z ) !2
T 1 Z T N
N
P vt dBt > ≤ 2 E vtN dBt ≤ .

0 0 2
This ends the proof of the result.

The extension
Fix u ∈ Λ2a,T and consider a sequence of step processes (un , n ≥ 1) of the
form (3.2) belonging to Λ2a,T such that
Z T
lim |unt − ut |2 dt = 0, (3.11)
n→∞ 0
33
in the convergence in probability.
By Proposition 3.5, for any > 0, N > 0 we have
(Z ) (Z )
T T N
n m n m 2
P (ut − ut )dBt > ≤ P (ut − ut ) dt > N + 2 .

0 0
Using (3.11), we can choose such that for any N > 0 and n, m big enough,
(Z )
T
n m 2
P (ut − ut ) dt > N ≤ .

0 2
Then, we may take N small enough so that N2 ≤ 2 . Consequently, we have

proved that the sequence of stochastic integrals of step processes
!
Z T
unt dBt , n ≥1
0
is Cauchy in probability. We then define

Z T Z T
ut dBt = n→∞
lim unt dBt . (3.12)
0 0
It is easy to check that this definition is indeed independent of the particular

approximation sequence used in the construction.
3.4 A change of variables formula: Itô’s formula

Like in Example 3.1, we can prove the following formula, valid for any t ≥ 0:
Z t
Bt2 =2 Bs dBs + t. (3.13)
0
If the sample paths of {Bt , t ≥ 0} were sufficiently smooth -for example, of

bounded variation- we would rather have
Z t
Bt2 =2 Bs dBs . (3.14)
0
Why is it so? Consider a similar decomposition as the one given in (3.6)

obtained by restricting the time interval to [0, t]. More concretely, consider
the partition of [0, t] defined by 0 = t0 ≤ t1 ≤ · · · ≤ tn = t,
n−1
X
Bt2 = Bt2j+1 − Bt2j
j=0
n−1
X n−1
X 2
=2 Btj Btj+1 − Btj + Btj+1 − Btj , (3.15)
j=0 j=0
34
where we have used that B0 = 0.
Consider a sequence of partitions of [0, t] whose mesh tends to zero. We
already know that
n−1
X 2
Btj+1 − Btj → t,
j=0
in the convergence of L2 (Ω). This gives the extra contribution in the devel-
opment of Bt2 in comparison with the classical calculus approach.
Notice that, if B were of bounded variation then, we could argue as follows:
n−1
X 2
Btj+1 − Btj ≤ sup |Btj+1 − Btj |
j=0 0≤j≤n−1
n−1
X
× |Btj+1 − Btj |.
j=0
By the continuity of the sample paths of the Brownian motion, the first factor
in the right hand-side of the preceding inequality tends to zero as the mesh
of the partition tends to zero, while the second factor remains finite, by the
property of bounded variation.
Summarising. Differential calculus with respect to the Brownian motion
should take into account second order differential terms. Roughly speaking
(dBt )2 = dt.
A precise meaning to this formal formula is given in Proposition 2.3.
3.4.1 One dimensional Itô’s formula

In this section, we shall extend the formula (3.13) and write an expression
for f (t, Bt ) for a class of functions f which include f (x) = x2 .
Definition 3.2 Let {vt , t ∈ [0, T ]} be a stochastic process,RT

adapted, whose
sample paths are almost surely Lebesgue integrable, that is 0 |vt |dt < ∞, a.s.
Consider a stochastic process {ut , t ∈ [0, T ]} belonging to Λ2a,T and a random
variable X0 . The stochastic process defined by
Z t Z t
Xt = X0 + us dBs + vs ds, (3.16)
0 0
t ∈ [0, T ] is termed an Itô process.
35
An alternative writing of (3.16) in differential form is
dXt = ut dBt + vt dt.
To warm up, we state a particular version of the Itô formula. By C 1,2 we

denote the set of functions on [0, T ] × R which are jointly continuous in
(t, x), continuous differentiable in t and twice continuous differentiable in x
with derivatives jointly continuous.
Theorem 3.1 Let f : [0, T ] × R → R be a function in C 1,2 and X be an

Itô process with decomposition given in (3.16). The following formula holds
true:
Z t Z t
f (t, Xt ) = f (0, X0 ) + ∂s f (s, Xs )ds + ∂x f (s, Xs )us dBs
0 0
Z t 1Z t 2
+ ∂x f (s, Xs )vs ds + ∂xx f (s, Xs )u2s ds. (3.17)
0 2 0
An idea of the proof. Consider a sequence of partitions of [0, T ], for example

the one defined by tnj = jt
n
. In the sequel, we avoid mentioning the superscript
n for the sake of simplicity. We can write
n−1
Xh i
f (t, Xt ) − f (0, X0 ) = f (tj+1 , Xtj+1 ) − f (tj , Xtj )
j=0
n−1
Xh i
= f (tj+1 , Xtj ) − f (tj , Xtj )
j=0
h i
+ f (tj+1 , Xtj+1 ) − f (tj+1 , Xtj )
n−1
Xh i
= ∂s f (t̄j , Xtj )(tj+1 − tj )
j=0
h i
+ ∂x f (tj+1 , Xtj )(Xtj+1 − Xtj )
1 n−1
∂ 2 f (tj+1 , X̄j )(Xtj+1 − Xtj )2 .
X
+ (3.18)
2 j=0 xx
with t̄ ∈]tj , tj+1 [ and X̄j an intermediate (random) point on the segment
determined by Xtj and Xtj+1 .
In fact, this follows from a Taylor expansion of the function f up to the
first order in the variable s, and up to the second order in the variable x.
The asymmetry in the orders is due to the existence of quadratic variation
of the processes involved. The expresion (3.18) is the analogue of (3.15).
36
The former is much simpler for two reasons. Firstly, there is no s-variable;
secondly, f is a polynomial of second degree, and therefore it has an exact
Taylor expansion. But both formulas have the same structure.
When passing to the limit as n → ∞, we obtain
n−1
X Z t
∂s f (tj , Xtj )(tj+1 − tj ) → ∂s f (s, Xs )ds
j=0 0
n−1
X Z t
∂x f (tj , Xtj )(Xtj+1 − Xtj ) → ∂x f (s, Xs )us dBs
j=0 0
Z t
+ ∂x f (s, Xs )vs ds
0
n−1 Z t
∂xx f (tj , X̄j )(Xtj+1 − Xtj )2 → 2
f (s, Xs )u2s ds,
X
∂xx
j=0 0
in the convergence of probability.

Itô’s formula (3.17) can be written in the formal simple differential form
1 2
df (t, Xt ) = ∂t f (t, Xt )dt + ∂x f (t, Xt )dXt + ∂xx f (t, Xt )(dXt )2 , (3.19)
2
where (dXt )2 is computed using the formal rule of composition
dBt × dBt = dt,

dBt × dt = dt × dBt = 0,
dt × dt = 0.
Consider in Theorem 3.1 the particular case where f : R → R is a function

in C 2 (twice continuously differentiable). Then formula (3.17) becomes
Z t Z t
0
f (Xt ) = f (X0 ) + f (Xs )us dBs + f 0 (Xs )vs ds
0 0
1 Z t 00
+ f (Xs )u2s ds. (3.20)
2 0
Example 3.2 Consider the function

σ2
f (t, x) = eµt− 2
t+σx
,
with µ, σ ∈ R.
Applying formula (3.17) to Xt := Bt -a Brownian motion- yields
Z t Z t
f (t, Bt ) = 1 + µ f (s, Bs )ds + σ f (s, Bs )dBs .
0 0
37
Hence, the process {Yt = f (t, Bt ), t ≥ 0} satisfies the equation
Z t Z t
Yt = 1 + µ Ys ds + σ Ys dBs .
0 0
The equivalent differential form of this identity is the linear stochastic differ-
ential equation
dYt = µYt dt + σYt dBt ,

Y0 = 1. (3.21)
Black and Scholes propose as model of a market with a single risky asset with
initial value S0 = 1, the process St = Yt . We have seen that such a process is
in fact the solution to a linear stochastic differential equation (see (3.21)).
3.4.2 Multidimensional version of Itô’s formula

Consider a m-dimensional Brownian motion {(Bt1 , · · · , Btm ) , t ≥ 0} and p
real-valued Itô processes, as follows:
m
dXti = ui,l l i
X
t dBt + vt dt, (3.22)
l=1
i = 1, . . . , p. We assume that each one of the processes ui,l 2

t belong to Λa,T
RT i
and that 0 |vt |dt < ∞, a.s. Following a similar plan as for Theorem 3.1, we
will prove the following:
Theorem 3.2 Let f : [0, ∞) × Rp 7→ R be a function of class C 1,2 and

X = (X 1 , . . . , X p ) be given by (3.22). Then
Z t p Z t
∂xk f (s, Xs )dXsk
X
f (t, Xt ) = f (0, X0 ) + ∂s f (s, Xs )ds +
0 k=1 0
p t
1 Z
∂xk ,xl f (s, Xs )dXsk dXsl ,
X
+ (3.23)
2 k,l=1 0
where in order to compute dXsk dXsl , we have to apply the following rules
dBsk dBtl = δk,l ds, (3.24)

dBsk ds = 0,
2
(ds) = 0,
where δk,l denotes the Kronecker symbol.
38
We remark that the identity (3.24) is a consequence of the independence of
the components of the Brownian motion.
Example 3.3 Consider the particular case m = 1, p = 2 and f (x, y) = xy.

That is, f does not depend on t and we have denoted a generic point of R by
(x, y). Then the above formula (3.23) yields
Z t Z t Z t
Xt1 Xt2 = X01 X02 + Xs1 dXs2 + Xs2 dXs1 + u1s u2s ds. (3.25)
0 0 0
Proof of Theorem 3.2: Let Πn = {0 = tn0 < · · · < tnpn = t} be a sequence

of increasing partitions such that limn→∞ |Πn | = 0. First we consider the
decomposition
n −1 h
pX i
f (t, Xt ) − f (0, X0 ) = f (tni+1 , Xtni+1 ) − f (tni , Xtni ) ,
i=0
and then Taylor’s formula for each term in the last sum. We obtain
f (tni+1 , Xtni+1 ) − f (tni , Xtni ) = ∂s f (t̄ni , Xtni )(tni+1 − tni )

p
∂xk f tni , Xt1ni , . . . , Xtpni (Xtkni+1 − Xtkni )
X
+
k=1
p k,l
fn,i
(Xtkni+1 − Xtkni )(Xtlni+1 − Xtlni ),
X
+
k,l=1 2
with t̄ni ∈]tni , tni+1 [ and

k,l
inf ∂x2k ,xl f tni , Xt1ni + θ Xt1ni+1 − Xt1ni , · · · ≤ fn,i
θ∈[0,1]

sup ∂x2k ,xl f tni , Xt1ni + θ Xt1ni+1 − Xt1ni , · · ·
θ∈[0,1]
We now proceed to prove the convergence of each contribution to the sum. In

order to simplify the arguments, we shall assume more restrictive assumptions
on the function f and the processes ui,l . Notice that the process X has
continuous sample paths, a.s.
First term
n −1
pX Z t
∂s f (t̄ni , Xtni )(tni+1 − tni ) → ∂s f (s, Xs )ds, (3.26)
i=0 0
39
a.s. indeed

pn −1 Z t
n n n
X

∂s f (t̄i , Xti )(ti+1 − ti ) −
n ∂s f (s, Xs )ds
i=1 0

pn −1 Z tn h i
i+1
n
X
= ∂s f (t̄i , Xti ) − ∂s f (s, Xs ) ds
n
i=1 tni
n −1 Z tn
pX
i+1
≤ ∂s f (t̄n , X ) − ∂ f (s, Xs ds
)

n
ti s
i
i=1 tn
i

≤t sup ∂s f (t̄n
i , Xtn ) − ∂ f (s, X ) 1 n ,tn ] (s) .

i s s [ti i+1
1≤i≤pn −1
The continuity of ∂s f along with that of the process X implies

lim sup ∂s f (t̄n
i , Xtn ) − ∂ f (s, X ) 1 n ,tn ] (s) = 0.

s s [t
n→∞ 1≤i≤p −1 i i i+1
n
This gives (3.26).
Second term
Fix k = 1, . . . , p. We next prove
n −1
pX Z t
∂xk f tni , Xt1ni , . . . , Xtpni (Xtkni+1 − Xtkni ) → ∂xk f (s, Xs )dXsk (3.27)
i=1 0
in probability, which by (3.22) amounts to check two convengences:
n −1
pX Z tn Z t
i+1

∂xk f tni , Xt1ni , . . . , Xtpni uk,l l
t dBt → ∂xk f (s, Xs )uk,l l
s dBs , (3.28)
i=1 tn
i 0
n −1
pX Z tn Z t
i+1

∂xk f tni , Xt1ni , . . . , Xtpni vtk dt → ∂xk f (s, Xs )vsk ds, (3.29)
i=1 tn
i 0
for any l = 1, . . . , m.
40
We start with (3.28). Assume that ∂xk f is bounded and that the processes
uk,l are in L2a,T . Then
2
n −1
pX Z tni+1 Z t
∂xk f tni , Xt1ni , . . . , Xtpni k,l l k,l l

E u t dBt − ∂xk f (s, X )u
s s dBs
tn

i=1 i 0
2
pn −1 Z tn h
i+1
i
X n 1 p k,l l

=E
n
∂xk f ti , Xtni , . . . , Xtni − ∂xk f (s, Xs ) us dBs
i=1 ti
2
n −1
pX

Z tni+1 h i
n 1 p k,l l
= E n ∂xk f ti , Xtni , . . . , Xtni − ∂xk f (s, Xs ) us dBs

t
i=1 i
pX
n −1 Z n
ti+1 nh i o2
= E ∂xk f tni , Xt1ni , . . . , Xtpni − ∂xk f (s, Xs ) uk,l
s ds,
i=1 tn
i
where we have applied successively that the stochastic integrals on disjoint

intervals are independent to each other, they are centered random variables,
along with the isometry property.
By the continuity of ∂xk f and the process X,

p
sup ∂xk f tn 1
i , Xtn , . . . , Xtn − ∂xk f (s, Xs ) 1[tn n ] (s) → 0, (3.30)

i i
,t
i i+1
1≤i≤pn −1
a.s. Then, by bounded convergence,

h i 2
sup E ∂xk f tni , Xt1ni , . . . , Xtpni − ∂xk f (s, Xs ) uk,l
s 1[tn (s) → 0,

n
i ,ti+1 ]
1≤i≤pn −1
and hence we get (3.28) in the convergence of L2 (Ω).

For the proof of (3.29) we also assume that ∂xk f is bounded. Then,

pn −1 Z tni+1 Z t
X n 1 p k k

∂xk f ti , Xtni , . . . , Xtni v t dt − ∂ xk f (s, Xs )vs ds

tn

i=1 i 0

pn −1 Z tn h
i+1
i
X n 1 p k

= ∂xk f ti , Xtn i
, . . . , Xtn
i
− ∂x k
f (s, X s ) v s ds
i=1 tni

n −1 Z tn
pX
i+1

p
∂xk f tn 1 k
≤
i , Xt n ,i
. . . , Xtn
i
− ∂xk f (s, Xs |vs |ds.
)
i=1 tn
i
By virtue of (3.30) and bounded convergence, we obtain (3.29) in the a.s.

convergence.
41
Third term
Given the notation (3.24), it suffices to prove that
n −1
pX m Z t
k,l
(Xtkni+1 Xtkni )(Xtlni+1 Xtlni ) ∂x2k ,xl f (s, Xs )uk,j l,j
X
fn,i − − → s us ds. (3.31)
i=0 j=1 0
which will be a consequence of the following convergences

n −1
pX ! Z !
Z tn
i+1 tn
i+1
Z t
k,l l,j 0 j0 0
fn,i uk,j j
s dBs us dBs → δj,j 0 ∂x2k ,xl f (s, Xs )uk,j l,j
s us ds,
i=0 tn
i tn
i 0
(3.32)
n −1
pX ! Z !
Z tn
i+1 tn
i+1
k,l
fn,i uk,j j
s dBs vsl ds → 0, (3.33)
i=0 tn
i tn
i
n −1
pX ! Z !
Z tn
i+1 tn
i+1
k,l
fn,i vsk ds vsl ds → 0. (3.34)
i=0 tn
i tn
i
Let us start by arguing on (3.32). We assume that ∂x2k ,xl f is bounded and
that ul,j ∈ L2a,T for any l = 1, . . . , p, j = 1, . . . , m. Suppose that j 6= j 0 . Then,
the convergence to zero follows from the fact that the stochastic integrals
Z tn
i+1
Z tn
i+1 0 0
uk,j j
s dBs , ul,j j
s dBs ,
tn
i tn
i
are independent.
Assume j = j 0 . Then,

pn −1 Z tn
! Z
tn
! Z t

i+1 i+1
l,j 0
X k,l
uk,j j
ul,j j
∂x2k ,xl f (s, Xs )uk,j

E fn,i s dBs s dBs − s us ds
i=0 tn
i tn
i 0
≤ (T1 + T2 ),
with

pn −1
X k,l
T1 = E fn,i
i=0
" Z n ! Z ! #
t i+1 tn
i+1
Z tn
i+1
× uk,j j
s dBs ul,j
− j
s dBs ,

uk,j l,j
s us ds
i tn tn
i tn
i

pn −1
X k,l tni+1
Z Z t
0
uk,j l,j 2 k,j l,j

T2 = E fn,i n s u s ds − ∂xk ,xl f (s, X )u
s s u s ds.

i=0 ti 0
42
Since ∂x2k ,xl f is bounded,
n −1
pX
Z n ! Z n !
ti+1 ti+1
k,j j l,j j
T1 ≤ CE us dBs us dBs

tn tn
i=0 i i
Z tn
i+1

k,j l,j
− n us us ds .
t i

This tends to zero as n → ∞ (see Proposition 3.3).

As for T2 , we have
t
Z
k,l 2 l,j 0
uk,j

T2 ≤ E sup fn,i − ∂xk ,xl f (s, Xs ) 1]tn ,tn+1 ] (s) s us ds

i i 0
i
By continuity,

k,l
lim sup (fn,i − ∂x2k ,xl f (s, Xs ) 1]tn ,tn+1 ] (s) = 0.

n→∞ i i i
Thus, by bounded convergence we obtain

lim T2 = 0.
n→∞
Hence we have proved (3.32) in L1 (Ω).

Next we prove (3.33) in L1 (Ω), assuming that ∂x2k ,xl f is bounded and ad-
ditionally that the processes uk,j ∈ L2a,T and v l ∈ L2 (Ω × [0, T ]). In this
case
!
pn −1 Z tn ! Z n
ti+1
X k,l i+1
uk,j j
vsl ds

E fn,i n s dBs n
i=0 ti ti
" Z n Z #
ti+1 t
≤ CE sup n uk,j dB j
|vsl |ds

s s
i ti 0
Z tn !
i+1

k,j j
≤ C sup u dB kvkL2 (Ω×[0,T ]) .

s s
i tn i

2
The first term tend to zero as n → ∞ a.s., while the second one is bounded.
The proof of (3.34) is easier. Indeed assuming that ∂x2k ,xl f is bounded,
!
n −1 k,l
pX Z tn
! Z
tn
i+1 i+1
vsk ds vsl ds

fn,i
tn tn

i=0 i i
!Z
Z tn
i+1 t
≤ C sup |vsl |ds |vsk |ds.
i tn
i 0
The first factor tends to zero as n → ∞, while the second one is bounded
a.s.
The proof of the theorem is now complete.

43
4 Applications of the Itô formula
This chapter is devoted to give some important results that use the Itô for-
mula in some parts of their proofs.
4.1 Burkholder-Davis-Gundy inequalities

Rt
Theorem 4.1 Let u ∈ L2a,T and set Mt = 0 us dBs . Define
Mt∗ = sup |Ms | .

s∈[0,t]
Then, for any p > 0, there exist two positive constants cp , Cp such that
!p !p
Z T 2 Z T 2
cp E u2s ds ≤E (MT∗ )p ≤ Cp E u2s ds . (4.1)
0 0
Proof: We will only prove here the right-hand side of (4.1) for p ≥ 2. Consider
the function
f (x) = |x|p ,
for which we have that
f 0 (x) = p|x|p−1 sign(x),

f 00 (x) = p(p − 1)|x|p−2 .
Then, according to (3.20) we obtain

Z t 1Z t
|Mt |p = p|Ms |p−1 sign(Ms )us dBs + p(p − 1)|Ms |p−2 u2s ds.
0 2 0
Applying the expectation operator to both terms of the above identity yields
p(p − 1) t
Z
E (|Mt |p ) = E |Ms |p−2 u2s ds .
2 0
p p
We next apply Hölder’s inequality with exponents p−2
and q = 2
and get
Z t Z t
E |Ms |p−2 u2s ds ≤ E (Mt∗ )p−2 u2s ds
0 0
p−2
" Z t p # p2
2
≤ [E (Mt∗ )p ] p E u2s ds .
0
44
Doob’s inequality implies
!p
p
E (Mt∗ )p ≤ E(|Mt |p ).
p−1
Hence,
!p !p p
p p(p − 1) 2 t
Z
2
E (Mt∗ )p ≤ E u2s ds .
p−1 2 0
This ends the proof of the upper bound.

4.2 Representation of L2 Brownian functionals

We already know that for any process u ∈ L2a,T , the stochastic integral process
{ 0t us dBs , t ∈ [0, T ]} is a martingale. The next result is a kind of reciprocal
R
statement. In the proof we shall use a technical ingredient that we write

without giving a proof.
In the sequel we denote by F∞ the σ-field generated by (Bt , t ≥ 0).
Lemma 4.1 The vector space generated by the random variables

 
n
X
exp i λj Btj − Btj−1  ,
j=1
for 0 = t0 < t1 < · · · < tn and λ1 , · · · , λn ∈ R is dense in L2C (Ω, F∞ ).
Theorem 4.2 Let Z ∈ L2 (Ω, F∞ ). There exists a unique process h ∈ L2a,∞

such that Z ∞
Z = E(Z) + hs dBs . (4.2)
0
Hence, for any martingale M bounded in L2 , there exist a unique process

h ∈ L2a,∞ and a constant C such that
Z t
Mt = C + us dBs . (4.3)
0
Proof: We start with the proof of (4.2). Let H be the vector space consisting
of random variables Z ∈ L2 (Ω, F∞ ) such that (4.2) holds. Firstly, we argue
the uniqueness of h. This is an easy consequence of the isometry of the
45
stochastic integral. Indeed, if there were two processes h and h0 satisfying
(4.2), then
Z ∞ Z ∞ 2
E (hs − h0s )2 ds =E (hs − h0s )dBs
0 0
= 0.
This yields h = h0 in L2 ([0, ∞) × Ω).
We now turn to the existence of h. Any Z ∈ H satisfies
Z ∞
2 2
E(Z ) = (E(Z)) + E h2s ds .
0
From this it follows that if (Zn , n ≥ 1) is a sequence of elements of H con-

verging to Z in L2 (Ω, F∞ ), then the sequence (hn , n ≥ 1) corresponding to
the representations is Cauchy in L2a,∞ . Denoting by h the limit, we have
Z ∞
Z = E(Z) + hs dBs .
0
Hence H is closed in L2 (Ω, F∞ ).

Fix 0 = t0 < t1 < · · · < tn < ∞ and λ1 , . . . , λn ∈ R. Set
n
X
f (s) = λj 1[tj−1 ,tj ] (s),
j=1
and t 1Z t 2
Z
Etf = exp i fs dBs + f ds .
0 2 0 s
By the Itô formula, Z t
Etf = 1 + i Esf f (s)dBs .
0

Hence Re exp iλj (Btj − Btj −1 ) ∈ H. Since by Lemma 4.1 these random
variables are dense in L2 (Ω, F∞ ), we conclude H = L2 (Ω, F∞ ).
The representation (4.3) is a consequence the following fact: The martingale
M converges in L2 (Ω) as t → ∞ to a random variable M∞ ∈ L2 (Ω), and
Mt = E(M∞ |Ft ).
Then by taking conditional expectations in both terms of (4.2) we obtain
Z t
Mt = E(M∞ |Ft ) = E(M∞ ) + hs dBs .
0

46
Example 4.1 Consider Z = BT3 . In order to find the corresponding process
h in the integral representation, we apply first Itô’s formula, yielding
Z T Z T
BT3 = 3Bt2 dBt + 3 Bt dt.
0 0
An integration by parts gives

Z T Z T
Bt dt = T BT − tdBt .
0 0
Thus, Z T h i
BT3 = 3 Bt2 + T − t dBt .
0
Notice that E(BT )3 = 0. Then ht = 3 [Bt2 + T − t].
4.3 Girsanov’s theorem

It is well known that if X is a multidimensional Gaussian random variable,
any affine transformation brings X into a multidimensional Gaussian random
variable as well. The simplest version of Girsanov’s theorem extends this
result to a Brownian motion. Before giving the precise statement and the
proof, let us introduce some preliminaries.
Lemma 4.2 Let L be a nonnegative random variable such that E(L) = 1.

Set
Q(A) = E(11A L), A ∈ F. (4.4)
Then, Q defines a probability on F, equivalent to P , with density given by L.
Reciprocally, if P and Q are two probabilities on F and P Q, then threre
exists a nonnegative random variable L such that E(L) = 1, and (4.4) holds.
Proof: It is clear that Q defines a σ-additive function on F. Moreover, since
Q(Ω) = E(11Ω L) = E(L) = 1,
Q is indeed a probability.
Let A ∈ F be such that Q(A) = 0. Since L > 0, a.s., we should have
P (A) = 0. Reciprocally, for any A ∈ F with P (A) = 0, we have Q(A) = 0
as well.
The second assertion of the lemma is Radon-Nikodym theorem.

47
If we denote by EQ the expectation operator with respect to the probability
Q defined before, one has
EQ (X) = E(XL).
Indeed, this formula is easily checked for simple random variables and then
extended to any random variable X ∈ L1 (Ω) by the usual approximation
argument.
Consider now a Brownian motion {Bt , t ∈ [0, T ]}. Fix λ ∈ R and let
λ2
!
Lt = exp −λBt − t . (4.5)
2
Itô’s formula yield Z t

Lt = 1 − λLs dBs .
0
Hence, the process {Lt , t ∈ [0, T ]} is a positive martingale and E(Lt ) = 1,

for any t ∈ [0, T ]. Set
Q(A) = E (11A LT ) , A ∈ FT . (4.6)
By Lemma 4.2, the probability Q is equivalent to P on the σ-field FT .

By the martingale property of {Lt , t ∈ [0, T ]}, the same conclusion is true on
Ft , for any t ∈ [0, T ]. Indeed, let A ∈ Ft , then
Q(A) = E (11A LT ) = E (E (11A LT |Ft ))

= E (11A E (LT |Ft ))
= E (11A Lt ) .
Next, we give a technical result.
Lemma 4.3 Let X be a random variable and let G be a sub σ-field of F such
that
u2 σ 2

E eiuX |G = e− 2 .
Then, the random variable X is independent of the σ-field G and its proba-
bility law is Gaussian, zero mean and variance σ 2 .
Proof: By the definition of the conditional expectation, for any A ∈ G,

u2 σ 2

E 1A eiuX = P (A)e− 2 .
48
In particular, for A := Ω, we see that the characteristic function of X is that
of a N(0, σ 2 ).
Moreover, for any A ∈ G,
u2 σ 2

EA eiuX = e− 2 ,
saying that the law of X conditionally to A is also N(0, σ 2 ). Thus,
P ((X ≤ x) ∩ A) = P (A)PA (X ≤ x) = P (A)P (X ≤ x) ,
yielding the independence of X and G.

Theorem 4.3 (Girsanov’s theorem) Let λ ∈ R and set

Wt = Bt + λt.
In the probability space (Ω, FT , Q), with Q given in (4.6), the process {Wt , t ∈
[0, T ]} is a standard Brownian motion.
Proof: We will check that in the probability space (Ω, FT , Q), any increment
Wt − Ws , 0 ≤ s < t ≤ T is independent of Fs and has N(0, t − s) distribution.
That is,
u2

EQ eiu(Wt −Ws ) |Fs = e− 2 (t−s) ,
equivalently, for any A ∈ Fs ,
2

u2
− u2 (t−s)

iu(Wt −Ws )
EQ e 1A = EQ 1A e = Q(A)e− 2
(t−s)
.
Indeed, writing
λ2 λ2
! !
Lt = exp −λ(Bt − Bs ) − (t − s) exp −λBs − s ,
2 2
we have

EQ eiu(Wt −Ws ) 1A = E 1A eiu(Wt −Ws ) Lt

λ2
= E 1A eiu(Bt −Bs )+iuλ(t−s)−λ(Bt −Bs )− 2
(t−s)
Ls .
Since Bt − Bs is independent of Fs , the last expression is equal to

λ2

E (11A Ls ) E e(iu−λ)(Bt −Bs ) eiuλ(t−s)− 2
(t−s)
(iu−λ)2 2
(t−s)+iuλ(t−s)− λ2 (t−s)
= Q(A)e 2
u2
= Q(A)e− 2
(t−s)
.
The proof is now complete.

49
5 Local time of Brownian motion and
Tanaka’s formula
This chapter deals with a very particular extension of Itô’s formula. More
precisely, we would like to have a decomposition of the positive submartingale
|Bt − x|, for some fixed x ∈ R as in the Itô formula. Notice that the function
f (y) = |y − x| does not belong to C 2 (R). A natural way to proceed is to
regularize the function f , for instance by convolution with an approximation
of the identity, and then, pass to the limit. Assuming that this is feasible,
the question of identifying the limit involving the second order derivative
remains open. This leads us to introduce a process termed the local time of
B at x introduced by Paul Lévy.
Definition 5.1 Let B = {Bt , t ≥ 0} be a Brownian motion and let x ∈ R.

The local time of B at x is defined as the stochastic process
1 Zt
L(t, x) = lim 1(x−,x+) (Bs )ds
→0 2 0
1
= lim λ{s ∈ [0, t] : Bs ∈ (x − , x + )}, (5.1)
→0 2
where λ denotes the Lebesgue measure on R.
We see that L(t, x) measures the time spent by the process B at x during a
period of time of length t. Actually, it is the density of this occupation time.
We shall see later that the above limit exists in L2 (it also exists a.s.), a fact
that it is not obvious at all.
Local time enters naturally in the extension of the Itô formula we alluded
before. In fact, we have the following result.
Theorem 5.1 For any t ≥ 0 and x ∈ R, a.s.,
Z t 1
+ +
(Bt − x) = (B0 − x) + 1[x,∞) (Bs )dBs + L(t, x), (5.2)
0 2
where L(t, x) is given by (5.1) in the L2 convergence.
Proof: The heuristics of formula (5.2) is the following. In the sense of
distributions, f (y) = (y − x)+ has as first and second order derivatives,
f 0 (y) = 1[x,∞) (y), f 00 (y) = δx (y), respectively, where δx denotes the Dirac
delta measure. Hence we expect a formula like
Z t 1Z t
(Bt − x)+ = (B0 − x)+ + 1[x,∞) (Bs )dBs + δx (Bs )ds.
0 2 0
50
However, we have to give a meaning to the last integral.
Approximation procedure
We are going to approximate the function f (y) = (y − x)+ . For this, we fix
> 0 and define

0,

 if y ≤ x −
(y−x+)2
fx (y) =  4
, if x − ≤ y ≤ x +


y−x if y ≥ x +
which clearly has as derivatives


0,

 if y ≤ x −
0 (y−x+)
fx (y) =
 2
, if x − ≤ y ≤ x +


1 if y ≥ x +
and 
0,

 if y < x −
00 1
fx (y) =  2
, if x − < y < x +


0 if y > x +
Let φn , n ≥ 1 be a sequence of C ∞ functions with compact supports decreas-
ing to {0}. For instance we may consider the function

φ(y) = c exp −(1 − y 2 )−1 1{|y|<1} ,
R
with a constant c such that R φ(z)dz = 1, and then take
φn (y) = nφ(ny).
Set Z
gn (y) = [φn ∗ fx ](y) = fx (y − z)φn (z)dz.
R
It is well-known that gn ∈ C ∞ , gn and gn0 converge uniformly in R to fx and

0
fx , respectively, and gn00 converges pointwise to fx
00
except at the points x +
and x − .
We then have an Itô’s formula for gn , as follows:
Z t 1 Z t 00
gn (Bt ) = gn (B0 ) + gn0 (Bs )dBs + g (Bs )ds. (5.3)
0 2 0 n
51
Convergence of the terms in (5.3) as n → ∞
0
The function fx is bounded. The function gn0 is also bounded. Indeed,
Z
|gn0 (y)| = 0
fx (y − z)φn (z)dz
R
Z 1
n 0
= 1
fx (y − z)φn (z)dz
−n
0
≤ kfx k∞ .
Moreover,
0 0
1[0,t] − fx
gn (Bs )1 (Bs )11[0,t] → 0,
uniformly in t and in ω. Hence, by bounded convergence,
Z t
2
E |gn0 (Bs ) − fx
0
(Bs )| → 0.
0
Then, the isometry property of the stochastic integral implies

Z t
2
[gn0 (Bs ) 0

E − fx (Bs )] dBs → 0,
0
as n → ∞.
We next deal with the second order term. Since the law of each Bs has a
density, for each s > 0,
P {Bs = x + } = P {Bs = x − } = 0.
Thus,
lim g 00 (Bs )
n→∞ n
00
= fx (Bs ),
a.s. for any fixed s. Using Fubini’s theorem, we see that this convergence
also holds, for almost every s, a.s. In fact,
Z Z t
dP ds11{fx,
00 (B )6=lim
s n→∞ g (Bs )} = 0.
00
Ω 0
We have
1
sup |gn00 (y)| ≤ .
y∈R 2
Indeed,
1 Z

|gn00 (y)| =

1(x−,x+) (y − z)dz
φn (z)1
2 R
1 Z y−x+ 1
≤ |φn (z)|dz ≤ .
2 y−x− 2
52
Then, by bounded convergence
Z t Z t
gn00 (Bs )ds → 00
fx (Bs )ds,
0 0
2
a.s. and in L .
Thus, passing to the limit the expression (5.3) yields
Z t 1Z t 1
0
fx (Bt ) = fx (B0 ) + fx (Bs )dBs + 1(x−,x+) (Bs )ds. (5.4)
0 2 0 2
Convergence as → 0 of (5.4)
Since fx (y) → (y − x)+ as → 0 and
|fx (Bt ) − fx (B0 )| ≤ |Bt − B0 |,
we have
fx (Bt ) − fx (B0 ) → (Bt − x)+ − (B0 − x)+ ,
in L2 .
Moreover,
Z t 2 Z t
0
E fx, (Bs ) − 1[x,∞) (Bs ) ds ≤ E 1(x−,x+) (Bs )ds
0 0
Z t 2
≤ √ ds.
0 2πs
that clearly tends to zero as → 0. Hence, by the isometry property of the
stochastic integral
Z t Z t
0
fx, (Bs )dBs → 1[x,∞) (Bs )dBs ,
0 0
in L2 .
Consequently, we have proved that
Z t 1
1(x−,x+) (Bs )ds
0 2
converges in L2 as → 0 and that formula (5.2) holds.

We give without proof two further properties of local time.
1. The property of local time as a density of occupation measure is made
clear by the following identity, valid por any t ≥ 0 and every a ≤ b:
Z b Z t
L(t, x)dx = 1(a,b) (Bs )ds.
a 0
53
2. The stochastic integral 0t 1[x,∞) (Bs )dBs has a jointly continuous ver-
R
sion in (t, x) ∈ (0, ∞) × R. Hence, by (5.2) so does the local time

{L(t, x), (t, x) ∈ (0, ∞) × R}.
The next result, which follows easily from Theorem 5.1 is known as Tanaka’s
formula.
Theorem 5.2 For any (t, x) ∈ [0, ∞) × R, we have
Z t
|Bt − x| = |B0 − x| + sign(Bs − x)dBs + L(t, x). (5.5)
0
Proof: We will use the following relations: |x| = x+ +x− , x− = max(−x, 0) =

(−x)+ . Hence, by virtue of (5.2), we only need a formula for (−Bt + x)+ .
Notice that we already have it, since the process −B is also a Brownian
motion. More precisely,
Z t 1
+
(−Bt + x) = (−B0 + x) + +
1[−x,∞) (−Bs )d(−Bs ) + L− (t, −x),
0 2
where we have denoted by L− (t, −x) the local time of −B at −x. We have
the following facts:
Z t Z t
1[−x,∞) (−Bs )d(−Bs ) = − 1(−∞,x] (Bs )dBs ,
0 0
1 Zt
L− (t, −x) = lim 1(−x−,−x+) (−Bs )ds
→0 2 0
1 Zt
= lim 1(x−,x+) (Bs )ds
→0 2 0
= L(t, x).
Thus, we have proved

Z t 1
− −
(Bt − x) = (B0 − x) − 1(−∞,x] (Bs )dBs + L(t, x). (5.6)
0 2
Adding up (5.2) and (5.6) yields (5.5). Indeed

if Bs > x 1,


1[x,∞) (Bs ) − 1(−∞,x] (Bs ) = −1 if Bs < x


0 if Bs = x
which is identical to sign (Bs − x).

54
6 Stochastic differential equations
In this section we shall introduce stochastic differential equations driven by a
multi-dimensional Brownian motion. Under suitable properties on the coeffi-
cients, we shall prove a result on existence and uniqueness of solution. Then
we shall establish properties of the solution, like existence of moments of any
order and the Hölder property of the sample paths.
The setting
We consider a d-dimensional Brownian motion B = {Bt = (Bt1 , . . . , Btd ), t ≥
0}, B0 = 0, defined on a probability space (Ω, F, P ), along with a filtration
(Ft , t ≥ 0) satisfying the following properties:
1. B is adapted to (Ft , t ≥ 0),
2. the σ-field generated by {Bu − Bt , u ≥ t} is independent of (Ft , t ≥ 0).

We also consider functions
b : [0, ∞) × Rm → Rm , σ : [0, ∞) × Rm → L(Rd ; Rm ).
When necessary we will use the description

b(t, x) = bi (t, x) , σ(t, x) = σji (t, x) .
1≤i≤m 1≤i≤m,1≤j≤d
By a stochastic differential equation, we mean an expression of the form
dXt = σ(t, Xt )dBt + b(t, Xt )dt, t ∈ (0, ∞),

X0 = x, (6.1)
where x is a m-dimensional random vector independent of the Brownian

motion.
We can also consider any time value u ≥ 0 as the initial one. In this case,
we must write t ∈ (u, ∞) and Xu = x in (6.1). For the sake of simplicity we
will assume here that x is deterministic.
The formal expression (6.1) has to be understood as follows:
Z t Z t
Xt = x + σ(s, Xs )dBs + b(s, Xs )ds, (6.2)
0 0
or coordinate-wise,
d Z t Z t
Xti = xi + σji (s, Xs )dBsj + bi (s, Xs )ds,
X
j=1 0 0
55
i = 1, . . . , m.
Strong existence and path-wise uniqueness
We now give the notions of existence and uniqueness of solution that we will
consider throughout this chapter.
Definition 6.1 A m-dimensional stochastic process (Xt , t ≥ 0) measurable

and Ft -adapted is a strong solution to (6.2) if the following conditions are
satisfied:
1. The processes (σji (s, Xs ), s ≥ 0) belong to L2a,∞ , for any 1 ≤ i ≤ m,

1 ≤ j ≤ d.
2. The processes (bi (s, Xs ), s ≥ 0) belong to L1a,∞ , for any 1 ≤ i ≤ m.
3. Equation (6.2) holds true for the fixed Brownian motion defined before.
Definition 6.2 The equation (6.2) has a path-wise unique solution if any
two strong solutions X1 and X2 in the sense of the previous definition are
indistinguishable, that is,
P {X1 (t) = X2 (t), for any t ≥ 0} = 1.
Hypotheses on the coefficients

We shall refer to (H) for the following set of hypotheses.
1. Linear growth:
sup [|b(t, x)| + |σ(t, x)] | ≤ L(1 + |x|). (6.3)

t
2. Lipschitz in the x variable, uniformly in t:
sup [|b(t, x) − b(t, y)| + |σ(t, x) − σ(t, y)|] ≤ L|x − y|. (6.4)
t
In (6.3), (6.4), L stands for a positive constant.
56
6.1 Examples of stochastic differential equations
When the functions σ and b have a linear structure, the solution to (6.2)
admits an explicit form. This is not surprising as it is indeed the case for
ordinary differential equations. We deal with this question in this section.
More precisely, suppose that
σ(t, x) = Σ(t) + F (t)x, (6.5)

b(t, x) = c(t) + D(t)x. (6.6)
Example 1 Assume σ(t, x) = Σ(t), where for any t ≥ 0, Σ(t) is a d × m

matrix, b(t, x) = c(t) + D, where for any t ≥ 0, c(t) ∈ Rm and D is a m × m
diagonal matrix with identical entries. Now equation (6.2) reads
Z t Z t
Xt = X0 + Σ(s)dBs + [c(s) + DXs ]ds,
0 0
and has a unique solution given by

Z t
Xt = eDt X0 + eD(t−s) (c(s)ds + Σ(s)dBs ). (6.7)
0
where for a m × m matrix M ,

∞
Mk
eM =
X
.
k=0 k!
To check (6.7) we proceed as in the deterministic case. First we consider the

equation
dXt = DXt dt,
with initial condition X0 , which solution is
Xt = eDt X0 , t ≥ 0.
The we use the variation of constants procedure and write
Xt = X0 (t)eDt .
A priori X0 (t) may be random. However, since eDt is differentiable, the Itô
differential of Xt is given by
dXt = dX0 (t)eDt + X0 (t)DeDt dt.
57
Equating the right-hand side of the preceding identity with
Σ(t)dBt + (c(t) + DXt ) dt
yields
dX0 (t)eDt + DXt dt

= Σ(t)dBt + (c(t) + DXt ) dt,
that is
dX0 (t) = e−Dt [Σ(t)dBt + c(t)dt] .
In integral form
Z t
X0 (t) = x + e−Ds [Σ(s)dBs + c(s)ds] .
0
Plugging the right-hand side of this equation in Xt = X0 (t)eDt yields (6.7).
6.2 A result on existence and uniqueness of solution

This section is devoted to prove the following result.
Theorem 6.1 Assume that the functions σ, and b satisfy the assumptions
(H). Then there exists a path-wise unique strong solution to (6.2).
Before giving a proof of this theorem we recall a version of Gronwall’s lemma

that will be used repeatedly in the sequel.
Lemma 6.1 Let u, v : [α, β] → R+ be functions such that u is Lebesgue

integrable and v is measurable and bounded. Assume that
Z t
v(t) ≤ c + u(s)v(s)ds, (6.8)
α
for some constant c ≥ 0 and for any t ∈ [α, β]. Then

Z t
v(t) ≤ c exp u(s)ds . (6.9)
α
Proof of Theorem 6.1

Let us introduce Picard’s iteration scheme
Xt0 = x,
Z t Z t
Xtn =x+ σ(s, Xsn−1 )dBs + b(s, Xsn−1 )ds, n ≥ 1, (6.10)
0 0
58
t ≥ 0. Let us restrict the time interval to [0, T ], with T > 0. We shall
prove that the sequence of stochastic processes defined recursively by (6.10)
converges uniformly to a process X which is a strong solution of (6.2). Even-
tually, we shall prove path-wise uniqueness.
Step 1: We prove by induction on n that for any t ∈ [0, T ],
( )
E sup |Xsn |2 < ∞. (6.11)
0≤s≤t
Indeed, this property is clearly true if n = 0, since in this case Xt0 is constant
and equal to x. Suppose that (6.11) holds true for n = 0, . . . , m − 1. By
applying Burkholder’s and Hölder’s inequality, we reach
( ) 2 !
s
Z
sup |Xsn |2 σ(u, Xum−1 )dBu

E ≤C x+E sup
0≤s≤t 0≤s≤t 0
Z s 2 !
m−1

+ E sup b(u, Xu )du
0≤s≤t 0
Z t 2
≤C x+E σ(u, Xum−1 ) du

0
Z t 2
+E b(u, Xum−1 )du

0
Z t
≤C x+E 1 + |Xum−1 |2 du
0
" !#
≤ C x + T + TE sup |Xsm−1 |2 .
0≤s≤T
Hence (6.11) is proved.

Step 2: As in Step 1, we prove by induction on n that
(Ct)n+1
( )
E sup |Xsn+1 − Xsn |2 ≤ . (6.12)
0≤s≤t (n + 1)!
Indeed, consider first the case n = 0 for which we have

Z s Z s
Xs1 − x = σ(u, x)dBu + b(s, x)ds.
0 0
Burkolder’s inequality yields

Z s
2 ! Z t
|σ(u, x)|2 du

E sup σ(u, x)dBu ≤ C

0≤s≤t 0 0
≤ Ct(1 + |x|2 ).
59
Similarly,
Z s
2 ! Z t
|b(u, x)|2 du

E sup b(u, x)du ≤ Ct

0≤s≤t 0 0
≤ Ct2 (1 + |x|2 ).
With this, (6.11) is established for n = 0.

Assume that (6.11) holds for natural numbers m ≤ n − 1. Then, as we did
for n = 0, we can consider the decomposition
( )
E sup |Xsn+1 − Xsn |2 ≤ 2(A(t) + B(t)),
0≤s≤t
with
( 2 )
s
Z
σ(u, Xun ) − σ(u, Xun−1 ) dBu ,

A(t) = E sup
0≤s≤t 0
( )
Z s 2
n n−1

B(t) = E sup b(u, Xu ) − b(u, Xu ) du .
0≤s≤t 0
Using first Burkholder’s inequality and then Hölder’s inequality along with
the Lipschitz property of the coefficient σ, we obtain
Z t
A(t) ≤ C(L) E(|Xsn − Xsn−1 |2 )ds.
0
By the induction assumption we can upper bound the last expression by

Z t (Cs)n (Ct)n+1
C(L) ≤ C(T, L) .
0 n! (n + 1)!
Similarly, applying Hölder’s inequality along with the Lipschitz property of
the coefficient b and the induction assumption, yield
(Ct)n+1
B(t) ≤ C(T, L) .
(n + 1)!
Step 3: The sequence of processes {Xtn , t ∈ [0, T ]}, n ≥ 0, converges uni-

formly in t to a stochastic process {Xt , t ∈ [0, T ]} which satisfies (6.2).
Indeed, applying first Chebychev’s inequality and then (6.12), we have
(Ct)n+1
( )
1
P sup Xtn+1 − Xtn > n ≤ 22n ,

0≤t≤T 2 (n + 1)!
60
which clearly implies
∞
( )
1
sup Xtn+1 − Xtn > n
X
P < ∞.
n=0 0≤t≤T 2
Hence, by the first Borel-Cantelli’s lemma

( ( ))
1
P lim inf sup Xtn+1 − Xtn ≤ n = 1.

n 0≤t≤T 2
In other words, for each ω a.s., there exists a natural number m0 (ω) such
that 1
sup Xtn+1 − Xtn ≤ n ,
0≤t≤T 2
for any n ≥ m0 (ω). The Weierstrass criterion for convergence of series of
functions then implies that
m−1
Xtm = x + [Xtk+1 − Xtk ]
X
k=1
converges uniformly on [0, T ], a.s. Let us denote by X = {Xt , t ∈ [0, T ]} the

limit. Obviously the process X has a.s. continuous paths.
To conclude the proof, we must check that X satisfies equation (6.2) on [0, T ].
The continuity properties of σ and b imply the convergences
σ(t, Xtn ) → σ(t, Xt ),

b(t, Xtn ) → b(t, Xt ),
as n → ∞, uniformly in t ∈ [0, T ], a.s.

Therefore,
t t
Z Z
[b(s, Xsn ) |Xsn − Xs | ds

− b(s, Xs )] ds ≤ L
0 0
≤ L sup |Xsn − Xs | → 0,
0≤s≤t
as n → ∞, a.s. This proves the a.s. convergence of the sequence of the

path-wise integrals.
As for the stochastic integrals, we will prove
Z t Z t
σ(s, Xsn )dBs → σ(s, Xs )dBs (6.13)
0 0
as n → ∞ with the convergence in probability.
61
Indeed, applying the extension of Lemma 3.5 to processes of Λa,T , we have
for each , N > 0,
Z t
n

P (σ(s, Xs ) − σ(s, Xs )) dBs >

0
Z t
N

≤P |σ(s, Xsn ) − σ(s, Xs )|2 ds > N + .
0 2
The first term in the right-hand side of this inequality converges to zero as
n → ∞. Since , N > 0 are arbitrary, this yields the convergence stated in
(6.13).
Summarising, by considering if necessary a subsequence {Xtnk , t ∈ [0, T ]},
we have proved the a.s. convergence, uniformly in t ∈ [0, T ], to a stochastic
process {Xt , t ∈ [0, T ]} which satisfies (6.2), and moreover
E{ sup |Xt |2 } < ∞.

0≤t≤T
In order to conclude that X is a strong solution to (6.1) we have to check

that the required measurability and integrability conditions hold. This is left
as an exercise to the reader.
Step 4: Path-wise uniqueness.
Let X1 and X2 be two strong solutions to (6.1). Proceeding in a similar way
as in Step 2, we easily get
! !
Z t
2 2
E sup |X1 (u) − X2 (u)| ≤C E sup |X1 (u) − X2 (u)| ds.
0≤u≤t 0 0≤u≤s
Hence, from Lemma 6.1 we conclude

!
2
E sup |X1 (u) − X2 (u)| = 0,
0≤u≤T
proving that X1 and X2 are indistinguishable.

6.3 Some properties of the solution

We start this section by studying the Lp -moments of the solution to (6.2).
Theorem 6.2 Assume the same assumptions as in Theorem 6.1 and suppose
in addition that the initial condition is a random variable X0 , independent of
62
the Brownian motion. Fix p ∈ [2, ∞) and t ∈ [0, T ]. There exists a positive
constant C = C(p, t, L) such that
!
E sup |Xs |p ≤ C (1 + E|X0 |p ) . (6.14)
0≤s≤t
Proof: From (6.2) it follows that

! Z s p !
sup |Xs |p ≤ C(p) E|X0 |p + E sup σ(u, Xu )dBu

E
0≤s≤t 0≤s≤t 0
Z s p !

E sup b(u, Xu )du .
0≤s≤t 0
Applying first Burkholder’s inequality and then Hölder’s inequality yield

Z s
p ! Z t p
2
2

E sup σ(u, Xu )dBu ≤ C(p)E
|σ(s, Xs )| ds
0≤s≤t 0 0
Z t Z t
≤ C(p, t)E |σ(s, Xs )|p ds ≤ C(p, L, t) (1 + E|Xs |p )ds
0 0
!!
Z t
≤ C(p, L, t) 1 + E sup |Xu |p ds.
0 0≤u≤s
For the pathwise integral, we apply Hölder’s inequality to obtain

Z s
p ! Z t
|b(s, Xs )|p ds

E sup b(u, Xu )du ≤ C(p, t)E
0≤s≤t 0 0
! !
Z t
≤ C(p, L, t) 1 + E sup |Xu |p ds .
0 0≤u≤s
Define !
ϕ(t) = E sup |Xs |p .
0≤s≤t
We have established that

Z t
p
ϕ(t) ≤ C(p, L, t) E|X0 | + 1 + ϕ(s)ds .
0
Then, with Lemma 6.1 we end the proof of (6.14).

It is clear that the solution to (6.2) depends on the initial value X0 . Consider
two initial conditions X0 , Y0 (remember that there should be m-dimensional
random vectors independent of the Brownian motion). Denote by X(X0 ),
X(Y0 ) the corresponding solutions to (6.2). With a very similar proof as that
of Theorem 6.2 we can obtain the following.
63
Theorem 6.3 The assumptions are the same as in Theorem 6.1. Then
!
p
E sup |Xs (X0 ) − Xs (Y0 )| ≤ C(p, L, t) (E|X0 − Y0 |p ) , (6.15)
0≤s≤t
for any p ∈ [2, ∞), where C(p, L, t) is some positive constant depending on
p, L and t.
The sample paths of the solution of a stochastic differential equation possess

the same regularity as those of the Brownian motion. We next discuss this
fact.
Theorem 6.4 The assumptions are the same as in Theorem 6.1. Let p ∈
[2, ∞), 0 ≤ s ≤ t ≤ T . There exists a positive constant C = C(p, L, T ) such
that
p
E (|Xt − Xs |p ) ≤ C(p, L, T ) (1 + E|X0 |p ) |t − s| 2 . (6.16)
Proof: By virtue of (6.2) we can write
E (|Xt − Xs |p )
Z t
p Z t
p

≤ C(p) E σ(u, Xu )dBu + E b(u, Xu )du .
s s
Burkholder’s inequality and then Hölder’s inequality with respect to

Lebesgue measure on [s, t] yield
Z t
p Z t p
2
2

E σ(u, Xu )dBu ≤ C(p)E |σ(u, Xu )| du
s s
p
Z t
≤ C(p)|t − s| 2 −1 E |σ(u, Xu )|p du
s
p
Z t
−1
≤ C(p, L, T )|t − s| 2 (1 + E(|Xu |p )) du.
s
By using the estimate (6.14) of Theorem 6.2, we have

!
Z t
p p
(1 + E(|Xu | )) du ≤ C(p, T, L)|t − s|E sup |Xu |
s 0≤u≤t
≤ C(p, T, L)|t − s| (1 + E|X0 |p ) . (6.17)
This ends the estimate of the Lp moment of the stochastic integral.
64
The estimate of the path-wise integral follows from Hölder’s inequality and
(6.17). Indeed, we have
Z t
p Z t
p−1
E (|b(u, Xu )|p ) du

E b(u, Xu )du ≤ |t − s|

s s
Z t
p−1
≤ C(p, L)|t − s| (1 + E(|Xu |p )) du
s
≤ C(p, T, L)|t − s|p (1 + E|X0 |p ) .
Hence we have proved (6.16)

We can now apply Kolmogorov’s continuity criterion (see Proposition 2.2) to
prove the following.
Corollary 6.1 With the same assumptions as in Theorem 6.1, we have that
the sample paths of the solution to (6.2) are Hölder continuous of degree
α ∈ (0, 21 ).
Remark 6.1 Assume that in Theorem 6.3 the initial conditions are deter-
ministic and are denoted by x and y, respectively. An extension of Kol-
mogorov’s continuity criterion to stochastic processes indexed by a multi-
dimensional parameter yields that the sample paths of the stochastic process
{Xt (x), t ∈ [0, T ], x ∈ Rm } are jointly Hölder continuous in (t, x) of degree
α < 21 in t and β < 1 in x, respectively.
6.4 Markov property of the solution

In Section 2.5 we discussed the Markov property of a real-valued Brownian
motion. With the obvious changes R into Rn , with arbitrary n ≥ 1 we can
see that the property extends to multi-dimensional Brownian motion. In this
section we prove that the solution to the sde (6.2) inherits the Markov prop-
erty from Brownian motion. To establish this fact we need some preliminary
results.
Lemma 6.2 Let (Ω, F, P ) be a probability space and (E, E) be a measur-

able space. Consider two independent sub-σ-fields of F, denoted by G, H,
respectively, along with mappings
X : (Ω, H) → (E, E),
65
and
Ψ : E × Ω → Rm ,
E ⊗ G, with ω 7→ Ψ (X(ω), ω) in L1 (Ω).
Then,
E (Ψ (X, ·) |H) = Φ(X)(·), (6.18)
with Φ(x)(·) = E (Ψ(x, ·)).
Proof: Assume first that Ψ(x, ω) = f (x)Z(ω) with a G-measurable Z and a

E-measurable f . Then, by the properties of the mathematical expectation,
E (Ψ (X, ·) |H) = E (f (X(·))Z(·)|H)

= f (X(·))E(Z).
Indeed, we use that X is H-measurable and that G, H are independent.

Clearly,
f (X(·))E(Z) = f (x)E(Z)|x=X(·) = E (Ψ(x, ·)) |x=X(·) = Φ(X).
This yields (6.18).

The result extends to any E ⊗ G-measurable function Ψ by a monotone class
argument.

Lemma 6.3 Fix u ≥ 0 and let η be a Fu -measurable random variable in

L2 (Ω). Consider the SDE
Z t Z t
Ytη = η + σ (s, Ysη ) dBs b (s, Ysη ) ds,
u u
with coeffients σ and b satisfying the assumptions (H). Then for any t ≥ u,
η(ω)
Yt (ω) = Xtx,u (ω)|x=η(ω) ,
where Xtx,u , t ≥ 0, denotes the solution to

Z t Z t
Xtx,u =x+ σ (s, Xsx,u )) dBs b (s, Xsx,u )) ds. (6.19)
u u
Proof: Suppose first that η is a step function,

r
X
η= ci 1Ai , Ai ∈ Fu .
i=1
66
By virtue of the local property of the stochastic integral, on the set Ai ,
Xtci ,u (ω) = Xtx,u (ω)|x=η(ω) = Ytη (ω).
Let now (ηn , n ≥ 1) be a sequence of simple Fu -measurable random variables
converging in L2 (Ω) to η. By Theorem 6.3 we have
L2 (Ω) − lim Ytηn = Ytη .
n→∞
By taking if necessary a subsequence, we may assume that the limit is a.s..

Then, a.s.,
η(ω) ηn (ω)
Yt (ω) = n→∞
lim Yt lim Xtx,u (ω)|x=ηn (ω) = Xtx,u (ω)|x=η(ω) ,
(ω) = n→∞
where in the last equality, we have applied the joint continuity in (t, x) of
Xtx,u .

x,s
As a consequence of the preceding lemma, we have Xtx,s = XtXu ,s
for any
0 ≤ s ≤ u ≤ t, a.s.
For any Γ ∈ B(Rm ), set
p(s, t, x, Γ) = P {Xtx,s ∈ Γ}, (6.20)
so, for fixed 0 ≤ s ≤ t and x ∈ Rm , p(s, t, x, ·) is the law of the random
variable Xtx,s .
Theorem 6.5 The stochastic process {Xtx,s , t ≥ s} is a Markov process with
initial distribution µ = δ{x} and transition probability function given by ([?]).
Proof: According to Definition 2.3 we have to check that (6.20) defines a
Markovian transition function and that
P {Xtx,s ∈ Γ|Fu } = p(u, t, Xux,s , Γ). (6.21)
We start by proving this identity. For this, we shall apply Lemma 6.3 in the
following setting:
(E, E) = (Rm , B(Rm )),
G = σ (Br+u − Bu , r ≥ 0) , H = Fu ,
Ψ(x, ω) = 1Γ (Xtx,u (ω)) , u ≤ t,
X := Xux,s , s ≥ u.
Clearly the σ-fields G and H defined above are independent and Xux,s is Fu -
measurable. Moreover,
Φ(x) = E (Ψ(x, ·)) = E (11Γ (Xtx,u ))
= P {Xtx,u ∈ Γ} = p(u, t, x, Γ).
67
Thus, Lemma 6.3 and then Lemma 6.2 yield
n x,s o
P {Xtx,s ∈ Γ|Fu } = P XtXu ,u
∈ Γ|Fu
x,s
= E 1Γ XtXu ,u
|Fu = E (Ψ (Xux,s , ω) |Fu )
= Φ (Xux,s ) = p(u, t, Xux,s , Γ).
Since x 7→ Xtx,s is continuous a.s., the mapping x 7→ p(s, t, ·, Γ) is also contin-

uous and thus measurable. Moreover, by its very definition Γ 7→ p(s, t, x, Γ)
is a probability. We now prove that Chapman-Kolmogorov’s equation is
satisfied (see (2.8)).
Indeed, fix 0 ≤ s ≤ u ≤ t; by property (c) of the conditional expectation we
have
p(s, t, x, Γ) = E (11Γ (Xtx,s ))
E (E (11Γ (Xtx,s )|Fu )) = E (P {Xtx,s ∈ Γ} |Fu ) .
By (6.21) this last expression is E (p(u, t, Xux,s , Γ)). But
Z
E (p(u, t, Xux,s , Γ)) = p(u, t, y, Γ)LXux,s (dy),
Rm
where LXux,s denotes the probability law of Xux,s . By definition
LXux,s (dy) = p(s, u, x, dy).
Therefore, Z
p(s, t, x, Γ) = p(u, t, y, Γ)p(s, u, x, dy).
Rm

7 Numerical approximations of stochastic

differential equations
In this section we consider a fixed time interval [0, T ]. Let π = {0 = τ0 <
τ1 < . . . < τN = T } be a partition of [0, T ]. The Euler-Maruyama scheme
for the SDE (6.2) based on the partition π is the stochastic process X π =
{Xtπ , t ∈ [0, T ]} defined iteratively as follows:
Xτπn+1 = Xτπn + σ(τn , Xτπn )(Bτn+1 − Bτn ) + b(τn , Xτπn )(τn+1 − τn ),
n = 0, · · · , N − 1
X0π = x, (7.1)
68
Notice that the values Xτπn , n = 0, · · · , N − 1 are determined by the values
of Bτn , n = 1, . . . , N .
We can extend the definition of X π to any value of t ∈ [0, T ] by setting
Xtπ = Xτπj + σ(τj , Xτπj )(Bt − Bτj ) + b(τj , Xτπj )(t − τj ), (7.2)
for t ∈ [τj , τj+1 ).

The stochastic process {Xtπ , t ∈ [0, T ]} defined by (7.2) can be written as a
stochastic differential equation. This notation will be suitable for comparison
with the solution of (6.2). Indeed, for any t ∈ [0, T ] set π(t) = sup{τl ∈
π; τl ≤ t}; then
Z t h i
Xtπ =x π
σ π(s), Xπ(s) π
dBs + b π(s), Xπ(s) ds . (7.3)
0
The next theorem gives the rate of convergence of the Eurler-Maruyama

scheme to the solution of (6.2) in the Lp norm.
Theorem 7.1 We assume that the hypotheses (H) are satisfied. Moreover,
we suppose that there exists α ∈ (0, 1) such that
|σ(t, x) − σ(s, x)| + |b(t, x) − b(s, x)| ≤ C(1 + |x|)|t − s|α , (7.4)
where C is some positive contant.

Then, for any p ∈ [1, ∞)
!
p
E sup |Xtπ − Xt | ≤ C(T, p, x)|π|βp , (7.5)
0≤t≤T
1
where |π| denotes the norm of the partition π and β = 2
∧ α.
Proof: We shall apply the following result, that can be arguing in a similar
way as in Theorem 6.2
!
sup E sup |Xsπ |p ≤ C(p, T ). (7.6)
π 0≤s≤t
Set
Zt = sup |Xsπ − Xs | .
0≤s≤t
Applying Burkholder’s and Hölder’s inequality we obtain

p
Z t p
E(Ztp ) ≤2 p−1
t −1 π
E σ(π(s), Xπ(s) ) − σ(s, Xs ) ds
2

0
Z t p
+ tp−1 π
E b(π(s), Xπ(s) ) − b(s, Xs ) ds .

0
69
The assumptions on the coefficients yield

π π
σ(π(s), Xπ(s)

) − σ(s, Xs ) ≤ σ(π(s), Xπ(s) ) − σ(π(s), Xπ(s) )

+ σ(π(s), Xπ(s) ) − σ(s, Xπ(s) ) + σ(s, Xπ(s) ) − σ(s, Xs )
h i
π
≤ CT Xπ(s) − Xπ(s) + 1 + Xπ(s) |s − π(s)|α + Xπ(s) − Xs
h i
≤ CT Zs + 1 + Xπ(s) (s − π(s))α + Xπ(s) − Xs ,
and similarly for the coefficient b.

Hence, we have
p
π
E σ(π(s), Xπ(s) ) − σ(s, Xs )

h p i
≤ C(p, T ) E(|Zs |p ) + (1 + |x|p ) |π|αp + |π| 2
h 1
p i
≤ C(p, T ) E(|Zs |p ) + (1 + |x|p ) |π|α + |π| 2 ,
and a similar estimate for b. Consequently,

Z t 1
p
E(Ztp ) ≤ C(p, T, x) E(Zsp )ds α
+ |π| + |π| 2 T .
0
With Gronwall’s lemma we conclude
E(Ztp ) ≤ C(p, T, x)|π|βp ,

1
with β = 2
∧ α.

Remark 7.1 If the coefficients σ and b do not depend on t, with a similar

proof, we obtain β = 12 in (7.4).
Consider now a sequence of partitions of [0, T ], (πn , n ≥ 1), whose mesh

satisfies ∞ δ
n=0 |πn | < ∞, for an arbitrarily small δ > 0. Fix γ ∈ (0, β), then
P
Chebyshev’s inequality and (7.4) imply

∞ ∞
( )
−γ
|Xsπn − Xs | > ≤ C(p, T, x)−p |πn |(β−γ)p
X X
P |πn | sup
n=0 0≤s≤T n=0
−p
C(p, T, x) .
Borel-Cantelli’s lemma then yields
|πn |−γ sup |Xsπn − Xs | → 0, a.s., (7.7)

0≤s≤T
as n → ∞, uniformly in t ∈ [0, T ].
70
8 Continuous time martingales
In this chapter we shall study some properties of martingales (respectively,
supermartingales and submartingales) whose sample paths are continuous.
We consider a filtration {Ft , t ≥ 0} as has been introduced in section 2.4 and
refer to definition 2.2 for the notion of martingale (respectively, supermartin-
gale, submartingale). We notice that in fact this definition can be extended
to families of random variables {Xt , t ∈ T} where T is an ordered set. In
particular, we can consider discrete time parameter processes.
We start by listing some elementary but useful properties.
1. For a martingale (respectively, supermartingale, submartingale) the

function t 7→ E(Xt ) is a constant (respectively, decreasing, increasing)
function.
2. Let {Xt , t ≥ 0} be a martingale and let f : R → R be a convex

function. Assume further that f (Xt ) ∈ L1 (Ω), for any t ≥ 0. Then
the stochastic process {f (Xt ), t ≥ 0} is a submartingale. The same
conclusion holds true for a submartingale if, additionally the convex
function f is increasing.
The first assertion follows easily from property (c) of the conditional expec-
tation. The second assertion can be proved using Jensen’s inequality.
8.1 Doob’s inequalities for martingales

In the first part of this section, we will deal with discrete parameter martin-
gales.
Proposition 8.1 Let {Xn , 0 ≤ n ≤ N } be a submartingale. For any λ > 0,

the following inequalities hold:

λP sup Xn ≥ λ ≤ E XN 1(supn Xn ≥λ)
n

≤ E |XN |11(supn Xn ≥λ) . (8.1)
Proof: Consider the stopping time
T = inf{n : Xn ≥ λ} ∧ N.
71
Then

E (XN ) ≥ E (XT ) = E XT 1(supn Xn ≥λ)

+ E XT 1(supn Xn <λ)

≥ λP sup Xn ≥ λ + E XN 1(supn Xn <λ) .
n

By substracting E XN 1(supn Xn <λ) from the first and last term before we
obtain the first inequality of (8.1). The second one is obvious.

As a consequence of this proposition we have the following.
Proposition 8.2 Let {Xn , 0 ≤ n ≤ N } be either a martingale or a positive

submartingale. Fix p ∈ [1, ∞) and λ ∈ (0, ∞). Then,

λ P sup |Xn | ≥ λ ≤ E (|XN |p ) .
p
(8.2)
n
Moreover, for any p ∈]1, ∞),

!p
p

E (|XN |p ) ≤ E sup |Xn |p ≤ E (|XN |p ) . (8.3)
n p−1
Proof: Without loss of generality, we may assume that E |XN |p ) < ∞, since
otherwise (8.2) holds trivially.
According to property 2 above, the process {|Xn |p , 0 ≤ n ≤ N } is a sub-
martingale and then, by Proposition 8.1 applied to the process Xn := |Xn |p ,

1
p
µP sup |Xn | ≥ µ = µP sup |Xn | ≥ µ p
n n

= λ P sup |Xn | ≥ λ ≤ E (|XN |p ) ,
p
n
1
where for any µ > 0 we have written λ = µ p .
We now prove the second inequality of (8.3); the first is obvious.
Set X ∗ = supn |Xn |, for which we have

λP (X ∗ ≥ λ) ≤ E |XN |11(X ∗ ≥λ) .
72
Fubini’s theorem yields
X ∗ ∧k
Z !
∗ p p−1
E ((X ∧ k) ) = E pλ dλ
0
Z Z ∞
= dP 1{λ≤X ∗ ∧k} pλp−1 dλ
Ω 0
Z k Z
= dλpλp−1 dP
0 {λ≤X ∗ }
Z k
= dλpλp−1 P (X ∗ ≥ λ)
0
Z k
≤p dλpλp−2 E |XN |11(X ∗ ≥λ)
0
k∧X ∗
Z !
p−2
= pE |XN | λ dλ
0
p
= E |XN | (X ∗ ∧ k)p−1 .
p−1
p
Applying Hölder’s inequality with exponents p−1
and p yields
p p−1 1
E ((X ∗ ∧ k)p ) ≤ [E ((X ∗ ∧ k)p )] p [E (|XN |p )] p .
p−1
Consequently,
p 1 1
[E ((X ∗ ∧ k)p )] p ≤
[E (|XN |p )] p .
p−1
Letting k → ∞ and using monotone convergence, we end the proof.

It is not difficult to extend the above results to martingales (submartingales)
with continuous sample paths. In fact, for a given T > 0 we define
D = Q ∩ [0, T ],
( )
k
Dn = D ∩ n , k ∈ Z+ ,
2
where Q denotes the set of rational numbers.
We can now apply (8.2), (8.3) to the corresponding processes indexed by Dn .
By letting n to ∞ we obtain
!
p
λ P sup |Xt | ≥ λ ≤ sup E (|Xt |p ) , p ∈ [1, ∞)
t∈D t∈D
and ! !p
p
E sup |Xt |p ≤ sup E (|Xt |p ) , p ∈]1, ∞).
t∈D p − 1 t∈D
By the continuity of the sample paths we can finally state the following result.
73
Theorem 8.1 Let {Xt , t ∈ [0, T ]} be either a continuous martingale or a
continuous positive submartingale. Then
!
p
λP sup |Xt | ≥ λ ≤ sup E (|Xt |p ) , p ∈ [1, ∞), (8.4)
t∈[0,T ] t∈[0,T ]
! !p
p p
E sup |Xt | ≤ sup E (|Xt |p ) , p ∈]1, ∞). (8.5)
t∈[0,T ] p−1 t∈[0,T ]
Inequality (8.4) is termed Doob’s maximal inequality, while (8.6) is called

Doob’s Lp inequality.
8.2 Local martingales

Throughout this section we will consider a fixed filtration (Ft , t ≥ 0) and use
the notation X T for the stochastic process {XtT = Xt∧T , t ≥ 0}, where T is
a stopping time.
Definition 8.1 An adapted continuous process M = (Mt , t ≥ 0), with M0 =
0, a.s. is a (continuous) local martingale if there exists an increasing sequence
of stopping times (Tn , n ≥ 1) such that Tn ↑ ∞ a.s. and for each n ≥ 1, the
process M Tn is a martingale.
We say that (Tn , n ≥ 1) reduces M .
Here are some interesting properties.
(a) Each continuous martingale is a continuous local martingale. In fact,
the sequence Tn = n reduces M .
(b) If M is a continuous local martingale, then for any stopping time T ,

M T is also a local martingale.
The next result helps to understand the relationship between the notions of
martingale and local martingale.
Proposition 8.3 1. A positive local martingale is a supermartingale.
2. Let M be a local martingale for which there exists a random variable

Z ∈ L1 (Ω) such that |Mt | ≤ Z for any t ≥ 0. Then M is a martingale.
3. Let M be a local martingale. The sequence of stopping times given by
Tn = inf{t ≥ 0, |Mt | ≥ n}
reduces M .
74
Proof: We start by proving (1). Consider a sequence (Tn , n ≥ 1) of stopping
times that reduce M . Then for 0 ≤ s ≤ t and each n,
Ms∧Tn = E (Mt∧Tn |Fs ) . (8.6)
Since the random variables are positive, by letting n tend to infinity and
applying Fatou’s lemma, we obtain
lim E (Mt∧Tn |Fs )

Ms = n→∞

≥ E lim inf Mt∧Tn |Fs
n→∞
= E (Mt |Fs ) .
Moreover, for s = 0 we obtain E(Mt ) ≤ E(M0 ) = 0. This ends the proof of

the first assertion.
To prove the second one, we notice that by bounded convergence, from (8.6)
we obtain
Ms = E (Mt |Fs ) .
Let us finally prove 3). The process M Tn is a local martingale (see property
(b) above) and it is bounded by n. Thus, by 2), it is also a martingale.
Consequently, the sequence Tn = inf{t ≥ 0, |Mt | = n}, n ≥ 1 reduces M .

The next statement gives a feeling of the roughness of the sample paths of a
continuous local martingale.
Proposition 8.4 Let M be a continuous local martingale with sample paths

of bounded variation, a.s. Then M is indistinguishable from the constant
process 0.
Proof: Define the sequence of stopping times

Z t
τn = inf{t > 0; |dMs | ≥ n},
0
n ≥ 1.
Fix n and set N n = M τn . This is a local martingale satisfying 0∞ |dNsn | ≤ n
R
and therefore |Ntn | ≤ n for any t ≥ 0. Hence, by Proposition 8.3, N n is a

bounded martingale. In the sequel we shall omit the superscript n to simplify
the notation.
75
Fix t > 0 and consider a partition 0 = t0 < t1 < · · · < tp = t of [0, t]. Then
p h i
E(Nt2 ) = E| Nt2i − Nt2i−1
X
i=1
Xp 2
= E Nti − Nti−1
i=1
" p
X #
≤E sup |Nti − Nti−1 | |Nti − Nti−1 |
i i=1

≤ nE sup |Nti − Nti−1 | ,
i
where the second identity above is a consequence of the martingale property

of N .
Considering a sequence of partitions whose mesh tends to zero, the preceding
2
estimate yields E(Nt2 ) = E(Mt∧τ n
) = 0, by the continuity of the sample paths
of N . Eventually, letting n → ∞ implies E(Mt2 ) = 0. This finishes the proof
of the proposition.

8.3 Quadratic variation of a local martingale

Throughout this section we will consider a fixed t > 0 and an increasing
sequence of partitions of [0, t] whose mesh tends to zero. Points of the n-th
partition will be generically denoted by tnk , k = 0, 1, . . . , pn . We will also
consider a local continuous martingale M and define
pn 2
hM int =
X
Mtnk − Mtnk−1 ,
k=1
(∆nk M )t = Mtnk − Mtnk−1 .
Theorem 8.2 The sequence (hM int , n ≥ 1), t ∈ [0, T ], converges uni-
formly in t, in probability, to a continuous, increasing process hM i =
(hM it , t ∈ [0, T ]), such that M0 = 0. That is, for any > 0,
( )
P sup |hM int − hM it | > → 0, (8.7)
t∈[0,T ]
as n → ∞. The process hM i is unique satisfying the above conditions and

that M 2 − hM i is a local continuous martingale.
If M is bounded in Lp (Ω), then the convergence of hM in to hM i takes place
p
in L 2 (Ω).
76
Proof: Uniqueness follows from Proposition 8.4. Indeed, assume there were
two increasing processes hMi i, i = 1, 2 satisfying that M 2 − hMi i is a con-
tinuous local martingale. By substracting these two processes we get that
hM1 i − hM2 i is of bounded variation and, at the same time a continuous local
martingale. Hence hM1 i and hM2 i are indistinguishable.
The existence of hM i is proved through different steps, as follows.
Step 1. Assume that M is bounded. We shall prove that (hM int , n ≥ 1),
t ∈ [0, T ] is uniformly in t a Cauchy sequence in probability.
Let m > n. We have the following:
  2 
p
X n
2
E |hM int − hM im
 n 2
X
m 2  
t | = E  (∆k M )t − (∆k M )t  


k=1 j:tm n n
j ∈[tk−1 ,tk [

  2 
p
X n
m
X
= 4E   (∆k M )t Mtm − M
   
n
tk  
j
k=1 j:tm
j ∈[tn ,t
k−1 k
n[
 
pn 2
X
(∆m 2
X
= 4E  M ) M m − Mtn

k t tj k

k=1 j:tm n n
j ∈[tk−1 ,tk [
 
pm
2 X
≤ 4E  sup Mtm
j
− Mtnk (∆m 2
k M )t
j:tm n n
j ∈[tk−1 ,tk [ j=1
    2  12
4 pm
(∆m 2 
X
≤ 4 E  sup Mtm − Mtnk  E  k M )t  .

j
j:tm n n
j ∈[tk−1 ,tk [ j=1
Let us now consider the last expression. The first factor tends to zero as n
and m tends to infinity, because M is continuous and bounded. The second
factor is easily seen to be bounded uniformly in m. Thus we have proved

2
lim E |hM int − hM im
n,m→∞ t | = 0, (8.8)
for any t ∈ [0, T ].

One can easily check that for any n ≥ 1, {Mt2 − hM int , t ∈ [0, T ]} is a
continuous martingale. Therefore {hM int − hM im
t , t ∈ [0, T ]} is a martingale
for any n, m ≥ 1. Hence, Doob’s inequality yields
!

2 2
E sup |hM int − hM im
t | ≤ 4E |hM inT − hM im
T| .
t∈[0,T ]
This yields the existence of a process hM i satisfying the required conditions.
77
Step 2. For a general continuous local martingale M , let us consider an in-
creasing sequence of stopping times (Tn , n ≥ 1) converging to infinity a.s.
such that M Tn is a bounded martingale. By Step 1, there exists an increas-
ing process hM n i associated with M n := M Tn . Moreover, for any m ≤ n,
hM n it∧Tm = hM m it . Fix t ∈ [0, T ]. There exists n ≥ 1 such that t ≤ Tn ;
then we set
hM it = hM it∧Tn = hM n it .
We have
P { sup |hM int − hM it | > } ≤ P { sup |hM int∧Tk − hM it∧Tk | > }

t∈[0,T ] t∈[0,T ]
+ P {Tk < T }.
For any δ > 0, we can choose k ≥ 1 such that P {Tk < T } < 2δ . Thus, we
obtain
lim P { sup |hM int − hM it | > } = 0.
n→∞
t∈[0,T ]
p
The assertion on L (Ω) convergence can be proved using martingale inequal-
2
ities.

Given two continuous local martingales M and N we define the cross varia-
tion by
1
hM, N it = [hM + N it − hM it − hN it ] , (8.9)
2
t ∈ [0, T ].
From the properties of the quadratic variation (see Theorem 8.2) we see that
the process {hM, N it , t ∈ [0, T ]} is a process of bounded variation and it is
the unique process (up to indistinguishability) such that {Mt Nt −hM, N it , t ∈
[0, T ]} is a continuous local martingale. It is also clear that
pn
X
lim Mtnk − Mtnk−1 Ntnk − Ntnk−1 = hM, N it , (8.10)
n→∞
k=1
uniformly in t ∈ [0, T ] in probability.

This result together with Schwarz’s inequality imply
q
|hM, N it | ≤ hM it hN it .
and more generally, by setting hM, N its = hM, N it − hM, N is , for 0 ≤ s ≤

t ≤ T , then q
|hM, N its | ≤ hM its hN its .
78
This inequality (a sort of Cauchy-Schwarz’s inequality) is a particular case
of the result stated in the next proposition.
Proposition 8.5 Let M , N be two continuous local martingales and H, K

be two measurable processes. Then
Z ∞ Z ∞ 1 Z ∞ 1
2 2
|Hs ||Ks ||dhM, is | ≤ Hs2 dhM is Ks2 dhN is . (8.11)
0 0 0
79
9 Stochastic integrals with respect to contin-
uous martingales
This chapter aims to give an outline of the main ideas of the extension of the
Itô stochastic integral to integrators which are continuous local martingales.
We start be describing precisely the spaces involved in the construction of
such a notion. Throughout the chapter, we consider a fixed probability space
(Ω, F, P ) endowed with a filtration (Ft , t ≥ 0).
We denote by H2 the space of continuous martingales M , bounded in L2 (Ω),
with M0 = 0 a.s. This is a Hilbert space endowed with the inner product
(M, N )H2 = E[hM, N i∞ ].
A stochastic process (Xt , t ≥ 0) is said to be progressively measurable if for

any t ≥ 0, the mapping (s, ω) 7→ Xs (ω) defined on [0, t] × Ω is measurable
with respect to the σ-field B([0, t]) ⊗ Ft .
For any M ∈ H2 we define L2 (M ) as the set of progressively measurable
processes H such that
Z ∞
E Hs2 dhM is < ∞.
0
Notice that this in an L2 space of measurable mappings defined on R+ × Ω

with respect to the measure dP dhM i. Hence it is also a Hilbert space, the
natural inner product being
Z ∞
(H, K)L2 (M ) = E Hs Ks dhM is .
0
Let E be the linear subspace of L2 (M ) consisting of processes of the form

p
X
Hs (ω) = Hi (ω)11]ti ,ti+1 ] (s), (9.1)
i=0
where 0 = t0 < t1 < . . . < tp+1 , and for each i, Hi is a Fti -measurable,
bounded random variable.
Stochastic processes belonging to E are termed elementary. There are related
with L2 (M ) as follows.
Proposition 9.1 Fix M ∈ H2 . The set E is dense in L2 (M ).
Proof: We will prove that if K ∈ L2 (M ) and is orthogonal to E then K = 0.
For this, we fix 0 ≤ s < t and consider the process
H = F 1]s,t] ,
80
with F a Fs -measurable and bounded random variable.
Saying that K is orthogonal to H can be written as
Z t
E F Ku dhM iu = 0.
s
Consider the stochastic process

Z t
Xt = Ku dhM iu .
0
Notice that Xt ∈ L1 (Ω). In fact,

Z t
E|Xt | ≤ E |Ku |dhM iu
0
Z t 1 1
2
2
≤ E |Ku | dhM iu EhM i2t 2
.
0
We have thus proved that E (F (Xt − Xs )) = 0, for any 0 ≤ s < t and any
Fs -measurable and bounded random variable F . This shows that the process
(Xt , t ≥ 0) is a martingale. At the same time, (Xt , t ≥ 0) is also a process of
bounded variation. Hence
Z t
Ku dhM iu = 0, ∀t ≥ 0,
0
which implies that K = 0 in L2 (M ).

Stochastic integral of processes in E
Proposition 9.2 Let M ∈ H2 , H ∈ E as in (9.1). Define

p
X
(H.M )t = Hi Mti+1 ∧t − Mti ∧t .
i=0
Then
(i) H.M ∈ H2 .
(ii) The mapping H 7→ H.M extends to an isometry from L2 (M ) to H2 .
The stochastic process {(H.M )t , t ≥ 0} is called the stochastic integral of the

process H with respect to M and is also denoted by 0t Hs dMs .
R
81
Proof of (i): The martingale property follows from the measurability proper-
ties of H and the martingale property of M . Moreover, since H is bounded,
(H.M ) is bounded in L2 (Ω).
Proof of (ii): We prove first that the mapping
H ∈ E 7→ H.M
is an isometry from E to H2 .
Clearly H 7→ H.M is linear. Moreover, H.M is a finite sum of terms like

Mti = Hi Mti+1 ∧t − Mti ∧t
each one being a martingale and orthogonal to each other. It is easy to check
that
hM i it = Hi2 (hM iti+1 ∧t − hM iti ∧t ).
Hence,
p
Hi2 (hM iti+1 − hM iti ).
X
hH.M it =
i=0
Consequently,
" p #
EhH.M i∞ = kH.M k2H2 = E Hi2 (hM iti+1 ∧t − hM iti ∧t )
X
i=0
Z ∞
=E Hs2 dhM is
0
= kHkL2 (M ) .
Since E is dense in L2 (M ) this isometry extends to a unique isometry from

L2 (M ) into H2 . The extension is termed the stochastic integral of the process
H with respect to M .

82
10 Appendix 1: Conditional expectation
Roughly speaking, a conditional expectation of a random variable is the
mean value with respect to a modified probability after having incorporated
some a priori information. The simplest case corresponds to conditioning
with respect to an event B ∈ F. In this case, the conditional expectation is
the mathematical expectation computed on the modified probability space
(Ω, F, P (·/B)).
However, in general, additional information cannot be described so easily.
Assuming that we know about some events B1 , . . . , Bn we also know about
those that can be derived from them, like unions, intersections, complemen-
taries. This explains the election of a σ-field to keep known information and
to deal with it.
In the sequel, we denote by G an arbitrary σ-field included in F and by X
a random variable with finite expectation (X ∈ L1 (Ω)). Our final aim is to
give a definition of the conditional expectation of X given G. However, in
order to motivate this notion, we shall start with more simple situations.
Conditional expectation given an event
Let B ∈ F be such that P (B) 6= 0. The conditional expectation of X given
B is the real number defined by the formula
1
E(X/B) = E(11B X). (10.1)
P (B)
It immediately follows that
• E(X/Ω) = E(X),
• E(11A /B) = P (A/B).
With the definition (10.1), the conditional expectation coincides with the
expectation with respect to the conditional probability P (·/B). We check
this fact with a discrete random variable X = ∞
P
i=1 ai 1Ai . Indeed,
∞ ∞
!
1 X X P (Ai ∩ B)
E(X/B) = E ai 1Ai ∩B = ai
P (B) i=1 i=1 P (B)
∞
X
= ai P (Ai /B).
i=1
83
Conditional expectation given a discrete random variable
Let Y = ∞ i=1 yi 1Ai , Ai = {Y = yi }. The conditional expectation of X given
P
Y is the random variable defined by

∞
X
E(X/Y ) = E(X/Y = yi )11Ai . (10.2)
i=1
Notice that, knowing Y means knowing all the events that can be described
in terms of Y . Since Y is discrete, they can be described in terms of the
basic events {Y = yi }. This may explain the formula (10.2).
The following properties hold:
(a) E (E(X/Y )) = E(X);
(b) if the random variables X and Y are independent, then E(X/Y ) =

E(X).
For the proof of (a) we notice that, since E(X/Y ) is a discrete random
variable
∞
X
E (E(X/Y )) = E(X/Y = yi )P (Y = yi )
i=1
∞
!
X
=E X 1{Y =yi } = E(X).
i=1
Let us now prove (b). The independence of X and Y yields

∞
X E(X11{Y =yi } )
E(X/Y ) = 1Ai
i=1 P (Y = yi )
X∞
= E(X)11Ai = E(X).
i=1
The next proposition states two properties of the conditional expectation.

that motivates the Definition 10.1
Proposition 10.1 1. The random variable Z := E(X/Y ) is σ(Y )-

measurable; that is, for any Borel set B ∈ B, Z −1 (B) ∈ σ(Y ),
2. for any A ∈ σ(Y ), E (11A E(X/Y )) = E(11A X).
Proof: Set ci = E(X/{Y = yi }) and let B ∈ B. Then
Z −1 (B) = ∪i:ci ∈B {Y = yi } ∈ σ(Y ),
84
proving the first property.
To prove the second one, it suffices to take A = {Y = yk }. In this case

E 1{Y =yk } E(X/Y ) = E 1{Y =yk } E(X/Y = yk )
!
E(X11{Y =yk } )
= E 1{Y =yk } = E(X11{Y =yk } ).
P (Y = yk )

Conditional expectation given a σ-field
Definition 10.1 The conditional expectation of X given G is a random vari-

able Z satisfying the properties
1. Z is G-measurable; that is, for any Borel set B ∈ B, Z −1 (B) ∈ G,
2. for any G ∈ G,
E(Z11G ) = E(X11G ).
We will denote the conditional expectation Z by E(X/G).

Notice that the conditional expectation is not a number but a random vari-
able. There is nothing strange in this, since conditioning depends on the
observations.
Condition (1) tell us that events that can be described by means of E(X/G)
are in G. Whereas condition (2) tell us that on events in G the random vari-
ables X and E(X/G) have the same mean value.
The existence of E(X/G) is not a trivial issue. You should trust mathe-
maticians and believe that there is a theorem in measure theory -the Radon-
Nikodym Theorem- which ensures its existence.
Before stating properties of the conditional expectation, we are going to
explain how to compute it in two particular situations.
Example 10.1 Let G be the σ-field (actually, the field) generated by a finite
partition G1 , . . . , Gm . Then
m
X E(X11Gj )
E(X/G) = 1Gj . (10.3)
j=1 P (Gj )
Formula (10.3) can be checked using Definition 10.1. It tell us that, on

each generator of G, the conditional expectation is constant; this constant is
weighted by the mass of the generator (P (Gj )).
85
Example 10.2 Let G be the σ-field generated by random variables
Y1 , . . . , Ym , that is, the σ-field generated by events of the form
Y1−1 (B1 ), . . . , Y1−1 (Bm ), with B1 , . . . , Bm arbitrary Borel sets. Assume in
addition that the joint distribution of the random vector (X, Y1 , . . . , Ym ) has
a density f . Then
Z ∞
E(X/Y1 , . . . , Ym ) = xf (x/Y1 , . . . , Ym )dx, (10.4)
−∞
with
f (x, y1 , . . . , ym )
f (x/y1 , . . . , ym ) = R ∞ R∞ . (10.5)
−∞ · · · −∞ f (x, y1 , . . . , ym )dx
In (10.5), we recognize the conditional density of X given Y1 = y1 , . . . , Ym =

ym . Hence, in (10.4) we first compute the conditional expectation E(X/Y1 =
y1 , . . . , Ym = ym ) and finally, replace the real values y1 , . . . , ym by the random
variables Y1 , . . . , Ym .
We now list some important properties of the conditional expectation.
(a) Linearity: for any random variables X, Y and real numbers a, b
E(aX + bY /G) = aE(X/G) + bE(Y /G).
(b) Monotony: If X ≤ Y then E(X/G) ≤ E(Y /G).
(c) The mean value of a random variable is the same as that of its conditional
expectation: E(E(X/G)) = E(X).
(d) If X is a G-measurable random variable, then E(X/G) = X
(e) Let X be independent of G, meaning that any set of the form X −1 (B),
B ∈ B is independent of G. Then E(X/G) = E(X).
(f) Factorization: If Y is a bounded, G-measurable random variable,
E(Y X/G) = Y E(X/G).
(g) If Gi , i = 1, 2 are σ-fields with G1 ⊂ G2 ,
E(E(X/G1 )/G2 ) = E(E(X/G2 )/G1 ) = E(X/G1 ).
86
(h) Assume that X is a random variable independent of G and Z another
G-measurable random variable. For any measurable function h(x, z)
such that the random variable h(X, Z) is in L1 (Ω),
E(h(X, Z)/G) = E(h(X, z))|Z=z .
For those readers who are interested in the proofs of these properties, we give
some indications.
Property (a) follows from the definition of the conditional expectation and
the linearity of the operator E.
Property (b) is a consequence of the monotony property of the operator E and
a result in measure theory telling that, for G-measurable random variables
Z1 and Z2 , satisfying
E(Z1 1G ) ≤ E(Z2 1G ),
for any G ∈ G, we have Z1 ≤ Z2 .
Here, we apply this result to Z1 = E(X/G), Z2 = E(Y /G).
Taking G = Ω in condition (2) above, we prove (c). Property (d) is obvious.
Constant random variables are measurable with respect to any σ-field. Ther-
fore E(X) is G-measurable. Assuming that X is independent of G, yields
E(X11G ) = E(X)E(11G ) = E(E(X)11G ).
This proves (e).

Properety (h) is very intuitive: Since X is independent of G in does not enter
the game of conditioning. Moreover, the measurability of Z means that by
conditioning one can suppose it is a constant.
11 Appendix 2: Stopping times

Throughout this section we consider a fixed filtration (see Section 2.4)
(Ft , t ≥ 0).
Definition 11.1 A mapping T : Ω → [0, ∞] is termed a stopping time with

respect to the filtration (Ft , t ≥ 0) is for any t ≥ 0
{T ≤ t} ∈ Ft .
It is easy to see that if S and T are stopping times with respect to the same
filtration then T ∧ S, T ∨ S and T + S are also stopping times.
87
Definition 11.2 For a given stopping time T , the σ-field of events prior to
T is the following
FT = {A ∈ F : A ∩ {T ≤ t} ∈ Ft , for all t ≥ 0}.
Let us prove that FT is actually a σ-field. By the definition of stopping time

Ω ∈ FT . Asuume that A ∈ FT . Then
Ac ∩ {T ≤ t} = {T ≤ t} ∩ (A ∩ {T ≤ t})c ∈ Ft .
Hence with any A, FT also contains Ac .

Let now (An , n ≥ 1) ⊂ FT . We clearly have
(∪∞ ∞
n=1 An ) ∩ {T ≤ t} = ∪n=1 (An ∩ {T ≤ t}) ∈ Ft .
This completes the proof.
Some properties related with stopping times
1. Any stopping time T is FT -measurable. Indeed, let s ≥ 0, then
{T ≤ s} ∩ {T ≤ t} = {T ≤ s ∧ t} ∈ Fs∧t ⊂ Ft .
2. If {Xt , t ≥ 0} is a process with continuous sample paths, a.s. and (Ft )-

adapted, then XT is FT -measurable. Indeed, the continuity implies
∞
X
XT = lim Xi2−n 1{i2−n <T ≤(i+1)2−n } .
n→∞
i=0
Let us now check that for any s ≥ 0, the random variable Xs 1{s<T } is
FT -measurable. This fact along with the property {T ≤ (i + 1)2−n } ∈
FT shows the result.
Let A ∈ B(R) and t ≥ 0. The set
{Xs ∈ A} ∩ {s < T } ∩ {T ≤ t}
is empty if s ≥ t. Otherwise it is equal to {Xs ∈ A} ∩ {s < T ≤ t},

which belongs to Ft .
88
References
[1] P. Baldi: Equazioni differenziali stocastiche e applicazioni. Quaderni
dell’Unione Matematica Italiana 28. Pitagora Editrici. Bologna 2000.
[2] K.L. Chung, R.J. Williams: Introduction to Stochastic Integration, 2nd

Edition. Probability and Its Applications. Birkhäuser, 1990.
[3] R. Durrett: Stochastic Calculus, a practical introduction. CRC Press,

1996.
[4] L.C. Evans: An Introduction to Stochastic Differential Equations.

http://math.berkeley.edu/ evans/SDE.course.pdf
[5] I. Karatzas, S. Shreve: Brownian Motion and Stochastic Calculus.

Springer Verlag, 1991.
[6] G. Lawler: Introduction to Stochastic Processes. Chapman and

Hall/CRC, 2nd Edition, 2006.
[7] J.-F. Le Gall: Mouvement Brownien et calcul stochastiques. Notes de

Cours de DEA 1996-1997. http://www.dma.ens.fr/ legall/
[8] B. Øksendal: Stochastic Differential Equations; an Introduction with

Applications. Springer Verlag, 1998.
[9] D. Revuz, M. Yor: Continuous martingales and Brownian motion.

Springer Verlag, 1999.
89

MARTA SANZ Lecturenotes-Sc2008 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MARTA SANZ Lecturenotes-Sc2008 PDF

Uploaded by

Copyright:

Available Formats

AN INTRODUCTION TO

2 The Brownian motion 9

4 Applications of the Itô formula 44

5 Local time of Brownian motion and Tanaka’s formula 50

6 Stochastic differential equations 55

7 Numerical approximations of stochastic differential equa-

8 Continuous time martingales 71

10 Appendix 1: Conditional expectation 83

11 Appendix 2: Stopping times 87

Definition 1.1 A stochastic process with state space S is a family {Xi , i ∈

1.1 The law of a stochastic process

Let us give an important example.

Example 1.1 A stochastic process {Xt , t ≥ 0} is said to be Gaussian if its

µ(t1 , . . . , tm ) = E (Xt1 , . . . , Xtm ) = (E(Xt1 ), . . . , E(Xtm ))

In the sequel we shall assume that I ⊂ R+ and S ⊂ R, either countable or

Mathematical results from measure theory tell us that PX is defined by means

Theorem 1.1 Consider a family

{Pt1 ,...,tn , t1 < . . . < tn , n ≥ 1, ti ∈ I} (1.1)

There exists a stochastic process {Xt , t ∈ I} defined in some probability space,

• for any s, t ∈ I, K(t, s) = K(s, t);

• for any natural number n and arbitrary t1 , . . . , tn ∈ I, and x1 , . . . , xn ∈

A(Xt1 , . . . , Xtn ) = (Xti1 , . . . , Xtim ),

δt1 ,tim · · · δtn ,tim

AΛAt = (K(til , tik ))1≤l,k≤m .

Example 1.2 Consider random arrivals of customers at a store. We set our

Example 1.3 Evolution of prices of risky assets can be described by real-

A stochastic process {Xt , t ∈ I} has independent increments if for any

A stochastic process {Xt , t ∈ I} has stationary increments if for any t1 < t2 ,

Proposition 2.1 A stochastic process {Xt , t ≥ 0} is a Brownian motion if

(ii) for any 0 ≤ s > t, the random variable Xt − Xs is independent of the

Proof: Let us assume first that {Xt , t ≥ 0} is a Brownian motion. Then

Hs and H̃s are orthogonal in L2 (Ω). Consequently, Xt − Xs is independent

This ends the proof of properties (i) and (ii).

E(Xt Xs ) = E ((Xt − Xs + Xs )Xs ) = E ((Xt − Xs )Xs ) + E(Xs2 )

2.2 A construction of Brownian motion

P. Lévy’s construction of Brownian Motion

where n ≥ 1 and k ∈ {0, 1, . . . , 2n − 1}.

hf, hkn ihkn ,

hf, hkn iNnk .

hf, hkn iNnk ,

the randon variable I(f ) is N (0, kf k22 ) and by Parseval’s identity

E(I(f )I(g)) = hf, gi, (2.4)

for any f, g ∈ L2 ([0, 1], B([0, 1]), λ).

for any t ∈ [0, 1].

Proof: We clearly have

and by the first Borel-Cantelli lemma

for any n ≥ n0 . Hence, we have proved

Such a process is Gaussian, zero mean and E(Bt Bs ) = s ∧ t. Hence it is a

2.3 Path properties of Brownian motion

|Bt (ω) − Bs (ω)| ≤ C|t − s|γ ,

for any t ∈ [0, 1] and some constant C > 0.

Next we prove that this set has null probability. Indeed,

since this holds for any k, M , se get

This ends the proof of the first part of the theorem.

where |Πn | denotes the norm of the partition Πn :

|Πn | = sup (tj+1 − tj )

Set ∆k B = Btnk − Btnk−1 .

Proposition 2.3 The sequence { rk=1 (∆k B)2 , n ≥ 1} converges in L2 (Ω) to

the deterministic random variable T . That is,

which clearly tends to zero as n tends to infinity.

something that we already know from Theorem 2.2.

2.4 The martingale property of Brownian motion

1. F0 contains all the sets of F of null probability,

where we have used that the random variables