This action might not be possible to undo. Are you sure you want to continue?

David Nualart nualart@mat.ub.es

1

1

1.1

Stochastic Processes

Probability Spaces and Random Variables

In this section we recall the basic vocabulary and results of probability theory. A probability space associated with a random experiment is a triple (Ω, F, P ) where: (i) Ω is the set of all possible outcomes of the random experiment, and it is called the sample space. (ii) F is a family of subsets of Ω which has the structure of a σ-ﬁeld: a) ∅ ∈ F b) If A ∈ F , then its complement Ac also belongs to F c) A1 , A2 , . . . ∈ F =⇒ ∪∞ Ai ∈ F i=1 (iii) P is a function which associates a number P (A) to each set A ∈ F with the following properties: a) 0 ≤ P (A) ≤ 1, b) P (Ω) = 1 c) For any sequence A1 , A2 , . . . of disjoints sets in F (that is, Ai ∩Aj = ∅ if i = j), P (∪∞ Ai ) = ∞ P (Ai ) i=1 i=1 The elements of the σ-ﬁeld F are called events and the mapping P is called a probability measure. In this way we have the following interpretation of this model: P (F )=“probability that the event F occurs” The set ∅ is called the empty event and it has probability zero. Indeed, the additivity property (iii,c) implies P (∅) + P (∅) + · · · = P (∅). The set Ω is also called the certain set and by property (iii,b) it has probability one. Usually, there will be other events A ⊂ Ω such that P (A) = 1. If a statement holds for all ω in a set A with P (A) = 1, then it is customary 2

to say that the statement is true almost surely, or that the statement holds for almost all ω ∈ Ω. The axioms a), b) and c) lead to the following basic rules of the probability calculus: P (A ∪ B) = P (A) + P (B) if A ∩ B = ∅ P (Ac ) = 1 − P (A) A ⊂ B =⇒ P (A) ≤ P (B). Example 1 Consider the experiment of ﬂipping a coin once. Ω = {H, T } (the possible outcomes are “Heads” and “Tails”) F = P(Ω) (F contains all subsets of Ω) 1 P ({H}) = P ({T }) = 2 Example 2 Consider an experiment that consists of counting the number of traﬃc accidents at a given intersection during a speciﬁed time interval. Ω = {0, 1, 2, 3, . . .} F = P(Ω) (F contains all subsets of Ω) λk P ({k}) = e−λ (Poisson probability with parameter λ > 0) k! Given an arbitrary family U of subsets of Ω, the smallest σ-ﬁeld containing U is, by deﬁnition, σ(U) = ∩ {G, G is a σ-ﬁeld, U ⊂ G} . The σ-ﬁeld σ(U) is called the σ-ﬁeld generated by U. For instance, the σ-ﬁeld generated by the open subsets (or rectangles) of Rn is called the Borel σ-ﬁeld of Rn and it will be denoted by BRn . Example 3 Consider a ﬁnite partition P = {A1 , . . . , An } of Ω. The σ-ﬁeld generated by P is formed by the unions Ai1 ∪ · · · ∪ Aik where {i1 , . . . , ik } is an arbitrary subset of {1, . . . , n}. Thus, the σ-ﬁeld σ(P) has 2n elements. 3

∞) of nonnegative real numbers. PX (B) = P (X −1 (B)) = P ({ω : X(ω) ∈ B}). 4 . The probability measure PX is called the law or the distribution of X. 2]. The random variable X assigns a value X(ω) to each outcome ω in Ω. b] ⊂ [0. that is. the set of all outcomes ω for which a ≤ X(ω) ≤ b is an event. F is the Borel σ-ﬁeld of [0. We will say that a random variable X has a probability density fX if fX (x) is a nonnegative function on R. ∞)) = e−λt . Notice that −∞ fX (x)dx = 1. 2]. F is the Borel σ-ﬁeld of [0. 2] is P ([a. 2 Example 5 Let an experiment consist of measuring the lifetime of an electric bulb. • A random variable deﬁnes a σ-ﬁeld {X −1 (B). B ∈ BR } ⊂ F called the σ-ﬁeld generated by X. ∞). A random variable is a mapping Ω −→ R ω → X(ω) which is F-measurable. We will denote this event by {a ≤ X ≤ b} for short. The sample space Ω is the set [0. The measurability condition means that given two real numbers a ≤ b. The probability of an interval [a. b]) = b−a . for all a < b. The probability that the lifetime is larger than a ﬁxed value t ≥ 0 is P ([t. instead of {ω ∈ Ω : a ≤ X(ω) ≤ b}. 2]. X −1 (B) ∈ F .Example 4 We pick a real number at random in the interval [0. for any Borel set B in R. Random variables admitting a probability density are called absolutely continuous. measurable with respect to the Borel σ-ﬁeld and such that b X P {a < X < b} = a +∞ fX (x)dx. that is. • A random variable deﬁnes a probability measure on the Borel σ-ﬁeld BR by PX = P ◦ X −1 . Ω = [0.

This random variable is discrete with 1 P (X = 1) = P (X = −1) = . Example 8 We say that a random variable X has the normal law N (m. .We say that a random variable X is discrete if it takes a ﬁnite or countable number of diﬀerent values xk . X(T ) = −1 represents the earning of a player who receives or loses an euro according as the outcome is heads or tails. k for k = 0. p) if n k P (X = k) = p (1 − p)n−k . . Its probability law is called the Bernoulli distribution with parameter p = P (A). n. σ 2 ) if b (x−m)2 1 P (a < X < b) = √ e− 2σ2 dx 2πσ 2 a for all a < b. 1] deﬁned by FX (x) = P (X ≤ x) = PX ((−∞. 2 Example 7 If A is an event in a probability space. Example 9 We say that a random variable X has the binomial law B(n. x]) is called the distribution function of the random variable X. The function FX : R →[0. Example 6 In the experiment of ﬂipping a coin once. the random variable 1A (ω) = 1 if ω ∈ A 0 if ω ∈ A / is called the indicator function of A. the random variable given by X(H) = 1. 1. . 5 . . Discrete random variables do not have densities and their law is characterized by the probability function: pk = P (X = xk ).

A2 . x FX (x) = −∞ fX (y)dy. Then E(X) = limn→∞ E(Xn ) ≤ +∞. the density is continuous. . Notice that E(1A ) = P (A). and if. in addition. such that X1 (ω) ≤ X2 (ω) ≤ · · · and n→∞ lim Xn (ω) = X(ω) for all ω. then. . 6 . . if X is a discrete variable that takes the values α1 .. . then FX (x) = fX (x). so the notion of expectation is an extension of the notion of probability. If X is a non-negative random variable it is possible to ﬁnd discrete random variables Xn .• The distribution function FX is non-decreasing. The mathematical expectation ( or expected value) of a random variable X is deﬁned as the integral of X with respect to the probability measure P : E(X) = Ω XdP . on the sets A1 . . In particular. α2 . 2. n = 1. then its expectation will be E(X) = α1 P (A1 ) + α1 P (A1 ) + · · · . x→+∞ • If the random variable X is absolutely continuous with density fX . . . lim FX (x) = 1. . If X is an arbitrary random variable. right continuous and with x→−∞ lim FX (x) = 0. . and this limit exists because the sequence E(Xn ) is non-decreasing. its expectation is deﬁned by E(X) = E(X + ) − E(X − ).

∞ E(X) = = 0 XdP = Ω +∞ Ω 0 1{X>t} dt dP P (X > t)dt.where X + = max(X. The expectation of a random variable X can be computed by integrating the function x with respect to the probability law of X: ∞ E(X) = Ω X(ω)dP (ω) = −∞ xdPX (x). then the expectation of g(X) can be computed by integrating the function g with respect to the probability law of X: ∞ E(g(X)) = Ω ∞ g(X(ω))dP (ω) = −∞ g(x)dPX (x). fX (x) is the density of X g(xk )P (X = xk ). Note that this is equivalent to say that E(|X|) < ∞. provided that both E(X + ) and E(X − ) are ﬁnite. A simple computational formula for the expectation of a non-negative random variable is as follows: ∞ E(X) = 0 P (X > t)dt. and in this case we will say that X is integrable. 0). X − = − min(X. X is discrete k ∞ −∞ 7 . The integral −∞ g(x)dPX (x) can be expressed in terms of the probability density or the probability function of X: ∞ g(x)dPX (x) = −∞ g(x)fX (x)dx. In fact. if g : R → R is a Borel measurable function and E(|g(X)|) < ∞. More generally. 0).

A random variable X is said to have a ﬁnite moment of order p ≥ 1. n! (n − 1)! n=1 ∞ The variance of a random variable X is deﬁned by σ 2 = Var(X) = E((X − E(X))2 ) = E(X 2 ) − [E(X)]2 .96) σ . 8 X −m ≤ 1. m+ 1. The variance of X measures the deviation of X from its expected value. then ∞ E(X) = n=0 n e−λ λn e−λ λn−1 = λe−λ = λ.96σ] is equal to 0. E(exp (λX)) = √ = √ = e 1 2πσ 2 1 2πσ 2 . 1). Y ) = E [(X − E(X)) (Y − E(Y ))] = E(XY ) − E(X)E(Y ).Example 10 If X is a random variable with normal law N (0.96 ≤ = Φ(1. e ∞ −∞ σ 2 λ2 2 eλx e− 2σ2 dx ∞ −∞ x2 e− (x−σ 2 λ)2 2σ 2 dx σ 2 λ2 2 Example 11 If X is a random variable with Poisson distribution of parameter λ > 0. the pth moment of X is deﬁned by mp = E(X p ). then its covariance is deﬁned by cov(X. For instance.96) − Φ(−1.96σ) = P (−1.95.96σ ≤ X ≤ m + 1. σ 2 ) we have P (m − 1. X provided E(X 2 ) < ∞. provided E(|X|p ) < ∞. where Φ is the distribution function of the law N (0. That is.96σ.96) = 0. If X and Y are two random variables with E(X 2 ) < ∞ and E(Y 2 ) < ∞. σ 2 ) and λ is a real number. In this case. if X is a random variable with normal law N (m. the probability that the random variable X takes values in the interval [m−1.95.

j=1 i. it is non-negative deﬁnite. . Moreover. an . . . The moments of a random variable can be computed from the derivatives of the characteristic function at the origin: mn = for n = 1. Indeed.. by deﬁnition. . This matrix is clearly symmetric. j)ai aj = i. by deﬁnition. We say that X = (X1 . The mathematical expectation of an n-dimensional random vector X is.The set of random variables with ﬁnite pth moment is denoted by Lp (Ω. . This is equivalent to say that X is a random variable with values in Rn . . the matrix ΓX = (cov(Xi . F.j≤n . Xj ))1≤i. that means. P ). n 1 (n) ϕ (t)|t=0 . . . . 9 .j=1 ai aj cov(Xi . Xj ) = Var( i=1 ai Xi ) ≥ 0 As in the case of real-valued random variables we introduce the law or distribution of an n-dimensional random vector X as the probability measure deﬁned on the Borel σ-ﬁeld of Rn by PX (B) = P (X −1 (B)) = P (X ∈ B). the vector E(X) = (E(X1 ).j=1 for all real numbers a1 . 3. . . E(Xn )) The covariance matrix of an n-dimensional random vector X is. The characteristic function of a random variable X is deﬁned by ϕX (t) = E(eitX ). . 2. j)ai aj ≥ 0 i. n n n ΓX (i. . . . Xn ) is an n-dimensional random vector if its components are random variables. in X ΓX (i.

. These distributions do not have densities and the law of a random variable X with a (possibly degenerated) normal law N (m. and Γ is a symmetric positive deﬁnite matrix. . . . . . where m ∈ Rn . . . . . . . . . for all ai < bi . 2 where t ∈ Rn . . There exists degenerate normal distributions which have a singular covariance matrix Γ. xn )dx1 · · · dxn . In this formula t denotes a row vector (1 × n matrix) and t denoted a column vector (n × 1 matrix).j=1 (xi −mi )(xj −mj )Γij . . Γ= . xn ) = i=1 1 2πσ 2 i e − (x−mi )2 2σ 2 i . 10 . . Γ = ΓX .. n} = an ··· a1 fX (x1 . xn )dx1 · · · dxn = 1. xn ) = (2π det Γ)− 2 e− 2 In that case. We say that an n-dimensional random vector X has a multidimensional normal law N (m. Notice that +∞ +∞ ··· −∞ −∞ fX (x1 . Γ). . Γ) is determined by its characteristic function: 1 E eit X = exp it m − t Γt . m = E(X) and If the matrix Γ is diagonal σ2 1 . if X has the density function fX (x1 . . . i = 1. . . we have. . · · · σ2 n then the density of X is the product of n one-dimensional normal densities: n fX (x1 . . ··· 0 . . measurable with respect to the Borel σ-ﬁeld and such that bn b1 P {ai < Xi < bi .We will say that a random vector X has a probability density fX if fX (x) is a nonnegative function on Rn . 0 n 1 Pn −1 i.

In particular. . We recall the diﬀerent types of convergence for a sequence of random variables Xn . / 11 . Γ) and A is a matrix of order m × n. .: Almost sure convergence: Xn −→ X.If X is an n-dimensional normal vector with law N (m. for ϕ(x) = |x|p . We recall some basic inequalities of probability theory: • Chebyshev’s inequality: If λ > 0 P (|X| > λ) ≤ • Schwartz’s inequality: E(XY ) ≤ • H¨lder’s inequality: o E(XY ) ≤ [E(|X|p )] p [E(|Y |q )] q . λp E(X 2 )E(Y 2 ). where p. + 1 q = 1. n = 1. we obtain |E(X)|p ≤ E(|X|p ). AΓA ). where P (N ) = 0. • Jensen’s inequality: If ϕ : R → R is a convex function such that the random variables X and ϕ(X) have ﬁnite expectation. then. q > 1 and 1 p 1 1 1 E(|X|p ). with p ≥ 1. . 3. if n→∞ a. for all ω ∈ N . then AX is an m-dimensional normal vector with law N (Am.s. lim Xn (ω) = X(ω). ϕ(E(X)) ≤ E(ϕ(X)). 2.

the reciprocal being also true when the limit is constant. if the random variables Xn are bounded in absolute value by a ﬁxed nonnegative random variable Y possessing pth ﬁnite moment (dominated convergence theorem): |Xn | ≤ Y. • The convergence in mean of order p implies the convergence in probability. • The convergence in probability implies the convergence law. E(Y p ) < ∞. for all ε > 0. if n→∞ P lim P (|Xn − X| > ε) = 0. Convergence in mean of order p ≥ 1: Xn −→ X. Indeed. Convergence in law: Xn −→ X. applying Chebyshev’s inequality yields P (|Xn − X| > ε) ≤ 1 E(|Xn − X|p ). Conversely. the convergence in probability implies the existence of a subsequence which converges almost surely. if n→∞ L Lp lim E(|Xn − X|p ) = 0. εp • The almost sure convergence implies the convergence in probability. for any point x where the distribution function FX is continuous. • The almost sure convergence implies the convergence in mean of order p ≥ 1. if n→∞ lim FXn (x) = FX (x). 12 .Convergence in probability: Xn −→ X.

Y are two independent random variables with ﬁnite expectation. . . Xn are independent random variables. E [g1 (X1 ) · · · gn (Xn )] = E [g1 (X1 )] · · · E [gn (Xn )] . . . if X1 . . A collection of random variables {Xi . The conditional probability P (A|B) represents the probability of the event A modiﬁed by the additional information that the event B has occurred occurred. i ∈ I} is independent if any collection of events {Ai . The conditional probability of an event A given another event B such that P (B) > 0 is deﬁned by P (A|B) = P (A∩B) . . Suppose that X. . . B ∈ F are said independent provided P (A ∩ B) = P (A)P (B). i ∈ I is independent. ik } ⊂ I. where gi are measurable functions such that E [|gi (Xi )|] < ∞. i ∈ I} such that Ai ∈ Gi for all i ∈ I.The independence is a basic notion in probability theory. The components of a random vector are independent if and only if the density or the probability function of the random vector is equal to the product of the marginal densities. . . is independent. This means that P (Xi1 ∈ Bi1 . Xik ∈ Bik ) = P (Xi1 ∈ Bi1 ) · · · P (Xik ∈ Bik ). More generally. . ik } ⊂ I. . Given an arbitrary collection of events {Ai . . . Then the product XY has also ﬁnite expectation and E(XY ) = E(X)E(Y ) . . for every ﬁnite set of indexes {i1 . Two events A. i ∈ I} is independent if the collection of σ-ﬁelds Xi−1 (BRn ). A collection of classes of events {Gi . P (B) We see that A and B are independent if and only if P (A|B) = P (A). we say that the events of the collection are independent provided P (Ai1 ∩ · · · ∩ Aik ) = P (Ai1 ) · · · P (Aik ) for every ﬁnite subset of indexes {i1 . . The mapping A −→ P (A|B) 13 . i ∈ I}. and for all Borel sets Bj . or probability functions.

such that E(X1 ) < ∞. Then. . n ≥ 1} be a sequence of 2 independent. P ). If T = N = {0. There will be also examples where S is N. t ∈ T } deﬁned on the same probability space (Ω. X1 + · · · + Xn a. The mathematical expectation of an integrable random variable X with respect to this new probability will be the conditional expectation of X given B and it can be computed as follows: E(X|B) = 1 E(X1B ). the process is said to be a discrete parameter process. Theorem 2 (Central Limit Theorem) Let {Xn .2 Stochastic Processes: Deﬁnitions and Examples A stochastic process with state space S is a collection of random variables {Xt . and then the process is said real-valued.s. identically distributed random variables. n ≥ 1} be a sequence of independent. Theorem 1 (Law of Large Numbers) Let {Xn . 1). The index t represents time. For every ﬁxed ω ∈ Ω. Set m = E(X1 ) and σ 2 = Var(X1 ). 1. the set of all integers. The set T is called its parameter set. the process is said to have a continuous parameter. F. ∞) and T = [a. 2. or a ﬁnite set. X1 + · · · + Xn − nm L √ → N (0. Then. n where m = E(X1 ). b] ⊂ R. If T is not countable. . the mapping t −→ Xt (ω) 14 . P (B) The following are two two main limit theorems in probability theory. identically distributed random variables. such that E(|X1 |) < ∞.deﬁnes a new probability on the σ-ﬁeld F concentrated on the set B. → m. and then one thinks of Xt as the “state” or the “position” of the process at time t. The state space is R in most usual examples. In the latter case the usual examples are T = R+ = [0. . σ n 1.}.

... establishes the existence of a stochastic process associated with a given family of ﬁnite-dimensional distributions satisfying the consistence condition: Theorem 3 Consider a family of probability measures {Pt1 ... . sample path or sample function of the process. Pt1 ... t1 < · · · < tn . ... n ≥ 1.tkm is the marginal of Pt1 . .. . A real-valued process {Xt . t ≥ 0} is called a second order process provided E(Xt2 ) < ∞ for all t ∈ T .. k m . then the probability distribution Pt1 . then Ptk1 . there exists a real-valued stochastic process {Xt .. trajectory. t) = Var(Xt ). The following theorem... Then. t) = Cov(Xs . t ≥ 0} deﬁned in some probability space (Ω. Xtn )−1 of the random vector (Xt1 . Xt ) = E((Xs − mX (s))(Xt − mX (t)). . . The variance of the process {Xt .. F..tn } as ﬁnitedimensional marginal distributions.tn . is called a ﬁnite-dimensional marginal distribution of the process {Xt . Xtn ) : Ω −→ Rn . P ) which has the family {Pt1 .. t ∈ T }. . t ∈ T } be a real-valued stochastic process and {t1 < · · · < tn } ⊂ T .tn = P ◦ (Xt1 . .tn is a probability on Rn 2... t ≥ 0} is deﬁned by σ 2 (t) = ΓX (t. (Consistence condition): If {tk1 < · · · < tkm } ⊂ {t1 < · · · < tn }. Let {Xt .. . The mean and the covariance function of a second order process {Xt . ... t ≥ 0} are deﬁned by mX (t) = E(Xt ) ΓX (s.. due to Kolmogorov. . corresponding to the indexes k1 .deﬁned on the parameter set T . is called a realization. ti ∈ T } such that: 1.tn . X 15 .. .

t]. . On the other hand. The sample paths of this process are non-decreasing. The sample paths of this process are lines with random coeﬃcients. Suppose that the interarrival times are positive random variables X1 . . Then Nt is the number of arrivals in the time interval [0. Example 13 Consider the stochastic process Xt = A cos(ϕ + λt). .. t ≥ 0} is a continuous time process with values in the state space N. The ﬁnite-dimensional marginal distributions are given by P (Xt1 ≤ x1 . Nt is a random variable taking values in the set S = N. right continuous and they increase by jumps of size 1 at the points X1 + · · · + Xk . for each t ∈ [0. and suppose the experiment is set up to measure the interarrival times. E(A2 ) < ∞ and ϕ is uniformly distributed on [0. Xtn ≤ xn ) = R FX xi − y 1≤i≤n ti min PY (dy). 2π]. Notice that for each t ≥ 0. and we put Nt = 0 if t < X1 . 2 Example 14 Arrival process: Consider the process of arrivals of customers at a store. . This is a second order process with mX (t) = 0 1 ΓX (s. Nt < ∞ for all t ≥ 0 if and only if ∞ Xk = ∞. Thus. . Then. ∞) Xt = tX + Y. t) = E(A2 ) cos λ(t − s). Consider the stochastic process with parameter t ∈ [0. . ∞). where A and ϕ are independent random variables such that E(A) = 0. .Example 12 Let X and Y be independent random variables. {Nt . we put Nt = k if and only if the integer k is such that X1 + · · · + Xk ≤ t < X1 + · · · + Xk+1 . k=1 16 . X2 .

Then. 1. A real-valued stochastic process {Xt . The dynamics of the process is as follows. t ∈ T } is equivalent to another stochastic process {Yt . and from 2 you jump to 3 with probability 1/3. Example 16 Let X and Y be random variables with joint Gaussian distribution. t) of a Gaussian process determine its ﬁnite-dimensional marginal distributions. σ 2 ). Then there exists a Gaussian process with mean m and covariance function Γ. t ∈ T } such that the random variables Xt are independent and with the same law N (0. Example 17 Gaussian white noise: Consider a stochastic process {Xt . 3}. ΓX (s. .Example 15 Consider a discrete time stochastic process {Xn .j=1 for all ti ∈ T . 17 . 2. and n ≥ 1. otherwise stay at 2. . Y ) + Var(Y ). You move from state 1 to state 2 with probability 1. which is nonnegative deﬁnite. that is n Γ(ti . 2. t) = 1 if s = t 0 if s = t Deﬁnition 4 A stochastic process {Xt . t ∈ T } if for each t ∈ T P {Xt = Yt } = 1. suppose that we are given an arbitrary function m : T → R. The mean mX (t) and the covariance function ΓX (s. this process is Gaussian with mean and covariance functions mX (t) = 0 ΓX (s. This is an example of a Markov chain. and a symmetric function Γ : T × T → R. Conversely. . tj )ai aj ≥ 0 i.} with a ﬁnite number of states S = {1. t) = t2 Var(X) + 2tCov(X. n = 0. is Gaussian with mean and covariance functions mX (t) = tE(X) + E(Y ). t ∈ T } is said to be Gaussian or normal if its ﬁnite-dimensional marginal distributions are multi-dimensional Gaussian laws. Then the process Xt = tX + Y . t ≥ 0. ai ∈ R. From state 3 you move either to 1 or to 2 with equal probability 1/2.

However. such that E (|Xt |p ) < ∞. Two equivalent processes may have quite diﬀerent sample paths. they are also indistinguishable. The following continuity criterion by Kolmogorov provides a suﬃcient condition of this type: 18 . The process {Xt . Let {Xt . Deﬁnition 5 Two stochastic processes {Xt . s→t Deﬁnition 7 Fix p ≥ 1. Example 18 Let ξ be a nonnegative random variable with continuous distribution function. t ∈ T }be a real-valued stochastic process. t ∈ T } are said to be indistinguishable if X· (ω) = Y· (ω) for all ω ∈ N . t ∈ T }. where T is an interval of R. Two discrete time stochastic processes which are equivalent. s→t Continuity in mean of order p implies continuity in probability. In order to show that a given stochastic process have continuous sample paths it is enough to have suitable estimations on the moments of the increments of the process. where T is an interval of R. then they are indistinguishable. t ≥ 0} is said to be continuous in mean of order p if lim E (|Xt − Xs |p ) = 0. t ∈ T }. for any ε > 0 and every t ∈ T lim P (|Xt − Xs | > ε) = 0.We also say that {Xt . The processes Xt = 0 Yt = 0 if ξ = t 1 if ξ = t are equivalent but their sample paths are diﬀerent. for all t ∈ T . / Two stochastic process which have right continuous sample paths and are equivalent. Deﬁnition 6 A real-valued stochastic process {Xt . t ∈ T } and {Yt . the continuity in probability (or in mean of order p) does not necessarily implies that the sample paths of the process are continuous. ∞). with P (N ) = 0. t ∈ T } is a version of {Yt . Set T = [0. is said to be continuous in probability if.

E(Z 2k ) = 19 . From the expression E(eλZ ) = e 2 λ 1 2 2 σ α (2) . t ∈ T . in comparison |t − s|. σ 2 ). Condition (1) also provides some information about the modulus of continuity of the sample paths of the process.2 The headway X between two vehicles at a ﬁxed instant is a random variable with P (X ≤ t) = 1 − 0. para todo s. 1. That means.3. t ∈ T } be a real-valued stochastic process and T is a ﬁnite interval.Proposition 8 (Kolmogorov continuity criterion) Let {Xt . More precisely.4. there exists a version of the process {Xt .3 Let Z be a random variable with law N (0. 2 with probabilities 0. for each ε > 0 there exists a random variable Gε such that.1 Consider a random variable X taking the values −2.4e−0. with probability one. t ≥ 0. Find the expected value and the variance of the headway.03t . e−X . |Xt (ω) − Xs (ω)| ≤ Gε (ω)|t − s| p −ε . t ∈ T . deduce the following formulas for the moments of Z: (2k)! 2k σ . 2k k! E(Z 2k−1 ) = 0.02t − 0. which is the order of magnitude of Xt (ω) − Xs (ω). 3X 2 + 5. for a ﬁxed ω ∈ Ω. 0. Moreover. 1. E(Gp ) < ∞.6e−0. Suppose that there exist constants a > 1 and p > 0 such that E (|Xt − Xs |p ) ≤ cT |t − s|α (1) for all s. Then. 0. t ∈ T } with continuous sample paths.3 respectively. Compute the expected values of X. 0. ε Exercises 1.

Var(Z) y E(Z 3 ). Show that the process Xt = X 2 cos(2πt + U ) is Gaussian and compute its mean and covariance functions. As an application compute E(Z 2 ). Let Nt be the number of arrivals during (0. where a = E(Y1 ). a) Show that P (Nt ≥ n) = P (Zn ≤ t).5 Let {Yn . and Zn = Y1 + · · · + Yn for n ≥ 1. d) Using the inequalities ZNt ≤ t < ZNt +1 show that almost surely Nt 1 = . 2π]. t]. and the probability density of X is for x > 0 fX (x) = 2x3 e−1/2x .6 Let X and U be independent random variables. Show that the characteristic function of Z is ϕZ (t) = exp λ eit − 1 . 1. U is uniform in [0. b) Show that for almost all ω. n ≥ 0} is called a renewal process.4 Let Z be a random variable with Poisson distribution with parameter λ. c) Show that almost surely limt→∞ ZNt Nt = a. The stochastic process {Zn . 4 20 . We think of Zn as the time of the nth arrival into a store. for all n ≥ 1 and t ≥ 0.1. limt→∞ Nt (ω) = ∞. Put Z0 = 0. t→∞ t a lim 1. n ≥ 1} be independent and identically distributed non-negative random variables.

for all s. Then T has a density function fT (t) = λe−λt 1(0.1 Jump Processes The Poisson Process A random variable T : Ω → (0. A stochastic process {Nt . Proposition 9 (Memoryless property) A random variable T : Ω → (0. ∞) has an exponential distribution if and only if it has the following memoryless property P (T > s + t|T > s) = P (T > t) for all s. 1 1 The mean of T is given by E(T ) = λ . P ) is said to be a Poisson process of rate λ if it veriﬁes the following properties: i) Nt = 0. t ≥ 0 and g(0) = 1. F. Proof. ∞) has exponential distribution of parameter λ > 0 if P (T > t) = e−λt for all t ≥ 0. 21 . t ≥ 0. Then P (T > s + t|T > s) = = P (T > s + t) P (T > s) e−λ(s+t) = e−λt = P (T > t). The exponential distribution plays a fundamental role in continuous-time Markov processes because of the following result. t ≥ 0} deﬁned on a probability space (Ω. Suppose ﬁrst that T has exponential distribution of parameter λ > 0.2 2. e−λs The converse implication follows from the fact that the function g(t) = P (T > t) satisﬁes g(s + t) = g(s)g(t).∞) (t). and its variance is Var(T ) = λ2 .

ii) for any n ≥ 1 and for any 0 ≤ t1 < · · · < tn the increments Ntn − Ntn−1 . because by the strong law of large numbers Tn 1 lim = . where λ > 0 is a ﬁxed constant. We have P (Nt = n) = P (Tn ≤ t < Tn+1 ) = P (Tn ≤ t < Tn + Xn+1 ) = {x≤t<x+y} t fTn (x)λe−λy dxdy fTn (x)e−λ(t−x) dx. . t ≥ 0}. Nt2 − Nt1 . P (Nt − Ns = k) = e−λ(t−s) [λ(t − s)]k . Tn = X1 + · · · + Xn . That is ∞ Nt = n=1 n1{Tn ≤t<Tn+1 } . n ≥ 1} of independent random variables with exponential law of parameter λ. = 0 22 .. are independent random variables. Condition ii) means that the Poisson process has independent and stationary increments. Set T0 = 0 and for n ≥ 1. A concrete construction of a Poisson process can be done as follows. We ﬁrst show that Nt has a Poisson distribution with parameter λt. 1. Notice that conditions i) to iii) characterize the ﬁnite-dimensional marginal distributions of the process {Nt . Notice that limn→∞ Tn = ∞ almost surely. iii) for any 0 ≤ s < t. k! k = 0. Clearly N0 = 0. t ≥ 0} deﬁned in (3) is a Poisson process with parameter λ > 0. . . Consider a sequence {Xn . that is. . (3) Proposition 10 The stochastic process {Nt . . . the increment Nt − Ns has a Poisson distribution with parameter λ(t − s). t ≥ 0} be the arrival process associated with the interarrival times Xn . . n→∞ n λ Let {Nt . 2. Proof.

The random variable Tn has a gamma distribution Γ(n, λ) with density fTn (x) = Hence, λn xn−1 e−λx 1(0,∞) (x). (n − 1)!

(λt)n . n! Now we will show that Nt+s − Ns is independent of the random variables {Nr , r ≤ s} and it has a Poisson distribution of parameter λt. Consider the event {Ns = k} = {Tk ≤ t < Tk+1 }, P (Nt = n) = e−λt where 0 ≤ s < t and 0 ≤ k. On this event the interarrival times of the process {Nt − Ns , t ≥ s} are X1 = Xk+1 − (s − Tk ) and Xn = Xk+n for n ≥ 2. Then, conditional on {Ns = k} and on the values of the random variables X1 , . . . , Xk , due to the memoryless property of the exponential law and independence of the Xn , the interarrival times Xn , n ≥ 1 are independent and with exponential distribution of parameter λ. Hence, {Nt+s − Ns , t ≥ 0} has the same distribution as {Nt , t ≥ 0}, and it is independent of {Nr , r ≤ s}. Notice that E(Nt ) = λt. Thus λ is the expected number of arrivals in an interval of unit length, or in other words, λ is the arrival rate. On the other 1 hand, the expect time until a new arrival is λ . Finally, Var(Nt ) = λt. We have seen that the sample paths of the Poisson process are discontinuous with jumps of size 1. However, the Poisson process is continuous in mean of order 2: E (Nt − Ns )2 = λ(t − s) + [λ(t − s)]2 −→ 0. Notice that we cannot apply here the Kolmogorov continuity criterion. The Poisson process with rate λ > 0 can also be characterized as an integer-valued process, starting from 0, with non-decreasing paths, with independent increments, and such that, as h ↓ 0, uniformly in t, P (Xt+h − Xt = 0) = 1 − λh + o(h), P (Xt+h − Xt = 1) = λh + o(h). 23

s→t

Example 1 An item has a random lifetime whose distribution is exponential with parameter λ = 0.0002 (time is measured in hours). The expected life1 1 time of an item is λ = 5000 hours, and the variance is λ2 = 25 × 106 hours2 . When it fails, it is immediately replaced by an identical item; etc. If Nt is the number of failures in [0, t], we may conclude that the process of failures {Nt , t ≥ 0} is a Poisson process with rate λ > 0. Suppose that the cost of a replacement is β euros, and suppose that the discount rate of money is α > 0. So, the present value of all future replacement costs is

∞

C=

n=1

βe−αTn .

**Its expected value will be
**

∞

E(C) =

n=1

βE(e−αTn ) =

βλ . α

For β = 800, α = 0.24/(365 × 24) we obtain E(C) = 5840 euros. The following result is an interesting relation between the Poisson processes and uniform distributions. This result say that the jumps of a Poisson process are as randomly distributed as possible. Proposition 11 Let {Nt , t ≥ 0} be a Poisson process. Then, conditional on {Nt , t ≥ 0} having exactly n jumps in the interval [s, s+t], the times at which jumps occur are uniformly and independently distributed on [s, s + t]. Proof. We will show this result only for n = 1. By stationarity of increments, it suﬃces to consider the case s = 0. Then, for 0 ≤ u ≤ t P (T1 ≤ u|Nt = 1) = P ({T1 ≤ u} ∩ {Nt = 1}) P (Nt = 1) P ({Nu = 1} ∩ {Nt − Nu = 0}) = P (Nt = 1) −λu −λ(t−u) λue e u = = . −λt λte t

24

2.2

Superposition of Poisson Processes

Let L = {Lt , t ≥ 0} and M = {Mt , t ≥ 0} be two independent Poisson processes with respective rates λ and µ. The process Nt = Lt + Mt is called the superposition of the processes Land M . Proposition 12 N = {Nt , t ≥ 0} is a Poisson process of rate λ + µ. Proof. Clearly, the process N has independent increments and N0 = 0. Then, it suﬃces to show that for each 0 ≤ s < t, the random variable Nt −Ns has a Poisson distribution of parameter (λ + µ)(t − s):

n

P (Nt − Ns = n) =

k=0 n

P (Lt − Ls = k, Mt − Ms = n − k)

= = e

k=0 −(λ+µ)(t−s)

e−λ(t−s) [λ(t − s)]k e−µ(t−s) [µ(t − s)]n−k k! (n − k)! [(λ + µ)(t − s)]n . n!

Poisson processes are unique in this regard.

2.3

Decomposition of Poisson Processes

Let N = {Nt , t ≥ 0} be a Poisson process with rate λ. Let {Yn , n ≥ 1} be a sequence of independent Bernoulli random variables with parameter p ∈ (0, 1), independent of N . That is, P (Yn = 1) = p P (Yn = 0) = 1 − p. Set Sn = Y1 + · · · + Yn . We think of Sn as the number of successes at the ﬁrst n trials, and we suppose that the nth trial is performed at the time Tn of the nth arrival. In this way, the number of successes obtained during the time interval [0, t] is Mt = SNt , and the number of failures is Lt = Nt − SNt . 25

Nt − Ns = m + k)P (Sm+k+n − Sn = m) = P (Nt − Ns = m + k)P (Sm+k = m) = e−λ(t−s) [λ(t − s)]m+k (m + k)! m p (1 − p)k . . SNt − SNs = m} . (m + k)! m!k! 2. t ≥ 0} and M = {Mt . the process Zt = Y1 + · · · + YNt = SNt . Clearly the event A is independent of H. . Nt − Ns = m + k. Lu . Consider a sequence {Yn . Then. . Y1 . the event A = {Mt − Ms = m.4 Compound Poisson Processes Let N = {Nt . t ≥ 0} are independent Poisson processes with respective rates λp and λ(1 − p) Proof. Set Sn = Y1 + · · · + Yn . YNs }. m! k! P (A) = n=0 ∞ P (A ∩ {Ns = n}) P (Ns = n. t ≥ 0} be a Poisson process with rate λ. Finally. u ≤ s} and it has probability P (A) = We have A = {Nt − Ns = m + k. Lt − Ls = k} is independent of the random variables {Mu . Lu . 26 . Sm+k+n − Sn = m) n=0 ∞ = = n=0 P (Ns = n. . u ≤ s. The σ-ﬁeld generated by the random variables {Mu . u ≤ s} coincides with the σ-ﬁeld H generated by{Nu . we have ∞ e−λp(t−s) [λp(t − s)]m e−λ(1−p)(t−s) [λ(1 − p)(t − s)]k . n ≥ 1} of independent and identically distributed random variables. which are also independent of the process N . It is suﬃcient to show that for each 0 ≤ s < t.Proposition 13 The processes L = {Lt .

The σ-ﬁeld generated by the random variables {Zu . Let us compute the characteristic function of this increment ∞ E(e ix(Zt −Zs ) ) = m. then E(Zt ) = µλt. . Nt − Ns = k) E(eixY1 ) k=0 k = = exp e−λ(t−s) [λ(t − s)]k k! ϕY1 (x) − 1 λ(t − s) . the sales to customers arriving in (0. Proposition 14 The compound Poisson process has independent increments. The amount of money spent by the nth customer is a random variable Yn which is independent of all the arrival times (including his own) and all the amounts spent by others. Example 2 Arrivals of customers into a store form a Poisson process N . As a consequence. Proof. Y1 . If E(Y1 ) = µ.Nt −Ns =k} ) m. YNs }. Fix 0 ≤ s < t. and the law of an increment Zt − Zs has characteristic function e(ϕY1 (x)−1)λ(t−s) .k=0 ∞ E(eixSk )P (Ns = m. t] is Nt . and we set Y0 = 0. the increment Zt − Zs = SNt − SNs is independent of the σ-ﬁeld generated by the random variables {Zu .Nt −Ns =k} ) E(eix(Sm+k −Sm ) 1{Ns =m. u ≤ s. 27 . u ≤ s}.k=0 ∞ = = m. The total amount spent by the ﬁrst n customers is Sn = Y1 + · · · + Yn if n ≥ 1. . . t] total Zt = SNt .k=0 ∞ E(eix(Zt −Zs ) 1{Ns =m.with Zt = 0 if Nt = 0. Since the number of customers arriving in (0. is called a compound Poisson process. where ϕY1 (x) denotes the characteristic function of Y1 . u ≤ s} is included into the σ-ﬁeld H generated by{Nu . .

0013. That is t a(t) = 0 λ(s)ds = 625(1 − e−3t − 3te−3t ). iii) for any 0 ≤ t. 2π . t ≥ 0} deﬁned on a probability space (Ω. t ≥ 0. we have P (N∞ N∞ − 625 > 700) = P ( √ > 3) 625 28 ∞ 3 1 2 √ e−x /2 dx = 0. where a is a non-decreasing function. and deﬁne the time inverse of a by τ (t) = inf{s : a(s) > t}. ii) for any n ≥ 1 and for any 0 ≤ t1 < · · · < tn the increments Ntn − Ntn−1 . The total demand N∞ for this item throughout [0. . If. F. P ) which veriﬁes the following properties: i) Nt = 0. Example 3 The times when successive demands occur for an expensive “fad” item form a non-stationary Poisson process N with the arrival rate at time t being λ(t) = 5625te−3t .2. ∞) has the Poisson distribution with parameter 625. Nt2 − Nt1 . . the probability of a shortage would be P (N∞ e−625 (625)k > 700) = . the variable Nt has a Poisson distribution with parameter a(t). for t ≥ 0. . and the process Mt = Nτ (t) is a stationary Poisson process with rate 1. based on this analysis. the company has manufactured 700 such items. Then τ is also a non-decreasing function. Assume that a is continuous. are independent random variables. k! k=701 ∞ Using the normal distribution to approximate this.5 Non-Stationary Poisson Processes A non-stationary Poisson process is a stochastic process {Nt . .

50 euros. t ≥ 0} be a Poisson process with rate λ = 15.3 A department store has 3 doors.1 Let {Nt . Arrivals at each door form Poisson processes with rates λ1 = 110. An average purchase is 4.30. N20 = 13. 2.80. Since T700 ≤ t if and only if Nt ≥ 700. λ2 = 90. Compute a) P (N6 = 9). a) What is the average worth of total sales made in a 10 hour day? b) What is the probability that the third female customer to purchase anything arrives during the ﬁrst 15 minutes? What is the expected time of her arrival? 29 . k! Exercises 2. Find the expected number of sales made during an eight-hour business day if the probability that a customer buys something is 0. its distribution is 699 P (T700 ≤ t) = P (Nt ≥ 700) = 1 − k=0 e−625 a(t)k . λ3 = 160 customers per hour. b) P (N6 = 9. The probability that a male customer buys something is 0.10. N56 = 27) c) P (N20 = 13|N9 = 6) d) P (N9 = 6|N20 = 13) 2. 30% of all customers are male.The time of sale of the last item is T700 . and the probability of a female customer buying something is 0.2 Arrivals of customers into a store form a Poisson process N with rate λ = 20 per hour.

30 . . . j ∈ I) is stochastic if 0 ≤ pij ≤ 1 and for each i ∈ I we have pij = 1. then π is an N -vector. i0 . A Markov chain will be a discrete time stochastic process with spate space I. If I is ﬁnite we label the states 1. j∈I This means that the rows of the matrix form a probability distribution on I. N . 1/2 0 1/2 (5) Deﬁnition 15 We say that a stochastic process {Xn . n ≥ 0} is a Markov chain with initial distribution π and transition matrix P if for each n ≥ 0. This property refers to the lack of memory of a given stochastic system. . . β < 1. i1 . Examples of stochastic matrices: P1 = where 0 < α. The elements pij will represent the probability that Xn+1 = j conditional on Xn = i. . .3 Markov Chains Let I be a countable set that will be the state space of the stochastic processes studied in this section. then P is a square N × N matrix. in+1 ∈ I (i) P (X0 = i0 ) = π i0 (ii) P (Xn+1 = in+1 |X0 = i0 . i. . . (4) 0 1 0 P2 = 0 1/2 1/2 . . 2. . . 1−α α β 1−β . . Xn = in ) = pin in+1 Condition (ii) is a particular version of the so called Markov property. If I has N elements. A probability distribution π on I is formed by numbers 0 ≤ π i ≤ 1 such that i∈I π i = 1. We say that a matrix P = (pij . and in the general case P will be an inﬁnite matrix.

We write pij = (P n )ij for 0 if i = j the (i. j) entry in P n . That is. . X1 = i1 . Similarly P is a stochastic matrix and we will denote the identity 1 if i = j (n) by P 0 . Xn = in ) P (X0 = i0 )P (X1 = i1 |X0 = i0 ) in |X0 = i0 . . Then P (Xn+m = j|Xm = i) = = = im+1 ∈I i0 ∈I ··· ··· P (Xn+m = j. . . . . . Xn = j) π i0 pi0 i1 · · · pin−1 j = (πP n )j . The matrix P 2 deﬁned by (P 2 )ik = j∈I pij pjk is again a stochastic n matrix. (n) in+m−1 ∈I Property (7) says that the probability that the chain moves from state i to state j in n steps is the (i. Xm = i) P (Xm = i) in+m−1 ∈I π i0 pi0 i1 · · · pim−1 i piim+1 · · · pin+m−1 j (πP m )i piim+1 · · · pin+m−1 j = pij . We also denote by πP the product of the vector π times the matrix P . . (P 0 )ij = δ ij = . we have P (X0 = = · · · P (Xn = = i0 . . = i0 ∈I ··· in−1 ∈I Proof of (7): Assume P (Xm = i) > 0. We call pij the n-step transition probability from i to j. . X1 = i1 . j) entry of the nth power of the transition (n) matrix P .Notice that conditions (i) and (ii) determine the ﬁnite dimensional marginal distributions of the stochastic process. Xn−1 = in−1 ) π i0 pi0 i1 · · · pin−1 in . In fact. . With this notation we have P (Xn = j) = (πP n )j P (Xn+m = j|Xm = i) = pij Proof of (6): P (Xn = j) = i0 ∈I (n) (6) (7) ··· in−1 ∈I P (X0 = i0 . . 31 . Xn−1 = in−1 .

First compute the eigenvalues of P2 : 1. and i ± 2 n = (n) 1 2 n (cos nπ nπ ± sin ) 2 2 so it makes sense to rewrite p11 in the form p11 = α + (n) 1 2 n β cos nπ nπ + γ sin 2 2 (n) for constants α. β and γ. 2 . p11 = a + b 2 2 The answer we want is real. (n) (0) This recurrence relation has a unique solution: p11 = (n) β α+β + α (1 α+β − α − β)n for α + β > 0 1 for α + β = 0 Example 2 Consider the three-state chain with matrix P2 given in (5). From the relation P1 = P1 P1 we can write p11 (n) (n+1) (n) (n) = p12 β + p11 (1 − α). β and γ: 1 = p11 = α + β 1 (1) 0 = p11 = α + γ 2 1 (2) 0 = p11 = α − β 4 32 (0) . p11 = 1. so by eliminating p12 we get a recurrence (n) relation for p11 : p11 (n+1) = β + p11 (1 − α − β). The ﬁrst values of p11 are easy to write down. Let (n) i i us compute p11 .The following examples give some methods for calculating pij : Example 1 Consider a two-states Markov chain with transition matrix P1 n+1 n given in (4). − 2 . From this we deduce that n n i i (n) +c − . (n) (n) (n) We also know that p12 + p11 = 1. so we get equations to solve for α.

. . . for j = i. Xn = in ) = pin+1 −in by the independence of Yn+1 and X0 . n + i. . π i = 0 for i > 0. Xn . n ≥ 0} is a Markov chain whose transition probabilities are pij = pj−i if i ≤ j. 1−p p P = · · 0 · Moreover. . Example 3 Sums of independent random variables. β = 4/5. This a particular case of a sum of independent random variables when the variables Yn have Bernoulli distribution. . 2.so α = 1/5. be independent and identically distributed discrete random variables such that P (Y1 = k) = pk . 1. . Xn = in ) = P (Yn+1 = in+1 − in |X0 = i0 . . . The transition matrix is. P = · · 0 · Let Nn be the number of successes in n Bernoulli trials. and pij = 0 if i > j. 2. and for n ≥ 1. 33 . . in this case 1−p p 0 1−p p . where the probability of a success in any one is p. . . for k = 0. . . . The space state of this process is {0. Xn = Y1 + · · · + Yn . Then P (Xn+1 = in+1 |X0 = i0 . . Y2 . . Thus. . . .Put X0 = 0. pij = (n) n j−i pj−i (1 − p)n−j+i . . . Let Y1 . .. {Xn . In matrix form p0 p1 p2 p3 · · · p0 p1 p2 · · · p0 p1 · · · p0 · · · .}. 1. γ = −2/5 and (n) p11 1 = + 5 1 2 n nπ 2 nπ β cos − sin 5 2 5 2 . . The initial distribution is π 0 = 1. .

independent of the random variables X0 . 1. When it fails.. {Xn . it is replaced immediately by an identical one. Consider some piece of equipment which is now in use. .. Consider an event of the form A = {X0 = i0 . {Xn+m . . . Xm . conditional on Xm = i. Let pk be the probability that a new item lasts for k units of time. k = 1. . · · · Proposition 16 (Markov property) Let {Xn . Xm = im }. Since the lifetimes of successive items installed are independent. Thus the transition matrix is p1 p2 p3 1 0 0 1 0 P = · 0 · · · · · · . n ≥ 0} be a Markov chain with transition matrix P and initial distribution π. .}. . When that one fails it is again replaced by an identical one. n ≥ 0} is a Markov chain with transition matrix P and initial distribution δ i . n ≥ 0} is a Markov chain whose state space is {0.. . 34 . Then.Example 4 Remaining Lifetime. That is Xn+1 = 1{Xn ≥1} (Xn − 1) + 1{Xn =0} (Zn+1 − 1) where Zn+1 is the lifetime of the item installed at n. Proof. . . . 2. . . . and so on. 2. Let Xn be the remaining lifetime of the item in use at time n. For i ≥ 1 pij = P (Xn+1 = j|Xn = i) = P (Xn − 1 = j|Xn = i) 1 if j = i − 1 = 0 if j = i − 1 and for i = 0 p0j = P (Xn+1 = j|Xn = 0) = P (Zn+1 − 1 = j|Xn = i) = pj+1 . The transition probabilities are as follows. .

i → j imply j ∈ C. . i1 . We say that i leads to j and we write i → j if We say that i communicates with j and write i ↔ j if i → j and j → i.. This follows from P ({Xm = im ... and 3. and 3. Xm+n = im+n } ∩ A|Xm = i) = P (X0 = i0 . . pij > 0 for some n ≥ 0. . . Xm+n = i ) /P (Xm = i) = δ iim pim im+1 · · · pim+n−1 im+n P (A ) /P (Xm = i) . i → j 2. . . follows from ∞ (n) pij ≤ P (Xn = j for some n ≥ 0|X0 = i) ≤ n=0 (n) pij . 35 . that i → j and j → k imply i → k. It is clear from 2. . . The equivalence between 1. ..in−1 (n) pii1 · · · pin−1 j . . So ↔ is an equivalence relation on I. . in with i0 = i and in = j 3. (n) and the equivalence between 2. . We have to show that P ({Xm = im .. .. For distinct states i and j the following statements are equivalent: 1. . follows from pij = i1 .where im = i. which partitions I into communicating classes. We say that a class C is closed if i ∈ C. Xm+n = im+n } ∩ A|Xm = i) = δ iim pim im+1 · · · pim+n−1 im+n P (A|Xm = i). Also i → i for any state i. pi0 i1 · · · pin−1 in > 0 for some states i0 . 3. .1 Classiﬁcation of the States P (Xn = j for some n ≥ 0|X0 = i) > 0. ..

n ≥ 0} be a Markov chain with transition matrix P . If X0 = i ∈ A. then H A = 0 so hA = 1. The probability starting from i that the chain ever hits A is then hA = P (H A < ∞|X0 = i). with only{5. 3}.A state i is absorbing if {i} is a closed class. hA is called the absorption probability. {4}.}∪ {∞} deﬁned by H A = inf{n ≥ 0 : Xn ∈ A} (8) where we agree that the inﬁmum of the empty set is ∞. X0 = i) = P (H A < ∞|X0 = j) = hA j 36 . . The mean i time taken for the chain to reach A. if X0 = i ∈ A. For the example: 1 1 0 0 0 0 2 2 0 0 1 0 0 0 1 0 0 1 1 0 3 3 P = 3 0 0 0 1 1 0 2 2 0 0 0 0 0 1 0 0 0 0 1 0 the classes are {1. 2. and {5. 2. 6}. i so by the Markov property P (H A < ∞|X1 = j. then H A ≥ 1. The hitting time of a subset A of I is the random variable H A with values in {0. A chain of transition matrix P where I is a single class is called irreducible. The class structure can be deduced from the diagram of transition probabilities. A A / hi = j∈I pij hj for i ∈ A (9) / In fact. 1. i ∈ I) satisﬁes the following i system of linear equations for i ∈ A hA = 1 i . if P (H A < ∞) = 1. 6} being closed. 3. is given by ∞ A ki = E(H |X0 = i) = n=0 A nP (H A = n). i When A is a closed class. . The vector of hitting probabilities hA = (hA .2 Hitting Times and Absorption Probabilities Let {Xn . .

j A The vector of mean hitting times k A = (ki . A A / ki = 1 + j ∈A pij kj for i ∈ A / (10) A In fact. the vector of hitting probabilities hA and the vector of mean hitting times k A are the minimal non-negative solutions of these systems. then H A = 0 so ki = 0. 0 0 2 2 0 0 0 1 37 . If X0 = i ∈ A. X0 = i) = 1 + E(H A |X0 = j) and A ki = E(H A |X0 = i) = j∈I E(H A 1{X1 =j} |X0 = i) E(H A |X1 = j. Example 5 Consider the chain with the matrix 1 0 0 0 1 0 1 0 P = 2 1 2 1 . X0 = i)P (X1 = j|X0 = i) j∈I A pij kj . if X0 = i ∈ A. X1 = j|X0 = i) P (H A < ∞|X1 = j)P (X1 = j|X0 = i) j∈I = = j∈I pij hA . j ∈A / = = 1+ Remark: The systems of equations (9) and (10) may have more than one solution.and hA = P (H A < ∞|X0 = i) i = j∈I P (H A < ∞. i ∈ I) satisﬁes the following system of linear equations A ki = 0 for i ∈ A . / so by the Markov property E(H A |X1 = j. In this case. then H A ≥ 1.

Clearly h0 = 0. if ki = E(hit {0. and 1 k1 = 1 + k0 + 2 1 k2 = 1 + k1 + 2 Hence. what is the probability of absorption in 3? How long it take until the chain is absorbed in 0 or 3? Set hi = P (hit 3|X0 = i). 3 On the other hand. If p < q. 2. for i = 1. so hi = 1 for all i. . 2. . 2. Assume X0 = i. k1 = 2 and k2 = 2. then the restriction 0 ≤ hi ≤ 1 forces B = 0. . and h1 = h2 1 h0 + 2 1 = h1 + 2 1 h2 2 1 h3 2 1 Hence. . and at any time with probability p he earns 1 euro and with probability q he losses euro. then k0 = k3 = 0. 3}|X0 = i). . 1 k2 2 1 k3 2 Example 6 (Gambler’s ruin) Fix 0 < p = 1 − q < 1. then since h0 = 1 we get a family of solutions hi = q p i +A 1− 38 q p i . . if he starts with a capital of i euros. . Then h satisﬁes h0 = 1. . What is the probability of ruin? Here 0 is an absorbing state and we are interested in the probability of hitting zero. hi = phi+1 + qhi−1 . for i = 1.i+1 = p. h2 = 2 and h1 = 3 .} and transition probabilities p00 = 1. If p = q this recurrence relation has a general solution hi = A + B q p i . and consider the Markov chain with state space {0. pi. . Set hi = P (hit 0|X0 = i). 1.Questions: Starting from 1. h3 = 1. Then Xn represents the fortune of a gambler after the nth play. If p > q. .i−1 = q. pi.

for i = 1. 1. In that case the state-space is ﬁnite I = {0. N − 1 hN = 0.. so hi = 1 for all i.for a non-negative solution we must have A ≥ 0. N } and the transition matrix of the Markov chain is 1 0 0 · · · 0 q 0 p · 0 q 0 p · · . . The solution to these equations is q p i hi = − q p q p N N 1− if p = q. 2. hi = 39 . . . hi = phi+1 + qhi−1 . The previous model represents the case a gambler plays against a casino. to the minimal non-negative solution is hi = solution q p i . 2. N being the total capital of both gamblers. . Suppose now that the opponent gambler has an initial fortune of N − i. . . Finally. if p = q the recurrence relation has a general hi = A + Bi and again the restriction 0 ≤ hi ≤ 1 forces B = 0. . and N −i N 1 if p = q = 2 . P = · 0 q 0 p · · · · · q 0 p 0 · · · 0 0 1 If we set hi = P (hit 0|X0 = i). Notice that letting N → ∞ we recover the results for the case of an inﬁnite fortune. Then h satisﬁes h0 = 1. .

Such a chain may serve as a model for the size of a population. pi pi−1 · · · p1 γ i = ∞. . Thus the minimal non-negative solution occurs when A = −1 and then ( ∞ γ i) i=0 ∞ j=i γ j hi = ∞ . . so the population survives with positive probability.i−1 = qi . so qi qi qi−1 · · · q1 ui+1 = ui = u1 = γ i u 1 . 1. for i = 1. 2. . As in the preceding example. . hi = pi hi+1 + qi hi−1 . where γ 0 = 1 and A = u1 is a constant to be determined. we have hi < 1. Set hi = P (hit 0|X0 = i). 2. j=0 γ j In this case.} and transition probabilities p00 = 1. There are two diﬀerent cases: (i) If all i. . ..i+1 = pi . Then h satisﬁes h0 = 1. This recurrence relation has variable coeﬃcients so the usual technique fails. ∞ i=0 qi qi−1 · · · q1 . recorded each time it changes. for i = 1. 2. . . then pi ui+1 = qi ui . 40 . . pi being the probability that we get a birth before a death in a population of size i. 2. .. pi. pi. 2. . . . Assume X0 = i. 0 is an absorbing state and we wish to calculate the absorption probability starting from i. hi = h0 − (u1 + · · · + ui ) = 1 − A(γ 0 + γ 1 + · · · + γ i−1 ). where. the restriction 0 ≤ hi ≤ 1 forces A = 0 and hi = 1 for (ii) If ∞ γ i < ∞ we can take A > 0 as long as 1−A(γ 0 +γ 1 +· · ·+γ i−1 ) ≥ 0 i=0 for all i. . . for i = 1. for i = 1. pi pi pi−1 · · · p1 where γi = Hence. we have 0 < pi = 1 − qi < 1..Example 7 (Birth-and-death chain) Consider the Markov chain with state space {0. Consider ui = hi−1 − hi .

Xn ∈ A}. . 2. The extension of the notion of stopping time to continuous time will be inspired by this equivalence. Let A be a subset of the space state. A random variable T : Ω → [0.} ∪ {∞} is called a stopping time if the event {T = n} belongs to Fn for all n = 0. . n ≥ 0}. Then the ﬁrst hitting time H A introduced in (8) is a stopping time because {H A = n} = {X0 ∈ A. F. and given this information you know when T occurs. 1. j=1 {T = n} = {T ≤ n} ∩ ({T ≤ n − 1})c . A discrete-time stochastic process {Xn . . / / The condition {T = n} ∈ Fn for all n ≥ 0 is equivalent to {T ≤ n} ∈ Fn for all n ≥ 0. Fn represents the information available at time n. n ≥ 0} if Xn is Fn -measurable for all n. . 1.3 Stopping Times F0 ⊂ F 1 ⊂ F 2 ⊂ · · · Consider an non-decreasing family of σ-ﬁelds in a probability space (Ω.. Example 8 Consider a discrete time stochastic process {Xn . Consider now a continuous parameter nondecreasing family of σ-ﬁelds {Ft . . . A random variable T : Ω → {0.3. X X {Xn . n ≥ 0} adapted to {Fn . t ≥ 0} in a probability space (Ω. and adapted to {Ft . . . Xn−1 ∈ A. 2. t ≥ 0} with continuous trajectories. . ∞] is called a stopping time if the event {T ≤ t} belongs to Ft for all t ≥ 0. P ). Intuitively. where Fn as the σ-ﬁeld generated by the random variables {X0 . In particular. P ). The ﬁrst passage time for a level a ∈ R deﬁned by Ta := inf{t > 0 : Xt = a} 41 . Xn }. n ≥ 0} is always adapted to the family Fn . . . . n ≥ 0} is said to be adapted to the family {Fn . t ≥ 0}. That is Fn ⊂ F for all n. . Example 9 Consider a real-valued stochastic process {Xt . . This is an immediate consequence of the relations {T ≤ n} = ∪n {T = j}. F.

FT is a σ-ﬁeld because it contains the empty set. If S and T are stopping times such that S ≤ T . if A ∈ FS . In fact. {S ∧ T ≤ t} = {S ≤ t} ∩ {T ≤ t} . If S and T are stopping times. In discrete time this property follows from the relation {XT ∈ B} ∩ {T = n} = {Xn ∈ B} ∩ {T = n} ∈ Fn for any subset B of the state space (Borel set if the state space is R). so are S ∨ T and S ∧ T . Let {Xt } be an adapted stochastic process (with discrete or continuous parameter) and let T be a stopping time. we can deﬁne the σ-ﬁeld FT = {A : A ∩ {T ≤ t} ∈ Ft .s∈ Q Xs ≥ a ∈ Ft . 3. we assume that the trajectories of the process {Xt } are right-continuous. then A ∩ {T ≤ t} = [A ∩ {S ≤ t}] ∩ {T ≤ t} ∈ Ft for all t ≥ 0. Given a stopping time T . 4. 2. If the parameter is continuous. Properties of stopping times: 1. Then the random variable XT (ω) = XT (ω) (ω) is FT -measurable. In fact. then FS ⊂ FT . it is stable by complements due to Ac ∩ {T ≤ t} = (A ∪ {T > t})c = ((A ∩ {T ≤ t}) ∪ {T > t})c . 42 .is a stopping time because {Ta ≤ t} = sup Xs ≥ a = sup 0≤s≤t 0≤s≤t. for all t ≥ 0}. this a consequence of the relations {S ∨ T ≤ t} = {S ≤ t} ∪ {T ≤ t} . and it is stable by countable intersections.

4 Recurrence and Transience Let {Xn . . XT = i) = pij1 pj1 j2 · · · pjn−1 . . XT = i) to obtain P ({XT +1 = j1 . . {XT +n . independent of X0 . We say that a state i is transient if P (Xn = i. for inﬁnitely many n) = 1. . and divide by P (T < ∞. which is a stopping time deﬁned by Ti = inf{n ≥ 1 : Xn = i}. n ≥ 0} is a Markov chain with transition matrix P and initial distribution δ i . 43 . . . XT . . . conditional on T < ∞. Let B be an event determined by the random variables X0 . We denote by Ti the ﬁrst passage time to state i.jn P (B|T < ∞. . . XT +n = jn } ∩ B ∩ {T = m} ∩ {XT = i}) = P ({Xm+1 = j1 . . .. Xn = jn |X0 = i) ×P (B ∩ {T = m} ∩ {XT = i}) . . By the Markov property at time m P ({XT +1 = j1 . Then. XT . . XT = i) . . . 1. . . n ≥ 0} be a Markov chain with transition matrix P and initial distribution π. and XT = i. . We say that a state i is recurrent if P (Xn = i. Now sum over m = 0. Let T be a stopping X time of Fn . n ≥ 0} be a Markov chain with transition matrix P . Xm+n = jn } ∩ B ∩ {T = m} ∩ {XT = i}) = P (X1 = j1 . . XT +n = jn } ∩ B|T < ∞.Proposition 17 (Strong Markov property) Let {Xn . . We shall show that every state is either recurrent or transient. . 2. Proof. . . for inﬁnitely many n) = 0. Thus a recurrent state is one to which you keep coming back and a transient state is one which you eventually leave for ever. . . . 3.

. The number of visits Vi to state i is deﬁned by ∞ Vi = n=0 1{Xn =i} and Ei (Vi ) = ∞ ∞ Pi (Xn = i) = n=0 n=0 pii . . Proof. Ti (r+1) (r) (r) to state i = inf{n ≥ Ti + 1 : Xn = i}. independent of X0 . for r = 0. . so Si is the ﬁrst passage time of {XT +n . (r) < ∞. So. . .where inf ∅ = ∞. XT . . 2. We deﬁne inductively the rth passage time Ti by (0) (1) Ti = 0. . conditional to T < ∞. 44 . . Si (r) is independent = n|Ti (r−1) < ∞) = P (Ti = n|X0 = i). {XT +n . (n) We can also compute the distribution of Vi under Pi in terms of the return probability fi = Pi (Ti < ∞). We will make use of the notation P (·|X0 = i) = Pi (·) and E(·|X0 = i) = Ei (·). 3. n ≥ 0} is a Markov chain with transition matrix P and initial distribution δ i . conditional on Ti (r−1) of {Xm : m ≤ Ti } and P (Si (r−1) Ti . . The length of the rth excursion to i is then Si (r) = Ti (r) − Ti 0 (r−1) if Ti <∞ if otherwise. But Si (r) (r) = inf{n ≥ 1 : XT +n = i}. 1. Ti = Ti and. (r−1) (r−1) Lemma 18 For r = 2. Apply the strong Markov property to the stopping time T = It is clear that XT = i on T < ∞. . n ≥ 0} to state i. .

then Pi (Vi > r + 1) = Pi Ti = Pi Ti (r) (r+1) (r) <∞ (r+1) < ∞ and Si < ∞|Ti (r) <∞ (r) = Pi Si (r+1) < ∞ Pi Ti <∞ = fi fir = fir+1 . From this theorem we can solve completely the problem of recurrence or transience for Markov chains with ﬁnite state space. one in terms of the return probability. Pi (Vi = ∞) = lim Pi (Vi > r) r→∞ = so i is recurrent and ∞ r→∞ lim [Pi (Ti < ∞)]r = 1 pii = Ei (Vi ) = ∞. then i is recurrent and (ii) if Pi (Ti < ∞) < 1. Suppose inductively that it is true for r. and: (i) if Pi (Ti < ∞) = 1. . then. The next theorem provides two criteria for recurrence or transience. 45 . if fi = Pi (Ti < ∞) < 1. (n) (n) ∞ n=0 Proof. Observe that if X0 = i then {Vi > r} = {Ti < ∞}. . First we show that recurrence and transience are class properties. Proof. we have Pi (Vi > r) = fir . by Lemma 42.. n=0 (n) On the other hand. then i is transient and ∞ n=0 pii = ∞. then by Lemma 42. pii < ∞. the other in terms of the n-step transition probabilities: Theorem 20 Every state i is recurrent or transient. If Pi (Ti < ∞) = 1.Lemma 19 For r = 0. 2. ∞ ∞ ∞ pii = Ei (Vi ) = n=0 r=0 (n) Pi (Vi > r) = r=0 fir = 1 <∞ 1 − f1 so Pi (Vi = ∞) = 0 and i is transient. . 1. When r = 0 the result is true.

Hence. Let C be a class which is not closed. Theorem 23 Every ﬁnite closed class is recurrent. Suppose C is closed and ﬁnite and that {Xn . j is also transient. (n) (m) There exists n. so C is recurrent. and. This shows that i is not transient. j ∈ C and m ≥ 1 with / Pi (Xm = j) > 0. m ≥ 0 with pij > 0 and pji > 0. Proof.Theorem 21 Let C be a communicating class. Then there exists i ∈ C. Since we have Pi ({Xm = j} ∩ {Xn = i for inﬁnitely many n}) = 0 this implies that Pi (Xn = i for inﬁnitely many n) < 1 so i is not recurrent. j ∈ C and suppose that i is transient. Proof. 46 . and so neither is C. Then either all states in C are transient or all are recurrent. for all r ≥ 0 pii so ∞ (r) pjj r=0 (n+r+m) ≥ pij pjj pji ∞ (n) (r) (m) ≤ 1 (n) (m) pij pji r=0 pii (n+r+m) < ∞. n ≥ 0} starts in C. Theorem 22 Every recurrent class is closed. Proof. Then for some i ∈ C we have 0 < P (Xn = i for inﬁnitely many n) = P (Xn = i for some n)Pi (Xn = i for inﬁnitely many n) by the strong Markov property. Take any pair of states i.

Example 11 (Random walk) Let {Xn . but inﬁnite closed classes may be transient. . 3}. Finite closed classes are recurrent. 9 are recurrent.} with 0 as absorbing state. 10} 0 0 0 0 0 0 0 0 0 0 0 1 0 3 0 0 3 0 4 1 0 1 4 4 0 0 0 0 0 1 3 0 0 0 0 . . The Markov chain of Example 6 (Gambler’s ruin) is a random walk on N = {0. 7. . 0. From the transition graph we see that {1.} and transition probabilities pi. 9} and {6} are irreducible closed sets. Example 10 Consider the Markov chain with and transition matrix 1 0 1 0 0 0 0 2 2 0 1 0 0 0 0 2 3 3 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 3 3 P = 0 0 0 0 0 1 0 0 0 0 0 0 0 1 4 0 0 1 1 0 0 0 4 4 0 1 0 0 0 0 0 0 1 0 0 1 0 0 3 3 state space I = {1. for i ∈ Z. . pi. 3. All states can be reached from each other. 2. . so pii = 0 for all n. 8 and 10 are transient. any state in a non-closed class is transient. Suppose we start at i. −1. there being n steps up and n steps down. This chain is called a random walk: If a person is at position i after step n. These sets can be reached from the states 4. n ≥ 0} be a Markov chain with state space Z = {. 5. n 47 . .i−1 = q. Any given sequence of steps of length 2n from i to i occurs with probability pn q n . state 6 is absorbing and states 1. 5. . . 2. 7. Thus pii (2n) = 2n n n p q . . .i+1 = p. so the chain is irreducible.So. 1. 1. . and the number of such sequences is the number of ways of choosing the n steps up from2n. 8 and 10 (but not viceversa). . 2. where 0 < p = 1 − q < 1. {2. . The states 4. his next step leads him to either i + 1 or i − 1 with respective probabilities p and q. It is clear that we cannot return to 0 after an odd (2n+1) number of steps.

Using Stirling’s formula n! pii (2n) √ = 2πn(n/e)n we obtain (2n)! n n p q (n!)2 (4pq)n √ 2πn as n tends to inﬁnity. Invariant distributions are row eigenvectors of P with eigenvalue 1. P must have a row eigenvector with eigenvalue 1. so the column vector of ones is an eigenvector with eigenvalue 1. that is. 48 . identically distributed with P (Y1 = 1) = p and P (Y1 = −1) = 1 − p. On the other rn √ πn √ and the random walk is transient because ∞ rn / n < ∞. Let us ﬁrst show the following simple result. So. so pii (2n) 1 √ πn ∞ n=0 and the random walk is recurrent because hand. Invariant distributions are related to the limit behavior of P n . 1 In the symmetric case p = q = 2 . n=0 Assuming X0 = 0. if p = q. The row sums of P are all 1. then 4pq = r < 1. we can write Xn = Y1 + · · · + Yn . then the distribution of Xn is π for all n. invariant distributions always exist. If the initial distribution π of a Markov chain is invariant. In fact. from (6) this distribution is πP n = π. n So limn→∞ Xn = +∞ if p > 1/2 and limn→∞ Xn = −∞ if p < 1/2. 3. → 2p − 1. Let the state space I be ﬁnite. so pii (2n) √ 1/ n = ∞. where the Yi are independent random variables. By the law of large numbers we know that Xn a.5 Invariant Distributions We say a distribution π on the state space I is invariant for a Markov chain with transition probability matrix P if πP = π.s. This also implies that the chain is transient if p = q.

α+β ). 49 . Notice that these are the rows of limn→∞ P n . Suppose for some i ∈ I that pij → π j as n tends to inﬁnity. 2/5. Example 12 Consider the two-state Markov chain given in (4) with α = 0. n ≥ 0} irreducible and recurrent with transition matrix P . for all j ∈ I. 1. This equation together with the restriction π1 + π2 + π3 = 1 yields π = (1/5. This result is no longer true for inﬁnite Markov chains. For a ﬁxed state i. consider for each j the expected time spent in j between visits to i: Ti −1 γ i = Ei j n=0 1{Xn =j} . Proof.Proposition 24 Let the state space I be ﬁnite. Hence π is an invariant distribution. β α The invariant distribution is ( α+β . j ∈ I) is an invariant distribution. Consider a Markov chain {Xn . 2/5). Example 13 Consider the two-state Markov chain given in (5). To ﬁnd an invariant distribution we must solve the linear equation πP = π. Then π = (π j . We have πj = i∈I i∈I n→∞ (n) lim pij = lim (n) n→∞ pij = 1 i∈I (n) and πj = = k∈I n→∞ lim pij (n+1) = lim n→∞ pik pkj = k∈I k∈I (n) (n) lim p pkj n→∞ ik π k pkj where we have used the ﬁniteness of I to justify the interchange of summation and limit operation.

Let γ be an invariant measure for P . If the expected return time mi = Ei (Ti ) is ﬁnite. The transition probabilities are pi. The measure π i = 1 for all i. 1.Clearly. Example 15 Consider the asymmetric random walk on {0. In a ﬁnite Markov chain all recurrent states are positive recurrent. j ∈ I} form a measure on the state space I. The numbers {γ i . that is. it is the unique invariant measure. Moreover. 2.i = 1 − p = q 50 . (ii) γ(I) < ∞. 0 < γ i < ∞ j 2. Notice that although the state i is recurrent. The chain is null recurrent. and i γ i = Ei (Ti ) := mi j j∈I is the expected return time in i. The chain is positive recurrent and there is a unique invariant distribution given by 1 πi = . γ i P = γ i . up to a multiplicative constant. γ i = 1.} with reﬂection at 0. that is γ i is an invariant measure for the stochastic matrix P . If a state is positive recurrent.i+1 = p pi−1. Pi (Ti < ∞) = 1. n ≥ 0} be an irreducible and recurrent Markov chain with transition matrix P . with the j following properties: 1. Then one of the following situations occurs: (i) γ(I) = +∞. So 2 we are in case (i) and there is no invariant distribution. Proposition 25 Let {Xn . . mi Example 14 The symmetric random walk of Example 11 is irreducible and recurrent in the case p = 1 . . Then the chain is null recurrent and there is no invariant distribution. . mi can be inﬁnite. is invariant. then we say that i is positive recurrent. all states in its class are also positive recurrent. A recurrent state which fails to have this stronger property is called null recurrent.

Also.6 Convergence to Equilibrium (n) We shall study the limit behavior of the n-step transition probabilities pij (n) as n → ∞. 3. Hence. that the chain is recurrent. so again there is no invariant distribution and the chain is null recurrent. by the results in the case of the random walk in the integers. Notice that π i must be constant. The invariant measure equation πP = π leads to the recursive system π i = π i−1 p + π i+1 q for i = 2. 1 π1π2 · · · πi = π0 πi = π0 π 0 π 1 · · · π i−1 q p q i−1 for i ≥ 1. the chain is positive recurrent and the unique invariant distribution is πi = if i ≥ 1 and π 0 = 1 2 1 2q 1− p q p q i−1 . and recursively this implies π i p = π i+1 q. We have seen above that if the limit limn∞ pij exists as n → ∞ 51 . 2.for i = 1. 3. the chain is transient because. If p = 1 . and p01 = 1. We obtain π 2 q = π 1 p. q 0 p · · · This is an irreducible Markov chain. If p < q. So the 0 q P = transition matrix is 1 0 p . we 2 already know. and π1q = π0 π0 + π2q = π1. .If p > q. . 1− p q . . by the results obtained in the case of the random walk on the integers we have limn→∞ Xn = +∞. . . . in that case there is no bounded invariant measure.

However. the limit does not always exist. Thus. recur1 rent positive and aperiodic. Then.then the period of i is one and this state is called aperiodic. Let P be a transition matrix. Then (n) lim pij = π j n→∞ (n) for all i. 0 ≤ s < d. there exists 0 ≤ r < d. (n) Then. with r = 0 if i = j. Proposition 26 Suppose that the transition matrix P is irreducible. recur1 rent positive with period d aperiodic. Let π i = mi be the unique invariant distribution. In an (n) irreducible chain all states have the same period. Finally let us state the following version of the ergodic theorem. j ∈ I. and such that n→∞ lim pij (nd+r) (nd+s) = dπ j = 0 pij for all s = r. and the state-space is ﬁnite. and P 2n+1 = P for all n. P 2n = I. 52 .for some i and for all j. pij fails to converge for all i. The greatest common divisor of I(i) is called the period of the state i. Given a state i ∈ I set I(i) = {n ≥ 1 : pii > 0}. for all i. Let π i = mi be the unique invariant distribution. Example 16 Consider the two-states Markov chain P = 0 1 1 0 . If a states i veriﬁes pii > 0 for all suﬃciently large n. Proposition 27 Suppose that the transition matrix P is irreducible. as the following example shows. j ∈ I. then the limit must be an invariant distribution. j.

k=0 a. 3. If qi = 0 then π ij = δ ij . j ∈ I) which satisﬁes the following conditions: (i) qii ≤ 0 for all i. 1{Xk =j} → π j . 53 . Then for any function on I integrable with respect to π we have 1 n In particular. the ith row of Π is π ij = qij /qi . j ∈ I) called the jump matrix deﬁned as follows: If qi = 0.s. if j = i and π ii = 0. (ii) qij ≥ 0 for all i = j.Proposition 28 Let {Xn . 1 n n−1 n−1 f (Xk ) → k=0 a. qi = j=i The diagonal entry qii is then −qi . In each row of Q we can choose the oﬀ-diagonal entries to be any nonnegative real numbers. making the total row sum zero. From Π and the diagonal elements qi we can recover the matrix Q. The basic data for a continuous-time Markov chain on I are given in the form of a so called Q-matrix. i∈I f (i)π i .s. Let π be the invariant distribution. A Q-matrix Q deﬁnes a stochastic matrix Π = (π ij : i.7 Continuous Time Markov Chains Let I be a countable set. By deﬁnition a Q-matrix on I is any matrix Q = (qij : i. subject to the constraint the the oﬀ-diagonal row sum is ﬁnite: qij < ∞. (iii) j∈I qij = 0 for all i. n ≥ 0} be an irreducible and positive recurrent Markov chain with transition matrix P .

.the holding times in these states. n ≥ 0} with initial distribution π. Yn−1 . and transition matrix Π. 2) supi∈I qi < ∞. . . . Π= 1 2/3 1/3 0 (11) Here is the construction of a continuous-time Markov chain with Qmatrix Q and initial distribution π.Example 17 The Q-matrix −2 1 1 Q = 1 −1 0 2 1 −3 has the associated jump matrix 0 1/2 1/2 0 0 . . ξ 2 . . . The stochastic process {Xt . Rn . Y2 . Consider a discrete-time Markov chain {Yn . . We say that the Markov chain Xt is not explosive if P (ζ = ∞) = 1.. . and we have Xt = Yn for all Tn ≤ t. the holding times R1 . and Xt = Yn if Tn ≤ t < Tn+1 . . qYn−1 . . . Set Rn = Tn = R1 + · · · + Rn . let ξ 1 . . . Then we want that conditional on Y0 . then we put Rn+1 = ∞. . Rn are independent exponential random variables of parameters qY0 . . . . The random time ∞ ζ = sup Tn = n n=0 Rn is called the explosion time. . . If qYn = 0. The following are suﬃcient conditions for no explosion: 1) I is ﬁnite. . . . Y1 . starting from X0 = Y0 . t ≥ 0} will visit successively the sates Y0 . be independent exponential random variables of parameter 1. . n ≥ 0}. ∞ otherwise ξn qYn−1 . 54 . Denote by R1 . In order to construct such process. independent of the chain {Yn . .

. Set P (t) = k=0 ∞ tk Qk . .3) X0 = i and i is recurrent for the jump chain. j∈I (n) Finally. since P (t) = I + tQ + O(t2 ) we get that pij (t) ≥ 0 for all i. j and for t suﬃciently small. . Q = dP (t) | . Xtn = in ) = pin in+1 (tn+1 − tn ). the matrix P (t) is stochastic. t ≥ 0} is a continuous family of matrices that satisﬁes the following properties: (i) Ssemigroup property: P (s)P (t) = P (s + t). P (0) = I. j and for all t. in+1 we have P (Xtn+1 = in+1 |Xt0 = i0 . Proposition 29 Let {Xt . So pij (t) = 1 + j∈I ∞ n=1 tk n! qij = 1. (ii) For all t ≥ 0. (iii) Forward equation: P (t) = P (t)Q = QP (t). i1 . . Furthermore. . t ≥ 0} be a continuous-time Markov chain with Q-matrix Q. . dt t=0 The family {P (t). and since P (t) = P (t/n)n it follows that pij (t) ≥ 0 for all i. t ≥ 0} is called the semigroup generated by Q. if Q has zero row sums then so does Qn for every n: qik = k∈I k∈I j∈I (n) qij (n−1) qjk = j∈I qij (n−1) k∈I qjk = 0. 55 . . In fact. {P (t). k! Then. . Then for all times 0 ≤ t0 ≤ t1 ≤ · · · ≤ tn+1 and all states i0 .

(2) p11 (0) = q11 = −2 and p11 (0) = q11 = 7. −2. a measure) π on the state space I is said to be invariant for a continuous-time Markov chain {Xt . On the other hand. the associated jump matrix Π is recurrent) with Q-matrix Q. t ≥ 0} if πP (t) = π for all t ≥ 0. j. Q has eigenvalues 0. The following statements are equivalent: (i) The jump chain Π is positive recurrent. 4. we have t→∞ lim pij (t) = π j = 1 . If we choose an invariant distribution π as initial distribution of the Markov chain {Xt . t ≥ 0}. under these assumptions. Moreover. t ≥ 0} be an irreducible continuous-time Markov chain with Q-matrix Q. so p11 (t) = 3 1 −2t 3 −4t + e + e . If {Xt . more generally. To determine the constants we use p11 (0) = 1. then Equation (12) is equivalent to say that α is invariant for the jump matrix Π. c. In fact. so (α(Π − I))j = i∈I αi (π ij − δ ij ) = i∈I π i qij = 0. a measure π is invariant if and only if πQ = 0. (12) and there is a unique (up to multiplication by constants) solution π which is strictly positive. t ≥ 0} is a continuous-time Markov chain irreducible and recurrent (that is. (ii) Q is non explosive and has an invariant distribution π. Example 18 We calculate p11 (t) for the continuous-time Markov chain with Q-matrix (11). b. qj m j where mi = Ei (Ti ) is the expected return time to the state i. then. 8 4 8 56 . if we set αi = π i qi . we have αi (π ij −δ ij ) = π i qij for all i. then the distribution of Xt is π for all t ≥ 0.A probability distribution (or. Then put p11 (t) = a + be−2t + ce−4t for some constants a. Let {Xt .

s. then ζ = ∞.Example 19 Let {Nt . ∞ 1 j=0 qj a.} and Qmatrix −λ λ −λ λ −λ λ Q= · · · Example 20 (Birth process) A birth process i as generalization of the Poisson process in which the parameter λ is allowed to depend on the current state of the process. . it is a continuous-time Markov chain with state-space {0. if < ∞.s. Then. where i = 0. R2 . 1. . . and the jump chain is given by Yn = i + n. the holding times R1 . Concerning the explosion time.s. . then ζ < ∞. then. . ∞ n=1 1 so ∞ Rn < ∞. In fact. Rn ) = n=1 1 qi+n−1 < ∞. . 1. and Q-matrix −q0 q0 −q1 q1 . 1.}. . . conditional on X0 = i. a. The data for a birth process consist of birth rates 0 ≤ qi < ∞. t ≥ 0} is a continuous time Markov chain with state-space {0. .. . . . . a birth process X = {Xt . . 2. two cases are possible: (i) If (ii) If ∞ 1 j=0 qj ∞ 1 j=0 qj < ∞. Then. then n=1 n=1 by monotone convergence and independence ∞ ∞ ∞ 1+ 1 qi+n−1 = ∞ and E exp − n=1 Rn = n=1 E e−Rn = n=1 1+ 1 qi+n−1 −1 =0 57 . −q2 q2 Q= · · · That is. . If ∞ qi+n−1 = ∞. are independent exponential random variables of parameters qi . qi+1 . by monotone convergence ∞ ∞ Ei ( n=1 a. . = ∞. t ≥ 0} be a Poisson process with rate λ > 0. 2. respectively. 2.

all independently. +µ i 58 .}. We have pi = λiλi .s. by the memoryless property.. However. and the state space is I = {0. 2) αi = parameter of the exponential time until next birth or death. . The parameters of this continuous Markov chain are: 1) pi = probability that a birth occurs before a death if the size of the population is i. the mean population size growths exponentially: so E(Xt ) = eλt . 1. 2. 1 X0 = 1. . i = 0. the process begins afresh. Suppose a.8 Birth and Death Processes This is an important class of continuous time Markov chains with applications in biology. If i individuals are present then the ﬁrst birth will occur after an exponential time of parameter iλ. demography. Then if µ(t) = E(Xt ) µ(t) = E(Xt 1{T ≤t} ) + E(Xt 1{T >t} ) t ∞ n=1 a. Deﬁne λi = pi αi and µi = (1 − pi )αi . .Rn = ∞. and diﬀerentiating we 3. Note that p0 = 1. t λr e µ(r)dr 0 Setting r = t − s we obtain eλt µ(t) = 2λ get µ (t) = λµ(t). Then we have i + 1 individuals and. Note that ∞ iλ = ∞. Particular case (Simple birth process): Consider a population in which each individual gives birth after an exponential time of parameter λ. 2. so these parameters can be thought as the respective time rates of births and deaths at an instant at which the population size is i.s. let T be the time of the ﬁrst birth. The random variable Xt will represent the size of a population at time t. A “birth” increases the size by one and a “death” decreases it by one. and queueing theory. so ζ = ∞ and there is no explosion in i=1 ﬁnite time. . . except µ0 = 0 and λi + µi > 0 for all i ≥ 1.The parameters λi and µi are arbitrary non-negative numbers. Then the size of the population performs a birth process with rates qi = λi. . Indeed. = 0 λeλs 2µ(t − s)ds + e−λt .

an invariant distribution exists if and only if ∞ c= i=1 λ0 λ1 · · · λi−1 < ∞. in the case λ0 > 0 0 1 1 − pi 0 pi . . λ2 + µ2 . (13) λi + µi i=0 π (n) On the other hand. 1 − pi 0 pi Π= · · · and the Q-matrix is −λ0 λ0 µ1 −λ1 − µ1 λ1 µ2 −λ2 − µ2 λ2 Q= · · · P∞ . . λi−1 π i−1 + µi+1 π i+1 = (λi + µi )π i . 3. equation πQ = 0 satisﬁed by invariant measures leads to the system µ1 π 1 = λ 0 π 0 . Notice that λi +µ ii is the expected time i spent in state i. λ1 + µ1 . Hence. and the jump matrix is. . we get πi = λ0 λ1 · · · λi−1 π0. . n=0 The matrix Π is irreducible.. Solving these equations consecutively. for i = 2. 2. µ1 · · · µi for i = 1. λ0 π 0 + µ2 π 2 = (λ1 + µ1 )π 1 .. the rates are λ0 . µ1 · · · µi 59 . . . A necessary and suﬃcient condition for non explosion is then: ∞ (n) ∞ n=0 π ii = ∞. . .As a consequence. .

This corresponds to the case where each individual of the population has an exponential lifetime with parameter µ. So the jump λ chain is that of the Gambler’s ruin problem (Example 6) with p = λ+µ . 1−p 0 p Π= · · · If λ ≤ µ we know that with probability one the population with extinguish. a > 0. µi = µi. Notice that λ0 = 0. for all i ≥ 0.In that case the invariant distribution is πi = 1 1+c 1 λ0 λ1 ···λi−1 1+c µ1 ···µi if i = 0 . 60 . condition (13) becomes ∞ i=1 π ii ≥ i (λ + µ) ∞ n=0 (n) ∞ i=1 1 = ∞. The equality P (t) = P (t)Q leads to the following system of diﬀerential equations: ∞ pij (t) = k=0 qik pkj (t). If λ > µ. if i ≥ 1 (14) Example 21 Suppose λi = λi. that is. µ > 0. that is. i (λ + µ) and the chain is non explosive. and each individual can give birth to another after an exponential time with parameter λ. 1 0 1−p 0 p . and pi0 (t) = −pi0 (t)a + pi1 µ. for all j ≥ 1. 1 Again there is no explosion because ∞ i(λ+µ)+a = ∞. pij (t) = pij−1 (λ(j − 1) + a) − pij (t)[(λ + µ)j + a] + pij+1 µ(j + 1). A variation of this example is the birth and death process with linear growth and immigration: λi = λi+a. µ. µi = µi for some λ. Let us compute i=1 µi (t) := Ei (Xt ) the average size of the population if the initial size is i. where λ.

9. The solution to this diﬀerential equation with initial condition µi (0) = i yields µi (t) = at + i if λ = µ . a (λ−µ)t (λ−µ)t µi (t) = λ−µ e − 1 + ie if λ = µ Notice that in the case λ ≥ µ. t ≥ 0} recording the number of customers in the queue at time t.9 Queues The basic mathematical model for queues is as follows. The quantity of interest is the continuous time stochastic process {Xt . µi (t) converges to inﬁnity as t goes to inﬁnity. On arrival each customer must wait until a serves is free. and the times taken to serve customers are also independent random variables of another distribution. giving priority to earlier arrivals.1 M/M/1/∞ queue Here M means memoryless. This is always taken to include both those being served and those waiting to be served. t ≥ 0} is a continuous time Markov chain with with state space 61 . There is a line of customers waiting service. 1 stands for one server and inﬁnite queues are permissible. µi (t) converges to µ−λ . Under these assumptions. and service times are exponential of parameter µ. Customers arrive according to a Poisson process with rate λ. It is assumed that the times between arrivals are independent random variables with some distribution. a and if λ < µ. 3. {Xt . and using the above equa- µi (t) = j=1 ∞ jpij (t) pij (t) ((λj + a)(j + 1) − j[(λ + µ)j + a] + µ(j − 1)j) j=1 ∞ = = j=1 pij (t) (a + j(λ − µ)) = a + j(λ − µ)µi (t). we obtain ∞ ∞ j=1 jpij (t). 3.Taking into account that µi (t) = tions.

This means that conditional on XR1 = j. and similarly conditional on {T > S}. T > x) ∞ µ −(λ+µ)x = e−λt µe−µt dt = e λ+µ x = P (T < S)P (inf(S. 1. µ −λ − µ λ Q= · · · In fact. . the process{Xt . then T − S is exponential of parameter µ. Then the ﬁrst jump time R1 is R1 = inf(S. λ+µ On the other hand. λ + µ. t ≥ 0} begins afresh from j at time R1 . Hence. XR1 = i − 1 with probability λ+µ and XR1 = i + 1 with probability λ . suppose at time 0 there are i > 0 customers in the queue. In fact. Denote by T the time taken to serve the ﬁrst customer and by S the time of the next arrival. On the other hand. Actually it is a special birth and death process with Q-matrix −λ λ µ −λ − µ λ . respectively. which is an exponential random variable of parameter λ + µ. T ) and have probabilities µ λ and λ+µ .{0. then S − T is exponential of parameter λ. The jump matrix is 0 1 µ λ λ+µ 0 λ+µ µ λ Π= 0 λ+µ λ+µ · · · 62 . T ) > x). P (S − T > x|T < S) = = λ+µ µ ∞ 0 λ+µ P (S − T > x) µ e−λ(t+x) µe−µt dt = e−λx . inf(S. µ Thus. T ). 2. . . λ+µ P (T < S. The holding times are independent and exponential with parameters λ. . . . . the events {T < S} and {T > S} are independent of inf(S. In fact.}. λ + µ. T ) > x) = P (T < S. if we condition on {T < S}. the chain is non explosive.

The ﬁrst service time 63 . If i servers are occupied. The ratio r is called the traﬃc intensity for the system. . so the queue growths without limit in the long term. . the ﬁrst service is completed at the minimum of i independent exponential times of parameter µ. The average numbers of customers in the queue in equilibrium is given by ∞ ∞ Eπ (Xt ) = i=1 Pπ (Xt ≥ i) = i=1 λ µ i = λ .2 M/M/1/m queue 1 1 = . 2. If λ = µ the chain is null recurrent. µ−λ Also. . the mean time to return to zero is given by m0 = 1 µ = q0 π 0 λ(µ − λ) so the mean length of time that the server is continuously busy is given by m0 − 3. . using (14) or Example 13 : π j = (1 − r)rj .λ is that of the random walk of Example 13 with p = λ+µ . that is. for i = 0. . 3. there are s servers. If λ < µ.3 M/M/s/∞ queue Arrivals from a Poisson process with rate λ. . the chain is positive recurrent and the unique invariant distribution is. 1. m} is a recurrent class.9. . The states m + 1. . . a customer arriving to ﬁnd m or more customers in the system leaves and never returns (we do allow initial sizes of greater than m. the chain is transient and we have limt→∞ Xt = +∞. . starting with less than m customers. λi = 0 for i ≥ m and µi = µ for all i ≥ 1. however). and the waiting room is of inﬁnite size. Hence. This is a birth and death process with λi = λ. are all transient and C = {0. service times are exponential with parameter µ. If λ > µ. . the total number in the system cannot exceed m. λ where r = µ .9. 1. m + 2. m − 1. q0 µ−λ This is a queueing system like the preceding one with a waiting room of ﬁnite capacity m − 1.

is therefore exponential of parameter iµ. The invariant distribution of this irreducible chain is πi = (λ/µ)i s j=0 1 i! 1 j! (λ/µ)j . The total service rate increases to a maximum of sµ when all servers are working. We emphasize that the queue size includes those customers who are currently being served. 3. i C λ 1 for i ≥ s + 1 µ ss−i i! with some normalizing constant C. The chain is positive recurrent when λ < sµ.4 Telephone exchange A variation on the M/M/s queue is to turn away customers who cannot be served immediately. . that is.000 families per year. . . In the particular case s = ∞ there are inﬁnitely many servers so that no customer ever waits. 64 .003. then the arrival rate must be λ = 250.000. µi = iµ. In this case the normalizing constant is C = e−λ/µ and π is a Poisson distribution of parameter λ/µ. If the city has a stable population size of about 1. and a service time is the amount of time spent there by a family before it moves out. This model applies to very large parking lots. the number of families in the city is about one million plus or minus three thousand. s and µi = sµ for i ≥ s + 1. then the customers are the vehicles. This queue might serve a simple model for a telephone exchange. Then the queue size is a birth and death process with λi = λ for all i ≥ 0. and the service times are the times vehicles remain parked. . Approximating the Poisson distribution by a normal we obtain P (997.000 ≤ Xt ≤ 1. The arrivals are the families moving into the city.9974.000 families. the additional calls are lost. . 2. In USA 1/µ is about 4 years. The invariant distribution is i C λ 1 for i = 0. servers are the stalls. for i = 1. .9. when the maximum number of calls that can be connected at once is s: when the exchange is full. This model also ﬁts the short-run behavior of the size (measure in terms of the number of families living in them) of larger cities. s µ i! πi = .000) 0. .

5 0 0 P2 = 0 0.9 0 0 0 0 0 0 0. 2.2 Classify the states of the Markov chains with the matrices: 0. 1 (λ/µ)j j! j=0 Exercises 3. X5 = 3) (b) P (X5 = 2.7 0 0 0 1 0 0 0 0 0.3 0 0 0.1 0 0. is given by (Erlang’s formula): 1 (λ/µ)s s! πs = s .4 0 0.2 0 0 0 1 0 P1 = 1 0 0 0 0. 3} and transition matrix 0 1/3 2/3 P = 1/4 3/4 0 2/5 0 3/5 1 and initial distribution π = ( 2 .which is called the truncated Poisson distribution.3 0. the long-run proportion of time that the exchange is full. By the ergodic theorem.8 0 0.8 0 0 0 0 0.3 0. X6 = 2|X2 = 2) 3. and hence the long-run proportion of calls that are lost. 5 2 ). X4 = 1.2 0 0 0 0 1 0 0. X2 = 2. n ≥ 0} be a Markov chain with state space {1.5 0 0 0 65 . Compute the following probabili5 5 ties: (a) P (X1 = 2.5 0 0 0 0.1 Let {Xn . X3 = 2.5 following transition 0 0 0 0.

. 3 are all independent and have exponential distributions with parameters 2. 3. the Q-matrix and the transition semigroup P (t). 0 0 0 1 2 0 1 0 1 0 2 2 P3 = 0 0 1 0 0 . 2. t ≥ 0} is a continuous Markov chain with state space {a. . 2.} and ﬁnd its transition matrix.6 . 3} and transition matrix 0 1 0 Π = 0. let us consider the age Yn of the equipment in use ar time n. Compute the distribution of the holding times. Compute the Q-matrix of this Markov system.5 Consider a system whose successive states form a Markov chain with state space {1. 0 1 0 Suppose the holding times in states 1.3 Find all invariant distributions of the transition matrix P3 in exercise 2. Let Xt be the type of the last arrival before t. instead of considering the remaining lifetime at time n. n ≥ 0} is a Markov chain with state space {0. 2. 66 . (a) Show that P (Yn+1 = i + 1|Yn = i) = 1 − p1 1−p1 −···−pi+1 1−p1 −···−pi if i = 0 if i ≥ 1 (b) Show that {Yn . For any n. 1. respectively.6 Suppose the arrivals at a counter form a Poisson process with rate λ. 0 1 1 1 1 4 4 4 4 1 0 0 0 1 2 2 1 2 3. (c) Classify the states of this Markov chain. . 3.2 3. Yn+1 = 0 if the equipment failed during the period n + 1. b}. and suppose each arrival is of type a or of type b with respective probabilities p and 1 − p = q.4 In Example 4. 5. 3. Show that {Xt .4 0 0. and Yn+1 = Yn + 1 if it did not fail during the period n + 1.

3. for the case λ = 2. Suppose a penalty cost of one dollar per unit time is applicable for each customer present in the system. starting at i. where g = (0. 1. where α > 0. µ = 8. 2). (U α g)i . Show that the expected total discounted cost if.7 Consider a queueing system M/M/1/2 with capacity 2. Compute the Q-matrix and the potential matrix U α = (αI − Q)−1 . 67 . Suppose the discount rate is α.

1 Conditional Expectation Consider an integrable random variable X deﬁned in a probability space (Ω. Property (ii) implies that for any bounded and B-measurable random variable Y we have E (E(X|B)Y ) = E (XY ) . then Z = Z.. P ).. P -almost surely. Bm }. and a σ-ﬁeld B ⊂ F . P (Bj ) j=1 Here are some rules for the computation of conditional expectations in the general case: Rule 1 The conditional expectation is lineal: E(aX + bY |B) = aE(X|B) + bE(Y |B) 68 . We deﬁne the conditional expectation of X given B (denoted by E(X|B)) to be any integrable random variable Z that veriﬁes the following two properties: (i) Z is measurable with respect to B.. 4. F.4 Martingales We will ﬁrst introduce the notion of conditional expectation of a random variable X with respect to a σ-ﬁeld B ⊂ F in a probability space (Ω. In this case. (ii) For all A ∈ B E (Z1A ) = E (X1A ) . the conditional expectation E(X|B) is a discrete random variable that takes the constant value E(X|Bj ) on each set Bj : m E X1Bj E(X|B) = 1Bj . if Z and Z verify the above properties. That is. It can be proved that there exists a unique (up to modiﬁcations on sets of probability zero) random variable satisfying these properties. . P ). (15) Example 1 Consider the particular case where the σ-ﬁeld B is generated by a ﬁnite partition {B1 . F.

This Rule means that B-measurable random variables behave as constants and can be factorized out of the conditional expectation with respect to B. Rule 6 Given two σ-ﬁelds C ⊂ B. we replace z by Z. such that Z is B-measurable and X is independent of B. Consider a measurable function h(x.Rule 2 A random variable and its conditional expectation have the same expectation: E (E(X|B)) = E(X) This follows from property (ii) taking A = Ω. Then. then E(X|B) = X. In fact. then E (E(X|B)|C) = E (E(X|C)|B) = E(X|C) Rule 7 Consider two random variable X y Z. z)) for any ﬁxed value z of the random variable Z and. then E(Y X|B) = Y E(X|B). z)) |z=Z That is. Rule 5 If Y is a bounded and B-measurable random variable. In fact. and for all A ∈ B we have E (E(X|B)Y 1A ) = E (E(XY 1A |B)) = E(XY 1A ). Rule 3 If X and B are independent. the constant E(X) is clearly B-measurable. afterwards. Z) is an integrable random variable. Rule 4 If X is B-measurable. 69 . where the ﬁrst equality follows from (15) and the second equality follows from Rule 3. the random variable Y E(X|B) is integrable and B-measurable. we have E (h(X. and for all A ∈ B we have E (X1A ) = E (X)E(1A ) = E(E(X)1A ). Z)|B) = E (h(X. z) such that the composition h(X. then E(X|B) = E(X). we ﬁrst compute the conditional expectation E (h(X.

. . Ym ) of the variables Y1 . . the following monotone property holds: X ≤ Y ⇒ E(X|B) ≤ E(Y |B). . . . . Ym ) is a function g(Y1 . Ym ).Conditional expectation has properties similar to those of ordinary expectation. . In this case. Ym . . . . . . Ym ) = a p(dx|Y1 . For instance. Jensen inequality also holds. . . . . . In this case this conditional expectation is the mean of the conditional distribution of X given Y1 . we obtain |E(X|B)|p ≤ E(|X|p |B). . ym ). ym ) = R xp(dx|y1 . . .where g(y1 . this implies that E(X|Y1 . . hence. . That is. such that for all a < b b (16) P (a ≤ X ≤ b|Y1 . . . . Notice that the conditional expectation E(X|Y1 . . . . . . . . (17) We can deﬁne the conditional probability of an even C ∈ F given a σ-ﬁeld B as P (C|B) = E(1C |B). . In particular. . . . then E(|E(X|B)|p ) < ∞ and E(|E(X|B)|p ) ≤ E(|X|p ). Ym . . if ϕ is a convex function such that E(|ϕ(X)|) < ∞. 70 . . . . . . . Then. . we deduce that if E(|X|p ) < ∞. . This implies |E(X|B) | ≤ E(|X| |B). . . Ym ). . Ym ) = R xp(dx|Y1 . taking expectations. . Ym ). Suppose that the σ-ﬁeld B is generated by a ﬁnite collection of random variables Y1 . Ym is a family of distributions p(dx|y1 . . . ym of the random variables Y1 . . . . . Ym . . ym ) parameterized by the possible values y1 . if we take ϕ(x) = |x|p with p ≥ 1. . . . Ym . . . . then ϕ (E(X|B)) ≤ E(ϕ(X)|B). . The conditional distribution of X given Y1 . we will denote the conditional expectation of X given B by E(X|Y1 .

Ym have a joint density f (x. . (ii) X − E(X|B) is orthogonal to the subspace L2 (Ω. . . the set of square integrable and B-measurable random variables. . E [(X − E(X|B))Z] = E(XZ) − E(E(X|B)Z) = E(XZ) − E(E(XZ|B)) = 0. using the Rule 5. . . y1 . . for all Z ∈ L2 (Ω. . denoted by L2 (Ω. . . P ). Y = E(ZY ).B. . P ). ym ) = and E(X|Y1 .P ) min 2 E (X − Y )2 . xf (x|Y1 . B. . B. F. given a random variable X such that E(X 2 ) < ∞. . then the conditional distribution has the density: f (x|y1 . P ) is a closed subspace of L2 (Ω. P ) we have. the conditional expectation E(X|B) is the projection of X on the subspace L2 (Ω. P ). P ) that minimizes the mean square error: E (X − E(X|B))2 = This follows from the relation E (X − Y )2 = E (X − E(X|B))2 + E (E(X|B) − Y )2 and it means that the conditional expectation is the optimal estimator of X given the σ-ﬁeld B. . ym ) +∞ −∞ f (x. The set of all square integrable random variables. . ym ). y1 . As a consequence. ym )dy1 · · · dym +∞ . P ) because it is a B-measurable random variable and it is square integrable due to (17). Then. Y1 . Then. . (18) . B. 71 Y ∈L (Ω. . denoted by L2 (Ω. . F. . P ). In fact. . In fact. . we have: (i) E(X|B) belongs to L2 (Ω. . Ym )dx. B. y1 . . . E(X|B) is the random variable in L2 (Ω.In particular. . . . is a Hilbert space with the scalar product Z. Ym ) = −∞ f (x. if the random variables X. B. B. .

n ≥ 1} are independent random variable such that P (ξ n = 1) = p. condition (iii) can also be written as E(∆Mn |Fn−1 ) = 0. n ≥ 0} is called a supermartingale ( or submartingale) if property (iii) is replaced by E(Mn+1 |Fn ) ≤ Mn (or E(Mn+1 |Fn ) ≥ Mn ). In fact. n ≥ 0}). and F0 = {∅. on 0 < p < 1. On the other hand. (ii) For each n ≥ 0. . Example 2 Suppose that {ξ n . Then Mn is a martingale with respect to the sequence of σ-ﬁelds Fn = σ(ξ 1 . . (iii) For each n ≥ 0. + ξ n . P ) and a nondecreasing sequence of σ-ﬁelds F0 ⊂ F 1 ⊂ F 2 ⊂ · · · ⊂ F n ⊂ · · · contained in F. E(|Mn |) < ∞. n ≥ 0} is called a martingale with respect to the σ-ﬁelds {Fn .4.2 Discrete Time Martingales In this section we consider a probability space (Ω. ξ n ) for n ≥ 1. E(Mn+1 |Fn ) = E(Mn + ξ n |Fn ) = Mn + E(ξ n |Fn ) = Mn + E(ξ n ) = Mn . F. where ∆Mn = Mn − Mn−1 . for all n ≥ 1. Set M0 = 0 and Mn = ξ 1 + . n ≥ 1} are independent centered random variables. A sequence of real random variables M = {Mn . for n ≥ 1. . i P (ξ n = −1) = 1 − p. n ≥ 0} if: (i) For each n ≥ 0. Then 72 . The sequence M = {Mn . Mn is Fn -measurable (that is.. . Ω}. M is adapted to the ﬁltration {Fn . Example 3 Suppose that {ξ n . Notice that the martingale property implies that E(Mn ) = E(M0 ) for all n. E(Mn+1 |Fn ) = Mn ..

. Fn = σ(M0 .. In fact. . 1−p p ξ n+1 In the two previous examples. E(Mn+1 |Fn ) = E = 1−p p 1−p p ξ 1 +. That is. = E(Mm |Fn ).. This is always possible due to the following result: Lemma 30 Suppose {Mn . M0 = 1. then for all m ≥ n we have E(Mm |Fn ) = Mn ... n ≥ 0} is a martingale with respect to a ﬁltration {Fn }. . Then {Mn .. Proof. E(Mn+1 |Fn ) = E(E(Mn+1 |Gn )|Fn ) = E(M |Fn ) = Mn . for all n ≥ 0. Mn = E(Mn+1 |Fn ) = E(E(Mn+2 |Fn+1 ) |Fn ) = E(Mn+2 |Fn ) = . by the Rule 6 of conditional expectations. . we will take Fn = σ(M0 . . Mn ). Mn ) ⊂ Gn .+ξ n |Fn = Mn E = Mn . Usually. for all n ≥ 0. is a martingale with respect to the sequence p of σ-ﬁelds Fn = σ(ξ 1 . Some elementary properties of martingales: 1. ξ n ) for n ≥ 1. . Ω}.+ξ n+1 ξ 1 +. {Fn } is the ﬁltration generated by the process {Mn }.. and F0 = {∅. Mn ).+ξ n |Fn E 1−p p ξ n+1 ξ 1 +. Let Fn = σ(M0 .Mn = 1−p . . If {Mn } is a martingale. . . .. {Mn } is a submartingale if and only if {−Mn } is a supermartingale. . .. . . We have. . when the ﬁltration is not mentioned. n ≥ 0} is a martingale with respect to a ﬁltration {Gn }. In fact. 2. 73 . .

4. then the martingale transform {(H · M )n } is a (sub)martingale. Mn will be the fortune of the gambler at time n. We may think of Hn as the amount of money a gambler will bet at time n. Proposition 31 If {Mn . In fact. n ≥ 1} is a bounded (nonnegative) predictable sequence. by Jensen’s inequality for the conditional expectation we have E (ϕ(Mn+1 )|Fn ) ≥ ϕ (E(Mn+1 |Fn )) = ϕ(Mn ). then {ϕ(Mn )} is a submartingale. In particular. n ≥ 1} as the sequence (H · M )n = M0 + n j=1 Hj ∆Mj . Hn is Fn−1 measurable. Suppose that {Fn . If {Mn } is a submartingale and ϕ is a convex and increasing function such that E(|ϕ(Mn )|) < ∞ for all n ≥ 0. Proof. We deﬁne the martingale transform of a martingale {Mn . In fact. We say that {Hn . if {Mn } is a martingale such that E(|Mn |p ) < ∞ for all n ≥ 0 and for some p ≥ 1. for each n ≥ 0 the random variable (H · M )n is Fn measurable and integrable. n ≥ 1} is a predictable sequence of random variables if for each n ≥ 1. Clearly. n ≥ 0} by a predictable sequence {Hn . and 74 . + In particular. if n ≥ 0 we have E((H · M )n+1 − (H · M )n |Fn ) = E(∆Mn+1 |Fn ) = Hn+1 E(∆Mn+1 |Fn ) = 0.3. n ≥ 0} is a given ﬁltration. and M0 is the initial capital of the gambler. On the other hand. by Jensen’s inequality for the conditional expectation we have E (ϕ(Mn+1 )|Fn ) ≥ ϕ (E(Mn+1 |Fn )) ≥ ϕ(Mn ). n ≥ 0} is a (sub)martingale and {Hn . Suppose that ∆Mn = Mn − Mn−1 is the amount a gambler can win or lose at every step of the game if the bet is 1 Euro. Then. then {ϕ(Mn )} is a submartingale. If {Mn } is a martingale and ϕ is a convex function such that E(|ϕ(Mn )|) < ∞ for all n ≥ 0. then {|Mn |p } is a submartingale. then {Mn } and {Mn ∧ a} are submartingales. if {Mn } is a submartingale .

(1 + r)−n Sn . + ξ n . . Sn ) the vector of prices at time n. it is also fair regardless the gambling system {Hn }. φ1 . Hn = 2Hn−1 if ξ n−1 = −1... The value of the portfolio at time n ≥ 1 n n n is then 0 1 d Vn = φ0 Sn + φ1 Sn + . n ≥ 1} are independent 1 random variable such that P (ξ n = 1) = P (ξ n = −1) = 2 . . .. We denote by Sn = (Sn ..(H · M )n will be the fortune of the gambler is he uses the gambling system {Hn }. but this is not true due to the above proposition. 75 . d. such that φi represents the number of assets of type i at n time n. So. i = 0. (1 + r)−n Sn ). This condition is equivalent to φn · Sn = φn+1 · Sn for all n ≥ 0.. and Hn = 1 if ξ n−1 = 1. + φd Sn = φn · Sn . . . In other words. where V0 denotes the initial capital of the portfolio. the previous proposition tells us that if a game is fair. Sn are adapted and positive random variables which represent the prices at time n of d + 1 ﬁnancial assets. . φd ). Sn . out net winnings will be −1 − 2 − 4 − · · · − 2k−1 + 2k = 1. ... Sn . . n n n We say that a portfolio is self-ﬁnancing if for all n ≥ 1 Vn = V0 + n j=1 φj · ∆Sj . we double our bet when we lose. 0 1 d Example 4 Suppose that Sn . .. A famous gambling system is deﬁned by H1 = 1 and for n ≥ 2.. so the asset 0 1 d number 0 is non risky. so that if we lose k times and then win. Suppose that Mn = M0 + ξ 1 + . where {ξ n . We set φn = (φ0 .. n ≥ n 1}. where r > 0 is the interest rate. The fact that {Mn } is a martingale means that the game is fair. We 0 assume that Sn = (1 + r)n ... .. a portfolio is a family of predictable sequences {φi . This system seems to provide us with a “sure win”. In this context. Then the discounted value of the portfolio is Vn = (1 + r)−n Vn = φn · Sn . Deﬁne the discounted prices by 1 d Sn = (1 + r)−n Sn = (1. .

a < r < b. for each n ≥ 1. We say that a probability Q equivalent to P (that is. Vn will also be a martingale. the sequence of values of any self-ﬁnancing portfolio will also be a martingale with respect to Fn . Q) the sequence of discounted prices Sn is a martingale with respect to Fn . Sn−1 Sn−1 n = 1. In particular. In this example. b−a In fact. and 1 − p. then Vn = (φ1 · S)n is the martingale transform of the sequence {Sn } by the predictable sequence {φ1 }. that is. E(Tn ) = (1 + a)p + (1 + b)(1 − p) = 1 + r.. 1 + b. respectively. therefore. if d = 1.and the self-ﬁnancing condition can be written as φn · Sn = φn+1 · Sn . for n ≥ 0. In the particular case of the binomial model (also called Ross-Cox-Rubinstein model). and. F. P (A) = 0 ⇔ Q(A) = 0). is a risk-free probability. As a consequence. Vn+1 − Vn = φn+1 · (Sn+1 − Sn ) and summing in n we obtain n Vn = V0 + j=1 φj · ∆Sj .. the risk-free probability will be b−r p= . we assume that the random variables Tn = Sn ∆Sn =1+ . Then. . N are independent and take two diﬀerent values 1 + a.. E(Sn |Fn−1 ) = (1 + r)−n E(Sn |Fn−1 ) = (1 + r)−n E(Tn Sn−1 |Fn−1 ) = (1 + r)−n Sn−1 E(Tn |Fn−1 ) = (1 + r)−1 Sn−1 E(Tn ) = Sn−1 . with probabilities p. provided the β n are bounded. if n 1 {Sn } is a martingale and {φn } is a bounded sequence. if in the probability space (Ω. 76 .

In order to compute this price we make use of the martingale property of the sequence Vn and we obtain the following general formula for derivative prices: Vn = (1 + r)−(N −n) EQ (H|Fn ). In particular. In fact. Vn = EQ (VN |Fn ) = (1 + r)−(N −n) EQ (H|Fn ). which represents the payoﬀ of a derivative at the maturity time N on the asset. The derivative is replicable if there exists a self-ﬁnancing portfolio such that VN = H.Consider a random variable H ≥ 0. Then. H = (SN − K)+ . the stopped process {MT ∧n } will be a (sub)martingale. for n = 0. Then E(MT |FS ) ≥ MS . j=1 = M0 + As a consequence. The martingale transform of {Mn } by this sequence is n (H · M )n = M0 + j=1 T ∧n 1{T ≥j} (Mj − Mj−1 ) (Mj − Mj−1 ) = MT ∧n . {T ≥ n} = {T ≤ n − 1} ∈ Fn−1 . 77 . with equality in the martingale case. Example 5 Suppose that T is a stopping time. FN -measurable. if the σ-ﬁeld F0 is trivial. for an European call option with strike priceK. the process Hn = 1{T ≥n} is predictable. if {Mn } is a (sub)martingale . V0 = (1 + r)−N EQ (H). In fact. For instance. We will take the value of the portfolio Vn as the price of the derivative at time n. Theorem 32 (Optional Stopping Theorem) Suppose that {Mn } is a submartingale and S ≤ T ≤ m are two stopping times bounded by a ﬁxed time m.

Notice that {Hn } is predictable because {S < n ≤ T } ∩ A = {T < n}c ∩ [{S ≤ n − 1} ∩ A] ∈ Fn−1 . Hence. Then P ( sup Mn ≥ λ) ≤ 0≤n≤N 1 E(MN 1{sup0≤n≤N Mn ≥λ} ). Therefore. E (1A (MT − MS )) = 0 for all A ∈ FS and this implies that E(MT |FS ) = MS . Consider the stopping time T = inf{n ≥ 0 : Mn ≥ λ} ∧ N. the random variables Hn are nonnegative and bounded by one. E(MN ) ≥ E(MT ) = E(MT 1{sup0≤n≤N Mn ≥λ} ) +E(MT 1{sup0≤n≤N Mn <λ} ) ≥ λP ( sup Mn ≥ λ) + E(MN 1{sup0≤n≤N Mn <λ} ). because MS is FS measurable. where A ∈ FS . λ Proof. Then. (H · M )n is a martingale. Theorem 33 (Doob’s Maximal Inequality) Suppose that {Mn } is a submartingale and λ > 0. 0≤n≤N 78 . Proof. (H · M )m = M0 + 1A (MT − MS ). by the Optional Stopping Theorem. Notice ﬁrst that MT is integrable because m |MT | ≤ n=0 |Mn |. Moreover. Consider the predictable process Hn = 1{S<n≤T }∩A . by Proposition 31. The martingale property of (H ·M )n implies that E((H ·M )0 ) = E((H ·M )m ).This theorem implies that E(MT ) ≤ E(MS ). We have (H · M )0 = M0 . We make the proof only in the martingale case.

i ≥ 0} are i nonnegative independent identically distributed random variables.s. by the strong law of large numbers. {Mn } is a nonnegative martingale such that Mn → 0. However. n ≥ 1} are independent random variables with distribution N (0. Mn = exp j=1 ξj − n 2 σ 2 . The random variable Zn is the number of people in the nth generation and each member of a generation gives birth independently to an identically distributed number of 79 . Deﬁne a sequence {Zn } by Z0 = 1 and for n ≥ 1 Zn+1 = ξ n+1 + · · · + ξ n+1 if Zn > 0 1 Zn 0 if Zn = 0 The process {Zn } is called a Galton-Watson process. As a consequence. a.s. Example 7 (Branching processes) Suppose that {ξ n . the convergence may not be in the mean. n ≥ 1. Then. then Mn → M where M is an integrable random variable. if {Mn } is a martingale and p ≥ 1. λp which is a generalization of Chebyshev inequality. Example 6 Suppose that {ξ n .As a consequence. but E(Mn ) = 1 for all n. Theorem 34 (The Martingale Convergence Theorem) If {Mn } is a + submartingale such that supn E(Mn ) < ∞. and n a. Set M0 = 1. any nonnegative martingale converges almost surely. σ 2 ). applying Doob’s maximal inequality to the submartingale {|Mn |p } we obtain P ( sup |Mn | ≥ λ) ≤ 0≤n≤N 1 E(|MN |p ).

This is intuitive: If each individual on the average gives birth to less than one child. That is. This limit is zero if µ ≤ 1 and ξ n is not identically one. then Mn → M almost surely and in mean of order p. If µ > 1 the limit of Zn /µn has a change of being nonzero. Zn /µn is a nonnegative martingale. Mn = E(M |Fn ) for all n. + ξ n where {ξ n .children. i i The process Zn /µn is a martingale. S0 = 0 and for n ≥ 1. Sn = ξ 1 + . The following result established the convergence of the martingale in mean of order p in the case p > 1. n ≥ 1} are independent 80 . the species dies out. where ϕ(s) = ∞ pk sk is the generating k=0 function of the spring distribution. Actually. pk = P (ξ n = k) is called the oﬀspring distribution. In this case ρ = P (Zn = 0 for some n) < 1 is the unique solution of ϕ(ρ) = ρ. Moreover. ∞ E(Zn+1 |Fn ) = k=1 ∞ E(Zn+1 1{Zn =k} |Fn ) E( ξ n+1 + · · · + ξ n+1 1{Zn =k} |Fn ) 1 k k=1 ∞ = = k=1 ∞ 1{Zn =k} E(ξ n+1 + · · · + ξ n+1 |Fn ) 1 k 1{Zn =k} E(ξ n+1 + · · · + ξ n+1 ) 1 k k=1 ∞ = = k=1 1{Zn =k} kµ = µZn . Theorem 35 If {Mn } is a martingale such that supn E(|Mn |p ) < ∞. Set µ = E(ξ n ).. n ≥ 0}. In fact. in this case Zn = 0 for all i n suﬃciently large. so it converges almost surely to a limit.. This implies that E(Zn ) = µn . Example 8 Consider the symmetric random walk {Sn . On the other hand. for some p > 1.

. So. b)}. a}.s. and by the same argument. ST ∧n −→ ST ∈ {b. E(ST ) = 0 = aP (ST = a) + bP (ST = b). Set T = inf{n ≥ 0 : Sn ∈ (a. Mt is Ft -measurable (that is. In fact. which leads to E(T ) = −ab. a−b a = b) = . we know that {ST ∧n } is a martingale. Therefore. t ≥ 0}. t ≥ 0} if: (i) For each t ≥ 0. (iii) For each s ≤ t.3 Continuous Time Martingales Consider an nondecreasing family of σ-ﬁelds {Ft . t ≥ 0}). E(Mt |Fs ) = Ms .L1 2 Now {Sn −n} is also a martingale. Property (iii) can also be written as follows: E(Mt − Ms |Fs ) = 0 81 . Then {Sn } is a martingale. We are going to show that E(T ) = |ab|. a−b a. because the random walk is recurrent.1 random variables such that P (ξ n = 1) = P (ξ n = −1) = 2 . Hence. 4. We know that P (T < ∞) = 1. M is adapted to the ﬁltration {Ft . t ≥ 0} is called a martingale with respect to the σ-ﬁelds {Ft . / where b < 0 < a. From these equation we obtain the absorption probabilities: P (ST = a) = P (ST −b . E(ST ∧n ) = E(S0 ) = 0. E(|Mt |) < ∞. This martingale converges almost surely and in mean of order one because it is uniformly bounded. A real continuous time process M = {Mt . (ii) For each t ≥ 0. this implies 2 E(ST ) = E(T ).

most of the properties of discrete time martingales hold in continuous time.2 Let {Yn }n≥1 be a sequence of independent random variable uniformly distributed in [−1. For instance. Check whether the following sequences are martingales: n (1) Mn = k=1 (2) Mn 2 = Sn − 2 Sk−1 Yk .Notice that if the time runs in a ﬁnite interval [0. T ]. Then. Set Z = 1{X+Y =0} . λp This inequality allows to estimate moments of sup0≤t≤T |Mt |. 1]. then we have Mt = E(MT |Ft ). Set S0 = 0 and Sn = Y1 + · · · + Yn if n ≥ 1. Also. In a similar way we can introduce the notions of continuous time submartingale and supermartingale. 3 82 . we have E 0≤t≤T sup |Mt |2 ≤ 4E(|MT |2 ). and P (X = 0) = P (Y = 0) = 1 − p. For instance. n ≥ 1. 0 ≤ t ≤ T } be a martingale with continuous trajectories. for all p ≥ 1 and all λ > 0 we have P 0≤t≤T sup |Mt | > λ ≤ 1 E(|MT |p ). Exercises 4. we have the following version of Doob’s maximal inequality: Proposition 36 Let {Mt . and this implies that the terminal random variable MT determines the martingale. As in the discrete time the expectation of a martingale is constant: E(Mt ) = E(M0 ). M0 = 0 (1) n (2) . Compute E(X|Z) and E(Y |Z).1 Let X and Y be two independent random variables such that P (X = 1) = P (Y = 1) = p. Are these random variables still independent? 4. M0 = 0.

4. Use this property to check that E(T ) = 1/(2p − 1). show that Var(T ) = (1 − (p − q)2 ) /(p − q)3 . The company is ruined if assets drop to 0 or less. σ 2 ). Show that X0 = 1. 4. . . where a is a real parameter and Y0 = 1. Deﬁne n Yn = exp a k=1 Xk − nσ 2 . Xn = Y1 Y2 · · · Yn deﬁnes a martingale.4. where σ 2 = 1 − (p − q)2 . . be nonnegative independent and identically distributed random variables with E(Yn ) = 1. 83 . Y2 . Show that Sn − (p − q)n is a martingale. Show that P (ruin) ≤ exp(−2(c − µ)S0 /σ 2 ).6 Let Sn be an asymmetric random walk with p > 1/2.4 Let Y1 .5 Let Sn be the total assets of an insurance company at the end of the year n. For which values of a the sequence Yn is a martingale? 4. Show that the almost sure limit of Xn is zero if P (Yn = 1) < 1 (Apply the strong law of large numbers to log Yn ). premiums totaling c > 0 are received and claims ξ n are paid where ξ n has the normal distribution N (µ. In year n.3 Consider a sequence of independent random variables {Xn }n≥1 with laws N (0. Using the fact that (Sn − (p − q)n)2 − σ 2 n is a martingale. and let T = inf{n : Sn = 1}. σ 2 ) and µ < c.

iii) If 0 ≤ s < t. . In the 20’s Norbert Wiener presented a mathematical model for this motion based on the theory of stochastic processes. the increment Bt −Bs has the normal distribution N (0. because its components are independent and normal. It was later discovered that such irregular motion comes from the extremely large number of collisions of the suspended pollen grains with the molecules of the liquid. t) 84 . . t− s) iv) The process {Bt } has continuous trajectories.1 Stochastic Calculus Brownian motion In 1827 Robert Brown observed the complex and erratic motion of grains of pollen suspended in a liquid. . The position of a particle at each time t ≥ 0 is a three dimensional random vector Bt . Bt2 − Bt1 . t ≥ 0} is called a Brownian motion if it satisﬁes the following conditions: i) B0 = 0 ii) For all 0 ≤ t1 < · · · < tn the increments Btn − Btn−1 . 2) The mean and autocovariance functions of the Brownian motion are: E(Bt ) = 0 E(Bs Bt ) = E(Bs (Bt − Bs + Bs )) 2 = E(Bs (Bt − Bs )) + E(Bs ) = s = min(s. . . is normal. . . the probability distribution of a random vector (Bt1 . . are independent random variables. Bt2 − Bt1 . . for 0 < t1 < · · · < tn . Remarks: 1) Brownian motion is a Gaussian process. . The mathematical deﬁnition of a Brownian motion is the following: Deﬁnition 37 A stochastic process {Bt . In fact.5 5. Btn − Btn−1 which has a joint normal distribution. Btn ). because this vector is a linear transformation of the vector Bt1 . . .

satisﬁes the above conditions i).tj ] (r)dr 2 = 0 i=1 ai 1[0.s] (r)1[0. by Kolmogorov’s theorem there exists a Gaussian process with zero mean and covariance function ΓX (s. BC . t) is nonnegative deﬁnite because it can be written as ∞ min(s. and this implies that for any natural number k we have (2k)! (t − s)k . tj ) = i. t) = min(s. t). (19) 85 . F.ti ] (r) dr ≥ 0. ii) and iii). ∞). t).j=1 i. ∞). the increment Bt − Bs has the normal distribution N (0. it is enough because E (Bt − Bs )2k = E (Bt − Bs )4 = 3(t − s)2 . 4) In the deﬁnition of the Brownian motion we have assumed that the probability space (Ω. t) = 0 1[0. t − s). R) equipped with its Borel σ-ﬁeld BC . R) ω → B· (ω) induces a probability measure PB = P ◦ B −1 . the random variables are the evaluation maps: Xt (ω) = ω(t). Indeed.t] (r)dr. P ) is arbitrary. Kolmogorov’s continuity criterion allows to choose a version of this process with continuous trajectories. In this canonical space.ti ] (r)1[0. The mapping Ω → C ([0. On the other hand. on the space of continuous functions C = C ([0.if s ≤ t. It is easy to show that a Gaussian process with zero mean and autocovariance function ΓX (s. t) = min(s. t) = min(s. PB ). k k! 2 So. Then we can take as canonical probability space for the Brownian motion the space (C. 3) The autocovariance function ΓX (s. called the Wiener measure.j=1 ∞ ai aj 0 n 1[0. choosing k = 2. n ∞ so n ai aj min(ti . Therefore.

n These properties can be formalized as follows.1.1. Then.5. the trajectories of the Brownian motion are H¨lder continuous of order 1 − ε for all ε > 0. First.1 Regularity of the trajectories From the Kolmogorov’s continuity criterium and using (19) we get that for all ε > 0 there exists a random variable Gε. this means that o 2 ∆Bt = Bt+∆t − Bt (∆t) 2 . Set ∆Bk = Btk − Btk−1 . if tj = jt we have n n |∆Bk | k=1 n t n 1 2 −→ ∞. 1 1 This approximation is exact in mean: E (∆Bt )2 = ∆t. where ∆tk = tk − tk−1 . t ∈ [0. Intuitively. The norm of the subdivision π is deﬁned by |π| = maxk ∆tk . whereas n (∆Bk )2 k=1 n t = t. t] and consider a subdivision π of this interval 0 = t0 < t1 < · · · < tn = t. 5.T such that |Bt − Bs | ≤ Gε.T |t − s| 2 −ε . for all s. we will show that n 2 k=1 (∆Bk ) converges in mean square to the length of the interval as the 86 .2 Quadratic variation Fix a time interval [0. That is. T ].

**norm of the subdivision tends to zero:
**

n 2

n

2

E

k=1

(∆Bk )2 − t

= E

k=1 n

**(∆Bk )2 − ∆tk (∆Bk )2 − ∆tk
**

2

=

k=1 n

E

=

k=1 n

**3 (∆tk )2 − 2 (∆tk )2 + (∆tk )2 (∆tk )2 ≤ 2t|π| −→ 0.
**

k=1 |π|→0

= 2

**On the other hand, the total variation, deﬁned by
**

n

V = sup

π k=1

|∆Bk |

is inﬁnite with probability one. In fact, using the continuity of the trajectories of the Brownian motion, we have

n n

**(∆Bk ) ≤ sup |∆Bk |
**

k=1 k k=1

2

|∆Bk |

≤ V sup |∆Bk | −→ 0

k n k=1

|π|→0

(20)

if V < ∞, which contradicts the fact that square to t as |π| → 0. 5.1.3 Self-similarity

(∆Bk )2 converges in mean

For any a > 0 the process a−1/2 Bat , t ≥ 0 is a Brownian motion. In fact, this process veriﬁes easily properties (i) to (iv).

87

5.1.4

Stochastic Processes Related to Brownian Motion

1.- Brownian bridge: Consider the process Xt = Bt − tB1 , t ∈ [0, 1]. It is a centered normal process with autocovariance function E(Xt Xs ) = min(s, t) − st, which veriﬁes X0 = 0, X1 = 0. 2.- Brownian motion with drift: Consider the process Xt = σBt + µt, t ≥ 0, where σ > 0 and µ ∈ R are constants. It is a Gaussian process with E(Xt ) = µt, ΓX (s, t) = σ 2 min(s, t). 3.- Geometric Brownian motion: It is the stochastic process proposed by Black, Scholes and Merton as model for the curve of prices of ﬁnancial assets. By deﬁnition this process is given by Xt = eσBt +µt , t ≥ 0, where σ > 0 and µ ∈ R are constants. That is, this process is the exponential of a Brownian motion with drift. This process is not Gaussian, and the probability distribution of Xt is lognormal.

5.1.5

Simulation of the Brownian Motion

Brownian motion can be regarded as the limit of a symmetric random walk. Indeed, ﬁx a time interval [0, T ]. Consider n independent are identically distributed random variables ξ 1 , . . . , ξ n with zero mean and variance T . n Deﬁne the partial sums Rk = ξ 1 + · · · + ξ k , k = 1, . . . , n. 88

By the Central Limit Theorem the sequence Rn converges, as n tends to inﬁnity, to the normal distribution N (0, T ). Consider the continuous stochastic process Sn (t) deﬁned by linear interpolation from the values Sn ( kT ) = Rk k = 0, . . . , n. n

Then, a functional version of the Central Limit Theorem, known as Donsker Invariance Principle, says that the sequence of stochastic processes Sn (t) converges in law to the Brownian motion on [0, T ]. This means that for any continuous and bounded function ϕ : C([0, T ]) → R, we have E(ϕ(Sn )) → E(ϕ(B)), as n tends to inﬁnity. The trajectories of the Brownian motion can also be simulated by means of Fourier series with random coeﬃcients. Suppose that {en , n ≥ 0} is an orthonormal basis of the Hilbert space L2 ([0, T ]). Suppose that {Zn , n ≥ 0} are independent random variables with law N (0, 1). Then, the random series

∞ t

Zn

n=0 0

en (s)ds

**converges uniformly on [0, T ], for almost all ω, to a Brownian motion {Bt , t ∈ [0, T ]}, that is,
**

N t

sup

0≤t≤T n=0

Zn

0

en (s)ds − Bt → 0.

a.s.

**This convergence also holds in mean square. Notice that
**

N t N s

E

n=0 N t

Zn

0

en (r)dr

n=0 s

Zn

0

en (r)dr

=

n=0 N 0

en (r)dr

0

en (r)dr 1[0,s] , en

N →∞ L2 ([0,T ])

=

n=0

1[0,t] , en

L2 ([0,T ])

→

1[0,t] , 1[0,s]

L2 ([0,T ])

= s ∧ t.

89

In particular. that is .2 Martingales Related with Brownian Motion Consider a Brownian motion {Bt . F. on the interval [0. . A is a Borel subset of R. {Ft . F. if we take the basis formed by trigonometric functions. . and N ∈ F is such that P (N ) = 0. P ). We say that {Ft . 90 . u ≥ 0} is adapted (to the ﬁltration Ft ) if for all t the random variable ut is Ft -measurable. t ≥ 0} is a non-decreasing family of σ-ﬁelds. . t ≥ 0} deﬁned on a probability space (Ω. n In order to use this formula to get a simulation of Brownian motion. 2π]. b) The family of σ-ﬁelds is right-continuous: For all t ≥ 0 ∩s>t Fs = Ft . P ). The inclusion of the events of probability zero in each σ-ﬁeld Ft has the following important consequences: a) Any version of an adapted process is adapted. t ≥ 0} is a ﬁltration in the probability space (Ω. s ≤ t} and the events in F of probability zero. 2π]. Ft is the smallest σ-ﬁeld that contains the sets of the form {Bs ∈ A} ∪ N. where 0 ≤ s ≤ t. t ∈ [0. we deﬁne the σ-ﬁeld Ft generated by the random variables {Bs . That is. . and e0 (t) = √1 . 5. We say that a stochastic process {ut . For any time t. 1. we have to choose some number M of trigonometric functions and a number N of discretization points: tj 2 Z0 √ + √ π 2π where tj = 2πj . Notice that Fs ⊂ Ft if s ≤ t. N . for n ≥ 1. 1 en (t) = √π cos(nt/2). n j = 0. N M Zn n=1 sin(ntj /2) . we 2π obtain the Paley-Wiener representation of Brownian motion: t 2 Bt = Z 0 √ + √ π 2π ∞ Zn n=1 sin(nt/2) .

91 . Example 1 Let Bt be a Brownian motion and Ft the ﬁltration generated by Bt . using the properties of the conditional expectation. then. Finally.If Bt is a Brownian motion and Ft is the ﬁltration generated by Bt . the processes Bt 2 Bt − t a2 t ) 2 are martingales. 2 For Bt − t. In fact. we can write. Consider the stopping time τ a = inf{t ≥ 0 : Bt = a}. The process Mt = eλBt − λ2 t 2 is a martingale such that E(Mt ) = E(M0 ) = 1. for exp(aBt − E(eaBt − a2 t ) 2 we have a2 t 2 a2 t 2 |Fs ) = eaBs E(ea(Bt −Bs )− = eaBs E(e = eaBs e 2 a2 (t−s) − a2 t 2 |Fs ) ) a2 s 2 2 a(Bt −Bs )− a2 t = eaBs − . As an application of the martingale property of this process we will compute the probability distribution of the arrival time of the Brownian motion to some ﬁxed level. Bt is a martingale because exp(aBt − E(Bt − Bs |Fs ) = E(Bt − Bs ) = 0. for s < t 2 E(Bt |Fs ) = E((Bt − Bs + Bs )2 |Fs ) = E((Bt − Bs )2 |Fs ) + 2E((Bt − Bs ) Bs |Fs ) 2 +E(Bs |Fs ) 2 = E (Bt − Bs )2 + 2Bs E((Bt − Bs ) |Fs ) + Bs 2 = t − s + Bs . where a > 0.

With the change of variables = α. λ2 τ a 2 = e−λa . the expectation of τ a can be obtained by computing the derivative of (21) with respect to the variable a: ae− E ( τ a exp (−ατ a )) = √ and letting α ↓ 0 we obtain E(τ a ) = +∞. for all N ≥ 1. limN →∞ Mτ a ∧N = Mτ a if τ a < ∞ limN →∞ Mτ a ∧N = 0 if τ a = ∞ and the dominated convergence theorem implies E 1{τ a <∞} Mτ a = 1. consequently.By the Optional Stopping Theorem we obtain E(Mτ a ∧N ) = 1. we get √ 2αa E ( exp (−ατ a )) = e− . that is. E λ2 τ a exp − 2 λ2 2 λ2 (τ a ∧ N ) 2 ≤ eaλ . E 1{τ a <∞} exp − Letting λ ↓ 0 we obtain P (τ a < ∞ ) = 1. = e−λa . (21) From this expression we can compute the distribution function of the random variable τ a : 2 t ae−a /2s √ P (τ a ≤ t ) = ds. 92 √ 2αa 2α . . and. 2πs3 0 On the other hand. Notice that Mτ a ∧N = exp λBτ a ∧N − On the other hand.

we cannot use this result to deﬁne the path-wise T Riemann-Stieltjes integral 0 ut (ω)dBt (ω) for a continuous process u. i i i+1 n→∞ such that supi (tn − tn ) −→ 0. One possibility is to interpret this integral as a path-wise Riemann Stieltjes integral. that if u has continuously diﬀerentiable trajectories.5. given two functions f and g on the i i−1 T interval [0. however.3 Stochastic Integrals We want to deﬁne stochastic integrals of the form. So. We know that the trajectories of Brownian motion have inﬁnite variation on any ﬁnite interval. . . In particular. T ]. i = 0. if we consider a sequence of partitions of an interval [0. . if g is continuously diﬀerentiable. T 0 ut dBt . . sup τ i T T |∆gi | < ∞. then T the path-wise Riemann Stieltjes integral 0 ut (ω)dBt (ω) exists and it can be computed integrating by parts: T 0 ut dBt = uT BT − T 0 ut Bt dt. the Riemann Stieltjes integral 0 f dg is deﬁned as n n→∞ lim f (si−1 )∆gi i=1 provided this limit exists and it is independent of the sequences τ n and σ n . then. T The Riemann Stieltjes integral 0 f dg exists if f is continuous and g has bounded variation. 93 . 0 f dg = 0 f (t)g (t)dt. That means. where ∆gi = g(ti ) − g(ti−1 ). that is. Note. kn − 1. T ]: τ n : 0 = tn < tn < · · · < tnn −1 < tnn = T k 0 1 k and intermediate points: σn : tn ≤ sn ≤ tn .

T ] × Ω).T u = {ut . In fact.T ] × F ).T n ut = j=1 φj 1(tj−1 . ω) belongs to the Hilbert space. Condition b) means that the moment of second order of the process is integrable on the time interval [0. the mapping (s. t ∈ [0. b) E T 0 T u2 dt < ∞. t Also. by Fubini’s theorem we deduce T E 0 u2 dt t T = 0 E u2 dt. T ]} such that: a) u is adapted and measurable (the mapping (s. ω) −→ us (ω) is measurable on the product space [0. condition b) means that the process u as a function of the two variables (t. t] × Ω is measurable with respect to the product σ-ﬁeld B[0. By deﬁnition a process u in L2 is a simple process if it is of the form a. T We will deﬁne the stochastic integral 0 ut dBt for processes u in L2 as a. This condition is slightly strongly that being adapted and measurable. T ]. L2 ([0. This condition means that for all t ∈ [0.We are going to construct the integral 0 ut dBt by means of a global probabilistic approach. and it is needed to guarantee that random variables t of the form 0 us ds are Ft -measurable. Denote by L2 the space of stochastic processes a. ω) −→ us (ω) on [0.T the limit in mean square of the integral of simple processes. T ]. (22) where 0 = t0 < t1 < · · · < tn = T and φj are square integrable Ftj−1 measurable random variables. T ] × Ω with respect to the product σ-ﬁeld B[0.tj ] (t).t] × Ft . 94 . t Under condition a) it can be proved that there is a version of u which is progressively measurable.

T based on the following approximation result: Lemma 38 If u is a process in L2 then. we obtain i T 2 n n E 0 ut dBt = i. (23) The stochastic integral deﬁned on the space E of simple processes possesses the following isometry property: E T 0 ut dBt 2 =E T 0 u2 dt t (24) Proof of the isometry property: Set ∆Bj = Btj − Btj−1 .j=1 E φi φj ∆Bi ∆Bj = i=1 T E φ2 (ti − ti−1 ) i = E 0 u2 dt . So. there exists a sequence of simple a.T processes u(n) such that T n→∞ lim E 0 ut − ut (n) 2 dt = 0. t The extension of the stochastic integral to processes in the class L2 is a. (25) Proof: The proof of this Lemma will be done in two steps: 95 .Given a simple process u of the form (22) we deﬁne the stochastic integral of u with respect to the Brownian motion B as T n ut dBt = 0 j=1 φj Btj − Btj−1 . Then E φi φj ∆Bi ∆Bj = E φ2 j 0 if i = j (tj − tj−1 ) if i = j because if i < j the random variables φi φj ∆Bi and ∆Bj are independent and if i = j the random variables φ2 and (∆Bi )2 are independent.

(26) holds because or each (t. (26) In order to ﬁnd this sequence we set (n) vt t t t =n 1 t− n us ds = n 0 us ds − 0 us− 1 ds . n with the convention u(s) = 0 if s < 0. which converges to zero as n tends to inﬁnity.tj ] (t). a. ω) dt ≤ 0 2 T |u(t. we can choose the approximating sequence n (n) ut = j=1 utj−1 1(tj−1 . ω) − v (n) (t.T we have T u(t. These processes are continuous in mean square (actually.T continuous in mean square and such that T n→∞ lim E 0 ut − v t (n) 2 dt = 0. a. Furthermore. T ] × Ω because T v 0 (n) (t. 2. Suppose now that u is an arbitrary process in the class L2 . they have continuous trajectories) and they belong to the class L2 . n The continuity in mean square of u implies that T 0 ut − ut (n) 2 dt ≤T sup |t−s|≤T /n E |ut − us |2 . of processes in L2 . ω) dt −→ 0 (n) 2 n→∞ 0 (this is a consequence of the fact that vt is deﬁned from ut by means of a convolution with the kernel n1[− 1 . where tj = E jT . 96 . Suppose ﬁrst that the process u is continuous in mean square.1. In this case. ω)|2 dt. ω) a. Then.T we need to show that there exists a sequence v (n) .0] ) and we can apply the dominated n convergence theorem on the product space [0.

3..Linearity: T T T (aut + bvt ) dBt = a 0 T 0 0 ut dBt + b 0 vt dBt .. T 0 Notice that the limit (27) exists because the sequence of random variables (n) ut dBt is Cauchy in the space L2 (Ω). due to the isometry property (24): T (n) ut dBt T 2 (m) ut dBt T E 0 − 0 = E 0 T ut − ut (n) (n) (m) 2 dt ≤ 2E 0 T ut − ut ut − ut 2 dt 2 +2E 0 (m) dt n. t 97 . (n) (27) where u(n) is an approximating sequence of simple processes that satisfy (25). on the set G = u2 dt = 0 .Local property: ut dBt = 0 almost surely. Properties of the stochastic integral: 1.Isometry: T 2 T E 0 ut dBt =E 0 u2 dt . the limit (27) does not depend on the approximating sequence u(n) .m→∞ → 0. On the other hand. T 0 4.Deﬁnition 39 The stochastic integral of a process u in the L2 is deﬁned a..T as the following limit in mean square T T ut dBt = lim 0 n→∞ 0 ut dBt .Mean zero: E 0 T ut dBt = 0. t 2..

Local property holds because on the set G the approximating sequence n (n. we can choose as approximating sequence n (n) ut T = j=1 Btj−1 1(tj−1 . T The stochastic integral 0 gs dBs is a normal random variable with law T T N (0.tj ] (t).N ) ut = j=1 m (j−1)T n (j−1)T n 1 −m us ds 1( (j−1)T . where tj = T jT . n and we obtain n Bt dBt = 0 n→∞ lim Btj−1 Btj − Btj−1 j=1 n 2 1 1 2 2 lim Btj − Btj−1 − lim Btj − Btj−1 = 2 n→∞ 2 n→∞ j=1 = 1 2 1 B − T. (28) 2 T 2 If xt is a continuously diﬀerentiable function such that x0 = 0. 2 Example 2 Consider a deterministic function g such that 0 g(s)2 ds < ∞. we know that T T xt dxt = 0 0 1 xt xt dt = x2 . 0 g(s)2 ds). 98 . 2 T Notice that in the case of Brownian motion an additional term appears: − 1 T . jT ] (t) n n vanishes. Example 1 1 2 1 Bt dBt = BT − T.m. 2 2 0 The process Bt being continuous in mean square.

t] (s)dBs . . 2. Properties of the indeﬁnite integrals: 1. a. Then. b b 1A us dBs = 1A a a us dBs . . n and s ≤ tk ≤ 99 . Martingale property: The indeﬁnite stochastic integral Mt = 0 us dBs of a process u ∈ L2 is a martingale with respect to the ﬁltration Ft . Set Mn (t) = 0 us dBs . In this way we have constructed a new stochastic process t us dBs . Actually.T process u1[0. j = 1.5. T ] the a.T Proof of the martingale property: Consider a sequence u(n) of simple processes such that T n→∞ t (n) t lim E 0 ut − ut (n) 2 dt = 0. tj ].4 Indeﬁnite Stochastic Integrals Consider a stochastic process u in the space L2 . and we can deﬁne its stochastic integral: a. 0 ≤ t ≤ T 0 which is the indeﬁnite integral of u with respect to B. Additivity: For any a ≤ b ≤ c we have b c c us dBs + a b us dBs = a us dBs . Factorization: If a < b and A is an event of the σ-ﬁeld Fa then.t] also belongs to L2 . for any t ∈ [0. . this property holds replacing 1A by any bounded and Fa measurable random variable. . 3.T t T us dBs := 0 0 us 1[0. If φj is the value of the simple stochastic process u(n) on each interval (tj−1 .

m→∞ −→ 0. the a. . Proof of continuity: With the same notation as above.tm−1 ≤ t we have E (Mn (t) − Mn (s)|Fs ) m−1 = E φk (Btk − Bs ) + j=k+1 φj ∆Bj + φm (Bt − Btm−1 )|Fs m−1 = E (φk (Btk − Bs )|Fs ) + j=k+1 E E φj ∆Bj |Ftj−1 |Fs +E E φm (Bt − Btm−1 )|Ftm−1 |Fs m−1 = φk E ((Btk − Bs )|Fs ) + j=k+1 E φj E ∆Bj |Ftj−1 |Fs +E φm E (Bt − Btm−1 )|Ftm−1 |Fs = 0. the result follows from the fact that the convergence in mean square Mn (t) → Mt implies the convergence in mean square of the conditional expectations.T t stochastic integral Mt = 0 us dBs has a version with continuous trajectories. Continuity: Suppose that u belongs to the space L2 . such that P sup |Mnk+1 (t) − Mnk (t)| > 2−k 100 ≤ 2−k . Then. Finally. 2. k = 1. 4. 0≤t≤T . Doob’s maximal inequality applied to the continuous martingale Mn − Mm with p = 2 yields P sup |Mn (t) − Mm (t)| > λ ≤ 1 E |Mn (T ) − Mm (T )|2 λ2 T 2 1 (n) (m) dt E ut − ut = λ2 0 0≤t≤T n. Then. the process Mn is a martingale with continuous trajectories. We can choose an increasing sequence of natural numbers nk . . .

k k That means. if ω ∈ N . or P (lim inf Ac ) = 1. there exists a set N of probability zero such that for all ω ∈ N there exists k1 (ω) such that for all k ≥ k1 (ω) / 0≤t≤T sup |Mnk+1 (t. Jt (ω) = 0 us dBs almost surely. and we 0 s have proved that the indeﬁnite stochastic integral possesses a continuous version. As a consequence. ω)| ≤ 2−k . T ]. ω) is uniformly con/ vergent on [0. the sequence Mnk (t. we know that for any t ∈ [0.τ ] (t)dBt = 0 0 ut dBt . k=1 Hence. Maximal inequality for the indeﬁnite integral: Mt = cesses u ∈ L2 : For all λ > 0. Mnk (t) converges in mean square to t t u dBs .The events Ak := sup0≤t≤T |Mnk+1 (t) − Mnk (t)| > 2−k verify ∞ P (Ak ) < ∞. P ). Fa . t 6. So. a. then the process u1[0.T also belongs to L2 and we have: a. Stochastic integration up to a stopping time: If u belongs to the space L2 and τ is a stopping time bounded by T . The stopping times 101 . for all t ∈ [0. 5. where 0 ≤ a < b ≤ T and F ∈ L2 (Ω. ω) − Mnk (t.b] (t). T ]. On the other hand. (29) Proof of (29): The proof will be done in two steps: (a) Suppose ﬁrst that the process u has the form ut = F 1(a.T P sup |Mt | > λ ≤ 1 E λ2 T 0 t 0 us dBs of a pro- 0≤t≤T u2 dt . T ] to a continuous function Jt (ω).τ ] a.T T τ ut 1[0. Borel-Cantelli lemma implies that P (lim supk Ak ) = 0.

where tj = 5. 102 . Then. The martingale Mt = Brownian motion: t 0 us dBs has a nonzero quadratic variation. The convergence of the right-hand side of (29) follows from Doob’s maximal inequality. we only need the property E(Bt − Bs |Hs ) = 0. it suﬃces to approximate the process u by simple processes. where ti = 2n . Taking the limit as n tends to inﬁnity we deduce the equality in the case of a simple process. n L1 (Ω) t 0 −→ u2 ds s as n tends to inﬁnity.T n tj 2 us dBs j=1 tj−1 jt .5 Extensions of the Stochastic Integral T Itˆ’s stochastic integral 0 us dBs can be deﬁned for classes of processes o larger than L2 . For any n we have τn n ut dBt = F (Bb∧τ n − Ba∧τ n ) 0 T 0 2n T ut 1[0. Ai = (i−1)T ≤ τ < 2n form a n n i=1 n n 2n nonincreasing sequence which converges to τ . In fact. we can replace the ﬁltration Ft by a largest one Ht such that the Brownian motion Bt is a martingale with respect to Ht .b∧ti ] (t)F dBt n n = i=1 1Ai F (Bb∧ti − Ba∧ti ) n n n = F (Bb∧τ n − Ba∧τ n ).τ n ] (t)dBt = i=1 2n 1Ai n 0 1(a∧ti . a. like the Proposition 40 (Quadratic variation) Let u be a process in L2 .T A) First.iT iT τ n = 2 ti 1Ai . a. (b) In the general case.

Example 3 Let {Bt . . Stochastic integral is extended to the space La.T . each component Bk (t) is a martingale with respect to (d) (d) Ft . d are independent Brownian (d) motions.Notice that this martingale property also implies that E((Bt − Bs )2 |Hs ) = 0. the components {Bk (t). 2 2 sin(B1 (s) + B1 (s))dB2 (s). because E((Bt − Bs )2 |Hs ) = E(2 s n t Br dBr + t − s|Hs ) Btt (Btt − Btt−1 )|Hs ) + t − s i=1 = 2 lim E( n = t − s. That is.T the space of processes that verify properties a) and b’). Then. . but Ft is not the ﬁltration generated by Bk (t). t ≥ 0}. t We denote by La. . B) The second extension consists in replacing property E by the weaker assumption: b’) P T 0 T 0 u2 dt < ∞ t u2 dt < ∞ = 1. Suppose that u belongs to La. The above extension allows us to deﬁne stochastic integrals of the form T 0 T 0 1 B2 (s)dB1 (s). For each n ≥ 1 we deﬁne the stopping time t τ n = inf t ≥ 0 : 0 u2 ds = n .T by means of a localization argument. Denote by Ft the ﬁltration generated by Bt and the sets of probability zero. k = 1. t ≥ 0} be a d-dimensional Brownian motion. s (30) 103 . .

Furthermore. it may have inﬁnite expectation and variance. For all K.τ n ] (s)dBs = s t∧τ n 0 u(m) dBs .T . 0 ≤ t ≤ T } belongs to L2 since E a. However. s Set (n) ut (n) = ut 1[0.T is linear and has continuous trajectories. there is a continuity property in probability as it follows from the next proposition: Proposition 41 Suppose that u ∈ La. δ > 0 we have : T T P 0 us dBs ≥ K ≤P 0 u2 ds ≥ δ + s δ . t 0 u(n) dBs = s t us dBs 0 if t ≤ τ n . In this way we obtain a s nondecreasing sequence of stopping times such that τ n ↑ T . t T t < τ n ⇐⇒ 0 u2 ds < n.τ n ] (t). If n ≤ m. Consider the stopping time deﬁned by t τ = inf t ≥ 0 : 0 u2 ds = δ . there exists an adapted and continuous process denoted t by 0 us dBs such that for any n ≥ 1. Instead of the isometry property. The stochastic integral of processes in the space La.where. τ n = T if 0 u2 ds < n. s 104 . on the set {t ≤ τ n } we have t 0 (n) us dBs = t 0 us (n) 2 ds ≤ u(m) dBs s because by (29) we can write t 0 u(n) dBs = s t 0 u(m) 1[0. T 0 The process u(n) = {ut . by convention. K2 Proof.T n. s As a consequence.

if u(n) is a sequence of processes in the space La. T T P 0 us dBs ≥ K.T in probability: T P 0 u(n) − us s T 0 2 ds > n→∞ → 0. 105 . s and on the other hand. 0 u2 ds < δ s T = P 0 τ us dBs ≥ K.We have s T P 0 us dBs ≥ K ≤ P 0 u2 ds ≥ δ s T T +P 0 us dBs ≥ K. 0 u2 ds < δ . τ = T us dBs ≥ K. K2 As a consequence of the above proposition. τ = T 0 τ 2 = P ≤ = 1 E K2 1 E K2 us dBs 0 τ 0 u2 ds s ≤ δ .6 Itˆ’s Formula o Itˆ’s formula is the stochastic version of the chain rule of the ordinary diﬀero ential calculus. 0 5.with the convention that τ = T if T T 0 u2 ds < δ.T which converges to u ∈ La. Consider the following example t 0 1 2 1 Bs dBs = Bt − t. P u(n) dBs → s T us dBs . 2 2 t that can be written as 2 Bt = 0 2Bs dBs + t. for all >0 then.

More generally. plus a diﬀerentiable function. Deﬁnition 42 A continuous and adapted stochastic process {Xt . we will see that any process of the form f (Bt ). can be expressed as the sum of an indeﬁnite stochastic integral. This leads to the deﬁnition of Itˆ processes. (31) where u belongs to the space La.T b”) P T 0 |vt | dt < ∞ = 1. a. this follows from Taylor development of Bt as a function of t. o Denote by L1 the space of processes v which satisfy properties a) and a.T and v belongs to the space L1 . s ∂x 2 0 ∂x2 t + 0 106 .2 ). ∂ f . Xs )vs ds + (s. where f is twice continuously diﬀerentiable. Xs )ds + (s. the process Yt = ∂x ∂x2 ∂t f (t.T In diﬀerential notation we will write dXt = ut dBt + vt dt. x) be a function twice diﬀerentiable with respect to the variable x and once diﬀerentiable with respect to t. Theorem 43 (Itˆ’s formula) Suppose that X is an Itˆ process of the form o o (31). 2 The stochastic process Bt can be expressed as the sum of an indeﬁnite t stochastic integral 0 2Bs dBs . with continuous partial derivatives 2 ∂f . 0 ≤ t ≤ T } is called an Itˆ process if it can be expressed in the form o t t Xt = X0 + 0 us dBs + 0 vs ds. plus a process with diﬀerentiable trajectories. Then. X0 ) + t t ∂f ∂f (s. Xs )u2 ds. with the convention (dBt )2 = dt. Xs )us dBs 0 ∂x 0 ∂t ∂f 1 t ∂ 2f (s.or in diﬀerential notation 2 d Bt = 2Bt dBt + dt. and ∂f (we say that f is of class C 1. Let f (t. Xt ) is again an Itˆ process with the representation o Yt = f (0. 2 Formally.

1.In the particular case ut = 1. ∂f ut = (t. Xs ) (dXt )2 .T and vt ∈ L1 due to the continuity of X. s 107 .T 3. Xt )dt + (t. 2 ∂t ∂x 2 ∂x where (dXt )2 is computed using the product rule × dBt dt dBt dt 0 dt 0 0 2.. vt = 0. Xt )u2 . and Itˆ’s formula has the following simple version o t f (t. Bt ) = f (0. Xt ) = (t. ∂x ∂f 1 ∂ 2f ∂f (t.The process Yt is an Itˆ process with the representation o t t Yt = Y0 + 0 us dBs + 0 vs ds. X0 = 0.In the particular case where f does not depend on the time we obtain the formula t t f (Xt ) = f (0) + 0 f (Xs )us dBs + 0 f (Xs )vs ds + 1 2 t 0 f (Xs )u2 ds. Xt ) + (t.. Bs )ds ∂t 1 + 2 t 0 ∂ 2f (s. Xt )vt + (t. Bs )ds.. where Y0 = f (0. 0) + 0 ∂f (s. Xt )ut . X0 ). a.In diﬀerential notation Itˆ’s formula can be written as o ∂f ∂f 1 ∂ 2f df (t. the process Xt is the Brownian motion Bt . vt = t 2 ∂t ∂x 2 ∂x Notice that ut ∈ La. ∂x2 4. Xt )dXt + (s.. Bs )dBs + ∂x t 0 ∂f (s.

we obtain 2 Bt = 2 t Bs dBs + t. More generally. if n ≥ 2 is a natural number. j=1 (32) where ∆Xj = Xtj − Xtj−1 and X j is an intermediate value between Xtj−1 and Xtj . o We will explain the heuristic ideas that lead to Itˆ’s formula using Taylor o development. The ﬁrst summand in the above expression converges in probability to t f (Xs )us dBs . whereas the second summand converges in probability to 0 1 t f (Xs )u2 ds. Example 5 If f (x) = x3 and Xt = Bt . Suppose v = 0. n Taylor’s formula up to the second order gives n f (Xt ) − f (0) = j=1 n f (Xtj ) − f (Xtj−1 ) 1 f (Xtj−1 )∆Xj + 2 n = j= f (X j ) (∆Xj )2 .Itˆ’s formula follows from Taylor development up to the second order. Fix t > 0 and consider the times tj = jt . s 2 0 Some examples of application of Itˆ’s formula: o Example 4 If f (x) = x2 and Xt = Bt . because f (x) = 3x2 and f (x) = 6x. t n(n − 1) t n−2 n−1 n Bs dBs + Bs ds. Bt = n 2 0 0 108 . 0 because f (x) = 2x and f (x) = 2. we obtain 3 Bt t =3 0 2 Bs dBs t +3 0 Bs ds.

Bt will be a martingale provided f satisﬁes: t E 0 ∂f (s. T ]. 109 . Itˆ’s formula applied to the function f (t)x yields o t t f (t)Bt = 0 fs dBs + 0 Bs fs ds and we obtain the integration by parts formula t t fs dBs = f (t)Bt − 0 0 Bs fs ds. 2.2 satisﬁes the equality (33).. x) of class C 1. then.. Therefore.If a function f (t. the stochastic process f (t. and Yt = eaBt − 2 t .Example 6 If f (t. but Yt = eaBt − 2 t . we obtain t a2 a2 Yt = 1 + a 0 Ys dBs because ∂f 1 ∂ 2f + = 0. Bt ) will be an indeﬁnite integral plus a constant term.The solution of the stochastic diﬀerential equation dYt = aYt dBt is not Yt = eaBt . x) = eax− 2 t . Xt = Bt . ∂t 2 ∂x2 This important example leads to the following remarks: (33) 1. Bs ) ∂x 2 ds < ∞ for all t ≥ 0. a2 Example 7 Suppose that f (t) is a continuously diﬀerentiable function on [0. f (t.

2 i.2 .We are going to present a multidimensional version of Itˆ’s formula. . Consider an n-dimensional Itˆ process of the form o t t t 1 1 1 m Xt1 = X0 + 0 u11 dBs + · · · + 0 u1m dBs + 0 vs ds s s . Xt )dXti dXtj . . Bt ) is an m-dimensional Brownian motion. .T Then. Xt ) is again an Itˆ process with the representation o dYtk = ∂fk (t.j=1 ∂xi ∂xj The product of diﬀerentials dXti dXtj is computed by means of the product rules: j i dBt dBt = i dBt dt = 0 (dt)2 = 0. a. n t n t nm t n1 m 1 n Xt = X0 + 0 us dBs + · · · + 0 us dBs + 0 vs ds In diﬀerential notation we can write m dXti or = l=1 l i uil dBt + vt dt t dXt = ut dBt + vt dt. . . . Bt . 0 if i = j dt if i = j In this way we obtain m dXti dXtj = k=1 uik ujk t t 110 dt = (ut ut )ij dt. . Xt )dt + ∂t n 2 n i=1 ∂fk (t. ∞) × Rn → Rp is a function of class C 1. the process Yt = f (t. where vt is an n-dimensional process and ut is a process with values in the set of n × m matrices and we assume that the components of u belong to La. Supo 1 2 m pose that Bt = (Bt .T and those of v belong to L1 . if f : [0. Xt )dXti ∂xi + 1 ∂ fk (t. .

Deﬁne s h2 ds .T integral Xt = 0 us dBs is a martingale with respect to the ﬁltration Ft . s h2 ds < ∞. Suppose ﬁrst that F is of the form T F = exp 0 1 hs dBs − 2 T 0 T 0 t 0 h2 ds . Proof. Theorem 44 (Itˆ’s integral representation) Consider a random variable o F in L2 (Ω.T such that T F = E(F ) + 0 us dBs . s (34) where h is a deterministic function such that t Yt = exp 0 hs dBs − 1 2 By Itˆ’s formula applied to the function f (x) = ex and the process Xt = o t 1 t hs dBs − 2 0 h2 ds. We know that the indeﬁnite stochastic a. Then. 111 . o t t t Xt Yt = X0 Y0 + 0 Xs dYs + 0 Ys dXs + 0 dXs dYs . P ). there exists a unique process u in the space L2 a. FT . We start with the integral representation of square integrable random variables. we obtain s 0 1 1 dYt = Yt h(t)dBt − h2 (t)dt + Yt (h(t)dBt )2 2 2 = Yt h(t)dBt .As a consequence we can deduce the following integration by parts formula: Suppose that Xt and Yt are Itˆ processes.7 Itˆ’s Integral Representation o t Consider a process u in the space L2 . 5. The aim of this subsection is to show that any square integrable martingale is of this form. Then.

P ) can be approximated in mean square by a sequence Fn of linear combinations of exponentials of the form (34). FT . we have T Fn = E(Fn ) + 0 u(n) dBs . s By linearity. FT . T E 0 Ys2 h2 (s)ds T = 0 T E Ys2 h2 (s)ds e Rt 0 = 0 h2 ds 2 s T h (s)ds T 0 ≤ exp 0 h2 ds s h2 ds < ∞. Then.m→∞ −→ 0 112 .that is. In the general case. Hence. The sequence Fn is a Cauchy sequence in L2 (Ω. any random variable F in L2 (Ω. P ). E (Fn − Fm )2 n. T Hence. F = YT = 1 + 0 Ys h(s)dBs and we get the desired representation because E(F ) = 1. s By the isometry of the stochastic integral E (Fn − Fm ) 2 T 2 = E E(Fn − Fm ) + 0 2 u(n) s T 0 − u(m) s − dBs 2 = (E(Fn − Fm )) + E T u(n) s u(m) s dBs ≥ E 0 (m) u(n) − us s 2 ds . Yt = 1 + 0 t Ys h(s)dBs . the representation holds for linear combinations of exponentials of the form (34).

T account that E(Fn ) converges to E(F ).T t (2) Mt = E(M0 ) + 0 us dBs for all t ∈ [0. This means that u(n) is a Cauchy sequence in L2 ([0. T ]} 2 is a martingale with respect to the Ft . ω). Theorem 45 (Martingale representation theorem) Suppose that {Mt . E 0 T (n) us − u(m) s 2 ds n.T T F = E(F ) + 0 u(1) dBs s T = E(F ) + 0 u(2) dBs . We can show that the process u. us (t. ω) for almost all (t. because there exists a subsequence u(n) (t. Finally. therefore. it will converge to a process u in L2 ([0. uniqueness also follows from the isometry property: Suppose that u(1) and u(2) are processes in L2 such that a. T ] × Ω). ω) = us (t. Then there exists a unique stochastic process u in the spaceL2 such that a. t ∈ [0.m→∞ −→ 0. ω) which converges to u(t. Applying again the isometry property. T ]×Ω). and taking into a. hence. such that E(MT ) < ∞ . u ∈ L2 . Consequently. as an element of L2 ([0. ω) for almost all (t. T ].and. T ] × Ω) has a version which is adapted. So. ω). s Then T 2 0=E 0 (1) u(1) s − u(2) s T dBs =E 0 u(1) − u(2) s s 2 ds and. we obtain T F = n→∞ lim Fn = lim T n→∞ E(Fn ) + 0 u(n) dBs s = E(F ) + 0 us dBs . 113 .

we obtain the representation 3 BT = T 0 2 3 Bt + (T − t) dBt . Suppose 0 ≤ t ≤ T .Proof. 0 0 and integrating by parts T T T Bt dt = T BT − 0 0 tdBt = 0 (T − t)dBt . So.8 Girsanov Theorem Girsanov theorem says that a Brownian motion with drift Bt + λt can be seen as a Brownian motion without drift. We obtain T Mt = E[MT |Ft ] = E(M0 ) + E[ 0 t us dBs |Ft ] = E(M0 ) + 0 us dBs . Applying Itˆ’s representation theorem to the random variable o F = MT we obtain a unique process u ∈ L2 such that T T T MT = E(MT ) + 0 us dBs = E(M0 ) + 0 us dBs . F. Suppose that L ≥ 0 is a nonnegative random variable on a probability space (Ω. Q(A) = E (1A L) 114 . 5. Then. By Itˆ’s o formula 3 BT = T 2 3Bt dBt + 3 T Bt dt. We ﬁrst discuss changes of probability by means of densities. with a change of probability. 3 Example 8 We want to ﬁnd the integral representation of F = BT . P ) such that E(L) = 1.

P (A) = 0 ⇐⇒ Q(A) = 0. In fact. X has distribution N (0. The probability Q is absolutely continuous with respect to P . that means. mutually absolutely continuous). 115 . Q) the variable X has the characteristic function: EQ (e itX ) = E(e = √ itX L) = √ ∞ −∞ 1 2πσ 2 x2 ∞ −∞ e− 2 (x−m)2 − mx + m 2 +itx 2σ 2 σ2 2σ dx 1 2πσ 2 e− 2σ2 +itx dx = e− σ 2 t2 2 . Q) is computed by the formula EQ (X) = E(XL).deﬁnes a new probability. Consider the random variable m m2 L = e− σ2 X+ 2σ2 . Suppose that Q has density L with respect to P . σ 2 ). dP The expectation of a random variable X in the probability space (Ω. σ 2 ). Example 9 Let X be a random variable with distribution N (m. so. P (A) = 0 =⇒ Q(A) = 0. F. On the probability space (Ω. We say that L is the density of Q with respect to P and we write dQ = L. If L is strictly positive. Q is a σ-additive measure such that Q(Ω) = E(L) = 1. F. The next example is a simple version of Girsanov theorem. that means. which satisﬁes E(L) = 1. then the probabilities P and Q are equivalent (that is.

The random variable LT is a density in the probability space (Ω. P ) which deﬁnes a probability Q given by Q(A) = E (1A LT ) . t ∈ [0. for any t ∈ [0. the probability Q has density Lt with respect to P . Ft . for all A ∈ FT . Then. In fact. in the space (Ω. σ 2 ). T ]} is a positive martingale with expectation 1 which satisﬁes the linear stochastic diﬀerential equation t Lt = 1 − 0 λLs dBs . T ]} be a Brownian motion. In order to prove Girsanov theorem we need the following technical result: Lemma 47 Suppose that X is a real random variable and G is a σ-ﬁeld such that u2 σ 2 E eiuX |G = e− 2 . P ). if A belongs to the σ-ﬁeld Ft we have Q(A) = E (1A LT ) = E (E (1A LT |Ft )) = E (1A E (LT |Ft )) = E (1A Lt ) . FT . t ∈ [0. is a Brownian motion. T ]. Theorem 46 (Girsanov theorem) In the probability space (Ω. Fix a real number λ and consider the martingale Lt = exp −λBt − λ2 t 2 .Let {Bt . (35) We know that the process {Lt . the random variable X is independent of the σ-ﬁeld G and it has the normal distribution N (0. FT . Q) the stochastic process Wt = Bt + λt. The martingale property of the process Lt implies that. 116 .

P ((X ≤ x) ∩ A) = P (A)Φ(x/σ) = P (A)P (X ≤ x). (36) Ls = E(1A Ls )E e(iu−λ)(Bt −Bs ) eiuλ(t−s)− = Q(A)e λ2 (t−s) 2 = Q(A)e− . EQ 1A eiu(Wt −Ws ) = Q(A)e− In order to show (36) we write EQ 1A eiu(Wt −Ws ) = E 1A eiu(Wt −Ws ) Lt = E 1A eiu(Bt −Bs )+iuλ(t−s)−λ(Bt −Bs )− 2 (iu−λ)2 (t−s)+iuλ(t−s)− λ (t−s) 2 2 u2 (t−s) 2 λ2 (t−s) 2 u2 (t−s) 2 .Proof. A ∈ Fs . σ 2 ). t − s). 1). That is. On the other hand. for all s < t. the characteristic function of X with respect to the conditional probability given A is again that of a normal distribution N (0. where Φ is the distribution function of the law N (0. Thus. for any A ∈ G. and this implies the independence of X and G . u ∈ R. these properties follow from the following relation. For any A ∈ G we have E 1A eiuX = P (A)e− u2 σ 2 2 . for all s < t ≤ T the increment Wt − Ws is independent of Fs and has the normal distribution N (0. Proof of Girsanov theorem. the law of X given A is again a normal distribution N (0. It is enough to show that in the probability space (Ω. σ 2 ): PA (X ≤ x) = Φ(x/σ) . Taking into account the previous lemma. choosing A = Ω we obtain that the characteristic function of X is that of a normal distribution N (0. Thus. FT . Girsanov theorem admits the following generalization: 117 . σ 2 ): EA eiuX = e− u2 σ 2 2 . Q).

T ]} be an adapted stochastic process such that it satisﬁes the following Novikov condition: E exp Then. and condition (37) ensures this property. Fix a real number λ. Bt = a}. That is. dP t By Girsanov theorem. FT . (37) θs ds is a Brownian motion with respect to the probability Q deﬁned by Q(A) = E (1A LT ) . s Notice that again Lt satisﬁes the linear stochastic diﬀerential equation t Lt = 1 − 0 θs Ls dBs . for all T > 0. t ∈ [0. As an application of Girsanov theorem we will compute the probability distribution of the hitting time of a level a by a Brownian motion with drift. Let {Bt . where Lt = exp − 0 t θs dBs − 1 2 t 0 θ2 ds . in the probability space (Ω. 2 Let Q be the probability on each σ-ﬁeld Ft such that for all t > 0 dQ |F = Lt . For the process Lt to be a density we need E(Lt ) = 1. the process W t = Bt + 0 1 2 T 0 θ2 dt t t < ∞. in this space Bt is a Brownian motion with drift −λt. T ]. Set τ a = inf{t ≥ 0. 118 .Theorem 48 Let {θt . and deﬁne λ2 Lt = exp −λBt − t . Q) the process Bt + λt := Bt is a Brownian motion in the time interval [0. t ≥ 0} be a Brownian motion.

where a = 0. For any t ≥ 0 the event {τ a ≤ t} belongs to the σ-ﬁeld Fτ a ∧t because for any s ≥ 0 {τ a ≤ t} ∩ {τ a ∧ t ≤ s} = {τ a ≤ t} ∩ {τ a ≤ s} = {τ a ≤ t ∧ s} ∈ Fs∧t ⊂ Fs . Consequently, using the Optional Stopping Theorem we obtain Q{τ a ≤ t} = E 1{τ a ≤t} Lt = E 1{τ a ≤t} E(Lt |Fτ a ∧t ) = E 1{τ a ≤t} Lτ a ∧t = E 1{τ a ≤t} Lτ a = E 1{τ a ≤t} e−λa− 2 λ

t

1 2

τa

=

0

e−λa− 2 λ s f (s)ds,

1

2

**where f is the density of the random variable τ a . We know that f (s) = √ |a| 2πs3 e− 2s .
**

a2

Hence, with respect to Q the random variable τ a has the probability density √ Letting, t ↑ ∞ we obtain Q{τ a < ∞} = e−λa E e− 2 λ

1 2

|a| 2πs3

e−

(a+λs)2 2s

, s > 0.

τa

= e−λa−|λa| .

If λ = 0 (Brownian motion without drift), the probability to reach the level is one. If −λa > 0 (the drift −λ and the level a have the same sign) this probability is also one. If −λa < 0 (the drift −λ and the level a have opposite sign) this probability is e−2λa .

5.9

Application of Stochastic Calculus to Hedging and Pricing of Derivatives

The model suggested by Black and Scholes to describe the behavior of prices is a continuous-time model with one risky asset (a share with price St at 119

time t) and a risk-less asset (with price St0 at time t). We suppose that St0 = ert where r > 0 is the instantaneous interest rate and St is given by the geometric Brownian motion: St = S0 eµt−

σ2 t+σBt 2

,

where S0 is the initial price, µ is the rate of growth of the price (E(St ) = S0 eµt ), and σ is called the volatility. We know that St satisﬁes the linear stochastic diﬀerential equation dSt = σSt dBt + µSt dt or dSt = σdBt + µdt. St This model has the following properties:

**a) The trajectories t → St are continuous. b) For any s < t, the relative increment generated by {Su , 0 ≤ u ≤ s}. c) The law of
**

St Ss St −Ss Ss

**is independent of the σ-ﬁeld
**

σ2 2

is lognormal with parameters µ −

(t − s), σ 2 (t − s).

Fix a time interval [0, T ]. A portfolio or trading strategy is a stochastic process φ = {(αt , β t ) , 0 ≤ t ≤ T } such that the components are measurable and adapted processes such that

T

|αt | dt < ∞,

0 T 0

(β t )2 dt < ∞.

The component αt is que quantity of non-risky and the component β t is que quantity of shares in the portfolio. The value of the portfolio at time t is then Vt (φ) = αt ert + β t St . 120

We say that the portfolio φ is self-ﬁnancing if its value is an Itˆ process o with diﬀerential dVt (φ) = rαt ert dt + β t dSt . The discounted prices are deﬁned by St = e

−rt

σ2 St = S0 exp (µ − r) t − t + σBt . 2

Then, the discounted value of a portfolio will be Vt (φ) = e−rt Vt (φ) = αt + β t St . Notice that dVt (φ) = −re−rt Vt (φ)dt + e−rt dVt (φ) = −rβ t St dt + e−rt β t dSt = β t dSt . By Girsanov theorem there exists a probability Q such that on the probability space (Ω, FT , Q) such that the process Wt = Bt + µ−r t σ

is a Brownian motion. Notice that in terms of the process Wt the Black and Scholes model is σ2 St = S0 exp rt − t + σWt , 2 and the discounted prices form a martingale: St = e−rt St = S0 exp − σ2 t + σWt . 2

**This means that Q is a non-risky probability. The discounted value of a self-ﬁnancing portfolio φ will be
**

t

Vt (φ) = V0 (φ) +

0

β u dSu ,

**and it is a martingale with respect to Q provided
**

T 0 2 E(β 2 Su )du < ∞. u

(38)

121

consider the square integrable martingale Mt = EQ e−rT h|Ft . The price of a replicable derivative with payoﬀ h at time t ≤ T is given by Vt (φ) = EQ (e−r(T −t) h|Ft ). This is a consequence of the integral representation theorem. 122 . We say that a derivative is replicable if there exists such a portfolio. V0 (θ) = EQ (e−rT h). any derivative satisfying EQ (h2 ) < ∞ is replicable. In particular. • In the case of an European put option with exercise price equal to K. Q-almost surely.Notice that a self-ﬁnancing portfolio satisfying (38) cannot be an arbitrage (that is. • In the case of an European call option with exercise price equal to K. V0 (φ) = 0. we have h = (ST − K)+ . we have h = (K − ST )+ . In the Black and Scholes model. the Black and Scholes model is complete. A self-ﬁnancing portfolio φ satisfying (38) replicates the derivative if VT (θ) = h. which follows from the martingale property of Vt (φ) with respect to Q: EQ (e−rT h|Ft ) = EQ (VT (φ)|Ft ) = Vt (φ) = e−rt Vt (φ). which contradicts the fact that P ( VT (φ) > 0) > 0. That means. In fact. so VT (φ) = 0. and P ( VT (φ) > 0) > 0). Suppose also that EQ (h2 ) < ∞. because using the martingale property we obtain EQ VT (θ) = V0 (θ) = 0. (39) if φ replicates h. VT (φ) ≥ 0. Consider a derivative which produces a payoﬀ at maturity time T equal to an FT -measurable nonnegative random variable h.

St ). The value of this derivative at time t will be Vt = EQ e−r(T −t) g(ST )|Ft = e−r(T −t) EQ g(St er(T −t) eσ(WT −Wt )−σ Hence. it is a self-ﬁnancing portfolio because dVt (φ) = = = = rert Vt (φ)dt + ert dVt (φ) rert Mt dt + ert dMt rert Mt dt + ert Kt dWt rert αt dt − rert β t St dt + σert β t St dWt = rert αt dt − rert β t St dt + ert β t dSt = rert αt dt + β t dSt . its ﬁnal value will be VT (φ) = erT VT (φ) = erT MT = h. Deﬁne the self-ﬁnancing portfolio φt = (αt . On the other hand.We know that there exists an adapted and measurable stochastic process Kt T 2 verifying 0 EQ (Ks )ds < ∞ such that t Mt = M0 + 0 Ks dWs . Vt = F (t. 123 (40) 2 /2(T −t) )|Ft . . Consider the particular case h = g(ST ). β t ) by βt = αt . σ St = Mt − β t St . so. Kt The discounted value of this portfolio is Vt (φ) = αt + β t St = Mt .

(42) 124 . Su )Su du ∂x ∂x 0 0 ∂F 1 t ∂2F 2 (u. x) = e−r(T −t) EQ g(xer(T −t) eσ(WT −Wt )−σ 2 /2(T −t) ) . x) σ 2 x2 = rF (t. x) ∂F 1 ∂ 2F ∂F (t. St ) = (t. ∞). 0 Comparing these expressions. the function F (t. applying Itˆ’s formula to (40) we o obtain Vt = V0 + t t ∂F ∂F (u. x) + rx (t. St ) ∂t 2 ∂x ∂F +rSt (t. ∂x ∂ 2F ∂F 1 rF (t. ∂t ∂x 2 ∂x2 F (T. we deduce the equations o ∂F (t. x).2 . Su )du + (u. St ). Therefore. is continuous and piece-wise diﬀerentiable) which include the cases g(x) = (x − K)+ . ∂x βt = The support of the probability distribution of the random variable St is [0. (41) Under general hypotheses on g (for instance. g(x) = (K − x)+ . and taking into account the uniqueness of the representation of an Itˆ process. x) is of class C 1. we know that Vt is an Itˆ process with the representation o t t Vt = V0 + 0 σβ u Su dWu + rVu du. x) = g(x). St ) + σ 2 St2 2 (t. ∂u 2 0 ∂x2 t σ + 0 On the other hand. Su )Su dWu + r (u. Then. if g has linear growth. St ). Su ) σ 2 Su du. the above equalities lead to the following partial diﬀerential equation for the function F (t. x) + (t.where F (t.

and the replicating portfolio will be given by βt = ∂F (t. and we get 1 F (t. St ) − β t St ) . In the particular case of an European call option with exercise price K and maturity T .The replicating portfolio is given by βt = αt ∂F (t. x) = e−r(T −t) EQ g(xer(T −t) eσ(WT −Wt )−σ 1 = e−rθ √ 2π ∞ −∞ 2 /2(T −t) ) g(xerθ− σ2 θ+σ 2 √ θy )e−y 2 /2 dy. St ). ∂x = e−rt (F (t. where θ = T − t. = σ T −t x log K + r + σ2 (T − t) √ = σ T −t 2 2 ∞ −∞ e−y 2 /2 xe− σ2 θ+σ 2 √ θy − Ke−rθ + dy xΦ(d+ ) − Ke−r(T −t) Φ(d− ). St ). ∂x Exercises 125 . The price at time t of the option is Ct = F (t. St ) = Φ(d+ ). Formula (41) can be written as F (t. x) = √ 2π = where d− d+ x log K + r − σ2 (T − t) √ . g(x) = (x − K)+ .

and (Bt1 . Given ρ > 0.1 Let Bt be a Brownian motion. 5.6 Check if the following processes are martingales. Show that the process Bt = U Bt is a Brownian motion. Bt2 ) assuming t1 < t < t2 . U U = I). Show that the process Bt = Bt0 +t − Bt0 .7 Find the stochastic integral representation on the time interval [0. where Bt is a Brownian motion: 3 Xt = Bt − 3tBt s Xt = t2 Bt − 2 0 sBs ds 1 Xt = (Bt + t) exp(−Bt − t) 2 Xt = B1 (t)B2 (t). Bt2 .2 Let Bt be a two-dimensional Brownian motion. Fix a time t0 ≥ 0. t ≥ 0 is a Brownian motion. T ] of Xt = et/2 cos Bt Xt = et/2 sin Bt 126 .5 Let Bt be a Brownian motion. 5. In the last example.5. 5. B1 and B2 are independent Brownian motions. compute: P (|Bt | < ρ). 5.3 Let Bt be a n-dimensional Brownian motion. 5. Consider an orthogonal n × n matrix U (that is. Is it a Gaussian process? 5. Find the law of Bt conditioned by Bt1 .4 Compute the mean and autocovariance function of the geometric Brownian motion.

1. Bt ). a) Show that for each 0 ≤ t < 1.8 Let p(t. where {Bt . x) = 0. a) Give an explicit expression for St in terms of t and Bt . and p(1. c) Compute the price at time zero of a derivative whose payoﬀ at time 2 T is ST . b) Fix a maturity time T . but Ht2 dt = ∞.the following random variables: F = BT 2 F = BT F = eBT T F = 0 3 BT Bt dt F = F = sin BT T F = 0 2 tBt dt √ 5. Bt ). x) = 1/ 1 − t exp(−x2 /2(1 − t)). for 0 ≤ t < 1.9 The price of a ﬁnancial asset follows the Black-Scholes model: dSt = 3dt + 2dBt St with initial condition S0 = 100. 5. 127 . x ∈ R. Find a risk-less probability by means of Girsanov theorem. Suppose r = 1. Bs )dBs . Deﬁne Mt = p(t. 0 ∂x Show that 1 0 Ht2 dt < ∞ almost surely. ∂x t ∂p (s. 0 ≤ t ≤ 1} is a Brownian motion. Mt = M0 + b) Set Ht = E 1 0 ∂p (t.

The coeﬃcients b(t. drift and diffusion coeﬃcient. the increment Bt − Bs is independent of Fs . (44) That is. respectively. P ). Xt )∆t and variance σ(t. 128 . Xt )2 ∆t. the solution will be an Itˆ process {Xt . The main result on the existence and uniqueness of solutions is the following. t ≥ 0} deﬁned on a probability space (Ω. The approximate distribution of this increment will the the normal distribution with mean b(t. Suppose that {Ft . which is a random variable independent of the Brownian motion Bt .6 Stochastic Diﬀerential Equations Consider a Brownian motion {Bt . Xs )dBs . Xs )ds + 0 σ(s. A formal meaning of Equation (??) is obtained by rewriting it in integral form. We aim to solve stochastic diﬀerential equations of the form dXt = b(t. dt For instance. x) and σ(t. If the drift vanishes. The increment ∆Xt = Xt+∆t − Xt can be approximatively decomposed into the sum of b(t. x) are called. The solutions of o stochastic diﬀerential equations are called diﬀusion processes. The stochastic diﬀerential equation (43) has the following heuristic interpretation. x) = b(t)x. then we have (43) is the ordinary diﬀerential equation: dXt = b(t. Xt )dBt (43) with an initial condition X0 . using stochastic integrals: t t Xt = X0 + 0 b(s. F. Xt )dt + σ(t. Xt )∆t plus the term σ(t. in the linear case b(t. the solution of this equation is Rt Xt = X0 e 0 b(s)ds . Xt )∆Bt which is interpreted as a random impulse. t ≥ 0} is a ﬁltration such that Bt is Ft -adapted and for any 0 ≤ s < t. t ≥ 0}. Xt ).

For example. 0 ≤ t ≤ 1.The linear growth condition (47. y)| |σ(t. x)| |σ(t. Then. x)| ≤ ≤ ≤ ≤ D1 |x − y| D2 |x − y| C1 (1 + |x|) C2 (1 + |x|). For example. 2. x) − σ(t. Suppose that X0 is a random variable inde2 pendent of the Brownian motion {Bt .. T ]. the process Xt is n-dimensional. and the coeﬃcients are functions b : [0. dt 129 1 . t ∈ [0. T ] × Rn → Rn . T ] × Rn → Rn×m . Suppose that the coeﬃcients of Equation (??) satisfy the following Lipschitz and linear growth properties: |b(t. 0 ≤ t ≤ T } and such that E(X0 ) < ∞. t ∈ [0. the deterministic diﬀerential equation dXt 2/3 = 3Xt . the deterministic differential equation dXt = Xt2 . y ∈ R. T ]} such that T E 0 |Xs |2 ds < ∞. Remarks: 1.. X0 = 0. which satisﬁes Equation (44). 0 ≤ t < 1. T ]..Lipschitz condition (45. 1−t . X0 = 1.This result is also true in higher dimensions. σ : [0.48) ensures that the solution does not explode in the time interval [0.Theorem 49 Fix a time interval [0. dt has the unique solution Xt = which diverges at time t = 1. when Bt is an m-dimensional Brownian motion.46) ensures that the solution is unique. x) − b(t. 3. there exists a unique continuous and adapted stochastic process {Xt . T ]. (45) (46) (47) (48) for all x. y)| |b(t.

Let us see some examples.has inﬁnitely many solutions because for each a > 0. Application: Consider the Black-Scholes model for the prices of a ﬁnancial asset. the solution of the homogeneous linear stochastic differential equation dXt = b(t)Xt dt + σ(t)Xt dBt where b(t) and σ(t) are continuous functions.. In this example. with time-dependent coeﬃcients µ(t) y σ(t) > 0: dSt = St (µ(t)dt + σ(t)dBt ). More generally. A) Linear equations. the coeﬃcient b(x) = 3x2/3 does not satisfy the Lipschitz condition because the derivative of b is not bounded. The geometric Brownian motion Xt = X0 e 2 µ− σ t+σBt 2 solves the linear stochastic diﬀerential equation dXt = µXt dt + σXt dBt .1 Explicit solutions of stochastic diﬀerential equations Itˆ’s formula allows us to ﬁnd explicit solutions to some particular stochastic o diﬀerential equations. the function Xt = 0 if t ≤ a (t − a)3 if t > a is a solution. 130 . x) are diﬀerentiable in the variable x. is Xt = X0 exp t 0 b(s) − 1 σ 2 (s) ds + 2 t 0 σ(s)dBs . ∂b the Lipschitz condition means that the partial derivatives ∂x and ∂σ ∂x are bounded by the constants D1 and D2 . respectively. 4. 6. x) and σ(t.If the coeﬃcients b(t.

This is a nonhomogeneous linear equation and to solve it we will make use of the method of variation of constants. where the parameters σ 2 and r are replace by Σ2 = 1 T −t 1 R = T −t T t T σ 2 (s)ds.The solution to this equation is t St = S0 exp 0 1 µ(s) − σ 2 (s) ds + 2 t σ(s)dBs . The solution of the homogeneous equation dxt = −axt dt x0 = x 131 . σ > 0 and m is a real number. there exists a risk-free probability under which the process t W t = Bt + 0 µ(s) − r(s) ds σ(s) Rt 0 is a Brownian motion and the discounted prices St = St e− martingales: t r(s)ds are St = S0 exp 0 σ(s)dWs − 1 2 t 0 σ 2 (s)ds . r(s)ds. 0 If the interest rate r(t) is also a continuous function of the time. where a. Consider the stochastic diﬀerential equation dXt = a (m − Xt ) dt + σdBt X0 = x. In this way we can deduce a generalization of the Black-Scholes formula for the price of an European call option. t B) Ornstein-Uhlenbeck process.

In fact. σ2 1 − e−2at 2a and it converges. 2a This distribution is called invariant or stationary. Thus.is xt = xe−at . Cov (Xt . Suppose that the initial condition X0 has distribution ν. for each t > 0 the law of Xt will be also ν. The process Yt satisﬁes dYt = aXt eat dt + eat dXt = ameat dt + σeat dBt . that is. Xs ) = σ 2 e−a(t+s) E 0 t s 0 ear dBr ear dBr = σ 2 e−a(t+s) 0 2 t∧s e2ar dr = σ e−a|t−s| − e−a(t+s) . which implies Xt = m + (x − m)e−at + σe−at t as e dBs . then. as t tends to inﬁnity to the normal law ν = N (m. Yt = x + m(eat − 1) + σ 0 t eas dBs . Xt = m + (X0 − m)e−at + σe−at 0 t eas dBs 132 . σ2 ). 0 The stochastic process Xt is Gaussian. Then we make the change of variables Xt = Yt e−at . 2a The law of Xt is the normal distribution N (m + (x − m)e−at . Yt = Xt eat . Its mean and covariance function are: E(Xt ) = m + (x − m)e−at .

b are σ constants. (52) This leads to the following formula for the price of bonds: P (t. s ≥ t. we obtain r(s) = r(t)e−a(s−t) + b(1 − e−a(s−t) ) + σe−as t s ear dBr .and. T ) = E e− t r(s)ds |Ft . (53) 133 . Formula (50) follows from the property PR(T. T ) = 1 and the fact t that the discounted price of the bond e− 0 r(s)ds P (t. Suppose we are working under the risk-free probability. (49) where a. 2a Examples of application of this model: 1. VarXt = e −2at VarX0 + σ e 2 −2at t 2 E 0 e dBs as σ2 = . T ) = A(t. r(s)ds condi- From this expression we deduce that the law of tioned by Ft is normal with mean (r(t) − b) and variance − σ2 1 − e−a(T −t) 3 2a 2 T t 1 − e−a(T −t) + b(T − t) a (51) + σ2 a2 (T − t) − 1 − e−a(T −t) a . Solving the stochastic diﬀerential equation (49) between the times t and s. T ) is a martingale. Vasicek model for rate interest r(t): dr(t) = a(b − r(t ))dt + σ dBt . therefore E(Xt ) = m + (E(X0 ) − m)e−at = m. T )e−B(t.T )r(t) . Then the price of a bond with maturity T is given by RT (50) P (t.

such that f satisﬁes the required Lipschitz and linear growth conditions in the variable x. where t Ft = exp 0 c(s)dBs − 1 2 t 0 c2 (s)ds . T ) = 1 − e−a(T −t) . X0 = x. x) and c(t) are deterministic continuous functions. We assume that the volatility σ(t) = f (Yt ) is a function of an Ornstein-Uhlenbeck process. that is. where Bt and Wt Brownian motions which may be correlated: E(Bt Ws ) = ρ(s ∧ t). Xt )dt + c(t)Xt dBt .where B(t. Y0 = x. which can be solved by the usual methods. (54) where f (t. Ft Yt )dt. 134 . a (B(t. dSt = St (µdt + f (Yt )dBt ) dYt = a (m − Yt ) dt + βdWt . (55) b) Equation (55) is an ordinary diﬀerential equation with random coeﬃcients. T ) = exp − . Then Yt = Ft−1 Xt satisﬁes dYt = Ft−1 f (t. is a solution to Equation (54) if f = 0 and x = 1. Black-Scholes model with stochastic volatility. C) Consider the stochastic diﬀerential equation dXt = f (t. T ) − T + t) (a2 b − σ 2 /2) σ 2 B(t. a2 4a 2. This equation can be solved by the following procedure: a) Set Xt = Ft Yt . T )2 A(t.

Consider the equation dXt = (a(t) + b(t)Xt ) dt + (c(t) + d(t)Xt ) dBt . β(t) α(t) Finally. Using the method of variation of constants. with initial condition X0 = x. We know that t t Ut = exp 0 b(s)ds + 0 d(s)dBs − 1 2 t 0 d2 (s)ds . 135 . with U0 = 1 and V0 = x. b. Xt = x exp t 0 f (s)ds + t 0 c(s)dBs − 1 2 t 2 c (s)ds 0 . Xt = Ut x + t 0 −1 [a(s) − c(s)d(s)] Us ds + t 0 −1 c(s) Us dBs = Ut α(t) + β(t)d(t)Ut = Ut β(t) = c(t) Ut−1 = [a(t) − c(t)d(t)] Ut−1 . diﬀerentiating (56) yields a(t) c(t) that is. c and d are continuous functions. D) General linear stochastic diﬀerential equations.For instance. we propose a solution of the form Xt = Ut Vt (56) where dUt = b(t)Ut dt + d(t)Ut dBt and dVt = α(t)dt + β(t)dBt . and Yt = x exp 0 t f (s)ds . On the other hand. suppose that f (t. x) = f (t)x. Equation (55) reads dYt = f (t)Yt dt. Hence. where a. In that case.

and it can be transformed into a o stochastic diﬀerential equation in the Stratonovich sense. Example 1 The Cox-Ingersoll-Ross model for interest rates: dr(t) = a(b − r(t))dt + σ . Xs )ds 2 + 0 (σσ ) (s. Xs )ds − 0 1 (σσ ) (s. Xs ) ◦ dBs . is written using the Itˆ’s stochastic integral. r(t)dWt . and the diﬀusion coeﬃcient σ satisﬁes the H¨lder condition o |σ(x) − σ(y)| ≤ D|x − y|α . 136 . Xs )ds + 2 t σ(s. In this way we obtain t t Xt = X0 + 0 b(s. Xt ) = σ(0. X0 ) + 0 t 1 σ b − σ σ 2 (s. 0 because the Itˆ’s decomposition of the process σ(s. the drift b is Lipschitz. Xs )dBs . the equation dXt = |Xt |r dBt X0 = 0 has a unique solution if r ≥ 1/2. Suppose that the coeﬃcients b and σ do not depend on time. In that case.The stochastic diﬀerential equation t t Xt = X0 + 0 b(s. there exists a unique solution. Xs )dBs . using the formula that relates both types of integrals. Xs )ds + 0 σ(s. Xs ) is o t σ(t. where α ≥ 1 . 2 For example. Yamada and Watanabe proved in 1971 that Lipschitz condition on the diﬀusion coeﬃcient could be weakened in the following way.

Consider the stochastic diﬀerential equation dXt = b(Xt )dt + σ(Xt )dBt . . n en ≤ cδ 1/2 n if n ≥ n0 .6. . . T ] and consider the partition ti = iT . . where ∆Bi = Bti − Bti−1 . The process X (n) is a function of the Brownian motion and we can measure the error that we make if we replace X by X (n) : en = E XT − XT (n) 2 (n) . i = 0. it is convenient to develop numerical methods that provide approximated simulations of these equations. 1). ξ n √ with distribution N (0. that is.. This leads to Milstein’s method. For this reason. ti ) the value of the process X (n) is obtained by linear interpolation. . with initial condition X0 = x. . and replace ∆Bi by δ n ξ i . In order to simulate a trajectory of the solution using Euler’s method. n. . i = 1.2 Numerical approximations Many stochastic diﬀerential equations cannot be solved explicitly. n. it suﬃces to simulate the values of n independent random variables ξ 1 . . Let us explain how this correction is obtained. Fix a time interval [0. The initial value is X0 = x. 137 . n (57) The length of each subinterval is δ n = T . . . . 1. . Euler’s method can be improved by adding a correction term. Inside the interval (ti−1 . n Euler’s method consists is the following recursive scheme: X (n) (ti ) = X (n) (ti−1 ) + b(X (n) (ti−1 ))δ n + σ(X (n) (ti−1 ))∆Bi . It holds that en is of the order δ 1/2 .

in order to improve the approximation. + σ(X(ti−1 )) + ti ti−1 s (σσ ) (Xr )dBr dBs ti−1 The dominant term is the residual Ri is the double stochastic integral (σσ ) (Xr )dBr dBs . ti−1 In Milstein’s method we apply Itˆ’s formula to the processes b(Xs ) and σ(Xs ) o that appear in (58). 2 = 138 .The exact value of the increment of the solution between two consecutive points of the partition is ti ti X(ti ) = X(ti−1 ) + ti−1 b(Xs )ds + ti−1 σ(Xs )dBs . ti−1 and one can show that the other terms are of lower order and can be neglected. ti−1 ti σ(Xs )dBs ≈ σ(X(ti−1 ))∆Bi . The rules of Itˆ stochastic calculus lead to o ti ti−1 s ti dBr dBs = ti−1 ti−1 Bs − Bti−1 dBs Bti − Bti−1 − δ n 1 2 2 Bti − Bti−1 − Bti−1 2 1 = (∆Bi )2 − δ n . This double stochastic integral can also be approximated by ti s (σσ ) (X(ti−1 )) ti−1 ti−1 dBr dBs . In this way we obtain X(ti ) − X(ti−1 ) ti s = ti−1 ti b(X(ti−1 )) + ti−1 s 1 bb + b σ 2 (Xr )dr + 2 s (σb ) (Xr )dBr ds ti−1 s 1 bσ + σ σ 2 (Xr )dr + 2 ti−1 ti−1 = b(X(ti−1 ))δ n + σ(X(ti−1 ))∆Bi + Ri . (58) Euler’s method is based on the approximation of these exact values by ti b(Xs )ds ≈ b(X(ti−1 ))δ n .

Xt )dt + σ(t. for any bounded Borel function f on Rn . Deﬁnition 50 We say that an n-dimensional stochastic process {Xt .3 Markov property of diﬀusion processes Consider an n-dimensional diﬀusion process {Xt . which says that the future values of the process depend only on its present value. Milstein’s method consists in the following recursive scheme: X (n) (ti ) = X (n) (ti−1 ) + b(X (n) (ti−1 ))δ n + σ(X (n) (ti−1 )) ∆Bi 1 + (σσ ) (X (n) (ti−1 )) (∆Bi )2 − δ n . In particular. and not on the past values of the process (if the present value is known). r ≤ s) = P (Xt ∈ C|Xs ). t ≥ 0} is a Markov process if for every s < t we have E(f (Xt )|Xr . that is. x. 139 (60) . en ≤ cδ n if n ≥ n0 . s) = P (Xt ∈ C|Xs = x). 6. t ≥ 0} which satisﬁes the stochastic diﬀerential equation dXt = b(t. (59) where B is an m-dimensional Brownian motion and the coeﬃcients b and σ are functions which satisfy the conditions of Theorem 49. Xt )dBt .In conclusion. r ≤ s) = E(f (Xt )|Xs ). We will show that such a diﬀusion process satisfy the Markov property. The probability law of Markov processes is described by the so-called transition probabilities: P (C. 2 One can shoe that the error en is of order δ n . t. property (60) says that for every Borel set C ∈ BRn we have P (Xt ∈ C|Xr .

140 .x .x . t. t. x. (x−y)2 We will denote by {Xts. That is. P (Bt ∈ C|Fs ) = P (Bt − Bs + Bs ∈ C|Fs ) = P (Bt − Bs + x ∈ C)|x=Bs . t. x. r ≤ s) = P (C. Xtx for t ≥ s satisﬁes the stochastic diﬀerential equation x Xtx = Xs + t s x b(u. s) is the normal distribution N (x. we will denote it by p(y. Hence. ∞) and with initial condition Xs = x. P (·. s) is the probability law of Xt conditioned by Xs = x. x) of the process {Xts. t. Xu )du + t s x σ(u. x. s). the real-valued Brownian motion Bt is a Markov process with transition probabilities given by p(y. 0 ≤ s ≤ t. t. x. One can show that there exists a continuous version (in all the parameters s.x equation (59) on the time interval [s. s). x. t. s). On the other hand. t ≥ s} the solution of the stochastic diﬀerential s. P (·. s) = In fact. If this conditional distribution has a density. t − s). means that for all 0 ≤ s < t.Xs (61) In fact. t. 1 2π(t − s) e− 2(t−s) . we will write Xt = Xt . 0. C ∈ BRn we have P (Xt ∈ C|Xr . Xu )dBu . For example. the fact that Xt is a Markov process with transition probabilities P (·. because Bt − Bs is independent of Fs .x x If s = 0. C ∈ BRn and x ∈ Rn . x ∈ Rn } . Therefore. for every 0 ≤ s ≤ t we have the property: Xtx = Xt x s.where 0 ≤ s < t. Xs .

the coefﬁcients do no depend on time). The uniqueness of solutions allow us to conclude. To do this we have to solve the stochastic diﬀerential equation dXt = a (m − Xt ) dt + σdBt in the time interval [s.On the other hand. ∞) with initial condition x Xs .x .y = y + t s s. Example 2 Let us compute the transition probabilities of the OrnsteinUhlenbeck process. The solution is Xts. s) = P (Xts. t ≥ s. Then.x )] |x=Xs . because the process {Xts. x ∈ Rn } is independent of Fs and the random variable Xs is Fs -measurable. ∞) with initial condition x.x )] |x=Xs .X x x and substituting y by Xs we obtain that the processes Xtx and Xt s are solutions of the same equation on the time interval [s. Proof. Xu )dBu s. then the Markov property can be written as x E [f (Xt )|Fs ] = E f (Xt−s ) |x=Xs . x.x = m + (x − m)e−a(t−s) + σe−at t s ear dBr 141 . Theorem 51 (Markov property of diﬀusion processes) Let f be a bounded Borel function on Rn .y σ(u. This theorem says that diﬀusion processes possess the Markov property and their transition probabilities are given by P (C.y satisﬁes Xts. Xts.y b(u.x ∈ C). Xu )du + t s s. Moreover. t. Using (61) and the properties of conditional expectation we obtain E [f (Xt )|Fs ] = E f (Xts. for every 0 ≤ s < t we have E [f (Xt )|Fs ] = E [f (Xts. if a diﬀusion process is time homogeneous (that is.Xs )|Fs = E [f (Xts.

Xt ) + At f (t. depending on time.j (s. x).j=1 σσ T i. x. therefore.x ) is a normal distribution with parameters E (Xts. 2a s 6.4 Feynman-Kac Formula Consider an n-dimensional diﬀusion process {Xt . Xt ) dt ∂t n m (62) ∂f j (t. Xt )dBt .and. t.2 . Xt ) = + i=1 j=1 ∂f (t. f (t. Xt )σ i. x) ∂xi ∂xj . x) is the symmetric and nonnegative deﬁnite matrix given by n σσ T i. Xt )dt + σ(t. P (·. x) be a function of class C 1.j (s. This operator is called the generator of the diﬀusion process and it is given by As f (x) = n ∂f i=1 bi (s.j (t. s) = L(Xts. x) ∂xi + 1 2 n i. t σ2 VarXts. ∂xi As a consequence. Xs ) ds ∂xi 142 2 <∞ (63) . where B is an m-dimensional Brownian motion. x)σ j. We can associate to this diﬀusion process a second order diﬀerential operator. The matrix σσ T (s. that will be denoted by As . then. Xt )dBt .k (s. t ≥ 0} which satisﬁes the stochastic diﬀerential equation dXt = b(t. Suppose that the coeﬃcients b and σ satisfy the hypotheses of Theorem 49 and X0 = x0 is constant.x ) = m + (x − m)e−a(t−s) .k (s. Xt ) is o an Itˆ process with diﬀerential o df (t. x) = k=1 σ i.2 .x = σ 2 e−2at e2ar dr = (1 − e−2a(t−s) ). Xs )σ i. The relation between the operator As and the diﬀusion process comes from Itˆ’s formula: Let f (t. if t E 0 ∂f (s. 2 In this expression f is a function on [0.j ∂ f (s. ∞) × Rn of class C 1.

Applying again Itˆ’s formula one can show that.j (t. Xt ) = E(f (T. x) = E(g(XT )) ∂f ∂t (66) almost surely with respect to the law of Xt . j.x f (t. then ∂t f (t. XT )|Xt ) = E(g(XT )|Xt ) = E(g(XT ))|x=Xt . Xs )ds (64) is a martingale. if f is of class C 1. ∂xi (65) In particular. dMt = e − Rt 0 n q(Xs )ds m i=1 j=1 ∂f j (t.2 and satisﬁes (65). x) satisﬁes + At f = 0 f (T.for every t > 0 and every i. In fact. Xt )σ i. A suﬃcient condition for (63) is that the partial derivatives ∂f have linear growth. Xs )ds is a martingale. that is. Xt ) − 0 ∂f + As f ∂s (s. Consider a continuous function q(x) bounded from below. ∂xi ∂f (s. Xt ) is a martingale. x) = g(x) in [0. The martingale property of this process leads to a probabilistic interpretation of the solution of a parabolic equation with ﬁxed terminal value. T ] × Rn . the martingale property of the process f (t. Xt )dBt . ∂xi 143 . then t. Xt ) implies t. In fact. Indeed. Xt ) − 0 e− Rs 0 q(Xr )dr ∂f + As f − qf ∂s (s.x f (t. x) ≤ C(1 + |x|N ). then the process t Mt = f (t. then o the process Mt = e− Rt 0 q(Xs )ds t f (t. ir f satisﬁes the equation ∂f + At f = 0 and (65) holds. if the function f (t.

T ] × Rn . the martingale property of the process (67) implies f (t. XT )|Ft = E e− RT t t.s r(s.x g(XT ) .x g(XT ) |x=Xt . Consider a derivative with payoﬀ g(ST ) at the maturity time T . q(Xs )ds f (T. under the risk-free probability. The value of the derivative at time t is given by the formula Vt = E e− Markov property implies Vt = f (t. (67) q(Xs )ds f (t. Example 3 Consider Black-Scholes model. Xt ) = E e− Finally. t.x g(ST ) .x q(Xs )ds t. x) = g(x) on [0. St ).x q(Xs )ds t. where the interest rate and the volatility depend on time and on the asset price. x) satisﬁes ∂f ∂t + At f − qf = 0 f (T.Ss )ds g(ST )|Ft . St )St dt + σ(t. (68) In fact.If the function f satisﬁes the equation e− Rt 0 ∂f ∂s + As f − qf = 0 then. Xt ) will be a martingale. where f (t. XT )|Ft . x) = E e− RT t t. St )St dWt . Suppose that f (t. f (t. 144 . x) = E e− RT t t. Markov property yields E e− RT t RT t q(Xs )ds f (T. dSt = r(t. Formulas (66) and (68) are called the Feynman-Kac formulas. Then.Ss )ds RT t r(s.

Feynman-Kac formula says that the function f (t. x) satisﬁes the following parabolic partial diﬀerential equation with terminal value: ∂f ∂t + r(t.Then. with B0 = a ∈ − π . 2 145 1 − Xt2 dBt . sin(Bt )) satisﬁes 1 dXt = Xt dt + Xt dBt . 1+t 1+t dXt = − X0 = 0 (ii) The process Xt = sin Bt . ∂x 2 ∂x f (T. π satisﬁes 2 2 1 dXt = − Xt dt + 2 (iii) (X1 (t). et Bt ) satisﬁes dX1 1 0 = dt + X1 dBt dX2 X2 e (iv) (X1 (t). = dt + X1 dX2 2 X2 (v) Xt = (cos(Bt ). x) = g(x) Exercices 6. X2 (t)) = (t. X2 (t)) = (cosh(Bt ). / 2 2 .1 Show that the following processes satisfy the indicated stochastic diﬀerential equations: (i) The process Xt = Bt 1+t satisﬁes 1 1 Xt dt + dBt . sinh(Bt )) satisﬁes 1 X1 X2 dX1 dBt . x)x ∂f 1 ∂ 2f + σ 2 (t. x)f = 0. x)x2 2 − r(t. for t < T = inf{s > 0 : Bs ∈ − π . π .

5 The nonlinear stochastic diﬀerential equation dXt = rXt (K − Xt )dt + βXt dBt . X0 = x > 0 Xt dXt = Xtγ dt + αXt dBt . x > 0. The components of the process Xt satisfy 1 0 X1 (t)2 + X2 (t)2 = 1.3 Solve the following stochastic diﬀerential equations: dXt = rdt + αXt dBt . satisﬁes 3 1 1/3 2/3 dXt = Xt dt + Xt dBt . X0 = x 1 dXt = dt + αXt dBt . X0 = x > 0 146 1 dt + 0 1 0 0 X1 dB1 . i = 1. 3 6.4 Solve the following stochastic diﬀerential equations: (i) dX1 = dX2 (ii) dX1 (t) = X2 (t)dt + αdB1 (t) dX2 (t) = X1 (t)dt + βdB2 (t) 6. . . 1. (vi) The process Xt = (x1/3 + 1 Bt )3 .where M = 0 −1 . dB2 . For which values of the parameters α. 6. and. n. . . Solve the stochastic diﬀerential equation n dXt = rXt dt + Xt k=1 αk dBk (t). γ the solution explodes? 6.2 Consider an n-dimensional Brownian motion Bt and constants αi . as a consequence. X0 = x > 0. Xt can be regarded as a Brownian motion on the circle of radius 1.

7 Find diﬀusion process whose generator are: 2 a) Af (x) = f (x) + f (x). .where dXk (t) = rk Xk dt + Xk j=1 αkj dBj .Xn ). X2 . .is used to model the growth of population of size Xt in a random and crowded environment. The constant K > 0 is called the carrying capacity of the environment. . f ∈ C0 (R) b) Af (x) = ∂f ∂t 2 + cx ∂f + 1 α2 x2 ∂ f . 6. x2 ) = 2x2 ∂x1 + log(1 + x2 + x2 ) ∂x2 + 1 (1 + x2 ) ∂ f 2 1 1 ∂x2 2 ∂2f +x1 ∂x1 ∂x2 + 1 ∂2f . the constant r ∈ R is a measure of the quality of the environment and β ∈ R is a measure of the size of the noise in the system. α and r are constants d) dYt = e) f) dX1 dX2 dX1 dX2 dt dXt = = whereXt is the process introduced in a) 1 X2 1 0 dt + dt + 0 e X1 dBt dB1 dB2 n 1 0 0 X1 h) X(t) = (X1 . f ∈ C0 (R2 ) ∂x 2 ∂x2 2 1 2 ∂f ∂f c) Af (x1 . . (geometric Brownian motion) α and r are constants c) dXt = rdt + αXt dBt . (Ornstein-Uhlenbeck process) µ and r are constants b) dXt = rXt dt + αXt dBt . 2 ∂x2 2 f∈ 2 C0 (R2 ) 147 . 1 ≤ k ≤ n 6.6 Find the generator of the following diﬀusion processes: a) dXt = µXt dt + σdBt . Show that the unique solution to this equation is given by Xt = exp x−1 + r t 0 1 rK − 2 β 2 t + βBt exp rK − 1 β 2 s + βBs ds 2 .

Chapman and Hall. Øksendal: Stochastic Diﬀerential Equations. Cambridge Series in Statistical and Prob. 5. I. R. T.8 Show the formulas (51). C. Math. D. 7. Klebaner: Introduction to Stochastic Calculus with Applications. Schreve: Brownian Motion and Stochastic Calculus. (52) y (53). Lapeyre: Introduction to Stochastic Calculus Applied to Finance. 3. Mikosch: Elementary Stochastic Calculus. E. World Scientiﬁc 2000. 6.. F. 1996. J. Prentice-Hall. 2. 1975. Springer-Verlag 1998 148 . Norris: Markov Chains. Lamberton and B. E. 1997. References 1. 4. B. Springer-Verlag 1991. Cinlar: Introduction to Stochastic Processes.6. Karatzas and S.

- Stochastic Processes
- Stochastic Process.sde.Course
- Stochastic Processes ActSci
- Stochastic Processes
- Markov Chain Random Walk
- Stochastic Process
- Title
- Introduction to Probability Theory and Stochastic Processes
- Probability Theory and Stochastic Processes with Applications
- Stochastic Processes - Ross
- markov-chains
- Business Economic and Managerial Decision Making
- Stochastic Process_Sheldon Ross 2ed
- Quantitative Methods
- Mathematics of Evolution
- 9491216465StochasticGames
- Timeless Decision Theory
- Calculus
- Inventory Models
- Fractal Report
- MBA Fundamentals Statistics
- Geographic Information Systems in Transportation Research
- Math56a Lectures
- Stochastic Programming
- Well Logging
- Operation Research As A Utility Tool
- CalcI Complete
- stochastic-process-1213683934485632-9
- Stochastic Processes

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd