You are on page 1of 12

Chapter 3 Martingales and Stopping Times

In this chapter,we will meet two of the most important concepts of modern probability theory and its applications to option pricing.

3.1

Filtrations

Notation: Z+ = N ∪ {0} = {0, 1, 2, 3, . . .}, is “time starting at zero”, however we can (and sometimes will) also use N for indexing processes. We need to be able to model the flow of information in time. The standard way of doing this is to use a filtration of sub-σ-algebras. Definition. A filtration is a sequence (Fn , n ∈ Z+ ) of sub-σ-algebras of F such that each Fn ⊆ Fn+1 . The standard example of a filtration is obtained when we observe a random process (Xn , n ∈ Z+ ), e.g. Xn might be the price of a stock at time n. We then take Fn = σ{X0 , X1 , . . . , Xn } to be the smallest sub-σ-algebra of F for which X0 , X1 , . . . , Xn are all measurable. This is called the natural X filtration of the process (Xn , n ∈ Z+ ) and denoted by (Fn , n ∈ Z+ ). Definition. Let (Fn , n ∈ Z+ ) be a filtration. A stochastic process Y = (Yn , n ∈ Z+ ) is said to be adapted to this filtration if each Yn is Fn -measurable. If the process Y is adapted, then all the information we need to predict the random variable Yn can be found in Fn , i.e. all the information about the observation at time n is (in principle) already known at this time. 24

. . • If X and Y are two stochastic processes and Y is adapted to the natural filtration of X. n ∈ Z+ ) of F. the least squares predictor) of Xn at time n − 1 is in fact Xn−1 .e. Xn ). n ∈ Z+ ) is said to be a (discrete-parameter) martingale if • it is adapted. i.1) . . • E(Xn |Fn−1 ) = Xn−1 for all n ∈ N. You stake £1 in each round.2. By (CE1).it is sometimes called “the martingale property”.2 3. then each Yn = fn (X0 . Your fortune (which may be negative) at time n is £Xn so Xn − Xn−1 is your winnings per unit stake at time n. . . X1 . 3.1 Martingales Basic Ideas We fix a filtration (Fn . . A martingale can be thought of as a “fair game”. .e. where fn is a measurable function from Rn+1 to R.e.. The sub-σ-algebra Fn contains all the information about the history of the game up to and including time n.• Any stochastic process is automatically adapted to its own natural filtration. A stochastic process X = (Xn . The third of these is the most important . each E(|Xn |) < ∞).2. 2. Consider a game where rounds take place at times 1. • it is integrable (i. average winnings in game n are zero given knowledge of the history of the game up to and including time n − 1. Another way of looking at a martingale is that it is a process where the best estimate (i. the martingale property is equivalent to E(Xn − Xn−1 |Fn−1 ) = 0. The martingale property is equivalent to E(Xn |Fm ) = Xm 25 (3.

and if it is biased against you. while that of a supermartingale decreases. Using (3. lets take a look at some close relatives. i. n ∈ Z+ ) is a submartingale if and only if (−Xn . • (Xn . E(Xn − Xn−1 |Fn−1 ) ≥ 0.2. The following are obvious: • (Xn . n with m < n E(Xn ) = E(E(Xn |Fm )) = E(Xm ). n ∈ Z+ ) is a supermartingale. finance and also analysis.e. Formally. i. we have the important result that The mean of a martingale is constant in time. If a game is biased in your favour. You can check that the mean of a submartingale increases in time. Martingales are important because (as we will see) • they are ubiquitous in probability.and a supermartingale. we obtain a supermartingale E(Xn − Xn−1 |Fn−1 ) ≤ 0. 26 .for all 0 ≤ m ≤ n. an adapted integrable process (Xn . We have seen that martingales correspond to “fair games”. statistics. n ∈ Z+ ) is a submartingale if E(Xn |Fn−1 ) ≥ Xn−1 for all n ∈ N and a supermartingale if E(Xn |Fn−1 ) ≤ Xn−1 for all n ∈ N. Before proceeding further with martingales themselves. You can establish this in Problem 15. n ∈ Z+ ) is a martingale if and only if it is both a sub. n ∈ Z+ ) is a martingale then for all m.1) and (CE5) we see that if X = (Xn . for all n ∈ N. we obtain a submartingale.e. • they have a beautiful theoretical development.

n ∈ N) is a martingale since by (CE1).2 Examples of Martingales (E1) Let X be an integrable random variable and (Fn . . . This yields a martingale since by (CE3) and (CE6) E(Xn |Fn−1 ) = Y1 Y2 · · · Yn−1 E(Yn |Fn−1 ) = Xn−1 E(Yn ) = Xn−1 . . R i.i.i. . n ∈ N) be a sequence of independent integrable random variables. n ∈ Z+ ) by Xn = E(X|Fn ). the only difference being that we now take each E(Yn ) = 1 and define Xn = Y1 Y2 · · · Yn . . Yn }. n ∈ N) is a sequence of i.2. Define a filtration by Fn = σ{Y1 . each with zero mean. Define a process (Xn . Y2 . (Yn . g(Y1 )g(Y2 ) · · · g(Yn ) We get a martingale (by (E3)) since each E 1 f (Yj ) g(Yj ) = R f (y) g(y)dy = g(y) f (y)dy = 1. n ∈ Z+ ) is a martingale by using (CE4). Define Xn = Y1 + Y2 + · · · + Yn .3.d means independent and identically distributed. . 27 . . (E4) Likelihood Ratios Let f and g be two pdfs with g(x) = 0 for all x ∈ R. (Xn . Y2 . Fn = σ{Y1 . Then it is easy to see that (Xn .d 1 random variables . n ∈ Z+ ) be an arbitrary filtration. and let Xn = f (Y1 )f (Y2 ) · · · f (Yn ) .each with pdf g. (CE3) and (CE6) E(Xn |Fn−1 ) = Y1 + Y2 + · · · + Yn−1 + E(Yn |Fn−1 ) = Xn−1 + E(Yn ) = Xn−1 . (E2) Sums of Independent Zero Mean Random Variables Let (Yn . . (E3) Products of Independent Unit Mean Random Variables This is the same set up as in Example 2. Yn }.

2) We call (Yn .2. Note that • C0 does not exist. there exists K > 0 such that |Cn (ω)| ≤ K. Now lets make this more interesting by allowing an arbitrary stake or gambling strategy. n ∈ N) the martingale transform of C by X. the total winnings on game n are Cn (Xn − Xn−1 ) and hence the total winnings up to and including game n is n Yn = j=1 Cj (Xj − Xj−1 ).1 Suppose that C is bounded. Proof 28 . Let (Fn . • Any previsible process is adapted. The idea behind the definition of a previsible process is that your stake in game n will depend on the history of the game up to and including game n − 1. Let’s return to the idea of betting in a game. n ∈ N and non-negative. Cn (ω) ≥ 0 for all ω ∈ Ω. (3. n ∈ Z+ ) be a filtration. Using a previsible strategy. Theorem 3.3 The Martingale Transform The martingale transform is the discrete time analogue of the important stochastic integral concept which we’ll meet later on in continuous time.e. n ∈ Z+ ) wherein Xn −Xn−1 is winnings per unit stake on game n. i. The next result show that “you can’t beat the system”. Recall the interpretation of a martingale (Xn . n ∈ Z+ ) is a supermartingale then Y = C • X is a supermartingale.3. i. (ii) If X = (Xn . Definition. for all ω ∈ Ω. n ∈ N) is said to be previsible if Cn is Fn−1 -measurable for all n ∈ N. This leads us to introduce the following concept. n ∈ N.e. n ∈ Z+ ) is a martingale then Y = C • X is martingale. (i) If X = (Xn .2. It is sometimes written succinctly as Y = C • X. A process (Cn . When we discussed martingales as describing a “fair game” we only bet a unit stake in each round.2.

To get the supermartingale property for Y we argue as follows. We include the point at infinity so that we can capture the idea of waiting “forever” for an event to take place.(i) Y is adapted since sums and products of Fn -measurable random variables are themselves Fn -measurable. Conversely if (T2) holds then (T ≤ n) = n (T = i) ∈ Fn and so (T2) i=0 holds. 1. 3.3. . 2. We can model this by using a stopping time T relative to a filtration (Fn . 29 . The intuition behind (T1) is simply that T is a random time and so should take non-negative values. n ∈ Z+ ). Y is integrable follows from the fact that X is integrable and C is bounded. Recall that (T = n) = {ω ∈ Ω. . A stopping time T is a random variable for which (T1) T takes values in Z+ ∪ {∞} = {0. ∞}. Using (CE3) and the supermartingale property of X: E(Yn − Yn−1 |Fn−1 ) = E(Cn (Xn − Xn−1 )|Fn−1 ) = Cn E((Xn − Xn−1 )|Fn−1 ) ≤ 0. (T2) The event (T = n) ∈ Fn for each n ∈ Z+ . (T2) . To see that these are equivalent note that if (T2) holds then (T ≤ n−1) ∈ Fn−1 ⊆ Fn and hence (T = n) = (T ≤ n) − (T ≤ n − 1) ∈ Fn and so (T2) holds. The event (T ≤ n) ∈ Fn for each n ∈ Z+ . T (ω) = n}. It is sometimes useful to replace (T2) with an equivalent axiom (T2) . It encapsulates the idea that the information needed to determine whether or not to stop after the nth game depends on the history of the game up to and including time n. . .3 3. (T2) is the key part of the definition. (ii) is proved similarly and in this case you can even drop the requirement that C be non-negative.1 Stopping Times Stopping Times and Stopped Processes Suppose that your strategy is to exit from a game the first time your fortune reaches a certain high or low point (or you want to sell your shares as soon as their price reaches a certain value).

• T will not be a stopping time if the process (Xn . n ∈ Z+ ) be an adapted process and let B be a Borel set. T }. (ii) αT . n ∈ Z+ ) is the history of the share price up to and including the time you sell. so T is the first time (after time zero) that the process moves into the set B. XT ∧n (ω) = XT (ω)∧n (ω). 30 . Now let (Xn .Here is an important example of a stopping time. If Xn is the price of a share at time n and T is the selling time. From a practical point of view. X5 . T }. n ∈ Z+ ) fails to be adapted. S ∨ T = max{S. 2 S ∧ T = min{S. so for each ω ∈ Ω. then XT is the price at the selling time and (XT ∧n . (iv) S ∨ T . .3. We introduce • The stopped random variable XT . X3 . Theorem 3. X5 . (S ∨ T ≤ n) = (S ≤ n) ∩ (T ≤ n) ∈ Fn .1 If S and T are stopping times (with respect to the same filtration) then so are (i) S + T . X5 . So for example if chance selects an ω for which T (ω) = 5. (ii) is obvious. X2 . For (iii) and (iv) use (S ∧ T ≤ n) = (S ≤ n) ∪ (T ≤ n) ∈ Fn . X4 . n ∈ Z+ ) be an adapted process and T be a stopping time. Xn ∈ B}. • The stopped process (XT ∧n . 2 Proof (i) is Problem 20 . X1 . . n ∈ Z+ ) so for each ω ∈ Ω. Let (Xn . . XT (ω) = XT (ω) (ω). We define T = min{n ∈ N. (where α ∈ N) (iii) S ∧ T . i=1 • Stopping times defined in this way are called first hitting times. T is a stopping time as n (T ≤ n) = (Xi ∈ B) ∈ Fn .). n ∈ Z+ . then XT (ω) = X5 (ω) and (for this value of ω only) XT ∧n = (X0 .

We begin by introducing a gambling strategy based on T . then (XT ∧n . 2. n ∈ Z+ ) is a martingale and so E(XT ∧n ) = E(X0 ). n (C • X)n = j=1 T (Xj − Xj−1 ) = Xn − X0 = XT ∧n − X0 . If X is a martingale and T is a stopping time. n ∈ N) where T Cn = 1{T ≥n} = 1 if T ≥ n 0 if T < n T This process is previsible since Cn = 0 iff T ≤ n − 1 and (T ≤ n − 1) ∈ T Fn−1 . T your stake process is C T = (Cn . we deduce the following Theorem 3. then (XT ∧n .we would like to know as much as possible about XT . The winnings process is the martingale transform n (C • X)n = j=1 T T Cj (Xj − Xj−1 ). i. If T > n − 1. Combining this result with that of theorem 3. If X is a supermartingale and T is a stopping time.2 1. k (C T • X)n = j=1 (Xj − Xj−1 ) = Xk − X0 = XT − X0 = XT ∧n − X0 . Suppose that you always bet one pound but you stop playing after time T . n ∈ Z+ ) is a supermartingale and so E(XT ∧n ) ≤ E(X0 ).e. It turns out that we can get some information about its mean and we’ll now explore this further. In this case. 31 .3. while Cn = 1 iff T ≥ n and (T ≥ n) = (T ≤ n − 1)c ∈ Fn−1 . (C T • X)n = XT ∧n − X0 .2.1. If T = k < n.

(ii) X is bounded and T is a. random variables with each P (Xn = 1) = P (Xn = −1) = 1/2. Hitting Times for a Random Walk Let (Xn .i.2 we have E(XT ∧n ) ≤ E(X0 ) for all n ∈ N. The great twentieth century probabilist Joseph Doob found conditions which allow you to do this.). Proof. then T ∧ N = T and so we get E(XT ) ≤ E(X0 ) as was required. finite. First suppose that (i) holds so there exists N ∈ N such that T (ω) ≤ N for all ω ∈ Ω. n ∈ N) be a sequence of i.s. From theorem 3. Theorem 3. By the dominated convergence theorem (theorem 1. ω ∈ Ω. Hence we obtain |XT ∧n − X0 | ≤ |XT ∧n | + |X0 | ≤ 2K.2 Applications of Stopping Times 1.3. The following result is called Doob’s optional stopping theorem in his honour. Unfortunately this doesn’t always work. Sn = X1 + X2 + · · · + Xn . it is enough for (XT ∧n .3.3.1) we have E(XT ) − E(X0 ) = lim E(XT ∧n − X0 ) ≤ 0.3 (Optional stopping) Let X be a supermartingale and T be a stopping time. If either (i) T is bounded.s. If (ii) holds we have that there exists K > 0 such that |Xn (ω)| ≤ K for all n ∈ Z+ .3 (ii). 3. We only consider the supermartingale case here (the extension to martingales is straightforward).3. n→∞ Note: For the proof of theorem 3. n ∈ Z+ ) to be bounded. provided T is finite (a. n ∈ Z+ ) defined as follows:S0 = 0.3. 32 .d. Indeed there is no reason why XT should even be integrable in general. The simple (symmetric) random walk is the process (Sn .It would be nice if we could take the limit as n → ∞ in these results and deduce (in the martingale case) that E(XT ) = E(X0 ). If X is a martingale and either (i) or (ii) hold then E(XT ) = E(X0 ). then XT is integrable and E(XT ) ≤ E(X0 ). Now take n = N .

2 hence E(sech(θ)eθXn ) = 1.We work with the filtration (Fn .3).3. θ By example (E3) we then have that (Mn . Now consider the stopping time T = min{n ∈ N. we have eθST ∧n ≤ eθST = eθ . . θ hence (MT ∧n .2 (2). E((sech(θ))T ) = e−θ . Our first observation is that P (T = 2m) = 0 for all m ∈ Z+ . indeed after 2m steps the path must comprise k steps of −1 and 2m − k steps of +1(0 ≤ k ≤ 2m) and hence the walk must be at a point which is an even number of steps away from the origin. 3 θ Effectively we may define MT = 0 when T = ∞ since limn→∞ (sech(θ))n = 0.3.3) 33 . . . Ω} and Fn = σ{X1 . (3. Sn = 1}. i. Furthermore if θ > 0. then for each n ∈ N. Sn }. . θ Mn = (sech(θ))n eθSn . . Now for all θ ∈ R we have cosh(θ) ≥ 1 ⇒ sech(θ) ≤ 1. We would like to know the distribution of T .3. for n ∈ N.e. . By theorem 3. Xn } = σ{S1 . to get E(MT ) = 1. Let θ ∈ R. To get precise information about the probabilities P (T = 2m − 1) for m ∈ N we reason as follows. θ E(MT ∧n ) = E((sech(θ))T ∧n eθST ∧n ) = 1. X2 . S2 . . Although P (T = ∞) > 0 a technical argument which we will not give here 3 allows us to still use Doob’s optional stopping theorem (theorem θ 3. n ∈ Z+ ) is bounded (for θ > 0). . n ∈ Z+ ) where F0 = T = {∅. 1 E(eθXn ) = (eθ + e−θ ) = cosh(θ). n ∈ Z+ ) is a martingale where θ M0 = 1 and for n ≥ 1.

so e. Now put α = sech(θ) = eθ Substituting into (3.4) i. √ 1 − α2 ). 34 . We make observations of i. .7266. m = 1 m! − ··· Substituting in (3.3.3.d random variables (Yn . Then eθ and e−θ are the two solutions + e−θ of the quadratic equation αr2 − 2r +α = 0. P (T = 7) = 5/128. 2 where for m ∈ N.3) yields E(αT ) = α−1 (1 − ∞ √ 1 − α2 ). P (T = 3) = 1/8. 1 2 3 2 1 −m+1 . There are two hypotheses: H0 : the common pdf of the Yj s is f .e. P (reject H1 |H1 ) ≤ β. (3.5) Equating coefficients yields P (T = 2m − 1) = (−1)m−1 m 1 2 . We control type I and type II errors as follows: P (reject H0 |H0 ) ≤ α. On solving this in the usual way.4) we obtain ∞ ∞ α P (T = n) = n=0 n (−1)m−1 m=1 m 1 2 α2m−1 .i. n=0 αn P (T = n) = α−1 (1 − Now you can check the binomial series expansion (for y ∈ R) ∞ (1 − y) 2 = m=0 1 2 1 m 1 2 − 1 2 (−y)m . we obtain √ e−θ = α−1 (1 − 1 − α2 ). n ∈ N). . (3.3.g. P (T ≤ 7) = 93/128 ∼ 0. P (T = 5) = 1/16. Wald’s Sequential Hypothesis Test We briefly examine a nice application of martingales and stopping times to statistics.3. H1 : the common pdf of the Yj s is g.2 . . In particular you can compute P (T = 1) = 1/2. = 2.

Calculate Xn . observe Yn . so for each n ∈ N Xn = and X0 = 1. To explore further. This recipe appears rather strange at first glance. It can be shown that T is finite. The strategy is as follows: n=0 Repeat n = n + 1. g(Y1 )g(Y2 ) · · · g(Yn ) . more detailed analysis shows that we still have E0 (XT ) ≤ E0 (X0 ) in this case (where the subscript zero on the expectation emphasises that probabilities are calculated under H0 ). Although X is not bounded. we obtain 1 = E0 (X0 ) ≥ E0 (XT ) ≥ α−1 P (XT ≥ α−1 ) = α−1 P (reject H0 |H0 ). f (Y1 )f (Y2 ) · · · f (Yn ) 35 . n ∈ Z+ ) be the likelihood ratio martingale under H0 as described in (E4). Xn ≥ α−1 or Xn ≤ β}. Now using Markov’s inequality (Problem 8). Until Xn ≥ α−1 or Xn ≤ β If Xn ≥ α−1 accept H1 Otherwise accept H0 .Let (Xn . introduce the stopping time T = min{n ≥ 0. Hence P (reject H0 |H0 ) ≤ α as was required. A similar argument (assuming H1 this time) shows that P (reject H1 |H1 ) ≤ β.