Byron Schmuland
I returned, and saw under the sun, that the race is not to the swift, nor
the battle to the strong, neither yet bread to the wise, nor yet riches to
men of understanding, nor yet favour to men of skill; but time and chance
happeneth to them all. Ecclesiastes 9:11.
Contents
3. Optimal stopping
A. Strategies for winning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
B. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
C. Algorithms to find optimal strategies . . . . . . . . . . . . . . . . . . . . . . . . . 45
D. Binomial pricing model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4. Martingales
A. Conditional expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
B. Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
C. Optional sampling theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
D. Martingale convergence theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5. Brownian motion
A. Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
B. Reflection principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
C. Dirichlet problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6. Stochastic integration
A. Integration with respect to random walk . . . . . . . . . . . . . . . . . . . . . 78
B. Integration with respect to Brownian motion . . . . . . . . . . . . . . . . . 79
C. Itos formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Basic definitions
Let (Xn )
n=0 be a stochastic process taking values in a state space S that
has N states. To understand the behaviour of this process we will need to
calculate probabilities like
P (X0 = i0 , X1 = i1 , . . . , Xn = in ).
(1.1)
26 25 26
52 51 50
.12745.
(b) With replacement
P (X0 = R, X1 = R, X2 = B)
= P (X0 = R) P (X1 = R j X0 = R) P (X2 = B j X1 = R, X0 = R)
=
26 26 26
52 52 52
.12500.
P =
B
R
B
R
1/2 1/2
1/2 1/2
jS
p(i, j) = 1. Such
0 0 0 1 1 1 0
The two arrows on top represent incoming calls and the arrow below is
the completed call. Note that the second incoming call was lost since the
phone was already busy when it came in.
P =
0
1
0
1
1p
p
q
1q
0 0 0 1 1 2 1
The two arrows on top represent incoming calls and the arrow below is
the completed call. This time the second incoming call is answered on the
second phone.
0
0
1p
P = 1 (1 p)q
2
0
1
p
(1 p)(1 q) + pq
q
0
p(1 q)
1q
1p
.......
..
..............
.......
....................................
.....................................
.....
..... .......... ..
......
.... ...
....
.....
....
...
...
.......
...... .
......................................
.....
...... .
....
.....
...
...
...
j1
j+1
..
..............
....................................
.....
......
....
.....
...
...
...
N 1
526.31
1d Random Walk
160
140
120
100
80
60
40
20
0
20
40
10000 steps  3 sample paths
Time = 0
Time = 1
Time = 2
Time = 100
Time = 200
Time = 300
Calculating probabilities
Consider a two state Markov chain with p = 1/4 and q = 1/6 so that
P =
0
1
0
1
3/4 1/4
.
1/6 5/6
To find the probability that the process follows a certain path, you multiply
the initial probability with conditional probabilities. For example, what is
the chance that the process begins with 01010?
As a second example, lets find the chance that the process begins with
00000.
P (X1 = 1, X2 = 0, X3 = 1, X4 = 0 j X0 = 0) =
1
= .00174. (1.2)
576
P (X1 = 0, X2 = 0, X3 = 0, X4 = 0 j X0 = 0) =
81
= .31641. (1.3)
256
P4 =
so that P (X4 = 0 j X0 = 0) =
0
3245
1
3245
6912
6912
3667
10368
1 !
3667
6912
6701
10368
= .46947.
0 P n
pn (1, 1)
pn (2, 1)
= (0 (1), . . . , 0 (i), . . . , 0 (N ))
..
...
pn (1, j)
pn (2, j)
..
.
...
pn (1, N )
pn (2, N )
..
(i)p
(i,
N
)
.
0
n
iS
iS
P (X0 = i, Xn = j)
iS
= P (Xn = j).
In other words, the vector n = 0 P n gives the distribution of Xn .
3245
3667
3245 3667
6912
6912
4
,
= (.46947, .53053).
=
4 = 0 P = (1, 0) 3667
6701
6912
6912
10368
10368
On the other hand, if we flip a coin to choose the starting position then
17069 24403
0 = (1/2, 1/2) and 4 = ( 41472
, 41472 ) = (.41158, .58840).
Invariant Probabilities
Example. Lets try to find invariant probability vectors for some Markov
chains.
1.
Suppose that P =
0
1
1
. An invariant probability vector =
0
(1 , 2 ) must satisfy
(1 , 2 ) = (1 , 2 )
0
1
1
0
1 0
2.
If P =
0 1
satisfies = P !
3.
3/4 1/4
P =
.
1/6 5/6
Theorem
A probability vector is invariant if and only if there is a
probability vector v such that = limn vP n .
Proof: () Suppose that is invariant, and choose 0 = . then 1 =
0 P = P = . Repeating this argument shows that n = for all n 1.
Therefore, = limn P n (In this case, we say that the Markov chain is in
equilibrium).
() Suppose that = limn vP n . Multiply both sides on the right by P to
obtain
P = (lim vP n )P = lim vP n+1 = .
n
10
u
t
1p
p
Lets investigate the general 2 2 matrix P =
. It has
q
1q
eigenvalues 1 and 1 (p + q). If p + q > 0, then P can be diagonalized as
P = QDQ1 , where
q
p
1 p
1
0
p+q p+q
.
Q=
, D=
, Q1 =
1
1
1 q
0 1 (p + q)
p+q p+q
Using these matrices, it is easy to find powers of the matrix P . For example
P 2 = (QDQ1 )(QDQ1 ) = QD2 Q1 . In the same way, for every n 1 we
have
P n = QDn Q1
n
1
0
= Q
Q1
0 (1 (p + q))n
q
p+q
= 1n q
p+q
p
p
p+q
np+q
p + (1 (p + q)) q
p+q
p+q
p
p+q
.
q
p+q
p+q p+q
n
P q
p = ,
p+q p+q
where = (q/(p + q), p/(p + q)). For any probability vector v we have
n
n
lim vP = v lim P = v
= .
n
n
This means that is the unique limiting vector for P , and hence the unique
invariant probability vector.
11
The next result is valid for any Markov chain, ergodic or not.
P
Theorem
If P is a stochastic matrix, then (1/n) nk=1 P k M . The
set of invariant probability vectors is the set of all convex combinations of
rows of M .
Proof: We assume the convergence result and prove the second statement.
Note that any vector that is a convex combination of rows of M can be
written = vM for some probability vector v.
() If is an invariant probability vector, then = P . Therefore
!
n
1X k
M,
P
=
n k=1
which shows that is a convex combination of the rows of M .
() Suppose that = vM for some probability vector v. Then
n
!
X
1
P = vM P = lim v
Pk P
n
n
k=1
= lim v(P 2 + . . . + P n+1 )
n
= lim v(P + + P n )
n
1
n
1
1
+ v(P n+1 P )
n
n
= vM + 0
= .
u
t
12
P
We sketch the argument for the convergence result n1 nk=1 P k M .
P
The (i, j) entry of the approximating matrix n1 nk=1 P k can be expressed
in terms of probability as
n
!
n
n
n
1X
1X k
1X
1X
P =
Pi (Xk = j) =
Ei (1(Xk =j) ) = Ei
1(Xk =j) .
n k=1 ij
n k=1
n k=1
n k=1
This is the expected value of the random variable representing the average
number of visits the Markov chain makes to state j during the first n time
periods. A law of large numbers type result will be used to show why this
average converges.
Define the return time of the state j as Tj := inf(k 1 j Xk = j). We
use the convention that the infimum of the empty set is . There are
two possibilities for the P
sequence 1(Xk =j) : if Tj = , then it is just a
1
sequence of zeros, and n nk=1 1(Xk =j) = 0. On the other hand, if Tj < ,
then the history of the process up to Tj is irrelevant and we may just as
well start counting visits to j from time Tj . This leads to the equation
mij = Pi (Tj < ) mjj .
Putting i = j above, we discover that if Pj (Tj < ) < 1, then mjj = 0.
Thus mij = 0 for all i S and hence j = 0 for any invariant probability
vector . If Pj (Tj < ) = 1, then in fact Ej (Tj ) < (for a finite state
space!).
The following example shows the first n + 1 values of the sequence 1(Xk =j) ,
where we assume that the (` + 1)th visit to state j occurs at time n. The
random variable Tjs is defined as the time between the (s 1)th and sth
visit. These are independent, identically distributed random variables with
the same distribution as Tj .
nth trial
1 00001

 {z } 000001
 {z }
{z } 00000000000001
{z
} 00000001
1
2
3
T `j
Tj
Tj
Tj
The average number of visits to state j up to time n can be represented as
the inverse of the average amount of time between visits. The law of large
13
k
n
E
(T
)
j
j
T
k=1 j
We conclude that (1/n)
Pn
k=1
2
4
P =
1/2 0
0 1/2
1/3 1/3 1/3 0
To find the invariant probability vector, we rewrite the equation = P
as (I P t ) = 0, where P t is the transpose of the matrix P and I is the
identity matrix. The usual procedure of row reduction will lead to the
answer.
1
1/2 1/2 1/3
1 1/2 1/2 1/3
1/3
1
0
1/3
14
1 0 3/5 3/5
1 0 0 1
0 1 1/5 8/15
0 1 0 2/3
0 0 4/5 8/15
0 0 1 2/3
0 0 4/5 8/15
0 0 0
0
This last matrix tells us that 1 4 = 0, 2 24 /3 = 0, and 3 24 /3 =
0, in other words the invariant vector is (4 , 24 /3, 24 /3, 4 ). Because
1 +2 +3 +4 = 1, we need
4 = 3/10 so the unique invariant probability
3 2 2 3
, , ,
.
vector is =
10 10 10 10
2. Random walk.
1/2
........
..
.....................................
.....
...... ...
....
.....
...
...
1/2
........
.........................................
......
.....
.....
....
...
...
........
.
...
..........
......................................
........................................
......
.....
..... ........... ..
.....
...
... ...
...
...
...
...
j1
j+1
...
N 1
15
20 = 1
1 20 = 2 1
j j1 = j+1 j , for j = 2, . . . , N 2
N 1 2N = N 2 N 1
2N = N 1 .
Combining the first two equations shows that 2 = 1 , and then the middle
set of equations implies that 1 = P
2 = 3 = = N 1 . If > 0, then
both 0 and N equal 1 /2. From N
j=0 j = 1, we get the unique solution
0 = N =
1
,
2((N 1) + 1)
j =
, j = 1, . . . , N 1.
(N 1) + 1
16
B
1
0
0
0
0
0
C
1
1
0
0
0
0
D
0
1
1
0
0
0
E
0
0
1
0
0
0
F
0
0
1
0
1
0
P
The transition probabilities are pij = pgij / k gik + (1 p)/n. With n = 6
and p = .85, we get
1 18 18
1
1
1
1
1 18 18
1
1
1
1
1 1 37/3 37/3 37/3
P =
35
1
1
1
1
1
40
1
1 1
1
1
35
35 1 1
1
1
1
Using software to solve the matrix equation = P , we get
= (.2763, .1424, .2030, .1431, .0825, .1526),
so the pages ranked according to their PageRank are A, C, F, D, B, E.
17
Classification of states
0 if j is transient
s
.
Pj s (Rj < ) =
1 if j is recurrent
The probability of infinitely many visits to state j is either zero or one,
according as the state j is transient or recurrent.
Definition. Two states i, j S are said to communicate if there exist
n
m, n 0 so that pm
ij > 0 and pji > 0.
By this definition, every state communicates with itself (reflexive). Also, if
i communicates with j, then j communicates with i (symmetric). Finally, if
i and j communicate, and j and k communicate, then i and k communicate
(transitive). Therefore communication is an equivalence relation and we
can divide the state space into disjoint sets called communication classes.
If i is transient, then mii = 0 and the equation mji = Pj (Ti < ) mii
shows that mji = 0 for all j S. The jth row of the matrix M is invariant
for P , and hence for any power of P , so that
X
0 = mji =
mjk pnki mjj pnji .
kS
Lemma.
18
P1 0 0 .........
....
0 P2 0 .......
..
0
...
...
0
...
0
0
...
P =
...
.
.
0
....
0
P
.
r
.
...
.
....
...
..
...
...
...
Each recurrent class R` forms a little Markov chain with transition matrix
P` . When you take powers you get
.
n
0 0 .........
P1
.
.
.
0 P2n 0 ........
..
0
...
.
.
.
. . 0 ....
0
Pn = 0
...
.
.
0
n ....
0
P
.
.
r ...
..
Sn
..
...
..
...
...
...
Qn
19
the matrix M .
1
0
M = 0
0
0
2
0
0
0
0
...
0
r
..
...
..
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
..
If i and j are in the same recurrent class R` , the argument in the Lemma
above shows that Pi (Tj < ) = 1 and so mij = mjj . That is, the rows of
` are identical and give the unique invariant probability vector for P` .
0 1
1.
If P =
, then there is only the one recurrent class R1 = f0, 1g.
1 0
The invariant probability must be unique and have strictly positive entries.
1 0
2.
If P =
, then there are two recurrent classes R1 = f0g
0 1
and R2 = f1g. The invariant measures are = a(1, 0) + (1 a)(0, 1) for
0 a 1. That is, all probability vectors!
3.
Suppose we have
1/2 1/2 0
0
1/6 5/6 0
0
0 3/4 1/4
P = 0
0
0 1/6 5/6
1/5 1/5 1/5 1/5
0
0
0
.
0
1/5
The classes are R1 = f0, 1g, R2 = f2, 3g, and T1 = f4g. The invariant
measures are = a(1/4, 3/4, 0, 0, 0) + (1 a)(0, 0, 2/5, 3/5, 0) for 0 a 1.
None of these puts mass on the transient state.
20
4.
Take a random walk with absorbing boundaries at 0 and N . We can
reach 0 from any state in the interior, but we cant get back. The interior states must therefore be transient. Each boundary point is recurrent,
so R1 = f0g, R2 = fN g, and T1 = f1, 2, . . . , N 1g and the invariant
probability vectors are
= a(1, 0, 0 . . . , 0, 0) + (1 a)(0, 0, . . . , 0, 1)
= (a, 0, 0 . . . , 0, 1 a) for 0 a 1.
Hitting times
Partition the state space S into two pieces D and E. We suppose that for
every starting point in D it is possible to reach the set E. We are interested
in the transition from D to E.
...............................................................
............
........
..
..........
......
..
........
.....
.
... .
.
.
.
.
.
.....
.............
.......
....
.
......
.
...
.
.
.
.
.
.
.
...
....
.
.
.
.
.
.
...
... ....
...
.
.
.
...
.
........
...
......
.
.
.
.
.
.
.
...
.
.
.
.
.
.
.
......
...
.
.
.
.
..
.
.
.
.
.
.
.
.
......
..
.. ..
.
.
.
.
.
.
.
.
...
.
.....
............
..
...
.....
.
.
.................................. ....
.
.....
.
..
.
.
.
.
.
.
.....
. .......... ..
....
...
.... .. ............... ....
...
.....
...........
.....
...
..
...
.....
...
..
...
...
.....
.
..
.
...
.
.....
....
.
.... .... ......
....
.
.
...
.
...
........ ......
...
...
..
.....
...
......................
.....
...
...
.....
....
...
...
.....
......
.
.
.
.
.....
.... .........
.
.....
.......
.. .....
......
.......
.
.
.
.
.
.......
.
.......
.........
.........
..........
..............
...............................................
Let Q be the matrix of transition probabilities from the set D into itself,
and S the matrix of transition probabilities of D into E.
The row sums of (I Q)1 give the expected amount of time spent until
the chain hits E.
The matrix (I Q)1 S gives the probability distribution of the first state
visited in E.
Examples.
1. The rat.
1
2
3
4
1
0 1/3 1/3 1/3
2 1/2 0
0 1/2
P =
3 1/2 0
0 1/2
4 1/3 1/3 1/3 0
21
Q = 2 1/2
3 1/2
2
1/3
0
0
1/3
0
0
and
1 1/3
S = 2 1/2 .
3 1/2
We calculate
1
2
3
1
1
1/3 1/3
1
0
IQ = 2 1/2
3 1/2
0
1
and
(IQ)1
1
2
3
5/2
E1 (T4 )
E2 (T4 ) = 8/4 ,
8/4
E3 (T4 )
and the first entry answers our question: E1 (T4 ) = 5/2.
22
2. $100 or bust.
Consider a random walk on the graph pictured below. You keep moving
until you hit either $100 or ruin. What is the probability that you end up
ruined?
.
.....
.....
... $100
.
.
.
.
..... ...
.... ...
...
.... ........
.
.
.
.....
..
....
...
.
.
ruin
start
E consists of the states $100 and ruin, so the Q and S matrices look like:
0 1/3 1/3
1/3 0
Q = 1/4
0 1/4 S = 1/4 1/4 .
1/3 1/3 0
0 1/3
A bit of linear algebra gives
11/8 2/3
(I Q)1 S = 1/2 4/3
5/8 2/3
5/8
1/3
1/2 1/4
11/8
0
0
5/8 3/8
1/4 = 1/2 1/2 .
1/3
3/8 5/8
Starting from the bottom left hand corner there is a 5/8 chance of being
ruined before hitting the money. Hey! Did you notice that if we start in
the center, then getting ruined is a 5050 proposition? Why doesnt this
surprise me?
fly
............................................ spider
.........
.
..................................................... ....
...
.
...
..
..
..
...
.
..
....
..
...
..
..
..
...
... ...... . . . . ... . .............
...............................................
23
To begin with, it helps to squash the cube flat and label the corners to see
what is going on.
1
..................................................................................
...........
.. .......
...... ...........
......
......
......
......
......
......
.....
......
.
......
.
......
.
.
......
...
.....
......
.
.
.
.
.
.
.
.
.
.
.
......
...... .....
....
.
.
.
.
......
.
.
.
....
.............
......
.
.
.
.
.
.
.
.
......
......
...
...
.
.
.
.
.
.
.
.
.
.
.
......
.
......
....
....
.
.
.
.
......
.
.
.
.
.
.
......
......
....
....
.
.
.
.
.
.
.
.
.
.
......
....
....
....
.
.
.
.
.
.
.
.
.
.
.
........................................................................................................................................................................................................................................
......
......
.
.
...
.
......
.
......
......
.
.
.
......
...
......
.....
.
.
.
.
.
.
.
.
.
.
......
......
....
....
.
.
.
.
......
......
.
.
.
.
...... .........
......
.....
..
......
.....
..........
......
......
......
......
...... ...........
.....
.....
......
......
.
.
.
.
.
.
.
.
.
.
......
.
......
..
......
......
......
......
...... .........
......
.
.................................................................................................
2 1/3 0
0
0
3 1/3 0
0
0
P =
4
0
1/3
1/3
0
5
0 1/3 0 1/3
6 0
0 1/3 1/3
S
0
0
0
0
4
0
1/3
1/3
0
0
0
0
1/3
5
0
1/3
0
1/3
0
0
0
1/3
6
0
0
1/3
1/3
0
0
0
1/3
0
0
1/3
1/3
1/3
0
4. Random walk.
Take the random walk on S = f0, 1, 2, 3, 4g with absorbing boundaries.
1p
......
..
..............
..................
................. ...............
......... ........................
......
.....
..... ..........
.....
...
... ...
.
.
...
....
.
24
(I Q)1
2
3
p
0
0
p
1p 0
and
0
4
1 1p 0
0 .
S = 2 0
3
0
p
1
1 (1 p)2 + p
1
2
1p
=
(1 p)2 + p2
3
(1 p)2
2
p
1
1p
p2
.
p
2
p + (1 p)
E1 (TE )
1 + 2p2
1
E2 (TE ) =
.
2
(1 p)2 + p2
2
E3 (TE )
1 + 2(1 p)
Matrix multiplication gives
0
1 (1 p + p2 )(1 p)
1
1
2
(1 p)2
(I Q) S =
2
2
(1 p) + p
3
(1 p)3
p3
.
p2
2
(1 p + p )p
4
E(length of game) =
2
,
(1 p)2 + p2
3
2
...................
.......
......
.....
.....
.....
.....
.....
.....
.
.
.
.....
...
.
.
.
.....
..
.
.
.
.....
.
....
...
.
.
...
..
.
.
...
..
.
....
.
.....
...
.
.
.
.....
...
.
.
.
.....
...
.
.....
.
.
..
.
.....
.
.
.
.....
...
.
.
.
.
.....
.
.
.....
0.5
25
P (ruin) =
0.5
(1 p)2
,
(1 p)2 + p2
...
...
...
...
...
...
...
...
...
....
.....
.....
.....
.....
.....
......
......
.......
.......
........
.........
..........
............
................
............................
0.5
State
State
State
State
3
2
1
0
Q = 2 1/2
3
0
2
0
0
1/2
1/2
1/2 ,
1/2
(I Q)1
1 2 3
1 2 2 4
= 2 2 4 6
3 2 4 8
26
We can apply the same idea to other patterns; lets take T HT . Now the
states are given as follows:
HH
HHT
TT
TH
T HT
State
State
State
State
State
3
2
2
1
0
Q = 2 1/2
3
0
2
0
1/2
1/2
1/2
0 ,
1/2
(I Q)1
1 2 3
1 2 2 2
= 2 2 4 6
3 2 4 4
27
1
0
0
0
0
0
1
1
0
0
0
0
2
2
1
1
1
0
0
0
3
3
3
..
..
..
..
..
..
...
.
.
.
.
.
.
1
1
1
1
1
0
N 1 N 1 N 1 N 1
N 1
We are trying to hit E = f1g and so
0
0
0
0
0
1
0
0
0
0
2
0
0
Q=
.
3
3
.
.
.
.
.
.
..
..
..
..
..
.
1
1
1
1
0
N 1 N 1 N 1
N 1
A bit of experimentation with Maple will convince you that
1 0 0
0
0
1
0
0
2 1 0
1 1
1
0
0
(I Q)1 =
.
2 3
. . .
.
.
. . . ...
.
.
.
.
.
.
.
1
1 1 1
1
2 3 4
N 1
P =
Taking row totals shows that E(Tj ) = 1 + (1/2) + (1/3) + + (1/(j 1)).
Even if we begin with the worst element, we have E(TN ) = 1 + (1/2) +
(1/3) + + (1/(N 1)) log(N ). It takes an average of log(N ) steps to
get the best element. The average case is much faster than the worst case
analysis might lead you to believe.
28
and
pm+n (x, y) =
zS
(2.4)
29
1(Xn =x)
n=0
P (Xn = x) =
n=0
pn (x, x),
n=0
Theorem.
n=0
pn (x, x) = .
1p
...
.........
.....
........................................
.....................................
......
..... ......... .
.....
.....
.... ....
...
...
...
...
x2
x1
x+1
x+2
...
Take x = 0 and assume that X0 = 0. Lets find p2n (0, 0). In order that
X2n = 0, there must have been n steps
to the left and n steps to the
2n
right. The number of such paths is n and the probability of each path
is pn (1 p)n so
2n n
(2n)! n
p2n (0, 0) =
p (1 p)n =
p (1 p)n .
n
n!n!
.
p
(1
p)
=
p2n (0, 0)
2n(n/e)2n
n
P
P
30
recurrent if d 2
d
Symmetric random walk in Z is
.
transient if d 3
31
qy
[a(y 1) a(y)],
py
qx+1 qy
[a(x) a(x + 1)].
px+1 py
w1
X
y=x
w1
X
y=x
z1
X
y=x
and plugging this back into (1) and solving for a(w) gives
Pz1
y=w (rx+1 ry )
.
a(w) = Pz1
(r
r
)
x+1
y
y=x
(2)
32
Consequences
1. Lets define the function b by
b(y) = Py (Xn will hit state z before x).
This function is also harmonic, but satisfies the opposite boundary conditions b(x) = 0 and b(z) = 1. Equation (1) is valid for any harmonic
function, so lets plug in b and multiply by 1 to get
b(w) = b(w) b(x) =
w1
X
(3)
y=x
P
If the limit of the denominator of (4) diverges, i.e.,
y=x (rx+1 ry ) = ,
then limz b(w) = 0, so limz a(w) = (w) = 1 for all w.
On the other hand, if
y=x (rx+1
ry ) < , then
P
y=w (rx+1 ry )
(w) = P
.
y=x (rx+1 ry )
This shows that (w) decreases to zero as w .
(5)
33
rz rx if r 6= 1
a(w) =
zw
if r = 1.
zx
Letting z gives the probability we ever hit x from the right: (w) =
1 rwx .
4. Notice that (x) = 1. The process is guaranteed to hit state x when you
start there, for the simple reason that we count the visit at time zero! Lets
work out the chance of a return to state x. Conditioning on the position
X1 at time 1, we have
Px (Tx < ) = qPx1 (hit x) + (1 (p + q))Px (hit x) + pPx+1 (hit x)
1
+ (1 (p + q))1 + p(1 r)
= q 1
r
= (q p) + (1 (p + q)) + (p q)
= 1 jp qj.
This shows that the chain is recurrent if and only if p = q.
34
Theorem.
Suppose Xn is a genuine ddimensional random
P
P walk with
jxjp(0, x) < . The walk is recurrent if d = 1, 2, and
xp(0, x) = 0.
Otherwise it is transient.
Lemma.
If i and j communicate,
pkjj < .
n
Proof.
Choose n and m so pm
ij > 0 and pji > 0. Then for every k 0,
n+k+m
we have pjj
pnji pkii pm
ij , so that
X
k
pkjj
X
k
n
pn+k+m
pm
ij pji
jj
pkii .
P
P
Therefore k pkjj < implies k pkii < . Reversing the roles of i and j
gives the result.
t
u
Example.
35
m00
2n
1 X 1
1 X 2k
= lim
= 0.
p00 lim
2n k=1
2n k=1 k
Branching processes
..
...........
..... .. .....
..... .. .....
..... ..... ........
.
.
.
.
.....
.
...
.....
.....
.
.....
....
....
.....
.....
.....
.....
...
....
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....................
........... ...
.
.
.
.
.
.
.
.
. ...... .........
............ ...
.
.
.
.
.
.
.
.
.
.
.
..... .......
...
...
..... .......
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.....
.......
....
.....
.......
.....
.......
.....
.....
....
.......
.....
.......
...
...
.......
.....
.....
.......
.......
.....
...
...
.....
.......
.......
........
.......
.
.
.
.
.
.
.
.
.
.
.
..
.
... ... .......
...
.
.
...
.
.
.
.
.
.
... ... .......
...
.
...
.
.
.
.
.
.
.
.....
...
...
...
...
.
.
.
.
.
.
.
.
.
.
.....
..
...
....
.
.
.
.
.
.
.
.
.
.....
.
.
.....
...
...
...
.
.
.
.
.
.
.
...
.
.
.
.....
.
...
...
.
.
.
.
.
...
.
.
.
.
.
.
.
.....
...
....
.....
.
.
.
... .....
...
.
.
. ..
....
.... .........
.....
..
..
.....
....
....
.....
...
.....
...
.....
...
...
....
.
.
1
p1
2
p2
3
p3
...
...
36
j=0
jpj .
so that
E(Xn+1 ) =
kP (Xn = k)
k=0
k=0
= E(Xn ).
By induction, we discover that E(Xn ) = n E(X0 ).
If < 1, then E(Xn ) 0 as n . The estimate
E(Xn ) =
kP (Xn = k)
k=0
P (Xn = k) = P (Xn 1)
k=1
shows that limn P (Xn = 0) = 1. Now, for a branching process, the state
0 is absorbing so we can draw the stronger conclusion that P (limn Xn =
0) = 1. In other words, the branching process is guaranteed to become
extinct if < 1.
Extinction.
Lets define a sequence an = P (Xn = 0 j X0 = 1), that
is, the probability that the population is extinct at the nth generation,
starting with a single individual. Conditioning on X1 and using the Markov
property we get
P (Xn+1 = 0 j X0 = 1) =
=
=
k=0
k=0
k=0
k
If we define (s) =
k=0 pk s , then the equation above can be written
as an+1 = (an ). Note that (0) = P (Y = 0), and (1) = 1. Also
37
P
k1
0
0, and 0 (1) = E(Y ). Finally, note that 00 (s) =
=
k=0 pk ks
P(s)
k2
0, and if p0 + p1 < 1, then 00 (s) > 0 for s > 0.
k=0 pk k(k 1)s
The sequence (an ) is defined through the equations a0 = 0 and an+1 = (an )
for n 1. Since a0 = 0, we trivially have a0 a1 . Apply to both sides
of the inequality to obtain a1 a2 . Continuing in this way we find that
the sequence (an ) is nondecreasing. Since (an ) is bounded above by 1, we
conclude that an a for some constant a.
The value a gives the probability that the population will eventually become extinct. It is the smallest solution to the equation a = (a). The
following pictures sketch the proof that a = 1 (extinction is certain) if and
only if E(Y ) 1.
Case 1 and a = 1
..
......
...
....
..
......
.........
...
..............
.
.
.
...
.
.. .
..........
...
.........
...
..........
...... .....
...
..................
.
.
.
...
.
.
...... .......
...
....................
...
...... .. .......
...... .......
...
................................ ...
.
.
.
...
.
.
.
.
....... .... ......... . .
...
.......
...
.. ..... .. ...
.......
............................................
...
.. ..
.
..........
.
.
.
.
.
.
.
...
.
.
.
.
..... ..
........
...
....
.... ....
..... ..
.........
.
.
.
.
.
.
.
.
.
.
...
.
.
.
.
.
..
..........
...
.... ........
.. ...
..
...........
.
.
.
.
.
.
.
.
.
.
.
.
.
...
.
.
.. ..
.
.
.
.
...
.
.......
...............
.
..............................................................................................................................
.
.... ....
.
... ..
.
.
.
.
....
.
..... ..
.. ...
..
...
.....
.
.
.
.
.
.
.. ..
.
.
...
.
.....
...
.
.
....
.
.
.
.
.
.... ....
.
...
.
.
...
.
.
.
.
.
.
...
.
.....
.. ...
..
...
..
....
.
.
.
.
.
.. ..
.
...
.
.
.....
...
.
.
.
.....
.
.
.
.
.
.
.
...
.
.... ...
.
.
....
.
.
.
...
.....
..
...
..
.... ....
.....
.
.
.
.
.
.
.
.
.
...
.
..
.....
...
.
.....
....
...
....
.... ...
.....
.
.
.
...
.
....
..
...
..
.... ....
.....
. .
..
... .........
..
... .......
..
. ..
..
.. ..
.........................................................................................................................................................................................................................................
a0
a1
a2 a3a4
..
...
...
......
..
........
........
...
...........
.
.
.
...
.
.
..........
...
........
...
..... ....
..... .....
...
.............
.
.
...
.
..... ....
...
..... ....
...
..... .....
..... .....
...
..............
.
.
...
.
..... ....
...
..... .....
...
..... .....
..... ....
...
.............
.
.
.
...
.
..... .....
...
...........
...
..........
..........
...
.............
.
.
.
...
.
.
..
..........
...
....
...
........
.......... .
...
................. .
.
.
.
.
...
.
.
..................
...
...
....... .. ......
...
........ .......
............................. ..
...
....
......... .... ......... .. ..
.
.
.
.
.
.
...
.
.
..
...........
..
..
........... .. ..
.............
..
................................................................................
.. ..
.
...
. ...
..... ...
.
.
.
.
....
.
.
...
. .
.
....
.
.
.
...
.....
.. ..
...
..
....
.....
.
.
.
.
.
.
.
.
.
.
...
.
.
.....
...
. ...
.
....
.
.
.
.
.
.
...
.
....
.
...
. .
.
.
.
.
...
.
....
.. ..
...
..
....
.....
.
.. ..
... .........
..
... .......
.. ...
......
..
..
.
.
.
..................................................................................................................................................................................................................................
a0
a1 a2 a3 a
Examples.
1. p0 = 1/4, p1 = 1/4, p2 = 1/2. This gives us = 5/4 and (s) =
1/4 + s/4 + s2 /2. Solving (s) = s gives two solutions f1/2, 1g. Therefore
a = 1/2.
38
OPTIMAL STOPPING
39
Optimal Stopping
f(x) = 1N (x)
.......
.
....
...
..
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
.
...
.
.
...
....
....
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
.
.
.
.
.
.
.
......................................................................................................................................................................................................
...
N 1 N
.......
..
...
..
.
......................................................................................................................................................................................................
... N 1 N
OPTIMAL STOPPING
40
Examples.
1. T 0, i.e., dont gamble.
2. T 1, i.e., play once, then stop.
3. T = wait for 3 reds in a row, bet on black. Repeat until you hit 0
or N .
We will always assume that P (T < ) = 1 in this section.
Definition. The value function v : S [0, ) is defined as the most
expected profit possible from that starting point;
v(x) = sup E(f(XT ) j X0 = x).
T
v(y)p(x, y)
yS
= (P v)(x).
Therefore v(x) (P v)(x) for all x S. Such a function is called
superharmonic.
OPTIMAL STOPPING
41
E(u(X0 ) j X0 = x)
E(u(Topt ) j X0 = x)
E(f(Topt ) j X0 = x)
v(x).
OPTIMAL STOPPING
42
u
t
Examples
OPTIMAL STOPPING
43
1 (q/p)x
, p 6= q
1 (q/p)N
v(x) =
.
x
p=q
N
3. Zarin case The following excerpt is taken from What is the Worth of
Free Casino Credit? by Michael Orkin and Richard Kakigi, published in
the January 1995 issue of the American Mathematical Monthly.
In 1980, a compulsive gambler named David Zarin used a generous
credit line to run up a huge debt playing craps in an Atlantic City
casino. When the casino finally cut off Zarins credit, he owed over
$3 million. Due in part to New Jerseys laws protecting compulsive
gamblers, the debt was deemed unenforceable by the courts, leading
the casino to settle with Zarin for a small fraction of the amount he
owed. Later, the Internal Revenue Service tried to collect taxes on the
approximately $3 million Zarin didnt repay, claiming that cancellation
of the debt made it taxable income. Since Zarin had never actually
received any cash (he was always given chips, which he promptly lost
at the craps table), an appellate court finally ruled that Zarin had no
tax obligation. The courts never asked what Zarins credit line was
actually worth.
Mathematically, the payoff function is the positive part of x k, where k
is the units of free credit.
f(x) = (x k)+
.......
.
....
...
...
...
...
...
...
..
...
...
...
...
...
.
.
.
.............................................................................................................................................................................................................................................................................
k+1
k+2
OPTIMAL STOPPING
44
Since the state zero is absorbing, we have v(0) = 0. On the other hand,
v(x) > 0 = f(x) for x = 1, . . . , k, so that 1, . . . , k 6 E. Starting at k, the
optimal strategy is to keep playing until you hit 0 or N for some N > k
which is to be determined. In fact, N is the smallest element in E greater
than k.
We have to eliminate the possibility that N = , that is, E = 0. But
the strategy Tf0g gives a value function that is identically zero. As this is
impossible, we know N < .
The optimal strategy is Tf0,N g for some N . Using the previous example we
can calculate directly that
E(f(XTf0,Ng ) j X0 = k) = (N k)
1 (q/p)k
.
1 (q/p)N
For any choice of p and q, we choose N to maximize the right hand side.
In the Zarin case, we may assume he played the pass line bet which gives
the best odds of p = 244/495 and q = 251/495, so that q/p = 251/244. We
also assume that he bets boldly, making the maximum bet of $15,000 each
time. Then three million dollars equals k = 200 free units, and trial and
error gives N = 235 and v(200) = 12.977 = $194, 655.
N Expected Profit (units) Expected Profit ($)
232
12.9169
193754.12
233
12.9486
194228.91
234
12.9684
194526.29
235
12.9771
194655.80
236
12.9751
194626.58
12.9632
194447.42
237
238
12.9418
194126.71
In general, we have the approximate formula N k + 1/ ln(q/p) and the
probability of reaching N is approximately 1/e = .36788. Therefore the
approximate value of k free units of credit is
v(k)
which is independent of k!
1
,
exp(1) ln(q/p)
OPTIMAL STOPPING
45
Algorithm.
Define
u1 (x) =
f(x)
sup f
if x is absorbing
otherwise.
Then let u2 = max(P u1 , f), u3 = max(P u2 , f), etc. The sequence (un )
decreases to the function v.
Example. How much would you pay for the following financial opportunity?
I assume that you follow a random walk on the graph. There is no payoff
except $100 at state 4, and state 5 is absorbing.
1 ..................
4 ($100 bill)
..
.....
.....
.....
.....
.....
.
.....
.
.
.
..
.....
.....
.....
.....
.....
.....
.....
.....
.....
..... .........
.........
....
..... .....
..... ........
.....
.....
.
.
.
.....
...
.
.
.
.....
.
.....
.....
.....
.....
.
.
.
.
.....
....
.
.....
.
.
.....
....
.
.
.
.....
.
...
.....
5 (absorbing)
In vector form, the value function is f = (0, 0, 0, 100, 0). (For ease of
typesetting, we will render these column vectors as row vectors, OK?) The
P operation takes a vector u and gives
Pu =
u(2) + u(3) + u(4) u(1) + u(3) + u(5) u(1) + u(2) + u(4) + u(5) u(1) + u(3) + u(5)
,
,
,
, u(5) .
3
3
4
3
The initial vector is u1 = (100, 100, 100, 100, 0) and P u1 = (100, 200/3, 75, 200/3, 0).
Taking the maximum of this with f puts the fourth coordinate back up to
100, giving u2 = (100, 200/3, 75, 100, 0).
Applying this procedure to u2 gives u3 = (725/9, 175/3, 200/3, 100, 0).
Putting this on a computer, and repeating 15 times yields (in decimal
OPTIMAL STOPPING
46
format)
u15 = (62.503, 37.503, 50.002, 100.00, 0.00).
These give the value, or fair price, of the different starting positions on the
graph.
We may guess that if the algorithm were continued, the values would be
rounded to u = v = (62.5, 37.5, 50.0, 100.0, 0.0). We can confirm this
guess by checking the equation v = max(P v, f).
........
..........
..........
..........
..........
.
.
.
.
.
.
.
.
.
...
..........
..........
..........
..........
...................
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
C, B, S
dS, rB, Cd
Time 0
Time 1
OPTIMAL STOPPING
47
1
rd
ur
C=
Cu +
Cd .
r
ud
ud
To ease the notation let p = (r d)/(u d) so that the price can be written
C = (p Cu + (1 p) Cd )/r. The worthless portfolio device is the same as
the usual replicating portfolio device.
Call Option
A call option gives the holder the right (but not the obligation) to buy
stock at a later time for K dollars. The value K is called the strike price.
The value of the option at time 1 is given by
Cu = (uS K)+
Cd = (dS K)+
=
.
3
r 3
r 3
OPTIMAL STOPPING
48
C =
1
C = max (S K)+ , fpCu + (1 p)Cd g
r
1
=
fpCu + (1 p)Cd g .
r
A call option is never exercised early.
A put option gives the buyer the right (but not the obligation) to sell stock
for K dollars. That is,
Pu = (K uS)+
and
Pd = (K dS)+
1
P = max (K S)+ , fpPu + (1 p)Pd g .
r
11
19
1
0+
100
= 60.31.
P = max 50,
1.05
30
30
OPTIMAL STOPPING
49
Call Option
650
800
50
200
0
50
........
..........
..........
..........
.
.
.
.
.
.
.
.
.
.
..........
..........
..........
..........
..........
.
.
.
.
.
.
.
.
.
.
..........
.......... ...............
..........
..........
..........
..........
..........
..........
.
.
.
.
.
.
.
.
..........
.
.
..........
..........
..........
..........
.
.
.
.
.
.
.
.
..........
.
......
.
.
..........
.
.
.
.
.
.
.
..................
...........
.
.
.
.
.
.
.
.
.....
.
..........
......
.
.
.
.
.......... .....
.
.
.
.
.
.
.
.
.
.
..........
..........
......
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..........
......
......
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..........
........
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
. ..........
..........................
...........................
.
.
.
. ..
..........
.......... ...................
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
.
.
.
.
.
..........
.
.
.
.
..........
..........
......
.
.
.
..........
.
.
.
.
.
.
..........
..........
......
.
.
.
.
..........
.
.
.
.
..........
.
....................
............
..........
..........
.......... ...
..........
..........
.
.
.
.
.
.
.
.
.
.
..........
........
.
.
..........
.
.
.
.
.
.
.
....
..........
...........
..........
..........
..........
...........
..........
.......... .....................
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
.......
257.14
100.33
400
38.71
200
17.46
100
6.10
100
50
25
Time
0
12.50
3
This tree explains the price of a call option with terminal time n = 3.
The red numbers are the possible stock values and the green numbers are
the current value of the call option. These are calculated by starting at
the right hand side and working left, using our formula. The end result is
relatively simple, since a call option is never exercised early.
( n
)
1 X n j
nj
j nj
p (1 p) (u d S K)+
C= n
r
j
j=0
OPTIMAL STOPPING
50
Put Option
0
800
0
200
100
50
.......
..........
..........
..........
..........
.
.
.
.
.
.
.
.
.
.
..........
..........
..........
..........
..........
.
.
.
.
.
.
.
.
.
.
.......... ..............
..........
..........
..........
..........
..........
..........
..........
..........
.
.
.
.
.
.
.
.
..........
.
......
.
.
.
..........
.
.
.
.
.
.
..........
......
.
.
.
.
.
.
.
.
..........
.
......
.
.
.................
.
.
.
.
.
.
.
...... ..............
.
.
.
.
.
.
.................
.
.
.
..........
......
..........
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..........
.....
......
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..........
......
......
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..........
.........
..........
..........
..........
..........
..........
..........
..........
..........
. ..........
...........................
...........................
.. ...........
.
.
.
.
.
.
.
..........
.
.
..........
.
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
.
.
.
.
.
.
..........
.
.
..........
.
......
..........
.
.
..........
.
.
.
.
.
.
.
.......... ...........
..........
..............
.............
..........
..........
.......... ...
..........
...........
..........
..........
.
.
.
.
.
.
.
.
.
.
..........
...........
..........
...........
..........
..........
...........
..........
..........
.......... .....................
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
..........
.......
36.38
400
73.02
200
60.32
100
100
100
50
125
25
Time
137.50
12.50
3
This tree explains the price of a put option with terminal time n = 3. The
red numbers are the possible stock values and the green numbers are the
current value of the put option. These are calculated by starting at the
right hand side and working left, using our formula, but always taking the
maximum with the result of immediate exercise. There are boxes around
the nodes where the option would be exercised; note that two of them are
early exercise.
OPTIMAL STOPPING
51
.....
..........
..........
..........
..........
..................
..........
..........
..........
..........
.......
level = n 1
Continuing this way, we can prove that uj+1 (x) = v(x) for x at level N j.
In particular, uk (x) = v(x) for all x when k N + 1. The algorithm of
starting at the far right hand side and working backwards gives us the value
function v, which gives the correct price of the option at each state. We
exercise the option at any state where v(x) = f(x).
MARTINGALES
52
Martingales
Conditional Expectation
6
X
xP (X = x j X is even)
x=1
1
1
1
+ (3 0) + 4
+ (5 0) + 6
= (1 0) + 2
3
3
3
= 4.
Similar calculations show that E(X j X is odd) = 3.
We can combine these results as follows. Define a function (p) =
4
3
if p = 0
.
if p = 1
MARTINGALES
53
Information
Best estimate of X
none
E(X j no info) =
partial
E(X j P ) = (P )
complete
E(X j X) = (X)
E(X)
4 if p = 0
(p) =
3 if p = 1
(x) = x
Example. Suppose you roll two fair dice and let X be the number on the
first die, and Y be the total on both dice. Calculate (a) E(Y j X) and (b)
E(X j Y ).
(a)
X
E(Y j X)(x) =
yP (Y = y j X = x)
6
X
1
=
(x + w) = x + 3.5,
6
w=1
so that E(Y j X) = X + 3.5. The variable w in the sum above stands for
the value on the second die.
(b)
X
x P (X = x j Y = y)
P (X = x, Y = y)
P (Y = y)
P (X = x, Y X = y x)
P (Y = y)
P (X = x)P (Y X = y x)
P (Y = y)
E(X j Y )(y) =
MARTINGALES
Now
and
54
y1
36
P (Y = y) =
13 y
36
1
P (Y X = y x) = ,
6
y = 2, 3, 4, 5, 6, 7
y = 8, 9, 10, 11, 12
y 6 x y 1.
For 2 y 7 we get
y1
X
y1
1/36
1 X
1 (y 1)y
y
E(X j Y )(y) =
x
=
x=
= .
(y 1)/36
y 1 x=1
y1
2
2
x=1
For 7 y 12 we get
6
X
6
X
1
1/36
y
E(X j Y )(y) =
=
x
x= .
(13 y)/36
13 y x=y6
2
x=y6
MARTINGALES
55
Properties:
1. E(E(Y j Fn )) = E(Y )
2. E(aY1 + bY2 j Fn ) = aE(Y1 j Fn ) + bE(Y2 j Fn )
3. If Y is a function of Fn , then E(Y j Fn ) = Y
4. For m < n, we have E(E(Y j Fn ) j Fm ) = E(Y j Fm )
5. If Y is independent of Fn , then E(Y j Fn ) = E(Y )
MARTINGALES
56
Example 3.
Martingales
MARTINGALES
57
Definition. The sequence M0 , M1 , . . . of random variables is called a martingale (with respect to (Fn )
n=0 ) if
(a) E(jMn j) < for n 0.
(b) (Mn )
n=0 is adapted to (Fn )n=0 .
n1
X
Mj+1 Mj j Fm )
n1
X
E(Mj+1 Mj j Fj ) j Fm )
j=m
= E(
j=m
= 0
so that E(Mn j Fm ) = Mm .
Another note: Suppose (Mn ) is an (Fn ) martingale, and define FnM =
(M0 , M1 , . . . , Mn ). Then Mn FnM for all n, and FnM Fn . Therefore
E(Mn+1 j FnM ) = E(E(Mn+1 j Fn ) j FnM ) = E(Mn j FnM ) = Mn ,
so (Mn ) is an (FnM )martingale.
Example 1. Let X1 , X2 , . . . be independent random variables with mean .
Put S0 = 0 and Sn = X1 + + Xn for n 1. Then Mn := Sn n is an
(Fn ) martingale.
Proof.
E(Mn+1 Mn j Fn ) =
=
=
=
E(Xn+1 j Fn )
E(Xn+1 j Fn )
0.
MARTINGALES
58
E(Bn+1 Xn+1 j Fn )
Bn+1 E(Xn+1 j Fn )
Bn+1 E(Xn+1 )
0,
j1
if X1 , X2 , . . . , Xj1 = 1 .
Bj = 2
0
otherwise
MARTINGALES
59
k
n+2
P (Xn+1 = k j Xn = k) = 1
k
n+2
This gives
E(Xn+1
k
n+3
k
j Xn = k) = (k + 1)
+k 1
=k
,
n+2
n+2
n+2
n+3
,
E(Xn+1 j Fn ) = Xn
n+2
and dividing we obtain
E
Xn+1
j Fn
(n + 1) + 2
Xn
.
n+2
MARTINGALES
60
MARTINGALES
61
1 (q/p)j
1 (q/p)j
1
P (ST = N ) =
and E(T ) = (p q)
N
j
1 (q/p)N
1 (q/p)N
MARTINGALES
62
E(T ) =
j(N j)
p+q
................................
......... ...........................................................
.
.
.
.
.
.............
.... ......
..... ......
..... ............
.
.
..... ....
.
.
.
.
.
.
.... .....
.. ........
.
.
.... ...
.. ......
.
.
........
.. .....
.
... ...
.
.
.
.
.
......
.. ........
......
.....
.. ....
......
......
......
....
0
5
10
15
20
Starting point
MARTINGALES
63
Waiting for patterns: In tossing a fair coin, how long on average until
you see the pattern HT H?
Imagine a gambler who wants to see HT H and follows the play until you
lose strategy: at time 1 he bets one dollar, if it is T he loses and quits,
otherwise he wins one dollar. Now he has two dollars to bet on T , if it is
H he loses and quits, otherwise he wins two more dollars. In that case, he
bets his four dollars on H, if it is T he loses and quits, otherwise he wins
four dollars and stops.
His winnings Wn1 form a martingale with W00 = 0.
Now imagine that at each time j 1 another gambler begins and bets
on the same coin tosses using the same strategy. These guys winnings are
labelled Wn2 , Wn3 , . . . Note that Wnj = 0 for n < j.
P
Define Wn = nj=1 WnJ the total winnings and let T be the first time the
pattern is completed. By optional stopping E(WT ) = E(W0 ) = 0. From
the casinos point of view this means that the average income equals the
average payout.
Income: $1
$1
$1
$1
$1
$1
$1
$1
Coin tosses: H
Payout: $0
$0
$0
$0
$0
$8
$0
$2
Examining this diagram, we see that the total income is T dollars, while
the total payout is 8 + 2 = 10 dollars, and conclude that E(T ) = 10.
Fortunately, you dont need to go through the whole analysis every time
you solve one of these problems, just figure out how much the casino has to
pay out. For instance, if the desired pattern is HHH, then the casino pays
out the final three bettors a total of 8 + 4 + 2 = 14 dollars, thus E(T ) = 14.
Example. If a monkey types on a keyboard, randomly choosing letters,
how long on average before we see the word MONKEY? Answer: 266 =
308915776.
MARTINGALES
64
Guessing Red: A friend turns over the cards of a well shuffled deck one
at a time. You can stop anytime you choose and bet that the next card is
red. What is the best strategy?
Solution: Let Rn be the number of red cards left after n cards have been
turned over. Then
Rn
with probability 1 p
Rn+1 =
,
Rn 1 with probability p
where p = Rn /(52 n), the proportion of reds left. Taking expectations
we get
52 (n + 1)
Rn
= Rn
E(Rn+1 j Rn ) = Rn
52 n
52 n
so that
E
Rn+1
j Rn
52 (n + 1)
Rn
.
52 n
Proof. Let (Xn ) be the Markov chain with transition matrix P on state
space S = f1, 2, . . . , ng. A function u : S R is harmonic if and only if the
MARTINGALES
65
vector u = (u(1), u(2), . . . , u(n))T satisfies P u = u, i.e., u is a right eigenvalue for the eigenvalue 1. Clearly the constant functions are harmonic; we
want to show that they are the only ones.
Suppose u is a right eigenvector so that u : S R is harmonic and u(Xn )
is a (bounded!) martingale. Let x, y S and Ty := inffn 0 : Xn = yg
be the first time the chain hits state y. Since the chain is communicating,
we have P (Ty < j X0 = x) = 1 and so
u(y) = Ex (u(XTy )) = Ex (u(X0 )) = u(x).
Wn =
n
X
Bj (Mj Mj1 ), n 1.
j=1
The following diagram explains the relationship between the two martingales.
MARTINGALES
66
M process
.
b
a
....
...
........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
...
....
..
......
...
...
............
...
...
..
..
.
...
.
.
.
.
.
.
.
...
.
.......
.
...
...
...
... ...
...
...
...
... ....
...
...
...
...
...
...........
.
.
..
..
.
.
.
.
...
.
.
.
..
.
..
.
.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
..... ...
..
..
...
..
..
...
..... ..
.....
.
.
.
.
.
...
.
.
.
.
.
...
.
.
.
.
................ .
...
...
...........
...
...
.
...
... .
..
... ... jM
... ..
n aj
..
... ..
... ..
... .
.
....
...
..
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
..
W process
.
......
... ..
... .....
...
...
........
.
.
.
.
.
.
.
...
.
..... .........
...
...
.....
.....
.....
..... .....
...
.....
........
.
.
...
.
.
..
..
...
...
...
...
...
...
...
...
...
..
......
......
......
.............
...
...
...
...
..
.
...
...
...
..
.
...
...
...
..
.
...
...
...
...
...
..
..
.
.
.
...
...
...
................ ....
...........
MARTINGALES
67
Examples.
1. Polyas urn.
Let Mn be the proportion of red balls in Polyas urn at
time n. Then (Mn ) is a martingale and 0 Mn 1, so supn E(jMn j) 1.
Therefore Mn M for some random variable M . It turns out that M
has a uniform distribution on (0, 1).
....
............. .... ......
........... ..
........
..
.......
....................................................................................................
............................ ......................................................................... .....................................................................
......
.....
....
.
.
..
.
...
...
...
....
..
...
...
....
..
...
.....
...
...
...
...
...
...
...
...
.
........
... ......
................................. .... .......... ............... ................................................ ....................................
....
... .. ....
....
.............................
..........
........... ..... .....
......... ..................... .
... ... .....
....
...... ... ......
... .. .. ..... .............. ..... . ............... ..............
..... ... ....... ... ... ........ .............. ................. ................
................
.
. ...........................................................................................................
........
...........
..... ..
.........
.........................................................................................................
.
.......
0.00
0
25
50
75
100
2. Branching process.
Let Xn be a branching process, and put Mn :=
n
Xn / so that (Mn ) is a martingale. If 1, then Mn M = 0
(extinction). If > 1, then (Mn ) is uniformly integrable and Mn M
where E(M ) = 1.
3. Random harmonic series.
The harmonic series diverges, but not the alternating harmonic series:
1+
1
1
1 1 1
+ + + + + = .
2 3 4
j
1
1 1 1
+ + + (1)j+1 + = ln 2.
2 3 4
j
Here the positive and negative terms partly cancel, allowing the series to
converge.
MARTINGALES
68
Lets choose plus and minus signs at random, by tossing a fair coin. Formally, let (j )
j=1 be independent random variables with common distribution P (j = 1) = P (j = 1) = 1/2. P
Then the martingale convergence
theorem shows that the sequence Mn = nj=1 j /j converges almost surely.
P
The limit M :=
j=1 j /j has the smooth density pictured below.
0.25
0.20
0.15
0.10
0.05
0.00
...............................................
......
....
.
.
...
..
...
.. .
...
...
...
..
...
..
...
..
...
..
...
..
...
..
...
....
.
.
.
.......
.
.
.
.
.
.
.
.....
.
....
3 2 1 0
1
2
3
P
Density of
j=1 j /j.
BROWNIAN MOTION
Brownian motion
Basic properties
69
y Rd .
BROWNIAN MOTION
70
E kXt X0 k
d
X
j=1
dt units
In
p fact, the average speed of Brownian motion over [0, t] is E(kXt X0 k)/t
d/t. For large t, this is near zero while for large t, it is near .
Proposition. Xt is not differentiable at t = 0, i.e.,
P ( : Xt0 () exists at t = 0) = 0.
Proof:
( :
Xt0 ()
exists at t = 0)
: sup kXt () X0 ()k/t <
0<t1
n1
n1
: sup 2 kX2n () X2(n1) ()k < k .
n
k=1
Define Ak = ( : supn 2n1 kX2n () X2(n1) ()k < k). The random variables Zn := 2n1 (X2n X2(n1) ) are independent multivariate normal, so
XY
X
P (Ak ) =
P (kZk < k) = 0.
P ( : Xt0 () exists at t = 0)
k
u
t
A more complicated argument gives
P ( : t 7 Xt () is not differentiable at all t 0) = 1.
20
40
60
80
0.002
0.004
0.006
0.008
100
80
60
40
20
0
20
40
60
80
100
BROWNIAN MOTION
71
By the three ingredients (1)(3) that define Brownian motion, we see that
for any fixed s 0, the process (Xt+s Xs ) is a Brownian motion independent of Fs that starts at the origin. In other words, (Xt+s ) is a Brownian
motion, independent of Fs , with random starting point Xs .
An important generalization says that if T is a finite stopping time then
(XT +t ) is independent of FT , with random starting point XT .
Suppose Xt is a standard 1dimensional Brownian motion starting at x,
and let x < b. We will prove that
P (Xs b for some 0 s t) = 2P (Xt b).
This follows from stopping the process at Tb , the first time (Xt ) hits the
point fbg, then using symmetry. The picture below will help you to understand the calculation:
1
P (Xt b) = P (Xt b j Tb t)P (Tb t) = P (Tb t),
2
which gives the result.
Reflection principle
. ..
..
.....................
.. ...................
...
.............. ........... . ..... ............ .....
...... ...... ..
....
...
.... .. .........
..
....
...... ...
...
...
...
.............................................................................................................................................................................................................................................................................................................................................................................
...
. .. ..
...
.....
..................... ..... ......................
.
.
. ..
.
.
.
.
.
.
.
.
...
.. . . ..
...
...
..... ... .......
.. .. .. ..
........
..................... ..... .....
...
.... ... .. ....
..... ........ ....
.......
... ........ ..... ............... ..
...
.......... ... .......... . .......... . ...... ....... .....
..
.
.............
...........
...................
.
.
.
.
.
.
.
......
.
.
.
.
.
...
.
.
.
.
.
. .........
.
...
........
... .......
.
..............
...... ......
.... .... .
...
...
..
.......... ......
...
...
..
...
......... ... .......
.
... .
.
....................
.
...
.
........... .... ......
.
... ......
....
.
.
.
.
...
.
.
............. ..
....
.............. .........
........
.
..
...
....
...
.
. .. ....... ...
....... ............
...........
. .... ..
.
...
...
...... ....... . . ...... ......
.... .
...... ........ ............. ...
...
... . .............
..................... .................... ....
....
.. ............ .. ....... .. .
.... ..... ... .................
........... ........ ........ .............. ....... .......
.....
...
... .......... ..
.. ........
. .... ..
........
............ .. ....
.
...
... ... ......
......
.... ...... ......
.....
.
.
...
... ... .
.
.
....... ....... .
.
.
...
... .....
.... ....
....
...
...........
...
.......
..
......
...........................................................................................................................................................................................................................................................................................................................................................................
...
...
...
...
..
....
..
...
BROWNIAN MOTION
72
This shows that 1dimensional Brownian will eventually hit every value
greater than the starting position. Since Xt is also a Brownian motion, we argue that it will also hit every value less than the starting point:
Px (Xt hits a) = Px (Xt hits a) = 1.
Now we use the strong Markov property again to show that
Px (Xt hits b, then hits a) = Px (Xt hits b)Pb (Xt hits a) = 1.
In particular it must return to its starting point. You can extend this
argument to prove
Px (Xt hits all points infinitely often ) = 1.
Now let T be the hitting time of the set fa, bg. Since (Xt ) is a martingale,
we have
x = Ex (X0 ) = Ex (XT ) = aPx (XT = a) + bPx (XT = b).
Using the fact that Px (XT = a) + Px (XT = b) = 1, we can conclude that
Px (XT = b) = (x a)/(b a).
Just like for the symmetric random walk, (Xt2 t) is a martingale so
Ex (X02 0) = Ex (XT2 T )
x2 = a2 Px (XT = a) + b2 Px (XT = b) Ex (T ).
The previous result plus a little algebra shows that
Ex (T ) = (b x)(x a).
If we let a , we find that, although Px (Tb < ), we have Ex (Tb ) = .
BROWNIAN MOTION
73
1X
i,j
= Ex (f(Xs )) + Ex
1
u(t, x) = Ex (f(Xt )) .
t
2
To find the spatial derivatives of u we use the translation invariance of
Brownian motion.
Ey (f(Xt )) = Ex (f(Xt + (y x)))
1
2
2
= Ex f(Xt ) + h(f)(Xt ), y xi + hy x, D f(Xt )(y x)i + o(ky xk )
2
1
= Ex (f(Xt )) + hEx ((f)(Xt )), y xi + hy x, Ex (D2 f(Xt ))(y x)i
2
2
+ o(ky xk ).
In particular, we have D2 u(t, x) = Ex (D2 f(Xt )) and hence u(t, x) =
Ex ((f)(Xt )). In other words, u satisfies the heat equation
1
BROWNIAN MOTION
74
xa
.
ba
..
...........
......... ....
.........
...
.........
.
.
.
.
.
...
.
.
.
..
...
.........
.........
.
.
.
.
.
.
....
.
.
......
.
.
.
.
.
...
.
.
.
......
.
.
.
...
.
.
.
.
.
......
.
.
...
.
.
.
.
.
.
......
...
.
.
.
.
.
.
.
.
..
......
.
.
.
.
.
.
.
.
.
.
.
...............................................................................................................................................................................................................................
Thats very nice, but let me plant the seeds of doubt by asking three
questions and looking at a couple of counterexamples.
Questions
1. Do we know that Px (T < ) = 1?
2. Is u continuous at the boundary?
3. Is there more than one solution to the Dirichlet problem?
Example 1. In R2 , let D = f(x1 , x2 ) : x1 > 0g, the open right half plane.
The functions u1 (x) 0 and u2 (x) = x1 are both harmonic and equal zero
on D.
BROWNIAN MOTION
75
x.
R
1
R2
.
The probability that Brownian motion reaches the outer boundary first is
given by v(x) = Ex (g(XT )) where g(x) = 1 if jxj = R2 and g(x) = 0 if
jxj = R1 . The function v will be harmonic in between. Now, the symmetry
BROWNIAN MOTION
76
of Brownian motion implies that the probability is the same for all x with
a common radius. So we can write
P
1/2
d
2
v(x) = (r), where r =
.
i=1 xi
Taking derivatives of the function r, we find
1/2
1 Pd
xi
2
x
i r =
2xi =
i=1 i
2
r
x
i
i [(r)] = 0 (r)i r = 0 (r)
r
so that
x
xi xi
i
+ 0 (r)i
r r
r
x2i
1
1 xi
00
0
= (r) 2 + (r)
+ xi 2
r
r
r r
2
1
x2
x
= 00 (r) 2i + 0 (r) 0 (r) 3i
r
r
r
ii [(r)] = 00 (r)
d
X
i=1
d
r2
r2
ii [(r)] = (r) 2 +0 (r) 0 (r) 3 = 00 (r)+0 (r)
r
r
r
00
d1
r
ln jxj ln(R1 )
ln(R ) ln(R ) if d = 2
2
1
v(x) =
R12d jxj2d
2d
if d 3
R1 R22d
We learn something interesting by taking limits as R2 . For d = 2,
Px ( ever hits B(0, R1 )) = lim 1 v(x) = 1.
R2
Two dimensional Brownian motion will hit any ball, no matter how small,
from any starting point. If we pursue this argument, we can divide R2
BROWNIAN MOTION
77
using a fine grid, and find that 2d Brownian motion will visit every section
infinitely often.
On the other hand, if we leave R2 alone and let R1 0, we get
Px (Xt = 0 before jXt j = R2 ) = lim 1 v(x) = 0,
R1 0
jxj
R1
2d
Since this is less than one, we see that Brownian motion is transient when
d 3.
It turns out that whether or not ddimensional Brownian motion will hit
a set depends on its fractional dimension. The process can hit sets of
dimension greater than d 2, but cannot hit sets of dimension less than
d 2. In the d 2 case, it depends on the particular set.
STOCHASTIC INTEGRATION
78
Stochastic integration
n
X
Bi Xi =
i=1
n
X
Bi Si ,
i=1
X
i
Bi2
+2
Bi Bj Xi Xj
i<j
E(Bi2 ).
STOCHASTIC INTEGRATION
79
Y0 0 t < t1
Y t t < t
1
1
2
Yt = ..
..
.
Y n tn t <
We assume E(Yi2 ) < and Yi Fti for all i. Then it makes sense to
define: for tj < t tj+1
Zt =
Ys dWS =
0
j
X
...
...
..
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
..
....................................................................................................................................................................................................................................................................................................... .
i=1
tj t tj+1
t1 t2
t3
0
Here are some facts about the integral weve defined
STOCHASTIC INTEGRATION
80
Z t
Z t
Z t
Ys dWs .
Xs dWs + b
(aXs + bYs ) dWs = a
0
k1
X
k1
X
i=j
= E(Ztj Zs j Fs ) +
i=j
The linearity, martingale property, and variance formula carry over to (Zt )
An example Let f be a differentiable nonrandom function. Then
Z t
Z t
Zt =
f(s) dWs = (Wt f(t) W0 f(0))
Ws df (s).
0
STOCHASTIC INTEGRATION
81
Rt
Then Zt is a normal random variable with mean zero and variance 0 f 2 (s) ds.
We can show that Zt has independent increments as well, so that Z is just
a time changed Brownian motion
t
2
Zt = B f (s) ds .
0
Itos formula
n1
X
f(0) +
n1
X
j=0
f (jt/n)(t/n) +
j=0
= f(0) +
n1
X
o(t/n)
j=0
f 0 (s) ds + 0
0
In a similar vein, let Wt be a Brownian motion and write f(Wt ) as telescoping sum
f(Wt ) = f(w0 ) +
n1
X
[f(W(j+1)t/n ) f(Wjt/n )]
j=0
= f(W0 ) +
n1
X
j=0
1
+
2
n1
X
j=0
00
n1 h
X
2 i
+
o W(j+1)t/n Wjt/n
j=0
The intuition behind Itos formula is that you can replace W(j+1)t/n Wjt/n
by t/n with only a small amount of error. Therefore
f(Wt ) = f(W0 ) +
n1
X
j=0
STOCHASTIC INTEGRATION
82
n1
n1
X
1 X 00
f (Wjt/n ) (t/n) +
o(t/n) + error .
2 j=0
j=0
f(Wt ) = f(W0 ) +
1
f (Ws ) dWs +
2
0
f 00 (Ws ) ds.
0
Rt
Example. Suppose we want to calculate 0 Ws dWs . The definition gets us
nowhere so we try to apply the usual rules of calculus
Z t
Z t
2
2
Ws dWs ,
Ws dWs = Wt W0
0
Rt
which implies 0 Ws dWs = [Wt2 W02 ]/2. The only problem is that this
formula is false! Since W0 = 0 , we can see that it is fishy by taking
expectations on both sides: the left hand side gives zero but the right hand
side is strictly positive.
The moral of this example is that the usual rulesR of calculus do not apply
t
to stochastic integrals. So how do we calculate 0 Ws dWs correctly? Let
f(t) = t2 , so f 0 (t) = 2t and f 00 (t) = 2. From Itos formula we find
Z t
Z
Z t
1 t
2
2
2Ws dWs +
Wt = W0 +
2 ds = 2
Ws dWs + t,
2 0
0
0
and therefore
Ws dWs =
0
1 2
Wt t .
2
A more advanced version of Itos formula can handle functions that depend
on t as well as x:
f(t, Wt ) = f(0, W0 ) +
s f(s, Ws ) ds +
0
Z
0
1
x f(s, Ws ) dWs +
2
xx f(s, Ws ) ds.
0
STOCHASTIC INTEGRATION
83
()
How do we solve this equation? Guess! Let Xt = exp(at + bWt ). From Itos
formula with f(t, x) = exp(at + bx) we get
Xt
Z
1 t 2
b Xs ds
= X0 +
aXs ds +
bXs dWs +
2 0
0
0
Z t
Z t
1
= X0 + (a + b)
aXs ds + b
Xs dWs
2
0
0
Z
St = S0 exp Wt + ( 2 /2) t .
Much more than documents.
Discover everything Scribd has to offer, including books and audiobooks from major publishers.
Cancel anytime.