Note On Markov Chains

1 Markov Chains
We will adopt the following conventions : random variables will be denoted

by capital letters A
t
and 7
t
, while realizations of random variables will be
denoted by corresponding lowercase letters r
t
and .
t
. A stochastic process
will be a sequence of random variables, say A
t
and 7
t
, while a sample
path will be a sequence of realizations r
t
and .
t
.
Roughly speaking, a stochastic process A
t
has the Markov property if
the probability distributions
PrA
t+1
_ r [ r
t
, r
t1
, ...r
tk
= PrA
t+1
_ r [ r
t
for any / _ 2. A Markov chain is a stochastic process with this property

and takes values in a nite set. A Markov chain (r, 1,
0
) is characterized by a
triple of three objects : a state space identied with an :-vector r, an :-by-:
transition matrix 1, and an initial distribution, an :-vector
0
.
Let
r = [r
1
r
2
.....r
n
]
0
Then the transition matrix 1 = [j
ij
] has elements with the interpretation
j
ij
= Pr(A
t+1
= r
j
[ A
t
= r
i
)
So, x a row i. Then the elements in each of the , columns give the con-
ditional probabilities of transiting from state r
i
to r
j
. In order for these to be
well-dened probabilities, we require
0 _ j
ij
_ 1, i, , = 1, 2, ....., :
n
j=1
j
ij
= 1 i = 1, 2, ...., :
For example, if : = 2, and
1 =
_
1,2
1,10
1,2
9,10
_
then the conditional probability of moving from state i = 1 to i = 2 is
j
12
= 1,2 while the conditional probability of moving from state i = 2 to i = 1
is only j
21
= 1,10. Notice that the diagonal elements of 1 give a sense of the
persistence of the chain, because the elements give the probability of staying in
a given state.
Similarly the initial distribution
0
= [
0;i
] has the elements with the inter-
pretation
0;i
= Pr(A
0
= r
i
)
And in order for these to be well-dened probabilities, we require
0 _
0;i
_ 1, i = 1, 2, ...., :
n
i=1
0;i
= 1
1
In applications we will often set the initial distribution to have
i
= 1 for
some i and zero everywhere else so that the chain is started in state i with
probability 1.
2 Higher order transitions
The main convenience of the Markov chain model is the ease with which it can
be manipulated using ordinary linear algebra. For example,
Pr(A
t+2
= r
j
[ A
t
= r
i
) =
n
k=1
Pr(A
t+2
= r
j
[ A
t+1
= r
k
) Pr(A
t+1
= r
k
[ A
t
= r
i
)
=
n
k=1
j
kj
j
ik
= (1
2
)
ij
where (1
2
)
ij
is the i,-th element of the matrix 1
2
= 11. More generally,
Pr(A
t+s
= r
j
[ A
t
= r
i
) = (1
s
)
ij
The Markov Chain gives a law of motion for probability distributions
over a nite set of possible state values. The sequence of unconditional proba-
bility distributions is
t
1
t=0
where each
t
is an :-vector. Let
t
= [
t;i
] be a
vector whose elements have the interpretation
t;i
= Pr(A
t
= r
i
)
Then the sequence of probability distributions is computed using
0
1
=
0
0
1
0
2
=
0
1
1 =
0
0
1
2
..
..
0
t
=
0
t1
1 =
0
0
1
t
where the prime (
0
) denotes the transpose. In short, probability distributions
0
t
evolve according to the linear homogeneous system of dierence equations
t+1
= 1
0
t
The conditional and unconditional moments of the Markov chain can be
calculated using similar reasoning. For example,
1A
0
=
0
0
r
1A
1
=
0
1
r =
0
0
1r
..
..
1A
t
=
0
t
r =
0
0
1
t
r
2
while conditional expectations are
1A
t+k
[ A
t
= r = 1
k
r
Similarly, let q be an :-vector that represents a function with typical element
q(r) on the state r. Then
1q(A
t+k
) [ A
t
= r = 1
k
q
3 Stationary Distribution
A steady state or stationary probability distribution is a vector such
that
0
=
0
1
or, equivalently
(1
0
1) = 0
Solving this matrix equation gives us . Clearly, for a non-trivial (i.e.,
,= 0), we need det(1
0
1) = 0, which will always be satised for the problems
we will work with. Will such a non-trivial be unique? Yes, if 1 is regular,
i.e., 0 < j
ij
< 1 for each i, , (I will not prove this result).
4 Computing Stationary Distributions By Hand
Consider a two-state Markov chain (r, 1,
0
) with transition matrix
1 =
_
1 j
j
1
_
for 0 < j, < 1. Then this Markov chain has the unique invariant distribu-
tion which we can solve for as follows
0 = (1 1
0
)
or
_
0
0
_
=
__
1
0
0
1
_
_
1 j
j

1
__ _

1
2
_
=
_
j
j

__

1
2
_
Carrying out the calculations, we nd that
0 = j
1

2
0 = j
1
+
2
3
These two equations only tell us one piece of information, namely
2
=
j
1
But we also know that the elements must satisfy
2
+
1
= 1
So we can solve these two equations in two unknowns to get
1
=

j +
,
2
=
j
j +
5 Computing Stationary Distributions in MAT-
LAB
One way to compute the stationary distribution using a computer is to use brute
force. So if 1 is the transition matrix, simply compute 1
T
for some large T.
Then you will get back a matrix with identical rows that are equal to the chains
stationary distribution. For example, if
1 =
_
_
0.7
0
0
0.2
0.5
0.9
0.1
0.5
0.1
_
_
Then,
1
1000
=
_
_
0
0
0
0.6429
0.6429
0.6429
0.3571
0.3571
0.3571
_
_
and the stationary distribution is = [0 0.6429 0.3571]
0
, which is what
we would get from the "by-hand" method of the last section. This is not very
elegant, of course. A neater approach is to use MATLAB to compute the eigen-
values and eigenvectors of 1. Why eigenvalues and eigenvectors?
Return to the equation that describes the steady state
(1
0
1) = 0
and notice that solving this equation for amounts to solving for the eigen-
vector of 1
0
corresponding to an eigenvalue equal to 1. Can we be sure that 1
0
will have an eigenvalue equal to 1? Yes, always (this follows from the property
that 0 _ j
ij
_ 1, and
n
j=1
j
ij
= 1). Can we be sure that it will have only one
eigenvalue equal to 1, so that the eigenvector corresponding to it will be unique?
Yes, provided 1
0
is regular (as dened in Section 3). Again, I will not prove
4
these last two claims. The requirement that
n
i=1
i
= 1 is a normalization of the
eigenvector.
Note that in the example above, 1 does not satisfy the regularity property.
So how did we get convergence? We got it because the regularity property is
only sucient for uniqueness. In this course, we may not always have a regular
transition matrix, but 1 will always be chosen by me (for examples, problem
sets, and exams) so that the stationary distribution is unique.
Step 1 : Compute matrix of eigenvalues and eigenvectors of 1
0
(remember
the transpose!). In MATLAB,
[V,D] = eig(P)
gives a matrix of eigenvectors \ and a diagonal matrix 1 whose entries are
the eigenvalues of 1.
Step 2 : Now since 1 is a transition matrix, one of the eigenvalues is 1.
Pick the column of \ associated with the eigenvalue 1. With the matrix 1
given above, this will be the second column and we will have
v = V(:,2) =
_
_
0
0.8742
0.4856
_
_
Step 3 : Finally, normalize the eigenvector to sum to one, that is
pibar = v/sum(v) =
_
_
0
0.6429
0.3571
_
_
6 Simulating Markov Chains in MATLAB
In the problem set, you will have to simulate an :-state Markov chain (x, 1,
0
)
for t = 0, 1, 2, ...., T time periods. I use a bold x to distinguish the vector of
possible state values from sample realizations from the chain. Iterating on the
Markov chain will produce a sample path r
t
T
t=0
where for each t, r
t
x. In
the exposition below, I suppose that : = 2 for simplicity so that the transition
matrix can be written
1 =
_
j
1
j
2
1 j
1
1 j
2
_
Step 1 : Set values for each (x, 1,
0
)
Step 2 : Determine the initial state r
0
. To do this, draw a random variable
from a uniform distribution on [0, 1]. Call that realization -
0
. In MATLAB,
this can be done with the rand() command. If the number -
0
_
0;1
, set
r
0
= x(1). Otherwise, set r
0
= x(2).
5
Step 3 : Draw a vector of length T of independent random variables from a
uniform distribution on [0, 1]. Call a typical realization -
t
. Again, in MATLAB,
this can be done with the command rand(T,1). Now the current state is
r
t
= x(i), check if -
t
_ j
i;1
. If so, the state transits to r
t+1
= x(,) with , ,= i.
Otherwise, the state remains at i and r
t+1
= x(i). Iterating in this manner
builds up an entire simulation.
The MATLAB le markov_example.m (uploaded to BB) implements this
procedure for a specic chain (x, 1,
0
). When this le is executed, it calls an-
other MATLAB le simulate_markov.m (also uploaded to BB) which per-
forms the simulation for an arbitrary chain (x, 1,
0
) and specied simulation
length T. The latter le is a function le.
As an example, suppose the growth rate of log GDP is
r
t+1
= log(j
t+1
) log(j
t
)
and suppose that A
t
follows a 3-state Markov chain with
r = (j o, j, j + o) = (0.02, 0.02, 0.04)
and transition probabilities
1 =
_
_
0.5
0.03
0
0.5
0.9
0.2
0
0.07
0.80
_
_
The idea here is that the economy can either be shrinking, r = j o < 0,
growing at its usual pace, r = j 0, or growing even faster. If the economy
is in recession, there is a 50/50 chance of reverting back to the average growth
rate. If the economy is growing at an average pace, there is a slight probability
of it falling into recession and a slightly bigger probability of it growing even
faster, and so on. We simulate a Markov chain on r and then recover the level
of output via the sum
log(j
t
) = log(j
0
) +
t
k=1
r
k
, t _ 1
The MATLAB le gdpgrowth_example.m (uploaded to BB) produces a
simulation of this stochastic growth process assuming
0
= [0 1 0]
0
, simulation
length T = 100, and initial condition log(j
0
) = 0. This MATLAB le also calls
the function le simulate_markov.m.
7 Homework
Work through the programs mentioned above. For information on MATLAB
function les (what is it? how does one code a function le?) go to the Help
section of MATLAB (once you have opened the program), and look up the index
entry on "function"
6

Note On Markov Chains

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Note On Markov Chains

Uploaded by

Copyright:

Available Formats

1 Markov Chains

We will adopt the following conventions : random variables will be denoted

for any / _ 2. A Markov chain is a stochastic process with this property

You might also like