Professional Documents
Culture Documents
and Simulation
Nguyen V.M. Man, Ph.D.
Applied Statistician
September 6, 2010
Contact: mnguyen@cse.hcmut.edu.vn
or mannvm@uef.edu.vn
ii
Contents
0.1 Mathematical modeling and simulation Why? . . . . . . . . . 6
0.2 Mathematical modeling and simulation How? . . . . . . . . . 6
0.3 Cautions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
0.4 Typical applications . . . . . . . . . . . . . . . . . . . . . . . 7
0.5 Computing Software . . . . . . . . . . . . . . . . . . . . . . . 7
1 Dynamic Systems 9
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Discrete Dynamic Systems- a case study . . . . . . . . . . . . 9
1.3 Continuous Dynamic Systems . . . . . . . . . . . . . . . . . . 14
2 Stochastic techniques 17
2.1 Generating functions . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Convolutions . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Compound distributions . . . . . . . . . . . . . . . . . . . . . 22
2.4 Introdductory Stochastic Processes . . . . . . . . . . . . . . . 24
2.5 Markov Chains (MC), a keytool in modeling random phenomena 26
2.6 Classication of States . . . . . . . . . . . . . . . . . . . . . . 30
2.7 Limiting probabilities and Stationary distribution of a MC . . 32
2.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3 Simulation 37
3.1 Introductory Simulation . . . . . . . . . . . . . . . . . . . . . 37
3.2 Generation of random numbers . . . . . . . . . . . . . . . . . 38
3.3 Transformation random numbers into input data . . . . . . . 39
3.4 Measurement of output data . . . . . . . . . . . . . . . . . . 41
3.5 Analyzing of output- Making meaningful inferences . . . . . . 45
3.6 Simulation languages . . . . . . . . . . . . . . . . . . . . . . . 45
3.7 Research 1: Simulation of Queueing systems with multiclass
customers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
iii
CONTENTS 1
4 Probabilistic Modeling 47
4.1 Markovian Models . . . . . . . . . . . . . . . . . . . . . . . . 47
4.1.1 Exponential distribution . . . . . . . . . . . . . . . . . 47
4.1.2 Poisson process . . . . . . . . . . . . . . . . . . . . . . 48
4.2 Bayesian Modeling in Probabilistic Nets . . . . . . . . . . . . 48
5 Statistical Modeling in Quality Engineering 49
5.1 Introduction to Statistical Modeling (SM) . . . . . . . . . . . 49
5.2 DOE in Statistical Quality Control . . . . . . . . . . . . . . . 52
5.3 How to measure factor interactions? . . . . . . . . . . . . . . 53
5.4 What should we do to bring experiments into daily life? . . . 53
6 New directions and Conclusion 57
6.1 Black-Scholes model in Finance . . . . . . . . . . . . . . . . . 57
6.2 Drug Resistance and Design of Anti-HIV drug . . . . . . . . . 57
6.3 Epidemic Modeling . . . . . . . . . . . . . . . . . . . . . . . . 57
6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
7 Appendices 59
7.1 Appendix A: Theory of stochastic matrix for MC . . . . . . . 59
7.2 Appendix B: Spectral Theorem for Diagonalizable Matrices . 61
Keywords: linear algebra, computational algebra, graph, random processes,
simulation, combinatorics, statistics, Markov chains, discrete time processes
2 CONTENTS
Introduction
We propose a few specic mathematical modeling techniques used in various
applications such as Statistical Simulations of Service systems, Reliability
engineering, Finance engineering, Biomathematics, Pharmaceutical Science,
and Environmental Science. These are aimed for graduates in Applied
Mathematics, Computer Science and Applied Statistics at HCM City.
The aims the course
This lecture intergrates mathematical and computing techniques into
modeling and simulating of industrial and biological processes.
The structure of the course. The course consists of three parts:
Part I: Introductory specic topics
Part II: Methods and Tools
Part III: Connections and research projects
Working method.
Each group of 2 graduates is expected to carry out a small independent
research project (max 25 pages, font size 11, 1.5 line spacing, Time new
roman) from the chosen topic and submit their report at the end of the
course [week 15].
Examination The grading will be based on performance in:
* hand-ins of home-work assignments (weight 20% of the grade)
* a written report of group work on a small project topic (20%) and
three-times oral presentation about the project (20%)
* a nal exam (40%) covers basic mathematical and statistical methods
that have been introduced
Literature. Many, will know in the lecture.
3
4 CONTENTS
Prerequisites The participants will benet from a solid knowledge of
advanced calculus, discrete mathematics, basic knowledge of symbolic
computing, ordinary and partial dierential equations and programming
experience with Matlab, Scilab, R, Maple (or an equivalent language).
Part I: Introductory specic topics case studies
i=1
[u
i
v
i
[
The weight of a binary state/vector is dened to be
wt(u) = d(u, 0) =
n
i=1
[u
i
0[ =
n
i=1
u
i
.
The Hamming distance d(., .) dened on some binary space V is also called
the Hamming metric, and the space V equipped with the Hamming metric
d(., .) is called a Hamming metric space.
Denition 3 (State-transition graph). State-transition graph G = (V, E) of
a developing system S is a directed graph where
the vertices V consists of all feasible states that the system can realize
the edges E consists of arcs e = (u, v) such that state u can reach to
state v during the evolution of the concerned system
Very often, changing states of a state-transition graph G = (V, E) can be
conducted mathematically by measuring how far the Hamming distance is
between an original state u = (u
1
, u
2
, . . . , u
n
) to its eect state
v = (v
1
, v
2
, . . . , v
n
).
Example 3 (The farmers crossing river problem, cont.). The states of the
river crossing process are binary vectors of length 4
u = (u
f
, u
g
, u
c
, u
w
) = (u
1
, u
2
, u
3
, u
4
) 0, 1
4
,
if we encode the left bank L and the right bank R by 0, 1 as done above!
1.2. DISCRETE DYNAMIC SYSTEMS- A CASE STUDY 13
In our specic example above, V can hold all 16 = 2
4
possible states if no
system invariants would be found and imposed on S. With Constraint I, V
can be redened as V := V (1, 0, 0, 1), (0, 1, 1, 0).
We understand when the farmer is rowing his boat, for instance from a left
river bank u = (u
1
, u
2
, u
3
, u
4
) to a right river bank v = (v
1
, v
2
, . . . , v
n
) (or
the other way round), his position must change. The changing of state u to
v creates an edge e = (u, v) E, indeed! More clearly, the edge e = (u, v)
is truly determined i
if u
1
= L(i.e. 0) then v
1
= R(i.e. 1); or the other way round.
A ha, we just nd another invariant, must always be true to let the process
run, say: an edge e = (u, v) would exist if we equivalently have:
Invariant 2 : u
1
+v
1
= 1, where the sum is binary plus.
Combining with the fact that
the small boat can accommodate at most one of the farmer belongings we
realize that
a starting state u change at most two of its two coordinates to be the result
state v.
Hence, the third invariant is found:
Invariant 3 : d(u, v) =
4
i=1
[u
i
v
i
[ 2.
Decomposition
Having known how to describe a process or system by a state-transition
graph G = (V, E) is not enough! The reason is that we sometimes wish
to search from the all elligible states in V to nd best solutions, or
to determine an optimal path running through that search space V .
This comes down to list all states in V eciently! In that situation, we
could split the search space into several small-enough piceces, usually is
called Decompostion, and then list all elements in that pieces, called
Brute force.
Example 4 (The farmers crossing river problem, cont.). The set of
elligible states V consists of two parts: one holds every state corresponding
to the position L of the farmer, and the other one holds every state
corresponding to the position R of the farmer.
14 CHAPTER 1. DYNAMIC SYSTEMS
This observation tells us to decompose the state vertices V into two subsets:
V
L
= u
L
= (0, u
2
, u
3
, u
4
) 0, 1
4
; and V
R
= u
R
= (1, u
2
, u
3
, u
4
) 0, 1
4
= f(x, u, t)
where x
j=0
a
j
x
j
. (2.1)
If the series converges in some real interval x
0
< x < x
0
, the function A(x)
is called the generating function of the sequence a
j
.
17
18 CHAPTER 2. STOCHASTIC TECHNIQUES
Fact 2. If the sequence a
j
is bounded by some cosntant K, then A(x)
converges at least for [x[ < 1 [Prove it!]
Fact 3. In case of the sequence a
j
represents probabilities, we introduce
the restriction
a
j
0,
j=0
a
j
= 1.
The corresponding function A(x) is then a probability-generating function.
We consider the (point) probability distribution and the tail probability of
a random variable X, given by
P[X = j] = p
j
, P[X > j] = q
j
,
then the usual distribution function is
P[X j] = 1 q
j
.
The probability-generating function now is
P(x) =
j=0
p
j
x
j
= E(x
j
), E indicates the expectation operator.
Also we can dene a generating function for the tail probabilities:
Q(x) =
j=0
q
j
x
j
.
Q(x) is not a probability-generating function, however.
Fact 4.
a/ P(1) =
j=0
p
j
1
j
= 1 and
[P(x)[
j=0
[p
j
x
j
[
j=0
p
j
1 if [x[ < 1. So P(x) is absolutely
convergent at least for [x[ 1.
b/ Q(x) is absolutely convergent at least for [x[ < 1.
c/ Connection between P(x) and Q(x): (check this!)
(1 x)Q(x) = 1 P(x) or P(x) +Q(x) = 1 +xQ(x).
Mean and variance of a probability distribution
m = E(X) =
j=0
j p
j
= P
(1) =
j=0
q
j
= Q(1) (why!?)
Recall that the variance of the probability distribution p
j
is
2
= E(X(X 1)) +E(X) [E(X)]
2
2.1. GENERATING FUNCTIONS 19
we need to know
E[X(X 1)] =
j=0
j(j 1) p
j
= P
(1) = 2Q
(1)?
Therefore,
2
=???Whatisit
Exercise: Find the formula of the r-th factorial moment
[r]
= E(X(X 1)(X 2) (X r + 1))
Finding a generating function from a recurrence.
Multiply both sides by x
n
.
Example: Fibonacci sequence
f
n
= f
n1
+f
n2
= F(x) = x +xF(x) +x
2
F(x)
Finding a recurrence from a generating function.
Whenever you know F(x), we nd its power series P, the coecicents of P
before x
n
are Fibonacci numbers.
How? Just remember how to nd partial fractions expansion of F(x), in
particular a basic expansion
1
1 x
= 1 +x +
2
x
2
+
In general, if G(x) is a generating function of a sequence (g
n
) then
G
(n)
(0) = n!g
n
Multiple random variables. We consider probabilities involving
simultaneously the numerical values of several random variables and to
investigate their mutual couplings. In this section, we will extend the
concepts of PMF and expectation developed so far to multiple random
variables.
Consider two discrete random variables X, Y : S R associated with the
same experiment. The joint PMF of X and Y is dened by
p
X,Y
(x, y) = P[X = x, Y = y]
for all pairs of numerical values (x, y) that X and Y can take. We will use
the abbreviated notation P(X = x, Y = y) instead of the more precise
notations P[(X = x) (Y = y)] or P[X = xand Y = x].. For the pair of
random variables X, Y , we say
20 CHAPTER 2. STOCHASTIC TECHNIQUES
Denition 4. X and Y are independent if for all x, y R, we have
P[X = x, Y = y] = P[X = x]P[Y = y] p
X,Y
(x, y) = p
X
(x) p
Y
(y),
or in terms of conditional probability
P(X = x[Y = y) = PX = x.
This can be extended to the so-called mutually independent of a nite
number n r. v.s.
Expectation. The expectation operator denes the expected value of a
random variable X as
Denition 5.
E(X) =
xRange(X)
PX = x x
If we consider X is a function from a sample space S to the naturals N, then
E(X) =
i=0
PX > i.(Why?)
Functions of Multiple Random Variables. When there are multiple
random variables of interest, it is possible to generate new random variables
by considering functions involving several of these random variables. In
particular, a function Z = g(X, Y ) of the random variables X and Y denes
another random variable. Its PMF can be calculated from the joint PMF
p
X,Y
according to
p
Z
(z) =
(x,y)|g(x,y)=z
p
X,Y
(x, y).
Furthermore, the expected value rule for functions naturally extends and
takes the form
E[g(X, Y )] =
(x,y)
g(x, y) p
X,Y
(x, y).
Theorem 6. We have two important results of expectation.
1. (Linearity) E(X +Y ) = E(X) +E(Y ) for any pair of random
variables X, Y
2. (Independence) E(X Y ) = E(X) E(Y ) for any pair of independent
random variables X, Y
2.2. CONVOLUTIONS 21
2.2 Convolutions
Now we consider two nonnegative independent integral-valued random
variables X and Y , having the probability distributions
PX = j = a
j
, PY = k = b
k
. (2.2)
The joint probability of the event (X = j, Y = k) is a
j
b
k
obviously. We
form a new random variable
S = X +Y,
then the event S = r comprises the mutually exclusive events
(X = 0, Y = r), (X = 1, Y = r 1), , (X = r, Y = 0).
Fact 5. The probability distribution of the sum S then is
PS = r = c
r
= a
0
b
r
+a
1
b
r1
+ +a
r
b
0
.
Proof.
p
S
(r) = P(X+Y = r) =
(x,y):x+y=r
P(X = xand Y = y) ==
x
p
X
(x) p
Y
(rx)
Denition 7. This method of compounding two sequences of numbers (not
necessarily be probabilities) is called convolution. Notation
c
j
= a
j
b
j
will be used.
Fact 6. Dene the generating functions of the sequence a
j
, b
j
and c
j
by
A(x) =
j=0
a
j
x
j
, B(x) =
j=0
b
j
x
j
, C(x) =
j=0
c
j
x
j
,
it follows that C(x) = A(x)B(x). [check this!]
In practical applications, the sum of several independent integral-valued
random variables X
i
can be dened
S
n
= X
1
+X
2
+ +X
n
, n Z
+
.
If the X
i
have a common probability distribution given by p
j
, with
probability-generating function P(x), then the probability-generating
function of S
n
is P(x)
n
. Clearly, the n-fold convolution of S
n
is
p
j
p
j
p
j
(n factors) = p
j
.
22 CHAPTER 2. STOCHASTIC TECHNIQUES
2.3 Compound distributions
In our discussion so far of sums of random variables, we have always
assumed that the number of variables in the sum is known and xed , i.e., it
is nonrandom. We now generalize the previous concept of convolution to
the case where the number N of random variables X
k
contributing to the
sum is itself a random variable! In particular, we consider the sum
S
N
= X
1
+X
2
+ +X
N
, where
PX
k
= j = f
j
,
PN = n = g
n
,
PS
N
= l = h
l
.
(2.3)
Probability-generating functions of X, N and S are
F(x) =
f
j
x
j
,
G(x) =
g
n
x
n
,
H(x) =
h
l
x
l
.
(2.4)
Compute H(x) with respect to F(x) and G(x). Prove that
H(x) = G(F(x)).
Example 6. A remote village has three gas stations, and each one of them
is open on any given day with probability 1/2, independently of the others.
The amount of gas available in each gas station is unknown and is
uniformly distributed between 0 and 1000 gallons. We wish to characterize
the distribution of the total amount of gas available at the gas stations that
are open.
The number N of open gas stations is a binomial random variable with
p = 1/2 and the corresponding transform is
G
N
(x) = (1 p +pe
x
)
3
=
1
8
(1 +e
x
)
3
.
The transform (probability-generating function) F
X
(x) associated with the
amount of gas available in an open gas station is
F
X
(x) =
e
1000x
1
1000x
.
The transform H
S
(x) associated with the total amount S of gas available at
the three gas stations of the village that are open is the same as G
N
(x),
except that each occurrence of e
x
is replaced with F
X
(x), i.e.,
2.3. COMPOUND DISTRIBUTIONS 23
H
S
(x) = G(F(x)) =
1
8
(1 +F
X
(x))
3
.
Application in Large Deviation theory
We are interested in a practical situation in insurance industry, originally
realized from 1932 by F. Esscher, (Notices of AMS, Feb 2008).
Problem: too many claims could be made against the insurance company,
we worry about the total claim amount exceeding the reserve fund set aside
for paying these claims.
Our aim: to compute the probability of this event.
Modeling. Each individual claim is a random variable, we assume some
distribution for it, and the total claim is then the sum S of a large number
of (independent or not) random variables. The probability that this sum
exceeds a certain reserve amount is the tail probability of the sum S of
independent random variables.
Large Deviation theory invented by Esscher requires the calculation of
the moment generating functions! If your random variables are independent
then the moment generating functions are the product of the individual
ones, but if they are not (like in a Markov chain) then there is no longer
just one moment generating function!
Research project: study Large Deviation theory to solve this problem.
24 CHAPTER 2. STOCHASTIC TECHNIQUES
2.4 Introdductory Stochastic Processes
The concept. A stochastics process is just a collection (usually innite) of
random variables, denoted X
t
or X(t); where parameter t often represents
time. State space of a stochastics process consists of all realizations x of X
t
,
i.e. X
t
= x says the random process is in state x at time t. Stochastics
processes can be generally subdivided into four distinct categories
depending on whether t or X
t
are discrete or continuous:
1. Discrete processes: both are discrete, such as Bernoulli process (die
rolling) or Discrete Time Markov chains.
2. Continuous time discrete state processes: the state space of X
t
is
discrete and the index set, e.g. time set T of t is continuous, as an
interval of the reals R.
Poisson process the number of clients X(t) who has entered
ACB from the time it opened until time t. X(t) will have the
Poisson distribution with the mean E[X(t)] = t ( being the
arrive rate).
Continuous time Markov chain.
Queuing process people not only enter but also leave the bank,
we need the distribution of service time (the time a client spends
in ACB).
3. Continuous processes: both X
t
and t are continuous, such as diusion
process (Brownian motion).
4. Discrete time continuous state processes: X
t
is continuous and t is
discrete the so-called TIME SERIES such as
monthly uctuations of the ination rate of Vietnam,
daily uctuations of a stock market.
Examples
1. Discrete processes: random walk model consisting of positions X
t
of
an object (drunkand) at time discrete time point t during 24 hours,
whose directional distance from a particular point 0 is measured in
integer units. Here T = 0, 1, 2, . . . , 24.
2. Discrete time continuous processes: X
t
is the number of births in a
given population during time period [0, t]. Here T = R
+
= [0, ) and
the state space is 0, 1, 2, . . . , The sequence of failure times of a
machine is a specic instance.
2.4. INTRODDUCTORY STOCHASTIC PROCESSES 25
3. Continuous processes: X
t
is population density at time
t T = R
+
= [0, ), and the state space of X
t
is R
+
.
4. TIME SERIES of daily uctuations of a stock market
What interesting characteristics of SP that we want to know? We
know a stochastic process is a mathematical model of a probabilistic
experiment that evolves in time and generates a sequence of numerical
values. Three interesting aspects of SP that we want to know:
(a) We tend to focus on the dependencies in the sequence of values
generated by the process. For example, how do future prices of a stock
depend on past values?
(b) We are often interested in long-term averages, involving the entire se-
quence of generated values. For example, what is the fraction of time that a
machine is idle?
(c) We sometimes wish to characterize the likelihood or frequency of certain
boundary events. For example, what is the probability that within a
given hour all circuits of some telephone system become simultaneously
busy, or what is the frequency with which some bu[U+FB00]er in a
computer net- work over[U+FB02]ows with data?
Few fundamental properties and categories
1. STATIONARY property: A process is stationary when all the X(t)
have the same distribution. That means, for any , the distribution of
a stationary process will be unaected by a shift in the time origin,
and X(t) and X(t +) will have the same distributions. For the
rst-order distribution,
F
X
(x; t) = F
X
(x; t +) = F
X
(x); and f
X
(x; t) = f
X
(x).
These processes are found in Arrival-Type Processes. For which, we
are interested in occurrences that have the character of an arrival,
such as message receptions at a receiver, job completions in a
manufacturing cell, customer purchases at a store, etc. We will focus
on models in which the interarrival times (the times between
successive arrivals) are independent random variables.
The case where arrivals occur in discrete time and the interarrival
times are geometrically distributed is the Bernoulli process.
The case where arrivals occur in continuous time and the
interarrival times are exponentially distributed is the Poisson
process. Bernoulli process and Poisson process will be investigated in
detail in the Stochastic Processes course.
26 CHAPTER 2. STOCHASTIC TECHNIQUES
2. MARKOVIAN (memory-less) property: Many processes with
memory-less property caused by experiments that evolve in time and
in which the future evolution exhibits a probabilistic dependence on
the past. As an example, the future daily prices of a stock are
typically dependent on past prices. However, in a Markov process, we
assume a very special type of dependence: the next value depends on
past values only through the current value, that is X
i+1
depends only
on X
i
, and not on any previous values.
2.5 Markov Chains (MC), a keytool in modeling
random phenomena
We discuss the concept of discrete time Markov Chain or just Markov
Chains in this section. Suppose we have a sequence M of consecutive trials,
numbered n = 0, 1, 2, . The outcome of the nth trial is represented by the
random variable X
n
, which we assume to be discrete and to take one of the
values j in a nite set Q of discrete outcomes/states e
1
, e
2
, e
3
, . . . , e
s
.
M is called a (discrete time) Markov chain if, while occupying Q states at
each of the unit time points 0, 1, 2, 3, . . . , n 1, n, n + 1, . . ., M satises the
following property, called Markov property or memoryless property.
P(X
n+1
= j[X
n
= i, , X
0
= a) = P(X
n+1
= j[X
n
= i), for all n = 0, 1, 2, .
(In each time step n to n + 1, the process can stay at the same state e
i
(at
both n, n + 1) or move to other state e
j
(at n + 1) with respect to the
memoryless rule, saying the future behavior of system depends only on the
present and not on its past history.)
Denition 8 (One-step transition probability).
Denote the absolute probability of outcome j at the nth trial by
p
j
(n) = P(X
n
= j) (2.5)
The one-step transition probability, denoted
p
ij
(n + 1) = P(X
n+1
= j[X
n
= i),
dened as the conditional probability that the process is in state j at time
n + 1 given that the process was in state i at the previous time n, for all
i, j Q.
2.5. MARKOVCHAINS (MC), AKEYTOOL INMODELINGRANDOMPHENOMENA27
Independent of time property- Homogeneous Markov chains. If
the state transition probabilities p
ij
(n + 1) in a Markov chain M is
independent of time n, they are said to be stationary, time homogeneous or
just homogeneous. The state transition probability in homogeneous chain
then can be written without mention time point n:
p
ij
= P(X
n+1
= j[X
n
= i). (2.6)
Unless stated otherwise, we assume and will work with homogeneous
Markov chains M. The one-step transition probabilities given by 3.2 of
these Markov chains must satisfy:
s
j=1
p
ij
= 1; for each i = 1, 2, , s and p
ij
0.
Transition Probability Matrix. In practical applications, we are likely given
the initial distribution (i.e. the probability distribution of starting position
of the concerned object at time point 0), and the transition probabilities;
and we want to determine the the probability distribution of position X
n
for any time point n > 0. The Markov property, quantitatively described
through transition probabilities, can be represented in the state transition
matrix P = [p
ij
]:
P =
_
_
p
11
p
12
p
13
. . . .p
1s
.
p
21
p
22
p
23
. . . p
2s
.
p
31
p
32
p
33
. . . p
3s
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
_
_
(2.7)
Briey, we have
Denition 9. A (homogeneous) Markov chain M is a triple (Q, p, A) in
which:
Q is a nite set of states (be identied with an alphabet ),
p(0) are initial probabilities, (at initial time point n = 0)
P are state transition probabilities, denoted by a matrix P = [p
ij
] in
which
p
ij
= P(X
n+1
= j[X
n
= i)
.
And such that the memoryless property is satised,ie.,
P[X
n+1
= j[X
n
= i, , X
0
= a] = P[X
n+1
= j[X
n
= i], for all n.
28 CHAPTER 2. STOCHASTIC TECHNIQUES
In practice, the initial probabilities p(0) is obtained at the current time
(begining of a research), and the transition probability matrix P is found
from empirical observations in the past. In most cases, the major concern is
using P and p(0) to predict future.
Example 7. The Coopmart chain (denoted C) in SG currently controls
60% of the daily processed-food market, their rivals Maximart and other
brands (denoted M) takes the other share. Data from the previous years
(2006 and 2007) show that 88% of Cs customers remained loyal to C, while
12% switched to rival brands. In addition, 85% of Ms customers remained
loyal to M, while other 15% switched to C. Assuming that these trends
continue, use MC theory to determine Cs share of the market (a) in 5
years and (b) over the long run.
Proposed solution. Suppose that the brand attraction is time homogeneous,
for a sample of large enough size n, we denote the customers attention in
the year n by a random variable X
n
. The market share probability of the
whole population then can be approximated by using the sample statistics,
e.g.
P(X
n
= C) =
[x : X
n
(x) = C[
n
, and P(X
n
= M) = 1 P(X
n
= C).
Set n = 0 for the current time, the initial probabilities then is
p(0) = [0.6, 0.4] = [P(X
0
= C), P(X
0
= M)].
Obviously we want to know the market share probabilities
p(n) = [P(X
n
= C), P(X
n
= M)] at any year n > 0. We now introduce a
transition probability matrix with labels on rows and columns to be C and
M
P =
_
_
C M
C 0.88 0.12
M 0.15 0.85
_
_
=
_
1 a = 0.88 a = 0.12
b = 0.15 1 b = 0.85
_
, =
_
0.88 0.12
0.15 0.85
_
,
(2.8)
where a = p
CM
= P[X
n+1
= M[X
n
= C], b = p
MC
= P[X
n+1
= C[X
n
= M].
h=1
p
(nk)
ih
p
(k)
hj
, 0 < k < n.
This results in the matrix notation
P
(n)
= P
(nk)
P
(k)
.
Since P
(1)
= P, we get P
(2)
= P
2
, and in general P
(n)
= P
n
.
Let p
(n)
denote the vector form of probability mass distribution (pmf or
absolute probability distribution) associated with X
n
of a Markov process,
that is
p
(n)
= [p
1
(n), p
2
(n), p
3
(n), . . . , p
s
(n)],
where each p
i
(n) is dened as in 2.5.
Proposition 10. The absolute probability distribution p
(n)
at any stage n
of a Markov chain is given in the matrix form
p
(n)
= P
n
p
(0)
, where p
(0)
= p is the initial probability vector. (2.10)
Proof. We employ two facts:
* P
(n)
= P
n
, and
* the absolute probability distribution p
(n+1)
at any stage n + 1 (associated
with X
n+1
) can be found by the 1-step transition matrix P = [p
ij
] and the
distribution
p
(n)
= [p
1
(n), p
2
(n), p
3
(n), . . . , p
s
(n)]
at any stage n (associated with X
n
):
p
j
(n + 1) =
s
i=1
p
ij
p
i
(n), or in the matrix notation p
(n+1)
= P p
(n)
.
Then just do the induction
p
(n+1)
= P p
(n)
= P P, p
(n1)
= = P
n+1
p
(0)
.
Example 8 (The Coopmart chain: cont. ). (a/) Cs share of the market
in 5 years can be computed by
p
(5)
= [p
C
(5), p
M
(5)] = P
5
p
(0)
.
30 CHAPTER 2. STOCHASTIC TECHNIQUES
2.6 Classication of States
Accessible states. State j is said to be accessible from state i if for some
n 0, p
(n)
ij
> 0, and we write i j. Two states i and j accessible to each
other are said to communicate, and we write i j. If all states
communicate with each other, then we say that the Markov chain is
irreducible.
Recurrent states. Let A(i) be the set of states that are accessible from i. We
say that i is recurrent if for every j that is accessible from i, i is also
accessible from j; that is, for all j A(i) we have that i A(j).
When we start at a recurrent state i, we can only visit states j A(i) from
which i is accessible. Thus, from any future state, there is always some
probability of returning to i and, given enough time, this is certain to
happen. By repeating this argument, if a recurrent state is visited once, it
will be revisited an innite number of times.
Transient states. A state is called transient if it is not recurrent. In
particular, there are states j A(i) such that i is not accessible from j.
After each visit to state i, there is positive probability that the state enters
such a j. Given enough time, this will happen, and state i cannot be visited
after that. Thus, a transient state will only be visited a nite number of
times.
If i is a recurrent state, the set of states A(i) that are accessible from i,
form a recurrent class (or simply class), meaning that states in A(i) are all
accessible from each other, and no state outside A(i) is accessible from
them. Mathematically, for a recurrent state i, we have A(i) = A(j) for all j
that belong to A(i), as can be seen from the denition of recurrence. It can
be seen that at least one recurrent state must be accessible from any given
transient state. This is intuitively evident, and a more precise justication
is given in the theoretical problems section. It follows that there must exist
at least one recurrent state, and hence at least one class. Thus, we reach
the following conclusion.
Markov Chain Decomposition.
A MC can be decomposed into one or more recurrent classes, plus
possibly some transient states.
A recurrent state is accessible from all states in its class, but is not
accessible from recurrent states in other classes.
A transient state is not accessible from any recurrent state.
2.6. CLASSIFICATION OF STATES 31
At least one, possibly more, recurrent states are accessible from a
given transient state.
Remark 7. For the purpose of understanding long-term behavior of
Markov chains, it is im- portant to analyze chains that consist of a single
recurrent class.
For the purpose of understanding short-term behavior, it is also important
to analyze the mech- anism by which any particular class of recurrent states
is entered starting from a given transient state.
Periodic states.
Absorption probabilities. In this section, we study the short-term behavior
of Markov chains. We rst consider the case where the Markov chain starts
at a transient state. We are interested in the rst recurrent state to be
entered, as well as in the time until this happens. When focusing on such
questions, the subsequent behavior of the Markov chain (after a recurrent
state is encountered) is immaterial. State j is said to be an absorbing state
if p
jj
= 1; that is, once state j is reached, it is never left. We assume,
without loss of generality, that every recurrent state k is absorbing:
p
kk
= 1, p
kj
= 0 for all j ,= k.
If there is a unique absorbing state k, its steady-state probability is 1
(because all other states are transient and have zero steady-state
probability), and will be reached with probability 1, starting from any
initial state.
If there are multiple absorbing states, the probability that one of them
will be eventually reached is still 1, but the identity of the absorbing
state to be entered is random and the associated probabilities may
depend on the starting state.
In the sequel, we x a particular absorbing state, denoted by s, and consider
the absorption probability a
i
that s is eventually reached, starting from i:
a
i
= P(X
n
eventually becomes equal to the absorbing state s[X
0
= i).
Absorption probabilities can be obtained by solving a system of linear
equations.
a
s
= 1, a
i
= 0, for all absorbing i ,= s, a
i
=
m
j=1
p
ij
a
j
, for all transient i.
32 CHAPTER 2. STOCHASTIC TECHNIQUES
2.7 Limiting probabilities and Stationary
distribution of a MC
Denition 11. Vector p
P = p
.
This equation indicates that a stationary distribution p
is a left
eigenvector of P with eigenvalue 1. In general, we wish to know limiting
probabilities p
p
(0)
.
We need some general results to determine the stationary distribution p
of a Markov chain.
A) Markov chains that have two states. At rst we investigate the
case of Markov chains that have two states, say Q = e
1
, e
2
. Let a = p
e
1
e
2
and b = p
e
2
e
1
the state transition probabilities between distinct states in a
two state Markov chain, its state transition matrix is
P =
_
p
11
p
21
p
12
p
22
_
=
_
1 a a
b 1 b
_
, where 0 < a < 1, 0 < b < 1. (2.11)
Proposition 12.
a) The n-step transition probability matrix is given by
P
(n)
= P
n
=
1
a +b
_
_
_
b a
b a
_
+ (1 a b)
n
_
a a
b b
_ _
_
b) Find the limit matrix when n .
To prove this basic Proposition 12 (computing transition probability matrix
of two state Markov chains), we use a fundamental result of Linear Algebra
that is recalled in Subsection ??.
Proof. The eigenvalues of the state transition matrix P found by solving
equation
c() = [I P[ = 0
are
1
= 1 and
2
= 1 a b. The spectral decomposition of square matrix
says P can be decomposed into two constituent matrices E
1
, E
2
(since only
two eigenvalues was found):
E
1
=
1
1
2
[P
2
I], E
2
=
1
2
1
[P
1
I].
2.8. EXERCISES 33
That means, E
1
, E
2
are orthogonal matrices, i.e. E
1
E
2
= 0 = E
2
E
1
, and
P =
1
E
1
+
2
E
2
; E
2
1
= E
1
, E
2
2
= E
2
.
Hence, P
n
=
n
1
E
1
+
n
2
E
2
= E
1
+ (1 a b)
n
E
2
, or
P
(n)
= P
n
=
1
a +b
_
_
_
b a
b a
_
+ (1 a b)
n
_
a a
b b
_ _
_
b) The limit matrix when n :
lim
n
P
n
=
1
a +b
_
_
_
b a
b a
_ _
_
B) Markov chains that have more than two states. For s > 2, it is
cumbersome to compute constituent matrices E
i
of P, we could employ the
so-called regular property. Markov chains are regular if there exists m N
such that P
(m)
= P
m
> 0 (every entry is positive).
2.8 Exercises
A/ Simple skills.
Let Z
1
, Z
2
, be independent identically distributed r.v.s with
P(Z
n
= 1) = p and P(Z
n
= 1) = q = 1 p for all n. Let
X
n
=
n
i=1
Z
i
, n = 1, 2,
and X
0
= 0. The collection of r.v.s X
n
, n 0 is a random process, and it
is called the simple random walk X(n) in one dimension.
(a) Describe the simple random walk X(n).
(b) Construct a typical sample sequence (or realization) of X(n).
(c) Find the probability that X(n) = 2 after four steps.
(d) Verify the result of part (a) by enumerating all possible sample
sequences that lead to the value X(n) = 2 after four steps.
(e) Find the mean and variance of the simple random walk X(n). Find
the autocorrelation function R
X
(n, m) of the simple random walk
X(n).
34 CHAPTER 2. STOCHASTIC TECHNIQUES
(f) Show that the simple random walk X(n) is a Markov chain.
(g) Find its one-step transition probabilities.
(h) Derive the rst-order probability distribution of the simple random
walk X(n).
Solution.
(a) The simple random walk X(n) is a discrete-parameter (or time),
discrete-state random process. The state space is
E = ..., 2, 1, 0, 1, 2, ..., and the index parameter set is T = 0, 1, 2, ....
(b) A sample sequence x(n) of a simple random walk X(n) can be produced
by tossing a coin every second and letting x(n) increase by unity if a head H
appears and decrease by unity if a tail T appears. Thus, for instance, we
have a small realization of X(n) in Table 2.8:
n 0 1 2 3 4 5 6 7 8 9 10
Coin tossing H T T H H H T H H T
x
n
0 1 0 - 1 0 1 2 1 2 3 2
Table 2.1: Simple random walk from Coin tossing
The sample sequence x(n) obtained above is plotted in (n, x(n))-plane. The
simple random walk X(n) specied in this problem is said to be
unrestricted because there are no bounds on the possible values of X. The
simple random walk process is often used in the following primitive
gambling model: Toss a coin. If a head appears, you win one dollar; if a tail
appears, you lose one dollar.
B/ Concepts.
_
0 0 0.5 0.5
1 0 0 0
0 1 0 0
0 1 0 0
_
_
; P
3
=
_
_
0.3 0.4 0 0 0.3
0 1 0 0 0
0 0 0 0.6 0.4
0 0 1 0 0
_
_
3. Verify the transitivity property of the Markov chain ; that is, if i j
and j k, then i k. (Hint: use Chapman Komopgorov equations).
4. Show that in a nite-state Markov chain, not all states can be
transient.
C/ Markov Chains and Modeling.
_
N M L S
N 1 0 0 0
M 0.4 0 0.6 0
L 0.2 0 0.1 0.7
S 0 0 0 1
_
_
Is P regular? ergodic? Find the long term distribution matrix
L = lim
m
P
m
. What is your conclusion? (Remark that the state N and
S are called absorpting states).
Chapter 3
Simulation
This section is aimed at providing a brief introduction to simulation
methods and tools within Industrial Statistics, Computational Mathematics
and Operations Research.
3.1 Introductory Simulation
Practical Motivation. An organisation has realised that a system is not
operating as desired, it will look for ways to improve its performance. To do
so, sometimes it is possible to experiment with the real system and, through
observation and the aid of Statistics, reach valid conclusions towards future
system improvement. However, experiments with a real system may entail
ethical and/or economical problems, which may be avoided dealing with a
prototype, a physical model.
Sometimes, it is not feasible or possible, to build a prototype, yet we may
obtain a mathematical model describing, through equations and constraints,
the essential behaviour of the system. This analysis may be done,
sometimes, through analytical or numerical methods, but the model may be
too complex to be dealt with. Statistically, in the design phase of a system,
there is no system available, we can not rely on measurements for
generating a pdf. In such extreme cases, we may use simulation. Large
complex system simulation has become common practice in many industrial
areas. Essentially, simulation consists of
(i) building a computer model that describes the behaviour of a
system; and
(ii) experimenting with this model to reach conclusions that support
decisions.
37
38 CHAPTER 3. SIMULATION
Once we have a computer siumulation model of the actual system, we need
to generate values for the random quantities that are part of the system
input (to the model).
Note that: Besides Simulation, two other key methods used to solve
practical problems in OR are Linear Programming, and Statistical Methods.
In this chapter, from the Statistical point of view, we introduce key
concepts, methods and tools from simulation with the Industrial
Statistics orientation in mind. The major parts of this section are from [8]
and [28] We mainly consider the problem within Step (ii) only. To conduct
Step (i) rightly and meaningfully, a close collaboration with experts in
specic areas is vital. Topics discussing Step (ii) are shown in the other
chapters. We learn
1. How to generate random numbers?
2. How to transformation random numbers into input data?
3. How to measure/record output data?
4. How to analyze and intepret output data and making meaningful
inferences?
3.2 Generation of random numbers
General concepts. Ref 3.1, [8] and [28].
The most basic computational component in simulation involves the
generation of random variables distributed uniformly between 0 and 1.
These then can be used to generate other random variables, both discrete
and continuous depending on practical contexts. Few major requirements
for meaningfully reasonable/reliable simulation:
the simulation is run long enough to obtain an estimate of the
operating characteristics of the system
the number of runs also should be large enough to obtain reliable
estimate
the result of each run is a random sample implies that a simulation is
a statistical experiment, that must be conducted using statistical
tools such as: i) point estimation, ii) condent intervals and iii)
hypothesis testing.
3.3. TRANSFORMATION RANDOM NUMBERS INTO INPUT DATA39
A schematic diagram to mathematically simulate a system. If a system S is
described by a discrete random variable X, a fundamental diagram to
simulate S is:
A random number generator G uniform random variable U pdf or cdf of X.
3.3 Transformation random numbers into input
data
Ref 3.2, [8]
Now some advanced simulation techniques.
Practicality: we use G to randomly compute specic value of X. in the last
two phases of this diagram. Using the so-called discrete inverse transform
method, in which we write the cdf of X by F(k) =
k
i=0
p(i) [0, 1], then:
- generate a uniform random number U [0, 1] by G,
- nd the value of X = k by determining the interval [F(k 1), F(k)]
consisting of U, mathematically this means to nd the preimage F
1
(U).
The Transformation Method
Generally, we need an algorithm, named Transformation Method, described
in two steps:
Step 1 use an algorithm A to generates variates V
n
, n = 1, 2, ... of a r.v. V
(V = U in the above example) with specic cdf F
V
(v) for continuous
case or pdf f
V
(v) for discrete case. Then
Step 2 employ an approriate transformation g(.) to generate a variate of
X, namely X
n
= g(V
n
).
Theorem 14 (Relationship of V and X). Consider a r.v. V with pdf f
V
(v)
and given transformation X = g(V ). Denote by v
1
, v
2
, , v
n
the real roots
of the equation
x g(v) = 0 then the pdf of the r.v. X is given by (3.1)
f
X
(x) =
n
l=1
f
V
(v
l
)
1
[
dg
dv
(v
l
)[
.
Given x, if Equ. 3.1 has no real solutions then the odf f
X
(x) = 0
Proof. DIY
40 CHAPTER 3. SIMULATION
Two most important uses of the Transformation Method are:
A) Linear (ane when b ,= 0) case: X = g(V ) = aV +b where a, b R.
Then
f
X
(x) =
1
[a[
f
V
(
x b
a
).
B) Inverse case X = g(V ) = F
1
X
(V ) where F
X
(x) is the cdf of the random
variable X.
Theorem 15 (Inverse case). Consider a r.v. V with uniform cdf
F
V
(v) = v, v [0, 1]. Then the transformation X = g(V ) = F
1
X
(V ) gives
variates x of X with cdf F
X
(x).
Proof. For any real number a, and due to the monotonicity of the cdf
function F
X
, so
P(X a) = P[F
1
X
(V ) a] = P[V F
X
(a)] = F
V
(F
X
(a)) = F
X
(a).
Use this, an algorithm is formulated for generating variates of a r.v. X.
1. Invert the given cdf F
X
(x) to nd its inverse F
1
X
2. Generate a uniform variate V [0, 1]
3. Generate variates x via the transformation X = F
1
X
(V ).
Example 9. Consider a Bernoulli r.v. X B(p) where p = P(X = 1). In
addition, the cdf F
X
(x) = P(X x) = V is a step (stair-case) function u(.).
[That is, u(t) = b
i
if a
i
t < a
i+1
, where (a
i
)
i
is an ascending sequence.]
Here
F
X
(x) = 0 if x < 0, F
X
(x) = p if 0 x < 1, and F
X
(x) = p + (1 p) = 1 if
1 x.
How to generate X? We employ V UniDist([0.1]), and the fact that the
inverse
F
1
X
(V ) = u(V (1 p)).
Example 10. Consider a binomial r.v. X BinomDist(n, p) where
p = P(X = 1). X takes values in X = 0, 1, ..., n and the distribution is
given by a probability function
p(k) = P(X = k) =
_
n
k
_
p
k
(1 p)
nk
.
3.4. MEASUREMENT OF OUTPUT DATA 41
We employ the fact that V UniDist([0.1]), and use
F
X
(x) = P(X x) = V x = F
1
X
(V ) = u(V ),
in which the parameters of the step function u(V ) are given by:
u(V ) = k if
i=0..k1
p(i) < V
i=0..k
p(i), k 1, ..., n; u(V ) = 0 if
V < 0.
How is this done? Simply split the vertical interval [0, 1] into n + 1
subintervals, with the length of the kth subinterval equal to
p(k) = P(X = k), k 0, 1, ..., n.
j=1
p
ij
= 1; for each i = 1, 2, , s and p
ij
0.
Transition Probability Matrix. In practical applications, we are likely given
the initial distribution (i.e. the probability distribution of starting position
of the concerned object at time point 0), and the transition probabilities;
and we want to determine the the probability distribution of position X
n
for any time point n > 0. The Markov property, quantitatively described
through transition probabilities, can be represented conveniently in the
so-called state transition matrix P = [p
ij
]:
P =
_
_
p
11
p
12
p
13
. . . .p
1s
.
p
21
p
22
p
23
. . . p
2s
.
p
31
p
32
p
33
. . . p
3s
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
_
_
(3.3)
Denition 17. Vector p
P = p
.
Question: how to nd a stationary distribution of a Markov chain?
Consider a homogeneous DTMC X
n
described by the transition matrix
P = [p
ij
]. How do we generate sample paths of X
n
. Two issues involved
here:
a) Only steady state results are of interest
b) Transient results are of interest as well.
In a), we want to generate values for a single stationary random variable p
is one-dimensional
pdf the algorithm after Theorem 15 suces
44 CHAPTER 3. SIMULATION
Instances of synchronous and asynchronous simulation We illustrate
both strategies describing how to sample from a Markov chain with state
space S and transition matrix
P = (p
ij
), with p
ij
= P(X(n + 1) = j[X(n) = i).
The obvious way to simulate the (n + 1)-th transition, given X(n), is
Generate X(n + 1) p
x(n)j
: j S.
This synchronous approach has the potential shortcoming that
X(n) = X(n + 1), with the corresponding computational eort lost.
Alternatively, we may simulate T
n
, the time until the next change of state
and, then, sample the new state X(n +T
n
). If X(n) = s, T
n
follows a
geometric distribution GeomDist(p
ss
) of parameter p
ss
and X(n +T
n
) will
have a discrete distribution with mass function
p
sj
(1pss)
: j S s.
Should we wish to sample N transitions of the chain, assuming X(0) = i
0
,
we do
Do t = 0, X(0) = i
0
While t < N
Sample h GeomDist(p
x(t)x(t)
)
Sample X(t +h)
p
x(t)j
(1p
x(t)x(t)
)
: j S x(t)
Do t = t +h
Two key strategies for asynchronous simulation.
One is that of event scheduling. The simulation time advances until the
next event and the corresponding activities are executed. If we have k types
of events (1, 2, . . . , k) , we maintain a list of events, ordered according to
their execution times (t
1
, t
2
, . . . , t
k
) . A routine R
i
associated with the i-th
type of event is started at time
i
= min(t
1
, t
2
, . . . , t
k
).
An alternative strategy is that of process interaction. A process represents
an entity and the set of actions that experiments throughout its life within
the model. The system behaviour may be described as a set of processes
that interact, for example, competing for limited resources. A list of
processes is maintained, ordered according to the occurrence of the next
event. Processes may be interrupted, having their routines multiple entry
points, designated reactivation points.
Each execution of the program will correspond to a replication, which
corresponds to simulating the system behaviour for a long enough period of
3.5. ANALYZINGOF OUTPUT- MAKINGMEANINGFUL INFERENCES45
time, providing average performance measures, say X(n), after n customers
have been processed. If the system is stable,
X(n) X.
If, e.g., processing 1000 jobs is considered long enough, we associate with
each replication j of the experiment the output X
j
(1000). After several
replications, we would analyse the results as described in the next section.
3.5 Analyzing of output- Making meaningful
inferences
Ref 3.4, [8] and [21], Section 5.
3.6 Simulation languages
Use JMT system or OpenModelica.
3.7 Research 1: Simulation of Queueing systems
with multiclass customers
Classical queueing models have been extensively studied from the 60s
during the emerging of internet. One of the pioneers of the eld is Leonard
Kleinrock at UCLA. In fact, queueing models are applied not only in
networks and systems of computers but also in any service system of an
economy that posseses resource allocation and/or sharing. In Europe, the
project called Euro-NGI (European Network of Exellence Project on
Design and Enginnering of the Next-Generation Internet) has been created
just a few years.
We restricted ourselve to studying and simulating basic queueing systems
such as M/M/1, M/M/1/K and M/G/1 systems. Now, how to improve the
work in [8]?
.
Proof. If (1) is proved then, by Theorem 20, P = [p
ij
] is ergodic. Hence,
when P = [p
ij
] is regular, the limit matrix L = lim
m
P
m
does exist. By
the Spectral Decomposition (7.1),
P = E
1
+
2
E
2
+ +
k
E
k
, where all [
i
[ < 1, i = 2, , k.
Then, by (7.2) L = lim
m
P
m
= lim
m
(E
1
+
m
2
E
2
+ +
m
k
E
k
) = E
1
.
Let vector p
P = p
(P 1I) = 0, (p
i.e.: L = [p
, , p
.
7.2. APPENDIXB: SPECTRAL THEOREMFOR DIAGONALIZABLE MATRICES61
Corollary 22. Few important remarks are: (a) for regular MC, the
long-term behavior does not depend on the initial state distribution
probabilities p(0); (b) in general, the limiting distributions are inuenced by
the initial distributions p(0), whenever the stochastic matrix P = [p
ij
] is
ergodic but not regular. (See more at problem D).
Example 13. Consider a Markov chain with two states and transition
probability matrix
_
3/4 1/4
1/2 1/2
_
(a) Find the stationary distribution p of the chain.
(b) Find lim
n
P
n
by rst evaluating P
n
.
(c) Find lim
n
P
n
.
7.2 Appendix B: Spectral Theorem for
Diagonalizable Matrices
Consider a square matrix P of order s with spectrum
(P) =
1
,
2
, ,
k
consisting of its eigenvalues. Few basic facts are:
If (
1
, x
1
), (
2
, x
2
), , (
k
, x
k
) are eigenpairs for P, then
S = x
1
, , x
k
is a linearly independent set. If B
i
is a basis for the
null space N(P
i
I), then B = B
1
B
2
B
k
is a linearly
independent set
P is diagonalizable if and only if P possesses a complete set of
eigenvectors (i.e. a set of s linearly independent vectors). Moreover,
H
1
PH = D = Diagmat(
1
,
2
, ,
s
) if and only if the columns of
H constitute a complete set of eigenvectors and the
j
s are the
associated eigenvalues- i.e., each (
j
, H[, j]) is an eigenpair for P.
Spectral Theorem for Diagonalizable Matrices. A square matrix P of
order s with spectrum (P) =
1
,
2
, ,
k
consisting of eigenvalues is
diagonalizable if and only if there exist constituent matrices
E
1
, E
2
, , E
k
(called the spectral set) such that
P =
1
E
1
+
2
E
2
+ +
k
E
k
, (7.1)
where the E
i
s have the following properties:
E
i
E
j
= 0 whenever i ,= j, and E
2
i
= E
i
for all i = 1..k
62 CHAPTER 7. APPENDICES
E
1
+E
2
+ +E
k
= I
In practice we employ Fact 7.1 in two ways:
Way 1: if we know the decomposition 7.1 explicitly, then we can compute
powers
P
m
=
m
1
E
1
+
m
2
E
2
+ +
m
k
E
k
, for any integer m > 0. (7.2)
Way 2: if we know P is diagonalizable then we nd the constituent
matrices E
i
by:
* nding the nonsingular matrix H = (x
1
[x
2
[ [x
k
), where each x
i
is a
basis left eigenvector of the null subspace
N(P
i
I) = v : (P
i
I)(v) = 0 Pv =
i
v;
** then, P = HDH
1
= (x
1
[x
2
[ [x
k
) D H
1
where
D = diag(
1
, ,
k
) the diagonal matrix, and
H
1
= K
=
_
_
y
t
1
y
t
2
.
.
.
y
t
k
_
_
; (i.e.K = (y
1
[y
2
[ [y
k
)).
Here each y
i
is a basis right eigenvector of the null subspace
N(P
i
I) = v : v
P =
i
v
.
The constituent matrices E
i
= x
i
y
t
i
.
Example 14. Diagonalize the following matrix and provide its spectral
decomposition.
P =
_
_
1 4 4
8 11 8
8 8 5
_
_
.
The characteristic equation is p() = det(P I) =
3
+ 5
2
+ 3 9 = 0.
So = 1 is a simple eigenvalue, and = 3 is repeated twice (its algebraic
multiplicity is 2). Any set of vectors x satisfying
x N(P I) (P I)x = 0 can be taken as a basis of the eigenspaces
(null space) N(P I). Bases of for the eigenspaces are:
N(P 1I) = span
_
[1, 2, 2]
_
; and N(P + 3I) = span
_
[1, 1, 0]
, [1, 0, 1]
_
.
Easy to check that these three eigenvectors x
i
form a linearly independent
set, then P is diagonalizable. The nonsingular matrix (also called similarity
transformation matrix)
7.2. APPENDIXB: SPECTRAL THEOREMFOR DIAGONALIZABLE MATRICES63
H = (x
1
[x
2
[x
3
) =
_
_
1 1 1
2 1 0
2 0 1
_
_
;
will diagonalize P, and since P = HDH
1
we have
H
1
PH = D = Diagmat(
1
,
2
,
2
) = Diagmat(1, 3, 3) =
_
_
1 0 0
0 3 0
0 0 3
_
_
Here, H
1
=
_
_
1 1 1
2 3 2
2 2 1
_
_
implies that
y
t
1
= [1, 1, 1], y
t
2
= [2, 3, 2], y
t
3
= [2, 2, 1]. Therefore, the constituent
matrices
E
1
= x
1
y
t
1
=
_
_
1 1 1
2 2 2
2 2 2
_
_
; E
2
= x
2
y
t
2
=
_
_
2 3 2
2 3 2
0 0 0
_
_
; E
3
= x
3
y
t
3
=
_
_
2 2 1
0 0 0
2 2 1
_
_
.
Obviously,
P =
1
E
1
+
2
E
2
+
3
E
3
=
_
_
1 4 4
8 11 8
8 8 5
_
_
.
64 CHAPTER 7. APPENDICES
Bibliography
[1] Arjeh M. Cohen, Computer algebra in industry: Problem Solving in
Practice, Wiley, 1993
[2] Nguyen, V. M. Man and the DAG group at Eindhoven University of
Technology, www.mathdox.org/nguyen, 2005,
[3] Nguyen, V. M. Man Computer-Algebraic Methods for the Construction
of Designs of Experiments, Ph.D. thesis, 2005, Technische Universiteit
Eindhoven, www.mathdox.org/nguyen
[4] Nguyen, Van Minh Man, Depart. of Computer Science, Faculty of CSE,
HCMUT, Vietnam, www.cse.hcmut.edu.vn/ mnguyen
[5] Brouwer E. Andries, Cohen M. Arjeh and Nguyen, V. M. Man,
Orthogonal arrays of strength 3 and small run sizes,
www.cse.hcmut.edu.vn/ mnguyen/OrthogonalArray-strength3.pdf,
Journal of Statistical Planning and Inference, 136 (2007)
[6] Nguyen, V. M. Man, Constructions of strength 3 mixed orthogonal
arrays,
www.cse.hcmut.edu.vn/ mnguyen/Specific-Constructions-OAs.pdf,
Journal of Statistical Planning and Inference 138- Jan 2008,
[7] Eric D. Schoen and Nguyen, V. M. Man, Enumeration and
Classication of Orthogonal Arrays, Faculty of Applied Economics,
University of Antwerp, Belgium (2007)
[8] Huynh, V. Linh and Nguyen, V. M. Man, Discrete Event Modeling in
Optimization for Project Management, B.E. thesis, HCMUT, 69 pages,
2008.
[9] T. Beth, D. Jung Nickel and H. Lenz. Design Theory vol II, pp 880,
Encyclopedia of Mathematics and Its Applications, Cambridge
University Press (1999)
65
66 BIBLIOGRAPHY
[10] Glonek G.F.V. and Solomon P.J., Factorial and time course designs for
cDNA microarray experiments, Biostatistics 5, 89-111, 2004
[11] N. J. A. Sloane, A Library of Orthogonal Arrays
http://www.research.att.com/ njas/oadir/index.html/,
[12] Warren Kuhfeld,
http://support.sas.com/techsup/technote/ts723.html/
[13] Hedayat, A. S. and Sloane, N. J. A. and Stufken, J., Orthogonal
Arrays, Springer-Verlag, 1999
[14] Madhav, S. P., iSixSigma LLC, Design Of Experiment For Software
Testing, isixsigma.com/library/content/c030106a.asp, 2004
[15] Bernd Sturmfels, Solving Polynomial Systems, AMS, 2002
[16] OpenModelica project, Sweden 2006,
www.ida.liu.se/ pelab/modelica/OpenModelica.html
[17] Computer Algebra System for polynomial computations, Germany
2006 www.singular.uni-kl.de/
[18] Sudhir Gupta. Balanced Factorial Designs for cDNA Microarray
Experiments, Communications in Statistics: Theory and
Methods, Volume 35, Number 8 (2006) , pp. 1469-1476
[19] Morris W. Hisch, Stephen Smale, Dierential Equations, Dynamical
Systems and Linear Algebra, 1980
[20] Jame Thomson, Simulation, A Modelers Approach, Wiley, 2000
[21] David Insua, Jesus Palomo, Simulation in Industrial Statistics, SAMSI,
2005
[22] Ruth J. Williams, Introduction to the Mathematics of Finance, AMS
vol 72, 2006
[23] C. S. Tapiero, Risk and Financial Management- Mathematical
Methods, Wiley, 2004
[24] A.K. Basu, Introduction to Stochastic Processes, Alpha Science 2005
[25] L. Kleinrock, Queueing Systems, vol 2, John Wiley & Sons, 1976
BIBLIOGRAPHY 67
[26] L. Kleinrock, Time-shared systems: A theoretical treatment, Journal of
the ACM 14 (2), 1967, 242-261.
[27] S. G. Gilmour, Fundamentals of Statistics I, Lecture Notes School of
Mathematical Sciences Queen Mary, University of London, 2006
[28] M. Parlar, Interactive Operations Research with Maple, Birkhouser
2000.
[29] Tim Holliday, Pistone, Riccomagno, Wynn, The Application of
Computational Algebraic Geometry to the Analysis of Design of
Experiments: A Case Study.
Copyright 2010 by
Lecturer Nguyen V. M. Man, Ph.D. in Statistics
Working area Algebraic Statistics, Experimental Designs,
Statistical Optimization and Operations Research
Institution University of Technology of HCMC
Address 268 Ly Thuong Kiet, Dist. 10, HCMC, Vietnam
Ehome: www.cse.hcmut.edu.vn/~mnguyen
Email: mnguyen@cse.hcmut.edu.vn
mannvm@uef.edu.vn