You are on page 1of 29

# WEEK 1 STOCHASTIC PROCESS

## Stochastic model vs Deterministic model

o Stochastic model recognizes random nature of inputs while
deterministic model does not contain any random
components, hence both inputs and outputs are fixed

## Modeling process; identify objective of model, construct the model

(determine assumptions, inputs, data and output), fitting the model
(collect data, analyze data, estimate model parameters), testing the
model (validating the model), implementing the model, examining
reasonability and sensitivity of output, reporting the results and
making recommendations

## Stochastic Process; collection of random variables; X(t) where t T

o T is the index set of the process and is the time parameters
o X(t) is the state of the process at time t
o The set of values that X(t) can take is called statement space
o Can be classified as discrete time process or continuous time
process
Discrete time; index set is finite/countable (i.e. 1, 2, 3)
Continuous Time; index set is continuous
o Can be classified as discrete state space or continuous state
space
Discrete state; X(t) is a discrete random variable
Continuous State Space; X(t) is a continuous RV
o Sample path or realization is a particular assignment of
possible values
o Increments of the process can be independent and/or
stationary
Independent when future increases are independent of
the past and the present; that is X(t+i) X(i) is
independent of X(s) for s < i
Stationary when increments over a length of time have
the same distribution regardless of starting point that is;
X(t2+i) - X(t1+i) has the same distribution as X(t2) X(t1)

Markov Process
o Pr( X(tn) < xn| X(tn-1) = xn-1 )
o Implication; only the current state influences the future
o Markov chain is just a markov process for discrete index set
with discrete state space
o Hence Markov Chain; Pr( Xt+1 = xt+1 | Xt = xt )
o Conditional probabilities; Pij(t,t+1) = Pr( Xt+1 = j | Xt = i) also
known as one step transition probabilities

## o Relevance Markov chain; Given information at time t, we can

say the predict the next state of the process with probability p

## If the one step transition probabilities do not depend on time then

the Markov Chain is homogenous and transition probabilities are

stationary

## N-step transition probability; Pijn = Pr( Xt+n = j | Xt = i)

o Chapman-Kolmogorov equations allow the computation of
transition probabilities using matrix multiplication
o Pn+m = Pn x Pm
o Forward Equation; Pn+1 = Pn . P
o Backward Equation; Pn+1 = P.Pn

Classification of States
o Absorbing State if Pii = 1
o Considered accessible from state i if Pij > 0
o States communicate if you can move of another state and
then return
o If two states communicate then, the are in the same class

## Irreducible Markov chain

o It is irreducible if there is only one class and all states
communicate with each other
o Properties; in the long run, it will always return to current state
and long run probability of being in a state is positive for all
states

## Recurrent vs Transient States

o A state is recurrent if the probability of returning to the state
you started in is 1, if less than 1 then it is transient
o If state is recurrent, the process returns to the state infinitely
often

## o If state is transient, the number of times is a RV with the

expected number of time given by 1/(1-fi), where fi is the
probability of returning to the state
o In a finite state MC, at least one state must be recurrent
otherwise after a finite amount of times no states will be
visited
o Recurrent and Transient are a class property (if one state is
recurrent, so are all)
o All states in a irreducible MC are recurrent

Period of States
o d(i) is the greatest common divisor of all n>=1
o If Piin = 0 for all n, then d(i) = 0 means impossible to go back
to the state
o A state with period 1 is aperiodic (i.e. in the long run can go
back to state at each period)
o Period is a class property, therefore period of all states in a
class is the same

## Positive Recurrent If a state is recurrent and the expected time

until the process returns to state I is finite, then state is positive
recurrent. For finite MC, all recurrent states are positive recurrent

property

## WEEK 2 Discrete time MC

Limiting probabilities
o If state is transient (i.e. fi < 1) then limiting Pijt = 0
o For irreducible and ergodic MC, limiting Pijt = j and the limiting
probability is independent on the current state i
o The limiting probability is also the probability of the process
being in that state
o Also equals the long run proportion of time the MC is in that
state
o In matrix format; 0 = (P-I)T *
o To solve the matrix, must also use the sum of limiting
probabilities = 1
o j = 1/mjj, where mjj is the expected number of transitions until
a MC returns to state it started in

## Mean time spent in transient states

Time Reversible MC
o Used for Monte Carlo Simulations and finding limiting
probabilities
o MC is time reversible if Qij = Pij for all i and j
o Qij = Pr (Xt = j | Xt+1 = i)
o For time reversible MC, the rate at which process goes from
state i to state j is the same as rate a which it goes from j to I
(i.e. i * Pij = j * Pji )

Branching Process
o If each individual and produce j off-springs with probability pj
then;

## o The population size of the nth generation is then a branching

process Xn, with P00 = 1 and hence state 0 is absorbing, while
all other states are transient if p0 > 0
o Expectation population size
E( Xn ) = un * X0
o Variance of Population size
Var ( Xn ) = o2 * un-1 + u2 *Var ( Xn-1 )

## generating function (given that Gx1 = Gy as the first generation

only has person)
o Important properties of PGF;

## o If u > 1, then extinction is not certain

o If u < 1, then extinction is certain

## Poisson Process counting process of the number of events up to a

particular time (i.e. simple discrete space continuous time Markov
Process)
o The time between events for Poisson process follows
exponential distribution

Important properties
o Distribution of waiting times for Poisson processes follow
Gamma (n,) where n is the number of events and is the
rate at which each event occurs
o Function; min (X1, X2, .) follows exponential with rate =
sum of all
o Probability Xi is the smallest is given by i divided by sum of all

## Counting Process Definition

o N(t) >= 0
o N(t) is integer-valued
o N(s) < = N(t) for s < t (i.e. non-decreasing)

## Potential Properties of counting process

o Independent increments; if N(s) and N(t) N(s) are
independent
o Stationary increments; N(t+s) N(q+s) has same distribution
as N(t) N(q) means number of events in a time period only
depend on the length of time period

Poisson Process
o Special counting process where N(0) = 0, with independent
increments + stationary increments and number of events in
any interval = *t, where t is length of interval
o Alternative definition; N(0) = 0, independent & stationary
increments an satisfied the following conditions

## o Last conditions means that rate is which one claim arrives is

but rate at which 2 or more claims arrive simultaneously is

## zero (as o(h) means the limit of function divided by h, is zero;

f(h)/h 0, as h 0

Inter-arrival Time;
o Tn is the time between (n-1)th and nth event. It follows
exponential RV with rate .
o In Poisson process, the time between has an i.i.d. Exponential
distribution and since it is memory less, the time until next
claim is independent of time elapsed since last claim

Waiting Time;
o Waiting time = arrival time of the nth event (i.e. Sn = T1 + T2 +
. + Tn )
o Follows Gamma Distribution (n, )
o Convolutions of exponential variables; when the rate for T1, T2
is different
Sn is said to have a hypo exponential distribution
When n=2, use the following to find the probability

density function
General case;

## Ordered Statistics; the pdf of the ordered statistics is; n! * product of

individual pdf
Arrival Times distribution; Given that the timings are uniformly
distributed; f(s1,s2,..,sn | N(t) = n) = n!/tn
Sj = cumulative time for j events to occur
Number of claims distribution;

## Sum of Independent Poisson Processes; N(t) = N1(t) + N2(t) than N(t)

is also a poisson process with rate 1 + 2

## Thinning of Poisson Process; Suppose a Poisson process with rate

has 2 type of events; Type I w.p. p and Type II w.p. 1-p
o Using N(t) = N1(t) + N2(t); N1(t) has rate p for type I events
and N2(t) has rate (1-p) for type II events
o Both processes are independent

## Non-homogenous Poisson process with intensity function (t) has

the following properties;
o N(0) = 0
o Independent increments
o Unit jumps;

## o (t) can be deterministic or process itself

o Mean value function;

process with = 1

sizes

## o Modelling aggregate claim size; X(t) sum of all individual

claims given by process N(t)
o Expected value of X(t) = t*E(Y)
o Var(X(t)) = t*E(Y2)

Markov

## Markov Jump Process (MJP) continuous Markov process with

discrete state space
o Transition probability; Pr( X(t+s) = j | X(t) = I ) = Pij (t, t+s)
o Homogenous & Stationary if; Pr( X(t+s) = j | X(t) = i) = Pr( X(s)
= j | X(0) = i) = Pij (s)
o Hence do not depend on time if homogenous & stationary

First Holding timing; the time spent in state before the first jump is a
memory less function
o Follows Exponential Distribution with parameter vi is the rate
at which the process makes a transition when in state i
o E(Ti) = 1/vi
o Pr(Ti > s); occupancy probability
o PDF of Occupancy probability; vi*exp(-vi*t)
o Occupation time; total time process spends in process j during
the (0,t) given that starts in state i
o Calculating Occupation time;

## Instantaneous transition rate; qij

o Theorem; vi = qij , where i j
o Rate out of state i = sum of rates to enter state j

## Embedded MC considering the sequence of states visited ignoring

the time spent in each state (discrete time MC)
o Transition probability; Pij = qij / vi when i j
o Pii = 0
o qii = vi

## Kolmogorovs Backward Equation

Note that for all Kolmogorov equation must satisfy the condition;
P(0) = Identity Matrix

## Kolmogorovs Equations in Matrix format

o P(t) = [Pij(t)]
o Q = [qij]
o Backwards equation; dP(t)/dt = Q P(t)
o Forward equation; dP(t)/dt = P(t) . Q
o With initial condition; P(0) = I

Limiting probabilities (Pj) for MJP are independent of initial state and
exist if the following conditions are met;
o All states communicate

## o MJP is positive recurrent (hence mean time to return to that

state is finite)
o Pj is also the long-run proportion of time that process is in

state j

## For embedded Markov chain, let limiting probability be i

Then the proportion of time the continuous MJP spends in state i;

## Where i is the proportion of transitions that go into state i

1/vi is the mean time spent in state i during a visit (from the original
MJP)

## Calculating Transition probabilities; Homogenous MJP

Method 1
o Using equation; P(t) = P(0) * exp(Qt)
o Exp(Qt) = lim n infinity (I + Q*t/n)n
o Choose n = 2k
o Let M = I + Q*t/n
Method 2
o Exp (-Qt) = lim n infinity (1-Q*t/n)n
o Exp (Qt) = ((1-Q*t/n)-1)n
o Computer the inverse component and then put it to the power
n=2k
Calculating Transition probabilities; Non-homogenous MJP
o See examples

WEEK 5 APPLCATIONS

## Birth & Death Process Special case of continuous time MJP

o Continuous time MC; increases at rate n & decreases at rate
n
o Embedded MC; increases at rate n/( n + n ) and decreases at
rate n+1/( n+1 + n )
o Combined MC; v0 = 0, vi = i + i
o Pij = qij / vi
o For embedded MC; P01 = 1
Pi,i+1 = i/( i + i )
Pi,i-1 = 1 Pi,i+1

## For homogenous case; Expected time starting in state I and entering

state i + 1, E[Ti+]

## For i = 0, E[T0+] = 1/0

Using recursive relationship, we obtain the following value for E[T i+]

## For homogenous case; Transition probabilities

Kolmogorovs Equations

## Poisson Process Application

o Pure birth process with rate

## Sickness Model (where state 0 = healthy , state 1 = sick)

Non-homogenous

General Case;
o Transition rates for forward equation;

## Two methods; Inverse transform method & acceptance-rejection

method
Need the probability/cumulative distribution function, where the
cumulative probabilities are UNIF (0,1)

## Procedure to generate Pseudo-Random Numbers First step

o Starts with seed X0 and specify positive integers a, c and m
o Use formula Xn+1 = (aXn + c) modulo m
o Modulo function means Xn+1 is the remainder after dividing aXn
+ c by m
o Divide Xn+1 by m to approximate UNIF (0,1)

## Inverse Transform Method Continuous RV

o Compute Fx-1
o Generate Preseudo-Random Number for UNIF(0,1)
o Set X = Fx-1(U), where U is the number generated

## Inverse Transform Method Discrete RV

o Generate U using pseudo-random number

## Inverse Transform Method Geometric Distribution

o Generate U
o Round up log(U)/log(1-p)

## Acceptance-Rejection Method Trying to generate r.v. X with density

function f
o Choose a distribution g(.) which can be simulated
o Set a constant, c, such that f(y)/g(y) c
o Simulate Y with density g(.)
o Simulate UNIF (0,1)
o If U f(Y)/(c*g(Y)) then set X = Y, otherwise reject and return
to simulating Step 3

Common Situations
o Simulating Standard Normal; if U < 0.5, set Z = -X, if U > 0.5,
set Z = X
o Generating Normal RV with mean u and variance o2; simulate
standard normal Z and set X = u + oZ,
o Simulating log-normal RV Y; Simulate Normal RV X then set Y
= exp(X)
o Gamma distribution (n, ); generating n exponential variables
and then adding then
Generating n random numbers from Uniform (0,1)
Set X = (-1/)* log(Ui)
o Simulating Poisson RV using

## o Simulating Chi-Squared; Simulating n independent standard

normal (Zi) and then using;

## o Alternative method; for even degrees of freedom (n=2k)

Generate k random numbers and using the following

formula;
For odd degrees of freedom; simulate distribution with
degree n-1 and then add Z2, where Z is the simulated
value from standard normal distribution

## Monte Carlo Simulation Procedure

o Generate random vector X(1) with joint density f and compute
Y(1) = g (X(1))
o Repeat process until r random vectors have been generated
o Find the average of Y(i) to estimate E(g(X)) = Y

## Mean and Variance of the Estimate

o Expected value of the estimate = expected value of function g
o Variance = Var(Y(i))/r, where Var(Y(i)) can be estimated using

the following;
o Using antithetic variables technique to reduce variance

## o Procedure for antithetic variants

Generate a set of n variates and input into function g to
determine Y(1)
Generate a set of another n variates correlated with the
first set and input into function g to determine Y(k+1)
Repeat steps k times

## o Control Variates evaluating expected value of function, g(X)

using the known expected value (u) of another function, f(X).
Using the following formula;
E(g(X)) = (a*u) + E(g(X) af(X))

## Importance Sampling Maybe questions will help

Number of Simulations
o Selecting r so that the estimate is within a desired accuracy of

o To estimate u;

## Time series model;

o {xt} is the observed data, {Xt} denotes the time series, where
each Xt is a r.v.

## Classical decomposition method

o Plot the series and examine the main features
o Determine whether there is a trend or seasonal component
o Determine if data transformation is necessary (i.e. Box-Cox
transformation) usually when variance of data changes over
time, it needs to be transformed in order to analyse

## Classical Decomposition Model

o Xt = Tt + St + Nt
o St = St-d & Sum of St for period d = 0
o E[Nt] = 0

## o Nt is the random component

o Tt = deterministic trend

General Procedure
o Remove or model trend and seasonal effects
o Analyse Nt by choosing probability model, estimating unknown
parameters, checking model for goodness of fit and using
fitted model to enhance out understanding

## Removing & Modeling

o Moving Average Filter; estimates trend and eliminated
seasonal effects
o Differencing; Eliminated trend and seasonal effects

## Moving Average Filter

o To estimate trend, need use linear filter to eliminate seasonal
effects and residuals
o Linear filter;

## o Estimating season trend;

Run the moving average filter over a complete cycle (d)
Use Xt T^

Differencing method

## o Differencing again would remove the trend component in the

new Xt
o Use E[Nt] = 0, to compute expected value of difference
equation to equate coefficients of trends

Seasonal Differencing

## Stationarity of time series

o Weakly stationary if E[Xt] = u for all t and Cov(Xt+y, Xt) = g(y)
for all t and y
o Means that the mean is constant and the covariance,
Cov(Xs,Xt) only depends on the length of different t-s
o Functions

## {Xt} is integrated to order 0, I(0) if the process itself is stationary

Integrated to order 1, if increments (Xt - Xt-1 ) form a stationary
process
Integrated to order in if the nth different series is stationary (I.e. nXt
)
Note the polynomial degree of trend (n) implies that for any k>=n,
the processs kth difference series is stationary
{Zt} is i.i.d. noise or Strong white noise if Zt and Zt+h are i.d.d with
mean zero (i.e. E[Zt] = 0 and Cov(Zt, Zs) = 0
That means the auto Covariance function = 0 for all h other than 0
Zt is white noise, but not i.i.d if Zt N(0,2) if t is odd and Zt Yt -
when t is even
Yt Exp(1/)

## Modelling of Residual/Noise depends on the dependence structure of

de-trended & de-seasonalised time series
Dependence structure can be modelled using ARIMA model
Use the autocorrelation function to detect appropriate model
Time series models can only be calibrated efficiently for stationary
random processes
A non stationary process can be transformed into stationary one
through moving average filter or differencing
Two types of processes;
o Auto Regressive Process
o Moving Average Process

## Moving Average Process

MA(1); {Xt} is a first order moving average process if there is
constant and process {Zt} WN(0,2) such that Xt = Zt + *Zt-1

## Often modelled using the following equation; Xt = c + Zt + *Zt-1,

where c is the linear trend

## Conditional mean is stochastic; E[Xt+1 | Xt, Xt-1,] = *Zt E[Xt+1]

Conditional variance is constant; Var[Xt+1 | Xt, Xt-1,] = 2 Var(Xt+1)
Both conditional mean and variance do not equal unconditional
variance

## MA(q); Moving average process of order q if; Xt = Zt + 1*Zt-1+ 2*Zt2++ q*Zt-q

{Zt} WN(0,2)
As such Xt depends on the past q observations

## Test for autocorrelations

o Dont understand

## Autoregressive process; AR(1); value depends on the previous value

of the process and thus it slowly converges to its mean
First order regressive process if;
o {Xt} is stationary
o Xt = Xt-1 + Zt , where is constant and {Zt} WN(0,2)
o Zt is uncorrelated with Xs, where s < t

## Same as MA, conditional mean is stochastic; E[Xt+1 | Xt, Xt-1,] = *Xt

E[Xt+1]
Conditional variance is constant; Var[Xt+1 | Xt, Xt-1,] = 2 Var(Xt+1)

## ACF Detection of AR & MA

o MA(q) has a non-zero ACF for lags 1 to q
o AR(1) has exponentially decaying non-zero ACF for all lags
o AR(p) has generally decaying ACF

## Duality between AR and MA;

o AR(1) is equivalent to a MA(

## Autoregressive Moving Average Model (ARMA)

o {Xt} is an ARMA(1,1) process if;
{Xt} is stationary
Xt - Xt-1 = Zt + Zt-1
Can also be written as; (B)*Xt = (B)*Zt
Where (B) = 1-B & (B) = 1+ B

## {Xt} is an ARMA process of order (p,q) if {Xt} is stationary & (B)*Xt

= (B)*Zt

Properties of stationarity;
o Covariance structure does not change over time
o Process can be represented by linear process
o {Xt} is stationary and Yt = (B) * Xt, then {Yt} is also
stationary
o Dont understand stationarity need to review later

## An ARMA(p,q) process is causal if there exists constants {j} such

that;
o Sum of |j| is less than infinity
o Xt = j*Zt-j

## Theorem; Process satisfying (B)*Xt = (B)*Zt is causal is and only if

roots to the equation (z) = 0 are outside the unit circle

## Linear process is invertible if there exists constants such that

o Sum of |j| is less than infinity
o Zt = j*Xt-j

Same as causal function, if roots to (z) are outside the unit circle
then the process is invertible