You are on page 1of 20

MH4500 TIME SERIES ANALYSIS

Chapter 3 (Part 1): Stationarity linear time series

This chapter is to introduce the most commonly used stationary linear time
series models–the autoregressive moving average (ARMA) models. These models
have assumed great importance in modeling real-world processes.

1 The Moving average

DEFINITION 1 : The process {Xt } defined by


q
∑ q

Xt = Zt + θj Zt−j = θj Zt−j , θ0 = 1,
j=1 j=0

where Zt ∼ W N(0, σ 2 ), is called a moving average process of order q and is


denoted by MA(q). We always assume θi = 0 for i ≤ −1.

Properties of MA(q) are as follows.

(a) E[Xt ] = 0 since E[Zt ] = 0.

(b) Its variance is


q

var(Xt ) = var(Zt + θj Zt−j )
j=1
q

2
= σ + θj2 σ 2
j=1
 
q

= σ 2 1 + θj2  ,
j=1

where σ 2 = var(Zt ).

1
(c) Its covariance functions are
q
∑ q

cov(Xt−k , Xt ) = cov( θj Zt−k−j , θl Zt−l )
j=0 l=0
q ∑
∑ q q−k

= θj θl cov(Zt−k−j , Zt−l ) = θj+k θj σ 2
j=0 l=0 j=0
( ) 2
= θk + θ1 θk+1 + · · · + θq−k θq σ for k ≤ q.

When k > q, cov(Xt−k , Xt ) = 0.

(d) Its correlation functions are




 1,
 k=0
θk +θ1 θk+1 +···+θq−k θq
ρ(k) = ∑q
1+ j=1 θj2
, 1≤k≤q (1.1)


 0, k>q

(e) Generally, the ACF for a MA(q) time series cut-off after lag q

(f) Since the mean and covariance function are both independent of time t we
conclude that MA(q) is stationary.

EXAMPLE 1 (MA(2) process): If Zt ∼ W N(0, σ 2 ), then {Xt } defined by

Xt = Zt + θ1 Zt−1 + θ2 Zt−2

is an MA(2) process. Find its ACFs,


SOLUTION : It follows that the ACFs are given by
θ1 + θ1 θ2
ρ(1) = ,
1 + θ12 + θ22
θ2
ρ(2) = ,
1 + θ12 + θ22
ρ(k) = 0 ∀ k ≥ 3.

EXAMPLE 2 Let Zt ∼ W N(0, σ 2 ). Consider ∇k Zt , which appeared when applying


the differencing operator to produce stationary process. Prove that it is stationary.
SOLUTION It is easily seen that ∇Zt = Zt −Zt−1 is a MA(1) and ∇2 Zt = Zt −2Zt−1 +
Zt−2 is a MA(2). Similarly one may see that ∇k Zt is a MA(k) model. Therefore
it is stationary.

2
DEFINITION 2 Let


Xt = ψj Zt−j
j=1

where ψ0 = 1 and Zt ∼ W N(0, σ 2 ) and




|ψj | < ∞.
j=0

{Xt } is called a general linear process or a MA(∞) process

As before, the properties of MA(∞) can be verified as follows.

(a)
E(Xt ) = 0

(b)


V ar(Xt ) = ψj2 σ 2
j=0

(c)
{
1,

if k = 0

ρ(k) = j=0 ψj ψh+j
∑ ∞ 2 , if k ̸= 0
j=0 ψj

EXAMPLE 3 Let


Xt = ϕj Zt−j
j=1

where |ϕj | < 1. Clearly,



∑ ∞
∑ 1
|ϕj | = |ϕ|j = < ∞.
1 − |ϕ|
j=0 j=0

Therefore Xt is a general linear process. Is it stationary ?


Note that

Xt = Zt + ϕZt−1 + ϕ2 Zt−2 + ϕ3 Zt−3 + · · ·


= Zt + ϕ(Zt−1 + ϕZt−2 + ϕ2 Zt−3 + · · · )
= Zt + ϕXt−1

∴ Xt = ϕXt−1 + Zt

3
or
Xt − ϕXt−1 = Zt
Xt is called an autoregressive process of order one (or AR(1)). For AR(1), we
further have
σ2
V ar(Xt ) =
1 − ϕ2
and
σ 2 ϕk
γ(k) = 
1 − ϕ2
and
γ(k)
∴ ρ(k) = = ϕk .
γ(0)

2 Autoregressive models
DEFINITION 3 : Assume that {Zt } is white noise. Then the process {Xt } defined
by
∑p
Xt = ϕi Xt−i + Zt
i=1

is called an autoregressive process of order p, denoted by AR(p).

Recall the following backshift operator B,

BXt = Xt−1 and Bi Xt = Xt−i

for i = 1, 2, . . ..
Define the pth degree polynomial,

ϕ(z) = 1 − ϕ1 z − · · · − ϕp zp .

We now can rewrite the model of Xt by

ϕ(B)Xt = Zt ,
∑p j.
where ϕ(B) = 1 − j=1 ϕj B It is called the AR characteristic polynomial.

4
THEOREM 1 AR(p) is stationary if all the roots of the polynomial ϕ(z) are
outside the unit circle.

Assume that there is a sequence of constants {ψj : j = 0, 1, . . .} such that




Xt = (ϕ(B))−1 Zt = ψj Zt−j . (2.1)
j=0

It follows that {Xt } is stationary if and only if




|ψj | < ∞.
j=0

Properties of stationary AR(p) are listed below.

(a) EXt = 0

(b) E(Zt Xt−k ) = 0 for any integer k > 0

(c) γ(0) = ϕ1 γ(1) + · · · + ϕp γ(p) + σ 2 .

(d) for k ≥ 1, γ(k) = ϕ1 γ(k − 1) + ϕ2 γ(k − 2) + · · · + ϕp γ(k − p). Thus




 γ(1) = ϕ1 γ(0) + ϕ2 γ(1) + · · · + ϕp γ(p − 1)

γ(2) = ϕ1 γ(1) + ϕ2 γ(0) + · · · + ϕp γ(p − 2)

 ···

γ(p) = ϕ1 γ(p − 1) + ϕ2 γ(p − 2) + · · · + ϕp γ(0).

THEOREM 2 The Yule-Walker equations are




 ρ(1) = ϕ1 + ϕ2 ρ(1) + · · · + ϕp ρ(p − 1)

ρ(2) = ϕ1 ρ(1) + ϕ2 + · · · + ϕp ρ(p − 2)

 ···

ρ(p) = ϕ1 ρ(p − 1) + ϕ2 ρ(p − 2) + · · · + ϕp .

PROOF Fact (d) may be obtained by first multiplying Xt−k on both sides of AR(p)
model, then taking expectation on both sides and eventually using (b) and the fact
that γ(k) = Cov(Xt , Xt−k ) = E(Xt Xt−k ) because EXt = 0. Fact (c) can be proved
similarly. Fact (b) can be seen from (2.1). 

5
EXAMPLE 4 (AR(1) process continued): Consider an AR(1) model of the form

Xt = ϕ1 Xt−1 + Zt , Zt ∼ W N(0, σ 2 ).

◃ (i) Show that Xt is stationary if |ϕ1 | < 1.

◃ (ii) Find the ACFs.

SOLUTION : (i)
Part a. If |ϕ1 | < 1, then it has been shown that {Xt } is stationary (refer to
Example 3).
(ii) Referring to Example 3, we have as k Ï ∞
−1 )
|ρ(k)| = |ϕ1 |k = e−k log(|ϕ1 | Ï0

exponentially. This fact is very useful in identifying an AR(1) model. Note that
the exponential behaviour of the ACF of AR(1).
Alternatively, by the Yule-Walker equations

ρk = ϕ1 ρk−1 .

This yields
ρk = ϕ1k ρ0 = ϕ1k .

6
MTH451/MH4500 TIME SERIES ANALYSIS
Chapter 3 (Part 2): Stationarity linear time series

This chapter is to introduce the most commonly used stationary linear time
series models–the autoregressive moving average (ARMA) models. These models
have assumed great importance in modeling real-world processes.

1 Stationary ARMA models

DEFINITION 1 : (The ARMA(p, q) Process). The process {Xt } is said to be an


auto-regressive moving average or ARMA(p, q) process if {Xt } is stationary and
for every t,

Xt − ϕ1 Xt−1 − · · · − ϕp Xt−p
= Zt + θ1 Zt−1 + · · · + θq Zt−q , (1.1)

where Zt ∼ W N(0, σ 2 ).

We say that {Xt } is an ARMA(p, q) process with mean µ if {Xt − µ} is an


ARMA(p, q) process.
Model (1.1) can be written symbolically in the more compact form

ϕ(B)Xt = θ(B)Zt , t = 0, ±1, ±2, . . . ,

where ϕ and θ are the pth and q th degree polynomials

ϕ(z) = 1 − ϕ1 z − · · · − ϕp zp

and
θ(z) = 1 + θ1 z + · · · + θq zq
and B is the backward shift operator defined by

Bj Xt = Xt−j , j = 0, ±1, ±2, . . . .

EXAMPLE 1 (The AR(p) Process). If θ(z) ≡ 1 then

ϕ(B)Xt = Zt

1
and the process is said to be an autoregressive process of order p (or AR(p)).
This model can also be written as

Xt = ϕ1 Xt−1 + ϕ2 Xt−2 + · · · + ϕp Xt−p + Zt .

EXAMPLE 2 (The MA(q) Process). If ϕ(z) ≡ 1 then

Xt = θ(B)Zt = Zt + θ1 Zt−1 + · · · + θq Zt−q

and the process is said to be a moving-averaging process of order q (or MA(q)).

EXAMPLE 3 (Calculation of ACVFs of ARMA(1,1)), i.e.

Xt − ϕ1 Xt−1 = Zt + θ1 Zt−1 .

Write
(1 − ϕ1 B)Xt = Zt + θ1 Zt−1 .
If |ϕ1 | < 1, we have
∑ j ∞
1
Xt = (Zt + θ1 Zt−1 ) = ϕ1 Bj (Zt + θ1 Zt−1 )
(1 − ϕ1 B)
j=0


∑ j
= ϕ1 (Zt−j + θ1 Zt−1−j ).
j=0

i.e.

Xt = Zt + (ϕ1 + θ1 )Zt−1 + ϕ1 (ϕ1 + θ1 )Zt−2


+ϕ12 (ϕ1 + θ1 )Zt−3 + · · ·
= Zt + ψ1 Zt−1 + ψ2 Zt−2 + ψ3 Zt−3 + · · · ,
j−1
where ψj = ϕ1 (ϕ1 + θ1 ). Note that

1 + |ψ1 | + |ψ2 | + |ψ3 | + ...


1
= 1 + |(ϕ1 + θ1 )| <∞
1 − |ϕ1 |

Therefore, the ARMA(1,1) is a stationary time series if |ϕ1 | < 1.


It follows from the property of MA(∞) that

∑ ∞
∑ 2j−1
γ(h) = σ 2 ψj+h ψj = ϕ1h−1 σ 2 (ϕ1 + θ1 )(1 + ϕ1 (ϕ1 + θ1 )) = ϕ1h−1 γ(1).
j=0 j=1

and

ρ(h) = ϕ1h−1 ρ(1). 

2
4
2
0
ts.sim

−2
−4
−6

0 50 100 150 200

Time

Series ts.sim
1.0
0.8
0.6
ACF

0.4
0.2
0.0
−0.2

0 5 10 15 20

Lag

Figure 1: Plot of ARMA(1, 1) with ϕ1 = 0.5, θ1 = 0.8 and its acf

3
Conclusion: ACF of ARMA(1,1) dies down (without cut-off).
Alternatively, as in AR(p), one may use the Yule-Walker equation to obtain
AVCFs of ARMA(1,1) and here we ignore the details.
This suggests another approach of verifying that a process is stationary by
converting it to a MA(∞).

THEOREM 1 The ARMA process is stationary if all roots of the polynomial of


ϕ(z) are outside the unit root circle.

EXAMPLE 4 Consider the model

Xt = Xt−1 + Zt .

Note that ϕ(z) = 1 − z. Its root is one which does not lie outside the unit circle. It
follows that it is not stationary. 

2 The Partial autocorrelation coefficient function (PACF)


The partial autocorrelation coefficient function, like the autocorrelation function,
conveys vital information regarding the dependence structure of a stationary pro-
cess.

2.1 Motivation of PACF


To motivate the idea, consider an AR(1) model,

Xt = ϕXt−1 + Zt .

Then
γ(2) = cov(Xt , Xt−2 ) = cov(ϕXt−1 + Zt , Xt−2 )
= cov(ϕ2 Xt−2 + ϕZt−1 + Zt , Xt−2 ) = ϕ2 γ(0),
because Zt and Zt−1 are both uncorrelated with Xt−2 (so called causal property).
The correlation between Xt and Xt−2 is not equal to zero because Xt is dependent
on Xt−2 through Xt−1 . Suppose we break this chain of dependence by removing
Xt−1 . That is, consider the correlation between Xt − ϕXt−1 and Xt−2 − ϕXt−1
because the correlation between Xt and Xt−2 with the linear dependence of each
on Xt−1 removed. In this way, we have broken the dependence chain between
Xt and Xt−2 . Indeed,
cov(Xt − ϕXt−1 , Xt−2 − ϕXt−1 )
= cov(Zt , Xt−2 − ϕXt−1 ) = 0.

4
2.2 Definition of PACF
The aim of introducing PACF is to detect the relation between xt and xt−h with
the effect of xt−1 , · · · , xt−h+1 removed.

xt , xt−1 , · · · , xt−h+1 , xt−h .


| {z }

DEFINITION 2 For a stationary process {Xt }, the partial autocorrelation function


(PACF) is defined as

ϕ11 = corr(X1 , X0 ) = ρ(1)


ϕkk = corr(Xk − f1,k−1 , X0 − f2,k−1 ), k ≥ 2

where f1,k−1 = f1 (Xk−1 , · · · , X1 ) and f2,k−1 = f2 (Xk−1 , · · · , X1 ) minimize, respec-


tively, the mean square linear prediction error

E(Xk − f1,k−1 )2 , E(X0 − f2,k−1 )2

Usually, f1 (Xk−1 , · · · , X1 ) = β1 X1 + · · · + βk−1 Xk−1 .

For a stationary process {Xt }, the partial autocorrelation function (PACF) at


lag k can be interpreted as the correlation function between X0 and Xk after
removing their linear dependency on X1 , ..., Xk−1 .
EXAMPLE 5 If xt ∼ AR(p)

xt = ϕ1 xt−1 + · · · + ϕp xt−p + at ,

then its PACFs of are

ϕ1,1 = ρ(1),
ϕp,p = ϕp ,
ϕk,k = 0 for k > p.

When k > p, the linear combination minimizing the mean square linear pre-
diction error is
f1,p = ϕ1 xk−1 + · · · + ϕp xk−p
(obtaining such an expression is complicated and beyond the scope of the course).
Then PACF at lag k > p is

ϕk,k = corr(Xk − f1,p , X0 − f2,p ) = corr(Zk , X0 − f2,p ) = 0,

because Zk is uncorrelated with the past values of Xk−j with j > 0. The calculation
of ϕp,p = ϕp is also beyond the scope of the course. 

5
THEOREM 2 Generally, the PACF for an AR(p) time series cut-off after lag p.

The general formula of ϕkk of an ARMA process is the last component of

ϕh = Γh−1 gh

where the matrix Γh = [γ(i − j)]hi,j=1 and gh = [γ(1), γ(2), · · · , γ(h)]T (this formula
is for your reference only).
Summary
Identification of ARMA model

model ACF PACF


AR(p) dies down cut-off after lag p
MA(q) cut-off after lag q dies down
ARMA(p,q) dies down dies down

3 ARIMA models for nonstationary time series


We have already discussed the class of ARMA models for representing stationary
series. A generalization of this class, which incorporates a wide range of non-
stationary series, is provided by the ARIMA processes, which, after differencing
finitely many times, reduces to ARMA models.

DEFINITION 3 (Auto–regressive Integrated Moving Average (ARIMA) Process):


A process {Xt } is ARIMA(p, d, q) iff {▽d Xt } is a stationary ARMA(p, q) pro-
cess.

ARIMA processes are not stationary when d > 0.

EXAMPLE 6 Suppose that {Xt } is a stationary process. Let Yt = Xt +c1 +c2 t +c3 t 2 .
Is {Yt } stationary? (1 − B)Yt ? (1 − B)2 Yt ?
Answer: (1 − B)2 Yt is stationary. 

EXAMPLE 7 ARIMA(0,1,0): Consider

t

(1 − B)Xt = Zt , or Xt = X0 + Zi ,
i=1

where Zt is white noise. The above special case is also called random walk. It
can be classified as ARIMA(0,1,0). 

6
EXAMPLE 8 (ARIMA(0, 1, 1)): Consider a time series model of the form
Xt = Xt−1 + Zt + θZt−1 . (3.1)
where {Zt } is a white noise.
Model (3.1) is an ARIMA(0, 1, 1) since
▽Xt = Zt + θZt−1
defines {▽Xt } an ARMA(0, 1)=MA(1). 

Aside: Notation: If p, d or q are zero, names can be abbreviated in the obvious


way. e.g., ARMA(1,0) ≡ AR(1); ARIMA(0,0,3) ≡ MA(3).
40
30
ts.sim

20
10
0

0 50 100 150 200

Time

Series ts.sim
1.0
0.8
0.6
ACF

0.4
0.2
0.0

0 5 10 15 20

Lag

Figure 2: Plot of ARIMA(0, 1, 1) with θ1 = 0.8 and its acf

4 Why do we investigate AR, MA model?


EXAMPLE 9 Consider the following data set of xt :

7
1.0445, -0.1338, 0.6706, 0.3755, -0.5110, -0.2352, 0.1595, 1.6258, -1.6739, 2.4478,
-3.1019, 2.6860, -0.9905, 1.2113, -0.0929, 0.9905, 0.5213, -0.1139, -0.4062, 0.5438

2
1
0
x

−1
−2
−3

5 10 15 20

We cannot make any “good” prediction using trend and seasonality model.
Suppose we try the follow model:

xt = 0.40 − 0.82xt−1

This is a AR(1) model (we will explain how we get 0.4 and -0.82 in the subse-
quent courses).
The fitted values are (time: prediction) 2: -0.4569, 3: 0.5130, 4: -0.1491, 5:
0.0938, 6: 0.8235, 7: 0.5965, 8: 0.2716, 9: -0.9354, 10: 1.7809, 11: -1.6121,
12: 2.9564, 13: -1.8082, 14: 1.2183, 15: -0.5942, 16: 0.47939, 17: -0.4124, 18:
-0.0262, 19: 0.4966, 20: 0.73730
which can make reasonable prediction. 

AR model is USEFUL in prediction


For the reason above (and more), we need to investigate models: AR(p), MA(q)
.... More generally, stationary time series. To this end, we need answer the
following questions.

◃ How to check stationarity

◃ ACF; PACF?

◃ Properties of AR, MA, ARMA models.

◃ How to calculate ACF and PACF for a given ARMA model.

8
2
1
0
x

−1
−2
−3

5 10 15 20

5 Selecting a model based on SACF and


SPACF
What will the SACF look like if the time series is not stationary ? Here is an
example.

Series ts.sim
1.0
0.8
0.6
ACF

0.4
0.2
0.0

0 5 10 15 20

Lag

Figure 3: Plot of ARIMA(0, 1, 1) with θ1 = 0.8 and its acf

9
For stationary TS (time series), the SAC will die down very quickly or
even cuts off. Otherwise, the time series is not stationary.
Given observations of a time series, one approach to the fitting of a model
to the data is to match the sample ACF and sample PACF of the data with the
ACF and PACF of the model, respectively. Suppose that z1 , · · · , zn is a realization
(sample, observations) from a stationary time series. The sample ACF (SACF) at
lag h is
∑n−h
(zt − z̄)(zt+h − z̄)
rh = t=1∑n 2
t=1 (zt − z̄)

and the sample PACF (SPACF) is

r1,1 = r1
∑k−1
rk − j=1 rk−1,j rk−j
rk,k = ∑k−1 , for k = 2, 3, · · ·
1 − j=1 rk−1,j rj
rk,j = rk−1,j − rk,k rk−1,k−j ,
for j = 1, 2, · · · , k − 1

where z̄ = ni=1 zt /n.
The standard error of rh is
( ∑h−1 )1/2
1+2 j=1 rj2
srh = .
n

and The standard error of rhh is


( )1/2
1
srhh = .
n

Recall that

model ACF PACF


AR(p) dies down quickly cut-off after lag p
MA(q) cut-off after lag q dies down quickly
ARMA(p,q) dies down quickly dies down quickly
Nonstationary dies down slowly —

THEOREM 3 If the SACF rh is significantly different from zero for 0 ≤ h ≤ q


and negligible for h > q, then an MA(q) model might be suitable for the data.

10
How do we classify ACF as negligible ? If (with probability approximately 0.95)

|rh | ≤ 1.96srh ,

then SACFs are considered as negligible. In practice, we frequently use the more

stringent values 1.96/ n as the bound.

THEOREM 4 If the SPACF rhh is significantly different from zero for 0 ≤ h ≤ p


and negligible for h > p, then an AR(p) model might be suitable for the data.

How do we classify PACF as negligible ? If (with probability approximately


0.95)
1.96
|rhh | ≤ √ ,
n
then SPACFs are considered as negligible.
There is no theoretical model for which the ACF has nonzero values at lags
1, · · · , q and cuts off after lag q, and the PACF has nonzero values at lags 1, · · · , p
and cuts off at all lags after lag p. However, in practice, sometimes for the time
series values SACF has spikes at lags 1, · · · , q and cuts off after lag q, and SPACF
has spikes at lags 1, · · · , p and cuts off after lag p. If this occurs, experience
indicates that we should attempt to determine which of the SACF or SPACF is
cutting off more abruptly. If the SACF cuts off more abruptly, then an MA(q)
might be appropriate.
Here are examples. Can you identify appropriate models for them ?
Time series 1
Series 1 Series x
1.0

0.5
0.8

0.4
0.6

0.3
Partial ACF

0.2
0.4
ACF

0.1
0.2

0.0
0.0

−0.1
−0.2

0 5 10 15 20 5 10 15 20

Lag Lag

11
ACF
ACF
ACF
−0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0
−0.4 −0.2 0.0 0.2 0.4 0.6 0.8 1.0
−0.2 0.0 0.2 0.4 0.6 0.8 1.0

0
0

5
5

10
10

10

Lag
15
Lag

Lag

Series 1
Series 1

Series 1
15

15

20
25
20

20

30
Partial ACF Partial ACF

12
Partial ACF
−0.4 −0.2 0.0 0.2 −0.2 −0.1 0.0 0.1 0.2 0.3 0.4
−0.5 0.0 0.5

0
Time series 4
Time series 3
Time series 2

5
5
5

10
10
10

15

Lag
Lag
Lag

Series x
Series x
Series x

15
15

20
25
20
20

30
Time series 5
Series 1 Series x

1.0

0.8
0.8

0.6
0.6

0.4
Partial ACF
ACF

0.2
0.4

0.0
0.2

−0.2
0.0

−0.4
0 5 10 15 20 25 30 0 5 10 15 20 25 30

Lag Lag

Time series 6
Series 1 Series x
1.0

0.0
0.5

−0.2
Partial ACF

−0.4
ACF

0.0

−0.6
−0.5

−0.8

0 5 10 15 20 25 30 0 5 10 15 20 25 30

Lag Lag

Time series 7
Series 1 Series x
1.0

0.8
0.8

0.6
0.6

Partial ACF
ACF

0.4
0.4

0.2
0.2

0.0
0.0

0 20 40 60 80 100 0 20 40 60 80 100

Lag Lag

13
Time series 8
Series 1 Series x

0.6
1.0
0.8

0.4
0.6

0.2
Partial ACF
0.4
ACF

0.2

0.0
0.0

−0.2
−0.2

0 20 40 60 80 100 0 20 40 60 80 100

Lag Lag

Time series 9
Series 1 Series x
0.8
1.0
0.8

0.6
0.6

Partial ACF

0.4
ACF

0.4

0.2
0.2

0.0
0.0

0 20 40 60 80 100 0 20 40 60 80 100

Lag Lag

14

You might also like