You are on page 1of 3

LECTURE 10: Non-stationary models: Seasonality and ARIMA.

5. Non-Stationary TS. Differencing. (Based on [BD], pp. 28–35)


5.1 Elimination of trend. The first step in the analysis of any TS is to plot the data.
If the TS behave as if they have no fixed mean level, then the data is non-stationary.
Many times, however, although there is no fixed mean, the parts of the series display
a certain kind of homogeneity, in sense that the local behavior (on short intervals) for
the series is similar, apart from from a difference in level and “trend”.
Example: Population of US.
This kind of non-stationarity is called homogeneous. By suitable differencing of such
a process we are able to come to a stationary process. This technique is due to Box
and Jenkins.
(i) Def. Denote: by ∇d Xt the dth difference of Xt for all t, and define:
∇Xt ≡ ∇1 Xt := Xt − Xt−1 ,
Note: ∇Xt = (1 − B)Xt

∇2 Xt = ∇(∇Xt ) = ∇(Xt − Xt−1 ) = ∇Xt − ∇Xt−1 = Xt − 2Xt−1 + Xt−2 ,


Note: ∇2 Xt = (1 − B)2 Xt

...
∇d Xt = (1 − B)d Xt = Cd0 Xt − Cd1 Xt−1 + Cd2 Xt−2 − . . . − (−1)d Cdd Xt−d .
We call ∇Xt = Xt − Xt−1 “the differenced process”.
Example (i): Xt = bt + St , where St is a stationary process. (Linear trend).
Then for Wt = ∇Xt = Xt − Xt−1 we have:

Wt = bt + St − b(t − 1) − St−1 = b + (St − St−1 )

Thus, Wt is stationary.

In some cases it is necessary to difference more than one time in order to achieve
stationarity.
Example (ii): Xt = bt2 + St , where St is a stationary process. (Quadratic trend).
Then take
Wt = ∇2 Xt = Xt − 2Xt−1 + Xt−2
= b[t2 − 2(t − 1)2 + (t − 2)2 ] + St − 2St−1 + St−2
= 2b + (St − St−1 − St−2 )

1
= stationary TS.
Conclude: If the trend is polynomial of order k, then Wt = ∇k Xt is stationary
(differencing k times eliminates polynomial trend of order k).
Example (iii). The Random Walk Model. ([BD], pp. 9, 17; also example 3.1
(vi) of Lecture 3)
This model is often used to represent non-stationary data (income, price changes,
stock prices):
Xt = Xt−1 + Zt
or
t−1
X
Xt = Zt−j , t = 1, 2, . . .
j=0

Note: EXt = 0, EXs Xt = σZ2 min(s, t), (calculated in 3.1 (vi) of Lecture 3) i.e. the
TS is non-stationary. Note also that the differenced process Wt = Xt − Xt−1 ≡ Zt is
stationary and that the ACF for W is ρk = 0, k > 0.
5.2 Elimination of trend and seasonality via differencing. The Classical
Decomposition model.
Examples of seasonal data: Sunspots, Accidental Deaths, etc.
If the data is a realization of the process of the type

(∗) Xt = mt + st + St ,

mt = trend component, polynomial of order k


Pd
st = seasonal component, i.e. periodic with period d: st+d = st , j=1 sj = 0.
St = stationary process
[The Classical Decomposition Model]
Convert to a stationary process via differencing:
take lag d difference : Wt := ∇d Xt = Xt − Xt−d :
then Wt = (mt − mt−d )(trend) + (St − St−d )(stationary). Usually, one needs to
difference more to remove the trend.
Note: ∇d Xt = (1 − B d )Xt removes periodicity (seasonality) with period d.
(d) ARIMA(p, d, q): this is nonstationary model such that after d difference
operations we obtain ARMA(p,q). Recall:

∇Xt = Xt − Xt−1 = (1 − B)Xt

∇2 Xt = ∇(Xt − Xt−1 ) = Xt − 2Xt−1 + Xt−2 = (1 − B)2 Xt

2
etc. The model then can be written as:

Wt − φ1 Wt−1 − φ2 Wt−2 − . . . − φp Wt−p = Zt − θ1 Zt−1 − . . . − θq Zt−q

for Wt = ∇d Xt . Another ways to write it:

φ(B)(1 − B)d Xt = θ(B)Zt

or
φ∗ (B)Xt = θ(B)Zt with φ∗ (z) = φ(z)(1 − z)d

We request: φ(z) 6= 0, θ(z) 6= 0 for all |z| ≤ 1. Note that φ∗ (z) has unit root of order
d.
• Useful for representing data with trend.
• Typical feature: slowly decaying ACF.
• Unit root in AR part: if Xt comes from the process φ∗ (B)Xt = θ(B)Zt with
φ∗ (z) = (1 − z)φ(z), then difference to eliminate the unit root: for Wt = Xt − Xt−1
the have the model: φ(B)Wt = θ(B)Zt .
• Unit root in MA part: if Xt comes from the process φ(B)Xt = θ∗ (B)Zt with
θ∗ (z) = (1 − z)θ(z), then the series was overdifferenced: let Yt be an invertible ARMA
process: φ(B)Yt = θ(B)Zt . Let Xt = Yt − Yt−1 . Then φ(B)Xt = φ(B)(1 − B)Yt =
(1 − B)θ(B)Zt .
Thus, if estimated φ and θ have roots close to unit roots, check your differencing.
• Effect of overdifferencing on the variance and ACF
∗ Suppose we consider the MA(1) process: Xt = (1 + θB)Zt . Its ACF is given by:

θ
ρX (1) = , ρX (k) = 0, k ≥ 2
1 + θ2

The first difference of this process Wt = Xt − Xt−1 is the MA(2) process:

Wt = (1 − B)(1 + θB)Zt ≡ (1 − (1 − θ)B − θB 2 )Zt

Its ACF ρW (2) 6= 0, i.e. ACF structure is more complicated.


Also, the variance of original process γX (0) = (1 + θ2 )σZ2 , and the variance of the
differenced process Wt is γW (0) = 2(1−θ+θ2 )σZ2 . Thus, γW (0)−γX (0) = (1−θ)2 σZ2 >
0, i.e. the variance of the overdifferenced MA process is larger than that of the original
process. Similar calculation can be made for AR(1) process.

You might also like