Time Series Analysis Lecture 8-3

Lecture 8
Yufei Cai, Langtian Ma, Zhuorang Liu, Yifei Feng

March 24, 2023
1 Examples of ARIMA(p, d, q)
Definition 1 (ARIMA(p, d, q)). We say that Xt ∼ ARIMA(p, d, q) if (1 −
B)d Xt ∼ ARMA(p, q) where B is the back-shift operator with BXt = Xt−1 .
That is to say,
ϕ(B)(1 − B)d Xt = θ(B)Wt
where ϕ(B) = 1 − ϕ1 B − ϕ2 B 2 − ... − ϕp B p , θ(B) = 1 + θ1 B + θ2 B 2 + ... + θq B q
and Wt ∼ W N (0, σ 2 ).
1.1 ARIMA(p, 1, q)
Definition 2 (ARIMA(p, 1, q)). Xt ∼ ARIMA(p, 1, q) if Yt ≜ Xt − Xt−1 ∼
ARMA(p, q), i.e.,
Yt − ϕ1 Yt−1 − ... − ϕp Yt−p = Wt + θ1 Wt−1 + ... + θq Wt−q
where Wt ∼ W N (0, σ 2 ).
Example 1.1. Simulated ARIMA(2, 1, 1) with ϕ1 = 0.7, ϕ2 = −0.3, θ1 = −0.4
1 library ( ggplot2 )
2 library ( forecast )
3 set . seed (123)
4 tsdata <- arima . sim ( list ( order = c (2 , 1 , 1) ,
5 ar = c (0.7 , -0.4) ,
6 ma = -0.4) , n = 200) + 10
7 autoplot ( tsdata ) +
8 geom _ point () +
9 labs ( title = " ARIMA (2 ,1 ,1) Simulation " , x = " Time " , y = " Value " )
1
Proposition 1. The characteristic polynomial of Xt ∼ ARIMA(p, 1, q) is ϕ(z)
e =
(1 − z)(1 − ϕ1 z − ϕ2 z 2 − ... − ϕp z p ) which includes the root z = 1.
Proof. By the definition, (Xt − Xt−1 ) − ϕ1 (Xt−1 − Xt−2 ) − ... − ϕp (Xt−p −
Xt−p−1 ) = Wt + θ1 Wt−1 + ... + θq Wt−q , which indicates (1 − B)(1 − ϕ1 B −
ϕ2 B 2 − ... − ϕp B p )Xt = Wt + θ1 Wt−1 + ... + θq Wt−q .
1.2 IMA(d, q)(=ARIMA(0, d, q))

If there is no AR in ARIMA, i.e., ARIMA(0, d, q), we call it Integrated MA,
IMA(d, q).
Definition 3 (IMA(d, q)). Xt ∼ IMA(d, q) (or ARIMA(0, d, q)) if Yt ≜ (1 −
B)d Xt ∼ ARMA(0, q), i.e.,
(1 − B)d Xt = Wt + θ1 Wt−1 + ... + θq Wt−q
2
1.2.1 IMA(1,1)
Definition 4 (IMA(1, 1)). Xt ∼ IMA(1, 1) if Xt − Xt−1 = (1 − B)Xt = Wt +
θWt−1 where Wt ∼ W N (0, σ 2 ).
Assume Xj = 0 if j ≤ −m − 1 which is the initial condition.
Proposition 2. If Xt ∼ IMA(1, 1), then

1. V ar(Xt ) = (1 + θ2 + (t + m)(1 + θ)2 )σ 2 increases as t goes up.
q
t+m
2. Corr(Xt+k , Xt ) ≈ t+m+k ≈ 1 for large m and moderate k.
Remark. The formula written in the lesson is slightly different from the formula
in the textbook. Here we use the formula given by the textbook.
Proof.
t
X
Xt = (Xj − Xj−1 )
j=−m
= (Wt + θWt−1 ) + (Wt−1 + θWt−2 ) + ... + (W−m + θW−m−1 )

= Wt + (1 + θ)Wt−1 + (1 + θ)Wt−2 + ... + (1 + θ)W−m + θW−m−1
. As a result, V ar(Xt ) = (1 + θ2 + (t + m)(1 + θ)2 )σ 2 which increases as t goes

up.
3
∀k ∈ N+ ,
Cov(Xt+k , Xt )
t+k
X t
X
= Cov( (Xj − Xj−1 ), (Xj − Xj−1 ))
j=−m j=−m
t
X t+k
X t
X
= Cov( (Xj − Xj−1 ) + (Xj − Xj−1 ), (Xj − Xj−1 ))
j=−m j=t+1 j=−m
t+k
X
= Cov(Xt + (Xj − Xj−1 ), Xt )
j=t+1
t+k
X
= V ar(Xt ) + Cov( (Xj − Xj−1 ), Xt )
j=t+1
= V ar(Xt ) + Cov((Xt+1 − Xt ) + ... + (Xt+k − Xt+k−1 ), Xt )

= V ar(Xt ) + Cov((Wt+1 + θWt ) + ... + (Wt+k + θWt+k−1 ), Xt )
= V ar(Xt ) + Cov((Wt+1 + θWt ) + ... + (Wt+k + θWt+k−1 ), Wt )
= V ar(Xt ) + θV ar(Wt )
= (1 + θ2 + (t + m)(1 + θ)2 )σ 2 + θσ 2
= (1 + θ + θ2 + (t + m)(1 + θ)2 )σ 2
Corr(Xt+k , Xt )
Cov(Xt+k , Xt )
=p p
V ar(Xt+k ) V ar(Xt )
1 + θ + θ2 + (t + m)(1 + θ)2
=p p
1 + θ2 + (t + k + m)(1 + θ)2 1 + θ2 + (t + m)(1 + θ)2
r
t+m
≈
t+m+k
≈1
for large m and moderate k.
1.3 ARI(p, d)(=ARIMA(p, d, 0))

If there is no MA in ARIMA(p, d, q), i.e. ARIMA(p, d, 0). We call it Autore-
gressive Integrated (ARI(p, d)).
4
Definition 5 (ARI(p, d)). Xt ∼ ARI(p, d) (or ARIMA(p, d, 0)) if Yt ≜ ϕ(B)(1−
B)d Xt ∼ ARMA(p, 0), i.e.,
(1 − ϕ1 B − ϕ2 B 2 − ... − ϕp B p )(1 − B)d Xt = Wt
1.3.1 ARI(1, 1)
Definition 6 (ARI(1, 1)). Xt ∼ ARI(1, 1) if (Xt −Xt−1 )−ϕ(Xt−1 −Xt−2 ) = Wt
Proposition 3. If Xt ∼ ARI(1, 1), then the characteristic polynomial of Xt is
ϕ(z)
e = 1 − (1 + ϕ)z + ϕz 2 = (1 − z)(1 − ϕz) which has the root z = 1.
Proof. By the definition, Xt = (1 + ϕ)Xt−1 − ϕXt−2 + Wt or Xt − (1 + ϕ)Xt−1 +

ϕXt−2 = Wt , which implies (1−(1+ϕ)B +ϕB 2 )Xt = Wt or (1−B)(1−ϕB)Xt =
Wt .
2 Model specification
Main Problem: Given the data, what model to fit it?
Recall that the sample covariance function is:
n−|h|
1 X
γ(h) = Xt+|h| − X̄ Xt − X̄
n t=1
5
Then sampled ACF is defined by:
γ(h)
ρ(ĥ) =
γ(0)
The following theorem provides us a powerful way to conduct hypothesis testing
on different models
P 4
Theorem 1. If Xt = j ψj wt−j with E |wt | < ∞, then:
    
ρ̂(1) ρ(1)
ρ(2) ρ(2) 1 
 ..  ∼ AN  ..  , V 
    
 .   .  n 
ρ̂(k) ρ(k)
as n → ∞, where AN stands for asymptotic normal and V is the covariance
matrix. Additionally, we also have
∞
X
Vij = (ρ(i + h) + ρ(i − h) − 2ρ(i)ρ(h)) · (ρ(j + h) + ρ(j − h) − 2ρ(j)ρ(h))
h=1
which is called Bartlett’s formula.

Therefore, for any linear process {Xi }ni=1 , the general form of reject region
is:
√
Vhh
|ρ̂(h) − ρ(h)| ≤ √ z0.025
n
where ρ̂(h) is the estimated ACF, ρ(h) is the real ACF under the null hy-
pothesis, n1 V is the covariance matrix of the asymptotic normal distribution
under the null hypothesis. Then we only need to calculate ρ̂(h) and V . The
following are some examples.
Example 2.1 (Z-test for White Noise). For a white noise process, we have:
    
ρ̂(1) 0
ρ(2) 0 1 
 ..  ∼ AN  ..  , I 
    
 .   .  n 
ρ̂(k) 0
For a process {Xt }, we
want to test: H0 : {Xt } is a white noise. Since we
have ρ̂(h) ∼ AN ρ(h), n1 . We can reject H0 if
z0.025 1.96
|ρ̂(h)| > √ = √
n n
Example 2.2 (Z-Test for MA(1)). For MA(1), we haveXt = Wt + θWt−1 ,
|θ| 1
ρ(1) = 1+θ 2 , |ρ(1)| ≤ 2 and ρ(h) = 0, |h| > 1.
6
Proposition 4. By Bartlett’s formula, we have V11 = 1 − 3ρ2 (1) + 4ρ4 (1), and
V22 = 1 + 2ρ2 (1).
Proof. Because when |h| > 1, we have ρ(h) = 0, and by Bartlett’s formula, we
have
∞
X
V11 = (ρ(1 + h) + ρ(1 − h) − 2ρ(1)ρ(h))2
h=1
= (ρ(0) − 2ρ2 (1))2 + (ρ(−1))2
= 1 − 3ρ2 (1) + 4ρ4 (1)
and
∞
X
V22 = (ρ(2 + h) + ρ(2 − h) − 2ρ(2)ρ(h))2
h=1
= (ρ(1))2 + (ρ(0))2 + (ρ(−1))2
= 1 + 2ρ2 (1)
Then the reject region for H0 : Xt is an AR(1) is:

√
Vhh
|ρ̂(h) − ρ(1)| ≤ √ z0.025 if h = 1
n
√
Vhh
|ρ̂(h) − ρ(h)| = |ρ̂(h)| ≤ √ z0.025 if h > 1
n
Example 2.3 (Z-test for AR(1)). For AR(1), we have Xt− ϕXt−1 = ωt , and
ρ(h) = ϕh
Proposition 5. By Bartlett’s formula, we have
(1 + ϕ2 )(1 − ϕ2i )
Vii = − 2iϕ2i
1 − ϕ2
. When i = 1, we have V11 = 1 + ϕ2 . And for lagre i, we have
1 + ϕ2
Vii ≈
1 − ϕ2
Remark. The formula written in the lesson is slightly different from the formula
in the textbook. Here we use the formula given by the textbook.
7
Proof. Because ρ(h) = ϕh , and by Bartlett’s formula, we have
∞
X
Vii = (ρ(i + h) + ρ(i − h) − 2ρ(i)ρ(h))2
h=1
∞
X
= (ϕi+h + ρ(i − h) − 2ϕi+h )2
h=1
i
X ∞
X
= (ϕi−h − ϕi+h )2 + (ϕh−i − ϕi+h )2
h=1 h=i+1
i
X ∞
X
= ϕ2i (ϕ2h + ϕ−2h − 2) + (ϕ2i + ϕ−2i − 2) ϕ2h
h=1 h=i+1
ϕ − ϕ2+2i
2
ϕ−2 − ϕ−2i−2 −2i ϕ2i+2
= ϕ2i ( + − 2i) + (ϕ 2i
+ ϕ − 2)
1 − ϕ2 1 − ϕ−2 1 − ϕ2
2i 2+2i 2
1−ϕ −ϕ +ϕ
= 2
− 2iϕ2i
1−ϕ
(1 + ϕ2 )(1 − ϕ2i )
= − 2iϕ2i
1 − ϕ2
1+ϕ2
And for large i, we have 2iϕ2i ≈ 0, so we have Vii ≈ 1−ϕ2 for large k.
Let i=1, we have V11 = 1 − ϕ2 .
Example 2.4 (Z-test for MA(q)). For a general MA(q) process, we have ρ(h) =
0, when |h| > q.
Proposition 6. For k > q, by Bartlett’s formula, we have:
q
X
Vkk = 1 + 2 ρ2j
j=1
Proof.
∞
X
Vkk = (ρ(k + h) + ρ(k − h) − 2ρ(k)ρ(h))2
h=1
= (ρ(q))2 + (ρ(q − 1))2 + (ρ(q − 2))2 + · · · + (ρ(1))2 + (ρ(0))2 + (ρ(−1))2 + (ρ(−2))2 + · · · + (ρ(−q))2
q
X
=1+2 ρ2j
j=1
8
3 Implement of ARIMA Model with R
We use AirPassengers data in forecast library to give an example of imple-
menting ARIMA model.
First we do some visualization to explore the data.
1 # Load necessary packages
2 library ( forecast )
3 # Load data
4 data ( AirPassengers )
5 ts _ data <- AirPassengers
6 # Plot the time series
7 autoplot ( ts _ data ) + xlab ( " Year " ) + ylab ( " Passengers " )
We can observe that there is an obvious trend mode in the data. The time
series is not stationary.
Then, to obtain some further insight, we may apply ‘decompose()‘ func-
tion, which would decompose the data into trend, seasonality, and remainder
(residuals).
1 # Decompose the time series
2 decomposed _ ts <- decompose ( ts _ data )
3 # Plot the decomposed time series
4 autoplot ( decomposed _ ts )
9
To make the time series stationary, we use differencing to transform it. Here
we use first order differencing, i.e. Yt = Xt − Xt−1
1 # Check if the time series needs to be differenced for stationarity
2 diff _ ts <- diff ( ts _ data )
3 autoplot ( diff _ ts ) + xlab ( " Year " ) + ylab ( " Passengers " )
Though the differenced data oscillates a lot, which may imply high noise,
the mean of it is close to zero, which means that the transformed data becomes
more stationary. We may use ARIMA(p, 1, q) model to fit the data.
In fact, the ‘auto.arima‘ function can select the optimal arima model for a
given time series by seraching through a range of possible ARIMA models.
1 # train _ test _ split
2 train _ data <- window ( ts _ data , end = c (1958 ,12) )
3 test _ data <- window ( ts _ data , start = c (1959 ,1) )
4 # Determine the best ARIMA model parameters
5 auto _ arima _ model <- auto . arima ( train _ data )
10
6 # Output the model parameter information
7 auto _ arima _ model
The output is:

1 Series : train _ data
2 ARIMA (1 ,1 ,0) (0 ,1 ,0) [12]
3
4 Coefficients :
5 ar1
6 -0.2397
7 s.e. 0.0935
8
9 sigma ^2 = 103.6: log likelihood = -399.64
10 AIC =803.28 AICc =803.4 BIC =808.63
In the output, the first set of numbers (1, 1, 0) refer to the p, d, and q
parameters of the ARIMA model, respectively. Note that d = 1 coincides with
our previous observation. The second set of numbers (0, 1, 0) refers to the
seasonal parameters of the model. The ”Coefficients” in the output specifiy
ϕ1 = −0.2394 in the ARIMA model. Since the seasonal part of time series model
have not been covered by our lecture, we do not provide further explanation here.
We can use ‘checkresiduals‘ to evaluate the model.
1 # Diagnostics checking of the model
2 check residua ls ( auto _ arima _ model )
We can observe from the plot that the residuals exhibit low autocorrelation
and approximate normal distribution.
Finally, we can use the model to forecast future data.
1 # forecasting
2 forecast _ result <- forecast ( fit , h = 12)
3
4 # visualization forecasting result
5 plot ( forecast _ result , main = " ARIMA Model Forecast for
AirPassengers Data " )
11
6 lines ( test _ data , col = " red " )
7 legend ( " topleft " , legend = c ( " Actual " , " Forecast " ) , col = c ( " red " ,
" blue " ) , lty = c (1 , 1) )
We can see that the forcasting result is satisfying.
12

Time Series Analysis Lecture 8-3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Time Series Analysis Lecture 8-3

Uploaded by

Copyright:

Available Formats

Lecture 8

Yufei Cai, Langtian Ma, Zhuorang Liu, Yifei Feng

Yt − ϕ1 Yt−1 − ... − ϕp Yt−p = Wt + θ1 Wt−1 + ... + θq Wt−q

1.2 IMA(d, q)(=ARIMA(0, d, q))

(1 − B)d Xt = Wt + θ1 Wt−1 + ... + θq Wt−q

Proposition 2. If Xt ∼ IMA(1, 1), then

= (Wt + θWt−1 ) + (Wt−1 + θWt−2 ) + ... + (W−m + θW−m−1 )

. As a result, V ar(Xt ) = (1 + θ2 + (t + m)(1 + θ)2 )σ 2 which increases as t goes

= V ar(Xt ) + Cov((Xt+1 − Xt ) + ... + (Xt+k − Xt+k−1 ), Xt )

for large m and moderate k.

1.3 ARI(p, d)(=ARIMA(p, d, 0))

(1 − ϕ1 B − ϕ2 B 2 − ... − ϕp B p )(1 − B)d Xt = Wt

Example 1.3. Simulated ARIMA(2, 1, 0) with ϕ1 = 0.7, ϕ2 = −0.3, θ1 = −0.4

Proof. By the definition, Xt = (1 + ϕ)Xt−1 − ϕXt−2 + Wt or Xt − (1 + ϕ)Xt−1 +

which is called Bartlett’s formula.

Then the reject region for H0 : Xt is an AR(1) is:

. When i = 1, we have V11 = 1 + ϕ2 . And for lagre i, we have

The output is:

We can see that the forcasting result is satisfying.

You might also like