You are on page 1of 49

CHAPTER 4

ARIMA MODEL
(BOX – JENKINS METHODOLOGY)

KTEE418 – Forecasts in Economics and Business


4.1.1 STATIONARITY OF TIME SERIES
Stationarity: Yt is (weakly) stationary if :
• The series varies about a fixed level (no growth or
decline) over time (constant mean)
E(Yt) = μ
• The variation of the series around the mean does not
change over time (constant variance)
Var(Yt) = σ2
• The autocovariance between Yt and Yt-k only depends on
the interval k.
Cov(Yt, Yt-k) = cov(Ys, Ys-k) = γ𝑘
EXAMPLE OF STATIONARY SERIES

• White noise:
✓E(ut) = μ
✓Var(ut) = σ2
✓Cov(ut, ut+k) = 0, với γ𝑘
Is white noise process stationary?
• Random series in EVIEWS (very much like the disturbance in
regression model)
=> Generate a random series in EVIEWS and observe its properties.
4.1.1 STATIONARITY OF TIME SERIES
=> Time series with trends, or with seasonality, are not stationary.
4.1.1 STATIONARITY OF TIME SERIES
=> Time series with trends, or with seasonality, are not stationary.
4.1.1 STATIONARITY OF TIME SERIES
Which series is stationary?
NON-STATIONARY SERIES
• When any of the 3 conditions is not met, the series is non-
stationary.
• Random Walk

Yt = Yt −1 + ut
• Is random walk stationary?
• The random walk is an autoregressive model of order 1 AR(1)
• The coefficient on Yt-1 is 1 => unit root process.
NON-STATIONARY SERIES

• Random Walk without drift

Yt = Yt −1 + ut

• Random walk with drift

Yt = 1 + Yt −1 + ut
• Random walk with drift and trend

Yt = 1 +  2Tt + Yt −1 + ut
INTEGRATED SERIES
• One way to make a non-stationary time series stationary is to
compute the differences between consecutive observations –
differencing.
d (Yt ) = Yt = Yt − Yt −1
d (Yt , 2) = Yt − Yt −1 = Yt − 2Yt −1 + Yt − 2
• A random walk is not stationary but the 1st difference of a random
walk is stationary. (Prove that!)
• A series which is not stationary. Its 1st difference is stationary =>
integrated series of order 1. Denoted as I(1)
• A series which is not stationary. Its 2nd difference is stationary =>
integrated series of order 2. Denoted as I(2)
1st DIFFERENCE OF RANDOM WALK
• Random Walk without drift

Yt = Yt −1 + ut
Yt = ut

• Random walk with drift


Yt = 1 + Yt −1 + ut
Yt = 1 + ut
• Random walk with drift and trend

Yt = 1 +  2Tt + Yt −1 + ut
Yt = 1 +  2Tt + ut
4.1.2 TEST FOR STATIONARITY
Unit Root Test
(Dickey – Fuller Unit Root Test)
• For series Yt:
Yt = Yt −1 + ut
Yt = Yt − Yt −1 = Yt −1 + ut − Yt −1 = (  − 1)Yt −1 + ut
Yt =  Yt −1 + ut
• If δ = 0 (ρ = 1) => the series is non-stationary
• Testing hypothesis:
H0: δ = 0 => the series is non-stationary
H1: δ < 0 => the series is stationary
=> Small p-value of the test indicates stationarity.
4.1.2 TEST FOR STATIONARITY

Perform the Unit Root Test


In EVIEWS:
Double click to open the series
→ View → Unit Root Test
4.2 AUTOCORRELATION AND
PARTIAL AUTOCORRELATION
• Autocorrelation coefficient ACF 𝝆𝒌 : is the correlation coefficient
between Yt and Yt-k
✓ρ𝑘 is unconditional correlation coefficient.
✓Not adjusting for the correlation due to intermediate relations of
Yt and Yt-1, Yt-2, …, Yt-k+1
✓Autocorrelation Function of at lag k:
Yt =  k Yt − k + ut
✓Calculating the sample k-th order autocorrelation coefficient:
n

 (Y − Y )(Y
t t −k −Y)
cov(Yt , Yt − k )
rk = t = k +1
n
=
 (Y − Y )
var(Yt )
t
t =1
Q-STATISTICS
• Q-statistics provides an overall test for significant
autocorrelation.
• Null hypothesis: All population autocorrelation coefficients up
to lag m are simultaneously equal to 0.
• H0: ρ1 = ρ2 = ⋯ = ρ𝑚 = 0
m
rk2
Q(m) = n(n + 2)
k =1 n − k

2
• Q follows χ (𝑚)
• P-value(Q) < α => Reject H0 => There is non-zero
autocorrelation within the first m lags
4.2 AUTOCORRELATION AND
PARTIAL AUTOCORRELATION
• Partial Autocorrelation Coefficient 𝝆𝒌𝒌 : is the
partial correlation coefficient between the series Yt and lag
of itself Yt-k
✓“Partial correlation” between two variables is the amount of
correlation between them which is not explained by their
mutual correlations with a specified set of other variables.
✓Partial Autocorrelation Function at lag k

Yt = ak1Yt −1 + ak 2Yt − 2 + ... + akk Yt − k + ut


✓ρ𝑘𝑘 = 𝑎𝑘𝑘 is the k-th partial autocorrelation coefficient between
Yt and Yt-k (the correlation after adjusting for the effects of the
intervening values Yt-1, Yt-2, …, Yt-k+1).
ACF AND PACF PLOTS
• Correlogram is the graph of the autocorrelations for
various lags of a time series.
• In EVIEWS:
• Double click to choose the series → View →
Correlogram
• This is the graph for sample autocorrelations and
partial autocorrelations of the series.
ACF AND PACF PLOTS
Date: 08/01/19 Time: 17:35
Sample: 1990Q1 2016Q4
Included observations: 91

Autocorrelation Partial Correlation AC PAC Q-Stat Prob

1 0.792 0.792 59.011 0.000


2 0.858 0.619 129.02 0.000
3 0.765 0.069 185.33 0.000
4 0.888 0.534 262.10 0.000
5 0.703 -0.43... 310.78 0.000
6 0.760 -0.12... 368.29 0.000
7 0.668 -0.01... 413.19 0.000
8 0.777 0.226 474.76 0.000
9 0.602 -0.13... 512.11 0.000
1... 0.650 -0.13... 556.27 0.000
1... 0.559 -0.05... 589.34 0.000
1... 0.655 0.103 635.35 0.000
ACF AND PACF PLOTS

• Q-stat and prob: Ljung-Box Q statistics at different


lags and the corresponding p-value
• Prob of Q-stat at lag k < α => Reject H0 => There is
non-zero autocorrelation within the first k lags
ANALYZING DATA PATTERNS USING
CORRELOGRAM
• Data patterns, including components such as trend and
seasonality, can be studied using autocorrelations.
ANALYZING DATA PATTERNS USING
CORRELOGRAM
Series Autocorrelation Correlogram
Stationary

Trend

Seasonality

Random
CORRELOGRAM OF
STATIONARY SERIES
CORRELOGRAM OF
SERIES WITH TREND
CORRELOGRAM OF
SERIES WITH SEASONALITY
CORRELOGRAM OF SERIES WITH
TREND AND SEASONALITY
CORRELOGRAM OF RANDOM SERIES
4.3 AR MODEL
• Auto-Regression Process
• In an autoregression model, we forecast the variable
of interest using a linear combination of past values of
the variable.
• AR(1) Model

Yt = 0 + 1Yt −1 + ut
➢ut satisfies OLS’s basic assumptions.
➢ϕ0 is the constant level of the series.
➢-1 < ϕ1 < 1 => stationary series
➢If |ϕ1 | > 1 => explosive series
4.3 AR MODEL
• Auto-Regressive Process of order p
• AR(p) Yt = 0 + 1Yt −1 + 2Yt − 2 + ... +  pYt − p + ut
p
Yt = 0 +  iYt −i + ut
i =1

• Autoregressive models are appropriate for stationary


time series.
• Stationarity Constraints:
➢For AR(2) model: -1 < ϕ2 < 1, ϕ1 + ϕ2 < 1
➢When p ≥ 3 , the restrictions are much more complicated
4.3 AR MODEL
• How to determine the order p of the
autoregressive process?
=> Use the partial autocorrelation plot PACF.
• Choose the lag orders where the partial
autocorrelation coefficients rise above the
significance borders in PACF
• There may be some appropriate lag orders
=> you can try some AR models with different
orders then evaluate them and decide which
model is adequate.
4.4 MA MODEL
• 1st order moving average process MA(1)

Yt =  + ut + 1ut −1
• μ is the level (mean) of the process
• Yt depends on previous values of the errors (error term
at time t – 1 for MA(1) process)
4.4 MA MODEL
Consider a MA(1) process: 𝑌𝑡 = μ + 𝑢𝑡 + θ1 𝑢𝑡−1
• E(𝑌𝑡 ) = ?
• var(𝑌𝑡 ) = ?
• cov(𝑌𝑡 , 𝑌𝑡−1 ) = E[(𝑢𝑡 + θ1 𝑢𝑡−1 )(𝑢𝑡−1 + θ1 𝑢𝑡−2 )] = θ1 σ2
=> r(𝑌𝑡 , 𝑌𝑡−1 ) ≠ 0 and does not depend on time t
• cov(𝑌𝑡 , 𝑌𝑡−2 ) = 0
=> r(𝑌𝑡 , 𝑌𝑡−2 ) = 0

In general, r(𝑌𝑡 , 𝑌𝑡−(𝑘−1) ) ≠ 0 and r(𝑌𝑡 , 𝑌𝑡−𝑘 ) = 0 for k > q


4.4 MA MODEL
• qth-order Moving Average Process MA(q)

Yt =  + ut + 1ut −1 +  2ut − 2 + ... +  q ut − q


q
Yt =  + ut +   j ut − j
j =1
• Moving average refers to the fact that the deviation of
the response from its mean is a linear combination of
current and past errors.
Yt −  = ut + 1ut −1 +  2ut − 2 + ... +  q ut − q
Yt +1 −  = ut +1 + 1ut +  2ut −1 + ... +  q ut − q +1
4.4 MA MODEL
• A stationary AR(p) process can be written as an MA(∞)
process
• For example: AR(1) wherein -1 < ϕ1 < 1
Yt = 0 + 1Yt −1 + ut
= 0 + 1 (1Yt − 2 + ut −1 ) + ut
= 0 +  Y 2
1 t −2 + 1ut −1 + ut = 0 +  (1Yt −3 + ut − 2 ) + 1ut −1 + ut
1
2

...
q
= 0 + 1qYt − q +  1i ut −i + ut
i =1

• q → ∞ then ϕ1 → 0 => Yt = 0 +
𝑞
 1 ut −i + ut = MA()
 i

i =1
4.4 MA MODEL
• Any MA(q) process is stationary (∀q)
• If a MA process can be written as an AR process => the
MA process is invertible.
• For example:
Yt = ut+0.5ut-1
Yt = ut+0.5(Yt-1 – 0.5 ut-2)
Yt = ut+ 0.5Yt-1 – 0.52 Yt-2+ …. = ut + σ∞
𝑗=1 (−θ 𝑗 )𝑌
𝑡−𝑗
Yt is an invertible process.
• The invertibility constraints are similar to the stationarity
constraints.
4.4 MA MODEL
• How to determine q?
=> Use the autocorrelation plot ACF
• we can select the order q for model MA(q) from
ACF if this plot has a sharp cut-off after lag q.
• If A gradual geometrically declining ACF or a
sinusoidal ACF is observed, then q = 0.
4.5 ARMA MODEL
• Most of time series are the mix processes of both AR
and MA terms.
• An ARMA (p,q) process:

Y t = C +  1Y t −1 +  2Y t − 2 + ... +  pY t − p +  1u t −1 +  2u t − 2 + ... +  qu t − q + u t
p q
= C +  iYt −i +   j ut − j + u t
i =1 j =1

• Orders p and q in an ARMA model are determined


from the patterns of the sample autocorrelations and
partial autocorrelations
CHOOSING THE LAG ORDER
Model ACF PACF

MA(q)

AR(p)

ARMA(p, q)
AR MODEL
MA MODEL
ARMA MODEL
4.6 ARIMA MODEL
I(d): integrated series of order d (the series is stationary after
taking d-th difference)
Example of ARIMA(p,1,q):
Y t
= C +   Y t −1 +   Y t − 2 + ... +   Y t − p +  1 u t −1 +  2 u t − 2 + ... +  q u t − q + u t
1 2 p

• AR, MA, ARMA, ARIMA models are appropriate for


stationary time series.
If your series is not stationary, you can try adjust the original
series by:
4.7 FORECASTING USING ARIMA MODELS
(BOX – JENKINS APPROACH)
1. Check for stationarity of the series. If the series is
not stationary, transform it into a stationary series.
2. Determine p, q using ACF and PACF plots.
3. Estimate the model
4. Check for model’s adequacy:
1. Random residual
2. Is the term of the highest lag order significant? If not,
reduce the lag orders (p, q).
3. Quality of the forecast (MAPE <=5%)
5. Forecasting with the model
ARIMA MODEL SELECTION CRITERIA
1. The residual of the model is a random series (use
correlogram or tests for autocorrelation and
heteroskedasticity)
2. AIC/SBC/HQC criteria are as small as possible

 RSS  2 k / n  RSS  k / n  RSS 


AIC =   u , SBC =   u , HQC =   (ln(n)) 2k / n

 n   n   n 
ARIMA MODEL SELECTION CRITERIA

3. Forecasting error evaluation criteria are as small as


possible (RMSE, MAPE)
4. Plot the graph to compare between the actual series and
the forecasted series:
➢Observe the turning points
➢Compare the forecasted values and the actual values for
the most recent periods
5. Are the coefficients statistically significant? Which model
has more significant coefficients?
4.7 FORECASTING USING ARIMA MODELS

Use workfile KTE418, page arima


• Step 1
Double click gdp → view → unit root test →
conclusion
• Step 2
Double click gdp → view → correlogram →
choose p and q based on ACF and PACF plots.
4.7 FORECASTING USING ARIMA MODELS

• Step 3: Estimate the model


ls d(gdp) c ar(1) ar(8) ar(12) ma(1) ma(8) ma(12)
ls d(gdp) c ar(1) ma(1)
....
You can try some combinations of the AR and
MA terms.
4.7 FORECASTING USING ARIMA MODELS
• Step 4: Model check
• Random residuals:
➢View → Residual Diagnostics correlogram → Q-statistic
→ ok.
➢If all the coefficients are not significant => random
residuals
• Evaluate forecast quality: MAPE & RMSE
➢In-sample forecast: From the estimation output →
forecast
➢Forecast sample: If your series is long, then you can
choose one part of the series to make forecasts and
evaluate (about 20%).
4.7 FORECASTING USING ARIMA MODELS

• Step 5: Out-of-sample forecast


• Expand the range and sample of the series (if
necessary): add periods that need forecasted.
• Forecast → Fill in Forecast range: period to be
forecasted → OK
SEASONAL ARIMA MODEL
• Check for seasonality
Use workfile page sarima → view → graph →
seasonal → ok. (or use Kruscal – Wallis Test)
If the original series has seasonal component:
Option 1: Separate the seasonal component
(using moving average technique) and use
ARIMA model to forecast for the adjusted series.
Option 2: Use S-ARIMA with MA, SMA and/or
AR, SAR terms.
SEASONAL ARIMA MODEL
• ARIMA(p,d,q)×(P,D,Q)S
• Where:
➢ p = number of non-seasonal autoregressive (AR) terms
➢ d = number of non-seasonal differences
➢ q = number of seasonal moving average (MA) terms
➢ P = number of seasonal autoregressive (SAR) terms
➢ D = number of seasonal differences
➢ Q = number of seasonal moving average (SMA) terms
• Example:
➢ARIMA(1,1,2)×(1,0,0)4
Y t = C +  1Y t −1 +  1u t −1 +  2u t − 2 + 1Y t −12 + 2 Y t −13 + u t

➢ARIMA(1,0,2)×(0,0,1)4

Y t = C +  1Y t −1 +  1u t −1 +  2u t − 2 + 1 u t −12 +  2 u t −13 +  3 u t −14 + u t

You might also like