Professional Documents
Culture Documents
ARIMA - Identification, Estimation & Seasonalities: ARMA Process
ARIMA - Identification, Estimation & Seasonalities: ARMA Process
Lecture 14
ARIMA Identification,
Estimation & Seasonalities
ARMA Process
We defined the ARMA(p,q) model:
(L)(yt ) = (L)t
Let xt = yt .
Then,
(L) xt = (L)t
RS EC2 - Lecture 14
Autocovariance Function
We define the autocovariance function, (t-j) as: (t j) = E[ yt , yt j ]
For an AR(p) process, WLOG with =0 (or demeaned yt), we get:
(t j ) = E[ yt , yt j ] = E[(1 yt 1 yt j + 2 yt 2 yt j + .... + p yt p yt j + t yt j )]
= 1 ( j 1) + 2 ( j 2) + .... + p ( j p)
Autocovariance Function
(1)
( 0)
(1)
( 0)
M
M
( p 1) ( p 2)
L ( p 1) 1 (1)
L ( p 2) 2 ( 2)
=
M M
L
M
(0) p ( p )
L
RS EC2 - Lecture 14
0 = 1 + E [ t (Yt )] E t 1
(
yt )
14243
123
( y t 1 )+ t t 1
2
= 1 + 2 E t 1[
(
y t 1 )
+ t t 1 ]
1424
3
( y t 2 )+ t 1 t 2
= 1 + 2 2 2
1 = 0 + E [ t ( yt 1 )] E t 1 ( yt 1 )
144244
3
424
3
(y1
0
t 2 )+ t 1 t 2
= 0 2
Two equations for 0 and 1:
0 = 0
) + (1 + )
2
1 + 2 2
0 =
1 2
In general:
k = k 1 = k 1 1 , k > 1
RS EC2 - Lecture 14
k covariance at lag k
=
0
variance
M
M
( p 1) ( p 2)
L ( p 1) 1 (1)
L ( p 2) 2 (2)
=
L
M M M
L
(0) p ( p)
( Y t Y )( Y t + k Y )
(Y t Y ) 2
RS EC2 - Lecture 14
In general,
( k ) = 2 j j k
kq
( with 0 = 1).
j=k
=0
otherwise.
q
Then,
(k ) =
=0
jk
j=k
2
2
(1 + 1 + 2 + ... + q )
kq
otherwise.
RS EC2 - Lecture 14
ChiSquare
6
12
18
24
38.29
46.89
75.07
89.77
Pr >
DF ChiSq
--------------------Autocorrelations--------------------
ACF: Correlogram
The sample correlogram is the plot of the ACF against k.
As the ACF lies between -1 and +1, the correlogram also lies between
these values.
Example: Correlogram for US Monthly Returns (1800 2013)
RS EC2 - Lecture 14
Q = T
k2
k =1
LB
= T (T + 2 )
k =1
2
k
(T k )
Partial ACF
The Partial Autocorrelation Function (PACF) is similar to the ACF.
It measures correlation between observations that are k time periods
apart, after controlling for correlations at intermediate lags.
Definition: The PACF of a stationary time series {yt} is
11 = Corr(yt, yt-1) = (1)
hh = Corr(yt E[yt|Ft-1], yt-h-1 E[yt-h-1|Ft-1])
for h=2, 3, ....
This removes the linear effects of yt-1, yt--2 , .... yt-h .
The PACF hh is also the last coefficient in the best linear prediction
of yt given yt-1, yt--2 , .... yt-h : h h = (k)
where h =(h1 h2 , ...., hh).
RS EC2 - Lecture 14
Partial ACF
Example: AR(p) process:
yt = + 1 yt 1 + 2 yt 2 + .... + p yt p + t
E[ yt | Ft 1 ] = + 1 yt 1 + 2 yt 2 + .... + p yt p
Then, hh = h
=0
if 1 h p
otherwise
( L ) yt = ( L ) t
is defined to be (assuming invertibility) the ACF of the inverse (or dual)
process
( L ) yt 1 = ( L ) t
The IACF has the same property as the PACF: AR(p) is
characterized by an IACF that is nonzero at lag p but zero
for larger lags.
The IACF can also be used to detect over-differencing. If the data
come from a nonstationary or nearly nonstationary model, the IACF
has the characteristics of a noninvertible moving-average.
RS EC2 - Lecture 14
RS EC2 - Lecture 14
yt = + t + t , yt = yt yt 1 = [(t ) (t 1)] + t t 1
E (Yt ) =
The sequence {yt} will exhibit only temporary departures from the
trend line +t. This type of model is called a trend stationary (TS)
model.
If a series has a deterministic time trend, then we simply regress yt on
an intercept and a time trend (t=1,2,,T) and save the residuals. The
residuals are detrended yt series.
If yt is stochastic, we do not necessarily get stationary series.
yt = + 1 yt 1 + ... + p yt p + 1t + 2 t 2 + ... + 2 k t k + t ,
Estimation:
- OLS.
- Frish-Waugh method: first detrend yt, get the residuals; and, second,
use residuals to estimate the AR(p) model.
RS EC2 - Lecture 14
RS EC2 - Lecture 14
yt = c + yt 1 + t
= c + (c + yt 2 + t 1 ) + t
M
t
= tc + y0 +
i =1
Deterministic trend
This is what we call a random walk with drift. The series grows with t.
Each t shock represents a shift in the intercept. All values of {t}
have a 1 as coefficient => each shock never vanishes (permanent).
RS EC2 - Lecture 14
ARIMA(p,d,q) Models
For p, d, q 0, we say that a time series {yt} is an ARIMA (p,d,q)
process if wt = dyt = (1-L)dyt is ARMA(p,q). That is,
( L )(1 L ) d yt = ( L ) t
Applying the (1-L) operator to a time series is called differencing.
Notation: If yt is non-stationary, but dyt is stationary, then yt is
integrated of order d, or I(d). A time series with unit root is I(1). A
stationary time series is I(0).
Examples:
Example 1: RW: yt = yt-1 + t.
yt is non-stationary, but
(1 L) yt = t => a white noise!
Now, yt ~ ARIMA(0,1,0).
ARIMA(p,d,q) Models
Example 2: AR(1) with time trend: yt = + t + yt-1 + t.
yt is non-stationary, but
wt =(1 L) yt = + wt-1 + t t-1
Now, yt ~ ARIMA(1,1,1).
We call both process first difference stationary.
Note:
- Example 1: Differencing a series with a unit root in the AR part of
the model reduces the AR order.
- Example 2: Differencing can introduce an extra MA structure. We
introduced non-invertibility. This happens when we difference a TS
series. Detrending is used in these cases.
RS EC2 - Lecture 14
ARIMA(p,d,q) Models
In practice:
A root near 1 of the AR polynomial
A root near 1 of the MA polynomial
differencing
over-differencing
ChiSquare
6
12
18
24
9999.99
9999.99
9999.99
9999.99
DF
Pr >
ChiSq
6 <.0001
12 <.0001
18 <.0001
24 <.0001
--------------------Autocorrelations-------------------0.995
0.963
0.937
0.920
0.989
0.957
0.934
0.917
0.984
0.952
0.931
0.914
0.979
0.947
0.929
0.911
0.974
0.943
0.927
0.907
0.969
0.940
0.924
0.903
RS EC2 - Lecture 14
t ~ WN 0, 2 .
RS EC2 - Lecture 14
yt yt 1 = yt = + t
It can also be written as
t
yt = y 0 +
t{
determinis tic
trend
i
i
=
1
123
stochastic
trend
E ( y t + s y t ) = y t + s not flat
RS EC2 - Lecture 14
RS EC2 - Lecture 14
RS EC2 - Lecture 14
37
38
RS EC2 - Lecture 14
39
40
RS EC2 - Lecture 14
41
42
RS EC2 - Lecture 14
43
T
1
ln( 2 2 )
S p , q ,
2
2 2 14243
SS Re sidual
T
T
ln L == ln 2 (1 + ln 2 )
2
2 4243
1
constant
AIC
= T ln
+ 2M
BIC = T ln 2 + M ln T
HQC = T ln 2 + 2 M ln (ln T )
i.i .d
assuming t ~ N 0, 2
).
RS EC2 - Lecture 14
= T ln
2 M ( M + 1)
T M 1
MA 0
MA 1
MA 2
MA 3
MA 4
MA 5
AR 0
-6.1889
AR 1
-6.19511
AR 2
-6.19271
-6.18993
AR 3
-6.19121
AR 4
-6.18853
AR 5
-6.18794
-6.1839
-6.1779 -6.17564
RS EC2 - Lecture 14
( L )( yt y ) = ( L ) t
Several ways to estimate an ARMA(p,q) model:
1) MLE. Assume a distribution, usually normal, and do ML.
2) Yule-Walker for ARMA(p,q). Method of moments. Not efficient.
3) Innovations algorithm for MA(q).
4) Hannan-Rissanen algorithm for ARMA(p,q)
RS EC2 - Lecture 14
) (
f 1 , L , T , , , 2 = 2
2 T / 2
1
exp
2
2
2
t
t =1
t =1
RS EC2 - Lecture 14
yt = yt 1 + t ,
i .i . d .
t ~ N (0, 2 ).
n / 2
1
exp 2
2
t =1
2
t
- Solve for t:
Y1 = Y0 + 1 Let' s take Y0 = 0
Y2 = Y1 + 2 2 = Y2 Y1
Y3 = Y2 + 3 3 = Y3 Y2
M
Yn = Yn1 + n n = Yn Yn 1
2
2
1
L
Y3
Yn
M O M =
M
n
n
L
0
Y3
Yn
0 L 0
1 L 0
M O M
0 L 1
=1
f (Y2 , L , Yn Y1 ) = f ( 2 , L , n ) J = f ( 2 , L , n )
1
e
=
2 0
(Y1 0 )2
2 0
2 2
(T 1) / 2
1
2 2
(Yt Yt1 )2
t =2
2
.
, where Y1 ~ N 0, 0 =
2
RS EC2 - Lecture 14
L , 2 =
1 2
(2 )
2 n/2
1 n
2
2
2
(
exp
Y
Y
+
1
t
t
1
1
2
2 t = 2
ln L ,
)=
n
n
1
ln 2
ln 2
ln 1 2
2
2
2
1
2
2
2
(
)
Y
+
1
Y
t
t 1
1
2 2 t=2
1 4 4 2 4 43
* ( )
4 4 S4
1
4 4 2 4 4 4 4 43
S (
ln L ,
ln L ,
)= 0
)= 0
max L ,
) = min S ( ).
max L ,
)Y
2
1
, then
) = min S ( ).
*
RS EC2 - Lecture 14
(
p
1)
(1)
(0)
M
( p 2)
( p 1) 1
( p 2 ) 2
(1)
(2)
=
M M
M
( 0 ) p ( p )
L
L
L
L
1
T
Y = Y
t
t =1
E[(Yt )(Yt k )) =
1
T
(Y )(Y
t
t k
)) k = k (&k = k )
t =1
( p 1)
(1)
( 0 )
M
L
L
L
( p 2 ) L
( p 1) 1
( p 2 ) 2
(1)
( 2 )
=
M M
M
( 0 ) p ( p )
= p 1 p
=> :
2
= 0 p
d
If {Yt} is an AR(p) process,
N ,
p1
p
2
2
1
d
kk
N 0, for k > p.
T
Thus, we can use the sample PACF to test for AR order, and we can
calculate approximated C.I. for .
RS EC2 - Lecture 14
approx.
~ N 0, 2 p1
1
p
100(1)% approximate C.I. for j is j z / 2
T
( )
1/ 2
jj
y t = y t 1 + t
It is known that 1 = . Then, the MME of is
1 = 1
T
= 1 =
(Y
Y )(Y t 1 Y
t =1
T
(Y
t =1
Also,
2
is unknown. 0 =
2 = 0 1 2
2
(1 )
RS EC2 - Lecture 14
t 1
(1 + ) =
2
2 1 + + 1 = 0
1
1 , 2 =
1 4 12
2 1
Choose the root satisfying the invertibility condition. For real roots:
1 4 12 0 0 . 25 12
0 . 5 1 0 . 5
RS EC2 - Lecture 14
(1 L ) d yt = t
We went over two cases for d =0 & 1. Granger and Joyeaux (1980)
consider the model where 0 d 1.
Using the binomial series expansion of (1-L)d:
(1 L ) d =
k ( L )
k =0
( d k )! k ! ( L )
d!
k 1
k =0
(d a )
a=0
k =0
k!
( L ) k
d ( d 1) L2
= 1 dL +
...
2!
( d + 1) dL 2
+ ...
2!
In the ARIMA(0,d,1): yt = (1 L ) d t = t + d t 1 +
( d + 1) d t 2
+ ...
2!
RS EC2 - Lecture 14
( d + 1) d t 2
+ ...
2!
yt = (1 L ) d t t + (1 + 1) d t 1 + (1 + 2) d 1 t 2 + ... = j t j
j =0
RS EC2 - Lecture 14
RS EC2 - Lecture 14
k = 0,1,2, L
RS EC2 - Lecture 14
( )(
P Bs 1 Bs
( )
yt = 0 + Q B s t
( )
P B s = 1 1 B s 2 B 2 s L P B sP
( )
Q B s = 1 1 B s 2 B 2 s L Q B sQ
Var( yt ) = 1 + 2 2
, k = 12
ACF : k = 1 + 2
0,
otherwise
(1 B )y
12
= 0 + t
2
a
ACF : 12 k = k , k = 0 , 1, 2 , L
RS EC2 - Lecture 14
where all zeros of (B); (Bs); (B) and (Bs) lie outside the unit
circle. Of course, there are no common factors between (B)(Bs)
and (B)(Bs).
When = 1, the series is non-stationary. To test for a unit root,
consider seasonal unit root tests.
RS EC2 - Lecture 14
RS EC2 - Lecture 14
W t = (1 B ) 1 B 12 a t
W t = a t a t 1 a t 12 + a t 13
W t ~ I (0 )
)(
1 + 2 1 + 2 2 , k = 0
2
2
k =1
1 + ,
k = 1 + 2 2 ,
k = 12
2
k = 11,13
,
0,
otherwise
(
(
)
)
k =1
1+ 2 ,
,
k = 12
k = 1+ 2
, k = 11 ,13
2
2
1+ 1+
0,
otherwise .
)(
(1
B ) 1 B 12 y t = (1 B ) 1 B 12
RS EC2 - Lecture 14
Non-Stationarity in Variance
Stationarity in mean does not imply stationarity in variance
Non-stationarity in mean implies non-stationarity in variance.
If the mean function is time dependent:
1. The variance, Var(yt) is time dependent.
2. Var[yt] is unbounded as t .
3. Autocovariance functions and ACFs are also time dependent.
4. If t is large wrt y0, then k 1.
It is common to use variance stabilizing transformations: Find a
function G(.) so that the transformed series G(yt) has a constant
variance. For example, the Box-Cox transformation:
T (Y t
)=
Y t 1
RS EC2 - Lecture 14