Professional Documents
Culture Documents
ensemble average:
N
1
∑ yt
(i )
E (Yt ) = plim (1.2)
N !∞ N i =1
Z ∞
γ0t = E (Yt µt )2 = (yt µt )2 fYt (yt ) dyt (1.3)
∞
γkt = E (Y t µt ) Y t k µt k (1.4)
Z ∞ Z ∞
= (Y t µt ) Y t k µt k fYt ,Yt 1 ,...,Y t k (y t , y t 1 , . . . , yt k ) dy t dy t 1 dyt k
∞ ∞
average:
N
1
∑
(i ) (i )
γkt = plim yt µt yt k µt k (1.5)
N !∞ N i =1
De…nition
1.1.3 A stationary time series is ergodic if γk ! 0 as
k " ∞.
Proof.
Proof of Theorem 1.1.3. Part (1) is a direct
consequence of the Ergodic theorem.
1 T 1 T 1 T
bk =
γ
T ∑t =1 yt yt k
T ∑t =1 yt µb T ∑t =1 yt b+µ
kµ b 2
1 p
∑t =1 yt yt
T
k ! E (yt yt k)
T
Hence,
p
bk
γ ! E (yt yt k) µ2 µ2 + µ2 = E (yt yt k) µ2 = γk
F. Guta (CoBE) Econ 654 September, 2017 18 / 296
A covariance stationary process is said to be ergodic
for the mean if y converges in probability to E (yt )
as T ! ∞.
Alternatively, if the autocovariance for a
covariance-stationary process satisfy ∑j∞=1 γj < ∞
then fyt g is ergodic for the mean.
Theorem
1.1.4 If xt is a stationary process, and if
(φi , i being integer) forms a sequence of absolutely
summable real numbers with ∑i∞= ∞ jφi j < ∞, the
new variable obtained by yt = ∑i∞= ∞ φi xt i de…nes
a new stationary process.
!
∞ ∞
cov (yt , yt +k ) = cov ∑ φi xt i , ∑ φj xt +k j
i= ∞ j= ∞
∞ ∞
= ∑ ∑ φi φj cov (xt i , xt +k j)
i= ∞ j= ∞
∞ ∞
= ∑ ∑ φi φj γx (k + i j ) = γy (k )
i= ∞ j= ∞
et N 0, σ2 (1.10)
E ( yt j Ft 1) = E ( yt j yt 1 , yt 2 , . . . , yt k)
E ( yt j Ft 1) = ρ0 + ρ1 yt 1 + ρ2 yt 2 + + ρk yt k
yt = ρ0 + ρ1 yt 1 + ρ2 yt 2 + + ρk yt k + et
E ( et j Ft 1) =0
Example
Some versions of the life-cycle hypothesis imply that
either changes in consumption or consumption
growth rates should be a MDS.
Most asset pricing models imply that asset returns
should be the sum of a constant plus a MDS.
yt = α + φyt 1 + et (1.2.1)
Theorem
1.2.1 If jφj < 1 then yt is strictly stationary and ergodic.
2
γ0 = E (yt µ)2 = E et + φet 1 + φ2 et 2 +
σ2
= 1 + φ2 + φ4 + φ6 + σ2 = (1.2.4)
1 φ2
γk = E (yt µ) (yt k µ)
et k + φet k 1 + φ2 et k 2 +
φk
γ k = φ k + φ k +2 + φ k +4 + σ2 = σ2 (1.2.5)
1 φ2
(yt 1 µ ) = et 1 + φet 2 + φ2 et 3 +
γk = φγk 1 ) γk = φk γ0 (1.2.15)
yt = α + φ1 yt 1 + φ2 yt 2 + et (1.2.16)
1 φ1 L φ2 L2 yt = α + et (1.2.17)
1
ψ (L) = 1 φ1 L φ2 L2 = ψ0 + ψ1 L + ψ2 L2 + (1.2.19)
ψ0 = 1, ψ1 = φ1 , ψj = φ1 ψj 1 + φ2 ψj 2 for j 2
Proof.
Taking expectation of both sides of (1.2.16) yields
E (yt ) = ψ (L) α
α
ψ (L) α = = ψ ( 1) α
1 φ1 φ2
1 1
1 φ1 L φ2 L2 = [(1 λ1 L) (1
λ2 L)]
c1 c2
= +
(1 λ1 L) (1 λ2 L)
λ1 λ2
Where c1 = and c2 = .
λ1 λ2 λ2 λ1
F. Guta (CoBE) Econ 654 September, 2017 51 / 296
Proof.
Assuming that the eigenvalues are inside the unit
circle or the roots of the reverse characteristic
polynomial lie outside the unit circle, one would
express the second order polynomial in the lag
operator as
∞ ∞
= c1 ∑ (λ1 L) + c2 ∑ (λ2 L)j
2 1 j
1 φ1 L φ2 L
j =0 j =0
∞
= ∑ c1 λj1 + c2 λj2 Lj
j =0
Hence,
∞ ∞ ∞ ∞
∑ ψj = ∑ c1 λj1 + c2 λj2 jc1 j ∑ jλ1 jj + jc2 j ∑ jλ2 jj
j =0 j =0 j =0 j =0
jc1 j jc2 j
= + <∞
1 j λ1 j 1 j λ2 j
yt = µ (1 φ1 φ2 ) + φ1 yt 1 + φ2 yt 2 + et
Or
γk = φ1 γk 1 + φ2 γk 2 for k = 1, 2, . . . (1.2.24)
ρk = φ1 ρk 1 + φ2 ρk 2 for k = 1, 2, . . . (1.2.25)
ρ1 = φ1 + φ2 ρ1
F. Guta (CoBE) Econ 654 September, 2017 56 / 296
φ1
Or, ρ1 = (1.2.26)
1 φ2
For k = 2,
ρ2 = φ1 ρ1 + φ2 (1.2.27)
γ0 = φ1 ρ1 γ0 + φ2 ρ2 γ0 + σ 2 (1.2.29)
satis…es
yt = α + φ1 yt 1 + φ2 yt 2 + + φp yt p + et (1.2.30)
1 φz φz 2 φp z p = 0 (1.2.31)
all lie outside the unit circle, one can easily verify
that a covariance stationary representation of the
F. Guta (CoBE) Econ 654 September, 2017 59 / 296
form
yt = µ + ψ (L) et (1.2.32)
exits where
1 ∞
ψ ( L) = 1 φL 2
φL p
φp L and ∑ ψj < ∞
j =0
Theorem
1.3.1 The AR (p ) process is strictly stationary and ergodic
if and only if jλk j < 1 (eigenvalues) for all k.
F. Guta (CoBE) Econ 654 September, 2017 60 / 296
Assuming that the stationarity condition is satis…ed,
one alternative way to …nd the mean is to take
expectations of (1.2.30):
α
µ= (1.2.33)
1 φ1 φ2 φp
Using (1.2.33), one can easily rewrite equation
(1.2.30) as
yt µ = φ1 (yt 1 µ) + φ2 (yt 2 µ) +
+ φp (yt p µ ) + et (1.2.34)
F. Guta (CoBE) Econ 654 September, 2017 61 / 296
One can easily …nd autocovariances by multiplying
both sides of (1.2.34) by (yt k µ) and taking
expectations:
8
>
>
>
< φ1 γk + φ2 γk + + φp γk for k = 1, 2, . . .
1 2 p
γk = (1.2.35)
>
>
>
: φ1 γ1 + φ2 γ2 + + φp γp + σ 2
for k = 0
ρk = φ1 ρk 1 + φ2 ρk 2 + + φp ρk p for k = 1, 2, . . . (1.2.36)
λp φ1 λp 1
φ2 λp 2
φp = 0
yt = µ + et + θet 1 (1.2.38)
F. Guta (CoBE) Econ 654 September, 2017 65 / 296
where µ and θ could be any constants.
This time series process is called a …rst-order
moving average process, denoted by MA(1).
The term “moving average” comes from the fact
that yt is constructed from a weighted sum of two
most recent values of e.
The expectation of yt is given by
2
= E e2t + 2θ E (et et 1) + θ E e2t 1
= 1 + θ 2 σ2 (1.2.40)
= θσ2 (1.2.41)
MA(q ), is characterized by
yt = µ + et + θ 1 et 1 + θ 2 et 2 + + θ q et q (1.2.45)
=µ (1.2.46)
γ0 = E ( yt µ )2 = E ( et + θ 1 et 1 + θ 2 et 2 + + θ q et q)
2
= 1 + θ 21 + θ 22 + + θ 2q σ2 (1.2.47)
For k = 1, 2, . . . , q,
γk = E f(et + θ 1 et 1 + θ 2 et 2 + + θ q et q) (1.2.48)
et k + θ 1 et k 1 + θ 2 et k 2 + + θ q et k q
yt = µ + ∑j =0 ψj et
q
j with ψ0 = 1
h i2
∑ j =N +1 ψ j e t
M
E j = ψ2M + ψ2M 1 + + ψ2N +1 σ2
h i
∑j =0 ψ2j ∑j =0 ψ2j
M N
= σ2 (1.2.54)
γ0 = E (yt µ )2
2
= lim E (ψ0 et + ψ1 et 1 + ψ2 et 2 + + ψT et T)
T !∞
= ψ k ψ 0 + ψ k +1 ψ 1 + ψ k +2 ψ 2 + ψ k +3 ψ 3 + σ2
Hence,
∞ ∞ ∞ ∞ ∞
∑ j γk j σ2 ∑∑ ψ j +k ψ j = σ 2 ∑ ψj ∑ ψ j +k
k =0 k =0 j =0 j =0 k =0
1 φ1 L φ2 L 2 φp L p y t = α + 1 + θ 1 L + θ 2 L 2 + + θ q L q et (1.2.60)
1 φ1 z φ2 z 2 φp z p = 0 (1.2.61)
yt µ = φ1 (yt 1 µ) + + φp (yt p µ)
+ et + θ 1 et 1 + + θ q et q (1.2.62)
γk = φ1 γk 1 + φ2 γk 2 + + φp γk p (1.2.63)
for k = q + 1, q + 2, . . .
yt = et (1.2.65)
p 1 q 1
1 φ1 L φp 1L ( yt µ) = 1 + θ 1 L + + θq 1L et
where
p
1 φ1 L φp 1L
p 1
= ∏ (1 λk L)
k =1, k 6=i
q
1 + θ1 L + + θq 1L
q 1
= ∏ (1 η k L)
k =1, k 6=j
et = yt E ( yt j yt 1 , yt 2 , . . . ) (1.2.71)
F. Guta (CoBE) Econ 654 September, 2017 99 / 296
The value of κ t is uncorrelated with et j for any j,
though κ t can be predicted arbitrarily well from a
linear function of past values of y:
κ t = E ( κ t j yt 1 , yt 2 , . . . )
De…nition
1.3.1 The lag operator L satis…es Lyt = yt 1.
Or (1 φL) yt = α + et
The operator Φ (L) = 1 φL is a polynomial in the
lag operator L.
We say that the root of the polynomial is 1/φ,
since Φ (z ) = 0 when z = 1/φ.
We call Φ (L) the autoregressive polynomial of yt .
From Theorem 1.2.1, an AR (1) is stationary i¤
jφj < 1.
F. Guta (CoBE) Econ 654 September, 2017 104 / 296
Note that an equivalent way to say this is that an
AR (1) is stationary i¤ the root of the autoregressive
polynomial lies outside the unit circle or larger than
one (in absolute value).
We can write a general AR (p ) model as
Φ (L) yt = et
yt = θ (L) et
where θ (L) = 1 + θ 1 L + θ 2 L2 + + θ q Lq .
F. Guta (CoBE) Econ 654 September, 2017 109 / 296
1
Now, if θ (L) exists, we can write
1
θ (L) yt = et
∞
(1 + θ L) 1
= ∑( θ )j Lj , provided that jθ j < 1 (1.3.3)
j =0
Φ (L) yt = θ (L) et
yt = Φ 1
(L) θ (L) et
θ 1
(L) Φ (L) yt = et
Both Φ 1
(L) θ (L) and θ 1
(L) Φ (L) are lag
polynomials of in…nite lag length, with restrictions
on the coe¢ cients.
F. Guta (CoBE) Econ 654 September, 2017 113 / 296
1.4 Invertibility of Lag Polynomials
(1 λ1 z ) (1 λ2 z ) = 0 (1.4.2)
1 φ1 L φ2 L2 yt = (1 + θL) et (1.5.1)
(1 λ1 L) (1 λ2 L) yt = (1 + θL) et (1.5.2)
(1 0.5L) yt = et (1.5.6)
z = e i ω = cos ω i sin ω
p
where i = 1 and ω is the radian angle that z
makes with the real axis.
If the autocovariance-generating function is
iω
evaluated at z = e and divided by 2π, the
resulting function of ω,
1 1 ∞
2π j =∑∞ j
i ωj
Sy (ω ) = gy (z ) = γe
2π
F. Guta (CoBE) Econ 654 September, 2017 127 / 296
is called the population spectrum of y.
gy (z ) = σ2 (1 + θz ) 1 + θz 1
(1.6.2)
1 + θ1 z + θ2 z 2 + + θq z q 1 + θ1 z 1
+ θ2 z 2
+ + θq z q
= θ q z q + (θ q 1 + θq θ1 ) z q 1
+ (θ q 2 + θq 1 θ1 + θq θ2 ) z q 2
+
+ (θ 1 + θ 2 θ 1 + θ 3 θ 2 + + θq θq 1) z
1
+ 1 + θ 21 + θ 22 + + θ 2q z 0
1 q
+ (θ 1 + θ 2 θ 1 + θ 3 θ 2 + + θq θq 1) z + + θq z (1.6.4)
yt = µ + θ (L) et (1.6.5)
with
θ (L) = 1 + θ 1 L + θ 2 L2 + (1.6.6)
F. Guta (CoBE) Econ 654 September, 2017 131 / 296
and
∞
∑j =0 jθ j j < ∞, (1.6.7)
then
gy (z ) = σ2 θ (z ) θ z 1
(1.6.8)
Filters
yt = (1 + θL) et (1.6.11)
xt = yt yt 1 = (1 L) yt (1.6.12)
xt = (1 L) (1 + θ L) et = 1 + ( θ 1) L θ L2 et
= 1 + θ 1 L + θ 2 L2 et (1.6.13)
F. Guta (CoBE) Econ 654 September, 2017 136 / 296
with θ 1 = (θ 1) and θ 2 = θ.
application of (1.6.3):
gx (z ) = σ2 1 + θ 1 z + θ 2 z 2 1 + θ1z 1
+ θ2z 2
(1.6.14)
gx (z ) = σ2 (1 z ) (1 + θz ) 1 z 1
1 + θz 1
1
= (1 z) 1 z gy (z ) (1.6.15)
xt = h (L) yt (1.6.16)
gx (z ) = σ 2 θ (z ) θ z 1
= σ 2 h (z ) θ (z ) h z 1
θ z 1
1
= h (z ) h z gy (z ) (1.6.17)
F. Guta (CoBE) Econ 654 September, 2017 140 / 296
Applying the …lter h(L) to a series thus results in
multiplying its autocovariance-generating function
1
by h (z ) h (z ).
yt = µ0 + µ1 t + St (1.7.1)
St = ρ1 St 1 + ρ2 St 2 + + ρk St k + et (1.7.2)
Or,
yt = µ0 + µ1 t + ρ1 St 1 + ρ2 St 2 + + ρk St k + et (1.7.3)
F. Guta (CoBE) Econ 654 September, 2017 141 / 296
There are two essentially equivalent ways to
estimate the autoregressive parameters
( ρ1 , ρ2 , . . . ρk ).
You can estimate (1.7.3) by OLS.
You can estimate (1.7.1) (1.7.2) sequentially by
OLS.
bt ,
That is, …rst estimate (1.7.1), get the residual S
and then perform regression (1.7.2) replacing St
bt .
with S
F. Guta (CoBE) Econ 654 September, 2017 142 / 296
This procedure is sometimes called Detrending.
Seasonal E¤ects
ut = θut 1 + et (1.8.2)
with et a MDS.
H0 : θ = 0 vs H1 : θ 6= 0
F. Guta (CoBE) Econ 654 September, 2017 146 / 296
We want to test H0 against H1 :
To combine (1.8.1) and (1.8.2), we take (1.8.1)
and lag the equation once:
yt 1 = α + ρyt 2 + ut 1
Or yt = α (1 θ ) + (ρ + θ ) yt 1 θρyt 2 + et = AR (2)
F. Guta (CoBE) Econ 654 September, 2017 147 / 296
Thus under H0 , yt is an AR (1), and under H1 it is
an AR (2).
H0 may be expressed as the restriction that the
coe¢ cient on yt 2 is zero.
An appropriate test of H0 against H1 is therefore a
Wald test that the coe¢ cient on yt 2 is zero (a
simple exclusion test).
In general, if the null hypothesis is that yt is an
AR (k ), and the alternative is that the error is an
F. Guta (CoBE) Econ 654 September, 2017 148 / 296
AR (m), this is the same as saying that under the
alternative yt is an AR (k + m), and this is
equivalent to the restriction that the coe¢ cients on
yt k 1 , yt k 2 , . . . , yt k m are jointly zero.
An appropriate test is the Wald test of this restriction.
yt = et + g (t ) et 1 (1.9.1)
yt = φyt 1 + et , (1.9.2)
with φ = 1.
Taking variances on both sides gives
var (yt ) = var (yt 1) + σ2 , which has no solution
for the variance of the process consistent with
F. Guta (CoBE) Econ 654 September, 2017 151 / 296
stationarity, unless σ2 = 0, in which case in…nity of
solutions exists.
(1 0.2L) (1 L) yt = (1 0.5L) et
yt = µ + ρyt 1 + et (1.9.6)
H0 : ρ = 1
H1 : ρ < 1
∆yt = µ + (ρ 1) yt 1 + et (1.9.8)
∆yt = µ + et (1.9.9)
yt = µ + γt + ρyt 1 + et (1.9.10)
∆yt = µ + γt + (ρ 1) yt 1 + et (1.9.11)
ρ (L) yt = µ + et (1.9.13)
ρ (L) = 1 ρ1 L ρ2 L2 ρk Lk
∆yt = µ + α0 yt 1 + α 1 ∆ yt 1 + + αk 1 ∆ yt ( k 1 ) + et (1.9.14)
(1 L) α0 L α1 L L2 αk 1 Lk 1 Lk
H1 : α0 < 0
d
Tb
α0 ! (1 α1 α2 αk 1 ) ρµ
b
α0 d
ADF = ! τµ
b (b
se α0 )
F. Guta (CoBE) Econ 654 September, 2017 186 / 296
The limiting distributions ρµ and τ µ are non-normal.
They are skewed to the left, and have negative
means.
The …rst result states that b
α0 converges to its true
value (of zero) at rate T , rather than the
conventional rate of T 1/2 .
This is called a “super-consistent” rate of
convergence.
yt = α + φ1 yt 1 + + φp yt p + et (1.10.1)
Consequently, we have
E (yt j et ) = 0 for j = 1, 2, . . . , p
E ( ut j Ft 1) = E ( Xt et j Ft 1 ) = Xt E ( et j Ft 1 ) = 0
Asymptotic Distribution
Theorem
1
Theorem
1.10.3 If the AR (p ) process yt is strictly stationary and
ergodic and E (yt4 ) < ∞, then as T " ∞,
p d
b
T β β ! N 0, Q 1 ΩQ 1
Estimation of MA Process
∑t =2 (yt
T 2
S (θ, α) = α θet 1)
∞
et 1 = ∑j =0 ( θ )j (yt 1 j α)
and write
h i2
T ∞
S (θ , α) = ∑t =2 yt α θ ∑j =0 ( θ )j yt 1 j α
yt = α + φ1 yt 1 + + φp yt p + et + θ 1 et 1 + + θ q et q (1.10.4)
E ( et ) = 0 (1.10.5)
8
>
>
< σ2 , t = τ
E ( et e τ ) = (1.10.6)
>
>
: 0 , otherwise
F. Guta (CoBE) Econ 654 September, 2017 213 / 296
This section explores how to estimate the values of
α, φ1 , . . . , φp , θ 1 , . . . , θ q , σ2 on the basis of
observations on y.
The primary principle on which estimation will be
based is maximum likelihood.
0
2
Let θ = α, φ1 , . . . , φp , θ 1 , . . . , θ q , σ denote the
vector of population parameters.
Suppose we have observed a sample of size T
(y1 , . . . , yT ).
F. Guta (CoBE) Econ 654 September, 2017 214 / 296
The approach will be to calculate the probability
density
Yt = α + φYt 1 + et (1.10.9)
fY 1 (y1 ; θ) = fY 1 y1 , α, φ, σ2 (1.10.10)
( )
1 [y1 α/ (1 φ)]2
= q exp
2πσ2 / 1 φ2 2σ 2 / 1 φ2
Y2 = α + φY1 + e2 (1.10.11)
F. Guta (CoBE) Econ 654 September, 2017 219 / 296
Conditional on Y1 = y1 , Y 2 j Y 1 = y1 N α + φy1 , σ 2 . Hence,
( )
1 ( y2 α φ y1 ) 2
f Y 2 jY 1 ( y2 j y1 ; θ) = p exp (1.10.12)
2πσ2 2σ 2
from which,
fY t ,...,Y 1 (yt , . . . , y1 ; θ) = f Y t jY t 1
( yt j yt 1 ; θ) (1.10.14)
fY t 1 ,...,Y 1
( yt 1 , . . . , y1 ; θ)
T
` (θ) = ln fY1 (y1 ; θ) + ∑ ln f Y jY t t 1
( yt j yt 1 ; θ) (1.10.16)
t =2
proces is seen to be
fy1 α/ (1 φ)g2
` (θ) = 1
2 ln 2π 1
2 ln σ2 / 1 φ2
2 σ 2 / 1 φ2
T
T
2
1
ln 2π T
2
1
ln σ2 1
2 σ2 ∑ (yt α φyt 1)
2
(1.10.17)
t =2
F. Guta (CoBE) Econ 654 September, 2017 223 / 296
The MLE b
θ is the value for which (1.10.17) is
maximized.
In principle, this requires di¤erentiating (1.10.17)
and setting the result equal to zero.
In practice, when an attempt is made to carry this
out, the result is a system of nonlinear equations in
θ and (y1 , . . . , yT ) for which there is no simple
solution for θ in terms of (y1 , . . . , yT ).
Maximization of (1.10.17) thus requires iterative or
numerical procedures.
F. Guta (CoBE) Econ 654 September, 2017 224 / 296
Conditional ML Estimates of AR(1) Process
An alternative to numerical optimization of the exact
T
f Y T ,...,Y 2 jY 1 ( yT , . . . , y2 j y1 ; θ) = ∏ f Y t jY t 1
( yt j yt 1 ; θ) (1.10.18)
t =2
ln f Y T ,...,Y 2 jY 1 ( yT , . . . , y2 j y1 ; θ) = T 1
2 ln 2π T 1
2 ln σ2 (1.10.19)
T
1
2 σ2 ∑ ( yt α φ yt 1)
2
t =2
F. Guta (CoBE) Econ 654 September, 2017 225 / 296
Maximization of (1.10.19) with respect to α and φ
is equivalent to minimization of
T
∑ (yt α φyt 1)
2
(1.10.20)
t =2
σ2
2b
+
σ4
2b
∑ =0 yt b
α byt
φ 1
t =2
1
b2 =
or σ T
∑t =2 yt b α φbyt 1 2 .
T 1
In contrast to the exact maximum likelihood
F. Guta (CoBE) Econ 654 September, 2017 227 / 296
estimates, the conditional maximum likelihood
estimates are thus trivial to compute.
Yt = α + φ1 Yt 1 + + φp Yt p + et (1.10.21)
ln f Y Yp +1 jYp ,...,Y1 ( yT , yp +1 j yp , . . . , y1 ; θ)
T,
T p T p
= ln 2π ln σ2 (1.10.22)
2 2
T
1 2
2σ 2 ∑ Yt α φ1 Yt 1 φp Yt p
t =p +1
turns out to be
T
1 2
b2 =
σ
T ∑
p t =p +1
Yt b
α b1 Yt
φ 1 bp Yt
φ p
Yt = µ + et + θet 1 (1.10.24)
Or
1 1 2
f Y t j et ( yt j et 1 ; θ) = p exp (yt µ θet 1)
1
σ 2π 2σ 2
(1.10.25)
et = yt µ θet 1 (1.10.26)
f Y t jY t 1 ,...,Y 1 ,e0 =0
( yt j yt 1 , . . . , y1 , e 0 = 0; θ) (1.10.27)
1 e2t
= f Y t j et ( yt j e t 1 ; θ) = p exp
1
σ 2π 2σ 2
f Y T ,...,Y 1 je0 =0 ( yT , . . . , y1 j e0 = 0; θ)
T
= f Y 1 je0 =0 ( y1 j e0 = 0; θ) ∏ f Y t jY t 1 ,...,Y 1 ,e0 =0
( yt j yt 1 , . . . , y1 , e0 = 0; θ)
t =2
Yt = µ + et + θ 1 et 1 + + θ q et q (1.10.29)
e0 = e 1 = =e q +1 =0 (1.10.30)
F. Guta (CoBE) Econ 654 September, 2017 241 / 296
From these starting values we can iterate on
et = Yt µ θ 1 et 1 θ q et q (1.10.31)
for t = 1, 2, . . . , T .
Let e0 denote the (q 1) vector
0
( e0 e 1 e q +1 )
T
= T
2 ln 2π T
2 ln σ2 1
2 σ2 ∑t =1 e2t (1.10.32)
F. Guta (CoBE) Econ 654 September, 2017 242 / 296
0
where θ = (µ θ 1 . . . θ q σ2 )
Again, expression (1.10.32) is useful only if all
values of z for which
1 + θ1z + θ2z 2 + + θ q z q = 0 lie outside the
unit circle.
Conditional LF for a Gaussian ARMA(p,q) Process
Yt = α + φ1 Yt 1 + + φp Yt p
+ et + θ 1 et 1 + + θ q et q (1.10.33)
F. Guta (CoBE) Econ 654 September, 2017 243 / 296
where et i.i.d.N (0, σ2 ).
et = Yt α φ1 Yt 1 φp Yt p
θ 1 et 1 θ q et q (1.10.34)
F. Guta (CoBE) Econ 654 September, 2017 245 / 296
for t = 1, 2, . . . , T .
∑t =1 e2t
T
= T
2 ln 2π T
2 ln σ2 1
2 σ2
(1.10.35)
1 + θ1z + θ2z 2 + + θq z q = 0
yt = α + φ1 yt 1 + φ2 yt 2 + + φk yt k + et
bk = 0 if k > p
plim φ (1.11.3)
This way one can look at the PACF and test for
each lag whether the partial autocorrelation
coe¢ cient is zero.
For a genuine AR (p ) model the partial
autocorrelation will be close to zero after the pth
lag.
F. Guta (CoBE) Econ 654 September, 2017 259 / 296
For a moving average model it can be shown that
the partial autocorrelations do not have a cut-o¤
point but tail o¤ to zero just like the
autocorrelations in an autoregressive model.
Diagnostic Checking
( p + 1)
b 2 (p ) +
BIC (p ) = ln σ ln T (1.11.7)
T
given by
2 ( p + q + 1)
b 2 (p + q ) +
AIC (p + q ) = ln σ
T
( p + q + 1)
b 2 (p + q ) +
BIC (p + q ) = ln σ ln T
T
F. Guta (CoBE) Econ 654 September, 2017 268 / 296
Both criteria are likelihood based and represent a
di¤erent trade-o¤ between ‘…t’, as measured by the
loglikelihood value, and ‘parsimony’, as measured
by the number of free parameters, p + q + 1,
(assuming the models include a constant).
Usually the model with the smallest AIC or BIC
value is preferred, although one can choose to
deviate from this if the di¤erences in criterion values
are small for a subset of the models.
FT = fY ∞ , . . . , YT 1 , YT g (1.12.1)
b T +h jT
Y E ( YT +h j FT ) (1.12.3)
yT +1 = φyT + eT +1
Consequently,
ybT +1jT = E ( yT +1 j yT , yT 1 , . . .)
= φyT + E ( eT +1 j yT , yT 1 , . . .)
= φyT (1.12.4)
F. Guta (CoBE) Econ 654 September, 2017 277 / 296
To predict two periods ahead (h = 2), we write
yT +2 = φyT +1 + eT +2
ybT +2jT = E ( yT +2 j yT , . . .)
= φE ( yT +1 j yT , . . .) + E ( eT +2 j yT , . . .)
= φ2 yT (1.12.5)
Then we have
E ( yT +1 j yT , . . .) = θE ( eT j yT , . . .) = θeT
= E ( eT +2 j yT , . . .) + θ E ( eT +1 j yT , . . .) = 0
yt = φ1 yt 1 + + φp yt p + et + θ 1 et 1 + + θ q et q
+θ 1be T +h 1 jT + + θ q be T +h q jT (1.12.8)
where b
e T +k jT is the optimal predictor for eT +k at
time T , and
ybT +k jT = yT +k if k 0
b
e T +k jT = 0 if k > 0
F. Guta (CoBE) Econ 654 September, 2017 283 / 296
b
e T +k jT = e T +k if k 0
where that latter innovation can be solved from the
autoregressive representation of the model.
yt = φyt 1 + et + θet 1
Assuming invertibility
yt φ1 yt 1 = (1 + θL) et
can be written as
F. Guta (CoBE) Econ 654 September, 2017 285 / 296
et = (1 + θ L) 1
( yt φ 1 yt 1) = ∑j∞=0 ( θ )j yt j φ 1 yt 1 j
∞
ybT +1jT = φyT + θ ∑( θ )j yT j φ1 yT j 1 (1.12.9)
j =0
y T +1 jT + b
ybT +2 jT = φb e T +2 jT + θb
e T +1 jT = φb
y T +1 jT (1.12.10)
Prediction Accuracy
F. Guta (CoBE) Econ 654 September, 2017 286 / 296
In addition to prediction, it is important to know
how accurate this prediction is.
prediction error as
YT +h b T +hjT = yT +h
Y ybT +hjT and the expected
2
Ch = E yT +h b
y T +h jT = var ( yT +h j FT ) (1.12.11)
ybT +hjT = E ( yT +h j FT )
= var (eT +1 ) = σ2
= 1 + θ 2 σ2
F. Guta (CoBE) Econ 654 September, 2017 289 / 296
As one would expect, the accuracy of the prediction
decreases if we predict further into the future.
It will not, however, increase any further if h is
increased beyond 2.
This becomes clear if we compare the expected
quadratic prediction error with that of a simple
unconditional predictor:
given by
b
y T +h jT = E ( yT +h j yT , yT 1 , . . .)
∞ ∞
= ∑ α j E ( e T +h j j e T , e T 1 , . . .) = ∑ α j e T +h j
j =0 j =h
Consequently, we have
h 1
∑ α2j
2 2
E yT +h ybT +hjT =σ (1.12.12)
j =0
C1 = σ2 , C2 = σ2 1 + φ2 , C3 = σ2 1 + φ2 + φ4 , etc.