Professional Documents
Culture Documents
Question 1
Estimate an ARCH(1), ARCH(4) and GARCH(1,1) models using the nysewk data.
We have done so in the accompanying MatLab document. First we compute the weekly
percentage growth as to measure the volatility. The data is shown in figure 1,
A quick visual assessment shows signs that the data is covariance stationary.
1
A GARCH model takes two parameters, p and q where p is the number of lag variances
and q the number of lag residuals. For p = 0, we can model ARCH(q) process using a
GARCH model. E.g. GARCH(0,2) = ARCH(2).
a)
Check that stationarity restrictions hold. Compare likelihood values. Which of the three
models do you prefer? But do the models have the same number of parameters?
Let’s first review what an ARCH(q) an a GARCH(p,q) consist of:
ARCH(q):
yt = µ + ρyt−1 + t
t = σt ut
q
X
2
σt = ω + αi 2t−i
i=1
GARCH(p,q):
yt = µ + ρyt−1 + t
t = σt ut
q p
X X
2 2 2
σt = ω + αi t−i + βσt−i
i=1 i=1
We then have estimated each model separately:
However, it is important what we define as yt . Following the “Garch11Example.jl”
script,
pt
yt = 100 × log
pt−1
Where pt is the price at time t. That is, yt is the returns of this market defined a the
logarithmic difference of the price between two consecutive periods, and multiplied by
100 to have as a percentage growth.
ARCH(1):
Following the suggestion as in “Garch11Example.jl” script, we have defined σ12 as the
sample variance of the ten first observations.
Parameter Value
µ 0.1663
ρ -0.0017
ω 3.1711
α1 0.2537
The log-likelihood value for this model is,
LogL = −4.4632 × 103
GARCH(1,1):
Following the suggestion as in “Garch11Example.jl” script, we have defined σ12 as the
sample variance of the ten first observations.
2
Parameter Value
µ 0.1764
ρ 0.0017
ω 0.1597
α1 0.1123
β1 0.8536
ARCH(4):
Following the suggestion as in “Garch11Example.jl” script, we have defined σi2 (i =
1, 2, 3, 4) as the sample variance of the ten first observations.
Parameter Value
µ 0.1945
ρ -0.0074
ω 2.2135
α1 0.2183
α2 0.1161
α3 0.0623
α4 0.0977
Conclusions
1. Stationarity A series is weakly stationary if,
µt = µ, ∀t
γjt = γj , ∀t
αi ≥ 0 i = 1, . . . , q
βi ≥ 0 i = 1, . . . , p
q p
X X
αi + βi < 1
i i
We can see from the tables above that these conditions are met for the ARCH and
the GARCH models we have estimated, respectively.
3
b)
2. Likelihood
The GARCH(1,1) model reports the highest log-likelihood result, indicating that this
specification is most likely to return the data observed. The LogL are ordered as,
We can see from the table above therefore that the GARCH(1,1) model, according
to both the AIC and the BIC, is the best model for us to select, given that from a set
of candidate models for the data, the preferred model is the one with the minimum
AIC/BIC value. As we know from the class notes, this is because the idea behind a
GARCH model with low values of p and q may fit the data as well or better than an
ARCH model with large q.
Question 2
Estimate (by ML) the same Garch(1,1) model as in the previous problem using the ny-
sewk.gdt data set. Do you get the same parameter estimates? Why or why not, explain.
As we know, when are doing Maximum Likelihood Estimation we have to make dis-
tributional assumptions regarding the distribution of the model.
4
We first recall what consists of a GARCH (1,1) model (following the notation from the
previous exercise):
yt = µ + ρyt−1 + t
t = σt ut
σt2 = ω + α2t−1 + βσt−1
2
Then,
1 −2
f (yt |σt ; θ) = p exp( 2t )
2πσt2 2σt
So that the likelihood function for GARCH(1,1) is:
n
X 1 1 2
L(θ) = − ln(2π) − ln(σt2 ) − t 2
t=1
2 2 2σt
5
σt2 = ω + α2t−1 + βσt−1
2
Subject to,
α ≥ 0, β ≥ 0, α + β < 1
We can find the results to this optimization problem by using the build-in function in
Matlab “garch”:
Description: “GARCH(1,1) Conditional Variance Model (Gaussian Distribution)”
Distribution: “Gaussian”
Parameter Value Standard Error T-Stat P-Value
Constant 0.14266 0.03573 3.9926 6.5339 × 10−5
GARCH(1) 0.8655 0.018483 46.828 0
ARCH(1) 0.10398 0.012698 8.1887 2.6415 × 10−16
Question 3
Write a Matlab script that generates two independent random walks, xt = xt−1 + ut and
yt = yt−1 + ut , where the initial conditions are x0 = 1 and y0 = 1, and the two errors are
both iidN (0, 1). Use a sample size of 1000 : t = 1, 2, . . . , 1000:
In the accompanying MatLab script we have specified the following random walk pro-
cess,
xt = xt−1 + t
yt = yt−1 + ut
6
1. Regress y upon x and a constant.
The model we have to estimate is:
yt = β1 + β1 · xt + et
2. Discuss your findings, especially the slope coefficient, the t statistic of the slope, and
R2 . Are the findings sensible, given that we know that x has nothing to do with y?
We are going to organize the answer to this question by commenting each of the
findings separately. First, it is worth mentioning that the estimates of the slope co-
efficient and R2 are very sensitive and when the random generator process is ran
multiple times through MatLab the values change significantly, however constantly
displaying the general conclusions we are presenting:
yt = xt + et
And since,
yt = φyt−1 + ut
7
xt = ρxt−1 + t
Where φ = ρ = 1.
To follow the proof that the variance of yt and xt depend on t, we are gonna do it for
yt (but it would be the the same for xt ):
yt = yt−1 + ut
And,
yt = yt−2 + ut−1 + ut
V ar(xt ) = tσ2
So that we can clearly see that the variance of both yt and xt depends on t. This
happens because φ = ρ = 1 and a necessary condition so that the variance does not
depend on t is that |φ| < 1 and |ρ| < 1 which is not satisfied in this case. Therefore,
we can see that the variance of both yt and xt “explode” as we increase the number
of observations. Moreover, we can also see that the initial conditions y0 and x0 do
not affect the variance.
4. Which of the assumptions of the classical linear regression model are not satisfied
by this data generating process?
This model (for both yt and xt ) breaks the assumption of no auto-correlation of the
error terms so that the classical assumption of spherical errors is not satisfied in this
case. To show it formally let’s see the correlation between yt and yt−1 ,
2
γ1 = Cov(yt , yt−1 ) = E[(yt−1 + ut )yt−1 ] = E(yt−1 ) = V ar(yt−1 ) = (t − 1)σu2 6= 0
E(ui uj |y) 6= 0
So that the spherical error assumption is not satisfied for yt . The proof is the same to
show that the spherical error assumption is not satisfied neither for the case of xt .
8
5. Present estimation results using transformation(s) of y and/or x so that the regres-
sion using the transformed variables conforms that there is no relationship between
the variables. Explain why the trans-formation(s) you use are successful in elimi-
nating the problem of a spurious relationship.
We propose the following transformation to show that there is no relation between
x and y:
yt − yt−1 = β1 + β2 (xt − xt−1 ) + wt
And the R2 = 0.0005. Clearly, these results confirm that there is no relationship
between x and y since the errors are uncorrelated (the p-value is high).
Question 4
Suppose that data follows a MA(1) process with a constant: yt = α0 + t + φ0 t−1 (t =
1, 2, . . . , T ); where V (t ) = σ02 (∀t), and the t are white noise shocks. Assume that the
parameters satisfy restrictions so that the process is invertible (this is a technical detail,
don’t let it confuse you).
4.a)
We have solved this exercise both analytically and computationally to see how well the
results approximate. In Matlab, we have generate the data with the following parameters:
α0 = 1, φ0 = 0.5 and t ∼iid N (0, 1)
i) Mean)
Analytical result:
E(yt ) = E(α0 + t + φ0 t−1 ) = E(α0 ) = α0
Computational results:
Sample size 10 100 1000 5000
Mean 0.7747 0.8352 1.0527 0.9899
9
Since α0 = 1, as we increase the sample size, the mean of the data approximates better
to the true value.
ii) Variance)
Analytical results:
⇒ E[(α02 + 2t + φ20 2t−1 + 2t α0 + 2α0 φ0 t−1 + 2t t−1 φ0 ] − α02
⇒ α02 + σ02 + φ20 σ02 − α02 = σ02 [1 + φ20 ]
Computational results:
Since σ02 = 1 and φ0 = 0.5 so that σ02 (1 + φ20 ) = 1.25, we can see that as we increase the
sample size, the variance of the data approximates better to true value.
iv) First-order autocovariance)
Analytical results:
γ1 = Cov(yt , yt−1 ) = E(yt yt−1 ) − E(yt )E(yt−1 ) = E([α0 + t + φ0 t−1 ]α0 + t−1 + φ0 t−2 ) − α02
Since σ02 = 1 and φ0 = 0.5 so that σ02 φ0 = 0.5, we can see that as we increase the sample
size, the first-order autocovariance of the data approximates better to true value.
iii) Second-order autocovariance)
Analytical results:
γ2 = Cov(yt , yt−2 ) = E(yt yt−2 ) − E(yt )E(yt−2 ) = E([α0 + t + φ0 t−1 ]α0 + t−2 + φ0 t−3 ) − α02
= α02 − α02 = 0
Computational results:
We can see that as we increase the sample size, the second-order autocovariance of the
data approximates better to true value.
10
4.b)
Is the process covariance stationary? Plot and explain.
Yes, this MA(1) process is covariance stationary since:
• The γs autocovariance does not depend t, only on s. We have shown that for s = 1,
γs = φ0 σ02 and for any s ≥ 2, γs = 0
We have plotted the graph one graph for each sample size.
As you can see in the graph above, the mean of each sample size is plotted in red. As
the number of observation increases the graph appears to show stationary as the points
are equally dispersed around the mean. Furthermore the graph appears to display no
increase or decrease in the auto-covariance across the observation period.
We can see in the graph the the value of yt is always centered on α0 = 1. Moreover, we
can the variance seems constant across time. This pattern of the mean and the variance is
clearer as we increase the sample size.
In addition, in the graphs with 10 and 100 observations (in the graphs with 1000 and
5000 observations is more difficult to appreciate it) we can also see some persistence of 1
period of the shocks as suggested by γ1 > 0 and γk = 0 (∀k > 1)
11
4.c)
Is the process ergodic (for the mean)? Plot and explain.
A stationary stochastic process is ergodic (for the mean) if the time average converges
to the mean: n
1X
yt → µ
n t=0
We have shown in question 4.a) (numerically) and it can be seen in the figure of ques-
tion 4.b) (graphically), that any this stochastic process is ergodic for the mean since the
time average converges to the true value of the mean since they all centered (as we increase
the sample size this is more clear) around the mean (α0 ). Recall that α0 = 1, the mean of
each sample size clearly converges to 1 as the sample size increases, thus demonstrating
that the process is ergodic.
12