Models
ARCH MODEL AND TIMEVARYING VOLATILITY
In this lesson we'll use Stata to estimate several models in which the variance of the dependent
variable changes over time. These are broadly referred to as ARCH (autoregressive conditional
heteroskedasticity) models and there are many variations upon the theme. Again, this is all
covered in POE4.
The first thing to do is illustrate the problem graphically using data on stock returns. The data
are stored in the Stata dataset returns.dta.
use r et ur ns, cl ear
The data contain four monthly stock price indices: U.S. Nasdaq (nasdaq), the Australian All
Ordinaries (al l or ds), the J apanese Nikkei (ni kkei ) and the U.K. FTSE (f t se). The data are
recorded monthly beginning in 1988m1 and ending in 2009m7.
gen dat e = m( 1988m1) + _n  1
f or mat dat e %t m
t sset dat e
Plots of the series in their levels are generated using t woway( t sl i ne var name) .
qui t sl i ne nasdaq, name( nas, r epl ace)
qui t sl i ne al l or ds, name( a, r epl ace)
qui t sl i ne f t se, name( f , r epl ace)
qui t sl i ne ni kkei , name( nk, r epl ace)
gr aph combi ne nas a f nk, col s( 2) name( al l 1, r epl ace)
The series are characterized by random, rapid changes and are said to be volatile. The volatility
seems to change over time as well. For instance U.S. stock returns (nasdaq) experiences a
relatively sedate period from 1992 to 1996. Then, stock returns become much more volatile until
early 2004. Volatility increases again at the end of the sample. The other series exhibit similar
periods of relative calm followed by increased volatility.
Next, the hi st ogr amcommand is used to generate graphs of the empirical distribution of
returns. A curve from a normal distribution is overlaid using the nor mal option.
qui hi st ogr amnasdaq, nor mal name( nas, r epl ace)
qui hi st ogr amal l or ds, nor mal name( a, r epl ace)
qui hi st ogr amf t se, nor mal name( f , r epl ace)
qui hi st ogr amni kkei , nor mal name( nk, r epl ace)
gr aph combi ne nas a f nk, col s( 2) name( al l 2, r epl ace)

3
0

2
0

1
0
0
1
0
2
0
N
A
S
D
A
Q
s
t
o
c
k
I
n
d
e
x
(
U
S
A
)
1990m1 1995m1 2000m1 2005m1 2010m1
date

2
0

1
0
0
1
0
A
l
l
O
r
d
i
n
a
r
i
e
s
S
t
o
c
k
I
n
d
e
x
(
A
u
s
t
r
a
l
i
a
)
1990m1 1995m1 2000m1 2005m1 2010m1
date

2
0

1
0
0
1
0
2
0
F
T
S
E
S
t
o
c
k
I
n
d
e
x
(
U
K
)
1990m1 1995m1 2000m1 2005m1 2010m1
date

3
0

2
0

1
0
0
1
0
2
0
N
I
k
k
e
i
S
t
o
c
k
I
n
d
e
x
(
J
a
p
a
n
)
1990m1 1995m1 2000m1 2005m1 2010m1
date
TimeVarying Volatility and ARCH Models
These series are leptokurtic. That means they have lots of observations around the average and a
relatively large number of observations that are far from average; the center of the histogram has
a high peak and the tails are relatively heavy compared to the normal.
TESTING, ESTIMATING, AND FORECASTING
The basic ARCH models consist of two equations. The mean equation describes the behavior of
the mean of your time series; it is a linear regression function that contains a constant and
possibly some explanatory variables. In the cases considered below, the mean function contains
only an intercept.
t t
y e =  +
In this case we expect the time series to vary randomly about its mean, . If the mean of your
time series drifts over time or is explained by other variables, you'd add them to this equation just
as you would in the usual regression model. The error of the regression is normally distributed
and heteroskedastic. The variance of the current period's error depends on information that is
revealed in the preceding period. The variance of e
t
is given the symbol h
t
. The variance
equation describes how the error variance behaves.
2
1 1 t t
h e
÷
= o + o
0
.
0
2
.
0
4
.
0
6
.
0
8
D
e
n
s
i
t
y
30 20 10 0 10 20
NASDAQ stock Index (USA)
0
.
0
2
.
0
4
.
0
6
.
0
8
.
1
D
e
n
s
i
t
y
20 10 0 10
All Ordinaries Stock Index (Australia)
0
.
0
5
.
1
D
e
n
s
i
t
y
20 10 0 10 20
FTSE Stock Index (UK)
0
.
0
2
.
0
4
.
0
6
.
0
8
D
e
n
s
i
t
y
30 20 10 0 10 20
NIkkei Stock Index (J apan)
Notice that h
t
depends on the squared error in the preceding time period. The parameters in this
equation have to be positive to ensure that the variance, h
t
, is positive.
A Lagrange Multiplier (LM) test can be used to test for the presence of ARCH effects (i.e.,
whether o>0). To perform this test, first estimate the mean equation. Save and square the
estimated residuals,
2
ˆ
t
e . You will use these in an auxiliary regression from which you’ll use the
sample size and goodnessoffit measure to compute a test statistic. For first order ARCH, regress
2
ˆ
t
e on the lagged residuals
2
1
ˆ
t
e
÷
and a constant:
2 2
0 1 1
ˆ ˆ
t t t
e e v
÷
= ¸ + ¸ +
where
t
v is a random term. The null and alternative hypotheses are:
0 1
1 1
: 0
: 0
H
H
¸ =
¸ =
The test statistic is TR
2
, whereT is the number of observations in the auxiliary regression. It has a
_
2
(1) distribution is the null hypothesis is true. Compare the pvalue from this statistic to the
desired test level (o) and reject the null if the pvalue is smaller. If you suspect a higher order
ARCH(q) error variance process, then include q lags of
2
ˆ
t
e as regressors, compute TR
2
, and use
the _
2
(q) distribution to obtain the pvalue.
In the first ARCH example the byd.dta data are used. Load the data using the clear option to
remove any previous data from Stata’s memory.
use byd, cl ear
This dataset contains a single undated time series. Generate a time variable in the easiest way
possible and declare the data to be time series.
gen t i me = _n
t sset t i me
In this instance, a time counter equal to the observation number is created using _n and this is set
equal to the variable t i me. Then the t sset command is used to declare it a time series.
The first thing to do is plot the time series using
t sl i ne r , name( g1, r epl ace)
This yields
TimeVarying Volatility and ARCH Models
There is visual evidence of time varying volatility. Towards the end of the time series, returns for
BYD appear to become more volatile. An ARCH(1) model is proposed and the ARCH(1) model
is tested against the null hypothesis of no ARCH using the LM test discussed above. The first
step is to estimate a regression that contains only an intercept. Obtain the residuals, which we call
ehat , and square them.
r egr ess r
pr edi ct ehat , r esi dual
gen ehat 2 = ehat * ehat
The auxiliary regression
2 2
0 1 1
ˆ ˆ
t t t
e e v
÷
= ¸ + ¸ + uses the lag operator . to take a single lag to include as
a regressor in the auxiliary model.
r egr ess ehat 2 L. ehat 2
The test statistic is TR
2
from this regression. The pvalue is computed using the chi 2t ai l
function. Remember, the first argument of chi 2t ai l is the degrees of freedom for your test
(equal to q) and the second argument is the computed value of your statistic. Reject no arch if the
pvalue is less than the desired significance level, o. The Stata code is:
scal ar TR2 = e( N) *e( r 2)
scal ar pval ue = chi 2t ai l ( 1, TR2)
scal ar cr i t = i nvchi 2t ai l ( 1, . 05)

2
0
2
4
6
8
r
e
t
u
r
n
s
t
o
s
h
a
r
e
s
i
n
B
r
i
g
h
t
e
n
Y
o
u
r
D
a
y
(
B
Y
D
)
L
i
g
h
t
i
n
g
0 100 200 300 400 500
time
scal ar l i st TR2 pval ue cr i t
This yields the result:
Stata also includes a builtin function to compute this test statistic. Using it will provide identical
results. First estimate the regression then use the postestimation command ar chl mas shown
below:
r egr ess r
Then use the postestimation command ar chl mas shown below.
est at ar chl m, l ags( 1)
As we know, postestimation commands begin with est at , after which the ar chl mcommand is
issued. The ar chl mcommand uses the l ags( q) option, where q is the order of the ARCH process
you wish to include in the alternative hypothesis. In this example q=1.
The results from the ar chl mcommand are:
This is a particularly useful alternative to the manual process of computing TR
2
from an auxiliary
regression. The null and alternative hypotheses are clearly stated, the statistic and its distribution
are given, and the pvalue is computed and shown in the default output. That means that Stata is
generating all the information you need to properly conduct the test. Excellent!
The ar chl mtest can be accessed through the dialogs, but the process is fairly convoluted. J ust
in case you haven’t weaned yourself from using the pulldown menus yet here is how. First you
need to estimate the mean equation using regression. Select Statistics > Linear models and
related > Linear regression. Choose r as the dependent variable (with no independent
cr i t = 3. 8414588
pval ue = 3. 167e 15
TR2 = 62. 159504
. scal ar l i st TR2 pval ue cr i t
_cons 1. 078294 . 0529959 20. 35 0. 000 . 9741716 1. 182417
r Coef . St d. Er r . t P> t  [ 95%Conf . I nt er val ]
Tot al 700. 737278 499 1. 40428312 Root MSE = 1. 185
Adj R squar ed = 0. 0000
Resi dual 700. 737278 499 1. 40428312 R squar ed = 0. 0000
Model 0 0 . Pr ob > F = .
F( 0, 499) = 0. 00
Sour ce SS df MS Number of obs = 500
. r egr ess r
H0: no ARCH ef f ect s vs. H1: ARCH( p) di st ur bance
1 62. 160 1 0. 0000
l ags( p) chi 2 df Pr ob > chi 2
LM t est f or aut or egr essi ve condi t i onal het er oskedast i ci t y ( ARCH)
. est at ar chl m, l ags( 1)
TimeVarying Volatility and ARCH Models
variables) and click OK. Then, choose Statistics > Time series > Tests < Timeseries tests after
regress.
This reveals the estat dialog box that we’ve seen before.
In this case, scroll down to the option Test for ARCH effects in the residuals (archlm – time
series only) and then specify the number of lags to be tested (1 as shown). Click OK.
In this example, the no ARCH effects hypothesis is rejected at the 5% level and we proceed to
estimation of the model.
The basic ARCH model and all the variants considered below are estimated using the ar ch
command. The syntax is shown below:
ar ch depvar [ i ndepvar s] [ i f ] [ i n] [ wei ght ] [ , opt i ons]
After issuing the ar ch command, list the dependent variable, independent variables (if you have
any), and any conditionals or weights you may wish to use. Then, list the desired options. These
options are what make Stata’s arch command very flexible and powerful.
For the ARCH(1) model of BYD, the option to use is simply ar ch( 1) . The complete
command syntax for an ARCH(1) model of BYD’s returns is
ar ch r , ar ch( 1)
which produces this output:
In the Stata output (but not shown) is a list of iterations; this gives a clue as to how this magic is
actually being performed. Iterations indicate that a nonlinear numerical optimization is being
done behind the scenes, in this case to maximize the likelihood function. The log likelihood
should be getting larger as the iterations proceed. If the numerical optimization somehow fails, an
error message will appear just after the (many) iterations.
The parameter estimates follow the iteration summary. In this case they match those in POE,
but the standard errors are a little different. Don’t worry about this, they are valid if the ARCH(1)
model is appropriate. So, in the BYD example, the average return is about 1.06%. The ARCH
term’s tratio is statistically significant and you conclude that the variance is autoregressive
conditionally heteroskedastic (which for good measure should be repeated out loud three times).
To arrive at these results through the dialogs choose Statistics > Time series >
ARCH/GARCH > ARCH and GARCH models from the pulldown menu. This reveals the
arch – Autoregressive conditional heteroskedasticity family of estimators dialog box shown
below:
TimeVarying Volatility and ARCH Models
In this box choose r as the dependent variable and select a single lag in the ARCH maximum lag
box. Click OK and you are done. Note, you can choose longer maximum ARCH lags (i.e., q) or
even specify a list of lags in this dialog. The dialog is also used to estimate a generalization of
ARCH that is considered in the next section. Before moving on though, let’s graph the estimated
future return
1 t
r
+
and the conditional volatility
1 t
h
+
.
The forecasted return is just a constant in this case, since no explanatory variables other than
a constant was included in the regression portion of the ARCH model
1 0
ˆ
ˆ 1.063
t
r
+
=  =
The forecasted error variance is essentially an insample prediction model based on the estimated
variance function.
( )
( )
2
2
1 0 1 0
ˆ ˆ
ˆ ˆ 0.642 0.569 1.063
t t t
h r r
+
= o + o ÷ = + ÷
Stata generates this whenever it estimates an ARCH model and saves the result to a variable using
the pr edi ct command with option var i ance. Here the ARCH(1) model is estimated and the
variance is saved as a variable called ht ar ch.
ar ch r , ar ch( 1)
pr edi ct ht ar ch, var i ance
This could be generated manually using saved results from the estimated ARCH model
gen ht _1 = _b[ ARCH: _cons] +_b[ ARCH: L1. ar ch] *( L. r  _b[ r : _cons] ) ^2
l i st ht ar ch ht _1 i n 496/ 500
which produces:
The builtin computation from Stata’s predict command is confirmed by our manual calculation.
Then t sl i ne is used to plot the forecast error variance against time.
t sl i ne ht ar ch, name( g2, r epl ace)
This produces the time series plot
Obviously, there is a lot more volatility towards the end of the sample.
500. 2. 122526 2. 122526
499. 1. 614941 1. 614941
498. 1. 968768 1. 968768
497. . 8093833 . 8093833
496. 1. 412281 1. 412281
ht ar ch1 ht _1
. l i st ht ar ch ht _1 i n 496/ 500
0
5
1
0
1
5
2
0
C
o
n
d
i
t
i
o
n
a
l
v
a
r
i
a
n
c
e
,
o
n
e

s
t
e
p
0 100 200 300 400 500
time
TimeVarying Volatility and ARCH Models
EXTENTIONS
An important extension of the ARCH(1) is the ARCH(q) model. Here, additional lags of the
squared residuals are added as determinants of the equation’s variance, h
t
:
2 2 2
0 1 1 2 2
...
t t t q t q
h e e e
÷ ÷ ÷
= o + o + o + o
GARCH
Another extension is the Generalized ARCH or GARCH model. The GARCH model adds lags
of the variance, h
tp
, to the standard ARCH. A GARCH(1,1) model would look like this:
2
1 1 1 1 t t t
h e h
÷ ÷
= o + o +
It has one lag of the regression model’s residual (1 ARCH term) and one lag of the variance itself
(1 GARCH term). Additional ARCH or GARCH terms can be added to obtain the GARCH(p,q),
where p is the number of lags for h
t
and q is the number of lags of e
t
included in the model.
Estimating a GARCH(1,1) model for BYD is simple. Basically, you just add a single
GARCH term to the existing ARCH model, so the command is
ar ch r , ar ch( 1) gar ch( 1)
The syntax is interpreted this way. We have an ar ch regression model that includes r as a
dependent variable and has no independent variables other than a constant. The first option
ar ch( 1) tells Stata to add a single lagged value of e
t
to the modeled variance; the second option
gar ch( 1) tells Stata to add a single lag of the variance, h
t
, to the modeled variance. The result is:
The estimate of o
1
is 0.491 and the estimated coefficient on the lagged variance, 
1
is 0.238.
Again, there are a few minor differences between these results and those in the text, but that is to
_cons . 4009868 . 0899182 4. 46 0. 000 . 2247505 . 5772232
L1. . 2379837 . 1114836 2. 13 0. 033 . 0194799 . 4564875
gar ch
L1. . 4911796 . 1015995 4. 83 0. 000 . 2920482 . 6903109
ar ch
ARCH
_cons 1. 049856 . 0404623 25. 95 0. 000 . 9705517 1. 129161
r
r Coef . St d. Er r . z P> z [ 95%Conf . I nt er val ]
OPG
Log l i kel i hood =  736. 0281 Pr ob > chi 2 = .
Di st r i but i on: Gaussi an Wal d chi 2( . ) = .
Sampl e: 1  500 Number of obs = 500
ARCH f ami l y r egr essi on
I t er at i on 7: l og l i kel i hood =  736. 02814
be expected when coefficient estimates have to be solved for via numerical methods rather than
analytical ones.
As in the ARCH model, the predicted forecast variance can be saved and plotted:
pr edi ct ht gar ch, var i ance
t sl i ne ht gar ch
which yields the time series plot:
Threshold GARCH
The threshold GARCH model, or TGARCH, is another generalization of the GARCH model
where positive and negative news are treated asymmetrically. In the TGARCH version of the
model, the specification of the conditional variance is:
2 2
1 1 1 1 1 1
1 0 (bad news)
0 0 (good news)
t t t t t
t
t
t
h e d e h
e
d
e
÷ ÷ ÷ ÷
= o + o + ¸ +
< ¦
=
´
>
¹
0
5
1
0
1
5
2
0
C
o
n
d
i
t
i
o
n
a
l
v
a
r
i
a
n
c
e
,
o
n
e

s
t
e
p
0 100 200 300 400 500
time
TimeVarying Volatility and ARCH Models
In Stata this just means that another option is added to the arch r regression model. The option to
add asymmetry of this sort is acw() where the argument tells Stata how many lagged asymmetry
terms to add. This can be less than the number of ARCH terms, q, but not greater.
Here is a TGARCH model for BYD.
ar ch r , ar ch( 1) gar ch( 1) t ar ch( 1)
pr edi ct ht t gar ch, var i ance
t sl i ne ht t gar ch
Once again, the variance is saved and plotted using a time series plot. The Threshold GARCH
result is:
and the plotted predicted error variances are:
_cons . 3557296 . 0900538 3. 95 0. 000 . 1792274 . 5322318
L1. . 2873 . 1154888 2. 49 0. 013 . 0609462 . 5136538
gar ch
L1.  . 4917071 . 2045045  2. 40 0. 016  . 8925285  . 0908856
t ar ch
L1. . 754298 . 2003852 3. 76 0. 000 . 3615501 1. 147046
ar ch
ARCH
_cons . 9948399 . 0429174 23. 18 0. 000 . 9107234 1. 078956
r
r Coef . St d. Er r . z P> z [ 95%Conf . I nt er val ]
OPG
Log l i kel i hood =  730. 554 Pr ob > chi 2 = .
Di st r i but i on: Gaussi an Wal d chi 2( . ) = .
Sampl e: 1  500 Number of obs = 500
ARCH f ami l y r egr essi on
I t er at i on 7: l og l i kel i hood =  730. 55397
GARCHinmean
A final variation of the ARCH model is called GARCHinmean (MGARCH). In this model,
the variance, h
t
, is added to the regression function.
0 t t t
y h e =  + u +
If its parameter, u, is positive then higher variances will cause the average return E(y) to increase.
This seems reasonable: more risk, higher average reward! To add a GARCHinmean to the BYD
example, we simply add another option to the growing list in the ar ch statement. The command
becomes:
ar ch r , ar chmar ch( 1) gar ch( 1) t ar ch( 1)
In this case, the option ar chm(which stands for arch in mean) is added to the others, ar ch( 1)
gar ch( 1) and t ar ch( 1) . These are retained since these terms are included in the BYD example
from the text. The results are
0
5
1
0
1
5
C
o
n
d
i
t
i
o
n
a
l
v
a
r
i
a
n
c
e
,
o
n
e

s
t
e
p
0 100 200 300 400 500
time
TimeVarying Volatility and ARCH Models
You can see that the coefficient on the GARCHinmean term
ˆ
.1959, u = is positive and
statistically significant at the 5% level in this instance.
Finally, the predicted mean and variance functions are saved and plotted using time series
plots.
pr edi ct m_mgar ch, xb
pr edi ct ht mgar ch, var i ance
qui t sl i ne m_mgar ch, name( g5, r epl ace)
qui t sl i ne ht mgar ch, name( g6, r epl ace)
gr aph combi ne g5 g6, col s( 1)
In this case, the mean and variance are plotted in the same graph in a single column:
_cons . 3705214 . 0818646 4. 53 0. 000 . 2100698 . 5309731
L1. . 2783425 . 1039073 2. 68 0. 007 . 074688 . 481997
gar ch
L1.  . 321069 . 1621927  1. 98 0. 048  . 6389608  . 0031772
t ar ch
L1. . 6160302 . 1634603 3. 77 0. 000 . 2956538 . 9364066
ar ch
ARCH
si gma2 . 1958843 . 067164 2. 92 0. 004 . 0642453 . 3275233
ARCHM
_cons . 8181453 . 0711579 11. 50 0. 000 . 6786783 . 9576122
r
r Coef . St d. Er r . z P> z [ 95%Conf . I nt er val ]
OPG
Log l i kel i hood =  724. 6549 Pr ob > chi 2 = 0. 0035
Di st r i but i on: Gaussi an Wal d chi 2( 1) = 8. 51
Sampl e: 1  500 Number of obs = 500
ARCH f ami l y r egr essi on
I t er at i on 7: l og l i kel i hood =  724. 65492
The predictions of the mean and variance follow very similar patterns.
1
2
3
4
x
b
p
r
e
d
i
c
t
i
o
n
,
o
n
e

s
t
e
p
0 100 200 300 400 500
time
0
5
1
0
1
5
2
0
C
o
n
d
i
t
i
o
n
a
l
v
a
r
i
a
n
c
e
,
o
n
e

s
t
e
p
0 100 200 300 400 500
time