Professional Documents
Culture Documents
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
Wiley, Economics Department of the University of Pennsylvania, Institute of Social and Economic Research --
Osaka University are collaborating with JSTOR to digitize, preserve and extend access to International
Economic Review.
http://www.jstor.org
BY A.C. HARVEY1
1. INTRODUCTION
(2.8) Vt = et + Oet-l,
with 0=-I.
The GLS estimator of /3 in (2.7) is the ML estimator. If this is denoted by /3,
the residuals are defined by
(2.9) VI.= lvt - (xt)'/, t=2,...,T.
where V= V VT) and a2Q is the covariance matrix of the disturbances in (2.7).
The last term in (2.10) may be evaluated very easily since 101= T Comparing
(2.10) with the log-likelihood for (2.2) therefore provides a basis for discriminating
between the two models, since (2.2) and (2.7) contain the same number of un-
known parameters. However, in the Appendix it is shown that the application
of GLS to (2.7) is not necessary, since V'-"v' is identical to SSEO. This suggests
the adoption of the. criterion
(2.11)
(2.11) 6*= SSE0
0: A*= SSE1 exp [(T-)' ln T],
the levels formulation again being chosen if S*< 1.
The above three criteria will tend to yield similar results if T is at all large.
However, (2.6) is the most appealing in practice. Furthermore this approach
retains its simplicity when extended to situations where higher order differencing
or seasonal differencing is involved. With quarterly observations, for example,
a levels model might be compared with one in fourth differences. If the T-4
fourth difference equations are augmented by four levels equations with quarterly
dummies, it is once again straightforward to justify the use of s2 as a means of
measuring goodness of fit.
3. APPLICATION
in a systematic fashion.
The simple autonomous expenditure model appears from the Friedman-
Meiselman article, and from later work, to give a reasonably satisfactory 'ex-
planation' of the behavior of consumption over the period 1929-1939. It is less
impressive in other periods, however, suggesting that some change in specification
may be appropriate. Restricting attention to the 1929-1939 period, therefore,
the results are as follows:
(3.2) Ct = 58335.9 + 2.498At,
(1169.9) (0.312)
R2= 0.8771 d= .89 SSE= 11943x 104
(3.3) Act = 1.993AAt
(0.324)
R2 = 0.8096 d = 1.51 SSE = 83872 x 103.
The figures in parenthesis are standard errors.
The levels equation has a higher R2 than the differenced equation. However,
a comparison of R2's is irrelevant in this context. The residual sum of squares
is considerably smaller in (3.3) and this is reflected in values of 1.40, 1.49 and
1.81 for 6. 6t and 6* respectively. The evidence in favor of first differences is
further supported by the Durbin-Watson d-statistic being very low in (3.2).
The irrelevance of R2 in comparing the goodness of fit of levels and first differ-
ence models suggests that some modification may be appropriate. Since R2 is
so open to misinterpretation for regressions involving time series data in levels,
the most useful approach is to define a coefficient of multiple correlation in terms
of first differences. If R2 denotes the coefficient of multiple correlation for the
first differencemodel, the corresponding R2 for the levels model may be defined by
R2 =1-T
D T SSE = 1SSE0(I-R )
E, (,Ayt-,Ayt) 21 ~~~~~SSE,
where zy is the arithmetic mean of the Ayt's. Thus R2 = R2 if SSEo = SSE1 while
in the example of the previous section
5. SERIAL CORRELATION
6=
(5.4) SSE? exp f ln(IV01/1V11)+ 2 (po+qo-pl -qi))
where SSE1 =w4 W. As before, the levels model is accepted if 6 is less than
one.
Allowing the levels and first difference models to have ARMA disturbances
suggests the possibility of nesting the hypothesis. Taking first differences in the
3 Such a model is of the form Yt=a+yt-i+st, and is termed a random walk with drift.
A series of Monte Carlo experiments were carried out in order to assess the
effectiveness of the criterion described in the previous section.
Observations on Ytwere generated by a levels model of the form
(6.1) Yt = a + fixt + Ut, = ,
in which x = /3=1, and the disturbances followed a first-order autoregressive
process,
(6.2) Ut = ut-1 + t,
with et-NID(O, 0.0036). Values of q equal to 0.5 and 0.9 were used in (6.2).
Observations were also generated for first differences of yt using the model
where qtNID(O, 0.0036) and y=1. A series of observations in levels was then
constructed by arbitrarily setting Yi = 1.
Values of the explanatory variable, xt, were generated artificially, but were kept
fixed for repeated realizations of the disturbances. Both trending and stationary
data sets were employed. In the trending case, values of the explanatory variable
were generated from
(6.4) xt= exp (0.04t) + wt, t = 1,..., T,
where wt-NID(0, 0.0009), while for stationary data, xt-NID(O, 0.0625). These
data formed the basis for both the levels and first differences model. These
methods of generating explanatory variables correspond to the methods employed
by Beach and MacKinnon [1978] in their study of the properties of estimators
for regression models with AR(1) disturbances. However, it should be noted
that the actual values taken by ox,/3, y and the variances of -t and Ih have no effect
on the overall pattern of the results.
Overall, therefore, experiments were carried out for six different models: levels
with AR(1) disturbances and 0=0.5, levels with AR(1) disturbances and 0=0.9,
and first differences, for both trending and stationary data. Samples of sizes 20,
50 and 100 were employed for each model. In all cases three estimation pro-
cedures were carried out. These were as follows:
(a) Full maximum likelihood (ML) estimation of ox,/3 and qYon the assump-
tion that the model is of the form (6.1), (6.2); cf., Beach and MacKinnon
[1978].
(b) OLS applied to the first difference model.
(c) Full ML estimation of the first difference model on the assumption that
the disturbances, wt, are generated by an AR(1) process.
Method (a) is appropriate when the true model is in levels with AR(1) disturb-
ances, while (b) is the best method for (6.3). Method (c) yields asymptotically
efficient estimators of the parameters in (6.3), but represents a case of 'overfitting'
since the AR(1) specification is unnecessary.
Procedures (a) and (c) were carried out by concentrating the log-likelihood
function and performing a grid search with respect to the autoregressive parameter.
The search was accurate to two places of decimals, the largest permitted value of
q being 0.99.
The results of the simulations are set out in Tables 1 and 2. These are based
on two hundred independent replications for each model. Table 1 shows the
estimated probabilities of choosing a first difference formulation, the criteria 31
and b2 being based on a comparison of (a) and (b), and (a) and (c) respectively.
The main conclusions from Table 1 are as follows:
(i) For all three criteria, the probability of choosing the incorrect model
tends to fall as Tincreases.
(ii) The estimated probabilities of choosing the first difference formulation
TABLE 1
ESTIMATED PROBABILITIES OF CHOOSING A MODEL IN FIRST DIFFERENCES
TABLE 2
SAMPLING PROPERTIES OF LEVELS AND FIRST DIFFERENCE ESTIMATORS
Levels with 20 3.413 3.245 3.343 .365 3.894 18.289 13.080 .300
50 .624 .648 .695 .445 .084 .689 .440 .437
~-0.5* 100 .452 .471 .492 .466 .0006 .0068 .0057 .467
Levels with 20 2.638 2.229 2.546 .698 23.041 19.062 28.853 .564
50 .643 .636 .670 .829 .752 1.097 1.104 .766
100 .263 .262 .265 .869 .009 .018 .018 .845
order to answer these questions, various sampling statistics associated with the
procedures are reported in Table 2. The table shows the mean squared error
(MSE) of each of the estimators of /3 taken over all replications. Since the
estimators are unbiased, the mean squared errors may be regarded as estimates
of variance, and hence can be used as a guide to small sample efficiency. The
sample means for the estimates of 0 produced by (a) are also given.
The following points emerge from Table 2:
(vi) When the true model is in levels, and the xe's are stationary, there is very
little to choose between the three methods of estimating /3,although for
T=20 the sample MSE of the first difference estimator (b) is actually
smaller than that of the levels estimators for both 0=0.5 and 0=0.9.
The relative efficiency of applying OLS to first differences in such cases
may be explained by noting that the disturbance term in the first differ-
ence formulation is an ARMA(1, 1) process,
(6.5) V 1- t.
well for the stationary data sets. However, for trending data, the gain
from using (b) is much more pronounced.
The overall conclusion from these results if that ( provides a relatively effective
criterion for discriminating between levels and first difference models. In cases
where it is difficult to discriminate - e.g., levels with 0 = 0.9 - the costs of
making a wrong decision are unlikely to be high. The least satisfactory per-
formance of the criterion is for trending data and a first difference model. In
small samples the chances of choosing a first difference formulation are not as
high as one might wish given the losses incurred in assuming a levels model to be
appropriate. Particular care should therefore be exercised for trending data
when the decision to choose a levels model is a marginal one. Finally, the results
in Table 2 provide some interesting supplementary information on estimator (c).
(viii) When the true model is (6.3), the performance of (c) shows the effect of
overparameterization. Although estimating the model under the
assumption that the disturbances are generated by an AR(1) process
does not affect asymptotic efficiency, the loss of efficiency in small
samples may be considerable. The case of trending data with T= 20
provides the most extreme example.
(ix) For a levels model with trending data and 0=0.5 the MSE's for (c) are
considerably higher than those produced by (a). Since a levels model is
equivalent to a first difference model with ARMA(1, 1) disturbances,
this is an indication that the assumption of an AR(1) disturbance may
be totally inadequate in these circumstances. The figures for 0 =0,
which are not shown in the table, illustrate the point even more dramati-
cally. For T= 20, for example, the MSE of (c) was over six times the
size of the MSE of (a) for trending data.4 These findings may have
important practical implications in applied econometric work, where the
assumption of AR(1) disturbances is widely employed, even in first
difference models.
7. CONCLUSION
This paper gives a formal justification for the use of s2, the unbiased estimator
of the disturbance variance, as the basis for assessing the goodness of fit of re-
gression models irrespective of whether the dependent variable is in levels or first
differences. A coefficient of multiple correlation in terms of first differences is
defined for a levels regression model. This would seem to be more useful than
the conventional R2 for a time series regression.
When the disturbances in a model have been modeled by an ARMA process an
I However, as with 0==0.5 and 0=0.9, method (c) performedmuch better for stationary
data. For T=20 and 6=0, it gave a MSE only 140 greaterthan that of (a). Neverthelessit is
the trendingcase which is most relevantin this context, since it is in such cases that differencing
tends to be carriedout automatically.
APPENDIX
Consider the levels model, (2.1) and pre-multiply y, the Tx 1 vector of observa-
tions on the dependent variable and X, the Tx (k +1) matrix of independent
variables, by the Tx T matrix
-1 1 0 .0... 0
0-1 1 0.0
0
01I
O 0-1 -1 1r *
0 ......... 0 1
This gives the T- 1 equations (2.7), together with (2.1) for t= T, i.e.,
(A.1) Aly = (Aext)'f + vt, t = 2,..., T,
Y T = 0 + XT'f + STA
Note that H is an upper triangular matrix with non-zero elements on the leading
diagonal, and so it is non-singular.
The covariance matrix of the disturbances v2,..., VT, ET in (A.1) is U2HH'.
Since HH' is positive definite, there must be a lower triangular matrix, J, such
that J'J=(HH')-1. Pre-multiplying the dependent and independent variables in
(A.1) by J leads to a new set of transformed equations in which the disturbances
have a scalar covariance matrix, i.e., U2I. Thus OLS applied to this model will
yield exactly the same estimator of f and exactly the same residual sum of squares,
SST, as OLS applied to the original model (2.1). Furthermore, in view of the
lower triangular nature of J, the new transformed system of equations will have
a similar form to (A.1), in that a constant term appears in the last equation only.
An estimator of all the parameters apart from the constant term, may be con-
structed from the first T- 1 observations. When the observations corresponding
to the final equation are added, however, the residual sum of squares remains
unchanged, since the extra parameter must be accommodated; cf., Brown, Durbin
and Evans [1975, p. 153]. Thus SST=SST-1.
Suppose now that the observations in the first T- 1 equations of (A.1) are pre-
multiplied by J1, the (T- 1) x (T- 1) sub-matrix of J obtained by deleting the
T-th row and T-th column of J. In view of the lower triangularity of J, this
yields the first T- 1 equations of the system obtained by pre-multiplying the full
set of equations in (A.1) by J. The matrix Q defined in (2.10) is equal to the
sub-matrix of HH' given by deleting the last row and column in HH'. A little
algebraic manipulation shows that J'1 (J 1)'= Q and so J'J1 = Q-T. The residual
sum of squares SST-1, is therefore equal to v'Q-v' and by the argument of the
previous paragraph this is also equal to the residual sum of squares in the un-
transformed levels model (2.1).
REFERENCES