You are on page 1of 26

Making Good Inferences from Bad Data

Author(s): John G. Cragg


Source: The Canadian Journal of Economics / Revue canadienne d'Economique , Nov.,
1994, Vol. 27, No. 4 (Nov., 1994), pp. 776-800
Published by: Wiley on behalf of the Canadian Economics Association

Stable URL: https://www.jstor.org/stable/136183

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms

Wiley and Canadian Economics Association are collaborating with JSTOR to digitize, preserve
and extend access to The Canadian Journal of Economics / Revue canadienne d'Economique

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
Making good inferences from bad data

JOHN G. CRAG G University of British Columbia

Abstract. Errors in variables can seriously distort inference when they are not taken into
account explicitly. Coefficient values, their significance, and whether some explanatory vari-
ables should instead be used as instruments are largely a matter of interpretation unless
further information is available. Higher moments of the observable variables impose restric-
tions that allow testing for identification and specification and estimating the parameters of
the standard errors-in-variables model. The argument is developed partly through examples
illustrating the points.

Faire de bonnes inferences a partir de mauvaises donne'es. Des erreurs dans la mesure des
variables peuvent distorsionner serieusement le processus d'inf6rence quand on n'en tient pas
compte explicitement. La valeur des coefficients, leur signification, et la question de savoir si
d'autres variables explicatives devraient plutot etre utilisees comme instruments - voila qui
est affaire d'interpretation 'a moins qu'on puisse obtenir des renseignements additionnels. Des
moments d'ordre superieur des variables observees imposent des restrictions qui permettent
de mettre au test l'identification et la specification mais aussi d'estimer les parametres d'un
modele standard oiu il y a erreur dans la mesure des variables. L'auteur developpe son
argumentation en partie 'a l'aide d'exemples qui illustrent ses propos.

J. INTRODUCTION

My acquaintance with measurement error goes back farther than my knowledge of


formal economics. Indeed, I almost lost my first full-time job over measurement
error. I was working in the mill of a mine in northern Quebec as a summer job, doing

The 1994 Harold Innis Memorial Lecture given at the Canadian Economics Association Meetings,
University of Calgary, Calgary, Alta, 10 June 1994. Research supported by the Social Sciences
and Humanities Research Council of Canada under grant 94-0737. I have benefited greatly from
the comments and patience of Adolph Buse, Noxy Dastoor, Masao Nakamura, and David Ryan. I
especially thank Alice Nakamura both for her comments and for her gracious invitation to deliver
this lecture.

Canadian Journal of Economics Revue canadienne d'Economique, XXVII, No. 4


November novembre 1994. Printed in Canada Imprime au Canada

0008-4085 / 94 / 776-800 $1.50 ? Canadian Economics Association

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
Making good inferences 777

odd jobs around the place. One of these tasks involved sampling the concentrate
in railway cars to determine its metal content.
Concentrate is a sort of mud, high in metal content, produced at the mill of
a mine. It was sent in railway cars to smelters for further processing to extract
the metals. The sampling consisted of taking a long hollow tube, shoving it down
into the mud until it hit the floor of the railway car, drawing it back up, and then
tapping the resulting core into a bucket. The samples were taken at several points
in the car. Often when the tube was drawn up, the core would fall back into the
hole, or part of it would fall back, before it could be put into the bucket. This did
not bother the person who had been doing the job and who was showing me how
to do the sampling. Nor did he repeat the task until he got a full tube. Instead, at
the end of his sampling, he would gather up several handfuls of mud to fill up the
bucket.
When I took over the task, I realized that this was not the best way to sample. I
assiduously made sure that the bucket was filled only with full-length cores. After
ten days, the foreman took me aside. Management had decided to discontinue
the activity. Recently agreement between the samples taken and the results at the
smelter had broken down. There had never been perfect agreement, but now the
nature of the relationship between the measures was different, and the mill was
claiming different content. Since the samples no longer led to good predictions
using the previous algorithm (though they were actually closer point estimates)
there was no longer any point in collecting them. My job had just disappeared.
Luckily, there were other (rather grungier) tasks available, so I did not get fired.
There can be little doubt that most - possibly all - data we used in economic
research are subject to errors of measurement. In this lecture I want to recall the
rather horrendous effects measurement error can have on the inferences that we
draw. Also, and more importantly, I want to call attention to the considerable
strides that have been made towards arriving at solutions.
One of my colleagues, on hearing the title of this lecture, remarked, 'For once the
Innis Lecture will not be policy oriented.' That is true in one sense, but in another
it is profoundly false. Any empirically oriented economist must be disturbed by the
deep suspicion that most economists seem to hold towards empirical work and the
extent to which empirical research is relied on genuinely neither in forming our view
of how the economy works, nor in confirming or contradicting theoretical accounts,
nor in evaluating precisely what should be done about the way the economy is
performing. By contrast, more reliance is placed (quite appropriately) on forecasting
using models built with standard econometric tools but without sustainable claims
to involve structural parameters.
I am not making the common complaint about the primacy of theory. There are
good reasons for this suspicion of empirical findings. We all seem to suspect (is
this partly because we have all done it to greater or lesser degree?) that empirical
results have been massaged carefully to conform to some interpretation of events.
In the common phrase, we have taken the data down into the basement and beaten
the truth out of them. The resulting evidence is of as little reliability as are most
forced confessions.

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
778 John G. Cragg

While this may be true, surely many (hopefully most) investigators do strive
conscientiously to report objectively what is buried in the data rather than to cook
the results; at least they do so within the confines imposed by common sense and the
need to present an intelligible account of results and the willingness of colleagues
to follow the elaborate process of detection by which empirical regularities are
usually discovered, validated, and interpreted. The source of suspicion of empirical
results should be sought in more difficult areas than the probity of the investigator.
Anyone with extensive experience applying econometric methods has, I sus-
pect, discovered how often reasonable hypotheses do not seem to be supported
by the data; 'strange' results arise that cannot apparently be given a reasonable
explanation. Such findings tend not to reach the public domain because they are
not sufficiently 'interesting.' One forgets that 'strange' results are what should be
expected when the data do not correspond precisely to the theoretical quantities.
Errors-in-variables is only one of a variety of possible sources for entirely proper
scepticism about econometric results. But measurement error is sufficient by and of
itself to cast serious doubt on results if it has not been taken into account explicitly.
I shall argue that much can and should be done about allowing for it.
The lecture proceeds as follows. First (in section II) I consider why measurement
error so often seems to be ignored. Then I shall explain (in section III) why in a
multiple regression context the effects are serious but cannot be assessed usefully,
even qualitatively, without further information. An illustration using a regression
of considerable prima facie policy relevance is then presented (in section iv). Next
I shall outline one way of coping with the problem, namely, using information
in sample moments up to the fourth, in the simple regression model (in section
v). The procedure will be illustrated (in section vi) with a CAPM example. Then I
outline how this approach connects with instrumental variables if they are available
(in section vii). That brings me to the conclusion (section viii) of the lecture, when
I shall also indulge in some of the preaching which, with this topic, one is tempted
to do on such an occasion.

II. IGNORING THE PROBLEM OF MEASUREMENT ERRORS

That measurement error is not taken more seriously or treated explicitly is one
of the mysteries of methodology. Its importance has been recognized from very
early stages in econometrics, and the number of articles on the subject is legion.1
Measurement error initially played a central role (cf. Morgan 1990), as illustrated
by the work of Frisch (1934). While the seminal work of the Cowles Commission
(Koopmans 1950; Koopmans and Hood 1953) changed the immediate focus to the
simultaneous-equations problem, those who developed that approach regarded it as
a way of getting on with one problem, while the measurement-error problem was
to be treated subsequently (Koopmans and Hood 1953, 117). But while typically a

1 However, the number of these papers that suggest positive steps to be taken to overcome the
problem which are applicable in the situations met by many researchers in the social sciences is
far more limited. A good, fairly recent survey can be found in Bickel and Ritov (1987).

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
Making good inferences 779

great deal of attention is now given not only in theoretical papers but also in serious
applied work to matters such as sample-selection bias, limited dependent-variables,
or heteroscedasticity-consistent estimation of covariance matrices, and while we
routinely conduct highly imaginative investigations of non-nested hypotheses and
estimate highly sophisticated non-linear models, the Achilles heel of measurement
error is ignored. At best, it is apt to be treated in cursory fashion using instrumental
variables without investigation of their appropriateness.
The need to handle measurement error remains pressing. The effects of ignored
errors-in-variables on structural-parameter estimates are qualitatively and possibly
quantitatively as serious as the effects of ignoring simultaneous-equation bias. The
reasons for needing estimates of the structural parameters when there are measure-
ment errors is at least as cogent as those so well elucidated in the classic paper by
Marschak (1953) on the need for structural estimates of the simultaneous-equation
model. Dealing with the measurement-error problem has been short-changed, how-
ever, except in so far as the instrumental-variable approach can be applied. This
goes so far that it is not even widely realized that having only as many instruments
as there are variables measured with error may not lead to happy results2 and
involves merely reinterpretation of an equation that uses the instruments as addi-
tional explanatory variables; instead, having redundant instruments is as beneficial
as having overidentified equations.
I suspect that the general neglect of errors-in-variables arises for at least three
reasons - each of which is to some extent mistaken. The first reason is that many
of us have the impression that not much can be done about measurement error. It
is notable that what is still the most extensive and (unfortunately) important inves-
tigation of measurement error in economics, Oskar Morgenstern's (1963) On the
Accuracy of Economic Observations, contains a large list of types of measurement
error and examples of them. By contrast, the book offers little practical advice on
what to do when the data that are available or that might practically be gathered
are thoroughly corrupted by the problems so graphically described. Partly this is
because the book deals with two different problems: (a) measurement error in the
sense of a mistake in measuring an appropriately conceived quantity, and (b) error
in the concept that is being measured, so that the observation does not correspond
to what it is interpreted as measuring.
Lack of correspondence between what is actually measured and the concepts on
which our theoretical analyses are based is a major problem, though I shall not deal
with it here. For example, if you work with financial data, you know that invest-
ment - capital expenditure - figures are determined by elaborate accounting rules
(not necessarily the same for all companies). The notion of investment embodied in
these rules may well diverge from the economist's notion of the increase in capital
which is embodied in some sophisticated model of the behaviour of firms or of
the effects of investment at some aggregate level. When we investigate investment
behaviour or the effects of investment using such data, it is the accountant's defi-
nition, not the economist's definition, that has primacy. If we blame any empirical

2 The standard analyses of two-stage least squares in exactly identified models applies here.

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
780 John G. Cragg

failure on the lack of correspondence between measured and theoretical quantities,


we shall surely be condemning ourselves to making little progress in developing
a truly empirically based economics3 unless, at a minimum, we give content to
the auxilliary hypotheses about data definitions and how thay are connected to
theoretical constructs.
Nevertheless, I suspect that we can make a reasonable stab at dealing with the
conceptual issues that are at the heart of the second type of measurement error.4 In
any case, and without downplaying the importance of relating the concepts actually
being measured and any systematic bias in their implementation to the theoretical
concepts on which models are built, I shall concentrate on measurement error in
the more classic sense. That is, I shall treat measurement error as a random error
added to an otherwise appropriate measure.
The second reason that I suspect accounts for economists' generally ignoring
measurement error is the impressions obtained from basic econometrics texts about
the effects of measurement error. Typically the analysis is taken up using the
bivariate linear model, where the results are that measurement error biases5 the
slope coefficient towards zero - an effect termed 'attenuation.' It also produces a
bias in the intercept of opposite sign when the average of the explanatory variable
is positive - an effect we may call 'contamination.' Typically the latter effect is
not stressed, so that most students retain only the result that measurement error
produces a bias towards zero. The prime focus of many empirical investigations
is on the sign of effects (and whether they are significant). This analysis offers
the comforting impression that no serious damage has been done if the regression
coefficient is significantly different from zero, except to get a conservative estimate
of magnitude. Though less clearly emphasized in the standard presentation, it is
also the case that the possible scope of attenuation is indicated by r2, a high value
of r2 precluding a large attenuation effect. Unfortunately, as we shall see in the
next section, these results carry over only to a very limited extent in less simple
situations.
The third reason for ignoring errors-in-variables also springs partly from the
simple exposition typical of textbook treatments of the problem. If all the random
quantities in the model (including the explanatory variables) are normally dis-
tributed, then the parameters of the errors-in-variables model are unidentified
without further information. This is true, but the conditions preventing identifi-
cation are stringent.
Like the belief that the problem is not 'really' serious, the belief that the pa-
rameters are not identified (and so the situation is hopeless) may be quite simply

3 Of course, there are those who do not regard the lack of empirical content of much of economics
as a damning condemnation of its claims to being a science, though it does mean that a rather
different enterprise is being undertaken by economists (e.g., Rosenberg 1992).
4 It is notable that the factor-analysis approach places at the forefront the need to relate the mea-
sured quantities to theoretically useful underlying variables.
5 Strictly speaking, the bias demonstrated is usually the asymptotic bias (and even may be just the
extent of inconsistency, since expectations may not be taken or possible) unless all random or
unobservable quantities are presumed to be normally distributed.

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
Making good inferences 781

in error. Furthermore, we may be able to test whether the unfortunate situation in


which lack of identification occurs is part of our problem without having first to es-
timate the model. Even if the identification problem was insoluble, that would be no
justification for ignoring the problem of measurement error; its implication is that
the parameters being estimated cannot possibly be structural ones, and therefore
the estimates cannot be given a structural interpretation.
But this is too gloomy a conclusion. Even if we could do nothing explicitly
to correct the problems, knowing what they are and ways to evaluate how serious
they may be should help us not to make unwarranted conclusions. Furthermore,
and positively, I shall be suggesting that in general the notion that nothing can be
done about errors-in-variables is itself in error: there are a great many things one
may do.

III. ALLOWING FOR MORE GENERAL EFFECTS OF MEASUREMENT ERROR

The clear qualitative results about the effects of errors in a bivariate regression carry
over only to a limited extent to wider models. The conclusion about attenuation
of the estimate of the coefficient of the variable still holds when only one variable
in a multiple regression is subject to error. The possible extent of attenuation is
indicated by the partial correlation between the dependent variable and the one
measured with error, holding the others constant, not by R2. Furthermore, the bias
of other coefficients is less than when that variable is simply omitted from the
equation (cf. McCallum 1972; Wickens 1972). The effects on the other coefficients
are more difficult to predict (though possible on the basis of information in the
data used6).
Things are not so simple when more than one variable is measured with error,
though some conclusions can be drawn when only two variables have measure-
ment error (cf Garber and Klepper 1980). In multiple regression with errors in
more than one explanatory variable, these conclusions, including especially the
attenuation/correct-sign conclusion, do not hold except as special cases, for ex-
ample, when the (true) explanatory variables are uncorrelated with each other. The
inconsistency of the estimate of the coefficient of any one variable now arises both
from its own error (the attenuation effect) and from the errors in other variables
(the contamination effect). The latter effect depends on the values of a whole set
of other parameters, including the variable's own error.
The overall effect is such that it is not even the case that the inconsistency is
necessarily reduced by having a better measured variable. This is because while the
attenuation effect shrinks when the variance of the measurement error is reduced,

6 The reason is obvious when we remember that we can always treat an individual regression
coefficient as the result of the simple regression of the residuals obtained from regressing the
dependent variable on the other explanatory variables regressed on the residuals from regressing
the explanatory variable in question on the others. Since the measurement error affects only this
last step, the results for simple regression apply to the 'own' coefficient.

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
782 John G. Cragg

the contamination effect is increased.7 Unambiguously, as we might anticipate,


having a more accurately measured variable will lead to a lower (asymptotic) stan-
dard error of estimate of the regression. More surprisingly, it will lead to a lower
asymptotic or population standard error of the coefficient of the variable involved
only if the partial correlation between that variable and the dependent variable is
greater than 0.75 in absolute value. Like our standard way of breaking the effects of
price changes into substitution and income effects, analysis in terms of attenuation
and contamination effects of errors-in-variables may help intuitive understanding,
but it does not provide a vehicle for concrete predictions about effects.
The intuitions lying behind understanding the effects of errors-in-variables are
fairly straightforward. We know (thanks to the very useful analysis pioneered by
Hans Theil 1957) that omitting variables that should be included in a structural
regression equation alters the nature of the coefficients being estimated for the
other variables unless the omitted variables are uncorrelated with all the other
variables included in the regression. The coefficients actually being estimated now
become not merely the direct effect that would be present in a correctly specified
equation but also include the coefficients of the (empirical) best linear predictors
of each of the omitted variables, based on the still included explanatory variables,
multiplied by the effect of that variable.
What is not always realized is that the effects persist even when we then include
proxies for the omitted variables, again provided that there is correlation between
the omitted and the now included variables. While in a correctly specified equa-
tion such variables would have zero population coefficients, this is not true for
the misspecified equation. Furthermore, the coefficients of other variables remain
inconsistent, the inconsistency possibly improving and possibly getting worse as a
result of using the additional variables.
Relying on intuition to determine even the direction of inconsistency becomes
daunting when more than one variable is measured with error. One can think of a
regression as producing a 'risk'-minimizing portfolio of variables to represent the
underlying influences. The coefficients that we estimate reflect this portfolio selec-
tion as well as the underlying relationship between correctly measured variables.
Not surprisingly, such dual-purpose estimates are not directly subject to transparent
interpretation.
This is not to say that no headway can be made; conjecturing about the size
and nature of errors can be revealing.8 The requirement that the variances of the
measurement errors as well as of the underlying variables all must be positive, com-
bined with some orthogonality conditions, may produce the ability to put bounds
on what is a sensible region in which the coefficients might lie, as pointed out in
the work of Klepper and Leamer (1984) and extended in other papers (cf Klepper
1987). Such analysis provides the multivariate analogue of realizing that in the

7 Since these results, appearing in a never-published discussion paper by Cragg (1977), do not seem
to be readily available in the literature (though they are referred to in various places), they are
summarized in the appendix to this paper.
8 An example in a related context occurs in Cragg, Harberger, and Mieszkowski (1968).

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
Making good inferences 783

simple regression model the reverse regression provides an upper bound on how
much attenuation one may believe to exist. Furthermore, the bounds may be re-
fined by considering what are reasonable as opposed to feasible limits to place
on the error variances. As with most other approaches to measurement error, this
one is more remarkable for its not being used by empirical workers than for their
embracing it enthusiastically without concern for any difficulties it may present.
One thing that we can safely expect when there are measurement errors is
that coefficients will not have their usually interpreted meaning. Furthermore, the
discovery that variables are statistically significant does not imply that the effect
that they represent is present in the way indicated. It is not even true that there are
as many genuine effects as there are significant coefficients. Indeed, if there are
'good' instruments for variables measured with error, then these variables should
act as significant explanatory variables in the equation. That such variables may
fail to be significant reflects lack of power of the testing procedure rather than
valid null hypotheses. Conversely, several variables can be separately and jointly
significant in a standard multiple regression because other variables with which
they are correlated and that should enter the model are measured with error, not
because they themselves enter the underlying relationship.9
One implication of this is particularly serious. Unless there are overidentifying
restrictions that can be incorporated in the procedures, it is merely a matter of
interpretation when the right-hand-side variables are correlated with each other,
whether any number up to one half the total of these variables represent separate
explanatory variables, or whether they should be providing instrumental variables
for others.
Since measurement errors may be disastrous for the usual purposes to which
our estimates are put,10 it may hardly be surprising that empirical researchers
have tended to ignore the problem, even investigators who 'should' know better
- meaning myself! However, the effects - especially the incorrect inferences -
will certainly not go away by ignoring the problem. Could it be that some of the
contempt that our profession at times elicits from those who would be users of
our insights arises from the same contempt that ostriches meet when engaging in
their proverbial ways of dealing with unpleasant facts? Is it this tendency among
ourselves that elicits scepticism about our contribution rather than the fact that we
practise the dismal science and people tend to denigrate (though not, thank heavens,
to kill) the messenger?

9 Testing whether their coefficients are zero can be considered a test of the hypothesis that there are
errors in variables and so provides an alternative to the Hausman (1978) test. In either case, the
possible instruments must have appropriate covariance with the variables for which they might be
instruments, tests for which are discussed in Cragg and Donald (1993).
10 The effects are far less serious if the intention is forecasting on the basis of mismeasured data, in
which case the usual coefficients (inconsistencies and all) may be just what is wanted.

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
784 John G. Cragg

IV. AN ILLUSTRATION

To illustrate the preceding points, consider the following regression for unemploy-
ment-rate averages in the period 1983-8 across twenty OECD countries; it appears
in an important study of unemployment by Layard, Nickell, and Jackman (1991,
55):

U = 0.24 + 0.92D + 0.17R - 0.13S + 2.45B - 1.42C - 4.28E - 0.35I


(0.1) (2.9) (7.1) (2.3) (2.4) (2.0) (2.9) (2.8) R2 0.91

The quantities in parentheses are t-ratios. The variables used are

U: The average unemployment rate


D: Benefit duration (of unemployment insurance)
R: The replacement ratio (fraction of income ui represents)
S: Active labour market spending (by government)
B: Coverage of collective bargaining
C: Union coordination
E: Employer coordination
I: Change in inflation.

Besides being authored by distinguished scholars, the study had the benefit of
extensive discussion in conferences by other leading researchers in the field. The
subject is obviously of pressing concern and policy importance.
It is quite clear that the explanatory variables are not regarded as being measured
without error by the authors. They do note, however, the appropriateness of the
various signs in the equation relative to their theoretical considerations. They further
comment: 'the standardized regression coefficients are all about one-tenth of the
t-statistics ... So the t-statistics indicate well the partial contribution of the different
variables to explaining the unemployment differences. These differences are thus
explained in roughly equal measure by the treatment of the unemployed, and by the
bargaining structure.' Note from this that the sizes of the coefficients, especially
of their standard errors, are to be given an essentially structural interpretation -
though there is also a hint that in some sense there are only two major influences
at work.
These are conclusions that any of us would tend to draw when presented with
this equation. So also are ones that arise from examining the coefficients. We
would also be tempted to make policy conclusions from them12 or (more likely)
to infer that the coefficients provide clear empirical support for judgments that we
are inclined to make in any case. Thus we might well conclude that the estimates
support the notion that generous unemployment insurance provisions lead to higher

11 Full details of the definitions and the sources for the variables are given in Layard, Nickell, and
Jackman (1991), which also contains the data.
12 Notably the policy conclusions drawn by Layard, Nickell, and Jackman (1991, 471-501) do not
explicitly rely on the magnitude of the coefficients, though they do correspond to the views of the
workings (and misworkings) of labour markets embodied in the equation.

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
Making good inferences 785

rates of unemployment. This in turn suggests that the unpleasant levels of unem-
ployment that we have been experiencing can be tackled by having less generous
unemployment insurance provisions. The estimates also indicate that pro-active
labour market policies (involving government spending) can reduce unemploy-
ment. A further conclusion is that policies that facilitate unionization will increase
unemployment, though their effects may be offset if coordination in bargaining is
encouraged. This is not an entirely unfamiliar set of policy opinions!
These conclusions may well be correct. The question here is whether they are
supported by the data to the extent that they appear to be. In fact, they are thrown
into doubt by recognition that errors-in-variables are clearly a feature of the ex-
planatory variables.
The explanatory variables are correlated with each other and each can serve
as a proxy for the set of remaining ones. Bounds on possible estimates based on
recognition that the variance of the measurement errors must be small enough to
leave the adjusted covariance matrices of the observations positive definite (the
multivariate analogues of running reverse regression) are such that any of the con-
clusions just cited can be overturned. There are no obvious immediate alternative
variables to serve as instruments, at least not while the variables reported in this
study are used rather than other variables available in their sources.13 The possi-
bility exists, however, that some of the variables are proxies for a smaller number
of underlying influences. We can explore what would happen if we hypothesized
errors with various variances or explored using some variables as instruments for
others. 14
To see how interpretation can be changed, we might first suppose that there is
a measurement error in B, the coverage of collective bargaining, whose variance
equals one-eighth of the total variance of B or 30 per cent of the estimated residual
variance of B given the other variables. With this hypothesis, the structural equation
becomes the one given in the second column of table 1. (The first column repro-
duces the original equation.) While the signs of coefficients remain unchanged,
their relative magnitude is altered. The inflation variable becomes totally insignifi-
cant. While the coefficient of B becomes larger, reflecting the attenuation that is an
immediate consequence of our assumption in this particular exercise that the only
measurement error is in B, its standard error increases sufficiently that the t-ratio
for B becomes smaller.
Note particularly what happens to D - the period of time over which ui benefits
may be drawn. The size of its coefficient is almost halved and it could be dropped
from the equation with little loss. If we took this equation seriously, we might well
now doubt whether reducing the benefit duration would have any effect on unem-
ployment. This change comes from assuming that it is the coverage of collective

13 Such variables do exist, but the point of this illustration is not how best to handle the large sets of
information actually available relevant to unemployment on a cross-section basis.
14 It would also be possible in general to explore the multivariate errors-in-variables model, though
the number of observations here prevents doing so without very strong additional assumptions
about normality.

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
786 John G. Cragg

TABLE 1
Variants on a regression; estimates produced with different error assumptions (ab-
solute values of t-ratios in parentheses)

Version 1 2 3 4 5 6

Errors assumed in
Instrumental
Variable Original B D, R, S B, C, E variables

D 0.92 0.47 1.09 0.54


(2.9) (0.9) (2.2) (1.2)
R 0.17 0.16 0.17 0.17 0.37
(7.1) (6.0) (6.0) (6.1) (1.0)
S -0.13 -0.14 -0.19 -0.11 -0.14
(2.3) (2.1) (2.1) (1.6) (0.6)
B 2.45 4.34 1.99 4.16 4.99 4.28
(2.4) (2.1) (1.4) (2.2) (2.3) (2.5)
C -1.42 -1.75 -1.06 -2.08
(2.0) (1.9) (1.2) (1.6)
E -4.28 -4.33 -4.21 -4.54 -9.36 -3.31
(2.9) (6.4) (5.7) (5.3) (2.1) -(2.8)
I -0.35 -0.28 -0.41 -0.24 -0.12
(2.8) (1.7) (2.5) (1.3) (0.4)
Constant 0.24 -1.92 0.49 -1.32 -8.07 2.78
(0. 1) (0.6) (0.2) (0.5) (0.5) (0.6)

bargaining that is not correctly measured - hardly a variable one would think was
well proxied by benefit duration!
To illustrate further how apparently solid conclusions are weakened by consid-
ering the possibility of errors in the variables,15 suppose that we hypothesize that
each of the three government-program variables has measurement error. Specif-
ically suppose that 40 per cent of the residual variances of D, R, and S, given
in each case all the other variables, arises from measurement error. The resulting
equation appears as version 3 in table 1. The major effect is to make the other
variables, except E, appear to be of much smaller importance and also to be statis-
tically insignif ,cant. Correspondingly, D and S appear to be more important. Before
jumping to substantive conclusions, however, consider that we could, instead, have
assumed that the errors are in the market-structure variables. This corresponds to
version 4 in table 1. Now it is D and S that apparently lose their importance.
One might be tempted to conclude that at least R and E appear to be the
important variables. But consider what happens when we suppose that D and C do
not 'really' belong in the equation, so that they can serve as instruments for R and
E. The result of this instrumental-variable estimation provides version 5 in table 1.
Variable R, though its point estimate has increased, is now apparently insignificant.
This version is observationally equivalent to the first one; that is, differences that

15 Another example showing how recognizing measurement error upsets conclusions occurs in
Dagenais and Dagenais (1994) using a different approach.

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
Making good inferences 787

appear between them arise from differences in interpretation of the underlying


situation based on exactly the same observational evidence.
Variables B and E apparently provide the explanation as suggested by version
6 of table 1 in which all the other variables are used as instruments for these
two. Now there are, indeed, overidentifying restrictions. They cannot be rejected
at the 0.10 level.16 This equation points by construction to there being only two
'underlying' influences or factors.17
Note how radically the immediate policy conclusions have changed if we con-
sider version 6 to be the appropriate model rather than the first version - and
version 6 is about as well supported by the data as the original one. Apparently
there is no direct role for standard government programs in affecting the level of
long-term unemployment; for both labour-market spending and also the parameters
of the unemployment-insurance system have 'disappeared' from the model.18
There is nothing very special about this equation in the sense that it is uniquely
subject to this sort of analysis. I invite you to do it with your own favourite
regression (or better still an adversary's favourite regression). If the explanatory
variables are correlated, then this sort of interpretation change can be performed.
Does this mean that we are stuck either with accepting the results at face value
according to whichever interpretive paradigm we adopt or else with simply aban-
doning empirical research? Although some a priori interpretation may be inevitable,
the basic answer is 'no.' There are steps that can be taken to alleviate the problem.
I shall not pursue them for this rather complicated equation; instead I shall illus-
trate the basic approach through a simpler - and therefore more easily exposited -
example.

V. COPING WITH ERRORS IN VARIABLES

The obvious course of action would seem to be examination of any further informa-
tion in the data that might bear on the question of the effect of measurement error

16 The test used here is one that remains valid even if the coefficients were not identified, as dis-
cussed in Cragg and Donald (1994b). The hypothesis that the equation is not identified is rejected
at the 0.1 level, using the test of Cragg and Donald (1993).
17 We could have picked other variables to represent these two factors, and if we interpret the effect
as being directly that of the aspect of the economy apparently being measured by the variable, we
shall reach very different policy conclusions. This ambiguity is simply the problem of normalizing
and interpreting latent factors with which other social scientists, especially psychologists, have
laboured. It is often present (without usually being recognized) in instrumental-variable models.
It may be remarked that one of the less revealing of the possible interpretations is obtained by
having the two variables represented by R and E, which had seemed appropriate on earlier spec-
ification, be the ones for which the others are instruments. Both coefficients are insignificant and
R has a negative sign. The reason is partly that the hypothesis that this specification represents a
misspecification (here meaning that the two variables do not span the latent ones) can be rejected
only at very large significance levels.
18 This is not to denigrate the work of Layard, Nickell, and Jackman (1991), which contains a great
wealth of further important empirical research and is remarkable for its meticulous handling of
data and explanation of procedures. Rather it is a comment on the possibility of interpreting
too readily a single piece of evidence without carefully exploring the data and other aspects of
investigation on which it is based.

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
788 John G. Cragg

on the inferences to be drawn. One type of such additional information arises from
examining higher moments. This suggestion has arisen at various times, notably
in Pal (1980) and most recently in an important paper by Dagenais and Dagenais
(1994).
The idea is that we can base inferences not only on the averages and the co-
variances of the data, but also on the third moments, especially those that involve
multiplying the dependent variable by the products of the explanatory variables and
also the squares of the dependent variables multiplied by the explanatory variables.
Similarly we can use the various statistics that arise from multiplying four vari-
ables (including self-multiplications) and averaging the results. With the common
errors-in-variables specification, this produces overidentification, in the sense that
the number of empirical moments being considered exceeds the number of parame-
ters. This has the advantage that one can test whether the parameters are identified,
estimate them, and furthermore test the errors-in-variables specification on which
the estimates will be grounded.
To see what is involved, consider the simplest model with one explanatory
variable, which is measured with error. Suppose that yi is the dependent vari
and xi is the (true) explanatory variable for which the only observations, zi, h
a measurement error, fi, which is independent19 of both xi and the disturba
the equation for yi, say Ej. All random quantities are assumed independent
observations. Specifically, the model is

Yi = ++ xi +Ei; Zi = Xi +i (1)

Assuming that i is independen


write the underlying first four moments as

[ E(xi) 1 [E((xi-ux)2) 1 F2
E(,Ei) = O ; E( ) = I e2
L E(i) [ 0 E(E ) L2

- E((xi- X)3) TX [E((xi -AX)) - x


E((i3) LTEJ E(,O) . (2)
L E(,q3) - L 9 E(,ql) i 00
Expressing the first two moments of the observable variables in terms of these
parameters, we obtain

E(yi) = Muy = +ftx; E(zi) = Hz = Mx;

E(yi,_Y)2 =32?x2 + ,; E(yi -ly)(Zi-1z) =35x

E(zi-Z)2 = 2+ u (3)

19 Independence is a stronger assumption than is required. What is needed in what follows is that all
cross-moments about the mean up to the fourth order be zero. We shall assume the existence of at
least sixth moments of the zi and yi to render inference feasible using minimum x2.

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
Making good inferences 789

These equations give rise to the OLS regression estimates, since if a2 = 0, t


other parameters are identified. Regarding expressions (3) as equations relating the
unknown parameters to the moments of the data, there are five equations in six
unknowns, precluding simply using estimates of these moments in these equations
to produce estimates.
If we consider the third cross-moments of the observables, we obtain, using the
independence assumptions,

E(y -y)2(Zi- _z) = 32Tr; E(y- py)(zi - pz) = f3x. (


Equations (4) add one further parameter and two equations to the system. These
additions allow solution for the parameters provided that Tx 7$ 0. The critical asp
is that conditions (4) must not be zero if we hope to identify and estimate / using
only equations (3) and (4). If the third moment of xi is not zero, then one could
estimate the parameters, though one would still only be imposing a particular
framework on the analysis of the data. The framework cannot be tested without
further information because there is a one-to-one transformation from the moments
to the parameters.
To get information that would exhibit restrictions that could be violated and
to avoid disaster in the event that the third moments are zero, we can similarly
examine the fourth cross-moments:

E( Yi - y)3(zi - bz) = d Qx + 30ox 2;

E(yi - bty)2(Zi -_)2 =32 ox + 120x2 + x2. 2 + 020.2;

E(yi - py)(Zi - ILz)3 =3x + 30135x5011. (5)

These add three equations with one more unknown.


For there to be a solution using equations (3) and (5), it is sufficient that Ox 7
3x4, provided that B 7$ 0. This identifiability condition would be violated, that i
it would be the case that

Ox - 34 (6)

if the xis wer


using estimated moments without requiring prior estimation of the parameters.
Specifically, if _x = 35x4, then combining equations (3) and (5),

E(yi - by)3(Zi - bz) = 3E(yi - AY)2E(yi - py)(Zi -x);

E(yi -y)2(zi - z)2 = E(yi - )2E(zi - bZ)2 + 2[E(yi -_y)(Z, Hz)];

E(yi - ILy)(zi - pz)3 = 3E(yi -- py)(Zi - pz)E(zi- Z)2. (7)


The interesting feature about equations (7) is that they do not involve unknown
parameters except for the means of the observable variables. We can test these

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
790 John G. Cragg

(non-linear) restrictions by seeing whether the calculated moments appear to obey


this theoretical constraint. And we can do so without having to estimate the possibly
unidentified model. (If they do obey the constraint and if the moments in (4) appear
to be zero, we would presumably have to go on to some other procedure if we
hoped to extract useful information from the data, and it simply may not be possible
if indeed only normally distributed variables are involved.)
Equations (3), (4), and (5) provide ten equations in eight unknown parame-
ters. Minimum x2 (equivalently named efficient Generalized Method of Moments
or Asymptotic Least Squares) provides a method of estimating the model and a
framework for testing both the identifying conditions that allow estimation and also
the overidentifying restrictions imposed by the specification. We shall employ still
higher moments to obtain the estimated covariance matrix of the sample moments
corresponding to the expectations in equations (3)-(5) needed for minimum x2
estimation.
The upshot of these considerations is that by using the first four moments,
one can consistently determine whether there is enough information to allow one
to adopt an errors-in-variables specification and to test whether that specification
does indeed adequately represent the data.

VI. A SECOND EXAMPLE

To illustrate, we can use the simple CAPM regression (regarded as a one-factor


model) using the 120 observations provided in Bemdt (1990) for Delta Airlines.
These data are used to produce the simple regression of the rate of return net of the
'risk-free' rate of interest on the net market rate of return - both rates measured in
per-cent-per-month terms. The question is whether one can regard this as a simple
regression in which the relevant 'explanatory' variable is measured with error.
As we noticed earlier, in order for this model to be investigated by basing
estimation of the first four moments,20 we require that the variables do not appear
to be normally distributed, in particular that the zis do not have the moments of
a normally distributed variable. Testing for zero third moments yields2l a p-value
of 0.52, but testing the restriction (7) on fourth moments coming from normality
produces a p-value of 0.001. We can therefore hope that the approach will be
revealing,22 especially since this identifiability test in small samples is not very
powerful.
On fitting the model by minimum X2 the p-value for the hypothesis of the
errors-in-variables restrictions is 0.48. That is, the hypothesis that the restriction
imposed on the first four moments by the errors-in-variables specification cannot

20 In the implementation of minimum x2, sample moments were calculated by dividing the appro-
priate sums by the number of observations rather than by the various denominators needed to
produce unbiased estimates.
21 This may come as a surprise. It is widely believed that financial retums exhibit positive skewness,
but that is not true for this period either for the market return or for the return to the selected
stock.
22 Note that there is some danger of pretesting-bias here, a difficulty we shall ignore.

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
Making good inferences 791

TABLE 2
Estimates of a textbook regression

OLS - no errors MCS - with errors

Parameter Estimate Standard error Estimate Standard error

a 0.12 0.82 -0.15 1.04

0 0.49 0.12 0.89 0.09


a2 81.43 10.74 68.00 10.90
xa2 47.07 9.56 28.54 10.70
a2 17.22 7.37

be rejected and indeed the test-statistic is very close to the median value of the
(asymptotic) distribution under the null hypothesis. The associated estimates for
the model are given in table 2.
According to these estimates, the measurement error in the 'market' variable has
substantial variance, accounting for a little less than two-fifths of the variance of the
measured explanatory variable. As a result, the suggested bias in the 3 coefficient
is substantial, its estimate rising from 0.49 to 0.89 through the recognition of the
measurement error.23 Since these estimates (if one were to take them seriously)
would then enter into an appreciation of the nature and valuation of risk in the
financial markets, the ability to correct for the bias is not unimportant!
Estimates based on higher moments have developed a poor reputation in the
folklore of econometrics. This may come partly from cases where they involve es-
timates with no overidentifying restrictions, as for instance in the series of possible
(not generalized) method-of-moments estimators considered by Pal (1980). The un-
derlying problem, however, is that estimates of fourth moments are more unstable
- or rather more precisely have much larger relative sampling variances - than do
the second moments on which we usually rely. It is compounded by the estima-
tors' (including the ones considered here) being non-linear. In the problem as here
stated, however, there is not a readily available alternative, and OLS is apparently
badly biased.
Monte Carlo work, based on the sort of model with the data properties used
here, reveals that the statistical inferences drawn, namely, that the bias is a large
fraction of the OLS estimates and that much more reliable inference can be made than
is achieved by hoping that the measurement error is not serious, are appropriate.
(Refinements on the estimation technique can apparently improve matters further,
but that is another story; cf Cragg 1994.)
Of course, there are precious few cases where a simple regression model can be
taken seriously. The basic idea extends immediately to multiple regression, how-
ever, though with a stronger requirement on having a sufficiently large number of

23 It needs to be recognized that one may reasonably express doubts about the model and its imple-
mentation. Furthermore, we should note that use of different companies for which data are found
in Bemdt (1991) gives different results.

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
792 John G. Cragg

observations. Indeed, the usual errors-in-variables specification in which the errors


in the individual variables are uncorrelated across variables, rapidly adds strong
overidentifying restrictions (which may be tested). In the same vein, using higher
moments directly for variables suspected of being measured with error can be com-
bined with the instrumental-variable approach (if such instruments are available).
This combination of procedures has the advantage of allowing extraction of possible
information in instruments while also retaining the information and specification
advantages of considering the higher moments. It also allows extracting informa-
tion consistently when there are not enough instruments available to identify the
model. Though it has not been worked out as far as I know, I strongly suspect that
the framework can be extended to handle some of the other econometric problems
with which we wrestle, just as has happened in instrumental-variable estimation
(cf, e.g., White 1982).

VII. COMPARISON WITH INSTRUMENTAL VARIABLES

To see how the approach just sketched can be combined with instrumental variables,
again consider the two-variable model in equations (1). But now suppose that there
is an instrumental variable qi with mean ULq, which is correlated with the xi but not
with Ei or with 71i. The relevant cross-moments are given by

E(zi - z)(qi - q) =zq; E(yi - /Ly)(qi - 1Iq) = 0yq =00'zq (8)

When combined with equations (3), these conditions identify the parameters, pro-
viding that24 0zq :/ 0. They provide the basis for instrumental-variables estimation.
Combining equations (8) with equations (3), (4), and (5) leads to an overidentified
system. It also may be estimated by minimum x2, using more information than is
incorporated in the instrumental-variable estimates alone.
To illustrate with the simple CAPM model just considered, we can treat the net
rate of return on Pan American, say r1, as a possible instrument for the net market
rate of return. The earlier results suggest that this quantity is measured with error
by the observed variable, rM. Interestingly, if we simply regress the net rate of
return for Delta Airlines, r3, on rM and rX, we obtain an equation in which both
regressors appear to be significant using the conventional procedures. The fitted
equation (with t-ratios in parentheses) is

r- = 0.32 + 0.33rM + 0.22r,.


(0.41) (2.64) (3.38)

An observationally equivalent way of treating these data is the instrumental-variable


estimator based on equations (3) and (8). The estimated equation is

r3 =-0.54 + 1.43rM.
(0.43) (3.78)

24 Testing for this condition (usually in a more general model) is considered in Cragg and Donald
(1993).

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
Making good inferences 793

TABLE 3
Instrumental variable estimates

IV MCS - with Uzq

Parameter Estimate Standard error Estimate Standard error

a -0.54 1.25 -0.18 1.05

0 1.43 0.38 0.93 0.08


al2 58.55 28.68 57.25 8.70
0a2 15.96 4.74 24.38 9.36
al2 30.71 6.90 23.44 6.17
Uzq 34.28 10.96 34.84 10.89

If we also use equations (4) and (5), we again obtain overidentified equations,
with there now being three overidentifying restrictions arising from the specifica-
tion. Applying minimum x2 to these data results in a test-statistic for these three
restrictions with an asymptotic p-value of 0.205. The estimates and standard errors
are given in table 3.
Using higher moments and instrumental variables are far from being the only
ways of using specification about the nature of data to allow us to take account of
errors-in-variables. Nor have I outlined all the possibilities for using higher mo-
ments or other information. For example, Dagenais and Dagenais (1994) obtain very
useful results by imposing some additional constraints that come from assuming
that the errors are normally distributed. The classic latent-variables models, specif-
ically the factor-analysis model, also known with a particular normalization as the
multivariate errors-in-variables model, are almost the paradigm case for dealing
with measurement error. Such models play the central role in Fuller's (1987) book,
Measurement Error Models. While these models have typically used normality as-
sumptions for deriving the asymptotic distributions on which inference is based, it
is quite possible (cf Cragg and Donald 1994a) to proceed without this assumption
using a minimum x2 approach. It is also possible to combine these specification
variants with more usual problems, in that case heteroscedasticity of unknown form.
Similarly, when one has multivariate regressions and there are rnore proxies for the
explanatory variables than are strictly necessary, those two aspects of the data im-
pose rank restraints on the matrix of regression coefficients which can be tested and
used in exploring the errors-in-variables hypothesis (cf Cragg and Donald 1992).
Another approach that should be mentioned here, though in detail it differs
conceptually in important ways from what I have been discussing, is exploiting the
restrictions imposed by economic theory to investigate or allow for measurement
error. Thus Stapleton (1984) pointed out that the symmetry restrictions in demand
systems (which in the presence of measurement error do not hold for the probability
limits of the models being estimated by the usual methods when they are true
in the underlying theoretical parameters) can be used to identify and correct for
measurement error. The same was found by Geracci (1977) for exploiting over-

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
794 John G. Cragg

identifying restrictions in simultaneous-equations models. It is also the essence of


the approach that can be used to deal with the problem in multivariate time-series
models (cf Nowak 1992).
The classic approach supposes that knowledge of some parameters needed for
identification is available. However, as Fuller (1980) shows, it is possible to use
such information if it is only stochastic or imprecise in nature. As Schafer (1986)
pointed out, this approach can be combined with instrumental variables to obtain
an estimator that is superior to either.
All of these findings and approaches are suggestive of what may be accomplished
when measurement error is confronted directly. The appropriate remedy depends
on the model and data used and the knowledge available and several remedies may
be combined. The point to stress is that solutions to the errors-in-variables problem
appear when they are sought and so errors-in-variables need not - and should not
- be ignored. It is perhaps indicative of the extent to which we have become blind
to the problems of measurement error that a recent, up-to-date, and fairly advanced
textbook on econometrics (Davidson and MacKinnon 1993) mentions none of the
references just cited and has only two pages devoted to measurement error, found
in the context of instrumental variables.

VIII. CONCLUSION

I have tried to stress two things in this lecture. First, measurement error distorts
seriously the interpretation that should be given to standard econometric estimates.
They are inconsistent and in general they are inconsistent in ways that are very dif-
ficult to unravel directly. Measurement error also means that different specifications
may appear to be appropriate when the error is admitted but when one does not
have enough information or data to allow one to be more precise about appropriate
constrained specifications.
These problems alone are sufficient to cast much doubt on many reported find-
ings. I would suggest that they may be part of the reason (along with all the
problems of specification, etc.) why it may actually be perilous (as well as often
being viewed as being perilous) to rely on our estimates for policy. While I have
concentrated on linear models, it would be an implausible fortuity for the effects
not to impinge at least as seriously on the non-linear models that are becoming
increasingly fashionable.
This is not to say that we should abandon empirical work and adhere more
explicitly, without any check by data, to some mixture of hunch and prejudice,
guided one fears by one or more of those scribblers to whom Keynes alluded,
whose alleged results guide interpretation although their works have ceased to be
examined explicitly.
Rather, I would argue, advancement will come from recognizing the problem
head on. The appropriate response to misleading results is to develop methods to
purge the estimates of their dubious aspects, not to rely on using only ones that
can be persuaded to meet our prejudices. We learned that lesson in the develop-

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
Making good inferences 795

ment of the simultaneous-equation model, but the applicability is not limited to


that case.
This brings me to the second important point: there are indeed ways of over-
coming the problems of errors-in-variables. I illustrated a few of these techniques
and would not suggest that they are exhaustive. These methods did rely in part
on being able to impose overidentifying information that would allow for richer
testing of specification and consequently more confidence in estimation.
Finally, as noted in combining instrumental-variable estimates with other tech-
niques, more variables may well be better than fewer. There are often many proxies
available. They can serve as instruments. Further, it is entirely appropriate to test
whether they are suitable as instruments - both in the sense that they do contribute
to the model and also in the sense that they do not contain information about inde-
pendent influences that needs to be incorporated directly in the model. Of course,
some interpretation is needed to relate our observations to our understanding of
the economy and to allow the former to develop the latter, but interpretation that
imposes testable constraints is more likely to be revealing. As the implications of
assuming independent measurement errors demonstrate, comparatively innocuous
restrictions may lead to surprisingly rich implications.
I started by noting that I once almost got fired for attempting to correct (in a very
naive way) for a problem of measurement. Present efforts by researchers may have
less serious effects; indeed I would hope that taking measurement error seriously so
that we can overcome the problems they present may make our conclusions more
worthy of being taken seriously.

APPENDIX: ERRORS-IN-VARIABLES INCONSISTENCIES

Let

y -X/3+e,

where y is a T x 1 vector of observations on the dependent variable, X is a T x K


matrix of explanatory variables for which plimTm,0(X'X/T) -, while e is a T x 1
vector of unobserved residuals with E(c) = 0 and E(cc') = ca2IT. What is observed
in addition to y is

Z =X+N, (Al)

where N is a T x K matrix of measurement errors with E(N) = 0.

E[vec (N) vec (N)'] = ? 0 IT,

and E[vec (N)c'] = 0. We presume that these variables are such that Z'Z/T is of
full rank a.s., that plim (Z'Z/T)-(E + 1), that plim (Z'y/T) = E-, and that

plim ( y'y/T) = yy _ O E.o + +

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
796 John G. Cragg

We are concerned with the least-squares estimates

A (Z'Z)-Zly.

We presume that the random variables are such that

plim 7 = y = plim (Z'Z/T)-1 plim (Z'y/T),

while

V;T ( a N(O, V),

where

V = {plim (y'y/T) - plim (y'Z/T) plim (Z'Z/T)-1 plim

We are interested in relating 'Y and V to the parameters of the model. The
inconsistency of Y is given by25

Y- 3= -[6 + X]-110. (A2)

To examine a typical element of the vector in (A2), say the first, partition ' and
/3 into 1 and K - 1 elements:

[02l] [Yi]

and partition B and I conformably:

[1 12 1; ii [ 1i 2 1
21 22 L21 22-

Let

V= 1 + U11 - (E12 + 112)[E22 + X22] 1(=21 + 12) (A3)

and

X = [L21 + 121]1( 21 + 21). (A4)

7r is the vector of (population) regression coefficients of regressing that first Z


variable on the other (K - 1)Z variables (the coefficients of the best linear predictor)

25 Frost (1976) in considering a related problem implicitly assumes that, instead of (Al), Z =
XD + N, so that what is being estimated is 6 = (D-1/ rather than : itself, and the inconsistency
findings produced by N then apply by relating 'Y to 6 rather than to /.

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
Making good inferences 797

and- v is the corresponding asymptotic value of the (calculated) conditional variance


of the first Z given the others.
The first element of (A2) is

( p ) all1101 7rZ2101 Y1202 +r12202 (5


01 -01) -~~+ + (A5)
v v v v

The first term in (A5) gives the usual attenuation or asymptotic bias towards
zero of the estimates. It is the standard result when only one variable is subject to
error. The two middle terms give the effect of covariance among the measurement
errors. They disappear if the different measurement errors are uncorrelated with
each other so that 112 = 0. This is the case with the common errors-in-variables
assumption that X is diagonal. The final term gives the contamination coming from
other errors. It involves 'smearing' 'Yi with the effects that other variables have on
y. This last term will be zero only by accident if the other X variables are correlated
with the first X, provided that O2 i O and that Y22 i 0.
In general, neither the sign nor the magnitude of the last term relative to the
first one in (A5) can be deduced without knowledge of the parameters. If all, = 0
and only one element of 22, say the ith, is not zero, however, the direction of bias
may be inferred if the signs of /2i and Xr are known. Since Xr can be estimated
and the corresponding estimate of 'Y2i is attenuated, estimation of the direction of
inconsistency is quite feasible. This fact is exploited by Levy (1973). Furthermore,
if a,1 74 0 and only one element of X22 is non-zero, various qualitative results were
obtained by Garber and Klepper (1980). In general, both the sign of expression
(A5) and its magnitude are ambiguous.
If we make assumptions about the sizes of the elements of X, we can calculate
estimates of /3 based on these assumptions, since Z'Z/T can be taken to estimate
Z + S. and v must be positive semi-definite, and this requirement puts limits on
what might be hypothesized. Supposing that Y* is the hypothesized value of X,
such that Z'Z/T - Y* is positive definite, corrected estimates can be calculated as

/3(Z'Z/T - *)-lZ'y/T.

The asymptotic covariance matrix of N/Y( - /) (when the assumption a


is correct) is given by

plim [(y - Z/)'(y - Z/)/T] plim [Z'Z/T - l . (A6)

Such corrected estimates are what is reported in section iv, with standard errors
based on replacing the probability limits with their actual values in (A6).
If /1 = 0, then from (A5) al 74 0 if 1r/2232 = O. If these three terms are all non-
zero, it would not generally be the case that Y1 == 0. That is, an irrelevant variable
included in the equation, which is correlated with variables that should enter the
equation and are measured with error, will typically have a non-zero probability
limit. Conversely, adding an irrelevant variable among the other Z variables will

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
798 John G. Cragg

decrease v unless its partial correlation with the first Z, holding the remaining Z
variables constant, is zero. Its inclusion will thus increase the attenuation effect.
Since such an addition will change the values of the elements of 7r pertaining to
the other variables, however, the effect of including an irrelevant variable on the
contamination effect is unpredictable.
Now consider the effects of having a more accurately measured variable in the
sense that a1l is smaller. For simplicity, suppose that X21 = 0, as is standard in
errors-in-variables models, so that the middle terms of (A5) vanish. Then

a0il - p1) _(v - 0,11Ap 71202p


=, 2 2 *(A7)

The first term in (A7) has the same sign as the inconsistency due to attenuation
since from (A3) v > aol. The second term has sign opposite to the inconsistency
due to contamination.
The effects are no more clear cut when we consider altering one of the diagonal
elements of X22, say the ith one, indicated by vii, with 7i the corresponding elemen
of 7r. Then

a('Yl -/31) _111[ 3 _ 7 r'X2232ri? _l/2i rihi A8


agii 2
v2 2v +
v2 v (M)

where hi is the ith element of rE


tion of the attenuation and the second gives a diminution of the contamination. The
last two terms also alter the contamination, however, and are ambiguous as to sign
and magnitude without knowledge of the parameters.
About the only unambiguous qualitative result coming from altering the magni-
tude of one of the measurement errors is that decreasing the error of measurement
decreases the standard error of estimate of the regression in the sense of

uyylz = -yy - + = plim [(y - Z)'(y-Z)/T]


is lessened by using better variables, since

acyylz/au 1 = l > 0. (A9)


Similarly, one may make a definite (though more complicated) statement about the
effect of improving the measurement of one variable on the asymptotic variance of
the estimate of its own coefficient, Vll. This is simply cyylz/v and

a(7yyIz/V) -2 Y _yylz (AlO)


a&ii v v2
The sign of expression (AlO) is the same as that of

2v I (All)

+2v) 2

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
Making good inferences 799

The first term in (All) is the square of the probability limit of the partial correla-
tion coefficient between y and the first Z variable, holding the other Zs constant.
A similar, though less easily interpreted, expression holds for increases in other
measurement errors.

REFERENCES

Bemdt, Ernst R. (1990) The Practice of Econometrics (Reading, MA: Addison-Wesley)


Bickel, P.J., and Y. Ritov (1987) 'Efficient estimation in the errors in variables model.'
Annals of Statistics 15, 513-40
Cragg, John G. (1977) 'On using proxy variables.' Discussion Paper 77-14, Department of
Economics, University of British Columbia
(1993) 'The asymptotic efficiency of summary generalized least squares estimators.'
Discussion Paper 93-17, Department of Economics, University of British Columbia
(1994) 'Using higher moments to estimate the simple errors-in-variables model.'
Mimeo, Department of Economics, University of British Columbia
Cragg, John C. and Stephen G. Donald (1992) 'Testing and determining arbitrage pricing
structure from regressions on macro variables.' Discussion Paper 92-14, Department of
Economics, University of British Columbia
(1993) 'Testing identifiability and specification in instrumental variable models.'
Econometric Theory 9, 222-40
(1994a) 'Factor analysis under more general conditions with reference to heteroskedas-
ticity of unknown form.' Forthcoming in Statistical Methods in Econometrics and
Quantitative Economics Oxford: Blackwell, ed. G.S. Maddala, P.C.B. Phillips and T.N.
Srinivasan
-(1994b) 'Testing overidentifying restrictions in unidentified models.' Mimeo, Depart-
ment of Economics, University of British Columbia
Cragg, John G., Arnold C. Harberger, and Peter Mieszkowski (1967) 'Empirical evidence
on the incidence of the corporation income tax.' Journal of Political Economy 75,
811-21
Dagenais, Marcel G. (1992) 'Parameter estimation in regression models with errors in the
variables and autocorrelated disturbances.' Mimeo, University of Montreal
Dagenais, Marcel G., and Denyse L. Dagenais (1994) 'GMM estimators for linear regres-
sion models with errors in the variables.' Cahier 0594, CRDE Universite de Montreal
Davidson, Russell, and James G. MacKinnon (1993) Estimation and Inference in Econo-
metrics (New York and Oxford: Oxford University Press)
Frisch, R. (1934) Statistical Confluence Analysis by Means of Complete Regression Sys-
tems (Oslo, University Institute of Economics)
Frost, Peter A. (1979) 'Proxy variables and specification bias.' Review of Economics and
Statistics 61, 323-5
Fuller, Wayne A. (1980) 'Properties of some estimators for the errors in variables model.'
Annals of Statistics 8, 407-22
-(1987) Measurement Error Model (New York: Wiley)
Garber, Steven, and Steven Klepper (1980) 'Extending the classical normal errors-in-
variables model.' Econometrica 48, 1541-5
Hausman, J.A. (1978) 'Specification tests in economics.' Econometrica 46, 1251-71
Hood, William C., and T.C. Koopmans, eds (1953) Studies in Econometric Method,
Cowles Commission Monograph 14 (New York: Wiley)
Koopmans, T.C., ed. (1950) Statistical Inference in Dynamic Economic Models, Cowles
Commission Monograph 10 (New York: Wiley)
Layard, Richard, Stephen Nickell, and Richard Jackman (1991) Unemployment: Macroeco-
nomic Performance and the Labour Market (Oxford: Oxford University Press)

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms
800 John G. Cragg

Levy, M.D. (1973) 'Errors in variables in the presence of correctly measured variables.'
Econometrica 41, 985-6
Marschak, Jacob (1953) 'Economic measurements for policy and prediction.' In Hood and
Koopmans (1953)
McCallum, B.T. (1972) 'Relative asymptotic bias from errors of omission and measure-
ment.' Econometrica 40, 757-8
Morgan, Mary S. (1990) The History of Econometric Ideas (Cambridge, Cambridge
University Press)
Morgenstern, Oskar (1963) On the Accuracy of Economic Observations (Princeton, NJ:
Princeton University Press)
Nowak, Eugen (1992) 'Identifiability in multivariate dynamic linear error-in-variables
models.' Journal of the American Statistical Association 87, 714-23
Pal, Manoranjan (1980) 'Consistent moment estimators of regression coefficients in the
presence of errors in variables.' Journal of Econometrics 14, 349-64
Rosenberg, Alexander (1992) Economics - Mathematical Politics or Science of Dimin-
ishing Returns (Chicago: University of Chicago Press)
Schafer, Daniel W. (1986) 'Combining information on measurement error in the errors-in-
variables model.' Journal of the American Statistical Association 81, 181-5
Stapleton, David C. (1984) 'Errors-in-variables in demand systems.' Journal of Economet-
rics 26, 255-70
Theil, H. (1957) 'Specification errors and the estimation of economic relationships.'
Review of the International Statistical Institute 25, 41-51
Wickens, Michael R. (1972) 'A note on the use of proxy variables.' Econometrica 40,
759-61
White, Halbert (1982) 'Instrumental variable regressions with independent observations.'
Econometrica 50, 483-99

This content downloaded from


200.52.255.155 on Wed, 14 Apr 2021 16:34:53 UTC
All use subject to https://about.jstor.org/terms

You might also like