You are on page 1of 20

Some Remarks about Relations between Stochastic Variables: A Discussion Document

Author(s): R. C. Geary
Source: Revue de l'Institut International de Statistique / Review of the International Statistical
Institute, Vol. 31, No. 2 (1963), pp. 163-181
Published by: International Statistical Institute (ISI)
Stable URL: http://www.jstor.org/stable/1401371
Accessed: 01-06-2015 14:21 UTC

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact support@jstor.org.

International Statistical Institute (ISI) is collaborating with JSTOR to digitize, preserve and extend access to Revue de
l'Institut International de Statistique / Review of the International Statistical Institute.

http://www.jstor.org

This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

REVIEW OF THE INTERNATIONAL


STATISTICAL INSTITUTE

Volume31: 2, 1963

SOME

163

REMARKS ABOUT RELATIONS


A DISCUSSION
VARIABLES:

BETWEEN STOCHASTIC
DOCUMENT*

by
R. C. Geary
The EconomicResearchInstitute,
Dublin
It is thecontentionof thewriterthatthefundamental
problemof themeaningof
in the economiccontextand in general,remainsunsettled.
stochasticrelationship,
It is truethatmuchof theworkin thisfieldis excellent,but real progresshas been
- and theessenceof mathematics
of concluis thecertainty
confinedto mathematics
itis theformulIn theproblemofstochasticrelationship
sionsfromstatedhypotheses.
thatis thetrouble.It is notsurprising
thatauthors- thewriter
ationofthehypotheses
is one - tendto returnto the topic at intervalsof yearsto shake its uneasybones.
but there
We may,or maynot,politelymentionone anotherin ourlistsof references
is littleevidencein our individualwritingsthatwe have deeplystudiedthe others'
thinking;and the presentpaper is no exceptionto thissorryrule. More like poets
thanscientists,
each of us seemsto wantto workthisone out forhimself;thestruggle
is in one's own soul.
the
in whatfollowsare verysimple,deliberately
The mathematics
so, to highlight
the
of
to
characteristics
in
as
the
stochastic
particular,assumptions
hypotheses
the writer'sexpressionof viewswill be forthright,
residualerror.Also deliberately,
to inspireor to provokedebate.It was an Irishstatesmanof otherdayswho said that
in speechto attainmoderationin ends.Perhapsit is hightimeworkers
he exaggerated
in thisfieldgettogether.
I.

WHAT IS REGRESSION?

the inIn the writer'sopinionregressionis essentiallya cause-effect


relationship,
dependentvariablesbeing the causes and the dependentvariablethe effect.With

Y=-o

+
X

+ u

you are sayingthatgiventhenumericalvalues ofa and 3, Y is foundby substituting


a givenvalue forX, calculatingY a + 3X and addinga randomvariableu. Simple
regressiontheoryis concernedwiththeestimationof a and p froma seriesof pairs
limitsof theseestimates,as well
of observations(X, Y) and discussingtheconfidence
the varianceof the randomelementu - the latterbeingof major imas estimating
theconfidencelimitsof theestimateof the averagevalue of
portanceforestimating
it becomesclearwhythereare in generaltwo reas
X.
Viewed
cause-effect,
Y, given
thecause of Y,
forone lineX is, byhypothesis,
case:
lines
in
two-variable
the
gression
forthe otherY is the cause of X and thereis no reasonwhytheseshouldcoincide,
evenwhenthenumberof pairs of observations(the data) is infinite.
the
oneselfto thetwo-variablecase, thebasic problemconfronting
Stillconfining
the
statistician
is, givena scatterdiagramin (X, Y), to findthelaw, ifany,governing
havingregardto probability,or stochastic,theory.We have already
relationships,
* Paperpresented
of
of theEconometric
at theJointEuropeanConference
Society,theInstitute
Dublin 3-7 September1962.
of Mathematical
Sciencesand theInstitute
Statistics,
Management
This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

164
mentionedtwo such relationships,
the two regressionstraightlines: therecould of
to which
in character
coursebe curvilinear
relationships,
regressional
(i.e. cause-effect)
whether
randomsamplingtheorycan be made apply.We have testsfordetermining
of whatkindcan it plausibly
thereis any relationshipand, ifthereis a relationship,
be regarded.Attentionwillbe confinedto thelinearcase.
have recognisedthatthere
Fromtheearlieststatistical
times,however,statisticians
betweenrandomvariables.
otherthan regressional,
were conceivablyrelationships,
Theysaid (moreor less) letus abandonthenotionofanyspecialrole(e.g. a particular
foreach variable:be quiteneutralas to the
variableregardedas a cause or an effect)
role of thevariable,treatthemall as equals,and see whathappens.Call theresulting
"associative","neutral",or whatyou will.I shall,in what
relationship"functional",
What is thelaw governingthejoint movementof
use
the
term
associative.
follows,
The
of
observations?
pairs
questionposed in thisway indicatesthattheassociative
in
sciencewhereso oftenwe can
viewpointpredominates the fieldof experimental
believein theexistenceofa law,ifonlywe couldfindit,our difficulty
beingdue solely
to errorsof theordinarykindin our observations.
An earlyfavouriteas an associativelaw was the line (or plane) of closestfit,i.e.
the straightlinewhichminimisesthesumsquaresof distancesfromthepointobserstochastvations.The troublehereis that,in general,theprocedurecannotbe justified
sensibleon practicalgrounds.A stochastic
theoryhas been
ically,thoughitis perfectly
developedon the followinglines([1], [2]). In the simplestcase of two variableslet
themodel be
Yt =

(1.1)

xt

Xt = xt + u,
Yt = Yt + vt

t= 1, 2, . . . , T,

wherethe(Xt, Y,) are theobservationssubjectto errorsof observation


(ut,v,) about
whichnothingelse is assumedexceptthattheyare independent
of one anotherand
of (xt, yt) the "true"measuresand thatall theirmomentsexist.The problemis to
estimatethe coefficient
p fromT sets of observations.All variablesare assumed
cannotbe solvedusingthe
measuredfromtheirmeans.The problemas formulated
variancesof Xt and Yt and/orthe covariance(Xt, Y,) since(whenT is indefinitely
large)theuse ofthesemomentssupplyonlythreeequationsto obtainfourunknowns
Eu2, Ev2. Instead,recoursemustbe had to highermoments,
(,Ex? (E = expectation),
or the mathematically
cumulants(L for (X, Y), X for
equivalenttwo-dimensional
defined
in
the
(x, y))
by
identity (s, t):
E exp (sX + tY) - exp {I
L(i,j) s' t1/i!j! }.
1, J
.E
But,fromtheindependence
assumptionsabout theerrorvariablesu, vand using(1.1)

(1.2)

(1.3)

E exp (sX + t Y)

E exp (sx + ty). E exp su. E exp tv

whencethefundamental
relation
(1.4)

L(i,j) = (i,j)
whenbothi andj are non-zeropositiveintegers.But,from(1.1),
Eexp(sx

+ ty) = E expx (s + tP3)

= expC k(S + t )k/k !


= exp Z X(i,j)xiyJ/i!j!

This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

165
whereXkis thek thcumulantof x. Equatingcoefficients
of s ti ,
kP5j
= X(i,j)

Hence

X(i,j + 1)

Xk+1 ~j+1=

+
kk 1,U).

It followsthat

X(i,j + 1)- p X(i

(1.5)

or, using(1.4) wheni,j ;> 1


L (i,j + 1)(1.6)

1,j) = 0

L(i + 1,j) = 0 .

This theorycan readilybe extendedto any numberk variableswhenthemodelis


k

(1.7)

: PkXk= 0

=1

i=

Xk= Xk+ Uk

omitting the cursive subscript t (t = 1, 2,...,

T). The equations for finding the

coefficients,iare then

(1.8)

1,2,...,k

f PiL (c

, c2 ...,

Ci +

Ci+ 1, ...

Ck)

= 0

wherethe integersci > 1. Thereare, in general,an infinity


of relations(1.8) which
T
is
constitute
the
and
sufficient
conditionsforthe
(when
indefinitely
large)
necessary
of themodel(1.7). It can easilybe shownthat,whennumberof setsT
acceptability
of observationsis finite,consistentestimatesof the L functionscan be foundfrom
theoperation
(1.2) (and analogouslyin thegeneralcase ofk variables)bysubstituting
1T

T t=1
for E. There is an asymptoticrandom samplingtheoryavailable for the theory
outlinedabove [2]. It suffers
fromthedisadvantagethatit is computationally
difficult
usinga deskmachineexceptwhenthenumberofvariablesis twoor three,or perhaps
in the"Reiersolcase" ofinstrumental
variables- see (vii) below.Also, sincewiththis
theorywe musthave (in general)recourseto cumulantsof powergreaterthantwo,
the errorvariancestendto becomelarge. That is whyone musthave morethan a
sneakingregardforempiricaldeviceslike the straightline (or plane) of closestfit,
whichinvolvesonlythevariancesand covariances.
Followingare some remarkson associativerelationship:(i) The theoryis not applicablewhentheobservations(X1, X2,... Xk) arejointly
forthenall thecumulantsof morethanone discussionand of
normallydistributed,
2
than
are
zero so thattheequationsystem(1.8) reducesto thetrivial
powergreater

0 - 0.

(ii) The theoryhas been verylittleapplied.The writerhimselfused it to estimate


involvedin the IrishmanBoyle's (17th Century)Law using
formallythe coefficient
theLaw is
Boyle's original25 pairsof observations.For constanttemperature
Log P +

P log V = Constant

The estimatep of is
P = L(3,1) / (2,2)=

1.00404

This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

166
whichscarcelyrequiresa significance
testto establishinsignificant
difference
from
In
a
in
lecture
the
writer
remarked:
Paris,
unity.
"Remarquonsincidemment,
que la loi de Boyles'appelleloi de Mariotteen France,
avec la memelogique qui faitque la loi normale,decouvertedans des conditions
diff6rentes
par de Moivre et Laplace, s'appelle quelquefoisloi de Gauss. Sans
tous
les pays regoivent
6ventuellement
doute,
justiceen moyenne".
(iii) Thereis a non-lineartwo-variable
theoryalso availablethoughherenuisance
which
under
certain
additionalhypothesescan be estimated
parametersintervene,
fromthedata.
(iv) Linear associativetheorycan be regardedas a generalizationof regression
theorythrowingsome lighton the latter.In theusual notationthemodelis
k

Pi + u,
Y= i=1
YE
all variablesmeasuredfrommeans.The standardequationsforestimating
thePi by
bi are
1
bk
bi
bi
=
=
X2 + ...
k.
X, Xi + ...0 ++ T SYXi
TT'
T XkXi, i 1,2,..,
ofearliertheory,
therearek + 1variablesandthecovariances
Now, fromtheviewpoint
involvedare equal to thecorresponding
cumulantsso that,forassociativetheorythe
covariancecoefficients
are estimablesince
E YXi = Eyxi
But

E Xi X = Exixj,

i 6 j.

EX2= Ex + Eu

in whichthereintervene
thenuisanceparameters
Eu?. The regression
equationsthereforebecomeassociativeonlywhenEu2 = 0, i.e. ui = 0, i = 1, 2, . , k. Hence by a
..
circuitousroutewe come to the basic assumptionof regression
theory,namelythat
it yieldsassociativevaluesofthecoefficients
variablesare
onlywhentheindependent
observedwithouterror,the singleerrorvariable in the model pertainingto the
dependentvariableY.
data as a sample,
(v) The R. A. Fisherstochasticmodel envisagesthe regression
or realisation,froma universein whichthe independent
variablesare the same for
all samples.J. Berkson[3] has, however,isolateda lineartwo-variable
case in which
regressiontheoryyieldsthe correctassociativeestimatethoughboth variablesare
subjectto error.In theBerksoncase whenwe thinkour measureof theindependent
variableis X it is reallyx where
x= X

u,

u beingtherandomerrorassumeduncorrelated
withX. The contrastwithassociative
theorywillbe noted: heretheobservation
X=x+u

and u is uncorrelatedwithx thoughit is, of course,in generalcorrelatedwithX,


comparedwithca in the Fishercase. In the Berksoncase the regressionof Y on X
The Fishersignificance
yieldsa consistentestimateof thecoefficient.
theoryapplies,
This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

167
oftheindependent
in measurement
variable
thoughthepricepaid fortheimprecision
is thattheerrorvarianceV is now

V=o2
+p2G2

the errorvariancesof Y (residual)and X. A nonwhere and ac are respectively


for
case is available [4] but therethenuisance
a2v
the
Berkson
linearsignificance
theory
which
intervenes
can, however,be estimatedfromthe observations.
parameterca
of
of
is indefinitely
number
sets
observations
the
When
largein thelinearasso(vi)
ciativecase oftwovariablesitis easyto showthattheassociativelinemustlie between
the two regressionlines.This need not necessarilybe the case whenthe numberof
pairs of observationsis limited.In factin the generalassociativecase, as remarked
earlier,therandomsamplingerrorvarianceof P tendsto be so largeas to givevery
aberrantresults.
and when
(vii) Whenone has availablemanyeconomictimeseriesall inter-related
one's model of severalbehaviouristic
equationsonlya fewof thesevariablesappear
one can use the
in each equation,forthe consistentestimationof the coefficients
to 0. Reiersol([5], [6]), theinstruments,
variablemethod,due essentially
instrumental
in regardto any equation,beingthevariableswhichdo not appear in the equation.
a particularcase of(1.8) above,theequationsystemfortheestimation
This constitutes
oftheterms
of thecoefficients
onlycovariancessincethecoefficients
pinowcontaining
to be zero.
are
assumed
error
thebiassing
in E Xi2,throughwhichintervene
variances,
from
their
Y
X
and
two
contains
variables
(measured
only
Suppose the equation
model
is
the
and
means)
X=x+u
Y= y +Uv
v
Y=y+

y= px,

withone anotheror withx and y. Suppose we have an


u and v beinguncorrelated
withu and v. Then
withX and Y but uncorrelated
correlated
additionalvariableZ,

so that
of which

EXZ = ExZ
E YZ = Eyz = P ExZ
P=EYX/EXZ

b = Z (Y- Y)(Z - Z) /Z (X- X)(Z- Z)

measuredfromtheirmeans)is a consistent
(wherethethreevariatesarenotnecessarily
a certain
estimate.Thereis a theoremthatwhen(X, Y, Z) are normallydistributed
as theStudent- Fishert [7]. The writerwouldwishfora
functionof b is distributed
simplerproofof thistheoremthanthatwhichhe found,forsucha proofmightlead
i.e. forany numberof variables.
to a generalisation,
variableshould
i.e. if u is zero, theinstrumental
Of course,if X is non-stochastic,
be Z = X itselfwhen the solutionis the regressionone, forthe reason (Markov)
variablesZ = X yieldsminimumvarianceof the
that,of all possibleinstrumental
theoremis thatthematrixX
case thecorresponding
estimateof 3.In themultivariate
The instrumental
of
the
coefficients.
estimates
of
the
variance
thegeneralised
minimizes
consistency
of
statistical
the
merit
has
estimation
coefficient
for
variableprocedure
As we knowfromsampling
inefficiency.
but at the cost of a measureof asymptotic
forgreaterefficiency
practice,sometimesit maybe expedientto sacrificeconsistency
in calculation.
in estimationand simplicity
This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

168
II.

A PROPERTY

OF REGRESSION

COEFFICIENTS

ratherfundamental
is notto be foundin any
It is curiousthatthefollowing
property
of thetext-bookswhichthewriterhas consulted,thoughhe is aware thatothercolleaguesknowit,and indeedit mightbe suspectedby anyonefamiliarwithregression
theory.Let theoriginalmodel,in matrixform,be
y= =pX+u

(2.1)

wherey and u are (1 x T), p is (1 x k) and X is (k x T). Divide theindependent


variablesinto any two groupsof kl and k2 variablesso thatk = k, + k2. Model
(2.1) can thenbe writtenin theidenticalform
y = P1X + P2X2+ u,
forthe secondtermon the
wherenow i, is (1 x kj), Xi is (k, x T) and similarly
is
matrix
the
of
be
the
residual
of
Let
regression X1 on X2. The property
right.
V,
coefficient
vector
thattheestimateb, of is identicalwiththeregression
transposed
p the generalisationof a propositiondue to R. Frisch and
cl of y on V,*. This is
F. V. Waugh [8], provedforthecase of k2 = 1. The maininterestof thisproperty
variablesincreases
in itsgeneralformis computational;as thenumberofindependent
is an increasingly
efficient
method(usinga deskmachine)
(beyond4 or 5) partitioning
in termsofnumberof computationaloperaof computingtheregression
coefficients
thenumberskl and k2,givenk, which
tionsinvolved.It is evenpossibleto determine
affordsthemostefficient
partition.
Anotherformof the propertyis thatif z is the residualmatrixof the regression
matrixbl.
y on X2 the regressionof z on V1 also yieldsidenticallythe coefficient
a Cobbfor
in
found
coefficients
This is why thepost-warperiodV. Cao-Pinna [9]
for
of
function
the
form
Douglas
Italy
(2.2)

q = constx H KY e t,

(2.3)

whereq = G NP (withcertainindustriesexcluded)at constantprices,H = hours


and K = capitalstockat constantpriceshad (as Cao-Pinnafound)coefficients
P and
almost
K
In
H
were
from
fact
and
different
zero.
linearly
increasing
q,
y insignificantly
as theregression
witht = timeso thatwhentheestimatesof p and y are interpreted
of the (small) and probablyrandomresidualswhenthe effectof t is excludedfrom
The writeris scepticalabout
is understandable.
log q, log H and log K, thenul-result
resultsfor many countriesfor the inter-war
period,where 3 was so oftenfound
both,on thegroundsthat(i) with
equal to about 2/3and y to 1/3,highlysignificant
K as capitalstock(and notcapitalin use or capitalactuallyconsumedin theproductionprocess)formula(2.3) could notpossiblybe a good theoryforexplaining
year-toyearvariationin q and (ii) thatthe"good fits"foundweredue to spuriouscorrelation
helpedby thepronounceddip of all variables(including,exceptionally,
K) in thedetake suffinot
1929-35
of
could
which
the
term
at
pressionperiod
exp obviously
cientaccount.
III. HAVE INDIVIDUAL

REGRESSION

COEFFICIENTS

OBJECTIVE SIGNIFICANCE?

Since regression
is essentially
a cause-effect
theonlyvalidobjectofthe
relationship
exerciseis to be able to estimateon averagethe value of y corresponding
to given
valuesoftheindependent,
orcausal variables.The coefficients
aretherefore
collectively
* The
proofis a prettyexercisein matrixmanipulationat studentlevel. G. Tintner[12] has a theorem
verylike thisthoughhe does not use a matrixmethodto proveit.
This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

169
useful.In mostcases,especiallywheneconomictimeseriesare involved,theindividual
are devoid of interestor significance.
coefficients
variablecase, one has determined
the coeffiSuppose that,in a threeindependent
cientsbl, b2and b3by least squareprocedureand writes
(3.1)

Y = bl xl + b2X2 + b3 x3

(all measuredfromtheirmeans). Marginaltheoryteachersare prone to interpret,


say b2 as "a riseofonein x2entailsa riseofb2in y whenx, and x3 remainconstant".
The troubleis thattheceterisparibuspartdoes not obtainexceptin theveryspecial
and rarecase ofxl, x2 and x3 beingmutually
a case whichneverarises
uncorrelated,
when one is dealingwithmacro-economic
timeserieswhenone can findveryhigh
correlationsindeed; in factin a paper [10] of manyyearsago the writerfounda
correlationof .97 betweenemployees'compensationand consumers'perishable
goods for U.S.A., 1921-38(usingH. Barger'sdata) and, even afterthe removalof
termsto degree7 in timethecorrelation(of residuals)remainedas highas .93.
It mayevenbe of some littleinterestto considerthevalue ofy, in the threeindeto a value x2 of x2 whenaccountis takenof
pendentvariablescase, corresponding
concomitantvariationsin x, and x3, withinthe logic of regressiontheory.Let x1
and x3 be the average or expectedvalues to be assignedto x, and x3 respectively
consequentto thevalue x2 beingassignedto x2. From simpleregression

(3.2)

x -

SX2X1

X2;

X3

X2 3

X2

Lety' be thevalue ofy corresponding


to thesevalues of xl, x2, x3. Then from(3.1),
= bl x + b2x2+ b3x3
Yl
(3.3)
= x2 (bl Z2 X1+ b2Z x2 + b3Z x2X3)
the
the coefficients,
But, fromthe second of the standardequationsfordetermining
expressionin bracketsequals Z x2y. So finallywe have

(3.4)

Z x2y X2

thesimpleregression
ofy on x2. The rightanswerto thequestionoftheaverageeffect
on y of a rise of unityon x2 (any independentvariable)is furnished
by the simple
regressionofy on x2, no matterhow manyothervariablesor equationsthereare in
the system.
A rationalmeaningcan therefore
in manycases be attributed
to thesinglecoefficient
in simpleregression,
and perhapsonlyin sucha case. At theotherextremeone must
be extremely
scepticalon statisticalgroundsalone about themeaningor usefulness
of individualcoefficients
in the many-variable
case when one so well knows that
small changesin the basic data (sometimeswell withintherangeof accuracyof the
data) can resultin substantialchangesin theestimatesof thecoefficients.
The writeris aware thatthe statementthatvalues foundforindividualmultiple
coefficients
is meaningless
has ratherdevastating
formarginal
regression
implications
deanalysisin practice.One of the best-known
applicationsis thatof price-income
mand analysisbased on timeseriesin theform(withtheusual notation)
(3.5)

log q = c+ P log

+ y log

t + u,

This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

170
The specialProvidence
as priceand incomeelasticity.
wherep and y are interpreted
that
ordain
which watchesover virtuousanalystsmay
log p/P and log YIP are
seemsinvalid.If we
if
are
usual
but
not
the
uncorrelated
elasticity
they
interpretation
a
to
which
a rationalmeaning
on
and
t
we
obtain
coefficient
log p/P
P'
regresslog q
contextcan be attached:no matterhow manyothercausal variables
in theelasticity
or not,
shouldbe in thetrueformulaforlog q and whethertheseare inter-correlated
whenall theothervariathevalue ofthecoefficient
estimaterepresents
theregression
bles assumetheiraveragevalue consequenton log p/P havinga givenvalue. But in
for all
equal to P' and yetP seemsto be thepriceelasticity
(3.5) P is not necessarily
values of log YIP whenthe effectof timet is eliminated.Of course,(3.5) in its revalid answerto
gressionformand undertheusual conditionswillafforda perfectly
theproblemofexpectedq consequenton givenvaluesofp, P, Y and t. Rathersimilarly
any theoryof marginalratesof returnto labour and capitalbased on partialdifferentialsof a productionfunctione.g.
q =f (H, K)
(3.6)
thatH - hoursworked
are dubiousunlessone caresto sponsorthecurioushypothesis
toolsand machines
without
ofK- capitalstock.Can hoursbe worked
are independent
or vice versa?
The foregoingconsiderationsalso lead one to the conclusionthat much of the
and comparais irrelevant
preoccupationwiththeerrorvariancesof thecoefficients
tivelyunimportant.
coefficients
have
theindividualregression
In one specialcase ofmultipleregression
But in thiscase,
variablesare uncorrelated.
a meaning,namelywhentheindependent
the
are exactlythosewhichwould be foundon regressing
of course,the coefficients
i.e.
of
variables
on
each
the
variable
by
simple
separately,
independent
dependent
theoriginal
to theproblemoforthogonalizing
Thisfacttendssomeinterest
regression.
in which
of lineartransformations
independentvariablesystem.Thereis an infinity
in matrixformas follows
thismaybe affected,
Z
B X,
(3.7)
matricesand B is (k x k).
whereX and Z (k x T) are the originaland transformed
in theoriginalindependents,
has themeritthatit is symmetrical
One transformation
orthovariablesare theprincipalcomponents,
namelythatin whichthetransformed
has no stochasticimgonal, of course,to one another[11]. This transformation
identicalwiththe originalin X
plicationswhatever:analysisin Z is mathematically
will
withthevalue of y givenZ
of
X
be
identical
since the estimatedvalue y given
into Z by (3.7). In the contextof economictimeseriesthe
when X is transformed
principalcomponentmaytakeup thegreaterpartof thevarianceofy and so impart
an objectivevalidity
tothesimpleregression
coefficient
ofyonthisprincipalcomponent.
In forecasting
whatusuallymattersis thevarianceoftheforecastwhich,unfortunately for forecasters,
depends absolutelyon the unit residualvariance 02 of the
seriesused to determinethe regression,
even if thisseriesis verylong and even if
and all
one makes the most favourableassumptionsabout stabilityof coefficients
therest.In fact,ifin simpleregression
Y=a + b X,
(3.8)
whereX is givenand Y is theestimateof Y, then,as is well-known,
(3.9)

VarY =

02

-+T

(X -X)2/
)2
(X-

= 0 (T-1) .

This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

171
However,thistakes care onlyof the expectedor averagevalue of Y. For actual Y
forecastedthevarianceis
VarY = a2 + Var Y,

(3.10)

wherethe firsttermon the rightpredominates


whenthe numberof observationsT
is large.If a is of the orderof theyearto yearchangein Y no valid forecastcan be
made in the shortterm.In the longertermwe may be in bettercase, sincewe can
reasonablyassumethatwe are concernedwiththe"average"or normalsituationand
changesfrombase to reference
yearmaybe substantial.
IV.

REMARKS

ON SYSTEMS OF EQUATIONS

Whenthewriterwas activelyresearching
on relationsbetweeneconomicvariables
about fifteen
or twentyyearsago, it was in the highestdegreehereticalto take the
attitudeoflettingthefiguresspeakforthemselves.
No, one musthave regardto what
was called, and perhapsis stillcalled, "economictheory".Actuallythe formulaof
thepriestcraft
is enshrined
in thesubtitleoftheEconometricSociety"An International
for
the
Advancement
of EconomicTheoryin its Relationto Statisticsand
Society
Statistics
and Mathematicswelland trulyin theirsubordinate
Mathematics",putting
A
view
was
taken
of the notionthatthe problemof establishing
place. verypoor
between
economic
time
seriesshouldbe approachedin a neutralway,
relationships
without,however,any abdicationof good sense,withthe object of findinga set of
complete(in thesenseof thewriterin [7]) linearrelationsin whichtheerrorvariance
ofthesystemas a whole- perhapsthegeneralised
variance- was as smallas possible.
He well recallsthe shock of disagreement
at a sessionof InternationalStatistical
Institute
manyyearsago whenO. Morgenstern
(withno doubtdeliberateexaggeration)
remarked"let's throwall thefiguresintoa computerand see whatcomes out at the
otherend". He was possiblythe onlypersonpresentwho had any sympathy
with
The writerdoes not assertthatthelack of successwhichhas attended
Morgenstern.
efforts
to setup economicequationsystemswas necessarily
due to theshacklesofthe
he does knowthatshacklesofanykindareinimicalto scientific
priestcraft:
objectivity
and development.Whilepayinga warmtributeto the so veryfewdevotedworkers
who dared to applytheirtheoryto actual data, the totalvolumeof appliedworkin
to deriveworkingmacro-economic
modelshas beenpunyin theextreme.That
trying
theseefforts
were not on a largerscale was due in a degreeto the scepticism,the
As Larochefoucauldalmostremarked,
inspissatedgloom,ofthepriestcraft.
theywere
not too unhappyin theireconometricfriends'misfortunes.
Those who genuinely
want to know in measuredtermshow the economicsystemworksmustsevertheir
connectionwitheveryprejudiceofso-calledeconomictheoryand setup computational
on a vastlylargerscale in thefuturethanin thepast.
experiments
Of coursein practicethefewdevotedmodel-workers
to be
did notallowthemselves
spancelledby economic theory.Having made obeisance,theyset down perfectly
sensibleputativerelationships
identities)
(apartfromtheaccountingand definitional
such as currentconsumptionbeingrelatedto current(and possibly)laggedincome,
thatoutputwas relatedto manpowerand capital,thatgovernment
was
expenditure
relatedto taxesand all therest.One does notneedto be an economistto surmisesuch
formsof relationship.Thosewho triedto verifyanythingwhichmightproperlybe
called"economics"havenothadhappyexperiences:
is a caseinpoint.
theroleofinterest
As alreadyremarked,
few
of
the
in
of
coefficients
very
systems equationshave any
in themselves;thosewhichhave are those occurringin equationswith
significance
This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

172
is forecasting
is implied
onlytwo variables.That the main objectof model-makers
in thearrangedequalityofthenumberofcurrent
variables
to
the
number
endogenous
of equations. Such equalitypermitsthe derivationof the reducedform(afterthe
ofthecoefficients),
i.e. ofexpressions
foreachofthecurrent
determination
endogenous
variables.It is high time that model-makers
variablesin termsof predetermined
and assertthatpredominantly
should shed theirpreoccupationwiththe coefficients
is forecasting.
the objectof model-making
relationAnymodelswithwhichthewriteris familiarare redolentof cause-effect
endogenousvariable
ships.Usuallyone findsthateachequationconsistsofonecurrent
on the leftand one or more currentendogenousvariablesand predetermined
(includinglaggedendogenous)variableson theright;itis evidentthatthelattervariables,
in thethoughtofthemodel-maker,
are regardedas causesand thevariableon theleft
as theeffect.Sincenumberof equationsequals numberof currentendogenous,each
of thelatterhas a solo partin a particularequation.Now thisis surelya verycurious
a cause
way to imaginehow thesystemworks.How can a variablebe simultaneously
and an effect?
Is one to imaginethatthecausativevariableis to be lagged"a little"
in time as comparedwithostensiblythe same variableas an effect?
But, if so, in
to rejointhat
is
not
are
two
variables
not
one.
It
there
and
quitesatisfactory
principle
one doesn'tknow.One wayofdealingwiththis
thevalueswillbe onlya littledifferent:
is, ofcourse,to inserton therightwitheach current
endogenousvariablethe
difficulty
same variablelaggedone timeunit,withtheidea thatthetwovaluesweightedby the
coefficients
are equal to one laggedvalue,i.e.

x, + P x,_= X,_,Oc+ p= 1.
This devicewould have someplausibility
ifthevariablewas moreor less continuous
in time,whichit rarelyis. Considerablymoreattentionmustbe givenin thefuture
is a useful
thanin thepast to thetimeintervalwhetherone believesthatcause-effect
when
results
To
to
or
not.
the
of
economic
expectgood
approach
relationships
study
one has imposed(usually)the year as the timeunit,givingone the choice only of
which
or effect
aftera wholeyearis to expecttoo much.Relationships,
simultaneity
are obviouslysignificant,
aftera timelag of a week or a month(whenone has the
statistics!)oftenvanishwhenthe figuresare totalledfora year.Of course,modelmakerscannotbe faultedfornot workingwithshorttimeunitswhenthe required
statisticsare not available.
From theforecasting
pointof viewwhatwe reallywantto knoware thevalues of
some k variablesin yearT + 7 whenwe knowthedata foryearst = 1, 2, ..., T.
We have no directinterest
in whatcaused what;wejust wantto know.The equation
all economicmodel-makers
systemis a meansto thisend,but,as alreadyremarked,
have followedthecause-effect
route.The writersuspectsthatthisapproachhas sometimesinvolvedthemin logicalcontradiction
at thestagewhentheoriginalequation
is
coefficients
system(with
purestimated) expressedin reducedformforforecasting
As
statistical
poses.
every
neophyteknows,havingwrittenthe simpleregressionof
Y on X in theform
Y= a + bX,
one cannotstatethat

X= (Y-a)/b,

in any verymeaningful,
as distinctfromformal,way. Yet thisseemsto be thekind
of thingone does withtransformation
to reducedform.It is the writer'sgrowing
convictionthatwhenseveralvariablesappearin an equationtherelationbetweenthem
This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

173
at anyratewhentherelationship
is associative
shouldbe associative,notregressional;
substitution
of variablesof the typeindicatedis always permissible.Such sanctity
attachesto the fullmaximumlikelihoodmethodof coefficient
estimation,thatit is
commonlyoverlookedthatML does not produceassociativeresults.
In somemodelsitis customary
to introducevariablestermed"policyinstruments",
level of
usuallythose variableswhichare underthe directcontrolof government,
taxationand the like, the problembeing to determinethe effecton othermacroOne mustbe verycarefulhere.
economicvariablesof changesin the instruments.
Suppose themodelconsistsof a singleequation
(4.1)

Yt =

PXt-1

Ut, t = 1, 2, ... , T,

wherey is incomeand x is amountof taxes both measuredfromtheirmeans.The


coefficient
3 (presumablynegativein sign) is estimatedby regressionand T is insincext_
-, at a timeunitback,
large.Theremightappearto be no difficulty
definitely
can be conceivedas the cause ofyt,but in the ordinary
meaningof words,xt,_cannot be regardedas the "cause" of y,. Yet on occasiony is a targetwithvalue r and
the problemarises of findingthe value of x, say ?, the "instrument",
whichcorresponds to thistarget.
From (4.1),

= (r - u) /p,
whereu is a nuisanceerrorterm.To obtaintherightaveragevalue of the?, theinstrument
variablecorresponding
to givenr, we mustassignto u its averagevalue u
to r. This value is foundas
corresponding

(4.2)

u=

(4.3)
in (4.2),
Then,on substitution

-=
(4.4)

Eyu
/Ey2.
E

r (Ey2-

Eyu) /Ey2

Exy /Ey2 ,

theregressionof x on y. This maybe obviousin thissimplecase (namelythatifone


wants to know the lagged tax level correspondingto targetincome and should
and not the
regresslagged tax on currentincometo findthe regressioncoefficient
otherway about), yet one wondersif this is always recognisedwhen the issue is
complicatedby manyvariablesand manyequations.
The writerwould verymuch like to provokea discussionon thispoint at this
Conference.He wouldlikehis colleaguesto addressthemselves
to thislittleproblem
in particular.Let themodel of a consumptionfunctionbe
(4.5)

C= pY+ u

and suppose that C personalconsumptionand Y personalincome are so defined


0 thenC - 0. Can I estimatep by
thatwhen Y
(4.6)

=c

Y,

theassociativeformula,in thiscase, or is myestimateto be (as usual) theregression


of C on Y? Beforeyou answertoo hastilylet me warnyou that,despitethe initial
hypothesesof zero associationof Y and C, the regressionwill containa positive
constantterm,contraryto hypothesis.In fact,in the associativecase, the model is
This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

174
C=c+u
Y= y + v

(4.7)

c= py

withone anotherand with


whereu and v are errortermswithmeanszerouncorrelated
butunknownvariablesc and y. Whenthenumberofsetsofobservations
theinherent
is indefinitely
large
Ec /Ey = EC/EY

(4.8)

of C on Y theabsoluteterma' is givenby
But fromregression
a' = EC- P' EY,

(4.9)
where

' = E(C - EC) (Y- EY)/E(Y- EY)2.

(4.10)

From (4.7), (4.9) and (4.10) and usingthe assumedpropertiesof u and v, we find
C' =

(4.11)

P a2

EY/E(Y-EY)2

wherea2 = Ev2. The constanttermx' is accordinglypositivewhenE Y is positive


(as it willalwaysbe) and a -= 0.
The writer'squestion"Are therelationswe wantto be causativeor associative?"
is not a rhetoricalone; he reallywantsto know. There are conceptualdifficulties
in regardingtherelationship
as associative,in particularin anyeconomicvariableX
are supposed
partx, to whichtherelationships
beingdecomposableintoan inherent
to pertainand a purelyrandompartu, so thattheobservation
X

x + u,

whichmanywill have troublein acceptingespeciallywhenthe economicseriesare


betweentime
timeseries.From the writer'sown work [10], associativerelationship
betweenseculartrendswhereastheresultspreviously
seriestendsto be relationship
mentioned
tendtoshowstrongrelationships
betweenresidualsaftertrendsareremoved.
but not suitablefor
suitableforlongertermforecasting
Are associativerelationships
short-term?
associativerelations(and thiswill be by no means
If one succeedsin establishing
or paradoxes
there
in
is
easy practice,though
plentyof theory)no contradictions
of thekindsindicatedabove can arise.
of substitution
V.

THE ERROR TERM IN EQUATION

SYSTEMS

In the historicaldevelopmentof the mathematicaltheoryof errors,errorwas


conceivedas an errorof measurement,
of
due to humanfallibility
or thelimitations
the measuringinstruments.
It was easy to attribute
ranthe
of
conceptually quality
domnessto errorsofthiskind.Now in thosedaystheastronomical
and otherphysical
laws whichhad emergedor were emergingwere simplein character:oftenindeed
theirformcould be inferred
theactualobservations
theoretically,
beingrequiredonly
- a simplesituationindeed.In thesocial sciences,on theotherhand,
forverification
thelaws,ifanyobtain,are immensely
morecomplicatedthanin thephysicalsciences.
Theoreticaland qualitativeeconomicsgiveus butlittleguidancein themathematical
formof theselaws. Geometricaland mathematical
economicsuse onlythe simplest
functionalformsand these,one suspects,are selectedmorefortheirsimplicity
and
This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

175
for classroompurposesthanforany convictionon the part of theirinventorsthat
reality.You willhavenoticed,forexample,thatintimeseriestheyalmost
theyrepresent
the solutiony = Cext, whichno economictimeserieshas obeyed
have
invariably
He showedlittledisbetween
consecutive
years.Came the econometrician.
except
to
in
than
in
the
work
functional
firstdegreethough
higher
relationships
position
theremaybe thisto be said forhim,injustification,
thatin introducing
laggedterms
intohis equations,he was implicitly
usingthecalculusof finitedifferences,
just one
fromlineardifferential
removetherefore
equations,which,as you are aware, can
involve solutionsof highfunctionalcomplexity.One qualitysharedby economists
alike is a distinctpreference
for the dialecticand for matheand econometricians
matical abstractionsas againstthe brutalisingdisciplineof numericalcalculation.
Inevitablythereappeared discrepanciesbetweentheoryand practice.The "expected"valueswerefoundto deviatein greateror lesserdegreefromthetruevalues,
To makeup forthediscrepancy
an errorterm
ifone can so politelytermthestatistics.
and
was introduced.Stochastictheorywas thenavailable forcoefficient-estimation
withR. A. Fisher'stestsof consistency,
and thewelltestsof significance,
efficiency
knownpropertiesof maximumlikelihoodestimation.By far the greatervolumeof
data weretimeseriesforwhichit was foundnecessaryto make a considerableextento thefactofserialcorrelation
in the
sion ofexistingstatistical
theory,due principally
statisticaltimeseries.
The errortermin anyequationis themeasureofwhatwe don'tknow.In thesocial
is farlessthaninthecase ofexperimental
sciencesknowledgeoflaw ofcause and effect
we have to make thebestuse of whatwe can get
science.In economicinvestigation
and the statisticsavailable tend to be of unsuitabledefinition,
inaccurateand informof thelaw
complete.In addition,we don't knowin advancethemathematical
It is reallyonlyin thefieldof samplingsocial surveysthatthe
or laws of relationship.
is in anything
like thesituationof theexperimental
economicstatistician
statistician
in havinghis measurements
and thewholeplan of his inquiryundercontrol.
of therandom
It is not as clearlyrecognisedas it shouldbe thattheintroduction
variablecompletely
economicsin thebroader
changedthecharacterof mathematical
sense.Any reasonablesystemof behaviouristic
equationsin timeserieswill contain
lagged as well as currentendogenousvariablesand it is customaryto arrangethat
the numberof endogenousvariablesequals the numberof equations.The formal
whichare
solutioncontainsa termlinearin the randomvariableswithcoefficients
moreor less estimablebut thiserrortermis of the same orderof magnitudeas the
As remarkedearlier,in mathematical
economicswithout
variableto be determined.
the errorsthe solutionis usuallyin exponentialor Fourierform:in any realistic
solutionthesetermswilllong sincehave vanishedwhenaccountis takenof theerror
alteredby the
terms.The pointis thatthecharacterof thesolutionis fundamentally
introduction
of errorterms:the formalsolutionfor each endogenousvariablefor
currenttimeis an expressionlinearin theerrortermsand in theexogenousvariables,
back in timeto thestartof theseries.
stretching
The specialproblemsof economictimeseriesposed theoretical
problemsof special
solved.These
to themathematician
and manyof thesehave beeningeniously
interest
branchofmathematical
problemsand theirsolutionhavejustlyendowedthisparticular
statisticswitha highprestige,certainlya muchhigherprestigethan it deservesfor
its practicalusefulness.
we are askingthaterror
In economicequations,singleor in sets,beguiledbytheory,
termto do too much.Surely,in reason,we cannotexpectmuchofa stochastictheory
whenwemaketheerrortermstandforall thevariableswhichshouldbe in theequations
This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

176
in thevariableswe
if onlywe knewwhattheywere,forthe errorsof measurement
of thelaw of relationship.
have includedand fortheinevitablesimplification
Since all the seriesexhibitedthephenomenonof serialcorrelationusuallyin emtheimmensely
phaticdegreeand sincethesimplemodelscould notpossiblyrepresent
it
was
inevitable
of
the
economic
thatthephesystemcorrectly,
complicatedworking
ofresidualsshouldappearin theresults.It was surelynot
nomenonofautocorrelation
themodel
thatif thisphenomenonis admittedas part of thehypothesis,
surprising
resultsin practice.The postulatethattheresidualsare (i)
could notyieldsatisfactory
variablesand at thesame timeinclude,as it must,
of thepredetermined
independent
of variables(necessarilyseriallycorrelated)not explicitin the
(ii) the contributions
in terms,forthereason
equationsbecausetheyare notknown,seemsa contradiction
the residualswhichencompassthemcannot be
that the unknownsand therefore
of theknownpredetermined
variables.
postulatedas independent
of residualsin
It is the writer'sconvictionthatthe hypothesisof auto-regression
is
equation systemsbased on timeseries(howeverattractiveit is mathematically)
ofthecoefficients
in
inadmissablefromthepracticalpointofview.If,afterestimation
any particularhypotheticalequation, the residuesexhibitthis phenomenonthe
equation shouldbe rejectedor, by trialand error(addingfreshvariablesor taking
othersout), the originalequation should be amendeduntilone attainsnon-autothisis a highlyempiricist
correlatedresidualerrors.Admittedly
pointof view.The
writerbelievesthat,whenall theoriginaltimeseriesare so highlyautocorrelated,
the
of adequacyof relationship
bestcriterion
is, thattheresidualsshouldbe foundto be
completelyrandomby thevon Neumannor othertests.
thehypothesis
If thisviewpointbe acceptedthenmodelsincorporating
of residual
are erroneous.Considerthemodel
auto-regression
Yt =
---

(5.1)

t-1 + Vt,

The simplestmodelof thiswould be


wherethe vtare autocorrelated.
vt= vt,_+ ut,
and non-correlated
withvt-_,vtbeinguncorrelated
whereutare non-autocorrelated
withyt-i. The problemis to estimateP froma seriesof Ty' s. In myopinionthis,the
simplestpossiblecase ofthekindofthingwhichis notuncommonin morecomplicated
More correctly,
from(5.1) in (5.2),
form,is a wrongformulation.
by substitution
(5.2)

Yt -

Yt-1

(Yt-1 Yt-2) + Ut

or
Yt= (c +3- ) Yt-1- PYt-2+ Ut.
(5.3)
The lattersurelyis the law we are seeking.We are interested
in estimating
(oc+ P)
and ac3 forthepurposeof forecasting
The
in
and
the
formulation
are
oc
original
P
Yt.
of no interestin themselves.*
And hereis an exampleof rathera different
characterdiscussedby manyauthors,
thoughthepresentglossis thewriter'sown. The modelis
(5.4)

Ct = P Yt -+ u,
Yt = Ct +- It

t = 1, 2, ..., T

* I am indebted
to M. H. Quenouille
fortheinteresting
thatif,as appearsto be theonly
observation
a and 3,givenby (5.1) and (5.2) froma set of
method,thesolutionof theproblemof estimating
is via(5.3),then,sinceac P and ap are symmetrical,
observations
theestimates
ofa and 3 are infromone another.
distinguishable

This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

177
ut random,Ct and Yt endogenousand It exogenous.The object is to estimate 3,
to consume,thoughwe shall see thatit isn't.
presumablydesignedas thepropensity
First(and thereis a hinthere)tryto setup thissystemin thisformgiventhecolumns
of utand It as wellas thecoefficient
p. You willfindyou cannot.You can onlydo so
by reducingtheformto either
1
ut
(5.5)
Yr= t + ut, P'
ut - 1=-3'
of
"I +

(5.6)

u,

Actuallywhensolved by least squares the last two equationsare foundto be consistentin that
as shouldbe thecase since

S=

1.

We mayalso remark(thoughthisis not thepoint)thatP estimatedby least squares


withtheestimatesP' and 3". The point
directlyfromthe firstof (5.4) is inconsistent
is that(5.4) is a falseformulation
thatat givenlevel
whichstates,ifit statesanything,
of Y, we expect(subjectto a randomerror)that Ctwill have a givenvalue, i.e. a
consumptionfunction.What we are reallysayingis that Y, and It or Ct and It are
economictheory,a capital/outputtheory,or, equivalently,
related,a quite different
a capital/consumption
theory.This maybe a trivialexamplebut it raisesthewhole
ofbehaviouristic
questionofthevalidityofthesolutionofa mixture
equationsand the
identities
and
of action
as
freedom
accounting
allowingoneself, is common,complete
as regardseliminationof variablesbeforeproceedingto solutionby maximumlikelihood or othermethods.
When we statethatan equationin our economicmodelis
o I+
Yt= tx

Pi Xtt+ ut, t = 1, 2,..., T,


i=1
whatis our pictureof thereality?Of course,we have no illusionsabout thelinearity
of the equation: we usuallycrossthathurdle(whichwill troubleus no morein this
section)by assuming,sometimesagainstall theevidence,thatwe are concernedonly
withthe estimationof "small" deviationsfromsome norms,whenwe have Taylor's
Theoremto sustainus. If (5.7) is a classical regressionequation the assumptionof
linearityaffectsonlythe dependentvariable Y since the X's, beingpre-determined,
can have anyfunctional
formwhatever,
e.g. X2 can be X12,or whatwe wish.But can
be
about
the
classical
modelin economicapplications,that
anyone happy
regression
Y on the one hand and theX's on the otherhand have so verydifferent
stochastic
properties;forwhatwe are sayingis that Y is envisagedas exactlyequal to a linear
with
witha randomscattering
of valuesu uncorrelated
expressionin theX's together
thelinearterm?Whateverviewbe takenof theassociativeconcept,surelyone must
be unhappyabout thisas an imageof economicrealityor anything
approachingit?
Or do we say that Y is exactlyrepresentable
by a linearexpressionof whichwe
know onlyk + 1 terms?That,in fact,u has thefollowingform:(5.7)

(5.8)

ut =

* Pk+j Xk+j.

j=1

t,

withno knowledgewhateverofthenumberof additionalvariablesk', thecoefficients


This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

178
to use a
or the Xk+j. (Of course it would be moresensiblein thesecircumstances
singlesymbolforeach term;the expressionis writtenin the way it is to pointthe
thatvar (u) is an ordinarymagnitude:
analogywith(5.7)). We assumethroughout
if it were "small" therewould be no problem.If we knewthe values of the Xk
+j
thenanyk + k' + 1 setsof ( Y; Xi, Xk+j) wouldserveto obtainthe exact values of
the (c; i, k+j), consistentwiththewhole T > k + k' + 1 setsof values.If the
correlationof each of theX1witheach of theXk+j were exactlyzero the values of
the pi foundwould be exactlyequal to thosefoundby regressionfrom(5.7). What
regressionhas done is to givethefirstk + 1 termsof a linearexpressioncontaining
k + k' + 1 terms.If thecorrelations
betweentheX's in thetwosets,insteadofbeing
fromzerointherandomsample
different
all exactlyzero,weresimplynotsignificantly
wouldbe unbiasedestimates
ofT setsofoperationsthentheji calculatedbyregression
of thetruevaluesPi.
variableswould
to believethattheknownand unknownindependent
It is difficult
was
dividethemselves
up intotwogroupslikethis,unless,of course,therelationship
the
associativeand completein whichcase the errortermwould merelysynthesize
thenumbersare all mutually
randomerrorsin the( Y; Xi). In theknownset,typically,
withthe
in timeas well; sincenon-correlation
correlatedand each is auto-correlated
of
the
latter
that
members
it
is
knownset is postulatedin theunknownset,
unlikely
seemsto disqualifythemas timeseries.
are auto-correlated
and lack of thisproperty
betweentheresidualu and the
The processof regressionimposesnon-correlation
variablesXi in (5.7). If in truthu has the form(5.8) wherethe X variablesexist
(thoughwe do not knowthem)and if,infact,some of thesevariablesare correlated
intheestimates
causesa distortion
withsomeoftheX's intheknownsetthenregression
of Piwhichare notconsistent
withtheirtruevalues.If thesetruevaluesare supposed
to have somekindofeconomicvalidity,
so muchtheworsefortheregression
process.
on subone
finds
in
If, afterestimationof the coefficients
regression,
(5.7)
by
Pi
in time,this
stitution
thattheestimatesoftheindividualresidualsare auto-correlated
resultseemsto establisha primafacie case forthe factthatu has in factthe form
variable
(5.8) withat least one of thecoefficients
Pk+j non-zero,the corresponding
values Xk+j having ordinarymagnitudesand the variable having the expected
In a wordthevariableexistsand theobviouscourseis
propertyof auto-correlation.
in theresiduals,
to go look foritinsteadofpostulating
ofauto-correlation
theproperty
of whichno practicalgood can come.
VI.

INTEGRAL

SOLUTION

OR

INDIVIDUAL

LEAST

SQUARES?

On thisfamousissue thewriter'sattitudemightbe describedas one of malevolent


if he
As he attacheslittleimportanceto individualcoefficient-estimation,
neutrality.
had to choose he would inclinetowardsthesolutionof each equationin the system
ifone is unableto use
theplane ofclosestfitto regression,
byleastsquares,preferring
the full associative solution. The writerfranklyconfesseshimselfto be a rank
As we don't know the laws of economicsand probablywill neverknow
empiricist.
them,let'sgo findwhathas workedbestovera seriesof yearsby all kindsof experithemost
ments,by trialand error.WhileML appliedintegrally
yieldsasymptotically
efficient
estimatesof the coefficients,
it by no meansfollowsthatthisis necessarily
thecase whenthetimeseriesis short.A good deal of imprecision
is likelyto remain
in thebestmodelsand in thesecircumstances
simplestmethodsofsolutionseembest.
If one attachesno specialimportance
but
to theestimation
ofindividualcoefficients
- and if,of course,one believesin the
wishesto use the systemonlyforforecasting
This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

179
leastsquaresor ML methodsof solution- thereseemsto be no good reasonwhyone
shouldretainoriginalform,withall itsdifficulties
ofcalculation,in solvingthesystem.
Instead,proceedat onceto reducedform,and solvethat.The originalform,ifsoundly
variableswhichshouldappear
based,will,ofcourse,serveto definethepredetermined
in each reducedformequation,i.e. thevariableswithpresumednon-zerocoefficients.
There are no identification
problemswithreducedformand each equation can be
solved separatelyby least squares.
VII.

CONCLUSION

Every econometricianhas his own definitionof "econometrics",yet all would


probablyagree thatthe sciencehas somethingto do witheconomicmeasurement,
i.e. economicstatistics.
Yet in a recentverylargeissueofEconometrica
onlyone-third
of the pages in the contribution
sectionweredevotedto articleswhichcould be regarded as "econometric"by this definition.The writeris aware of the Society's
Policy Statementof 1954 but perhapssince then mattershave got a littleout of
control.No one can objectto mathematics
as thenoblestexerciseofthehumanspirit
but the disproportionbetweenmathematicsand applicationin statisticalscience
is to be regretted.
All articlesacceptedforEconometrica
shouldat leasthave applicationin mind,eveniftheythemselves
do not containapplications.Theyshouldtherefore be couched in termscomprehensible
who are not
by potentialpractitioners
in the offensivesense of the term.The Annals of Mathematical
mathematicians
Statisticsand Biometrikapresumablyexist to accommodatepapers of adequate
qualitywhichare not applied papersin the senseindicated.The writerspeaks with
some feelingon thismattersincein theissue of Econometrica
referred
to therewas a
shortarticleon an important
the
moment
which
interests
him
at
yethe
topic
greatly
was stoppedin histracksat an earlystagebythestatement
"letN (X) = {j : xi = C}",
knows
X beinga transposedvector.It maywell be thateveryyoungmathematician
whatthis means; the presentwriter,in the language of his grandchildren,
"hasn't
a clue"; an additionalsentenceor twomighthave made all clear.Existencetheorems,
are all verywell in theirproperplace but
strongertheoremsand epsilon-mongering
thatplace shouldnot be a journaldevotedto an appliedscience.
Surelythevastpotentialofthemodemdigitalcomputer,
possiblea thourendering
sand computationalcombinationswhereone could not have been envisagedbefore,
should be harnessedto econometricends. Let us not favoura prioriany particular
modelbuttrythemall. The needis urgentforthewholeworldhas turnedto planning
and forecasting
and we econometricians
are in the sorrypositionthat,withall our
on theonlygroundsthat
recommend
theory,we have so littlethatwe can confidently
matter,namelythatour theoryhas been triedout and foundsuccessfulin practice.
statistical
Certainit is thatthe greatmajorityof plannersuse the most elementary
and
devicesin makingtheirforecasts.Econometricresearchschools in universities
institutes
shouldturntheirresourcesoverwhelmingly
towardsapplication,relegating
in thelightofcomputational
results.
theoryto theroleofgeneralguide,to be modified
At the same timewe should exerciseall the pressurewe can on officialstatistical
organisationsto give us the statistics,particularlyat the macro-economiclevel,
moreadvanced
whichwe require.National accountstatisticsin eventhe statistically
in scope and come out too late. Yet with
countriesare ofpoor quality,are insufficient
the advent of planningthese statisticshave assumed essentialimportance.As a
are
I knowhow anxiousmycolleagueseverywhere
one-timegovernment
statistician
to improvethese statisticsand my strictures
are not addressedto them: for good
This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

180
whichencompassall economicstatistics,
thecooperation
nationalaccountstatistics,
of all sectorsof theeconomyis necessary.The elementmostinimicalto thedevelopon the past of industrialists
mentof these statisticsis apathyand disinterest
and
businessmengenerally.
To end, may I summarizethe principaldiscussionpointsin the paper proper:ofeconomicequationstheindividualcoefficients
havelittlesignificance;
(i) In systems
all thatreallymattersis theestimationformulae.
equations should we seek associativeor
(ii) In establishingsets of behaviouristic
cause-effect
relationships?
of auto-regression
of residualsin timeseriesis unusefuland mis(iii) The hypothesis
in
the
economic
we
withactual data
context; mustgo on experimenting
leading
in whichresidualsare trulyrandomin timeand in
untilwe findrelationships
else.
everything
(iv) Is reducedformtheonlyvalid form?

REFERENCES
relationsbetweenrandomvariables.Proceedings
of theRoyalIrish
[1] Geary,R. C. Inherent
Academy(A), 47 : 6. 1942.
thegeneraland thesamplingproblemwhenthe
[2] Geary,R. C. Relationsbetweenstatistics:
oftheRoyalIrishAcademy(A), 49 : 10. 1943.
samplesare large.Proceedings
Journal
Statistical
45. 1950.
[3] Berkson,J.Aretheretworegressions?
oftheAmerican
Association,
functional
betweentwovariableswhenone variableis
[4] Geary,R. C. Non-linear
relationship
Journal
controlled.
Statistical
48. 1953.
oftheAmerican
Association,
and othermethods
of confluence
analysisbymeansof lag moments
[5] Reiersol,O. Confluence
9. 1941.
analysis.Econometrica,
setsofvariables.Uppsala. 1945.
analysisby meansof instrumental
[6] Reiersal,O. Confluence
oflinearrelations
between
witherrors
systematic
partsofvariables
[7] Geary,R. C. Determination
of observation
thevariancesofwhichareunknown.
17 : 1. 1949.
Econometrica,
as comparedwithindividualtrends.Econo[8] Frisch,R., Waugh,F. V. Partialtimeregression
1. 1933.
metrica,
etempirique
de l'6cod'uneprevision
[9] Cao-Pinna,V. Validit6theorique
globalede la croissance
nomieitaliennede 1958a 1970.Dans: Europe'sFuturein Figures.NorthHollandPublishing
1962.Chapter4.
Company,Amsterdam,
between
economictimeseries.Journal
[10] Geary,R. C. Studiesin relations
oftheRoyalStatistical
Society(B), 10 : 1. 1948.
5 : 3.
ratioand statistical
Statistician.
[11l Geary,R. C. The contiguity
mapping.TheIncorporated
1954.
G. Econometrics.
[12] Tintner,
Wiley,New York,1952.Chapter11.

RESUME
Dans cettecommunication
I'auteurexaminequelquesproblemes
fondamentaux
dans la theorie
des relationsentredes variablesstochastiques.
La regression
une relation
impliqueessentiellement
caracterecause-effet,
les variablesind6pendantes
l'effet.
6tantles causeset la variabled6pendante
On ne doitpas confondre
la regression
avecunerelationdu typeassociatif,
dontla theorie
lin6aire
estesquissredansle texte.Dans la theorie
associative
il n'estpas besoinde faireappelg l'hypothese
cause-effet.
Dans la regression
de plusieursvariablesles coefficients
individuels
ou
sontsans signification
saufseulement
dansle cas specialde variablesindrpendantes
Le seulbut
importance
non-correlees.
Aplusieurs
de la regression
variablesestd'estimer
de la
(pourla pr6vision
etc.)la valeurmoyenne
variabled6pendante
En 6conom6trie,
c'est
pourdes valeursdonn6esdes variablesindependantes.

This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

181
dansle cas de la regression
a unsens.A titred'exemple,
seulement
on montre
simplequele coefficient
des prixet des salairesApartirde seriestemporelles,
il fautcalculer
les 61asticit6s
que, pourestimer
chacunede ces variabless6par6ment
la tendance.
par une regression
simpleapresavoir61imin6
L'auteurse demandesi la notionde systeme
estutileen 6conom6trie.
d'6quationsde structure
dans le cas de la former6duite,
une seulevariable
C'est seulement
ofichaqueequationrenferme
endogeneque la theoriea unevaleurpratique,pourla prevision.
L'auteurpose la questionde l'utilit6pratiquede I'hypothese
des erreursdans
d'auto-r6gression
la theseque leserreurs
8tresuppos6es
doivent
lesseriestemporelles.
absolument
II soutient
al6atoires.
Apartird'exemples
bien-connus
entraine
Il montre,
6conomiques
que l'hypothese
d'auto-r6gression
les relations.
des 6nonc6sincorrects
concernant
"Afinde
sous formede questionsa la finde la communication,
Tous les 6nonc6ssontr6sum6s
servirde base Aunediscussion.

This content downloaded from 147.188.128.74 on Mon, 01 Jun 2015 14:21:48 UTC
All use subject to JSTOR Terms and Conditions

You might also like