You are on page 1of 45

Techniques for Testing the Constancy of Regression Relationships over Time

Author(s): R. L. Brown, J. Durbin, J. M. Evans


Source: Journal of the Royal Statistical Society. Series B (Methodological), Vol. 37, No. 2
(1975), pp. 149-192
Published by: Blackwell Publishing for the Royal Statistical Society
Stable URL: http://www.jstor.org/stable/2984889 .
Accessed: 13/02/2011 11:50

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at .
http://www.jstor.org/action/showPublisher?publisherCode=black. .

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

Blackwell Publishing and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and
extend access to Journal of the Royal Statistical Society. Series B (Methodological).

http://www.jstor.org
149

Techniques for Testing the Constancy of


Regression Relationships over Time

By R. L. BROWN J. DURBIN and J.M. EVANS


CentralStatisticalOffice LondonSchoolof Economics CentralStatisticalOffice
andPoliticalScience
[Read beforethe ROYAL STATISTICALSOCIETYat a meetingorganizedby theRESEARCHSECTION
on Wednesday,December4th,1974,ProfessorR. L. PLACKETTin theChair]

SUMMARY
Methodsforstudying thestability
overtimeof regression relationships
are
considered.Recursiveresiduals,definedto be uncorrelated with zero
meansand constantvariance,are introduced and testsbased on thecusum
and cusum of squares of recursiveresidualsare developed. Further
techniquesbased on movingregressions, in whichtheregression modelis
fittedfroma segmentof data whichis movedalong the series,and on
regression
modelswhosecoefficients are polynomials in timeare studied.
The Quandtlog-likelihoodratiostatistic
is considered.Emphasisis placed
on the use of graphicalmethods. The techniquesproposedhave been
embodiedin a comprehensive computerprogram,TIMVAR. Use of the
techniques
is illustrated
byapplyingthemto threesetsof data.
Keywords:CUSUM; REGRESSION RESIDUALS; RECURSIVE RESIDUALS
1. INTRODUCTION
THIS paper describesand exemplifiesa set of techniquesfor detectingdepartures
fromconstancyof regressionrelationshipsover time when regressionanalysis is
applied to time-series data. All the techniquesdescribedhave been embodiedin a
computerprogram,TIMVAR. Enquiriesabout the availabilityof thisprogramshould
be addressed to the Computer Developmentfor StatisticsUnit of the Central
StatisticalOffice.A "User's Guide" to theprogram(Evans, 1973) is available from
the Central StatisticalOffice. In what follows,the name TIMVAR will be used
indifferently to describeeitherthe set of methodsused or the computerprogram
writtento implement them.
The theoryunderlying thepaperwas developedjointlyby Brownand Durbinwho
gave a preliminary accountof it in Brownand Durbin (1968). The originalversion
of the programwas writtenby C. E. Rogers and later work on it was done by
R. P. Bayes and Evans. Mr Brownunfortunately died in 1972 so the actual writing
of the paper was done by Durbin and Evans who accept fullresponsibility forthe
finalversion. However,since theyhave made substantialuse of materialleftby
Mr Browntheyfeelthathe shouldbe regardedas a co-authorof thepaper.
Regressionanalysisof time-series data is usuallybased on the assumptionthat
the regressionrelationshipis constantover time. In some applications,particularly
in the social and economicfields,thevalidityof thisassumptionis open to question,
and it is oftendesirableto examineit critically,particularlyifthemodelis to be used
forforecasting.
TIMVAR includesformalsignificance testsbut its philosophyis basicallythat of
data analysisas expoundedbyTukey(1962). Essentially, thetechniquesare designed
to bringout departuresfromconstancyin a graphicway insteadof parametrizing
150 BROWN et al. - TestingConstancyof RegressionRelationships [No. 2,

particulartypesofdepartureinadvanceandthendeveloping formal tests


significance
intended to havehighpoweragainsttheseparticular Fromthispoint
alternatives.
of viewthe significancetestssuggested shouldbe regardedas yardsticks forthe
ofdatarather
interpretation thanleadingto hardandfastdecisions.
Theproblem weconsider is a specialcaseofa generalclassofproblemsconcerned
withthedetection ofchangesofmodelstructure overtime,butwe shallnotattempt
to reviewheretheextensive literaturedealingwiththewholerangeof problems.
Apartfromcitingreferences whichhavespecific relevanceto ourowntreatment we
merelycall attentionto twopapersof specialimportance, thoseof Chernoff and
Zacks(1964)and Hinkley (1972).
The nextsectionbeginsby specifying thebasicregression modeland thenull
hypothesis underconsideration. It goes on to showhow thishypothesis can be
investigatedby constructingplotsof cumulative sumsand sumsof squaresof the
recursive
so-called residuals.Thesearethestandardized residualsfromtheregression
of each observationyt,theregression coefficientsbeingcalculatedfromtheobser-
vationsYi, .. Yi-, for t = k+ 1, ..., T, wherek is the numberof regressorsand T is
thenumberof observations. It is shownthaton thenullhypothesis therecursive
residualsare uncorrelatedwithzeromeanand constant varianceand are therefore
independent underthenormality assumption.Suitableformulae forcarrying out
therecursive in a highly
calculations economical wayarepresented.
Othermethodsof transforming least-squaresresidualsto independent N(0,a2)
variableshavebeengivenbyseveralauthors including Theil(1965,1968)andDurbin
(1970). However, therecursive residuals seempreferable fordetecting thechangeof
modelovertimesinceuntila changetakesplacetherecursive residualsbehaveexactly
as on thenullhypothesis. Whenthechangedoes occur,one hopesthatsignsof it
will soon be apparent.Withthe othermethodsone wouldnormally expectthe
effectsofthechangeto be spreadoverthefullsetoftransformed residuals.
In section2'5 furthertechniques basedon plotting thecoefficients obtainedby
fittingthemodelto a segment ofn successive observations and moving thissegment
alongtheseriesare presented.The plotsare supplemented by a homogeneity test
based on theanalysisof variance.Section2.6 considers thefitting and testingof
time-trendingregressionsin whicheachcoefficient is represented as a polynomialin
time.The finaltechnique considered is theplotting ofQuandt'slog-likelihood ratio
intended
statistic, to detectthe singletime-point, if any,at whichthereis a dis-
continuous changefromoneconstant setofregression parameters to another.
In Section3 thetechniques developed areappliedto threesetsofrealdatataken
fromthefieldof economics.Theseexamplesillustrate howTIMVAR can be usedin
the model-buildingprocessto investigate the stability of modelsovertime. The
firstexamplerefers to data fromthePost Officeon thegrowth in thenumberof
localtelephonecalls,thesecondusesdatafromtheInternational Monetary Fundon
thedemandformoneyand thethirddealswitha modelforforecasting manpower
requirements usingdata providedby theCivilServiceDepartment. Section4 out-
linesthestructureoftheTIMVAR program andindicates theoptionsavailable.

2. THE TECHNIQUES PROPOSED


2.1. The RegressionModel underStudy
is
modelwe consider
Thebasicregression
yv= x'.,+ ut t = 1, .........
T. 1
1975] BROWNet al. - TestingConstancyof RegressionRelationships 151

whereat timet,ytis theobservationon thedependent variableand xtis thecolumn


vectorofobservations Thefirst
on k regressors. xlt,willbe takento equal
regressor,
unityforall valuesof t ifthemodelcontainsa constant.The otherregressors are
assumedto be non-stochastic modelsare excludedfromcon-
so auto-regressive
sideration.The columnvectorofparameters, withthesubscript
PI, is written t to
indicatethatitmayvarywithtime.We assumethattheerrorterms, ut,areindepen-
dentand normally distributedwithmeanszeroand variancesAt, t = 1,...,T. The
hypothesisofconstancy overtime,whichwillbe denotedbyHo,is
P1= P2= = PT= P
Gr2 = Gr2 =
2 = s2T _ Gr2
1
We shallbe moreconcerned withdetecting differences amongtheP's thanamong
thea's thoughwe do givea procedure whichpermits theinvestigationof variance
changes.We havenotconsidered theeffects of serialcorrelationin theu's on the
performance ofthetestsproposed.
Itis natural tolookat residuals toinvestigate departures frommodelspecification,
and a variety ofprocedures fordoingthishavebeenproposedin theliterature (see,
forexample,Anscombe,1961;Anscombeand Tukey,1963). However,experience
hasshownthatinthepresent situation theplotoftheordinary residuals,
least-squares
or theplotoftheirsquares,againsttimeis nota verysensitive indicatorofsmallor
gradualchangesin theP's. In thisrespecttheproblemresembles thatofdetecting
changesin themeanin industrial qualitycontrolforwhichthecumulative sumor
cusumtechnique, introduced byPage(1954)anddiscussed further byBarnard(1959)
andbyWoodwardand Goldsmith (1964),has beenfoundto be a moreeffective tool
fordetecting smallchangesthantheordinary controlchartin somecircumstances.
This suggests thatinsteadof plotting out theindividual least-squaresresiduals
zt thecusumsZr E-l zi r = 1,...,T, shouldbe plotted, wherewe havedivided
by theestimated standarddeviationa to eliminate theirrelevant scalefactor.The
difficultyaboutthissuggestion is thatthereseemsno wayofassessing thesignificance
of thedeparture of theobservedgraphof Zr againstr fromthemean-value line
E(Zr) = 0. The intractability oftheproblem arisesfromthefactthatin generalthe
covariance functionE(ZrZS) doesnotreducetoa form thatis manageable bystandard
Gaussianprocesstechniques (cf.Mehrand McFadden,1965). For instance, forthe
simplecase ofregression on a lineartimetrendwithzerointercept, thecovariance
function is asymptoticallyr- 3r2s2/4T3 (r< s), whichis an unmanageable form.
An alternative is to considerthe standardized cusumof squares, -2Er z2.
Although moretractable, thisis stilldifficultto dealwith.Insteadofconsidering it
we prefer to makethetransformation to recursive residualsgivenin thefollowing
sectionwhichenablesus to treattheproblemin termsof standardized cusumsand
cusumsofsquaresofindependent N(O, 2) variables.
2.2. The RecursiveResidualsand theirProperties
Ho to be true,letbrbe theleast-squares
Assuming of ,3basedon the
estimate
firstr observations,i.e. br= (X Xr)-lX'Yr wherethe matrixXrXr is assumedto be
andlet
non-singular,
-
rw 1( + ~~yr xr bri,rkl
(X'X)r-l)Xr-r)k-1.Xr),T, (2)
2

whereXr-1= [xl,...,xri1] and Yr = [Yl'., Yr].


152 BROWN et al. - TestingConstancyof RegressionRelationships [No. 2,

Lemma 1. UnderHo, Wk+l, ..., WT are independent,


N(O,(J2).
Proof. The unbiasednessof wr is obvious and the assertionV(Wr) = U2 follows
fromtheindependenceof Yrand bri-. Also,
immediately

= (8r - xr(Xr1 X-1)X_ x1


1xi) (1 + x(Xr-1 Xr-1)
j=1
Since each wris a linearcombinationof the normalvariatesuj, the wj's are jointly
normallydistributed.Now

E[Ur -
Xr1)
xr(Xri-1 x }u- xI(X X-) -1 xiui)
= cr2[0_0-_Xs(Xs1 Xs1) Xr+ Xr(Xr_1 x
Xr-1(Xr_1 Xr_1)(X-1 X1)- xl] 0 ( )
independentin view
It followsthat Wk+, ..., WT are uncorrelatedand are therefore
of theirjoint normality.The transformation fromtheur'sto thewr'sis a generalized
formof theHelmerttransformation (Kendall and Stuart,1969,p. 250).
Let Sr be the residual sum of squares afterfittingthe model to the firstr
observations assuming Ho true, i.e. Sr = (Yr-Xrbr)'(Yr-Xrbr).
Lemma2.
- 1+Xr-()X1 XrXr)-1 Xr-1)
(X' Xr)X1
rXr) = -((X r-lXr-1)
r-1
Xr1)1 -
- 1 + xr(Xlr_lXr-1)_xr (3)

br = br-I + (Xr Xr)-1 Xr(Yr- Xr br-), (4)

=
r=k+1, ... , T.
Sr Sr_i+W2, (5)
The relation(3) was givenby Plackett(1950) and Bartlett(1951). It is used in the
programto avoid havingto invertthe matrix(XrXr)directlyat each stage of the
the left-handside by XrKXr
calculations. It is provedby multiplying and the right-
hand side by Xr-1Xr-1+ XrXr= XrXr.
Proofof (4). Since bris theleast-squaresestimateit satisfies
X'Xrbr = X'rYr = X'r1 Yr1 + Xryr = Xri1 Xr-1 br-1 + XrYr

= Xr Xr br-1 + Xr(Yr- Xr1br-).

Proofof (5).
Sr = (Yr-, Xbr)'(Yr-Xr br)
= (Yr X bri1)'(Yr - Xr br1) - (br -br-)' Xr Xr(br -br-1)

= Sr-1 + (Yr X br_)2 - x(Xr X7X)-1


Xr(Yr_X br)2
whichgives(5) on substituting for(X' Xr)-l from(3).
An alternativeproofof (3), (4) and (5) maybe derivedfromtheresultsof Heyadat
and Robson (1970) sincetheirquantitiesfrare multiplesof our quantitieswr. Note
thatwris thestandardizedpredictionerrorof Yr whenpredictedfromYi, . . ,yr-
A situationarisingfrequently modelcontains
in practiceis one wheretheregression
a constantand one of the regressorvariablesis itselfconstantforthe firstr1obser-
vations,wherer1> k. Even though,in this case, the recursiveresidualscannot be
calculated fromdirectapplicationof (2) above because of multicollinearity, it is
possibleto derivethemin thefollowingmanner.The basic idea is to droptheinitially
1975] BROWN et al. - TestingConstancyof RegressionRelationships 153

(whichmustbe suppliedto theprogram


regressor
constant of
last)at thebeginning
thenumber
reducing
therecursions, ofregressorsto k-1. Thenrecursive residuals
, w?,,are derivedfromestimatesb?7,1,
wok, of the regressorvectorPO,the 0
..., b,?,_4
denoting thefactthatonlyk-I regressors havebeenused. (Thefirst component of
eachoftheseestimateswillofcoursebe an estimate d) whered is theinitial
of(/1+ 1Bk
Whenthelastregressor
valueofthelastregressor.) haschanged itis broughtintothe
regression arecalculated
residuals
andrecursive fromthenon byformula (2). If,for
r=k+l,...,rl, wris definedto be wrLjthenwe againhave a set of T-k values
Wk+, ..., wr, whichmay be shown to be independentN(O,a2) as before. When the
kthregressoris brought intotheregression theextradegreeof freedom absorbed
meansthatthereis no increasein theresidualsumofsquares,i.e. Sri+i= S ; more-
over,apartfromtheconstant term,thefirst k-1 components of br,+iare equal to
thecorresponding components of b?.. Havingmadethetransition fromk-I to k
therecursion
regressors, case.
proceedsas in thestandard
2.3. The CusumTest
up to timet = toand differs
If t is constant valuefromthen
fromthisconstant
on,thewr'swillhavezeromeansforrup to tobutingeneral willhavenon-zeromeans
subsequently. examination
This suggests of
to revealdepartures
of plotsintended
themeansofthew<'sfromzeroas onetravelsalongtheseriesthrough time.
The first is theplotofthecusumquantity
plotwe consider
r
W1

Wra denotesthe estimated


againstr for r = k+1, ..., T, where standarddeviation
determinedby a2 = STI(T-k). We requirea methodof testing thesignificanceof
thedepartureofthesamplepathofWrfrom = 1?i,
itsmeanvaluelineE(Wr)= 0. A suitable
procedureis to finda pair of lineslyingsymmetricallyabove and belowtheline
Wr=0 suchthattheprobability of crossingone or bothlinesis a, the required
level.
significance
From thepropertiesof the wr's,underHo the sequenceWk+,,..., Wris a sequence
normalvariables
ofapproximately suchthat
E(Wr)= 0, V(W1)= r-k WW)= min(r,s)-k,
and C(QW,
to a goodapproximation. To derivethetest,Wris approximatedbythecontinuous
Gaussianprocess{Zj,k< t< T} withthesemeanand covariance functions. Thisis
in facttheBrownian motionprocessstarting fromzeroat timet = k. The formof
straightlineto choosewas decidedin twostages.The standarddeviation ofZt is
1(t-k). Consequently,ifwewishedto finda curve
such thatunderHo theprobability
that the sample path lies above the curveat any point betweent = k and t = T is
constant, we shouldchoosecurvesof the form? A1(t- k) whereA is constant.
However,sincewe wishto limitourselves to straight
lines,thecrossingprobability
cannotbe constant forall t and theprocedure adoptedis to choosethefamilyof
linestangentto the curves+ A1(t- k) at the pointshalfwaybetweent = k and
t = T. This leads to the familyof pairs of straight lines throughthe points
{k, ? a V(T- k)}, {T, ? 3a V(T- k)}, wherea is theparameter.For anygivenlinein
thisfamily,theprobability thatthepoint(r,Wr)liesoutsidethelineis a maximumfor
r halfwaybetweenr = k and r = T. We wantto finda memberof thisfamilysuch
inBrownian
thata samplepathZt crossesitis lot.Knownresults
thattheprobability
154 BROWN et al. - TestingConstancyofRegressionRelationships [No. 2,

thata samplepath Zt crossesthe line


motiontheorygive for the probability
y = d + c(t-k) forsome t in (k,T) thevalue

Q c(T- k) + exp(-2dc) Q ( k)
where V~~(T-k) Q(d(T-k))
where
Q(z) =s( J
fexp (-l u2) du

d =a V(T-k) and
(see, for example,Durbin, 1971, Lemma 3). Substituting
c = 2a/4(T- k) we obtaintheequation
Q(3a)+ exp(-4a2) (1-Q(a)) =6
to be solvedfora.
It has beenassumedthattheprobability thatW,crossesbothlinesis negligible,
whichwillbejustifiableforvaluesofasnormally usedforsignificance say0-1
testing,
or less. Usefulpairsofvaluesofa and ocare
a = 001, a =1P143,
cX= 0 05, a = 0948,
oc= 010, a = 0850.
Fromthestandpoint ofdataanalysis,thefunctionoftheselinesis to providea yard-
stickagainstwhichto assesstheobserved behaviour ofthesamplepath,thoughof
coursetheycan be usedto providea formaltestof significance by rejectingif the
samplepathtravelsoutsidetheregionbetween thelines.
2.4. The Cusumof SquaresTest
Thistestusesthesquaredrecursive w2,andis basedon theplotofthe
residuals,
quantities

sr (L W12w )| ) =SIrST, r = k+ 1, ..., T.


The testprovidesa usefulcomplement to the cusumtest,particularly whenthe
departure fromconstancy of thePi1'sis haphazardratherthansystematic. On Ho,
srmaybe shownto havea betadistribution withmean(r-k)/(T-k). Thissuggests
drawing a pairofliness. = + co+ (r-k)/(T- k) on thediagram parallelto themean-
valuelinesuchthattheprobability thatthesamplepathcrossesone or bothlines
is of,therequired level.
significance
To findthesignificance values,c0,itis convenient to consider thecasewhen
first
T- kiseven.Thenthejointdistribution ofthe1(T- k)-1 statistics Sk+2, Sk+4, ***,ST-2
is thesameas thatof an orderedsampleof (T- k)-1 independent observations
fromtheuniform Thismaybe shownbywriting
(0,1) distribution.
n= (T-k)-I and zj=(wk+2j + wk2_1)/2a2,j=l,...,n+1.
randomvariables
distributed
exponentially
Thenthezj's areindependent, withmean
one. IfZ is thesumofthezj's we have
Sk+2j = (Zl + ... +Zj)Z, j =, ..., n.
The requiredresultfollowsby transforming the variableszl, zn1 to givethe
outZ.
ofSk+29... ST-2, Z andthenintegrating
jointdistribution
1975] BROWN et al. - TestingConstancyof RegressionRelationships 155

The distribution of an orderedsampleof independent observationsfromthe


uniform partin thetheory
playsan important
(0,1) distribution of non-parametric
andthedistribution
statistics, c+ and c-, defined
ofeachofthestatistics by
c= max (Sk+2j -ji/m), c- = max (/m - sk+2j)
j=l,...,m-1 i=l,1...-,m-1

wherem= l(T-k), can be recognized as beingequivalentto thatof Pyke's(1959)


modified Kolmogorov-Smirnov C+.,withn = mr-1. The statistics
statistic, c+ and
c- are the maximum positiveand negativedeviationsrespectivelyof the set of
(Sk+2, . . ., ST-0 from
statistics theirmean-value line.
A tableofsignificance valuesofthequantity C. forn = mr-1is givenbyDurbin
(1969)(Table 1, p. 4). The procedure suggested forthecusumofsquarestestis to
takethesevaluesas approximations valuesof
to thesignificance

i=1,...,T-k-1( T-k) and c = ma --k


whichare themaximum positiveand negativedeviationsof thewholeset of s,'s
fromthemean-value level,a, normally
line. For thevalueofthesignificance chosen,
say 0-1or less,theprobability bothlinesis negligible,
of crossing so thatgivena
significancelevela, to findthevalueofc0wemaytakethevalueobtained byentering
the table at n = '(T- k) -1 and -.a If T- k is odd the proceduresuggestedis to
interpolatelinearlybetween the values for n -(T-k)- 2 and n -(T-k) .
MonteCarlorunsmadebyM. C. Hutchison haveshownthatthistestgivessignificant
resultsmoreoftenthantheexacttestwouldgive,butthatthediscrepancy is very
smallwhen(T- k) exceeds30.
It maysometimes be appropriate to consider a one-sided test.For example, ifit
is assumedthat Pt= p* for t r and t= p**p* for t>r whileut2= u2 forall t,
thenE(w,2)= a2 fort< r and E(wt2)>a2 fort> r. One wouldtherefore expectthe
departure fromthenullhypothesis to be indicatedbya tendency forthesamplepath
srto liebelowthemeanvalueline,andwouldtherefore usea one-sided test.Forthis
purpose, one would take thesignificance value of c0 corresponding to significance
levela, notIa. However, whether thetwo-or one-sided situationsareenvisaged we
ourselvesprefertoregard thelinesconstructed inthiswayas yardsticks againstwhich
to assesstheobserved samplepathrather thanproviding formal testsofsignificance.
Ifthetwoplotsdescribed abovedo indicatedepartures fromconstancy it maybe
usefulto examineplotsof thecomponents of b7 againsttimeto tryto identify the
source.Further, to helplocatethepointofchangeitis ofteninformative to lookat
thesetofplotswhichare obtainedbyrunning theanalysisbackwards through time
as wellas forwards.
2.5. MovingRegressions
Another usefulwayofinvestigating ofPtis tofittheregression
thetime-variation
on a shortsegment and to movethissegment
observations
ofn successive alongthe
series.Thegraphsoftheresulting againsttimeprovidefurther
coefficients evidence
fromconstancy.In addition,
of departures theestimatedresidualvariancemaybe
computed theconstancy
andplottedto investigate ofC2.
Thequantitiesrequiredforeachnewsegment arecomputed byfirstaddinga new
observationto thesegmentjustdealtwithusingformulae (3)-(5) and thenallowing
ofdropping
fortheeffect from
an observation thebeginningbymeansofthefollowing
156 BROWN et al. - TestingConstancyof RegressionRelationships [No. 2,

analoguesof (3)-(5):
(K' :n)-1 = (Xn+l Xn+l)-1 + (X +1 Xn+l)-1 X1MJlXn-1/0 - lX'M+1 Xn+0)-1 Xi),

bn = bn+i-(Xn Xn)-' xj(y1 - x1 b+),

Sn = Sn+- (Y1- x bn)2/(l+ X( Rn)1 x1),


where9n,bn Snarethevaluesoftheregressor matrix, vectorandthe
thecoefficient
residualsumofsquaresbasedon observations of
fromt = 2 to n+ 1. For simplicity
notationwehavegiventheformulae updateonlybutthefurther
forthefirst formulae
are,ofcourse,similar.Proofsaresimilarto thoseof(3)-(5).
A significancetestfor constancybased on this approach,called by us the
homogeneity fromtheresults
test,is derived ofregressionsbasedon non-overlapping
usingtheanalysis
timesegments, ofvariance.Thetimesegments usedbytheprogram,
fora movingregressionof lengthn, are (1,n), ((n+ 1),2n),..., ((p -2) n+ 1,(p - 1) n),
p is theintegral
((p-1) n+ 1,T), where ratioconsidered,
partofT/n,andthevariance
calledbyus thehomogeneity is
statistic,
(T-kp) S(1, T)-{S(1, n) + S(n + 1,2n)+... + S(pn-2n + 1,pn-n) + S(pn-n + 1,T)}
(kp-k) {S(1,n)+S(n+ 1,2n)+... +S(pn-n+ 1,T)}
whereS(r,s) is the residualsum of squaresfromthe regression calculatedfrom
observations to the usual "between
fromt = r to s inclusive.This is equivalent
groupsoverwithingroups"ratioof meansquaresand underHo is distributed as
F(kp - k,T- kp).
TheTIMVAR program thequantity
also calculates M1,themeansquareprediction
erroroneperiodahead. Thisis defined as
T
ml = {Ym- x b(m- n,m- 1)}2/(T-n),
m=n+l
wherethevectorb(mn- n,m- 1) is theestimate coefficients
ofthevectorofregression
from thetimesegment(m- n,mr-1).Whencalculatedformoving ofseveral
regressions
different
lengthsitsminimum forthelengthofrecord
valuegivesa usefulcriterion
to usewhenpredictingoneperiodahead. AlsocalculatedareM2,defined by
T-n
M2-=E xb(m + 1,m+ n)}2/(T-n),
{ym-x
m=1
froma moving
to M1calculated
whichis equivalent passedin thereverse
regression
and
direction, M thesum of M1 and M2. Of theseM1 willnormallybe themost
useful. Finally,M3 is calculated. This is T=n +n(Ym-xm b(m-n, m-1))2/(T-n)
wheren1 is the maximum lengthof regression the same
considered.This fulfils
as M1,exceptthatthedifferent
function regression are nowcomparedover
lengths
thesamepartoftherecord,namely thatfromn1to T.

Regressions
2.6. Time-trending
This techniqueintroduces timevariationinto the regressionmodelexplicitly
byallowing theregression to becomepolynomials
coefficients in time.To determine
whetherthisextended modelwillproducea significantly fitthanonebasedon
better
constancy,andfurther whatdegreeofpolynomial
to determine shouldbe employed,
theprogram calculatesthesumofsquaresremoved byeachofthefollowing nested
1975] BROWN et al. - TestingConstancyof RegressionRelationships 157

hypotheses:
(0): Yt=X1
0=,
(1): y1=xi,AO) +r(I)t)+ 8i

(e): yt =Xt(P(0) +P(1) t+ *+ P(e)te)+81t


The P's areall vectors oflengthk, and e is a positiveinteger bytheuser.
specified
Comparison ofthemean-square increase intheexplained withanestimate
variation
of the errorvariancegivesa testfordetermining whether each modelgivesa
better
significantly fitthantheone before.Thisestimate of theerrorvariancemay
be derivedfromeither theresidualsumofsquaresfromthemodelin thenexthigher
degreein t or fromtheresidualsumof squareof thefullmodel(e) and so two
F-ratiosarecalculated, oneforeachestimate.
Ratio Technique
2.7. Quandt'sLog-likelihood
described
Thistechnique, in twopapersby Quandt(1958,1960),is appropriate
whenitis believedthattheregression relationship mayhavechangedabruptly at an
unknown timepointt = r fromone constantrelationship by P(l), a2, to
specified
anotherconstantrelationship specifiedby P(2), a2. For each r fromr = k+ 1 to
r = T- k -1 theprogram computes andplots
Amax likelihoodoftheobservations givenHo\
Ar= log10max likelihood oftheobservations givenH1!,
whereH1 is the hypothesisthatthe observationsin the timesegments(1, ..., r) and
(r+1, ..., T) come fromtwo different
regressions.This is the standardlikelihood
ratiostatistic
fordeciding between thetwohypotheses Hoand H1,and it is easyto
showthat
2r lo T'lg
A,^r=ilg(2 + !(T- r)loga^2-To ^2

where2j, a2 and &2 are theratiosof theresidualsumsof squaresto numberof


observationswhentheregression to thefirst
is fitted r observations,
theremaining
T-r observationsandthewholesetofT observations, Theestimate
respectively. of
thepointatwhichtheswitch fromonerelationship toanother is thenthe
hasoccurred
valueof r at whichArattainsits minimum.Unfortunately, no testhas yetbeen
devisedforminArsinceitsdistribution on Hois unknown.However, thebehaviour
of thegraphof Ar againstr shedslighton the stability of theregression and in
particularindicateswhetherchangeshave occurredas an abrupttransition or
gradually.
3. EXAMPLES OF THE APPLICATION OF TIMVAR
In thissectionwe presentthreeexampleswhichillustrate the use of TIMVAR
techniques.Thefirst and thirdexamples revealevidenceofchangewhilethesecond
doesnot. Thegraphshavebeenchosento illustrate kindsofTIMVAR output
different
but in each case thegraphsshownare onlya smallfraction of thetotaloutput
available.
Example1. This examplewas made availableby the Statistics and Business
ResearchDepartment of thePost Office.As partof a widerstudyof postsand
telecommunications in Turner(1973),a regression
described modelwasdeveloped to
158 BRowNet al. - TestingConstancyof RegressionRelationships [No. 2,

explaingrowth in thenumber oflocaltelephonecalls(i.e.thedifferences


betweenthe
numbers ofcallsinconsecutive ofa linearmodelinvolving
years)interms a constant
and fourindependent variables.These fourvariables,whichwereused in first
difference forms,werea measureof economicactivity, thenumberof residential
telephones, the"real"priceoflocalcallsandthe"real"priceofresidentialtelephones.
(Thedeflator usedto arriveat the"real"priceswastheretailpriceindex.)Thedata
ranfrom1950/51 butithadbeenfeltthattheestimates
to 1971/72, ofthenumber of
local calls and henceof local call growthweresubjectto someuncertainty after
1964/65; however, the1971/72 figurewasthoughtto be reliable.In orderto usethis
modelas partofa largermodel,reliableestimates forthegrowth oflocalcallswere
required forthewholeperiod.Thusitwasimportant oftherelation-
thatthestability
shipovertimeshouldbe investigated.
Figs 1-4 giverespectively theplotsof theleast-squares thecusumof
residuals,
least-squares thecusumofrecursive
residuals, residuals andthecusumofsquaresof

150

100

50

0 1 3 5 7- 9 II 10
V 15 17 19 21
Observation
-50 number

-100

-150

FIG. 1. Example1: Ordinary


least-squareresiduals.

0 1 5 10 15 20
Observationnumber| /
-I

-2

-3

FiG. 2. Example1: Cusumof ordinary residuals.


least-squares
1975] BROWNet al. - TestingConstancy
of RegressionRelationships 159

-I

-2

-3

-4

FiG. 3. Example1: Cusumof recursive


residuals,forwardrecursion.

1.0

06 -

6 1

04- S

6 8 10 12 14 1516 18 20 22
Observation
number 1964/65
FIG,4. Example1: Cusumof squaresof recursive
residuals,forwardrecursion.

least-squaresresiduals.These plotsprovideincreasing indicationof evidenceof


after1964/65,
instability 1 percentsignificance
beingattained in Fig.4. Thefactthat
bythecusumofsquaresplotbutnotbythecusumofrecursive
is achieved
significance
residualssuggests maybe due to a shiftin residualvariancethanto
thatinstability
shiftsin valuesof regression coefficients.
However,examination of the plotsof
coefficients
andresidualvarianceestimatedfrommoving regressionsshowedthatthe
was dueto localchangesin theregression
instability coefficientsand notto changes
invariance.Themodelshowsno signofinstability intheyearsup to 1964/65 andit
was further foundthata forecast ofthe1971/72 fromthemodelfitted
figure to the
data up to 1964/65 was verycloseto theactual1971/72 estimate,whichhad been
acceptedas reliable.In the circumstances it was decidedto ignorethe suspect
estimatesbetween1964/65 to use themodelfitted
and 1971/72, fromtheremaining
datato providetheexplanatory equationrequiredand to replacethediscarded data
byforecasts derivedfromit.
Example2. The secondexample, usingdata madeavailablebyDr M. S. Khan
oftheInternational Monetary Fund,is basedon a studyofthedemandformoney
160 BROWN et al. - TestingConstancyof RegressionRelationships [No. 2,

functionfortheUnitedStates,1901-65inKhan(1974).In thispaper,Khanconsiders
severalpossiblespecifications
forthefunctionand usesTIMVAR teststo investigate
theirstability ofthefunction
overtime.He arguesthatthequestionofstability over
timeis ofcrucialimportancefortheeffectiveness
ofmonetary policy.Theparticular
modelconsidered hereexpressesthe "narrow"realper capita stockof moneyMt
in termsofthelong-term interest
rateRtand thepermanent realper capita income
Ytin an equationoftheform
AlogMt = a+PlogARj+ylogAYj+w1,
whereA is thefirstdifference operatorand w1is an errortermwiththeproperty
wjNID(0,cr,2,). This is the one testedin thepaperusing
of eightspecifications
annualdatafrom1901to 1965.
NoneoftheTIMVAR results at 5 percent. Figs5 and 6 showthe
weresignificant
cusumand cusumof squaresgraphsfromtheforward recursion.The resultsare
clearlyconsistent
withthehypothesis overtime.In hispaperKhangoes
ofstability
on to drawconclusions
fromtheresults
forthisand theothermodelspecifications.

O -,,lo 20 30 40 0 60 70

-2

-4

-6 "IO.

A\
or

FIG. 5. Example2: Cusumof recursive


residuals,forwardrecursion.

0 _,

06 7
02 7
R7\

35 10 20 30 40 50 6065
Observationnumber
FIG. 6. Example2: Cusumof squaresof recursive
residuals,forwardrecursion.
1975] BROWNet al. - TestingConstancy
of RegressionRelationships 161

Example 3. This exampleuses data providedby the Civil ServiceDepartment


and is concernedwiththe staffrequirementSt of an organizationexpressedas a
categories.The exampleis studiedin detail in
functionof workloadsof 9 different
Cameronand Nash (1974) and uses quarterlydata fromthe firstperiod of 1960 to
the thirdquarterof 1970 (43 observations).Cameron and Nash foundthat the 9
workloadcategorieswerehighlyintercorrelated so theyemployedfactoranalysisto
reducethemto threeuncorrelatedfactorsF1,F2,F3. They thenfittedthe regression
model:
3
St= I3o+ 3fl1
Fjt+ et,
j=1
whereetis a disturbanceterm.
Forwardand backwardcusumand cusumsquaredrecursive residualsplotsshowed
strongevidenceof instability.Fig. 7 givesthegraphof Quandt'slog likelihoodratio
and thisindicatesclearlythatan abruptchangetook place just afterthe27thquarter.

5 10 15 20 25 30 35 40
number
Observation
-5

-10

-15

-20

-25

FIG. 7. Example 3: Quandt's log-likelihood ratio.

In fact two separate bodies were amalgamatedduringthis quarter to formthe


organizationunderstudyso therewas an obvious administrative explanationin this
case forthe resultsobserved. However,the exampleservesto illustratethe way in
whichthe Quandtlog-likelihoodratiocan serveto pinpointa changein therelation.
Fig. 8 gives the graph of estimatesof P3 calculated fromsegmentsof lengthsix
quartersmovedalong the series. These graphsshow theconsequencesof the abrupt
changein therelationjust afterthe27thquarter. As a resultof theseand othertests
Cameronand Nash decidedto fitthe model forforecasting purposesfromthe last
14 observationsonly.
4. THE TIMVAR PROGRAM
The TIMVARprogramwas written to calculateall theresultsnecessaryforthetests
and techniquesdescribedin the paper and to produce a considerableamount of
graphicaloutputto aid theinterpretation of theresults.Any or all of the following
resultsare available:
(a) The standardregression.The analysisof varianceover the whole timespan
and the DW statistic. Tables and plots of the residuals and regression
estimates.
162 BROWN et al. - TestingConstancyof RegressionRelationships [No. 2,

(b) Theresultsofthetime-trendingregressions(see Section2.6).


(c) The resultsoftherecursiveregressions,
bothforward and backward, giving
tablesandplotsofthesuccessive regressionestimates,
therecursive residuals
and theircusums, andtheteststatistics
(see Sections2.2,2.3,2.4).
(d) ThevaluesofQuandt'slog-likelihood ratio(seeSection2.7).
(e) The resultsof themovingregressionsforeachspecified lengthgivingtables
and plotsof the successiveregressionestimates, mean squareerrors,the
quantitiesM1,M2,M and M3,and the statistic forthe homogeneity test
(see Section2.5).

10,000

5 ~~~20
01 I0 5 25 30 35
Numberof observationat startof
-10,000 segmentused in regression

FIG. 8. Example3: Estimateof coefficient


of thirdindependent
variablederivedfrommoving
of length6.
regression

The program makesfulluse of theformulae described in Sections2.2 and 2.5


during thecalculations oftherecursive andmoving regressions. Becauseofthelarge
number ofsuccessive matrix operations performedduringthesecalculations thereis
a dangerthatsomeofthematrices maybecomeill-conditioned. Anysuchtendency
is usuallyreducedby subtracting themeansfromeach of thevariablesand thisis
doneautomatically bytheprogram ifthemodelcontainsa constant.The valueof
theconstant termis thenrecovered by anothermechanism. In thecase wherethe
modelcontains a constant andoneoftheotherregressors is constantat thebeginning
or endof therecordfora numberof observations greater thank, thisregressor,if
suppliedlast,is dealtwithby theprogramduringtherecursive in the
regressions
manner described in Section2.2. Ifitis constant
at thestartonlyitcanbe dealtwith
in a similarfashionduringthemovingregressions. The extension to thecasewhere
twoor moreregressors are constant at thebeginning or endof therecordhas not
beenprogrammed.

REFERENCES
ANSCOMBE, F. J.(1961). Examinationofresiduals.Proc.4thBerkeleySymp.Math.Statist.Prob.,
1, 1-36.
ANSCOMBE, F. J. and TUKEY,J. W. (1963). The examinationand analysisof residuals.Techno-
metrics,5, 141-160.
BARNARD, G. A. (1959). Controlchartsand stochastic
processes.J.R. Statist.Soc. B, 21,239-271.
BARTLETT, M. S. (1951). An inversematrixadjustment arisingin discriminant analysis. Ann.
Math. Statist.,22, 107-111.
1975] Discussionof thePaper by Brown,Durbinand Evans 163

BROWN, R. L. and DURBIN, J. (1968). Methodsof investigating whethera regression relation-


ship is constantover time. SelectedStatisticalPapers, EuropeanMeeting,Mathematical
CentreTractsNo. 26, Amsterdam.
CAMERON, M. H. and NASH, J. E. (1974). On forecasting the manpowerrequirements of an
organization withhomogeneousworkloads.J. R. Statist.Soc. A, 137,200-218.
CHERNOFF, H. and ZACKS, S. (1964). Estimating thecurrent meanof a normaldistribution which
is subjectedto changesin time. Ann.Math.Statist.,35, 999-1018.
DURBIN, J.(1969). Testsforserialcorrelation in regressionanalysisbased on theperiodogram of
leastsquaresresiduals.Biometrika, 56, 1-15.
(1970). An alternativeto the bound test for testingserial correlationin least-squares
regression.Econometrica, 38, 422-429.
(1971). Boundary-crossing probabilities for the Brownianmotionand Poisson processes
and techniquesforcomputingthe powerof the Kolmogorov-Smirnov test. J. Appl.Prob.,
8, 431-453.
EVANS, J. M. (1973). User's Guide to TIMVAR. ResearchExerciseNote 10/73of the Central
StatisticalOffice.
HEYADAT, A. and ROBSON, D. S. (1970). Independentstepwiseresidualsfor testinghomo-
scedasticity.J. Amer.Statist.Ass.,65, 1573-1581.
HINKLEY, D. V. (1972). Time-ordered classification.Biometrika, 59, 509-523.
KEN-DALL, M. G. and STUART, A. (1969). The AdvancedTheoryof Statistics, Vol. I, 3rd ed.
London: Griffin.
ofthedemandformoneyfunction
KHAN, M. S. (1974). The stability intheUnitedStates,1901-65.
J.Polit.Econ. 82, 1205-1219.
MEHR, C. B. and McFADDEN, J. A. (1965). Certainproperties of Gaussianprocessesand their
firstpassagetimes.J. R. Statist.Soc. B, 27, 505-522.
PAGE, E. S. (1954). Continuousinspection schemes.Biometrika, 41, 100-114.
PLACKETT, R. L. (1950). Some theorems in leastsquares. Biometrika, 37, 149-157.
PYKE, R. (1959). The supremum and the infimum of the Poissonprocess. Ann.Math. Statist.,
30, 569-576.
QUANDT, R. E. (1958). The estimation of the parameters of a linearregression systemobeying
twoseparateregimes.J. Amer.Statist.Ass.,53, 873-880.
(1960). Testsof thehypothesis thata linearregression systemobeystwoseparateregimes.
J. Amer.Statist.Ass.,55, 324-330.
THEIL, H. (1965). The analysisof disturbances in regressionanalysis. J. Amer.Statist.Ass.,
60, 1067-1079.
(1968). A simplificationof the BLUS procedurefor analysingregressiondisturbances.
J. Amer.Statist.Ass.,63, 242-251.
TUKEY, J. W. (1962). The future of data analysis.Ann.Math.Statist.,33, 1-67.
TURNER, W. M. (1973). A critical reappraisalof theinteraction ofpostsand telecommunications.
ReportNo. 29. Statisticsand BusinessResearchDepartment, Post Office.
WOODWARD, R. H. and GOLDSMITH, P. L. (1964). Cumulative Sum Techniques.Monograph
No. 3, ICI Series on Mathematicaland StatisticalTechniquesfor Industry,Edinburgh:
Oliver& Boyd.

DISCUSSION OF THE PAPER BY DR BROWN, PROFESSOR DURBIN AND MR EVANS

Professor D. R. Cox (ImperialCollege):This is an important and interestingpaper;


itdealswitha commonpracticalproblem, givesvaluablenewmethods andtheirtheory and
concludeswithcogentillustrations.
Mycomments concerntwotheoretical aspectsofthepaper. Firstthereis theefficiency
of theprocedures in idealizedsituationswhich,notwithstanding theremarks in Section1
of thepaper,seemsof someinterest in understanding of themethods.I
theapplicability
shalldealonlywiththeverysimplest situation,inparticularwherethefittedmodelcontains
just a constantterm,so thattherecursive residualsare definedby thestandardHelmert
transformation and are
Y2-Y1 2y3-Y2-- Yl
42' 46
Suppose furtherthat E(y ) = Z (i = 1, ..., m), E(ym+j) = + 8 (j = 1, ..., n).
164 Discussionof thePaper byBrown,Durbinand Evans [No. 2,

Thesimplest analogueoftheprocedures ofthepaperis to usethecumulative sumofthe


lastn recursive for8 = 0, thechange-point
residualsas a teststatistic beingregardedas
knownand theamountof data fixed,bothassumptions in contrastwiththesituations
contemplated by theauthors.Bothforthecumulative sumstatisticand fortheefficient
test,theratiooftheexpectation to itsstandarddeviationcan be found.
oftheteststatistic
The valuesare respectively

,I n
/nE{(M
8M + s -1) (m + s)}-,
and
8( mn\i
uaVr+nJV
in thelimitm,n-> so,n/m
Thustheratiocan be found,simpleresultsemerging = k. The
definedas the square of the ratio and
is somewhatconventionally
relativeefficiency
asymptoticallythisis
(1+ k){log(1 + k)}2k-2.
Thisis closeto 1 exceptforlargek, somerepresentative
valuesbeing
k 1 2 5 10
ARE 0-961 0 905 0-770 0-632.
For smallk, the asymptotic is 1- k2/12
relativeefficiency + 0(k3). Verysimilarresults
hold forsmallm,n. Thus exceptin theuntypicalcase whenthe discontinuity emerges
closeto thestartofthedata,thecumulative
relatively summethodis veryefficient.
A similar calculation can be made for the departureE(ym+j) = , ++j/(j = 1, ..., n)
comparing theefficient
testbothwiththecumulative sumof thelastn recursive residuals
and withtheirregression coefficienton time. Thereis no difficulty
in principle
in making
similarcalculations
formoregeneralmodels.
A secondtheoreticalpointconcerns ofserialcorrelation.It would,ofcourse,
theeffect
be possibleto definerecursive residualsrelativeto an assumedor estimatedcovariance
matrix;a simplerpossibilityis to keepto thedefinitionsof thepaperand to examinethe
effectofserialcorrelation
in thedata. Whilegeneralformulae canbe writtendown,I have
investigatedagainonlythecase wherethefitted modelcontainsjust an unknownmean.
It can thenbe shownthatforlargem
var (wvm)= a2{1 + O(m-1)}, corr(Wm, Wm+h) Ph,

where{ph} is the autocorrelation


functionof the data. This suggeststhatthe standard
deviationof sumsof w's is inflated
overits value in theindependentcase by thefactor

1 +h
in particular,for a first-order processthis factoris {(I + plJ/(l
autoregressive -PP.
Providedthisholdsalso formoregeneralmodels,it wouldbe reasonableto inflatethe
limitsofthepaperbya roughestimate ofthisfactor.
I proposea mostcordialvoteofthanksto theauthors.

Mr P. R. FISK (University
of Edinburgh):I shouldliketo startbypayinghomageto
Dr R. L. Brownwho,as Professor Durbinhas said,was thefirstDirectoroftheResearch
and SpecialStudiesUnitin theCentralStatisticalOffice.The CSO was veryfortunate to
have himin thatpositionbecauseduringhis periodof officehe demonstrated whatthe
unitwas capableof doing. His untimely deathwas regrettedbyall whoknewhim. It is
noteworthy thatthisevening's
paperis thefirst
to be readbeforetheSocietyfromthatunit.
It is myearnesthopethatwe willreceivemorecontributions, eitherof reador published
papers,on methodology fromthisor othersourcesintheGovernment statistical
service.
1975] Discussionof thePaper byBrown,Durbinand Evans 165

A basic featureof theprocedures givenin thepaperis theset of recursive residuals


described inequation(2). Variousproperties ofthesehavebeenmentioned bytheauthors.
One notablefeature mentioned in thepaperis thatundertheassumptions madeaboutthe
termsinequation(1) therecursive residualshavezerocorrelation between anypair. It may
proveofsomeinterest to givea littleattention to thenatureofthetransformation involved.
I alwaystendto operateherein termsof matrixtransformations, whichis implicit in the
paperbutnotspeltout.
As Professor Durbinsaid, thetransformation is clearlylinearfroma T-dimensional
space to a (T- k)-dimensional space. It is possibleto writethetransformation matrixin
sucha waythatit is orthogonal to theregressor matrixXT. Thisis sufficient to showthat
therecursive residualsmaybe described intermsofthesametransformation matrix applied
to thevectorof dependent variables,or to thevectorof least-squares residuals,or to the
vectorof errorsin the equation providedthe assumptionsunderlying the model in
equation(1) are correct.This varietyof waysin whichtherecursive residualsmaybe
describedmakesit conceptually possibleto describethenatureof thedistribution of the
recursiveresiduals,and so of statisticsbased on those residuals,under alternative
assumptions suchas heteroscedasticity of theerrors.
The transformation matrixmentioned above has some nice properties.One thatis
impliedinthepaperis thatwhenpost-multiplied byitstranspose wegetan identity matrix.
Whenpre-multiplied byitstransposewe gettheidempotent matrixused in thedefinition
of least-squaresresiduals. The structureof the transformation matrixmakes the
interpretation ofthestatistic sTinthepaperas a residualsumofsquaresobviousat a glance.
Interestin the transformation matrixdoes not stop there. This matrix,of order
(T- k) x T, maybe partitioned by the firstk columns.The remaining (T- k) columns
forma squarematrix, D say,whichis lowertriangular and satisfies
theequation
D[I+ ZZ'] D' = I,
whereI is an identity matrixof order(T- k) and Z = XT-k X-1. HereI havepartitioned
XTas (X: X-0k) in whichXk mustbe non-singular forrecursive residualsto be derivable
at all. It is evidentfromtheproperties ofthetransformation matrixmentioned abovethat
recursive residuals,are membersof the Theil systemof residualtransformations. The
distinction is thatwhereasTheil,and thosewhofollowhisparticular approach,haveused
thespectralresolution ofthematrix(I+ ZZ'), theauthorsofthepresentpaperhaveused
theCholeskifactorization. I havebeentold,althoughI haveseenno publisheddemonstra-
tion,thatthe Householdertransformation recommended by Golub yieldstransformed
residualswhicharealso members oftheTheilsystem oftransformations. Thisleadsmeto
wonderwhether whenoneattempts anylineartransformation oftheleast-squares residuals
to a setofuncorrelated randomvariableswe are perhapsproducing justanothermember
of theTheilsystem.
One noticeablefeature oftherecursive residualtransformation is thatit is notunique.
TherowsofXT can be arranged in anyorderweplease,so longas thefirst k rowsgivenby
Xkforms a non-singular matrix.The authorshavea naturalorderintheirexampleswhich
is inducedbytime,butthereis no reasonwhytheyshouldbe inhibited fromtrying some
otherorder.Thus,beforetheanalysiswas conductedtherewas a suspicionin Example1
that1971/72 wasmoreliketheearlierperiodsthantheperiodafter1964/65.Thatparticular
observationcould have been insertedbetweenthosefor 1964/65and 1965/66forthe
purposeof thetestapplied.
I have not looked verycloselyat the movingregressions describedin the paper. I
confessthatI findtheprocedures appealing, possiblybecauseI haveusedsimilartechniques
without anyattempt at a theoretical justification.I wasexamining theerrorsinpreliminary
estimatesof economictimeseries,definedas the difference betweenthefirstpublished
figure ofthevalueoftheseriesand thefigure publishedthreeyearshence.I wasinterested
in detecting anymarkedchangesin thevaluesofvariancesand first-order autoregression
coefficients overthelengthof theobservedrecord.I thinkthiskindof non-stationarity
166 Discussionof thePaper byBrown,Durbinand Evans [No. 2,

maybe expectedwithsucherrorseriesand mightbe foundwithothertypesofeconomic


timeseriesalso. Simpleprocedures, areofbenefit
suchas movingregressions, insearching
formarkedchanges, althoughI confess themeaning
ininterpreting
thatI hadsomedifficulty
of the chartsthatI constructed.This was mainlybecauseI could recognizethatthe
patternsobservedcouldhavebeeninducedbychangesin theseriesotherthantheone of
primeinterest.I experienced whenlookingat theauthors'examples.
a similarfeeling
The authorsare to be congratulated paper.
on theirveryinteresting
I am verypleasedto secondthevoteof thanks.
The voteof thankswas passedbyacclamation.
SirMAURICE KENDALL (WorldFertility Survey):I fullyendorsewhatprevious speakers
havesaid aboutthemerits ofthispaper.It is themostimportant contributiontoregression
problemsthatwe havehad forsometime.I shouldliketo ask threequestionsaboutthe
paperitselfand makethreesuggestions forfurther work.
The proceduresuggested forcomputing successiveresidualsrelieson theresultgiven
byourChairmanand Professor Bartlettenablingthecovariancematrixofregressors to be
up-datedwhena newobservation becomesavailable. If thiswerenot so thearithmetic
wouldbe verytedious.However,theup-dating does requirematrixmultiplication and if
it is doneovera fairlylongsequencethereare dangersofcumulative rounding-offerrors.
Myinclination wouldbe,havingarrivedat theendofa series,to recalculate thecovariance
matrixand to checkwhether theiterative processhas arrivedat thecorrectresult.The
pointappliesequallyand possiblymorestrongly to movingregressions.
A pointwhichwas notentirely clearto me was howtheauthorsdistinguish between
departuresfromthe null hypothesis concerningthe regression and those
coefficients
concerning themagnitude oftheresidualerrors.I shouldbe gladto knowhowtheydecide
betweenone or theotherexplanation ofa significantresult.
A thirdpointon whichI shouldvaluetheauthors'comments concernstheapplication
of theirtechniqueto data whichdo nothavea temporalorder. So faras I can see,one
couldusetheirmethodfordataarisinginanyorder,butiftheorderwereto be determined
byreference to thevaluesoftheregressor variables,whichofthemshouldone choose?
I thinkfurther examination is requiredofQuandt'smethodoftesting wherea regression
changesroutine.Maximizinglikelihoodhas thedisadvantage thatmaximaare flat,by
whichI meanthatmuchthesamemaximum valueis reachedfora fairlywiderangeofthe
variables.In certaincases,as forexamplewhena metalchangesmolecularshape,it is
important to narrowdownthe pointof changewithgreataccuracyand some further
researchon thissubjectis desirable.
Finally,twodirections in whichan extension of theworkin thispaperwouldbe very
valuable. One wouldbe an applicationto autoregressive series.The otherwouldbe an
extension to thecase whereall variablesaresubjectto error.Bothcasesaremorelikelyto
arisein practicethantheone ofpureregression on non-autocorrelated regressors.

ProfessorM. B. PRIESTLEY (Universityof ManchesterInstituteof Science and


Technology): The problemdiscussedbytheauthorsis indeedan interesting one,and they
havepresented us witha richvarietyoftechniques foritssolution.Essentially,
whatthey
seemto be sayingis thatintherealworldonewouldexpectrelationships between variables
to be "dynamic"ratherthan "static"; hencesimpleapproximations (such as "static"
linearmodels)cannotbe expectedto remainvalidoverindefinitely longperiodsof time
unlesswe are preparedto modify continually thevaluesof theparameters so as to allow
thesemodelsto adapt themselves to "local" conditions.The proposedtechniquesfor
monitoring and testingchangesin theparameter valuesshouldtherefore proveextremely
usefulin manyfieldsof application.
However,although theauthorsassumeattheoutsetthatthex variablesaredeterministic
(so that,presumably, model (1) may be treatedwithinthe framework of classical
1975] Discussionof thePaper byBrown,Durbinand Evans 167

regression
theory), thiswouldseemto be a ratherunnaturalassumption-particularly in
thecontextof theexamplesdiscussedin Section3. Indeed,theindependent variablesin
Examples1 and 2 seemjustas "stochastic" as thedependentvariable!Moreover, although
thedynamic elementis incorporated via thepossibletimedependence oftheparameter, it
would seem morenaturalto expressthisnotionin the moreconventional mannerby
introducing
laggedterms intotheregression relationships.
(Thesewouldallow,forexample,
forthe effectsof "inertia"in the interrelationships
betweenthe variables.)If we now
combineboththeseideaswe are led to a moregeneralformofequation(1), namely,
k a)
Yt = I :(jt )XWS+ ut. (*)
j=1 s=O

Equationsof theform(*) are familiar in thestudyof relationships betweentimeseries,


and arise,forexample,in thecontextof linearstochasticcontrolsystems.Model (1)
reducesto a specialcase of(*) in whichonlyfl,(t) is non-zero, all theremaining P's being
zero.
Ofcourse,ifonlyfinitely manyofthef(l)arenon-zero, (*) maystillbe expressed inthe
form(1) wherenow thext?)S are simplyregardedas additionalvariables.However,this
would,in general,lead to a modelwithan extremely largenumberof parameters, and if
thex(j)areregardedas stationary processes,theserialcorrelation within
each processand
thecross-correlation betweenx(i),x(j),wouldno doubtinvalidatethe distribution theory
ofthevariousteststatistics.Moreimportantly, iftheoriginalrelationship between Ytand
the{xWj)} involvedlaggedvaluesof {Yt},theformulation (*) wouldrequilean infinite set
of theparameters {f(lt}. For thesereasonsit is usuallymoreconvenient to transform (*)
intoitsequivalentfrequency domainrepresentation. If we assume,forthemoment, that
thef(l)are timeinvariant, i.e. flM= P?),all t,thentheinter-relationships betweenYtand
{x(?)}are completelycharacterized bythesequenceof transfer functions,
(c)Q) = (l) e-iws j = 1,2,.
s=O
If each f")(cw)is a sufficiently
"smooth"function of co it is thenpossibleto estimateall
thesefunctions "non-parametrically",i.e. without
makinganyspecific assumptions about
theformofthe{P(l%}, andthis,inturn,providesestimates ofthecomplete setof parameters
{/3?)}.It is wellknown,forexample,thatif thereis no cross-relation betweenthe{x(?)}
processes, thentheleast-squares estimateof/M')(c)is (asymptotically)givenby
= 3v?J(C0)1fjs(0),
PUVO()
whereA8,(cw)
is the estimatedcross-spectral betweenYtand {x(j)} and
densityfunction
fjj(v)is theestimatedspectraldensityfunction
of xWj).WhentheP(l) are time-dependent
Yt is, in general,no longera stationary
process,but theessentialpointis thatthesame
basicideasmaystillbe usedtoobtaina frequency
domaindescription
of(*) eveninthismore
generalcase. The transfer
functionsthemselvesnowbecometime-dependent, i.e. in place
of the{f)(cv)} we nowhavethe"generalized transfer
functions"
()= z / e-iws,
s=0
but,assumingthattheP(/3do notvary"too rapidly"overtime,the/3M')(w)maystillbe
estimated"non-parametrically" by introducingthe notionof evolutionary(i.e. time-
dependent) spectraand cross-spectra.This approachhas beenstudiedby myself and my
colleagues,Dr Subba Rao and Dr Tong,and themainideas werereportedin Priestley
(1965) and Priestley and Tong (1973). Moreover,althoughmodelsof theform(*) are
morecomplicatedthan(1) in thattheyinvolve"lagged" regressor variables,it is still
possibleto constructtestsfortheconstancy overtimeof thecomplete
sequenceof transfer
functions, usinga MANOVAapproach.Suchtestshavebeenappliedand tested
{/3P)(w)},
on "real data", and weredescribedin recentpublishedpapersby Subba Rao and Tong
(1972,1973). In a veryloose sensetheideasunderlying
thisapproacharenotunlikethose
168 Discussionof thePaper byBrown,Durbinand Evans [No. 2,

discussedin Section2.5 on "movingregressions", butitsadvantageis thatitallowsone to


examinepossiblevariations overtimeofthecomplete formofeachofthetransfer functions,
/3(Aw(w). (The authors'model (1) assumes,in effect,that each /3M)(w)is a constant,
independent of w.)
The moregeneralapproachdescribed abovedoes,ofcourse,requiretheavailability of
fairlylongseriesofobservations on Ytand the{x(}, butthisrequirement is surelyinherent
in thebasic natureof theproblem,irrespective of theapproachadoptedforits solution.
Thereis, afterall, a limitto theamountof information whichone can extractfroma
finiteamountof data; once theparameters are allowedto becometime-dependent the
accuracyof theestimators is relatedin quitea fundamental wayto themaximum rateat
whichtheparameters canbe allowedto change-cf.the"Uncertainty Principle"(Priestley,
1965). It maywellproveto be falseeconomyto tryto extractadditionalinformation from
thedata byover-simplifying themodel.
It is well known(forexample,Rosenbrock,1965),thatthe recursiverelationsfor
updatingregression coefficients-due to Plackettand Bartlettand mentioned in Section
2.2-are verycloselyrelatedto the recursive relationswhicharise in Kalman filtering
theory,and it may be interesting, to investigate
therefore, whetherthe (now massive)
literature on Kalmanfiltering couldbe exploitedtoprovidefurther resultson theproperties
of the recursive residuals,{Wr}. It maybe notedalso thatboth Harrisonand Stevens
(1971)and Bohlin(1968)haveconsidered modelsoftheform(*) in whichthecoefficients
f() are themselves stochastic processes,and Kalman filtering thenemergesalmostin-
evitablyas the appropriatetechniquefor updating(or more preciselyin this case,
predicting) the coefficients in such models. Models involving"stochasticparameters",
althoughnotconceivedin a trueBayesianspirit,mayperhapsbe regardedas a firststep
in thisdirection.
Thepointsmentioned in thepreceding paragraph thecloserelationship
illustrate which
existsbetweencertainbranchesof stochasticcontroltheoryand problemsin timeseries
analysis.Unfortunately, muchof thecontroltheoryliterature tendsto be written in a
languageand stylewhich,at first sight,mayseemunfamiliar and thismay
to statisticians,
accountforthefactthatmanystatisticians are unawareof thefullpotentialities of this
work. It wouldbXe to themutualadvantageof bothcontroltheorists and statisticians to
fosterclosercollaborationbetweenworkersin thesetwo fields.On the otherhand,it
wouldbe whollyregrettable if,byadoptinga parochialand introverted approachto their
work,statisticians failedto appreciate therelevance and importance ofmuchoftherecent
controltheoryresearch.

Dr PETER C. YOUNG (CentreforResourceand Environmental Studies,Australian


NationalUniversity,Canberrat):It has beensuggested (Kailath,1974),althoughwithout
specific thatlyingsomewhere
reference, in theCollectedWorksof C. F. Gauss thereis a
recursiveformulationof theleast-squaresequations.In recentyears,however, therecan
be no doubtthatthedevelopment oftherecursive least-squaresalgorithm is due,in large
part,to theworkof our Chairmantonight, R. L. Plackett,whopublishedan important
paperon thesubjectin 1950. IndeedProfessor Plackett'spaperwasto me,as a veryyoung
researchworkerin theearlynineteensixties,a greatrevelation and it had an important
influenceon myfuture work;an influenceforwhichI am extremely gratefulandforwhich
I am,aftermorethana decade,nowable to thankhimpersonally.
Butlikemostworksofinnovative qualityProfessor Plackett'spaperwas,inmanyways,
ahead of its timeand its significancewas ratherlost on the pre-computer statistical
audienceoftheday. Indeeditwasleftto a controltheorist,RudolfKalman,tocontinue the
saga of recursive
leastsquaresin 1960whenhe publishedhis influential paperon state
variableestimation Kalmanutilizedtheprinciple
theory.In effect, oforthogonal projection
to evolvea moregeneralformoftherecursive least-squaresequationsforthecase where
at Control
t Formerly Division, ofEngineering,
Department ofCambridge.
University
1975] Discussionof thePaper byBrown,Durbinand Evans 169

the unknownparametersare no longerconsideredconstantcoefficients (as assumed


implicitlyin Professor Plackett'sformulation) butare treatedas timevariablestatesof a
dynamicsystem described bya setoflinearstochastic stateequationsofa Gauss-Markov
type.
Since1960manypapershaveappearedin thecontrolliterature on bothrecursive least
squaresand statevariableestimation(or filtering as it is referred to in the literature),
certainlytoo manyto discusshere:it willsuffice merely to mentiona recentsurveypaper
on filteringtheorybyKailath(1974),whichlists390references fromboththecontroland
statistical
literatureand providesan excellent appreciation ofthesubject,albeitsomewhat
orientated towardstheinformation and communication theoryaudienceto whomit was
principally directed.It is a littlesurprising, however, thatthestatistical literature has not
beensimilarly influenced bytheearlyworkon recursive least-squares methods, particularly
now thatthe availabilityof electroniccomputersmakesthe recursiveformulation so
attractivein practicalterms.Withthisin mind,it is especiallywelcometo heara paper
read at the Royal StatisticalSocietywhichrecognizesthe importance of the recursive
least-squaresformulationin analysingdata with generalnon-stationary statistical
properties and, in particular, whichdiscusseshow therecursive residualscan be used to
detectthepossibility of temporalchangein thecoefficients of a regression relationship.
The papertonight will,I am sure,proveofpracticaluse in day-to-day statisticalanalysis
forI knowfromexperience withreal data froma varietyof different sourcesthatthe
possibilityof parameteric non-stationarity is everpresentand classicalmethodsof block
data analysissimplydo nothavetheflexibility to handlesuchproblems.
One minorcriticism of the paper,whichI mayperhapsbe allowedto voice,is its
notablelack ofreference to paralleldevelopments in thecontrolliterature; developments
whichhave considerable bearingon the typeof analysissuggestedby the authorsand
whichshould,I believe,be brought to theattention oftheaudience.I haveattempted, at
a previousmeetingof theSociety(Young,1971),to correcttheapparentlack of contact
betweenthedisciplines but,apartfromsomeexceptions, I have clearlyfailedto getmy
messageacrossand,withtheChairman'sindulgence, I willtryagain.
Muchofthepapertonight is concerned withthestatistical properties of a normalized
function wrof therecursive residualsYr- Xrb7-1. In thecontrolliterature theserecursive
residualshavebeentermedtheinnovations process;a termthatappearsto havebeenfirst
introduced in thisconnection byWienerand Masaniin themid-fifties (see Kailath,1974).
It was notuntil1968,however,thatKailathshowedthatthisprocess(or itscontinuous-
timeequivalent), is, underassumptions similarto thoseofregression analysis,zeromean,
gaussianand serially uncorrelated, (althoughitshouldbe emphasized thattheseproperties
are to a certainextentimplicitin theorthogonal projectionarguments of Kalman).
Bearingthe"whitenoise"properties in mind,itis notsurprising thatstatistical testson
therecursive residualshave beenthebasis of manymethodsof verifying theefficacy of
recursiveestimationschemes. In our own work on autoregressive-moving-average
time-seriesmodelestimation (see,forexample,Youngetal., 1971),forinstance, itis normal
to computethesampleautocorrelation function of theresidualsand assesswhether the
statisticalassumptionsare satisfied;in particularwhetherthe recursiveresidualsare
seriallyindependent.And in recentyears,therehas beenconsiderable researchintothe
statistical
properties oftheinnovations processwhen,forexample,a sub-optimal Kalman
filter-estimationalgorithm is appliedto a stochastic system(as is oftenthecase,sincethe
optimalalgorithm requires aprioriinformation onthenatureofthedynamic system andthe
covarianceproperties of thestochasticdisturbances; information which,ifit is available
at all,can be subjectto considerable uncertainty). Suchresearchhas producedalgorithms
forboth(a) explicitly estimating eitherthestatistical propertiesof thestochasticdistur-
bances,or the gain matrixof the Kalman filter, fromthe samplecorrelations of the
innovations process(forexample,Mehra,1970;Carewand Belanger,1973;Neethling and
Young,1974)or (b) implicitly adaptingtheKalmanfilter to ensurea satisfactory innova-
tionsprocess(Neethling, 1974).
170 Discussionof thePaper byBrown,Durbinand Evans [No. 2,

The authors,withtheirexperienceof statistical hypotheses testing,suggestvarious


checksthatcan be appliedto thenormalized innovations processwhichappearto be of
greatvalue in assessingtheexistenceof non-stationarity. A somewhatsimilarapproach
to detecting thepossibility of parametric changeusinga serialcorrelation testhas been
suggested byHancock(1971). But hypothesis testsare onlydiagnostic toolsand theydo
littleto tellus how,havingdiscovered theexistence ofparametric non-stationarity,we are
able to modify our analysisto accountforitspresence.In thissense,I thinktheauthors
haveerredin notrecommending thattheuserrefers moreoftento therecursive estimation
of the parameters in the regression relationship,as well as to the recursiveresiduals.
Certainlyour own workover the past ten yearsshowsthatsuch information can be
particularly usefulin practicaltermsproviding, as it does,information not onlyon the
existence of timevariability butalso, and perhapsmoreimportant, on thephysicalcause
of such non-stationarity; information whichcould well be used, oftenin an iterative
manner,to "identify" bettertheprocessunderinvestigation, in somecases removing the
causeofnon-stationarity completely and so yielding a modelofgreater practicalutility.
As an exampleofthislatterapproachconsidertherainfall(uk)-runoff (Yk)datashown
in Fig. 1 fora stretch oftheRiverOuse nearBedford, intheyear1972. (Thisexamplehas
arisenin connection witha systems analysisstudyofwaterqualityintheGreatOuseRiver
system;a studybeingcarriedoutbyPaul Whitehead and myself in collaboration withthe
Great Ouse River Division of the AnglianWaterAuthority and the Departmentof
the Environment; see Whiteheadand Young, 1975.) The recursiveestimatesof the
movingaverageparameters b, and b2in an autoregressive-moving averagemodelrelating
Yk and ukare shownin Fig. 2. These estimates were,in fact,obtainedfroma special
recursiveinstrumental variable(I.V.) algorithm whichmay be interesting to Professor
Durbinwhoseearlypaper on the instrumental variablemethod(Durbin,1954)proved
usefulin ourinitialdevelopment of thisapproachto time-series analysis.The algorithm
is specialin thesensethatitcan be giventheabilityto estimate parametric non-stationarity
if theuserhas reasonto believethatsignificant changesmayoccurovertheobservation
interval.In thispresentcase,it is clearthattheparameters showa markedtendency to
both long-and short-term variation.Referenceto the physicalnatureof the system
indicatesthatsuchnon-stationarity is probablydue to theeffects of evaporationetc. (in
thelongterm)andsoilmoisture deficit(intheshortterm)andsuggest thatthesystem might
be "purged"of itsnon-stationary behaviourby pre-processing or filteringthedata with
thesefactorsin mindin orderto yieldan effective rainfallinput,as shownin Fig. l(b),
suchthattheresulting estimates are, at leastapproximately, time-invariant as shownin
Fig. 2(b). In thisfigure theestimates do notindicatestricttimeinvariance becauseofthe
natureof estimation algorithm: in effect,theabilityto trackparameter variationis only
obtainedat somecostin estimation unlesstheexactnatureofthetimevariation
efficiency
is knowna priori.In thiscase such information is not availableand the algorithm is
instructed to expectonlyrandomvariationin theb parameters betweensamples,withthe
resultthattheestimates havea fairlyhighresidualvariance.But,at thesametime,it is
nowclearthattheapparent estimated parameter variations are probablydue in largepart
to theresidualrandomnoiseeffects thathavenotbeensufficiently "smoothedout" bythe
modified algorithm which,in effect,has difficultyin differentiating betweenrandomnoise
effects and any randomchangesin parameters of themodel(see, forexample,Young,
1974).As a result,further analysiscannowproceedundertheassumption thattheparameters
are nowsensiblyconstantovertheobservation interval.In thiscase, such analysiswas
carriedout usingan iterative versionof theI.V. algorithm whichis able to refinethe
estimates bymakingmultiple passesthrough thedata,eachtimeupdating theinstrumental
variablesand so improving thestatistical efficiency.
The successof the stationarytime-series model obtainedfromthe iterativeI.V.
algorithm is demonstrated in Fig. 3, whichshowstheforecasted riverflowcomparedwith
theobserved flowfortheyear1972,and Fig.4, whichgivessimilarresultsforthefollowing
yearon a different reachoftheriverbutusingthemodelas fitted to the1972data.
1975] Discussionof thePaper by Brown,Durbinand Evans 171

400

sao

"0

-j
U-
Li-
U. to0

00

40

*40
0
0 28 04 112 140 10 100 4 a 20 -00 000 00

TIME (DAYS)

20
?,so
10

14

ZZ12

<-10
z

0 04
J&hLWiLi
~~112 140 10e 196 224 22 200ass000

TIME(DAYS)
FiG. I a
10

14

-J1

z $2
-4Z
1-1

10

LU

U-

0 o0 ~Q 04 lie 140 10 S

TIME (DAYS)
FIG. lb

2400

b2

ID1-
Lt
1000

0 40 S0 190 is0 200 240 260 SW SW 4

TIME(DAYS)

bi;
12o0

0
0 403
t0lhiflff$hTnaf
so 8e e w O ew SW SWo :3

TIME (DAYS)
FIG. 2a
eoo
24000

1800

bi
1200

0 40 so 120 160 200 240 ago SW 'l30 400

TIME(DAYS)

b2
120
low
82100

0 TT
0 40 SO 120 180 e0 240 20 geo S 400

TIME (DAYS)
FIG. 2b

* OBSERVEDFLOW
FORECASTFLOW

1400

u 1ao
0~~~~~~~~~~~~~~~~~
wo

U-

4100

200

0
0 2 t0 84 11i 140 1i8 100 24 22 83

TIME(DAYS)
FiG. 3
174 Discussionof thePaper by Brown,Durbinand Evans [No. 2,

* OBSERVEDFLOW
- FORECAST FLOW

LU

1600

iLooo
14W ~ ~ ~ ~ ~ ~ 4
t ~~~~~~~~~~FG 4

Beforeleavingthe topicof time-series analysis,I thinkit is a littlemisleadingto a


generalaudiencewhentheauthors, in thefirstsentence oftheirpaper,referto time-series
data,so tending to givetheimpression thatthetechniques theydescribeare particularly
usefulfortime-series analysis. And yetit is well knownthat,exceptin specialcases,
regressionanalysisyieldsasymptotically biasedestimateswhenappliedto moregeneral
time-seriesproblemssuch as theexampleI havejust outlined,wherethereare clearly
problemsof errorsin variables.(Thisis probablyno problemin theexamplesquotedin
thepapersincethe authorsare not,apparently, interestedin the structural parameters
butonlyin theforecasting abilityofthemodels.)
I merely wishto noteherethatrecursive techniquesoftime-series analysisthatare not
prone to such disadvantages are, as we have seen, available and have been applied
successfullyto real data froma varietyof systems, in additionto the waterresources
problemdiscussedabove. Descriptionsof these IV-AML (Instrumental Variable-
Approximate MaximumLikelihood)methodsof time-series analysishaveappearedboth
in the controlliterature (Young and Hastings-James, 1970; Young, 1972) and, more
recently,in the mathematical literature(Young, 1974), whilea reportdescribingthe
techniquesindetailis availablefromtheauthor(Youngetal., 1971). In addition,a paper
emphasizing theirimportance ingeneraldynamic systems analysiswillbe submitted to this
Societyin thenearfutureand willdeal not onlywithsingleinput-single outputsystems
of the conventional time-seriestype(see, forexample,Box and Jenkins, 1970) butalso
withmulti-input, multi-output systemsof thekindmetso oftenin practice(Young and
Whitehead, 1975).
Mr G. PHILLIPS (Universityof Kent): I foundthepaperextremely interesting,
and it
clearlyindicatestheusefulnessoftherecursive residualsin testing
forspecification
errors
in linearregression models. Furtherevidenceforthisis providedin two forthcoming
papers,thefirstof which(Phillipsand Harvey,1974)discussestestsforserialcorrelation
usingrecursiv e resals,an t second(Harveyand Phillips,1974) whichdiscusses
testsforheteroscedasticity.
1975] Discussionof thePaper by Brown,Durbinand Evans 175

HoweverI findsomedifficulty in decidingthelikelyusefulness oftheproposedtestsin


theabsenceof anyinvestigations of theirpowerundera rangeof alternative hypotheses.
I noticedthattheauthorsreferred to thefactthattheyhavenotinvestigated theproblem
of serialcorrelationon theirtests.Thereis someevidencein a paperby R. A. Johnson
and M. Bagshaw,in Technometrics (1974),16,103-112,whichsuggests thatthecusumtests
are notrobustto departures fromindependence.
A further pointis thattherecursive residualsare independent onlywhendisturbances
arenormal.Whenthereis a departure fromnormality perhapsrecursiveresidualsmaybe
no moreeffective thantheordinary least-squaresresiduals.
Finally,inthefirstexampleofan application ofTIMVAR, theproblemas posedappeared
to be one of errorsof measurement on thedependent variable.I was notclearhow the
analysiscameto itsconclusionthattheinstability was due to local changesin regression
coefficients
althoughI agreewiththecourseof actiontaken.

Dr T. W. ANDERSON (StanfordUniversity and London School of Economics):This


paperwas interesting to me on manycountsbecauseitscontributions touchseveralofmy
own areas of research.Let me commenton different aspects. I thoughtit would be
amusingto suggestto ProfessorDurbinthatthe recursive residualscould be used to
construct testsof serialcorrelation whichwouldbe alternative to thewell-known Durbin-
Watsonprocedure, butthatidea has alreadycomeup earlierin thediscussion.
The generalization of theHelmerttransformation is particularlyusefulin its natural
timesequenceforindefinitely longseries.BeforereadingthepaperI had usedthetrans-
formation to provethefollowing theorem.(My readingof thediscussionsin theJRSS
indicatethatnotinfrequently theopportunity is usedforthediscussant to displayhisown
results,and I seldomgetthechance.) In themodelofthepaper,withHo true,bT -+ ,3as
T-o co withprobability 1 ifand onlyif(X'TXT)'1 - 0.
Anotheraspectof thepaperthatstrucka familiar notewas theuse ofthecontinuous
Gaussianprocessto approximate theprobability thatthecusumslie betweentwolinesfor
r =k +1, ..., T. At one timeI workedveryhard to obtainthe probability that the
Brownianmotionprocessremainbetweentwo specifiedlines. (Apparently, theseRSS
discussions permita discussant to referto his earlierworkas well.) UsingCorollary4.2
of "A modification of the sequentialprobability ratiotestto reducethe samplesize"
(Annalsof Mathematical Statistics(1960),31, 165-197)I find,forexample,that when
a = 0-850,theprobability of goingout oftheinterior regionis 0-0987,whichtheauthors
use for10 percentsignificance. Thisis an errorof only1-3percent. Sincetheerrorwill
decreasewithsignificance level,mycalculations justifytheauthors'assumption thatthe
probability thatW.crossesbothlinesis negligible. It mightbe notedthatthecontinuous
timecomputation exaggerates theprobability of cusumscalculateddiscretely crossinga
line.
It was instructiveto me to write(4) as
r- br-1 = Wrx (X; Xr)-' xr4{1 + x'(X'-1 Xr-1)-' x7}.
This shows that the recursiveresidualsare based on changesin the estimateof the
regression
parameter vectorand w2is a quadraticformin thedifference.

Dr A. F. M. SMITH(University CollegeLondon): The topicsdealtwithin thispaper


are of considerablepracticalimportance and providea numberof challenging problems
forthetheoretical statistician.The authorshave adoptedan unashamedly exploratory,
data-analyticapproach,and are carefulto point out thattheirproposals"should be
regardedas yardsticks fortheinterpretationof data ratherthanleadingto hardand fast
decisions".
I wonderif this is reallysatisfactory?Data explorationis certainlya necessary
preliminary,but it seemsto me thatsooneror laterone mustprovidea moreformal,
176 Discussionof thePaper byBrown,Durbinand Evans [No. 2,

theoretical framework withinwhichto assess the outcomesof the significance tests,or


thedepartures occurring in thecusumplots. Othercontributors havealreadytouchedon
theproblemsof assessingpoweragainstparticular alternatives,distinguishingchangesin
regression coefficientsfromchangesin variances, and takingintoconsideration theeffects
of autocorrelated errors.The latterpoint,in particular, seemscrucialgiventhetypeof
problemunderconsideration.Have the authorsany idea of how proceduresbased on
recursive residualsare affected bytheintroduction ofserialcorrelation?
Turningmorespecifically to thecusumtechniques, -isit not possibleto improvethe
procedureby adoptinga movingV-mask? Considering Fig. 5, forexample,one notices
thattheplothas returned to theoriginbyaboutobservation 35. Could thisinformation
notbe utilizedbyresetting themask(or somemodification thereof) in orderto beginagain
at thispoint?
As theystand,theprocedures putforward bytheauthorsseemto requirea greatdeal
ofinformal use ofpersonaljudgment (as evidencedin Section2.4,forexample,wherewe
encountersuch phrasesas "we ourselvesprefer","it may be usefulto examine"and
"it is ofteninformative to"). The Bayesianapproachoffers a moreformalframework for
theinclusionof personaljudgments, and I shouldliketo bringto theauthors'attention
twosuchBayesianofferings.
The firstoftheseis due to Harrisonand Stevens(1971),and was developedin a time-
seriescontext.Basically,theytakeas theirmodela weighted averageof submodels, the
latterbeingselectedto covertherangeof potentialbehaviourof theprocessin question.
The weights are thecontinuously updatedprobabilitiesofthesubmodels.Changesin the
underlying relationship in changesin theweights,
are reflected and thesubmodelscan be
chosento includea wholerangeofpossibledepartures.The methodhas beenapplied,in
particular, to forecasting demandin a mail-order contextwherefashionsare proneto
suddenchange(see Greenand Harrison,1973).
The secondapproachis one I havebeenworkingon myself in thegeneralcontextof
change-point inference.The simplestversionconcernsa seriesof n observations which
mayall followthesamedistribution, withdensityf1(.I Al),or mayhavechangedat some
(unknown)pointto a different withdensity
distribution, f2(.102). A changeat timer
wouldgenerate a likelihood
r
l(r, 01, 02 | X1, X2, ..., Xn) = I fl(Xi I 01) I f2(Xi 1 02).
i=1 i=r+l

If po(r)represents
thepriorprobabilityof a changeat timer, thenpn(r), theposterior
probabilitygiventhedata,is defined
by

pn(r)/po(r)
oc j(r, 01,02 1Xl, x2, ..., Xn)p(01, 02) d01dO2,

wherep(6l, 02) denotesthejointpriordensity for01and 62. Thepn(r) providea starting


pointforinferences aboutr, 01and 62
In the particularcase of detectinga possiblechangein the regression coefficients
corresponding to a particularsetofregressor variables,ifX,,X"_rdenotetheportionsof
thedesignmatrixcorresponding and r thelastn- r observations,
to thefirst respectively,
and SSr and SS,-, the corresponding residualsums ofsquares, the ofstandard
assignment
vaguepriorsfortheregression andthevariance(assumedconstant
coefficients throughout)
leadsto theresult
ocI XrX,r| x|XI .
Pn(r)/po(r) .I ix (SSr + SSn_r) ,

of n and thenumberof regression


whered is a function In so faras thereis
coefficients.
any connectionherewiththe authors'analysis,the Bayesianapproachwouldseemto
favourtechniquesbased on the squares of the recursiveresiduals(cf. the authors'
equation (5)).
1975] Discussionof thePaper byBrown,Durbinand Evans 177

More complicatedsituationsinvolvinga numberof possiblechangepointscan be


dealt withstraightforwardly by performing a similarpriorto posterioranalysisfor
appropriate (rl,r2,..., rk),where1< r1< r2< ... < rk<n.
k-tuples
The authorsare to be congratulated on stimulating thisevening'sdiscussion.By their
ownadmission, a numberofproblems remainunsolved,but,as was said ofSt Deniswhen
he walkedsome considerable distancewithhis head in his hand: "La distancen'yfait
rien;iln'ya quelepremierpas quicoute".
The following werereceivedin writing,
contributions afterthemeeting.
Mr M. R. B. CLARKE (University of London): I havea briefcomment to makeabout
theupdatingformulae formovingregressions in Section2.5. Manypeoplehavepointed
outpossibledangersarisingfromnumerical informing
ill-conditioning sumsofsquaresand
productsin orderto solvethenormalequations.Morerecently updatingformulaesuchas
thosequotedat thebeginning of Section2.5 havecomein forseverecriticism,notablyby
Chambers,when used for numericalratherthan theoreticalcomputations.Broadly
speakingsuchproblemscan be avoidedby usingone of theorthogonaldecomposition
methodssuch as Householderor Gram-Schmidt. These decomposethe data matrix
intotheform
[XIY] =[Q][OW
whereQ is (n x n) orthogonaland U is the(k x k) uppertriangular
squarerootof X'X.
Q need not be knownexplicitly as all the informationrequiredis in U and v, the
beingtherootsof Up = v and theregression
coefficients sumof squaresv'v.
If we nowadd anotherobservation (x,y) we have
U v

and since [ Q, 1 ] is orthogonal


we needonlywritedownsomesimpleequations
to
determinethek-planerotations
thatannihilate
thex partof(x, y). Theseupdating
formulae
stablebeinglinearin thedata and verynearlyas economical,once the
are numerically
originaldecomposition as thosequotedin thepaper.
is completed,

Professor A. S. C. EHRENBERG (LondonBusinessSchool):I welcomea paperaddressing


itselfto practicalproblemsin regression withit. For
analysis,but I have difficulties
example,I do notunderstand thedata-analysis "'yardsticks"whichtheauthorsproffer in
placeofclassicaltestsofsignificance. In whatprobabilisticorotherunitsaretheyardsticks
calibrated?
My maindifficulty, however, arisesfromtheassumption at thebeginningof Section2
thattheregressors are non-stochastic. Thisis obviouslynottruein mostpracticalcases,
andcertainly notintheauthors'ownexamplesin Section3. Thefailureofthisassumption
leads not merelyto minortechnicaldifficulties or smallbiases,butradicallyaffectsthe
authors'null hypothesis and indeedthe applicability of regression
analysisas a whole.
Thiscan be readilydemonstrated intermsofthe";moving regression"approachdeveloped
in Section2.5.
Here regressions are fittedto non-overlapping setsof n successiveobservations.The
authors'null-hypothesis is thattheregression forthefirstn readingsis thesameas that
foranothern,saythelastn. Thisis illustrated in Fig. A fortwovariables,y and x. (For
simplicityI consideronlytwovariableshere,buttheargument generalizes
to morethan
two.)
7
178 Discussionof thePaper byBrown,Durbinand Evans [No. 2,

If x is non-stochastic
(as assumedin the paper) thisnull-hypothesissays thatthe
expectedvaluesofy foreachx inthefirst
setofnreadingslieon a straight
lineand thatthe
expectedvaluesofy foreachx in thesecondsetofdatalie on thesamestraight line. This
is ofcourseperfectly
feasibleandwouldtherefore
be worthtestingagainstanydata.

y y

n/2readings,.
The last
.- n readings

e firstn readings

"l/2 readings

x x
FIG. A. The null-hypothesiswith a non- FIG. B. Regressionsof y on x forthe first
stochasticx-variable. and last n readings,and forthefirstand last
n/2readings.

But thisnull-hypothesis impossibleif thex-variableis stochastic(other


is inherently
thanin trivialcases). It is thenwellknownthattheregression of y on x forthefirstn
readingswillgenerally differ fromtheregression fora systematicsubsetof thereadings,
such as thefirstn/2. Similarly, of y on x forthelast n readingswillbe
theregression
differentfromthatforthelastn/2readings.Thisis illustrated in Fig. B.

y y

*~~~~~
X

j- X~~~~~~
xS

x x
FIG. C. A linefitted to meansofthefirst
and FIG. D. The regressions of y on x and of x
last n readings. on y forthefirstand last n readings.

ofy on x arethesameholdsfor
thattheregressions
It followsthatifthenull-hypothesis
thefirstn and lastn readings,thecorresponding
hypothesiscannotholdforthefirstn/2
number.Hencetheregressions
and lastn/2readings.Butn is an arbitrary ofy on x with
1975] Discussionof thePaper by Brown,Durbinand Evans 179

stochasticx-variables cannotin generalbe thesameforanytwosetsof n readings.The


null-hypothesis is thereforeuntrue,and thiscan be seen withoutany data-analysis, or
evenwithout thedata. The distinction betweenthenon-stochastic situationconsidered in
thepaperand therealistic onewheretheregressors arein somesensestochasticis therefore
crucial.
An alternativeformulation whichavoidstheabovedilemmais discussedmoregenerally
elsewhere(Ehrenberg, 1975). In termsof the approachof analysingseparatesets of n
readingsas developedin Section2.5 it is thatfora straight lineto holdforboththefirst
and thelastsetsofn successive readings(and foranyothersetsofreadings), thelinemust
go through themeanvaluesof each setof readings,as shownin Fig. C.. This is botha
necessary anda sufficient
condition.
Sucha lineis ingeneralnota regression equationforanyofthedata. Thisis illustrated
in Fig.D whichindicates theregressions ofy on x and ofx ony forbothsetsofn readings.
Sincethelinein Fig. C is fitted
without recourseto theresidualsfromtheline(i.e. with
no minimization procedures), easyto examinefor
theresidualsare in practicerelatively
systematic deviations.The lineis also notaffected byunbiasederrors(ofmeasurement or
thelike)in eitherof thevariables.

Dr A. C. HARVEY (University of Kentat Canterbury): Recursiveresidualshave two


greatattractions; and theother,whichis perhapsmoreimportant,
one is theirsimplicity,
is theirflexibility. For a givenset of T observations,thereare T!/k!different sets of
recursive residuals.Whichset is actuallycomputeddependson theresultof twoclosely
connecteddecisions.The first concernswhichk observations shouldbe usedto formthe
"basis" (i.e. usedto formtheinitialestimate ofP); thesecondconcerns howtheremaining
T- k observations shouldbe ordered.If thebasisis chosenin a certainwayit is possible
to obtaina setof recursive residualswhichfollowa similarpatternto thatproducedby
theO.L.S. residuals.Exact testsagainstsuchalternative hypothesisas serialcorrelation
(Phillipsand Harvey,1974)andheteroscedasticity (Harveyand Phillips,1974)maythenbe
constructed.On the otherhand,it is sometimes possibleto choose the basis and the
orderingin sucha waythatundercertainmisspecifications of themodel,therecursive
residualshave a"distinctive pattern,verydifferentto thatproducedby O.L.S. residuals.
Tonight'spaperhas providedone exampleofthistypeofdistinctive patternin thecontext
ofstructural change.Anotherexampleconcernsthecase whenthefunctional formofone
oftheregressors specified.MrP. Collierand myself
is incorrectly haverecently completed
a paper (Harveyand Collier,1975) in whichwe show thata testbased on recursive
residualsis relatively powerful comparedwitha numberof othertests.The testis based
on thestatistic
=
((T-k-1)1E(w -W)2} ((T- k)-1 w3}, (1)
j=k+l j=k+l

whereiw is the arithmetic mean of the recursiveresiduals.This statisticfollowsa t-


distribution with(T-k-1) degreesof freedomunderthe null hypothesis.Underthe
alternative hypothesis residualstendto havethesamesignand so btendsto
therecursive
be largein absolutevalue,thusleadingto rejection of thenullhypothesis.
PerhapsI can nowturnto somespecific pointson tonight's paper. The first
concerns
thedefinition ofthecusumquantity, W. It is suggestedthatthisbe obtainedbyusing,as
a deflatorof the recursiveresiduals,the estimatora, definedas the square root of
I wg/(T-k). It seemsmuchmoresensible to me,however, to estimate
a bythesquareroot
of I (wj - iV)2/(T-k- 1). Thisdoes notaffect thetheorybehindthecusumtest,butit is
likelyto maketheprocedure moreeffectiveas thecusumwilltendto be largerin absolute
valueunderthealternative hypothesis.(Anotheradvantageof defining thecusumin this
wayis thatwhenthelast cusumquantityhas beenobtainedit onlyneedsto be divided
throughby (T- k)' in orderto yieldthestatisticb,which,as I have alreadysaid,has a
t-distributionunderthenullhypothesis.)
180 Discussionof thePaper by Brown,Durbinand Evans [No. 2,

My secondpointconcernsthetypeof variationin the,B'swhichwe.are interested in


detecting.The authorsprefer ofthealternative
to leavetheirspecification hypothesis
in a
rathervagueform.However,if a moreconcretealternative is proposedmorepowerful
testsare available. For example,thetestproposedbyFarleyand Hinich(1970) is likely
to be muchmorepowerful thanthecusumtestagainstcertainspecialtypesof structural
changesin the P's. Now it seemsto me perfectly acceptableto leave the alternative
hypothesis vague,butthisleavesus witha testwhichmaybe veryweakwhenusedin the
presenceofmanytypesofstructural changelikelyto occurin practice.Ofcourse,I think
theauthorsimplicitly recognize thiswhentheysaythatthesignificance linesdrawnshould
be interpreted as yardsticksratherthanas partof a formaltest. However,if thelines
yieldan ineffectivetesttheymaywellbe ineffective also.
as yardsticks
FinallymayI suggestan alternative, or ratheradditional,wayof calculating moving
regressions,whichare thesubjectof Section2.5 of thepaper. Supposeestimatesof P
based on r observations are calculatedusingan exponential weighting system.In this
systemthecurrent set of observationswouldreceivea weightof unity,thepreviousset
of observations wouldreceivea weightofq, thesetbeforea weightofq2 and so on. Of
course,0 <q < 1. Successiveestimates couldbe computedrecursively sinceifwe define
I
Qs= xxxqi-i (2)
i=l

we have
Qr = qQr1+xrx$ (3)
and so
Q-1r_ q-1 Q-1
7-1
q Q'-1
-2

1+q'1xI
x1 x' Q'r'
QA1xV
-
(4)
4
I wonderiftheauthorshaveusedthistypeofweighted recursion?It wouldbe interesting
to changesin thePl'sthanthemovingaveragemethod.
to knowifitis moresensitive

Dr AGNESM.>HERZBERG (ImperialCollege,London): Andrews(1972) proposeda


simpleway of plottinghigherdimensional i.e. if the data are
data in two dimensions,
each pointx' = (x1,..., xm)defines
m-dimensional a function
fQ(t)= 2-1 +x2sin t+x3 cos t+x4sin 2t+ x5 cos 2t+
thefunction beingplottedovertherange- v < t< r.
The basic regression modelconsideredin equation(1) is Yt= x' Pt+ut(t = 1, ..., T).
In Section2.5,theauthorssuggesta wayofinvestigating thetime-variation ofPtbyfitting
equation(1) on n successive observations and thenon thenextn,and so on. Theysuggest
thatthe graphsof theresulting in Pt againsttimeprovide
estimatesof thecoefficients
evidenceof departures fromconstancy.
Let fGi= (f1i, *--,gkf) be thek x 1 vectorofestimates
ofPtobtainedfromtheithsetof
n observations, i.e. P1is estimated fromthefirstn observations, fromthe
02 is estimated
secondobservation to the(n+ I)st observation,etcetera.Fromeach , formthefunction
fp,(t)as above. Plot theresulting functions.The plotsof thefunctions shouldshowthe
gradualchangeof thewholevectorof coefficients fromconstancy by thechangeof the
clustersof theplots.

Mr M. C. HUTCHISON (Department of Healthand Social Security):I wouldliketo


congratulate theauthorsinproviding paperwhich,intheformofTIMVAR
a mostinteresting
is especiallyusefulto statisticians
working timeseries.
withregressional
The cusumof squarestestis oftheform
j i n
yI = x/x (j= 1,., ,n),
1975] Discussionof thePaper byBrown,Durbinand Evans 181

wheren = T- K and thexi are distributedas x2 variatesunderH, The approximate test


considersonlyhalfof theyj forevenj or equivalently
givenin thepaperessentially adds
x,-1to xj forevenjand treatsxj- + x as a x2variate.Obviouslythistestwillgivemore
resultsthantheexacttestbecausetheyi foroddj can overstate
significant theconfidence
limitcalculatedon evenvaluesonlywhiletheyj forevenj staywithinit. The following
diagramshowsthisclearly.

-- limitfor(n/2)X2variates
Confidence

plot ofy,foreven
x/2
jonly

- - plotofy;forallj
, Xl

Even Odd Even


i i i

An exacttestfory1can be obtained.Considering theprobabilitiesof samplepathsofyj


(j = 1, ..., n) crossing a linearboundaryusingDurbin(1971),formulaecan be obtained
forcalculating theconfidence limitsofthejointdistributionoftheyj forevenvaluesofn
only. A programhas beenwritten to calculatethesepercentage pointsbutunfortunately
dueto thelengthy summations involvedin integrationbypartscombinedwithan iterative
process,therunning timeis prohibitive. Consequently, percentagepointsforprobabilities
of 005 and 0005 of crossingone boundaryhave beencalculatedonlyforevenvalues
of n up to 34. Workis in handto obtainpercentage pointsforfurther valuesof n and
further probabilities.This will givea testfordetecting changesin themeanof normal
variates.
My othercomments concerntheuse ofTIMVAR in practice.I haveusedTIMVAR in the
paston regressional time-series pickeda pointofdiscontinuity
dataand ithas successfully
in theregression coefficients whichwas believedto be highlylikelya priori.Sincethen,
otherdatahavebeenbrought to mynoticeforwhichTIMVAR doesnotseemreadilyapplic-
able. First,thereis thecase ofdatafora smallnumberofyears(say,m< k) butwithmany
observations within eachyear. If one can be certainthatdatawithin eachyearcomefrom
thesameequationthenI can findno reasonwhyTIMVAR shouldnotbe used,whichimplies
a randomordering ofobservations withineachyearto givea new"timedimension".Then
one is concerned withdiscontinuities at mpointsonlywhichare thechangesofdata from
one yearto thenext. If one is unsurethatthedata withineach yearcomefromthesame
equationthenpresumably a simpletestbetween corresponding calculatedfrom
coefficients
regressions on datawithin yearsshouldbe applied. Secondly, the"otherdimension" need
notbe time. One maywantto considerpointsof discontinuity overan orderedvariable,
say adjustedincome(incomeminusneeds)collectedfroma surveytogether withother
variables. One mightwishto relateexpenditure on luxurygoods to otherfactorsas
adjustedincomeincreases.A pointof discontinuity maybe suspectedforhighadjusted
income. In thiscase one may not have readyaccess to moredata thanjust averages
withinadjustedincomebands. Theseaverageswillalmostsurelynotlie equidistant from
each other.How robustis TIMVAR to changesin the"otherdimension" froma discreteto
182 Discussionof thePaper byBrown,Durbinand Evans [No. 2,

variable?One couldperhapsuse TIMVAR ifthedeviations


a continuous of thedatafrom
weresmallin somesense.
discreteness
to heartheviewsof theauthorson thesepoints.
I wouldbe interested

ProfessorMOHSIN S. KHAN(International Monetary Fund): I am verypleasedto have


theopportunity to discussthisinteresting paper,althoughunfortunately I was unableto
be presentat itspresentation. I believethatthispaperwillbe ofconsiderable importance
inthefieldofappliedeconomicswheremoreand moreresearchers arebecoming interested
in regressionrelationshipsthatcontainparameters thatvaryovertime.In discussing this
paperI wouldliketo maketwopointsaboutthetestsofstability thathavebeenproposed,
particularlythetestsutilizingthecusumsof residuals.
Sincetheauthorscautionthereaderagainsttheuse of thetestsof significance in any
rigidway,it maybe usefulto reporttheresultsfromsomeMonteCarloexperiments that
I haveconducted.In theseexperiments I haveexaminedthecusumstestsof theBrown-
Durbin-Evanspaper,fora randomcoefficients model. It is interestingthatthecusums
testshavereasonably highpowerevenforsamplesizesof 20 and 30 as comparedto the
maximum-likelihood ratiotestand a testbased on theestimated valuesof thevariances
thatis due to Hildrethand Houck (1968).
My otherpointhas to do withthequestionof theexclusionof autoregressive models
fromconsideration.Sinceit is precisely thesemodels,namelyones involving theuse of
laggeddependent variablesas regressors, thatare currently of greatest
interestto econo-
mists,itwouldbe extremely usefulto havetestsofstability thatwouldbe applicableto this
class of models. In certainspecialcases it is possibleto applythemethodscontainedin
thepaper,forexample,ifone has a regression equationof theform:
Yt = coL+ rx1+Ut, (1)
wherethedependent variable,y, is relatedto the "expected"value of theindependent
variable,x. The errorterm,ut,is assumedto have classicalproperties.The expected
variableis generatedbyrecursive mechanism suchas
xe = PXt+(1-P)4i, (2)
whereP is thecoefficient ofexpectations,1 > 3> 0. Substituting (2) into (1) andeliminating
xt we obtain:
Yt= Oto+ 01 xt+(1-)Yt-1+ ut-( -) ut-1. (3)
Obviously thecusumsmethodcannotbe appliedto equation(3) becauseoftheappearance
ofYt-ias a regressor and themoving-average natureof theerrorprocess.However,it is
possibleto generatexefrom(2) forvariousvaluesof/(as itis bounded)and substitute the
generated seriesofxtintoequation(1). Sucha procedure wouldfitintotheframework of
thetestsdescribed in thepaperalthough, unfortunately, it is fairly timeconsuming.With
othermodelsinvolving laggeddependent variableseventhisdoesnotappearto be possible.
As I said earlier,thecusumstestsof Brownet al. haveconsiderable potentialuse in
economics.In additionto theapplications describedbytheauthors,thetechniques have
beenusedto evaluatethestability of Phillips'curverelationships in theU.K. and import
functions fortheU.S.

J. A. NELDER(Rothamsted Experimental Station):The updatingproceduregivenby


relation(3), andmoreparticularly itsanaloguefordeleting a point,is unstablenumerically.
Chambers(1971) describesbettermethodsbased on updatingthe componentsof, for
example,a QR decomposition of thedata matrix.
I wouldliketo makea smallprotestaboutthemisuseof theword"recursive" in the
paper. The authorsarefollowing whatappearsto be a well-established practicein talking
about"recursive residuals"et cetera,butnonetheless theunderlying procedures are in no
senserecursive.A recursiveprocedure is one thatinvokesitselfin thecourseofexecution;
1975] Disucssionof thePaper by Brown,Durbinand Evans 183

theprocedures in thepaperare sequential


or updating in whichnewunitsare
procedures
added to (or droppedfrom)an existingfit. This typeof algorithm also needsto be
distinguishedfroman iterative
procedure, but to thesamne
whichis appliedsequentially,
setof data. "Sequential"seemsas good an adjectiveas any.
RICHARDE. QUANDT (PrincetonUniversity,N. J.): Parametervariation in
regression modelshas recently been consideredin a varietyof contexts(see Annalsof
Economicand Social Measurement, 1973,entireissue) and econometric modelbuilders
haveincreasingly beenwillingto entertain thenotionthatregression coefficientsmayhave
differentvaluescorresponding to unknown partitions ofthesample(see Davis et al., 1966;
FairandJaffee, 1972). In suchcasesitis important to be ableto testthehypothesis thatno
shiftin parameter valueshas takenplace. Thereare basicallytwotypesof mechanisms
thatmaybe thought to be responsible forshifts:(a) deterministic mechanisms according to
whichtheparameter vectorfihas thevalue/3,ifsomespecified functionb(z,7) < 0, where
z is a vectorof someobservableexogenousvariablesand 7 a vectorof unknownpara-
meters,and fihas value P2 if b(z,0)>0; (b) stochasticmechanisms amongwhichwe
includerandomcoefficients regression modelsas well as mixturemodelsaccordingto
which,foreachobservation, fi= /3,withprobability Aand 2 withprobability 1-A.Brown,
Durbinand Evansaddressthemselves to thefirst case withthefurther specializationthat
thefunction 0(z, 7T)is of the form t+ vT where the only exogenous variable responsiblefor
theshiftis thetimeindext. Theirprocedures involving recursive residualsareparticularly
appealingbecauseof (a) theparticular suitabilityof therecursive residuals,as contrasted
with,say,Theil'sBLUS residual,forthetestat hand; (b) theease of computation of the
tests;(c) the factthatthe testsservethe combinedpurposeof testinghypotheses and
performing data analysisin theTukeysense;and (d) theiradaptability to caseswherethe
shiftin parameter valuesoccursaccordingto thevaluesofsomeextraneous variableother
thent. If, forexample,it werepositedthatfi= Pi forvaluesof a variableZt < zo and
P = /2 otherwise, whereZt is observableand z0 unknown, all thatwouldbe necessary for
theBrown,Durbinand Evansprocedures wouldbe to sorttheobservations accordingto
the valuesof Zt and thenapplythe tests. It is particularly laudablethattheirseveral
proceduresas well as the log-likelihood ratiotechniqueare operationalin theTIMVAR
program.
Severalquestionsand problemsremain,however. Some of theseare as follows.
(1) Whatare theasymptotic properties of thevariousprocedures?One of thequestions
hereis themannerin whichone proceedsto thelimit.In a recentpaper,Farleyet al.
(1973) suggestholdingconstanttheperiodspannedby the observations and lettingthe
intervalsbetweenobservations converge to zero; onlybysucha devicecan one guarantee
thatthefraction ofobservations belonging to thetworegression regimes remainsunaltered
as we pass to thelimit.On thisbasistheyindicatethatbothcusumtestshaveundesirable
asymptotic properties, suchas thepowerofthecusumofsquarestestnotconverging to 1
as n goesto infinity. (2) Whatarethepowersofthevariousprocedures in finite
samples?
Somepreliminary MonteCarloexperiments byFarleyet al. havecomparedthepowerof
theirownprocedure withthatoftheChowtest(Chow,1960)performed on theassumption
thattheshiftoccursat themidpointof the data seriesand thelog-likelihood ratiotest
usingempirically derived critical
values.Theirownprocedure involves estimating y = XP + E
and y = Xp+ HA+ s, whereH has(tj)thelementtxtj,computing thesumofsquaresresi-
dualsSo andS. fromthetworegressions andrejecting Ho: 8 = 0 if(S - S1)/S issignificantly
differentfromzero. Amongthe threeprocedurescomparedtheyfindthatthereis no
testmostpowerful uniformly in thevalueof thetrueshiftpoint. Unpublished resultsby
Goldfeldand Quandtsuggestthatthecusumtestperforms wellin samplesof size 30-60
observations and in somecaseshas powerequal to 100percent. It seemspossibleto gain
additionalpowerby performing thetestboth "forward"and "backward"on the data
series,althoughwe have not determined how to use the thus gained information
"rigorously".It wouldclearlybe desirableto havemoresystematic information aboutthe
184 Discussionof thePaper byBrown,Durbinand Evans [No. 2,

severalprocedures proposedbytheauthorsin contrastwiththeprocedures proposedby


others.(3) Whataresensibleprocedures iftheerrortermsareserially
correlated?It seems
fairlycertainthatthetestsmentioned heretoforedo notremainacceptablein thepresence
of serialcorrelation,
yetit is precisely
in timeseriesmodelsthatserialcorrelation
is most
likelyto occur. It wouldbe interesting to see to whatkindof modificationtheauthors'
methodswouldhaveto be subjectedin orderto cope withthisproblem.An approximate
maximum-likelihood procedure forestimatingtheparametersofthetworegression regimes
whichcan thenbe usedin likelihoodratiotestsis as follows(see Goldfeldand Quandt,
1974).t
Let
Yt= x' P, + ult ift< to
Yt = XPt
P2 +U2t if t> to
with9l 2, toand theerrorvariancesunknown.
Define Dt = 0 if t<to and Dt = 1 otherwise,
and posit that the errortermsare
generated
by
Uit = pf{(l- Dt1) ult-i + Dt1 U2til} + ut,

U2t = P2{( - Dt1) u1ilt + Dt1 U2t_l} + E2t,

whereeltand 52t arejointlynormaland


E(Elt) = E(e2t) = E(s1l clt-u)= E(82t E2t-1) = 0,
E(82t) = or, E(c2t) = o, E(E1tc2t) = a12-
The tworegression
regimesmaybe combinedin obviousfashionto yield
Yt= (1 - Dt) [x1P1+ pl{(I - Dt-1)(Yt-u- x'_1Pl) + Dt-1(yt - x'tlP2)}]
+ Dt[x2P2+ P2{(- Dt-1)(Yt-u - x'_1Pl) + Dt-.u(yt-u- x-l P2)}
+ (1 - Dt) elt+ Dt c2t.
Fromthisthelikelihood function (conditionalon yo)can be derivedandis a function ofall
theparameters including thediscontinuous Dt. It is possibleto replaceDt inthelikelihood
functionwitha continuous approximation withthecorrectqualitative propertiessuchas
Dt vl
a exp
V27T) x
(- )de;
thelikelihood functionthenbecomesa function oftwonewparameters toand a butall the
parameters can now be estimatedby fairlyroutinenumericaloptimization techniques.
Preliminaryindications seemto be thatif we formthelikelihoodratioA by dividingthe
maximum of thelikelihoodfunctionunderHo bythemaximum ofthelikelihoodfunction
suggestedabove,thequantity-2 log Ahas approximately x2distribution
withappropriate
degreesof freedom as suggested byasymptotic theoryevenin moderate-sized samples.
The preceding comments and suggestionsaremerely intendedto givesomeindications
of directions
in whichfuture researchmightgo. It is clearthatthetestsproposedin the
paperare alreadyusefulsinceseveralresearchers have beenusingthemand indeedthe
cusumtesthas beenprogrammed byseveraleconometricians.

Dr T. SUBBA RAO(University ofManchester Institute


ofScienceandTechnology):
The
problemconsidered bytheauthorsis interesting,
anda similarproblemhasbeenconsidered
earlierby Subba Rao and Tong(1972,1973)and Bohlin(1971). By first the
formulating
problemin theconteextofa controlsystem, I willshowhowtheproblemconsidered
bythe
authorscan be deducedas a particular
case oftheoneconsideredbySubbaRao and Tong
(1972,1973).
1975] Discussionof thePaper byBrown,Durbinand Evans 185

systemwithk inputs{x7t,r = 1,2, ..., k} and


Let us assumewe havea time-dependent
a singleoutput{Yt} contaminatedby the noise {ut}. The systemcan be described
schematicallyas follows:

XIt
Time- ut noise

X2t | dependent |- ) - _ {Yt;

Xrt system

Further letus assumethatthestochastic {x7t,r = 1,2, ..., k}, {Ut}, {yt} are all
processes
1965),withzeromeanand spectralrepresentations
processes(Priestley,
oscillatory
Xrt = eit"At,r(cO)dZx,r(CO)(r = 1, 2, ..., k), (
_T (1)

Ut= eitwAt, ,(w)dZ.(co)

and
Yt eito?
At,,#) dZ,,(c),

be
processes.Let thesystem
where{dZ.,,(co)}, {dZ.(c)} and {dZy(w)} are all orthonormal
describedbythelinearrelationship
k oo
Yt = I hj,t(l) xj,t-I + ut, (2)
j=l 1=0

whereit is assumedthat{ut}and {xj,8} are independent.We notethatby choosingthe


impulseresponsefunctions
hjj1) =-hj,tAt(l) ( j = 1,2, ..., k), (3)
where
I =
e(I) ifI ?,

bytheauthorscan be obtained.
themodel(1) considered
in (2), we can showthat
thespectralrepresentations
By substituting
= dZ',x(w)Ht(@v)+ dZt,.(&),
dZt,y,(ov) (4)
where
dZt,y(co)= At,y(w)dZ,,(w), dZt,.(co)= At,.(v) dZ.(&v),
dZx,$(cv) = {At,,(c&) dZO,k(w)},
dZx,1(co),..., At,k(w)
H'(co) = {Jh,t(o), H2,t(co), Hk(cot)},

Hj,t(w) = h,t(l) e-'I (j= 1, 2, .., k).


1=0

For thetypeof impulseresponsefunctions


givenby(3),
Hj,t(cw) = hj,t (j = 1, 2, ... , k). (5)
we get
bothsidesofequation(4), bydZ*(wo) and takingexpectations,
Multiplying
Ft,$x(w) Ht(co) =Ft,v()
186 Discussionof thePaper by Brown,Durbinand Evans [No. 2,

whence
Ht(co)= F-,.(co)Ft,y(co), (6)
where
Ft,x(oi) = EfdZtx(clo)dZ',x(c0*)
Ft,y(w)= E{dZ*x(w)dZt,o(w)},
givenby(3) and (5), we havetheestimateof
For thetypeof impulseresponsefunctions
Ht(co)),
Ht = Fi%-'X(w0)
Pt' O). (7)
We noteHt (or equivalently Pt) is a gain vectorand hencetesting
theconstancyof the
vectorHtis equivalent
coefficient to testingtheconstancyofgainvector,and thisproblem
has beenconsideredby Subba Rao and Tong (1972, 1973). Briefly thisapproachis as
follows:Estimatefitat severalfrequencies, coveringthewholefrequency range(- iT, iT).
Sincefitis approximatelya multivariatenormal,wecan perform single-factor
multivariate
analysisofvariancetestforthehypotheses
Ht1 = Ht2= ... = HtT

on the lines suggestedby Subba Rao and Tong (1972, 1973). From the spectral
of {ut},we have
representation
varUt= dt= fft,.() dw.

Testingtheconstancy theconstancy
offt,(co)forall cois equivalentto testing of ot. This
can be performed followingthetwo-factor analysisof variancetechniquesuggestedby
Priestleyand Subba Rao (1969).
Bohlin(1971) consideredthe followingtimedomainapproach. Considerthe time
fromthemodel
series{Yt} generated
Yt+ al(t) Yt-l+ ... + an(t) Yt-n= Aeo(t)+ kt(t), (8)
where
a?(t) = aj(t-1) +qi e?(t) (i = 1, 2, ...,n),
k(t) = qn+len+1(t),
where{e?(t)} is a sequenceof i.i.d. randomvariablesN(O,1). Bohlin(1971,equation5)
derivedtheupdatingequationsfortheparameter vector0'(t) = {ai(t), a2(t), ..., an(t)}, k(t)
basedon thesample(yt,Yt-1,...). Assuming q1 = q2 = ... = qn = q he obtainedthemaxi-
mum-likelihood estimateofq andtestedthenull-hypothesis q = 0. Ifthenull-hypothesis is
accepteditimpliesthatthecoefficient vector0(t)is timeinvariant.
By choosingA = 0, -Yt-1 = Xlt, -Yt-2 = X2t, ... et ceterawe can reducethemodel(8)
to theone considered bytheauthors,and hencetheproblemhas beensolvedby Bohlin
(1971)in a moregeneralset-up.

Dr H. TONG(University of Manchester Instituteof Scienceand Technology):The


studyofthedependency ofregression on timeis,ofcourse,a veryimportant
relationships
one andtheauthorsareto be congratulated forbringing forward a verytimely
paper.
Although theauthorshaveconfined themselves to thecase ofnon-stochastic
regressors,
manyoftheproblemsstudiedand someoftheresultsobtainedhavetheircounterparts in
thecase of stochastic as could arisein,forexample,a time-dependent
regressors, system
witha stochastic inputsubjectto an additivestochasticnoisedisturbance.This typeof
problemhas beensystematically studied.See, forexample,Subba Rao and Tong (1972,
1973),Priestleyand Tong(1973)and Tong(1974).
1975] Discussionof thePaper byBrown,Durbinand Evans 187

If we pose theproblemin theabovegeneralform,withstochastic regressors,thentests


forconstancyof relationships overtimehave beenproposedby Subba Rao and myself
(1972,1973). I amgladto reportthatthesetestshavebeensuccessfully appliedtorealdata.
(See Subba Rao and Tong,1973,1974. The latterpaperis currently availablein theform
of a technical reportissuedfromtheDepartment of Mathematics, UMIST.)
Comingback to the problemconsideredin the paper,I wouldmake thefollowing
comments.
(i) In practice,itwouldseemthatrestricting theregressionmodel(1) to a fixednumber,
k, of regressors overall timeis sometimes unrealistic.Justas in thecase of stationary
autoregressive model building,the determination of the numberof regressors is an
importantproblem.In the formercase, Dr H. Akaikeof the Instituteof Statistical
Mathematics, Japan,has recently obtainedimportant resultsand practicalprocedures
(see,forexample,Akaike,1973).
(ii) The ideaofusingW,(r = k + 1, ..., T) to studytheadequacyoftheproposedmodel
has itscounterparts in thecase ofstochastic regressors.For example,Mehraand Peschon
(1971) have summarized the experiences of controlengineersin "Fault detectionand
diagnosisin dynamicsystems" and theirapproachis based on whatis commonly known
as the"innovationsequence"whichin thecontrolliterature, is definedto be Yr-YI,-_,
Y7I,-L beingtheminimum linearmean-square predictorofY,in termsofYI-1, YI-2, ....
(iii) In designing movingregressions, it wouldseemthatthechoiceof thelengthn is
quiteimportant.Have theauthorsanysystematic procedure foritsselection?If so, is it
in anywayrelatedto something likethe"maximum widthoverwhichtheshortsegment
maybe regardedto followone and thesameregression relationship"?
(iv) I wouldjust mentionthata similartechniqueas thatbased on Quandt'slog-
likelihoodratiohas recently beenproposedbyOzakiand mein a paperto be presented at
theEighthHawaii International Conference in SystemsScience,1975,forthedetection
of abruptchangesovertimein auto-regressive relationships.An alternative approachto
this problemmay be formulated in a Bayesianframework.Recently,some control
engineers in RussiahavestudiedthisproblembytheBayesianmethod.See,forexample,
Telksnys(1973)and thereferences quotedthere.
(v) In thecase of stochasticregressors, Jonesand Brelsford (1967) have considered
Fourierseriesexpansionof regression usingan approachdue to Gladyshev
coefficients,
(1961). Here,theproblemofdetermining an optimalnumber ofFouriertermsis important,
and a Ph.D. studentat Manchester (Mrs M. Green)has recently studiedthisproblem.
This approachis somewhatsimilarto the polynomialparameterization suggestedin
Section2.6 ofthepaper,butmustbe appliedwithcautionifprediction is theobjective.

Dr W. G. GILCHRIST (Sheffield
Polytechnic):Thougheveryfacetofdailylifeproclaims
thattheworldis non-stationary,theliteratureofregression
tendsto ignoreit. Themethods
and plotsproposedby theauthorsprovideusefultoolsthatwillhelpthosewhowishto
checktheirassumptions.I wouldbe interested to knowwhatproportion of thedata the
authorshavelookedat satisfies theassumption ofconstant,3.
The methodsusedbytheauthorsapplyleastsquaresto eitherall thedata up to a time
r or to a movingsequenceofn observations up to thattime.Thustheestimates ofp treat
all thedata used as beingequallyimportant.The deviationsfromthefittedmodel,as
represented bytherecursive areusedto indicatepossiblechangesin thef's. An
residuals,
alternative approachis to seekto find"local" estimates
of thevaluesof Pt. Thiscan be
done, for example,by usingdiscountedleast squares,e.g. Gilchrist(1967). Moving
forward throughtimetheestimates thatminimize thediscountedleast-squarescriteria,
a7r Ut2

are givenby
br = br-I + Pr Xr(Yr- Xr br-1),
188 Discussionof thePaper byBrown,Durbinand Evans [No. 2,

where
/ ar-k-1 0

Pr= [xrA,
r
xr]-, A7r= | a
a )
~~~
and Pr maybe updatedby
p = Pr-1 Pr-1 Xr Xr Pr-1
a a(a + x Pr-1Xr)'
Thesecorrespond to equations(4) and (3) in thepaper. The aboveestimate ofPr putsthe
greatest emphasison data closeto timer. The valueof thisapproachis thatwe can plot
components of bragainsttimeand see how theregression coefficients
actuallyvaryover
time. Clearlythe ordinaryleast-squares estimateis a specialcase of thiswhena = 1.
Wherewe wishto look at pastvaluesofbra criterion thatdiscountsu2+, bya factora'T'
= ..., -1, 0, 1, ...) providestheequivalentto themovingregression.
An implication of the possiblevariationin coefficients,
thathas been exploredby
Singleton(1971),is that,in selecting variablesforregression, thebestvariablesto select
in one localityin timemaynotbe thesameas thoseat a different localityin time.
Professor DURBIN and Mr EVANS repliedbriefly at the meetingand subsequently in
writing as follows:
We agreewithProfessors Cox and Quandtthatit wouldbe usefulto comparethe
powersachievedbya variety oftests,includingthosewehavesuggested, againstalternatives
ofinterest.Professor Cox has madea usefulstartbuttakesthecusumofa fixednumber
of terms.Where,as in our case, thenumberof termsvariesthemathematics becomes
intractable, thoughof courseone coulduse simulation.
As Mr Fiskindicates, thereare manywaysoftransforming theleast-squares residuals
intoan orthogonal set. This raisestheinteresting question:whichof themis bestfora
giventeststatisticand a givenalternative hypothesis?
Sir MauriceKendall,Mr Clarkeand Dr Nelderpointto thepossibility ofundesirable
build-upofrounding errors.In orderto guardagainstthis,all therecursive calculations
in ourprogramare performed in doubleprecisionand,ifthemodelcontainsa constant,
are carriedout on deviationsfrommeans. In addition,variouscheckscan be made on
theoutput.First,thefinalestimates of theregression coefficientsafterall therecursive
calculations havebeenperformed maybe comparedwiththoseobtainedin one stepfrom
the entiredata set. Secondly,the finalcusumof squaresmay be comparedwiththe
theoreticalvalueof unity.Thirdly, in thecase of themovingregressions thecoefficients
obtainedfromthefinalsegment of thedata afterall therecursions havebeenperformed
maybe comparedwiththecorresponding estimates derivedfromthebackwardrecursive
regression.Thesecheckshave been carriedout as a matterof routineand have rarely
shownanysizeablediscrepancies.For thethreeexamplesin thepaperthediscrepancies
werenegligible.
SirMauriceKendall'squestionabouthowto distinguish between changesinregression
coefficientsand changesin residualvariancelendssupportto theemphasiswe haveplaced
on theexamination of thedata fromseveraldifferent standpoints.The information on
variancechangeobtainedas part of the outputof the movingregressiontechnique
discussedin Section2.5 has to be balancedagainsttheinformation on coefficient changes
providedbytheothertests.The resultofthisexamination willoftenindicatefairly clearly
whichis themorelikelyexplanation.Thiswas thecase withourExample1 as we stated
in thepaper,notwithstanding Mr Phillips'sremarkaboutthisexample.
Sir Maurice'ssuggestions on extensionsare well worthfollowingup. As regards
autoregressive series,Dr Younghas referred to his use ofrecursiveresidualsfordynamic
modelsand Dr Khan has pointedto theneedfortreatment ofthegeneralmodelincluding
bothexogenousvariablesand laggeddependent variables.At presentwe do not know
1975] Discussionof thePaper by Brown,Durbinand Evans 189

whichof our tests,if any,are valid even asymptotically formodelscontaining lagged


dependentvariables.That thesituationrequirescare is clearfromthestudyof similar
problemsforordinary least-squaresresiduals(Durbin,1970). Modelscontaining errors
in variablesmightbe amenableto theKalmantechnique.
Mr Fisk,SirMauriceKendall,MrPhillips,Mr Harveyand Professor Quandtall point
out thatthe observations can be orderedby criteriaotherthantime. This extendsthe
domainof applicationof thetechniques to othertestsof modelspecification. One could
go further and transform thedata firstand thenorderbysomeappropriate quantity.For
example,one couldfollowDuncanand Jones(1966)and transform firstto thefrequency
domainand thenuse TIMVAR to investigate the stabilityof the regression relationships
withrespectto frequency.
We recognizethemeritoftheworkon time-varying techniques oftime-series analysis
developedby ProfessorPriestley, Dr Subba Rao and Dr Tong and are grateful forthe
references to it. For theproblemmentioned byDr Tong of choosingthelengthn of the
movingregression ourapproachis purelypragmatic.We plotthemean-square one-step-
ahead prediction error,derivedas indicatedin Section2.5, as a function ofn and choose
thelargestn forwhichthisvaluehas attaineditseffective minimum.Thistechniquehas
beenused to determine thelengthof base overwhichto averagein seasonaladjustment
work(Durbinand Murphy,1975).
As regardsProfessor Priestley's
comments on thecontroltheoryliterature and related
remarksby Dr Young, we of courseagreethatit is important thatcommunications
betweenstatisticians and controlengineers workingon relatedproblemsshouldbe kept
open. Bothcontributors havedonevaluableworkin thisrespect.At thesametimewe do
notthinkthatstatisticians are quiteas parochialas theirremarks mightbe understood to
suggest.The originalKalman (1960) paper was an outstanding achievement whichis
surelywell knownto all time-series specialists.Papers relatingKalman's work to
statistical
problems werepublished inBritish journalsbyJones(1966)andWalker
statistical
and Duncan (1967). ThisSocietyheldan Ordinary Meetingdevotedto controltheoryin
1969at whichpaperswerereadbyWishart(1969),Whittle (1969)and Bather(1969). The
principalspeakerat the Society'sConferenceat Nottinghamin 1972 was Professor
K. J. Astrbmwho is a leadingcontroltheorist.These are just a fewexamplesof the
influence of controltheoryon Britishstatistics.
HavingincludedKalman's equationsin lecturecourseswe werewell aware of the
relationbetweenthemand ourrelations(2)-(5) and shouldprobablyhavereferred to this.
The reasonwe did not was thatour relationsowe nothingto Kalmanhistorically. The
definition (2) of recursive residualswas used by us in theformof a generalization of the
Helmert transformation inlecturesinthemid-1950's andis surely"wellknown"andmuch
older.Theremaining relationscomefromthepapersofPlackettandBartlett referred to.
Dr Youngrecommends greater emphasison therecursive estimates oftheparameters.
Thesearein factproducedbytheprogramand studyoftheresulting plotsalongthelines
he suggestshas beenfoundusefulin practice.Mentionof thisand otheraspectsof the
workwas omittedonlyin orderto tryto keepthepapershortand simple.We certainly
agreewithhimthatiftime-variation is found,it is important to investigate thephysical
natureofthesystem in orderto seektransformations ofthedata whichwillyielda system
whichis time-invariant.
Mr Phillipsquestionstherobustness of thetestsagainstnon-normality. We havenot
investigated thisbutwouldconjecture thatthetestswouldbe sufficiently robustformost
practicalwork. As regardstheeffects of serialcorrelation, raisedalso be Dr Smithand
ProfessorQuandt,theseare likelyto be substantial.It would be worthinvestigating
whethersimplecorrection factorsalong the lines suggestedby ProfessorCox can be
developed.
We thankProfessor Anderson forconfirming ourassertion thatat theusualsignificance
levelstheprobability thata samplepathcrossesbothlinesis negligible.In facttheresults
in his (1960)papercouldbe usedto producea completesetofpercentage pointsfora.
190 Discussionof thePaper byBrown,Durbinand Evans [No. 2,

Dr Smith'ssuggestion of usinga V-maskis a verygood one whichcan certainly be


expectedto lead to a gainin poweragainstsometypesof alternative.His development
of relatedBayesiantechniquesis to be encouraged,thoughwe believethatour own
approach,in whichone looksat thedata froma variety ofstandpoints and usesdifferent
teststatisticsto measuredifferent typesof departure fromthe null-hypothesis, is more
informative whenthereis no specific alternativein mindat theoutset.
We agreewithProfessorEhrenberg thatin particularpracticalsituationsa model
based on stochasticregressors mightwell be appropriate.But our approachin such a
situationwouldbe simplyto perform a conditional analysisgiventheobservedvaluesof
the regressors.This would immediately reduceto the modelconsideredin the paper.
Leavingaside samplingfluctuations, we cannot followhis argumentthat when the
regressors arestochastic twohalf-samples wouldnecessarily givedifferent regressionlines.
Mr Harveymakesa usefulpointaboutthechoiceof theestimator of a and we agree
thattheuse of his estimator shouldlead to an increaseof power. As regardstheuse of
exponentialweights,suggestedby him and also by Dr Gilchrist,we consideredthis
possibilityand obtainedtherelevant formulae at an earlystageof thework,butgaveup
theidea and we havenot,in fact,everusedtheformulae in practice.
Dr Herzbergmakesan imaginative suggestion butwe havenotbeenable to put it to
thetestand findit hardto evaluatethemeritof theidea in theabstract.This is one of
thosecaseswheretheproofofthepuddingwillbe in theeating.
We wishto encourageMr Hutchisonto producehis table whichwill be a useful
contribution. Whiletheuse of TIMVAR afterrandomization withinyearsas he suggests
willgivea testwhichis validin thesenseofgivingtherightrejection probabilitieson the
null-hypothesis,we feelthatpowerwouldinevitably be lostrelativeto thecorresponding
testbasedonlyon year-to-year changes.
We havesomesympathy withDr Nelder'sremarks on theuse oftheword"recursive"
butfeelthattheusageis too well-established to changenow. Dr Young has referred to
the use of the term"innovations process"in the controlliterature.In the time-series
literature(Wienerand Masani, 1957; Cramer,1961)theterm"innovation"is normally
used in connectionwithstochasticprocesseswhoserealizationsare of infinite length.
The innovation at timet is thendefined as thedifference betweentheobservation at time
t and its bestlinearpredictor givenall theobservations up to and includingtimet-1.
If one wereto regardYi,*--,YT as a samplefroman underlying infinite
population,one
wouldtakeutin (1) as theinnovation.Extending thedefinition to finite
samplesof data,
however,one could takeas the"sampleinnovation"at timet thedifference betweenYt
and its best linearestimatebased on Yi,..., Yt-1,i.e. Yt-x' bt-1,t = k+ 1, ..., T in our
situation.But it would stillbe necessaryto introducea new term,such as our term
"recursive residual",to denotethestandardized residualwi.
We aregrateful to ProfessorQuandtforthebriefsummary ofrecentworkbyhimand
hiscolleagues.It is a matter forregretthatthereareno distributional resultsavailablefor
theoriginalQuandtlog-likelihood ratiostatisticconsideredin Section2.6. The idea of
replacingthediscontinuous Dt bya continuous approximation is ingeniousand we hope
it willmakethesedifficult problemsmoretractable.

REFERENCES IN THE DISCUSSION


AKAIKE, theoryand an extensionof themaximumlikelihoodprinciple.
H. (1973). Information
In InternationalSymposiumin InformationTheory,pp. 267-281 (B. N. Petrov and F. Csaki, eds).
Budapest:AkademiaiKiado.
Annals of Economic and Social Measurement(1973), 2, No. 4 (whole issue).
ANDREWS, D. F. (1972). Plotsof high-dimensional data. Biometrics,28, 125-136.
BATHER, J. A. (1969). Diffusionmodelsin stochasticcontroltheory(withDiscussion). J. R.
Statist. Soc. A, 132, 335-352.
BOHLIN,T. (1968). Informationpatternforlineardiscrete-time coefficients.
modelswithstochastic
ReportNo. T.P. 18.192. I.B.M. NordicLaboratory.
1975] Discussionof thePaper byBrown,Durbinand Evans 191

spectra.ReportNo. T.P. 18.212,I.B.M.


BOHLIN,T. (1971). AnalysisofEEG-signalswithchanging
NordicLaboratory.
Box, G. E. P. and JENKINS, G. M. (1970). Time Series Analysis,Forecasting and Control. San.
Francisco:Holden-Day.
CAREW,B. and BELANGER,P. R. (1973). Identificationof optimumfiltersteadystategain for
systemswithunknownnoisecovariances.I.E.E.E. Trans.on Aut.Cont.,AC-18,582-587.
CHAMBERS, updating.J. Amer. Statist. Ass., 66, 744-748.
J. M. (1971). Regression
CHOW,G. C. (1960). Testsoftheequalitybetweentwosetsofcoefficients intwolinearregressions.
Econometrica,28, 561-605.
CRAMER,H. (1961). On some classes of nonstationarystochasticprocesses. Proc. 4th Berk. Symp.,
2, 57-78.
DAVIS,0. A., DEMPSTER,M. A. H. and WILDAVSKY,A. (1966). A theoryof the budgetaryprocess.
Amer. Polit. Sci. Rev., 60, 529-547.
DUNCAN, D. B. and JONES,R. H. (1966). Multiple regressionwith stationaryerrors. J. Amer.
Statist. Ass., 61, 917-928.
DURBIN,J. (1954). Errors in variables. Rev. Int.Statist. Inst., 22, 23-32.
(1970). Testing for serial correlationin least squares regressionwhen some of the regressors
are lagged dependent variables. Econometrica,38, 410-421.
DURBIN, J. and MURPHY, M. J. (1975). Seasonal adjustmentbased on a mixed additive-
model. J. R. Statist.Soc. A, 138,Part3 (to appear).
multiplicative
EHRENBERG, A. S.C. (1975). Data Reduction. Londonand New York: Wiley.
FAIR, R. C. and JAFFEE,D. M. (1972). Methodsof estimation formarketsin disequilibrium.
Econometrica,40, 497-514.
FARLEY, J. U. and HINICH, M. J. (1970). Testing for a shiftingslope coefficientin a linear model.
J. Amer. Statist. Ass., 65, 1320-1329.
FARLEY, J. U., HINICH, M. J. and McGUIRE, T. W. (1973). Testing for a shiftin the slopes of a
multivariatelinear time series model. Working paper, Carnegie-Mellon University,June 1973.
GILCHRIST,W. G. (1967). Methods of estimation using discounting. J. R. Statist. Soc. B, 29,
355-369.
GLADYSHEV,E. G. (1961). Periodicallycorrelated randomsequences.SovietMaths,2, 385-388.
GOLDFELD, S. M. and QUANDT, R. E. (1974). The estimation of structural shiftsby switching
regressions. Ann. Econ. Soc. Measur., 2, 475-485.
GREEN,M. and HARRISON,P. J. (1973). Fashion forecasting fora mail-order company using a
Bayesian approach. Oper. Res. Quart., 24, 193-205.
HANCOCK, R. A. (1971). Problems relating to the identificationof linear stochastic systems.
of Engineering,
Ph.D. Thesis,Department Universityof Cambridge.
F. (1971). A Beyasianapproachto short-term
HARRISON, P. J. and STEVENS,C. forecasting.
Oper.
Res. Quart., 22, 341-362.
HARVEY, A. C. and COLLIER, P. (1975). Testing for functional mis-specificationin regression
of Kent Q.S.S. DiscussionPaper No. 12.
analysis.University
HARVEY, A. C. and PHILLIPS,G. D. A. (1974). A comparison of the power of some tests for
heteroscedasticityin the general linear model. J. Econometrics2, 307-316.
HILDRETH, C. and HOUCK, J. P. (1968). Some estimators for a linear model withrandom
coefficients.J. Amer. Statist. Ass., 63, 584-595.
JONES, R. H. (1966). Exponential smoothing timeseries.J. R. Statist. Soc. B, 28,
formultivariate
241-251.
JONES, R. H. and BRELSFORD,W. M. (1967). Time series with periodic structure. Biometrika,54,
403-408.
KAILATH, T. (1968). An innovations approach to least squares estimation-Part I. Linear
filteringin additive whitenoise. LE.E.E. Trans. on Aut. Cont., AC-13, 655-660.
(1974). A view of three decades of linear filteringtheory. LE.E.E. Trans. on Inf. Theory,
IT-20,145-181.
KALMAN, R. E. (1960). A new approach to linear filteringand prediction problems. Trans.
A.S.M.E., J. Basic Engng, 82, 35-45,
MEHRA, R. K. (1970). On the identificationof variances and adaptive Kalman filtering.LE.E.E.
Trans. on Aut. Cont., AC-15, 175-184.
MEHRA,R. K. and PESCHON,J. (1971). An innovation approachto faultdetectionand diagnosis
in dynamic systems. Automatica,7, 637-640.
192 Discussionof thePaper byBrown,Durbinand Evans [No. 2,

optimalfilter
NEETHLING,C. G. and YOUNG,P. C. (1974). Commentson "Identification steady-
stategainforsystemswithunknownnoisecovariances".I.E.E.E. Trans.onAut.Cont.,AC-19,
623-624.
in regression
PHILLIPS,G. D. A. and HARVEY,A. C. (1974). A simpletestforserialcorrelation
analysis. J. Amer. Statist. Ass., 69, 935-939.
PRIESTLEY,M. B. (1965). Evolutionary processes.J.R. Statist.Soc. B,
spectraand non-stationary
27, 204-237.
PRIESTLEY,M. B. and SUBBARAO,T. (1969). A testfornon-stationarity of timeseries. J. R.
Statist. Soc. B, 31, 140-149.
PRIESTLEY,M. B. and TONG, H. (1973). On the analysisof bivariatenon-stationary processes
(with Discussion). J. R. Statist. Soc. B, 35, 153-188.
ROSENBROCK, H. (1965). Surles relationsentreles filtres et quelquesformules
lineairesdiscrets de
Gauss. Conference on Identification,Optimilisationet Stabilite des Systemes Automatiques.
Paris:
methodsin forecasting.M.Phil.Thesis,University
SINGLETON,P. W. (1971). Multipleregression
of London.
SUBBARAO, T. and TONG, H. (1972). A test for time dependence of linear open-loop systems.
J. R. Statist. Soc. B, 34, 235-250.
function.Biometrika,
(1973). On some testsfor the timedependenceof a transfer 60,
589-597.
(1974). Linear time-dependentsystems. I.E.E.E. Trans. on Aut. Cont., AC-19, 735-737.
TELKSNYS,L. (1973). Determination of changes in the properties of random processes by the
and SystemParameterEstimation(P. Eykoff,ed.). Amsterdam:
Bayes method. In Identification
North-Holland.
TONG, H. (1974). On time-dependent of non-stationary
linear transformations stochastic
processes. J. Appl. Prob., 11, 53-62.
WALKER,S. H. and DUNCAN,D. B. (1967). Estimation oftheprobability
ofan event as a function
of severalindependent 54, 167-179.
variables.Biometrika,
WHITEHEAD,P. G. and YOUNG,P. C. (1975). A dynamic-stochasticmodelforwaterqualityinpart
of the Bedford-Ouse river system. In Proc. IFIP Working Conferenceon Modelling and
Simulationof WaterResource systems. Ghent, Belgium: (to appear).
controltheory(withDiscussion).J. R. Statist.Soc. A,
WHITTLE,P. (1969). A viewof stochastic
132,320-334.
WIENER,N. and MASANI,P. (1957). The predictiontheoryof multivariatestochasticprocesses.
Acta Math., 98, 111-150.
WISHART,D. M. G. (1969). A surveyof controltheory(withDiscussion). J. R. Statist.Soc. A,
132,293-319.
YOUNG, P. C. (1969). Applyingparameterestimationto dynamicsystems.Control,16, (10),
119-125;16 (11), 118-124.
(1971). Commentson "Dynamicequationsfor economicforecasting withG.D.P.-un-
employment relationand growthof G.D.P. in theU.K. as an example".J. R. Statist.Soc. A,
134,220-223.
(1972). Commentson "On lineidentification of lineardynamicsystemswithapplicationto
Kalman filtering".I.E.E.E. Trans. on Aut. Cont., AC-17, 269-270.
(1974). Recursiveapproachesto timeseriesanalysis. Bull. Inst. Maths. & Applic.,10,
209-224.
and controlof discretedynamic
R. (1970). Identification
YOUNG, P. C. and HASTINGS-JAMES,
systemssubject to disturbanceswith rational spectral density. In Proc. 9th LE.E.E. Symp. on
AdaptiveProcesses: Decision and Control. New York: I.E.E.E.
YOUNG, P. C., SHELSWELL,S. H. and NEETHLING,C. G. (1971). A recursiveapproach to time
series analysis. Department of Engineering, Cambridge University,Report No. CUED/B-
Control/TR16.
YOUNG, P. C. and WHITEHEAD,P. G. (1975). A recursive approachto timeseriesanalysisfor
multivariatesystems. In Proc. IFIP WorkingConferenceon Modelling and Simulationof Water
ResourceSystems.Ghent,Belgium:(to appear).

You might also like