Biometrika (1984), 71, 3, pp. 599-607 599
Printed in Great Britain
Testing for unit roots in autoregressive-moving average
models of unknown order
By SAID E. SAID
Department of Mathematics, Kast Carolina University, Greenville, North Carolina, U.S.A.
anp DAVID A. DICKEY
Department of Statistics, North Carolina State University, Raleigh, North Carolina, U.S.A.
Summary
Recently, methods for detecting unit roots in autoregressive and autoregressive-
moving average time series have been proposed. The presence of a unit root indicates
that the time series is not stationary but that differencing will reduce it to stationarity.
The tests proposed to date require specification of the number of autoregressive and
moving average coefficients in the model. In this paper we develop a test for unit roots
which is based on an approximation of an autoregressive-moving average model by an
autoregression. The test statistic is standard output from most regression programs and
has a limit distribution whose percentiles have been tabulated. An example is provided.
Some key words: Mixed model; Nonstationary; Time series; Unit root.
1. Ivrropverron
Box & Jenkins (1970) discuss a class of models known as arma (p,d,q) models. The
term Arima (p,d,q) refers to an autoregressive-moving average model of the form
By ty Ly
Oy Dy p = +B, Gato + Byer ge
where Z, is the order d difference of the data and ¢, is a white noise sequence, These have
become quite popular for modelling time series data. Methods are available for testing
the p autoregressive and q moving average coefiicients for significance. The value of d
indicates the amount of differencing necessary to reduce the series to stationarity. The
number d also equals the number of unit roots in the characteristic equation for the time
series.
While much has been written about estimation when d = 0, relatively little has been
written about cases with d > 0. D. A. Dickey, in his Iowa State University Ph.D. thesis,
Dickey & Fuller (1979) and Hasza & Fuller (1979) discuss cases with q = 0. Dickey &
Said (1981) discuss a method for testing the hypothesis Ho: d= 1 when p and q are
known. Unfortunately, p and q are usually unknown and it is necessary to determine d
before estimating p and q if we use the approach described by Box & Jenkins (1970).
In the present paper we show that it is possible to approximate an ariMa (p, 1,q) by an
autoregression whose order is a function of the number of observations n. Using least
squares to estimate coefficients in this autoregressive approximation produces statistics
whose limit distributions are the same as those tabulated by Dickey and listed by Fuller
(1976, p.373). Thus it is possible to test the null hypothesis Ho: d = 1 without knowing p
org.600 Sarp E. Satp anp Dayip A. DickEy
The development is illustrated for p = and normal errors then extended to the
general p,q case with independent identically distributed errors,
2. Estimation PROCEDURE
We first consider the model described by
Y= pY-1+%,
aL,» +e + Be» (b=
(2:1
wh 1)
where it is assumed that || <1, |/| <1, Yo =0 and {;} is a sequence of independent
identically distributed random variables which, for the moment, we will assume to be
normal. If |p| <1 then, except for transitory start up effects, Y, is a stationary
ARMA (2, 1) series in the terminology of Box & Jenkins (1970). If p = 1 the series (21) is
an anima (I, 1,1) process. We consider p = 1 as a null hypothesis to be tested. Notice
that
—BY(Z,j- 92,51)
and it follows that
Ye= Vann = (P= VW) (Yes) + (0+ B) (Ze 1 Ba + BP Zp 3.0) 4 (22)
Under the null hypothesis that p = 1, we see that Z, = Y,— Y,-;. This motivates us to
estimate the coefficients in (22) by regressing the first difference ¥,= Y,—Y,-, on
Yy-4, Yy-1,--., Y;-4 where & is a suitably chosen integer, To get consistent estimates of
the coefficients in (2-2) it is necessary to let & be a function of n. We shall assume that
n-'?& — 0 and that there exist ¢ > 0,r > 0 such that ck > n'",
3. Derinrrions AND Norarion
Let do = p—1 and d; = (7+ )(—f)'~! (i > 0) be the coefficients in (22). Let
X= Zpntyee Bran), Ur = (Yr X))
(do,dy,...,d,). Now consider a truncated version of (2:2)
Y= (BN) Feat (tt B) {Ze 1 Boa + B ya (BY Za} +e. (31)
Notice that ey is not a white noise series. In fact ¢, = Y,—U;d. Define a = (8, 4,
to be the vector of estimated coefficients in the regression of ¥, on ¥,1, Yy-ts 05 Yenes
hereafter referred to as regression (3:1). Further, define a sum of squares and cross-
products matrix Ro and a normalizing matrix D, by
and d’
E¥2, U¥X aa
EY1X, EX,X; | a
D, = diag {(n—k)“4, (n—h)“4, ..., (n=) 4}.
shall develop the limit distribution of
Dy —d) = (Dy Ro Dy)" Dy Y Uren. (33)Unit roots in autoregressive-moving average models 601
‘The first step will be to show that D, Ro D,—R converges to zero, where R is a block
diagonal matrix. Let y,(i—j) = H(Z,-;Z,—3) be the autocovariance function of the
stationary series Z,, I = y.(i—j), where Tis a kx k matrix, and y! = {7,(1),....7(k)}. It
is well known (Fuller, 1976, p.239), that
(n—k)7! X,X,>T, (nk)! Y XZ, 49,
wf ehh
where the convergence is in probability. Finally, let W, = e; +... +¢,and define the block
diagonal matrix R by
R= diag {(1—a) (1+ f)?(n—k) "7 E W?_,, PF}. (34)
The ideas presented in the next section follow the lines of Berk (1974). He uses the
standard Euclidean norm, || «|| = (#z)$, of a column vector « to define a matrix norm
| B|| and defines
| BY =sup {| Bel: le] <1}
Notice that || B |? is bounded by the sum of squares of the elements of 2 and that || B || is
bounded by the largest modulus of the eigenvalues of B. Using this norm we investigate
the convergence of Ro.
4, AsyMPToric BQUIVALENCE OF D, Ro D, AND R
In this section, we let @ = D, Ry D,— Rand show that || Q || converges in probability to
0. Let the (i, j)th element of Q be denoted by gy.
Luvma 41, Under the assumptions in §3, if p=1 and nk +0 then KI QI)
converges in probability to 0.
Proof. First,
Gir = (nk) 7{E YP —(1—a) (1. +B)? ZW}.
For p =1 we have ¥,-1 =Zy+...+Z,-; 80 that
en
Farina = St Bea) 8ly
and thus.
(1-a) ¥,-y+a%,-, = (148) % e+ Bleo— es) +aZ0 (41)
or
Yo. = (1a) 1+ B) Wy. + O,(1).
Dickey & Fuller (1979) show that EW}, =0,(n). Their arguments show that
(n—k) E(qi;) < Cy, where C, is some constant.
Berk (1974) shows that for some C;,
(mk) Binh) EZ, Z,_ jy DY? SCz (GF = Vy rh)
For i > 1, j > 1, we note that 9, = (n—k)"' 2 Z,_;Z,_;—y.(i—j).
We now consider q for i=1, j>1. The vector of these qj from (32) is
(n—k)-9?E ¥,_, X;, Now for some C3,
(m= k) E(qi) = (nb)? EE, BY Bp—7 Vu-1 Zy-j) < Or.602 Sarp E. Sarp AnD Davip A. DickEy
This follows from the facts that
1E(¥,2,)| 3 nb-a]< X_ lrld)1 = Ca < 20,
IEYY St YX ly(d)| = tCy.
To finish the proof, let C = max (C,, C2, C3), so that (n—k) E(q}) < C for all i, j; note
that C,, C; and C do not depend on » or k. Since @ has dimension (k-+ 1) x (+1) we see
B(\ QI?) < C(k+1)?/(n—k) and if k?/n + 0 then ||Q|| converges in probability to 0.
Under our assumption that k3/n + 0 we have k# | Q|| converging to 0 also.
Luma 42. Under the assumptions of model (2:1) with p =1
fa-a-ta+ prem s wasp =0,(1).
:
fet
Proof. Now
y W2,=e'Ae,
A
where e' = (¢,,...,¢,1) and the (i, j)th element of A is n—max (i, j). Thus
net
CAe= Y 4,23,
a
where {Z,} is an independent N(0,02) sequence, and (Dickey & Fuller, 1979)
Yin = 47 * see? {(2n—1)"'(n—i) mn} (i= 1,...,n).
Also, for any fixed i,n“,, > 4{(2i—1)n}~? > 0 as n + 00, so that, given ¢ > 0, we
can find M, such that
D 1
orf Wa) > a} < pr(n? yj, Zz? > M,) = pr(n-? y;,! 23 < Mz) 0. Here we have used the fact that pr(l| Ql] > 1) ean
be made arbitrarily small by choice of n since p ||@| > 0.
5. CONSISTENCY OF LEAST SQUARES ESTIMATORS
In §2 we propose the use of least squares autoregression to estimate p in model (2'1).
We estimate p—1 by the first element of a vector @ given by expression (3:3). In §4 we
looked at D, Ro D,, so it remains to consider D,¥ U,e,. We show that this vector
converges to 0 and thus prove consistency.
Lemma 5:1. Asn > co
Proof. We have
» 2
| Dy Y Usleu-e)
oh
1
(nk) {E, Ysa e)}? + (nh) * x {E,Z,-Aeu—e)}. (51)
é
Since {Z,} is a stationary invertible RMA process, we can use (3-1) and (3-2) to see that
[net x B2,1eu-2))| 0.
‘The order of the first term in (5+1) is established by expressing Y,_ and ey —e, in terms
of {Z;} and the proof is thereby completed.
‘At this point we will establish the probability order of
which, in combination with Lemma 5:1 and Theorem 41, will establish consistency.
DS Ue
aka604 Sarp E. Sarp anp Davip A. DICKEY
Lemma 52. Asn > 00,
|>. oie =0,(k#).
Proof. Now
ko fnck 2
BD, &, 0,6? = (n—k)-? HE Y,_ 1)? +(n— ky a(S 4-54) ,
Ne
Since Z, is a stationary invertible autoregressive-moving average process
(nb! x (S fey ‘) kys0) 0,
Here we have used the independence of Z,-; and ¢, for j > 0. This same fact shows that
BOE, Yous)? = 08D, BYE) = O(n’).
Combining Lemmas 5:1 and 5-2, we see that
1,2, U,eu | = Og(k*) 62)
and we thus are in #. position to prove our main result.
TneoreM 5:1. Under the assumptions of model (2:1) with p =1, |4—d || converges in
probability to 0.
Proof. We write
Dy'(a—d) = (RY R-) DE, eq + Ro" Dy E, Ope; + R7* DE, Ue —e,). (53)
‘Taking norms of the three terms above we obtain, respectively, op(1), O,(k#) and O,(n~")
so that || D, (@—d) | = O,(k#). Since || D,*|| = O(n—k) and k= O(n") the proof is
complete.
6. Limrr DISTRIBUTION OF UNIT ROOT TEST STATISTICS
Having shown that D,'(@—d) =0,(1) we now develop the limit distribution of
D,,\(@—d) which, by (5:3), is the same as that of R~! D,E,U,e,. The first element of
D, (a—d) is
(nk) (6-1) = (nk) PE VE nk) EY
By Lemma 41
(n—h)7E ¥?_, = (1—a)-7(1 +B) EW? +0, (074), (6-1)
By the arguments of Lemma 5:1,
(= WE Yee = (WWE Vy 1+ 0,(0-) (62)
and, if we use result (41),
(=k) EY, G = (n—h) (1a) (14+ BYE Wye, +O,(n- 4).
Dickey & Fuller (1979) define random variables (I”, ¢) and show that
(07? n-7EW2 4, 07?n EW, 6)Unit roots in autoregressive-moving average models 605
converges in law to (I, €). If we use (61) and (6-2), the distribution of P~* € is the same as,
the limit distribution of
(1a) *(1-+B) (mk) (8 —do) = (1a) "(1 +B) (mn) (= 1),
If we use the results of Hasza & Fuller (1979) the above limiting distribution is
unchanged if the ¢, sequence is taken to be independently and identically distributed
with zero mean and variance o?. Further, since R is block diagonal, the results of Berk
(1974) apply to the coefficient vector (2, —dy, ..., 44 —d,). That is, the limit distribution of
this part of a—d is the same whether or not we include Y,_, on the right-hand side in
regression (3:1). Thus, for large n, significance tests for these coefficients are not affected
by including ¥,_, in the regression
Since the distribution of n(p—1) involves the unknown parameters a and f we would
be unable to use p as a test for unit roots at the ideritification stage of analysis. We
now show that the limit distribution of the studentized ¢ statistic associated with p
does not depend on any unknown parameters,
Turorem 6:1. Under the assumptions of Theorem 4-1, define t = (C;, 67)” *(p—1), where
Oy, is the upper left-hand corner element of Rg, p—1 = &o—do, and 6? is the error mean
square from regression (3:1). Then the limit distribution of « is the same as the distribution of
rte.
Proof. By our previous consistency results, 6? converges in probability to a”. Using the
arguments of Lemma 4:1 we see that
t—{n? (E71) 1} Fm(p—1) > 0.
If we use (61) and (6-2) it follows that
1-6 (LW2 4) FEM 20
and thus t > T7#é.
Percentiles of the distribution of t are given by Fuller (1976, p.373) and can be used to
test the null hypothesis of a unit root. An illustration is given in §8.
7. EXTENSIONS
‘The general anita (p, 1,q) model is defined by
Y= pY-1t+% (b= 1,2...) (71)
2+ § ot, et S bien (72)
with Yq = 0, ¢, a normal pure noise sequence and p = 1, We assume that Z, is stationary
and invertible. Thus there exists a sequence of real numbers {dj}, a number 0 <4 <1
and a number M such that ¢ = ZdjZ,-; and |d,| <.Mi (Puller, 1976, Theorem 27-2)
Rearranging, we can write, in analogy to (31),
Y= (PW Kp Za
Za On (73)
Notice that the consistency proofs in the anima (I,1,1) case depend only on the
existence of exponentially decreasing bounds on the d; and so they generalize im-
mediately to the higher order case,606 Samp E. Sarp anp Davip A. DickEY
If p = 1, Z,-;= Y,-j— ¥,-j-1 = ¥;-; and in regression (7-3) we will refer to the least.
squares coefficient on Y,_, as p—1. Summation of (7-2) on both sides as goes from 1 to n
shows that, in analogy to (4-1) with p = 1,
(L404... ay) Y= (148i +...+8,) Wet Vor (74)
where J, is a linear combination of a finite number of Z, and ¢, and W, = ¢+...+e as
before. If we use (7-4) and the arguments of §6 it follows that, for the general
anima (p, 1,4) case, then in law
(Lt ay +... +p) (L+ By +... + Bq) (P-L) > ToS, (75)
as does the studentized statistic from the least squares regression (7'3)
Finally, if the series mean ¥ is subtracted from each ¥, prior to analysis, we replace
(74) by
(Lt +... tay) (Ye— P) = (14, +... +B) (We ) + Ve (76)
where W = n~* W,. We note that Z, = (¥,—)—(¥,-;—Y) = ¥, when p = 1. Using
the results of Dickey & Fuller (1979) we get distributions similar to (7-5) except that the
limit distribution percentiles now correspond to the p, and 2, tables of Fuller (1976,
p. 371-3). This follows from the fact that for fixed j, 3 = 0,(n), so that the order
results following (4-1) are unchanged if the series mean 7’ is subtracted prior to analysis,
‘The limit matrix R is still block diagonal. The only difference from (3°3) is that the upper
left-hand element of R is now, if we use (6'1),
(m—k)-7(L 40, +... +p) 2(1 + Bi +... + By)? (W,— W?),
and a similar modification in (62) yields results like (7'5)
8, EXaMPLe
Box & Jenkins (1970, p. 525) list 197 concentration readings from a chemical process.
‘The authors conclude that an ana (I, 1) or an ARIMA (0,1,1) model should be fitted to
the data. We use our testing procedure to test the hypothesis that the model form is
arma (p, 1,9).
Using sas (Barr, Goodnight & Sall, 1979) we fit the model
—0-1601(¥,_;— ¥)—0-4941 Y,_, -0-2919Y,_,—0-2640¥,_5
—0-2477 ¥,_ 4-0-2682 Y,_ 5-0-1888 Y,_,
where the error mean square is 0-0938 and Y, = Y,— ¥,_. The coefficient standard errors
are 0-0785, 0-0963, 0-0985, 0-0947, 0-0903, 0-0858 and 0-0726. The studentized statistic
2, = (00785) *(—0-1601) = —2-04 is compared to the 2, tables of Fuller (1976, p.373).
We do not reject the unit root hypothesis using a one sided 10% significance level test.
The limit theory, of course, does not specify the value of k for any given n. To select k,
we initially fit regression (3:1) for k = 6, 7, 8, 9 and 10, Assuming that an autoregressive
order 10 model gives a sufficient approximation to the data we used the standard
regression F' test that the coefficients on Y,_, to Y,_y9 are simultaneously zero. The
justification for this as an approximate test is the fact from §6 that Berk’s results
hold for the coefficients on Y;_; in regression (3:1). The sequential sums of squares
for Y,_, to Y,_19 are 0:0279, 0-0019, 0-0013 and 01341. The error sum of squares isUnit roots in autoregressive-moving average models 607
166423 with 174 degrees of freedom and we compute F't,4 = 0-43. None of the individual
L tests for these 4 lags was significant, even at the 20% level.
Furthermore, when regression (3'1) is fitted with k = 7, 8, 9 and 10, the t, statisties are
—1-931, — 1-830, — 1-796 and —2-013 so that our unit root test is not affected over this
range of k.
One will now want to fit a model to the differenced data. Once this has been done, unit
root tests based on known p and q are available (Dickey & Said, 1981).
Reverences
Bane, A. J, Goooionr, JH. & Sat, J.P. (1979). SAS User's Guide. Cary, North Carolina: sas Institute
Bunk, K_N. (1974). Consistent autoregressive spectral estimates. Ann. Siatst. 2, 489-502
Box, G. E. P. & dunxins, G. M. (190). Time Series Analysis Forecasting and Control. San Francisco:
Holden-Day.
Dickey, D. A. & Futter, W. A. (1979). Distribution of the estimators for autoregressive time series with a
unit root. J. Am. Statist. Assoc. 74, 427-31
Diokzy, D. A. & Sam, 8. B. (1981). Testing anata (p,1,q) versus anata (p+1,q). Proc. Bus. Heon. Statist.
Sect, Am. Statist. Assoe., 318-22
Funune, W. A, (1976). Introduction to Statistical Time Series. New York: Wiley:
Hasa, D. P.& Pour, W. A. (1979). Estimation for autoregressive processes with unit roots. Ann. Statist.
7, 1106-20,
[Received November 1983. Revised February 1984]