You are on page 1of 8

Distributed Lag and Autoregressive DL Models. Class notes by Prof. H. D. Vinod, All rights reserved. 1.

DISTRIBUTED LAG MODELS Lags are present in econometrics for several reasons. Psychological inertia (habit), permanent vs. transitory income, technical and technological reasons causing delay in implementing the changes in capital labor compositions, institutional reasons, labor contracts, etc. Finite or Infinite number of lag terms to include in regression is one issue. Yt = + 0 Xt+ 1Xt-1 + kXt-k This has k finite lags. Some types of lags considered in the literature: Arithmetic Lag: If i = (k+1 i) where the lags decline on a straight line from to 0. Inverted V Lag: If i = i for i [0, k/2] and i = (k i) for i (k/2, k], line goes up and comes down. Koyck Lag Structure (geometrically declining effect of past on current events): k= is possible by the Koyck method. There is no theory w.r.t. when to stop lagging the regressors. Seems mostly ad hoc and estimation is subject to serious collinearity. Koyck assumes that all coefficients are of same sign (positive) and decline geometrically as: k = 0 k or k = 0(1 ) k, where (0,1) is the rate of decline and (1 ) is speed of adjustment. If we estimate we know rate of decline and speed of adjustment. If k = 0 k then sum of coefficients of lag terms k = 0(1+ + + k + ...) or k = 0(1 ) 1. Steps in derivation of Koyck are designed to make it easy to estimate. Write the distributed lag model as ADL(0,): Yt = + 0 Xt+ 1Xt-1 + + ut substitute the geometric declining definition of coefficients in this model to yield: (1) Yt = + 0 Xt+ 0 Xt-1 + 0 2 Xt-2 + + ut lag the above eq. by 1 period to yield: Yt-1 = + 0 Xt-1+ 0 Xt-2 + 0 2 Xt-3 + + ut multiply by to both sides of this second equation to give (2) Yt-1 = + 0 Xt-1+ 0 2 Xt-2 + 0 3 Xt-3 + + ut Now subtract (2) from (1) to yield ADL(1,1) (3) Yt Yt-1 = (1 ) + 0 Xt+ (ut ut-1 ). Error term here has and that must be recognized. Re write (3) as (4) Yt = (1 ) + Yt-1 + 0 Xt+ (ut ut-1 ). Lagged dependent variable is a problem in finding reliable estimates of (4). Need to use Durbin h test since lagged dep var is present. Even if maximum lag is infinite, the average lag for Koyck model need not be long. What is the mean lag of Koyck model =[ k k]/[ k] both s are from 0 to . Mean lag simplifies to /[1 ]. For example, if =0.5, mean lag is 1 period.

Pascal Lag: used by Solow Econometrica, 1960, p393 (1 L)r yt= (1 )r xt + t . Jorgensons rational distributed lag is just a ratio of polynomials in L (Econometrica, 1966 p. 135) How can we provide economic theory behind Koyck model? Use ADAPTIVE EXPECTATIONs framework to rationalize the Koyck model. Start with model: (1) Yt = 0 + 1Xt* + ut where Y=demand for money (real cash balances), Xt*= equilibrium optimum expected long-run or normal rate of interest, which is not observable. The * suggests that it is not observable at time t but known for t-1. Let denote the rate at which the system adapts to past errors. Make an adaptive model for the past error in expectation made by an economic agent. X*t t-1 = (Xt t-1) what we learned from last periods expectation compared to the X* X* actual. rewrite this with the unknown on the left side as: X*t= X*t-1 + (Xt t-1)= Xt +(1 ) (Xt t-1) X* X* Substitute in model (1) above We have (2) Yt = 0 + 1[ Xt +(1 ) (Xt t-1)] + ut X* Now lag the model (1) by one period, multiply it by (1 ) (3) (1 ) Yt-1 = 0 (1 )+(1 ) 1Xt-1* + (1 )ut-1 Now if you subtract the product (3) from (2), you get Yt = 0 + 1 Xt +(1 ) Yt-1 +ut (1 )ut-1. All are observable, but interpretation is not the same as Koyck even if the lagged dependent variable is on the RHS because the coeff (1 ) of the lagged dependent variable is obviously different here. PARTIAL ADJUSTMENT MODEL says it takes time to adjust. Desired capital investment is linear function of output (4) Y*t= 0+ 1Xt + ut. Even if I want Y*t, I cannot have it right away Yt Yt-1 = (Y*t t-1) left side has actual change, RHS has desired change Y write this as Yt = Yt-1 + (Y*t t-1), that is: Y (5) Yt=(1 )Yt-1 + (Y*t ), This is almost like Koyck. Substitute (4) in (5), to yield our partial adjustment model: Yt=(1 )Yt-1 + ( 0+ 1Xt + ut), where the error term is simply ut, which is simpler than under adaptive expectations ut (1 )ut-1. This is an advantage of the Partial Adj Framework. 2. AUTOREGRESSIVE DISTRIBUTED LAG ADL(p,q) MODELS

ADL of order 1 in autoregression and order 1 in distributed lags: ADL(1,1) model is defined as M1 yt= 1zt+ 2yt-1 + 3zt-1 + t. ADL(p,q) model is defined as Mpq yt= 1zt+ pi=1 2iyt-i + qj=0 3jzt-j + t. where the second summation starts at 0 not at 1. Special case: Static regression,M2 yt= 1zt+ t. Special case: Univariate time series, M3 yt= 2yt-1 + + t. Special case: Differenced data, M4 yt= 1 zt++ t. where yt= 1zt+ 2yt-1 + 3zt-1 + t. with 2=1.and 3= 1 Campbell and Mankiew (1991) considered this to illustrate 1 as proportion of consumers who are income constrainted and (1 1) as the remaining consumers who are permanent income consumers. The coefficient 2=1 means that when the consumer is out-of-equilibrium she will stay there. This may be thought of as a growth rate model. It has the counter-intuitive interpretation that consumers do not try to remove the disequilibria in level of the variable. Seee model M10 below to understand the equilibrium interpretation. Special case: Leading indicator, M5 yt= 3zt-1 + t
zt is the leading indicator of yt

Special case: Partial adjustment, M6 ADL(1,0) yt= 1zt+ 2yt-1 + t Economic agent has a target y* and a one-period cost function Ct = (yt t*)2 + (yt t-1)2. [cf] y y which incorporates the cost of not meeting target plus cost of adjustment (too soon). Note that is the partial adjustment coefficient. Minimization of cost yields (yt t*)+ (yt t-1)= 0. y y Now we can rewrite this first order condition as (yt t-1) +(1+ )-1[y*t yt-1]=0 y or as: yt = yt-1 + (1 ) y*t where = /(1+ ) for >0 In practice regression: yt = b1 zt + b2 yt-1 + t estimates suffer from a subtle problem. The mean lag is b2/(1 2), which is not defined when b2=1 and creates problem when b2 b 1. That is, we have cointegration failure arising from the choice of an incorrect longtailed distribution.

One can consider an inter-temporal optimization problem and introduce an expectation operator in the above cost function [cf above]. The solution to this using Wiener-Hopf methods is derived in Vinod (Consumer Behavior and a Target Seeking Supply Side Model for Income Determination, Ch. 18 in M. Ahsanullah and D. Bhoj (eds.) Applied Statistical Science, I (1996), Nova Science Pub. Carbondale, IL, pp. 219-233.) using frequency domain optimization methods. Special case: common factor and autoregressive error, M7 yt = 1 zt + ut, ut = 2 ut-1 + t where 3= 1 2 Why is it called common factor model? What is the common factor? ANS: (1 2L), where L is the lag operator. These common factors lack much economic interpretation. Special case: finite distributed lag, M8 yt= 1zt+ 3zt-1 + t. e.g. output of coffee as dependent on number of trees planted in previous years However this is not realistic as the farmers will surely consider market forces such as stock of coffee beans, price, etc. Another example is completion of dwellings, given the past history of housing starts (zt). The problem is that a given housing start only influences its own completion, not of other houses. Special case: Dead start, M9 yt= 2yt-1 + 3zt-1 + t. Hall's (1978) random walk in consumption is a dead start model. Under permanent income hypothesis if we also assume rational expectations, the change in consumption must be an innovation (random walk) according to Hall. The corresponding data generating process (DGP) will be yt = t . This is unrealistic, since it allows the possibility of negative consumption. One should be careful before using statistical theory in Economics and somehow rule out negative consumption before using such model. Special case: Homogeneous equilibrium correction, M10 yt= 1zt+( 2 (yt-1 t-1 )+ t. 1) z Starting with a stochastic process, the long-run solution is static in the sense that it does not change over time. Then the process equals its expected value. (Assuming that the process is stationary). yt= 1zt+ 2yt-1 + 3zt-1 + t. Let Eyt=y* and Ezt=z* Taking expectations, the above becomes y*= 1z*+ 2y*+ 3z* Hence

y*=(1 2)-1( 1 + 3) z* =K1z* which defines K1. So the homogeneous equilibrium solution says that in equilibrium y and z are proportional to each other. (e.g. consumption and income?) If there is disequilibrium in one period, there would be an induced change in the next period. Special case: General equilibrium correction, M11 yt= 1zt+( 2 (yt-1 1 zt-1 )+ t. 1) K where K1= (1 2)-1( 1 + 3) as defined above and 2 1. TESTING the 11 models: Run the regressions and find e* e* or residual sum of sq use the test based on a loss of fit page 283 Greene when restriction is nonlinear use page 298 of Greene testing nonlinear restrictions Example 1: In consumption function: ADL(0,q) yt= 0 + qj=0 3jzt-j + t. where 0 is the impact multiplier. Unit change in income z at initial time and held constant forever after than leads to 0 change in the (mean value of) consumption. Note that consumption/ lagged income = j The partial sum of few terms is called intermediate term multiplier and the sum of all coefficients is known as the long-run multiplier. Example 2: In monetary theory ADL(0,q) yt= 0 + qj=0 3jzt-j + t. left side is inflation and right side variable is money supply. Example 3: In production theory ADL(0,q) yt= 0 + qj=0 3jzt-j + t. Vinod (1976) used this model to construct an index of effective R&D based on past data on R&D. "Application of New Ridge Regression Methods to a study of Bell System Scale Economies," Journal of the American Statistical Association, Vol. 71, December 1976, pp. 835-841. Koyck method can be thought of as ADL(0,q ) yt= 0 + qj=0 3jzt-j + t. and where 3j= 0 j. the sum of coefficients is simply 0(1+ + 2 + ) The long-run multiplier becomes 0(1 )-1 if | |<1.

Write the expression for yt and also for yt-1 and for yt yt-1 verify that an ADL(1,0) model can represent the Koyck model. If an empirical estimate of is larger than unity, we can conclude that something is wrong. The expression j as j . The values 0 times(1, , 2 , ) can be thought of as generating a discrete probability distribution if their sum is unity. One can study the properties of Koyck's lag distribution, or generalize the idea to another discrete or continuous probability distribution. The median lag of Koyck model is log2/log and the mean lag is /(1 ). How to use Adaptive Expectations model to test rational expectations hypothesis? Adaptive expectations (AE) model is distinguished from rational expectations (RE). The basic relation is yt=f(x*t) where an equilibrium (or long-run expectational) variable x*t is introduced, though it is not directly observable. The expectation at time adapts itself over time in the AE framework. At time t, one knows the actual, lagged x values and the expectation about xt-1 made at time t-1, denoted by x*t-1. and the error in that expectation: (xt-1 t-1). From this experience, one learns (adapts) the expectation formation for x*t, x* the equilibrium value x*t= xt + (1 )x*t-1. where 0 1 are weights. Note that when =0 we have static expectations and when =1 we have immediately adapting expectations. Rational expectations mean that all available information is taken into account by economic agents. Thus =0 value will not fit in the RE framework. Hence a statistical test for =0 might be test of the rational expectations model. ESTIMATION: OLS is consistent, though biased. Instrumental variable estimation is possible. (See Maddala-Kim). According to Durbin's (1960) estimating functions viewpoint, OLS is optimal despite lagged dependent variable. But if addition to yt-1 on the right side, there is a problem of autocorrelated errors, then they too can be handled by Durbin (1960) two-step estimator described in Gujarati (1995, p.432). First step is to regress yt on zt, zt-1 and yt-1, and to find the coefficient ^ of yt-1. Second step is to regress pseudo first differences y*t=yt ^ yt-1 on similar x*t=xt ^xt1. This process has optimality properties according some theoretical results. See Vinod (1996) Using Godambe-Durbin Estimating Functions In Econometrics, 1996, in I. Basawa, V. P. Godambe and R. Taylor (Ed.s) Selected Proceedings of the Symposium on Estimating Equations, IMS Lecture Notes- Monographs Series Vol. 32, pp215-237. Vinod, H. D., 2000. Foundations of multivariate inference using modern computers. Linear Algebra and Its Applications, 321, 365-385 discusses construction of scaled score functions for inference. Hence Vinod adds a third step:

Step 3: use the residuals of second step above as ut. Denote the residual variance by s2. The score function for is (1/ s2) ut x*t, where x*t is the pseudo difference with ^ from the second step. The Godambe pivot function is GPF=(X* X*) 0.5X* u. Computation of 999 values of the roots of the equation: GPF=0 for is used for confidence intervals and hence for inference. Interpretation problems in the presence of expectational variables: McCallum (JME, 1984, pp3-14) associates these problems with long-run effects but Bannerjee et al (1993, p. 65) show that it is really due to invalid weak exogeneity assumption and same problem can be present in short-run elasticity estimation also. McCallum's example is estimation of regression of interest t on inflation t as t= + 1 t + t. The Fisher hypothesis is that 1=1, i.e., in long-run equilibrium the nominal interest rate reflects inflation one-for-one. Now the data are generated by expectational variable as t= 0 + 1 t|t-1 + t where inflationary expectations are denoted by t|t-1. Let t be an AR(1) process with AR coefficient | 1|<1. Then t|t-1= 0+ 1 t and substituting this in the above equation will mean t= 0 + 1[ 0+ 1 t ] + t. Now the corresponding estimates of coeff of t by OLS will yield 1 1 <1. This would tend to reject the Fisher hypothesis 1=1. Here the problem is that t is not "weakly exogenous" for estimation of 1. This is not a new problem as McCallum thought but a ramification of a known problem. The solution is joint estimation of two equations: t= 0 + 1 t + t. and t= 0+ 1
t-1

+ut.

It is interesting to note that if the series are nonstationary, and if t and t are cointegrated, then even if weak exogeneity is ABSENT, we can obtain a consistent longrun solution. See Bannerjee et al (1993 p. 67) Lag Polynomials and computation of the Mean Lag in ADL(1,1) model. yt= 1zt+ 2yt-1 + 3zt-1 + t. is written in lag polynomials as (1 2L)yt= ( 1+ 3L)zt + t. yt= (1 2L)-1( 1+ 3L)zt +(1 2L)-1 t. Now change notation and write yt=w(L)zt + ut, where w(L) = wi Li, is, in general, an infinite order polynomial in L. Note that when L=1, w(L)= wi. Hence a convenient way to write the sum of all w coefficients K1 is w(1). Note that yt/ zt-i is wi. Hence effect of zt on yt is w0, effect of zt on yt+1 is w1.

effect of zt on yt+2 is w2. the sum of the effects of zt on all future yt is the sum of wi. or K1, which is the impulse response. If the ws are in [0,1] interval and their sum is nonzero, then wi correspond to discrete probability distribution weights. Any weighted average is ( x * w )/ ( w) Here we use the lag i instead of x with i=1,2, etc. Thus average lag by the standard weighted average formula is ( i * wi )/ ( wi) It is interesting to note that the derivative w(L)/ L evaluated at L=1 is simply iwi. Hence mean lag is w (1) / w(1) where w denotes the derivative. Normalize the weights to that they add up to 1 by dividing them by their sum. This is used to develop lag distributions of various shapes. for example, if 2 is close to 1 the distribution has fat tails. The median lag is the first lag point at which the normalized sum of weights exceeds 0.5. Median lag = log[2(1 1)]/ log 2. provided this expression is well defined.

You might also like